Categories
Uncategorised

The Unfairness of AI-Flagged Academic Misconduct Investigations in UK Universities

Dr David Grundy, Director of Digital Education, Newcastle University Business School

image illustrating the unfairness of academic misconduct allegations

Abstract

The increasing reliance on AI text detectors in UK higher education to flag potential academic misconduct raises profound fairness concerns. When universities initiate investigations solely based on an AI “red flag,” they risk contaminating the entire process, violating principles of due process and natural justice. Using the Office of the Independent Adjudicator’s (OIA) Good Practice Framework as a benchmark, this article examines how an initial AI suspicion can taint investigations (“fruit of the poisoned tree”), the high rate of false positives that undermines “just cause,” cognitive biases introduced when investigators know about AI flags, and the legal and procedural standards demanding evidence-based, impartial processes. Policy recommendations are offered to ensure AI detection tools serve only as preliminary aids, not as sole grounds for disciplinary action.

Keywords: AI Detection, Academic Misconduct Investigations, Unfair Processes, False Positives, Confirmation Bias, Anchoring Bias, Good Practice Framework

The “Fruit of the Poisoned Tree”

Originating in U.S. criminal law, the “fruit of the poisoned tree” doctrine holds that evidence derived from tainted sources is inadmissible. In UK academia, an AI detection flag functions as a “poisoned tree” when it is inherently unreliable. Initiating a misconduct investigation based solely on such a flag means the process is contaminated at its root, predisposing investigators to find guilt irrespective of subsequent evidence (Mita, 2023). Although UK courts do not formally adopt this doctrine, its underlying principle is embedded in the OIA’s mandate that disciplinary processes be fair and evidence-based (OIA, 2024). An AI-generated suspicion is akin to an unsubstantiated tip-off rather than concrete proof; using it as the trigger for formal inquiries unfairly prejudices the student, as any exculpatory evidence may be discounted through a lens of presumed guilt. To uphold justice, universities must treat AI flags as prompts for further verification, not as grounds for launching full-scale investigations.

False Positives and Lack of “Just Cause” for Investigation

A cornerstone of fairness in academic discipline is “just cause”—reasonable grounds to suspect misconduct. Generative AI detectors, however, exhibit significant error rates (Dalalah & Dalalah, 2023; Giray, 2024). Turnitin’s own data reveal a 4% sentence-level false-positive rate (Turnitin, 2023), implying that a sizeable fraction of human-written work could be mislabelled. Walters (2023) estimates that even a 10% false-positive rate could wrongfully implicate dozens of students per cohort, accumulating to hundreds over multiple years. Empirical assessments of 14 detectors report accuracies ranging from merely 33% to 81% (Weber-Wulff et al., 2023), and Temple University’s evaluation found Turnitin’s detector only 77% accurate at spotting AI text, with a 7% mis-flag rate for genuine writing (Asselta Law, 2025). The University of Pennsylvania concurs that many detectors have “dangerously high” false-positive defaults (Wood, 2024). Moreover, bias against non-native English writers exacerbates injustices: Liang et al. (2023) demonstrate that GPT detectors disproportionately target second-language students. An AI flag, therefore, falls far short of providing reasonable suspicion; it resembles unreliable hearsay rather than “hard evidence” required to justify formal proceedings (OIA, 2024). Without corroborating indicators—such as verbatim matches to external sources or inability to reproduce the work—launching investigations on AI scores alone constitutes an unjustified witch-hunt, subjecting innocent students to undue stress and reputational harm (Gorichanaz, 2023).

Cognitive Bias and Presumption of Guilt in AI-Flagged Cases

Once a case is initiated on the basis of an AI flag, investigators become vulnerable to cognitive biases. Confirmation bias leads them to seek out evidence that confirms the initial AI suspicion while overlooking exculpatory signs (Rassin, 2022; Wallace, 2015). Anchoring bias further cements this effect: a reported “85% AI-generated” score becomes an immovable reference point, skewing all subsequent evaluations (Ly et al., 2023). Forensic research shows that contextual suggestions of guilt “cannot be unseen,” distorting experts’ judgments even after the context is removed (Kunkler & Roy, 2023). In academic settings, knowing an essay was flagged primes investigators to interpret well-crafted sections suspiciously (“too good to be the student’s own”), rather than neutrally assessing content and process (OIA, 2024). Instructors may unconsciously scrutinize minor stylistic deviations or benign editing-tool usage as signs of cheating, while ignoring personal reflections or draft submissions that demonstrate authorship. This is particularly harmful for international students, whose strong writing skills can be misattributed to AI (Mathewson, 2023). By injecting a presumption of guilt and tainting the investigator’s mindset, AI flags undermine the “innocent until proven guilty” ethos and violate natural justice standards that forbid “any reasonable perception of bias or pre-determination” (OIA, 2024).

Due Process, Fairness, and Evidence

UK universities operate under internal regulations and public law requiring fairness, impartiality, and evidence-based decisions (OIA, 2024). Students are entitled to know precise allegations and the evidence against them, and to respond fully. AI detectors, however, provide opaque probability scores with no transparent rationale, denying students a meaningful opportunity to challenge the “evidence” (Turnitin, 2023). The typical “balance of probabilities” standard demands substantive proof that misconduct more likely than not occurred—an unsubstantiated AI score cannot meet this threshold. In appeals to the OIA, universities would struggle to justify decisions hinging on black-box algorithms rather than verifiable facts. Furthermore, evidence obtained through improper means—such as uploading student work to unauthorized free detection tools—may violate GDPR and intellectual property rights, rendering it inadmissible. Courts will intervene if procedural fairness or contract terms are breached; a case built chiefly on AI probabilities risks being overturned as procedurally unfair (OIA, 2024). To satisfy due process, any allegation of AI-assisted plagiarism should be substantiated with concrete examples—verbatim matches, inability to reproduce work, or clear stylistic mismatches informed by multiple writing samples—and accompanied by full disclosure of tool limitations to the student.

Policy Recommendations

Thresholds for “Just Cause”

AI detection results must not be the sole trigger for investigations. Universities should require corroborating evidence—verbatim text matches, stark deviations from the student’s known writing style, or inability to demonstrate authorship—before proceeding. Policies should explicitly state that AI flags serve only as preliminary guidance (JISC, 2023; Wargo & Anderson, 2024).

Human Oversight and Professional Scepticism

Flagged submissions should prompt human review by trained subject-matter experts or integrity officers. Reviewers must consider benign explanations (talent, grammar tools, personal reflections) and treat AI outputs as invitations for inquiry, not as proof (Jones & Newton, 2024).

Process Design to Mitigate Bias

Implement partial blinding: require markers to document independent concerns before viewing AI reports, or assign AI review to separate officers. Deliver training on cognitive biases in misconduct investigations, using case studies to highlight the risks of anchoring and confirmation bias (Born, 2024).

Transparent, Fair Regulations

Update academic integrity policies to affirm that no student may be penalized based solely on an AI detector. Incorporate explicit language: “Automated AI detection scores are preliminary aids only; findings must rest on verifiable evidence and academic judgment.” Disclose tool limitations and provide students access to the AI report and its accuracy parameters.

Assessment Design and Pedagogy

Shift away from punitive, detector-centric approaches toward authentic, iterative assessments (vivas, personalized tasks, draft-based assignments) that inherently discourage misconduct. Emphasize trust-based evaluation and support systems, reducing the institutional reliance on unreliable detection software (OIA, 2024).

Statement: Acknowledgement of Assistive Tool Usage

This document is the authors original own work, however Microsoft Word and Grammarly drafting and editing tools were used in the creation of the document as the author is dyslexic and the work would be unreadable without their usage to refine the text. Both tools incorporate elements of machine learning and AI. The author takes full responsibility for the content of the published article.

References

Asselta Law. (2025, February 16). The Hysteria of Professors Accusing Students of Using AI. https://www.asseltalaw.com/blog/2025/02/the-hysteria-of-professors-accusing-students-of-using-ai/

Baker, R. S., & Hawn, A. (2022). Algorithmic Bias in Education. International Journal of Artificial Intelligence in Education, 32(4), 1052–1092. https://doi.org/10.1007/s40593-021-00285-9

Born, R. T. (2024). Stop Fooling Yourself! (Diagnosing and Treating Confirmation Bias). eNeuro, 11(10). https://doi.org/10.1523/ENEURO.0415-24.2024

Chechitelli, A. (2023, June 14). Understanding the false positive rate for sentences of our AI writing detection capability. Turnitin Blog. https://www.turnitin.com/blog/understanding-the-false-positive-rate-for-sentences-of-our-ai-writing-detection-capability

Cambridge University (2023, October). Investigating academic misconduct and mark checks [Text]. https://www.studentcomplaints.admin.cam.ac.uk/staff-support/investigating-academic-misconduct-and-mark-checks

Dalalah, D., & Dalalah, O. M. A. (2023). The false positives and false negatives of generative AI detection tools in education and academic research: The case of ChatGPT. The International Journal of Management Education, 21(2), 100822. https://doi.org/10.1016/j.ijme.2023.100822

Eaton, S. E. (2022, February 22). Check your bias at the door. University Affairs. https://universityaffairs.ca/features/check-your-bias-at-the-door/

Foltynek, T., Bjelobaba, S., Glendinning, I., Khan, Z. R., Santos, R., Pavletic, P., & Kravjar, J. (2023). ENAI Recommendations on the ethical use of Artificial Intelligence in Education. International Journal for Educational Integrity, 19(1), 12. https://doi.org/10.1007/s40979-023-00133-4

Giray, L. (2024). The Problem with False Positives: AI Detection Unfairly Accuses Scholars of AI Plagiarism. The Serials Librarian, 85(5–6), 181–189. https://doi.org/10.1080/0361526X.2024.2433256

Good Practice Framework—OIAHE (Worldwide). (2024, June 3). https://www.oiahe.org.uk/resources-and-publications/good-practice-framework/

Gorichanaz, T. (2023). Accused: How students respond to allegations of using ChatGPT on assessments. Learning: Research and Practice, 9(2), 183–196. https://doi.org/10.1080/23735082.2023.2254787

Kunkler, K. S., & Roy, T. (2023). Reducing the impact of cognitive bias in decision making: Practical actions for forensic science practitioners. Forensic Science International: Synergy, 7, 100341. https://doi.org/10.1016/j.fsisyn.2023.100341

Liang, W., Yuksekgonul, M., Mao, Y., Wu, E., & Zou, J. (2023). GPT detectors are biased against non-native English writers. Patterns, 4(7). https://doi.org/10.1016/j.patter.2023.100779

Ly, D. P., Shekelle, P. G., & Song, Z. (2023). Evidence for Anchoring Bias During Physician Decision-Making. JAMA Internal Medicine, 183(8), 818–823. https://doi.org/10.1001/jamainternmed.2023.2366

Mathewson, T. G. (2023, August 14). AI Detection Tools Falsely Accuse International Students of Cheating – The Markup. https://themarkup.org/machine-learning/2023/08/14/ai-detection-tools-falsely-accuse-international-students-of-cheating

Mita, S. (2022). AI Proctoring: Academic Integrity vs. Student Rights Notes. Hastings Law Journal, 74(5), [i]-1554.

Newton, P. M., & Jones, S. (2025). Education and Training Assessment and Artificial Intelligence. A Pragmatic Guide for Educators. British Journal of Biomedical Science, 81, 14049. https://doi.org/10.3389/bjbs.2024.14049

Rassin, E. (2022). ‘Anyone who commits such a cruel crime, must be criminally irresponsible’: Context effects in forensic psychological assessment. Psychiatry, Psychology and Law, 29(4), 506–515. https://doi.org/10.1080/13218719.2021.1938272

Sadasivan, V. S., Kumar, A., Balasubramanian, S., Wang, W., & Feizi, S. (2025). Can AI-Generated Text be Reliably Detected? (No. arXiv:2303.11156). arXiv. https://doi.org/10.48550/arXiv.2303.11156

Wallace, W. A. (2015). The Effect of Confirmation Bias in Criminal Investigative Decision Making [Ph.D., Walden University]. In ProQuest Dissertations and Theses. https://www.proquest.com/docview/1668379477/abstract/576B938495004949PQ/1

Walters, W. H. (2023). The Effectiveness of Software Designed to Detect AI-Generated Writing: A Comparison of 16 AI Text Detectors. Open Information Science, 7(1). https://doi.org/10.1515/opis-2022-0158

Wargo, K., & Anderson, B. (n.d.). Striking a Balance: Navigating the Ethical Dilemmas of AI in Higher Education. EDUCAUSE Review. Retrieved 28 February 2025, from https://er.educause.edu/articles/2024/12/striking-a-balance-navigating-the-ethical-dilemmas-of-ai-in-higher-education

Webb, M. (2023, September 18). AI Detection—Latest Recommendations. Artificial Intelligence. https://nationalcentreforai.jiscinvolve.org/wp/2023/09/18/ai-detection-latest-recommendations/

Weber-Wulff, D., Anohina-Naumeca, A., Bjelobaba, S., Foltýnek, T., Guerrero-Dib, J., Popoola, O., Šigut, P., & Waddington, L. (2023). Testing of detection tools for AI-generated text. International Journal for Educational Integrity, 19(1), 26. https://doi.org/10.1007/s40979-023-00146-z

Wood, C. (2024, September 10). AI detectors are easily fooled, researchers find. EdScoop. https://edscoop.com/ai-detectors-are-easily-fooled-researchers-find/

Leave a Reply

Your email address will not be published. Required fields are marked *