Research Articles

Authorship Attribution of Arabic Criminal Texts Using Large Language Models: A Comparative Evaluation of ChatGPT, DeepSeek, and Gemini

Authors

Abstract

This study investigates the ability of three large language models (LLMs)—ChatGPT, DeepSeek, and Gemini—to attribute authorship of Arabic criminal texts in a zero-shot setting, with no task-specific training or fine-tuning. Using a quantitative experimental design, each model attributes 24 anonymous criminal texts against reference writings from 12 Arabic authors. The results reveal limited effectiveness, with only ChatGPT achieving a statistically significant accuracy rate of 25%, above the 8.3% chance level. These findings demonstrate that current LLMs in zero-shot settings lack sufficient reliability for definitive authorship attribution (AA) of short Arabic criminal texts, highlighting a gap between their general linguistic capabilities and the specific requirements of forensic textual analysis. While LLMs show preliminary potential, their current implementation cannot replace human expertise in high-stakes forensic contexts involving Arabic texts.

Keywords:

Authorship attribution, forensic linguistics, Arabic criminal texts, LLMs

Article information

Journal

International Journal of Human Post-Edited AI Qualitative Data Analysis

Volume (Issue)

2(1), (2026)

Pages

1-13

Published

2026-01-20

How to Cite

Alharbi, I. (2026). Authorship Attribution of Arabic Criminal Texts Using Large Language Models: A Comparative Evaluation of ChatGPT, DeepSeek, and Gemini. International Journal of Human Post-Edited AI Qualitative Data Analysis, 2(1), 1-13. https://doi.org/10.65930/yctcgz27

References

Alsajri, A., Salman, H. A., & Steiti, A. (2024). Generative models in natural language processing: A comparative study of ChatGPT and Gemini. Babylonian Journal of Artificial Intelligence, 134–145.

Altheneyan, A., & Menai, M. (2014). Naïve Bayes classifiers for authorship attribution of Arabic texts. Journal of King Saud University – Computer and Information Sciences, 26(4), 473–484. https://doi.org/10.1016/j.jksuci.2014.06.006

AlZahrani, F. M., & Al-Yahya, M. (2023). A transformer-based approach to authorship attribution in classical Arabic texts. Applied Sciences, 13(12), 1–15. https://doi.org/10.3390/app13127255

Atkinson-Abutridy, J. (2024). Large language models (1st ed.). CRC Press.

Bissell, A. F. (1995). Weighted cumulative sums for text analysis using word counts. Journal of the Royal Statistical Society: Series A (Statistics in Society), 158(3), 525–545. https://doi.org/10.2307/2983444

Canbay, P., Sezer, E. A., & Sever, H. (2020). Deep combination of stylometry features in forensic authorship analysis. International Journal of Information Security Science, 9(3), 154–163.

Coulthard, M., Johnson, A., & Wright, D. (2016). An introduction to forensic linguistics (2nd ed.). Routledge.

Coulthard, M., Johnson, A., & Wright, D. (2020). The Routledge handbook of forensic linguistics (2nd ed.). Routledge.

Coyotl-Morales, R. M., Villaseñor-Pineda, L., Montes-y-Gómez, M., & Rosso, P. (2006). Authorship attribution using word sequences. In J. F. Martínez-Trinidad, J. A. Carrasco Ochoa, & J. Kittler (Eds.), Progress in pattern recognition, image analysis and applications (pp. 844–853). Springer. https://doi.org/10.1007/11892755_87

Everett, D. L. (2012). Language. Profile Books.

Gee, J. P. (2017). Introducing discourse analysis (1st ed.). Routledge.

Grant, T. (2022). The idea of progress in forensic authorship analysis. Cambridge University Press.

Hardcastle, R. A. (1993). Forensic linguistics: An assessment of the CUSUM method for the determination of authorship. Journal of the Forensic Science Society, 33(2), 95–106.

Holmes, D. I. (1998). The evolution of stylometry in humanities scholarship. Literary and Linguistic Computing, 13(3), 111–117. https://doi.org/10.1093/llc/13.3.111

Holmes, D. I., & Forsyth, R. S. (1995). The Federalist revisited: New directions in authorship attribution. Literary and Linguistic Computing, 10(2), 111–127.

Seltman, H. J. (2018). Experimental design and analysis. Carnegie Mellon University.

Hu, Z., Zheng, T., & Huang, H. (2024). A Bayesian approach to harnessing the power of LLMs in authorship attribution. In Y. Al-Onaizan, M. Bansal, & Y.-N. Chen (Eds.), Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (pp. 13216–13227). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.emnlp-main.733

Huang, B., Chen, C., & Shu, K. (2024). Can large language models identify authorship? In Y. Al-Onaizan, M. Bansal, & Y. Chen (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2024 (pp. 445–460). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.findings-emnlp.26

Huang, B., Chen, C., & Shu, K. (2025). Authorship attribution in the era of LLMs: Problems, methodologies, and challenges. ACM SIGKDD Explorations Newsletter, 26(2), 21-43.

Makei, J., & Tokura, T. (2025). Teaching “what” vs. teaching “why”: How ChatGPT and generative AI are shaping education. ResearchGate. https://doi.org/10.13140/RG.2.2.13559.53924

Misini, A., Canhasi, E., Kadriu, A., & Fetahi, E. (2024). Automatic authorship attribution in Albanian texts. PLOS ONE, 19(10), e0310057. https://doi.org/10.1371/journal.pone.0310057

Mosteller, F., & Wallace, D. L. (1963). Inference in an authorship problem. Journal of the American Statistical Association, 58(302), 275–309. https://doi.org/10.1080/01621459.1963.10500849

Olsson, J. (2008). Forensic linguistics (2nd ed.). Continuum.

Olsson, J. (2009). Wordcrime (1st ed.). Continuum.

Olsson, J., & Luchjenbroers, J. (2013). Forensic linguistics (1st ed.). Bloomsbury Academic.

Plechác, P. (2022) Versification and Authorship Attribution. Karolinum Press, Charles University.

Raschka, S. (2024). Build a large language model (from scratch). Manning.

Rahman, M., Shiplu, A., Watanobe, Y., Tapader, M., Amin, M., & Peng, L. (2025). ChatGPT and DeepSeek: Strengths, limitations, and the future of generative AI. Journal of LATEX Class Files, 18(9), 1-19.

Saini, K., Gupta, A., Rani, S., Sethi, R., & Awasthi, P. (2024). Artificial intelligence in forensic science (1st ed.). CRC Press.

Sousa-Silva, R. (2024). Fighting cyber-malice: A forensic linguistics approach to detecting AI-generated malicious texts. In Proceedings of the 1st International Conference on NLP & AI for Cyber Security (164–174).

Gorovaia, S., Schmidt, G., & Yamshchikov, I. P. (2024). Sui generis: Large language models for authorship attribution and verification in Latin. In M. Hämäläinen, E. Öhman, S. Miyagawa, K. Alnajjar, & Y. Bizzoni (Eds.), Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities (pp. 398–412). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.nlp4dh-1.39

Thakur, K., Barker, H. & Pathan, A.-S. K. (2024). Artificial intelligence and large language models (1st ed.). Chapman and Hall/CRC.

Downloads

Views

33

Downloads

6