Mostrar el registro sencillo del ítem
Lost in Translation: Analyzing Non-English Cybercrime Forums
| dc.contributor.author | Mischinger, Mariella | |
| dc.contributor.author | Hughes, Jack | |
| dc.contributor.author | Vitiugin, Fedor | |
| dc.contributor.author | Pastrana, Sergio | |
| dc.contributor.author | Hutchings, Alice | |
| dc.contributor.author | Suarez-Tangil, Guillermo | |
| dc.date.accessioned | 2026-01-19T16:20:33Z | |
| dc.date.available | 2026-01-19T16:20:33Z | |
| dc.date.issued | 2025-11 | |
| dc.identifier.citation | M. Motoyama, D. McCoy, K. Levchenko, S. Savage, and G. M. Voelker, “An analysis of underground forums,” in Proceedings of the 2011 ACM SIGCOMM Internet Measurement Conference, 2011, pp. 71–80. [2] S. Pastrana, D. R. Thomas, A. Hutchings, and R. Clayton, “CrimeBB: Enabling cybercrime research on underground forums at scale,” in Proceedings of the 2018 World Wide Web Conference, 2018, pp. 1845 1854. [3] A. Bermudez-Villalva and G. Stringhini, “The shady economy: Under standing the difference in trading activity from underground forums in different layers of the web,” in Proceedings of the APWG Symposium on Electronic Crime Research (eCrime), 2021, pp. 1–10. [4] S. Pastrana, A. Hutchings, D. Thomas, and J. Tapiador, “Measuring eWhoring,” in Proceedings of the Internet Measurement Conference, 2019, p. 463–477. [Online]. Available: https://doi.org/10.1145/3355369.3355597 [5] G. Atondo Siu, B. Collier, and A. Hutchings, “Follow the money: The relationship between currency exchange and illicit behaviour in an underground forum,” in Proceedings of the IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2021, pp. 191–201. [6] M. Mischinger, S. Pastrana, G. Suarez-Tangil et al., “IoC Stalker: Early detection of Indicators of Compromise,” in Proceedings of the Annual Computer Security Applications Conference, 2024. [7] R. Bhalerao, M. Aliapoulios, I. Shumailov, S. Afroz, and D. McCoy, “Mapping the underground: Supervised discovery of cybercrime supply chains,” in Proceedings of the IEEE APWG Symposium on Electronic Crime Research (eCrime), 2019, pp. 1–16. [8] M. Ebrahimi, S. Samtani, Y. Chai, and H. Chen, “Detecting cyber threats in non-English hacker forums: an adversarial cross-lingual knowledge transfer approach,” in Proceedings of the IEEE Security and Privacy Workshops (SPW), 2020, pp. 20–26. [9] J. Hughes, S. Pastrana, A. Hutchings, S. Afroz, S. Samtani, W. Li, and E. Santana Marin, “The art of cybercrime community research,” ACM Computing Surveys, vol. 56, no. 6, pp. 1–26, 2024. [10] J. Ram´ırez S´anchez, A. Campo-Archbold, A. Zapata Rozo, D. D´ıaz L´opez, J. Pastor-Galindo, F. G´omez M´armol, and J. Aponte D´ıaz, “Uncovering cybercrimes in social media through natural language processing,” Complexity, vol. 2021, pp. 1–15, 2021. [11] M. Arazzi, D. R. Arikkat, S. Nicolazzo, A. Nocera, M. Conti et al., “NLP-based techniques for cyber threat intelligence,” arXiv preprint arXiv:2311.08807, 2023. [12] J. Torregrosa, G. Bello-Orgaz, E. Mart´ınez-C´amara, J. D. Ser, and D. Camacho, “A survey on extremism analysis using natural language processing: definitions, literature review, trends and challenges,” Journal of Ambient Intelligence and Humanized Computing, vol. 14, no. 8, pp. 9869–9905, 2023. [13] A. Rocha, W. J. Scheirer, C. W. Forstall, T. Cavalcante, A. Theophilo, B. Shen, A. R. B. Carvalho, and E. Stamatatos, “Authorship attribution for social media forensics,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 1, pp. 5–33, 2017. [14] A. Caines, S. Pastrana, A. Hutchings, and P. J. Buttery, “Automatically identifying the function and intent of posts in underground forums,” Crime Science, vol. 7, no. 1, pp. 1–14, 2018. [15] S. Pastrana, A. Hutchings, A. Caines, and P. Buttery, “Characterizing Eve: Analysing cybercrime actors in a large underground forum,” in Proceedings of the 21st International Symposium on Research in Attacks, Intrusions, and Defenses (RAID), 2018, pp. 207–227. [16] J. Lusthaus, M. Bruce, and N. Phair, “Mapping the geography of cybercrime: A review of indices of digital offending by country,” in 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE, 2020, pp. 448–453. [17] M. Edwards, G. Suarez-Tangil, C. Peersman, G. Stringhini, A. Rashid, and M. Whitty, “The geography of online dating fraud,” in Workshop on technology and consumer protection. IEEE-TCSP, 2018. [18] V. Valeros, A. ˇ Sirokova, C. Catania, and S. Garcia, “Towards better understanding of cybercrime: The role of fine-tuned LLMs in transla tion,” in Proceedings of the IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2024, pp. 91–99. [19] D. Seyler, W. Liu, Y. Zhang, X. Wang, and C. Zhai, “Darkjargon. net: A platform for understanding underground conversation with latent mean ing,” in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 2526 2530. [20] K. Yuan, H. Lu, X. Liao, and X. Wang, “Reading thieves’ cant: auto matically identifying and understanding dark jargons from cybercrime marketplaces,” in Proceedings of the 27th USENIX Security Symposium (USENIX Security 18), 2018, pp. 1027–1041. [21] Y. Li, J. Cheng, C. Huang, Z. Chen, and W. Niu, “Nedetector: Automat ically extracting cybersecurity neologisms from hacker forums,” Journal of Information Security and Applications, vol. 58, p. 102784, 2021. [22] E. Vanmassenhove, D. Shterionov, and A. Way, “Lost in translation: Loss and decay of linguistic richness in machine translation,” in Proceedings of Machine Translation Summit XVII: Research Track, 2019, pp. 222 232. [23] A. Mukherjee and M. Shrivastava, “Lost in translation? found in evalua tion: A comprehensive survey on sentence-level translation evaluation,” ACM Computing Surveys, 2025. [24] V. Ghafouri, J. Such, and G. Suarez-Tangil, “I love pineapple on pizza!= i hate pineapple on pizza: Stance-aware sentence transformers for opinion mining,” in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024, pp. 21046–21058. [25] L. Zhou, A. Caines, I. Pete, and A. Hutchings, “Automated hate speech detection and span extraction in underground hacking and extremist forums,” Natural Language Engineering, vol. 29, no. 5, pp. 1247–1274, 2023. [26] R. S. Portnoff, S. Afroz, G. Durrett, J. K. Kummerfeld, T. Berg Kirkpatrick, D. McCoy, K. Levchenko, and V. Paxson, “Tools for automated analysis of cybercriminal markets,” in Proceedings of the 26th International World Wide Web Conference, 2017, pp. 657–666. [27] M. Ebrahimi, Y. Chai, S. Samtani, and H. Chen, “Cross-lingual cyber security analytics in the international dark web with adversarial deep representation learning,” Mis Quarterly, vol. 46, no. 2, 2022. [28] Y. Jin, E. Jang, J. Cui, J. W. Chung, Y. Lee, and S. Shin, “Darkbert: A language model for the dark side of the internet,” in 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023. Association for Computational Linguistics (ACL), 2023, pp. 7515–7533. [29] J. Hughes, Y. T. Chua, and A. Hutchings, “Too much data? opportunities and challenges of large datasets and cybercrime,” in Researching Cyber crimes: Methodologies, Ethics, and Critical Approaches, A. Lavorgna and T. J. Holt, Eds. Springer, 2021, pp. 191–212. [30] L. Han, “Machine translation evaluation resources and methods: A survey,” arXiv preprint arXiv:1605.04515, 2016. [31] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating text generation with bert,” arXiv preprint arXiv:1904.09675, 2019. [32] J. Vamvas and R. Sennrich, “NMTScore: A multilingual analysis of translation-based text similarity measures,” in Findings of the Associa tion for Computational Linguistics: EMNLP, 2022, pp. 198–213. [33] M. Hanna and O. Bojar, “A fine-grained analysis of BERTScore,” in Proceedings of the Sixth Conference on Machine Translation, 2021, pp. 507–517. [34] T. Kocmi and C. Federmann, “Large language models are state-of-the art evaluators of translation quality,” arXiv preprint arXiv:2302.14520, 2023. [35] ——, “GEMBA-MQM: Detecting translation quality error spans with GPT-4,” arXiv preprint arXiv:2310.13988, 2023. [36] Hugging Face, “all-mpnet-base-v2,” https://huggingface.co/sentence transformers/all-mpnet-base-v2, 07 2022. [37] S. Samtani, K. Chinn, C. Larson, and H. Chen, “Azsecure hacker assets portal: Cyber threat intelligence and malware analysis,” in 2016 IEEE conference on intelligence and security informatics (ISI). Ieee, 2016, pp. 19–24. [38] J. Caballero, G. Gomez, S. Matic, G. S´anchez, S. Sebasti´an, and A. Villaca˜nas, “The rise of GoodFATR: A novel accuracy comparison methodology for indicator extraction tools,” Future Generation Com puter Systems, vol. 144, pp. 74–89, 2023. [39] Virustotal, “Virustotal,” https://www.virustotal.com/gui/home/upload, [Online] Last accessed: April, 30 2025. [40] J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat et al., “GPT-4 technical report,” arXiv preprint arXiv:2303.08774, 2023. [41] J. R. Jolley and L. Maimone, “Thirty years of machine translation in language teaching and learning: A review of the literature,” L2 Journal: An Electronic Refereed Journal for Foreign and Second Language Educators, vol. 14, no. 1, 2022. [42] Y. A. Telaumbanua, A. Marpaung, C. P. D. Gulo, D. K. W. Waruwu, E. Zalukhu, and N. P. Zai, “Analysis of two translation applications: Why is DeepL translate more accurate than Google Translate?” Journal of Artificial Intelligence and Engineering Applications (JAIEA), vol. 4, no. 1, pp. 82–86, 2024. [43] D. S. Chaplot, “Albert q. jiang, alexandre sablayrolles, arthur mensch, chris bamford, devendra singh chaplot, diego de las casas, florian bressand, gianna lengyel, guillaume lample, lucile saulnier, l´elio re nard lavaud, marie-anne lachaux, pierre stock, teven le scao, thibaut lavril, thomas wang, timoth´ee lacroix, william el sayed,” arXiv preprint arXiv:2310.06825, 2023. [44] A. K. Wassie, M. Molaei, and Y. Moslem, “Domain-specific translation with open-source large language models: Resource-oriented analysis,” arXiv preprint arXiv:2412.05862, 2024. [45] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, and C. D. Manning, “Stanza: A Python natural language processing toolkit for many human languages,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020. [46] H. Saadany and C. Oraˇsan, “BLEU, METEOR, BERTScore: Evalu ation of metrics performance in assessing critical translation errors in sentiment-oriented text,” in Proceedings of the Translation and Interpreting Technology Online Conference, 2021, pp. 48–56. [47] C. M. Hidalgo-Ternero, “Google Translate vs. DeepL: analysing neural machine translation performance under the challenge of phraseological variation.” Universitat d’Alacant, 2020. [48] L. Matviienko, L. Khomenko, I. Denysovets, K. Horodenska, T. Niko lashyna, and I. Pavlova, “Comparative analysis of online translators in the machine translation system,” Revista Romaneasca pentru Educatie Multidimensionala, vol. 16, no. 3, pp. 101–118, 2024. [49] K. Huang, D. W. E. B. C. GrierD, T. J. Holt, C. Kruegel, D. McCoy, S. Savage, and G. Vigna, “Framing dependencies introduced by under ground commoditization,” in Workshop on the Economics of Information Security, 2015. [50] W. Jiao, W. Wang, J.-t. Huang, X. Wang, S. Shi, and Z. Tu, “Is ChatGPT a good translator? Yes with GPT-4 as the engine,” arXiv preprint arXiv:2301.08745, 2023. [51] W. Zhu, H. Liu, Q. Dong, J. Xu, S. Huang, L. Kong, J. Chen, and L. Li, “Multilingual machine translation with large language models: Empirical results and analysis,” in Findings of the Association for Computational Linguistics: NAACL 2024, 2024, pp. 2765–2781. [52] A. Hendy, M. Abdelrehim, A. Sharaf, V. Raunak, M. Gabr, H. Mat sushita, Y. J. Kim, M. Afify, and H. H. Awadalla, “How good are gpt models at machine translation? a comprehensive evaluation,” arXiv preprint arXiv:2302.09210, 2023. [53] D. Elshin, N. Karpachev, B. Gruzdev, I. Golovanov, G. Ivanov, A. Antonov, N. Skachkov, E. Latypova, V. Layner, E. Enikeeva et al., “From general LLM to translation: How we dramatically improve translation quality using human evaluation data for LLM finetuning,” in Proceedings of the Ninth Conference on Machine Translation, 2024, pp. 247–252. [54] I. Rivera-Trigueros, “Machine translation systems and quality assess ment: a systematic review,” Language Resources and Evaluation, vol. 56, no. 2, pp. 593–619, 2022. [55] T. Sellam, D. Das, and A. Parikh, “BLEURT: Learning robust metrics for text generation,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 7881–7892. [56] R. Rei, C. Stewart, A. C. Farinha, and A. Lavie, “COMET: A neural framework for MT evaluation,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 2685–2702. [57] J. Burroughs, M. Tereszkowski-Kaminski, and G. Suarez-Tangil, “Visu alizing cyber-threats in underground forums,” in Proceedings of the IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2023, pp. 244–258. | es |
| dc.identifier.uri | https://hdl.handle.net/20.500.12761/2008 | |
| dc.description.abstract | Cybercrime analysis and Cyber Threat Intelligence are crucial for understanding and defending against cyber threats, with online underground communities serving as a key source of information. Classification tasks are popular but demand significant manual effort and language-specific expertise. Prior work focuses on English-language forums, as non-English languages require fluent domain experts. We evaluate machine translation tools for suitability in preserving contextual information in posts and find GPT-4 is most reliable. We leverage existing underground forum post classification pipelines to compare their performance on translated text and original language text. We find classification performed on translated underground forum data is as effective as on original language text, enabling researchers to reuse existing pipelines. Finally, we investigate a fully machine-generated few-shot and zero-shot classification to reduce reliance on manual labeling, followed by a two-step machine-based classification, combining machine-generated labels with the existing classification pipeline. We find machine-based labeling causes errors to propagate downstream. For tasks requiring high-quality label creation, human expertise remains essential. Finally, we provide a qualitative evaluation of disagreements in annotator labels of the original language and the translations, as well as disagreements between annotators and machine labeling. | es |
| dc.language.iso | eng | es |
| dc.title | Lost in Translation: Analyzing Non-English Cybercrime Forums | es |
| dc.type | conference object | es |
| dc.conference.date | 4-7 November 2025 | es |
| dc.conference.place | San Diego, USA | es |
| dc.conference.title | APWG eCrime 2025 | * |
| dc.event.type | conference | es |
| dc.pres.type | paper | es |
| dc.type.hasVersion | VoR | es |
| dc.rights.accessRights | open access | es |
| dc.description.refereed | TRUE | es |
| dc.description.status | inpress | es |
Ficheros en el ítem
| Ficheros | Tamaño | Formato | Ver |
|---|---|---|---|
|
No hay ficheros asociados a este ítem. |
|||


