Mostrar el registro sencillo del ítem
Graph-based Techniques for Topic Classification of Tweets in Spanish
dc.contributor.author | Cordobés de la Calle, Héctor | |
dc.contributor.author | Fernández Anta, Antonio | |
dc.contributor.author | Chiroque, Luis F. | |
dc.contributor.author | Pérez, Fernando | |
dc.contributor.author | Redondo, Teófilo | |
dc.contributor.author | Santos, Agustín | |
dc.date.accessioned | 2021-07-13T10:08:52Z | |
dc.date.available | 2021-07-13T10:08:52Z | |
dc.date.issued | 2014-03 | |
dc.identifier.citation | References [1] Aseervatham, Sujeevan. 2007. Apprentissage à base de Noyaux Sémantiques pour le traitement de données textuelles. Ph.D. thesis, Université Paris-Nord-Paris XIII. [2] Blanco, Roi and Christina Lioma. 2012. Graph-based term weighting for information retrieval. Information retrieval, 15(1):54-92. [3] Brin, Sergey and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst., 30(1-7):107-117, April. [4] Dadachev, Boris, Alexander Balinsky, Helen Balinsky, and Steven Simske. 2012. On the helmholtz principle for data mining. In Emerging Security Technologies (EST), 2012 Third International Conference on, pages 99-102. IEEE. [5] Fernández Anta, Antonio, Luis Núñez Chiroque, Philippe Morere, and Agustín Santos. 2013. Sentiment analysis and topic detection of Spanish tweets: A comparative study of of NLP techniques. Procesamiento del Lenguaje Natural, 50:45-52. [6] Hall, Mark, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: an update. SIGKDD Explorations, 11(1):10-18. [7] Hassan, Samer, Rada Mihalcea, and Carmen Banea. 2007. Random walk term weighting for improved text classification. International Journal of Semantic Computing, 1(04):421-439. [8] Kleinberg, Jon M. 1999. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604-632, September. [9] Lewis, David D. 1997. Reuters-21578 text categorization test collection. [10] Mihalcea, R. and P. Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of EMNLP-04and the 2004 Conference on Empirical Methods in Natural Language Processing, July. [11] Nagao, Makoto and Shinsuke Mori. 1994. A new method of n-gram statistics for large number of n and automatic extraction of words and phrases from large text data of Japanese. In Proceedings of the 15th conference on Computational Linguistics, COLING 1994, Volume 1, pages 611-615. Association for Computational Linguistics. [12] Padró, Lluís, Samuel Reese, Eneko Agirre, and Aitor Soroa. 2010. Semantic services in freeling 2.1: Wordnet and ukb. In Principles, Construction, and Application of Multilingual Wordnets, pages 99-105, Pushpak Bhattacharyya, Christiane Fellbaum, and Piek Vossen, editors, Mumbai, India, February. Global Wordnet Conference 2010, Narosa Publishing House. [13] Porta, Jordi and José Luis Sancho. 2013. Word normalization in twitter using finite-state transducers. Proc. of the Tweet Normalization Workshop at SEPLN 2013. IV Congreso Espa nol de Informática. [14] Salton, Gerard and Michael J McGill. 1983. Introduction to moderm information retrieval. [15] Shuyo, Nakatani. 2010. Language detection library for java. http://code.google.com/p/language-detection/. [16] Thakkar, Khushboo S, Rajiv V Dharaskar, and MB Chandak. 2010. Graph-based algorithms for text summarization. In Emerging Trends in Engineering and Technology (ICETET), 2010 3rd International Conference on, pages 516-519. IEEE. [17] Vilares, David, Miguel A. Alonso, and Carlos Gómez-Rodríguez. 2013. Una aproximación supervisada para la minería de opiniones sobre tuits en español en base a conocimiento lingüístico. Procesamiento del Lenguaje Natural, 51:127-134. | |
dc.identifier.issn | ISSN 1989 - 1660 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12761/1287 | |
dc.description.abstract | Topic classification of texts is one of the most interesting challenges in Natural Language Processing (NLP). Topic classifiers commonly use a bag-of-words approach, in which the classifier uses (and is trained with) selected terms from the input texts. In this work we present techniques based on graph similarity to classify short texts by topic. In our classifier we build graphs from the input texts, and then use properties of these graphs to classify them. We have tested the resulting algorithm by classifying Twitter messages in Spanish among a predefined set of topics, achieving more than 70% accuracy. | |
dc.language.iso | eng | |
dc.publisher | IMAI Research Group | |
dc.title | Graph-based Techniques for Topic Classification of Tweets in Spanish | en |
dc.type | journal article | |
dc.journal.title | IJIMAI International Journal of Interactive Multimedia and Artificial Intelligence (Special issue: AI Techniques to Evaluate Economics and Happiness) | |
dc.type.hasVersion | VoR | |
dc.rights.accessRights | open access | |
dc.volume.number | 2 | |
dc.issue.number | 5 | |
dc.identifier.doi | DOI:10.9781/ijimai.2014.254 | |
dc.page.final | 37 | |
dc.page.initial | 31 | |
dc.subject.keyword | Classification | |
dc.subject.keyword | Graphs | |
dc.subject.keyword | Happiness | |
dc.subject.keyword | NLP | |
dc.subject.keyword | Text Classification | |
dc.subject.keyword | Topic Classification | |
dc.description.status | pub | |
dc.eprint.id | http://eprints.networks.imdea.org/id/eprint/723 |