11 resultados para Modeling Rapport Using Machine Learning
em Universidad de Alicante
Resumo:
This paper presents a preliminary study in which Machine Learning experiments applied to Opinion Mining in blogs have been carried out. We created and annotated a blog corpus in Spanish using EmotiBlog. We evaluated the utility of the features labelled firstly carrying out experiments with combinations of them and secondly using the feature selection techniques, we also deal with several problems, such as the noisy character of the input texts, the small size of the training set, the granularity of the annotation scheme and the language object of our study, Spanish, with less resource than English. We obtained promising results considering that it is a preliminary study.
Resumo:
Hospitals attached to the Spanish Ministry of Health are currently using the International Classification of Diseases 9 Clinical Modification (ICD9-CM) to classify health discharge records. Nowadays, this work is manually done by experts. This paper tackles the automatic classification of real Discharge Records in Spanish following the ICD9-CM standard. The challenge is that the Discharge Records are written in spontaneous language. We explore several machine learning techniques to deal with the classification problem. Random Forest resulted in the most competitive one, achieving an F-measure of 0.876.
Resumo:
In the chemical textile domain experts have to analyse chemical components and substances that might be harmful for their usage in clothing and textiles. Part of this analysis is performed searching opinions and reports people have expressed concerning these products in the Social Web. However, this type of information on the Internet is not as frequent for this domain as for others, so its detection and classification is difficult and time-consuming. Consequently, problems associated to the use of chemical substances in textiles may not be detected early enough, and could lead to health problems, such as allergies or burns. In this paper, we propose a framework able to detect, retrieve, and classify subjective sentences related to the chemical textile domain, that could be integrated into a wider health surveillance system. We also describe the creation of several datasets with opinions from this domain, the experiments performed using machine learning techniques and different lexical resources such as WordNet, and the evaluation focusing on the sentiment classification, and complaint detection (i.e., negativity). Despite the challenges involved in this domain, our approach obtains promising results with an F-score of 65% for polarity classification and 82% for complaint detection.
Resumo:
Tema 6. Text Mining con Topic Modeling.
Resumo:
Inspirados por las estrategias de detección precoz aplicadas en medicina, proponemos el diseño y construcción de un sistema de predicción que permita detectar los problemas de aprendizaje de los estudiantes de forma temprana. Partimos de un sistema gamificado para el aprendizaje de Lógica Computacional, del que se recolectan masivamente datos de uso y, sobre todo, resultados de aprendizaje de los estudiantes en la resolución de problemas. Todos estos datos se analizan utilizando técnicas de Machine Learning que ofrecen, como resultado, una predicción del rendimiento de cada alumno. La información se presenta semanalmente en forma de un gráfico de progresión, de fácil interpretación pero con información muy valiosa. El sistema resultante tiene un alto grado de automatización, es progresivo, ofrece resultados desde el principio del curso con predicciones cada vez más precisas, utiliza resultados de aprendizaje y no solo datos de uso, permite evaluar y hacer predicciones sobre las competencias y habilidades adquiridas y contribuye a una evaluación realmente formativa. En definitiva, permite a los profesores guiar a los estudiantes en una mejora de su rendimiento desde etapas muy tempranas, pudiendo reconducir a tiempo los posibles fracasos y motivando a los estudiantes.
Resumo:
El análisis de textos de la Web 2.0 es un tema de investigación relevante hoy en día. Sin embargo, son muchos los problemas que se plantean a la hora de utilizar las herramientas actuales en este tipo de textos. Para ser capaces de medir estas dificultades primero necesitamos conocer los diferentes registros o grados de informalidad que podemos encontrar. Por ello, en este trabajo intentaremos caracterizar niveles de informalidad para textos en inglés en la Web 2.0 mediante técnicas de aprendizaje automático no supervisado, obteniendo resultados del 68 % en F1.
Resumo:
The methodology “b-learning” is a new teaching scenario and it requires the creation, adaptation and application of new learning tools searching the assimilation of new collaborative competences. In this context, it is well known the knowledge spirals, the situational leadership and the informal learning. The knowledge spirals is a basic concept of the knowledge procedure and they are based on that the knowledge increases when a cycle of 4 phases is repeated successively.1) The knowledge is created (for instance, to have an idea); 2) The knowledge is decoded into a format to be easily transmitted; 3) The knowledge is modified to be easily comprehensive and it is used; 4) New knowledge is created. This new knowledge improves the previous one (step 1). Each cycle shows a step of a spiral staircase: by going up the staircase, more knowledge is created. On the other hand, the situational leadership is based on that each person has a maturity degree to develop a specific task and this maturity increases with the experience. Therefore, the teacher (leader) has to adapt the teaching style to the student (subordinate) requirements and in this way, the professional and personal development of the student will increase quickly by improving the results and satisfaction. This educational strategy, finally combined with the informal learning, and in particular the zone of proximal development, and using a learning content management system own in our University, gets a successful and well-evaluated learning activity in Master subjects focused on the collaborative activity of preparation and oral exhibition of short and specific topics affine to these subjects. Therefore, the teacher has a relevant and consultant role of the selected topic and his function is to guide and supervise the work, incorporating many times the previous works done in other courses, as a research tutor or more experienced student. Then, in this work, we show the academic results, grade of interactivity developed in these collaborative tasks, statistics and the satisfaction grade shown by our post-graduate students.
Resumo:
El foco geográfico de un documento identifica el lugar o lugares en los que se centra el contenido del texto. En este trabajo se presenta una aproximación basada en corpus para la detección del foco geográfico en el texto. Frente a otras aproximaciones que se centran en el uso de información puramente geográfica para la detección del foco, nuestra propuesta emplea toda la información textual existente en los documentos del corpus de trabajo, partiendo de la hipótesis de que la aparición de determinados personajes, eventos, fechas e incluso términos comunes, pueden resultar fundamentales para esta tarea. Para validar nuestra hipótesis, se ha realizado un estudio sobre un corpus de noticias geolocalizadas que tuvieron lugar entre los años 2008 y 2011. Esta distribución temporal nos ha permitido, además, analizar la evolución del rendimiento del clasificador y de los términos más representativos de diferentes localidades a lo largo del tiempo.
Resumo:
El campo de procesamiento de lenguaje natural (PLN), ha tenido un gran crecimiento en los últimos años; sus áreas de investigación incluyen: recuperación y extracción de información, minería de datos, traducción automática, sistemas de búsquedas de respuestas, generación de resúmenes automáticos, análisis de sentimientos, entre otras. En este artículo se presentan conceptos y algunas herramientas con el fin de contribuir al entendimiento del procesamiento de texto con técnicas de PLN, con el propósito de extraer información relevante que pueda ser usada en un gran rango de aplicaciones. Se pueden desarrollar clasificadores automáticos que permitan categorizar documentos y recomendar etiquetas; estos clasificadores deben ser independientes de la plataforma, fácilmente personalizables para poder ser integrados en diferentes proyectos y que sean capaces de aprender a partir de ejemplos. En el presente artículo se introducen estos algoritmos de clasificación, se analizan algunas herramientas de código abierto disponibles actualmente para llevar a cabo estas tareas y se comparan diversas implementaciones utilizando la métrica F en la evaluación de los clasificadores.
Resumo:
Paper submitted to MML 2013, 6th International Workshop on Machine Learning and Music, Prague, September 23, 2013.
Resumo:
Background and objective: In this paper, we have tested the suitability of using different artificial intelligence-based algorithms for decision support when classifying the risk of congenital heart surgery. In this sense, classification of those surgical risks provides enormous benefits as the a priori estimation of surgical outcomes depending on either the type of disease or the type of repair, and other elements that influence the final result. This preventive estimation may help to avoid future complications, or even death. Methods: We have evaluated four machine learning algorithms to achieve our objective: multilayer perceptron, self-organizing map, radial basis function networks and decision trees. The architectures implemented have the aim of classifying among three types of surgical risk: low complexity, medium complexity and high complexity. Results: Accuracy outcomes achieved range between 80% and 99%, being the multilayer perceptron method the one that offered a higher hit ratio. Conclusions: According to the results, it is feasible to develop a clinical decision support system using the evaluated algorithms. Such system would help cardiology specialists, paediatricians and surgeons to forecast the level of risk related to a congenital heart disease surgery.