9 resultados para mining machine industry
em Universidad de Alicante
Resumo:
El campo de procesamiento de lenguaje natural (PLN), ha tenido un gran crecimiento en los últimos años; sus áreas de investigación incluyen: recuperación y extracción de información, minería de datos, traducción automática, sistemas de búsquedas de respuestas, generación de resúmenes automáticos, análisis de sentimientos, entre otras. En este artículo se presentan conceptos y algunas herramientas con el fin de contribuir al entendimiento del procesamiento de texto con técnicas de PLN, con el propósito de extraer información relevante que pueda ser usada en un gran rango de aplicaciones. Se pueden desarrollar clasificadores automáticos que permitan categorizar documentos y recomendar etiquetas; estos clasificadores deben ser independientes de la plataforma, fácilmente personalizables para poder ser integrados en diferentes proyectos y que sean capaces de aprender a partir de ejemplos. En el presente artículo se introducen estos algoritmos de clasificación, se analizan algunas herramientas de código abierto disponibles actualmente para llevar a cabo estas tareas y se comparan diversas implementaciones utilizando la métrica F en la evaluación de los clasificadores.
imaxin|software: PLN aplicada a la mejora de la comunicación multilingüe de empresas e instituciones
Resumo:
imaxin|software es una empresa creada en 1997 por cuatro titulados en ingeniería informática cuyo objetivo ha sido el de desarrollar videojuegos multimedia educativos y procesamiento del lenguaje natural multilingüe. 17 años más tarde, hemos desarrollado recursos, herramientas y aplicaciones multilingües de referencia para diferentes lenguas: Portugués (Galicia, Portugal, Brasil, etc.), Español (España, Argentina, México, etc.), Inglés, Catalán y Francés. En este artículo haremos una descripción de aquellos principales hitos en relación a la incorporación de estas tecnologías PLN al sector industrial e institucional.
Resumo:
Tema 6. Text Mining con Topic Modeling.
Resumo:
This paper analyses the consequences of urban environmental degradation on the well-being of Spanish miners. It is based on analyses of differences in mortality and height. The first part of the paper examines new hypotheses regarding the urban penalty. We take into consideration existing works in economic theory that address market failures when analysing the higher urban death rate. We explain the reduction in height using the model recently created by Floud, Fogel, Harris and Hong for British cities. The second part of the paper presents information demonstrating that the urban areas in the two largest mining areas in Spain (Bilbao and the Cartagena-La Unión mountain range) experienced a higher death rate relative to rural areas as a consequence of market failures derived from what we term an ‘anarchic urbanisation’.
Resumo:
This work presents a forensic analysis of buildings affected by mining subsidence, which is based on deformation data obtained by Differential Interferometry (DInSAR). The proposed test site is La Union village (Murcia, SE Spain) where subsidence was triggered in an industrial area due to the collapse of abandoned underground mining labours occurred in 1998. In the first part of this work the study area was introduced, describing the spatial and temporal evolution of ground subsidence, through the elaboration of a cracks map on the buildings located within the affected area. In the second part, the evolution of the most significant cracks found in the most damaged buildings was monitored using biaxial extensometric units and inclinometers. This article describes the work performed in the third part, where DInSAR processing of satellite radar data, available between 1998 and 2008, has permitted to determine the spatial and temporal evolution of the deformation of all the buildings of the study area in a period when no continuous in situ instrumental data is available. Additionally, the comparison of these results with the forensic data gathered in the 2005–2008 period, reveal that there is a coincidence between damaged buildings, buildings where extensometers register significant movements of cracks, and buildings deformation estimated from radar data. As a result, it has been demonstrated that the integration of DInSAR data into forensic analysis methodologies contributes to improve significantly the assessment of the damages of buildings affected by mining subsidence.
Resumo:
Preliminary research demonstrated the EmotiBlog annotated corpus relevance as a Machine Learning resource to detect subjective data. In this paper we compare EmotiBlog with the JRC Quotes corpus in order to check the robustness of its annotation. We concentrate on its coarse-grained labels and carry out a deep Machine Learning experimentation also with the inclusion of lexical resources. The results obtained show a similarity with the ones obtained with the JRC Quotes corpus demonstrating the EmotiBlog validity as a resource for the SA task.
Resumo:
This paper presents a preliminary study in which Machine Learning experiments applied to Opinion Mining in blogs have been carried out. We created and annotated a blog corpus in Spanish using EmotiBlog. We evaluated the utility of the features labelled firstly carrying out experiments with combinations of them and secondly using the feature selection techniques, we also deal with several problems, such as the noisy character of the input texts, the small size of the training set, the granularity of the annotation scheme and the language object of our study, Spanish, with less resource than English. We obtained promising results considering that it is a preliminary study.
Resumo:
Comunicación presentada en las IV Jornadas TIMM, Torres (Jaén), 7-8 abril 2011.
Resumo:
Hospitals attached to the Spanish Ministry of Health are currently using the International Classification of Diseases 9 Clinical Modification (ICD9-CM) to classify health discharge records. Nowadays, this work is manually done by experts. This paper tackles the automatic classification of real Discharge Records in Spanish following the ICD9-CM standard. The challenge is that the Discharge Records are written in spontaneous language. We explore several machine learning techniques to deal with the classification problem. Random Forest resulted in the most competitive one, achieving an F-measure of 0.876.