Biblioteca Digital

874 resultados para granule mining

Collaborative data stream mining in ubiquitous environments using dynamic classifier selection

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In ubiquitous data stream mining applications, different devices often aim to learn concepts that are similar to some extent. In these applications, such as spam filtering or news recommendation, the data stream underlying concept (e.g., interesting mail/news) is likely to change over time. Therefore, the resultant model must be continuously adapted to such changes. This paper presents a novel Collaborative Data Stream Mining (Coll-Stream) approach that explores the similarities in the knowledge available from other devices to improve local classification accuracy. Coll-Stream integrates the community knowledge using an ensemble method where the classifiers are selected and weighted based on their local accuracy for different partitions of the feature space. We evaluate Coll-Stream classification accuracy in situations with concept drift, noise, partition granularity and concept similarity in relation to the local underlying concept. The experimental results show that Coll-Stream resultant model achieves stability and accuracy in a variety of situations using both synthetic and real world datasets.

Educational innovation applied in the UNESCO chair of mining and industrial heritage

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The “Innovatio Educativa Tertio Millennio” group has been 10 years developing educational innovation techniques, actually has reached the level of teaching on the technical teachers has developed, and share them with other groups, that can implement them in their teaching activities. UNESCO Chair of Mining and Industrial Heritage has been years working on heritage, and on the one hand teaching in conservation and maintenance of heritage, and on the other doing raise awareness of the meaning of heritage, the social value and as must be managed effectively. Recently these two groups work together, thus is spreading in a much more effective manner the concepts of heritage, its meaning, its value, and how to manage it and provide effective protection. On one hand being a work of dissemination based on internet and on radio broadcasting, and on the other one of teaching based on educational innovation, and courses, conferences, and face-to-face seminars or distance platforms.

"Innovatio educativa tertio millennio": 12 years evoluting from learning from learning to teaching in courses on educational innovation and collaboration with the UNESCO chair of mining and industrial heritage

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Twelve years ago a group of teachers began to work in educational innovation. In 2002 we received an award for educational innovation, undergoing several stages. Recently, we have decided to focus on being teachers of educational innovation. We create a web scheduled in Joomla offering various services, among which we emphasize teaching courses of educational innovation. The “Instituto de Ciencias de la Educacion” in “Universidad Politécnica de Madrid” has recently incorporated two of these courses, which has been highly praised. These courses will be reissued in new calls, and we are going to offer them to more Universities. We are in contact with several institutions, radio programs, the UNESCO Chair of Mining and Industrial Heritage, and we are working with them in the creation of heritage courses using methods that we have developed.

Combining data mining and ontology engineering to enrich ontologies and linked data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this position paper, we claim that the need for time consuming data preparation and result interpretation tasks in knowledge discovery, as well as for costly expert consultation and consensus building activities required for ontology building can be reduced through exploiting the interplay of data mining and ontology engineering. The aim is to obtain in a semi-automatic way new knowledge from distributed data sources that can be used for inference and reasoning, as well as to guide the extraction of further knowledge from these data sources. The proposed approach is based on the creation of a novel knowledge discovery method relying on the combination, through an iterative ?feedbackloop?, of (a) data mining techniques to make emerge implicit models from data and (b) pattern-based ontology engineering to capture these models in reusable, conceptual and inferable artefacts.

Data mining for the diagnosis of type 2 diabetes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Diabetes is the most common disease nowadays in all populations and in all age groups. diabetes contributing to heart disease, increases the risks of developing kidney disease, blindness, nerve damage, and blood vessel damage. Diabetes disease diagnosis via proper interpretation of the diabetes data is an important classification problem. Different techniques of artificial intelligence has been applied to diabetes problem. The purpose of this study is apply the artificial metaplasticity on multilayer perceptron (AMMLP) as a data mining (DM) technique for the diabetes disease diagnosis. The Pima Indians diabetes was used to test the proposed model AMMLP. The results obtained by AMMLP were compared with decision tree (DT), Bayesian classifier (BC) and other algorithms, recently proposed by other researchers, that were applied to the same database. The robustness of the algorithms are examined using classification accuracy, analysis of sensitivity and specificity, confusion matrix. The results obtained by AMMLP are superior to obtained by DT and BC.

Mining recurring concepts in a dynamic feature space

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most data stream classification techniques assume that the underlying feature space is static. However, in real-world applications the set of features and their relevance to the target concept may change over time. In addition, when the underlying concepts reappear, reusing previously learnt models can enhance the learning process in terms of accuracy and processing time at the expense of manageable memory consumption. In this paper, we propose mining recurring concepts in a dynamic feature space (MReC-DFS), a data stream classification system to address the challenges of learning recurring concepts in a dynamic feature space while simultaneously reducing the memory cost associated with storing past models. MReC-DFS is able to detect and adapt to concept changes using the performance of the learning process and contextual information. To handle recurring concepts, stored models are combined in a dynamically weighted ensemble. Incremental feature selection is performed to reduce the combined feature space. This contribution allows MReC-DFS to store only the features most relevant to the learnt concepts, which in turn increases the memory efficiency of the technique. In addition, an incremental feature selection method is proposed that dynamically determines the threshold between relevant and irrelevant features. Experimental results demonstrating the high accuracy of MReC-DFS compared with state-of-the-art techniques on a variety of real datasets are presented. The results also show the superior memory efficiency of MReC-DFS.

Facing the climate change challenges in the mining and mineral industry as providers of a sustainable manufacturing process

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A sustainable manufacturing process must rely on an also sustainable raw materials and energy supply. This paper is intended to show the results of the studies developed on sustainable business models for the minerals industry as a fundamental previous part of a sustainable manufacturing process. As it has happened in other economic activities, the mining and minerals industry has come under tremendous pressure to improve its social, developmental, and environmental performance. Mining, refining, and the use and disposal of minerals have in some instances led to significant local environmental and social damage. Nowadays, like in other parts of the corporate world, companies are more routinely expected to perform to ever higher standards of behavior, going well beyond achieving the best rate of return for shareholders. They are also increasingly being asked to be more transparent and subject to third-party audit or review, especially in environmental aspects. In terms of environment, there are three inter-related areas where innovation and new business models can make the biggest difference: carbon, water and biodiversity. The focus in these three areas is for two reasons. First, the industrial and energetic minerals industry has significant footprints in each of these areas. Second, these three areas are where the potential environmental impacts go beyond local stakeholders and communities, and can even have global impacts, like in the case of carbon. So prioritizing efforts in these areas will ultimately be a strategic differentiator as the industry businesses continues to grow. Over the next forty years, world?s population is predicted to rise from 6.300 million to 9.500 million people. This will mean a huge demand of natural resources. Indeed, consumption rates are such that current demand for raw materials will probably soon exceed the planet?s capacity. As awareness of the actual situation grows, the public is demanding goods and services that are even more environmentally sustainable. This means that massive efforts are required to reduce the amount of materials we use, including freshwater, minerals and oil, biodiversity, and marine resources. It?s clear that business as usual is no longer possible. Today, companies face not only the economic fallout of the financial crisis; they face the substantial challenge of transitioning to a low-carbon economy that is constrained by dwindling natural resources easily accessible. Innovative business models offer pioneering companies an early start toward the future. They can signal to consumers how to make sustainable choices and provide reward for both the consumer and the shareholder. Climate change and carbon remain major risk discontinuities that we need to better understand and deal with. In the absence of a global carbon solution, the principal objective of any individual country should be to reduce its global carbon emissions by encouraging conservation. The mineral industry internal response is to continue to focus on reducing the energy intensity of our existing operations through energy efficiency and the progressive introduction of new technology. Planning of the new projects must ensure that their energy footprint is minimal from the start. These actions will increase the long term resilience of the business to uncertain energy and carbon markets. This focus, combined with a strong demand for skills in this strategic area for the future requires an appropriate change in initial and continuing training of engineers and technicians and their awareness of the issue of eco-design. It will also need the development of measurement tools for consistent comparisons between companies and the assessments integration of the carbon footprint of mining equipments and services in a comprehensive impact study on the sustainable development of the Economy.

Efficient management of available technologies. Aplication to mining operations for industry supplies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nowadays, processing Industry Sector is going through a series of changes, including right management and reduction of environmental affections. Any productive process which looks for sustainable management is incomplete if Cycle of Life of mineral resources sustainability is not taken into account. Raw materials for manufacturing are provided by mineral resources extraction processes, such as copper, aluminum, iron, gold, silver, silicon, titanium? Those elements are necessary for Mankind development and are obtained from the Earth through mineral extractive processes. Mineral extraction processes are operations which must take care about the environmental consequences. Extraction of huge volumes of rock for their transformation into raw materials for industry must be optimized to reduce ecological cost of the final product as l was possible. Reducing the ecological balance on a global scale has no sense to design an efficient manufacturing in secondary industry (transformation), if in first steps of the supply chain (extraction) impact exceeds the savings of resources in successive phases. Mining operations size suggests that it is an environmental aggressive activity, but precisely because of its great impact must be the first element to be considered. That idea implies that a new concept born: Reduce economical and environmental cost This work aims to make a reflection on the parameters that can be modified to reduce the energy cost of the process without an increasing in operational costs and always ensuring the same production capacity. That means minimize economic and environmental cost at same time. An efficient design of mining operation which has taken into account that idea does not implies an increasing of the operating cost. To get this objective is necessary to think in global operation view to make that all departments involved have common guidelines which make you think in the optimization of global energy costs. Sometimes a single operational cost must be increased to reduce global cost. This work makes a review through different design parameters of surface mining setting some key performance indicators (KPIs) which are estimated from an efficient point of view. Those KPIs can be included by HQE Policies as global indicators. The new concept developed is that a new criteria has to be applied in company policies: improve management, improving OPERATIONAL efficiency. That means, that is better to use current resources properly (machinery, equipment,?) than to replace them with new things but not used correctly. As a conclusion, through an efficient management of current technologies in each extractive operation an important reduction of the energy can be achieved looking at downstream in the process. That implies a lower energetic cost in the whole cycle of life in manufactured product.

Dental Implants Success Prediction: Tomography Based and Data Mining Based Approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There are a number of factors that contribute to the success of dental implant operations. Among others, is the choice of location in which the prosthetic tooth is to be implanted. This project offers a new approach to analyse jaw tissue for the purpose of selecting suitable locations for teeth implant operations. The application developed takes as input jaw computed tomography stack of slices and trims data outside the jaw area, which is the point of interest. It then reconstructs a three dimensional model of the jaw highlighting points of interest on the reconstructed model. On another hand, data mining techniques have been utilised in order to construct a prediction model based on an information dataset of previous dental implant operations with observed stability values. The goal is to find patterns within the dataset that would help predicting the success likelihood of an implant.

La Planificación de Áreas Mineras en Declive : Una respuesta multidimensional = Planning Declining Mining Areas. A multidimensional response

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La actividad minera tiene un gran impacto sobre el territorio, probablemente más que ninguna otra de las actividades humanas, ya que transforma el espacio en todas sus dimensiones: ecológica, ambiental, social y económica. Cuando la reducción de la rentabilidad de la explotación conduce al cierre de ésta, la repercusión sobre su entorno puede llegar a ser brutal. Pero las explotaciones mineras son muy distintas entre ellas y los efectos que su abandono producen sobre el espacio en la que se enclavan pueden ser diversos, por lo que la decisión sobre el futuro de estas áreas no es simple y evidente. Aquí se propone desarrollar una propuesta de clasificación tipológica de las minas y sus regiones con el objetivo de determinar las estrategias de intervención más adecuadas para el futuro de estos espacios y sus habitantes. En concreto se busca diferenciar los conceptos de Mina, Parque Minero, Espacio Minero y Región Minera, todos ellos fruto de la interacción de la huella de la actividad minera con el medio físico, los enclaves urbanizados, y la estructura socioeconómica de la región en la que se enclavan. Mining activity is having a great impact on the territory, probably more than any other human activity, which transforms the space in all of its dimensions, ecological, environmental, social and economic. When reducing the profitability of the operation leads to the conclusion thereof, the impact on the environment can be brutal. But mining are very different between them and the effects they produce on their abandonment in space that interlock can be diverse, so the decision on the future of these areas is not simple and obvious. This proposal aims to develop a typological classification of mines and their regions in order to determine the most appropriate intervention strategies for the future of these spaces and their inhabitants. Specifically, it seeks to differentiate the concepts of Mine, Mining Park, Space Miner and Mining Region, all the result of the interaction of the mining footprint with the physical environment, the urbanized enclaves, and the socio-economic structure of the region which interlock. El presente libro reúne las ponencias presentadas por los investigadores de la red REUSE dentro del 1er Simposio de Reutilización del Espacio Minero; evento organizado por la Universidad Federal de Minas Gerais (UFMG) en Belo Horizonte, entre el 1 y el 3 de octubre de 2012, en el marco del 1er Seminario Internacional de Reconversión de Territorios. La red REUSE es una red realizada gracias a la financiación del programa CYTED

Factors influencing metal price selection in mining feasibility studies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The mineral price assigned in mining project design is critical to determining the economic feasibility of a project. Nevertheless, although it is not difficult to find literature about market metal prices, it is much more complicated to achieve a specific methodology for calculating the value or which justifications are appropriate to include. This study presents an analysis of various methods for selecting metal prices and investigates the mechanisms and motives underlying price selections. The results describe various attitudes adopted by the designers of mining investment projects, and how the price can be determined not just by means of forecasting but also by consideration of other relevant parameters.

Data Mining with Enhanced Neural Networks-CMMSE

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract This paper presents a new method to extract knowledge from existing data sets, that is, to extract symbolic rules using the weights of an Artificial Neural Network. The method has been applied to a neural network with special architecture named Enhanced Neural Network (ENN). This architecture improves the results that have been obtained with multilayer perceptron (MLP). The relationship among the knowledge stored in the weights, the performance of the network and the new implemented algorithm to acquire rules from the weights is explained. The method itself gives a model to follow in the knowledge acquisition with ENN.

1RM prediction: a data mining case study

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the last few years there has been a heightened interest in data treatment and analysis with the aim of discovering hidden knowledge and eliciting relationships and patterns within this data. Data mining techniques (also known as Knowledge Discovery in Databases) have been applied over a wide range of fields such as marketing, investment, fraud detection, manufacturing, telecommunications and health. In this study, well-known data mining techniques such as artificial neural networks (ANN), genetic programming (GP), forward selection linear regression (LR) and k-means clustering techniques, are proposed to the health and sports community in order to aid with resistance training prescription. Appropriate resistance training prescription is effective for developing fitness, health and for enhancing general quality of life. Resistance exercise intensity is commonly prescribed as a percent of the one repetition maximum. 1RM, dynamic muscular strength, one repetition maximum or one execution maximum, is operationally defined as the heaviest load that can be moved over a specific range of motion, one time and with correct performance. The safety of the 1RM assessment has been questioned as such an enormous effort may lead to muscular injury. Prediction equations could help to tackle the problem of predicting the 1RM from submaximal loads, in order to avoid or at least, reduce the associated risks. We built different models from data on 30 men who performed up to 5 sets to exhaustion at different percentages of the 1RM in the bench press action, until reaching their actual 1RM. Also, a comparison of different existing prediction equations is carried out. The LR model seems to outperform the ANN and GP models for the 1RM prediction in the range between 1 and 10 repetitions. At 75% of the 1RM some subjects (n = 5) could perform 13 repetitions with proper technique in the bench press action, whilst other subjects (n = 20) performed statistically significant (p < 0:05) more repetitions at 70% than at 75% of their actual 1RM in the bench press action. Rate of perceived exertion (RPE) seems not to be a good predictor for 1RM when all the sets are performed until exhaustion, as no significant differences (p < 0:05) were found in the RPE at 75%, 80% and 90% of the 1RM. Also, years of experience and weekly hours of strength training are better correlated to 1RM (p < 0:05) than body weight. O'Connor et al. 1RM prediction equation seems to arise from the data gathered and seems to be the most accurate 1RM prediction equation from those proposed in literature and used in this study. Epley's 1RM prediction equation is reproduced by means of data simulation from 1RM literature equations. Finally, future lines of research are proposed related to the problem of the 1RM prediction by means of genetic algorithms, neural networks and clustering techniques. RESUMEN En los últimos años ha habido un creciente interés en el tratamiento y análisis de datos con el propósito de descubrir relaciones, patrones y conocimiento oculto en los mismos. Las técnicas de data mining (también llamadas de \Descubrimiento de conocimiento en bases de datos\) se han aplicado consistentemente a lo gran de un gran espectro de áreas como el marketing, inversiones, detección de fraude, producción industrial, telecomunicaciones y salud. En este estudio, técnicas bien conocidas de data mining como las redes neuronales artificiales (ANN), programación genética (GP), regresión lineal con selección hacia adelante (LR) y la técnica de clustering k-means, se proponen a la comunidad del deporte y la salud con el objetivo de ayudar con la prescripción del entrenamiento de fuerza. Una apropiada prescripción de entrenamiento de fuerza es efectiva no solo para mejorar el estado de forma general, sino para mejorar la salud e incrementar la calidad de vida. La intensidad en un ejercicio de fuerza se prescribe generalmente como un porcentaje de la repetición máxima. 1RM, fuerza muscular dinámica, una repetición máxima o una ejecución máxima, se define operacionalmente como la carga máxima que puede ser movida en un rango de movimiento específico, una vez y con una técnica correcta. La seguridad de las pruebas de 1RM ha sido cuestionada debido a que el gran esfuerzo requerido para llevarlas a cabo puede derivar en serias lesiones musculares. Las ecuaciones predictivas pueden ayudar a atajar el problema de la predicción de la 1RM con cargas sub-máximas y son empleadas con el propósito de eliminar o al menos, reducir los riesgos asociados. En este estudio, se construyeron distintos modelos a partir de los datos recogidos de 30 hombres que realizaron hasta 5 series al fallo en el ejercicio press de banca a distintos porcentajes de la 1RM, hasta llegar a su 1RM real. También se muestra una comparación de algunas de las distintas ecuaciones de predicción propuestas con anterioridad. El modelo LR parece superar a los modelos ANN y GP para la predicción de la 1RM entre 1 y 10 repeticiones. Al 75% de la 1RM algunos sujetos (n = 5) pudieron realizar 13 repeticiones con una técnica apropiada en el ejercicio press de banca, mientras que otros (n = 20) realizaron significativamente (p < 0:05) más repeticiones al 70% que al 75% de su 1RM en el press de banca. El ínndice de esfuerzo percibido (RPE) parece no ser un buen predictor del 1RM cuando todas las series se realizan al fallo, puesto que no existen diferencias signifiativas (p < 0:05) en el RPE al 75%, 80% y el 90% de la 1RM. Además, los años de experiencia y las horas semanales dedicadas al entrenamiento de fuerza están más correlacionadas con la 1RM (p < 0:05) que el peso corporal. La ecuación de O'Connor et al. parece surgir de los datos recogidos y parece ser la ecuación de predicción de 1RM más precisa de aquellas propuestas en la literatura y empleadas en este estudio. La ecuación de predicción de la 1RM de Epley es reproducida mediante simulación de datos a partir de algunas ecuaciones de predicción de la 1RM propuestas con anterioridad. Finalmente, se proponen futuras líneas de investigación relacionadas con el problema de la predicción de la 1RM mediante algoritmos genéticos, redes neuronales y técnicas de clustering.

Probabilistic meta-analysis of risk from the exposure to Hg in artisanal gold mining communities in Colombia

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Colombia is one of the largest per capita mercury polluters in the world as a consequence of its artisanal gold mining activities. The severity of this problem in terms of potential health effects was evaluated by means of a probabilistic risk assessment carried out in the twelve departments (or provinces) in Colombia with the largest gold production. The two exposure pathways included in the risk assessment were inhalation of elemental Hg vapors and ingestion of fish contaminated with methyl mercury. Exposure parameters for the adult population (especially rates of fish consumption) were obtained from nation-wide surveys and concentrations of Hg in air and of methyl-mercury in fish were gathered from previous scientific studies. Fish consumption varied between departments and ranged from 0 to 0.3 kg d?1. Average concentrations of total mercury in fish (70 data) ranged from 0.026 to 3.3 lg g?1. A total of 550 individual measurements of Hg in workshop air (ranging from menor queDL to 1 mg m?3) and 261 measurements of Hg in outdoor air (ranging from menor queDL to 0.652 mg m?3) were used to generate the probability distributions used as concentration terms in the calculation of risk. All but two of the distributions of Hazard Quotients (HQ) associated with ingestion of Hg-contaminated fish for the twelve regions evaluated presented median values higher than the threshold value of 1 and the 95th percentiles ranged from 4 to 90. In the case of exposure to Hg vapors, minimum values of HQ for the general population exceeded 1 in all the towns included in this study, and the HQs for miner-smelters burning the amalgam is two orders of magnitude higher, reaching values of 200 for the 95th percentile. Even acknowledging the conservative assumptions included in the risk assessment and the uncertainties associated with it, its results clearly reveal the exorbitant levels of risk endured not only by miner-smelters but also by the general population of artisanal gold mining communities in Colombia.

Nanoinformatics approaches for information extraction and text mining in nanomedical research texts

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La nanotecnología es un área de investigación de reciente creación que trata con la manipulación y el control de la materia con dimensiones comprendidas entre 1 y 100 nanómetros. A escala nanométrica, los materiales exhiben fenómenos físicos, químicos y biológicos singulares, muy distintos a los que manifiestan a escala convencional. En medicina, los compuestos miniaturizados a nanoescala y los materiales nanoestructurados ofrecen una mayor eficacia con respecto a las formulaciones químicas tradicionales, así como una mejora en la focalización del medicamento hacia la diana terapéutica, revelando así nuevas propiedades diagnósticas y terapéuticas. A su vez, la complejidad de la información a nivel nano es mucho mayor que en los niveles biológicos convencionales (desde el nivel de población hasta el nivel de célula) y, por tanto, cualquier flujo de trabajo en nanomedicina requiere, de forma inherente, estrategias de gestión de información avanzadas. Desafortunadamente, la informática biomédica todavía no ha proporcionado el marco de trabajo que permita lidiar con estos retos de la información a nivel nano, ni ha adaptado sus métodos y herramientas a este nuevo campo de investigación. En este contexto, la nueva área de la nanoinformática pretende detectar y establecer los vínculos existentes entre la medicina, la nanotecnología y la informática, fomentando así la aplicación de métodos computacionales para resolver las cuestiones y problemas que surgen con la información en la amplia intersección entre la biomedicina y la nanotecnología. Las observaciones expuestas previamente determinan el contexto de esta tesis doctoral, la cual se centra en analizar el dominio de la nanomedicina en profundidad, así como en el desarrollo de estrategias y herramientas para establecer correspondencias entre las distintas disciplinas, fuentes de datos, recursos computacionales y técnicas orientadas a la extracción de información y la minería de textos, con el objetivo final de hacer uso de los datos nanomédicos disponibles. El autor analiza, a través de casos reales, alguna de las tareas de investigación en nanomedicina que requieren o que pueden beneficiarse del uso de métodos y herramientas nanoinformáticas, ilustrando de esta forma los inconvenientes y limitaciones actuales de los enfoques de informática biomédica a la hora de tratar con datos pertenecientes al dominio nanomédico. Se discuten tres escenarios diferentes como ejemplos de actividades que los investigadores realizan mientras llevan a cabo su investigación, comparando los contextos biomédico y nanomédico: i) búsqueda en la Web de fuentes de datos y recursos computacionales que den soporte a su investigación; ii) búsqueda en la literatura científica de resultados experimentales y publicaciones relacionadas con su investigación; iii) búsqueda en registros de ensayos clínicos de resultados clínicos relacionados con su investigación. El desarrollo de estas actividades requiere el uso de herramientas y servicios informáticos, como exploradores Web, bases de datos de referencias bibliográficas indexando la literatura biomédica y registros online de ensayos clínicos, respectivamente. Para cada escenario, este documento proporciona un análisis detallado de los posibles obstáculos que pueden dificultar el desarrollo y el resultado de las diferentes tareas de investigación en cada uno de los dos campos citados (biomedicina y nanomedicina), poniendo especial énfasis en los retos existentes en la investigación nanomédica, campo en el que se han detectado las mayores dificultades. El autor ilustra cómo la aplicación de metodologías provenientes de la informática biomédica a estos escenarios resulta efectiva en el dominio biomédico, mientras que dichas metodologías presentan serias limitaciones cuando son aplicadas al contexto nanomédico. Para abordar dichas limitaciones, el autor propone un enfoque nanoinformático, original, diseñado específicamente para tratar con las características especiales que la información presenta a nivel nano. El enfoque consiste en un análisis en profundidad de la literatura científica y de los registros de ensayos clínicos disponibles para extraer información relevante sobre experimentos y resultados en nanomedicina —patrones textuales, vocabulario en común, descriptores de experimentos, parámetros de caracterización, etc.—, seguido del desarrollo de mecanismos para estructurar y analizar dicha información automáticamente. Este análisis concluye con la generación de un modelo de datos de referencia (gold standard) —un conjunto de datos de entrenamiento y de test anotados manualmente—, el cual ha sido aplicado a la clasificación de registros de ensayos clínicos, permitiendo distinguir automáticamente los estudios centrados en nanodrogas y nanodispositivos de aquellos enfocados a testear productos farmacéuticos tradicionales. El presente trabajo pretende proporcionar los métodos necesarios para organizar, depurar, filtrar y validar parte de los datos nanomédicos existentes en la actualidad a una escala adecuada para la toma de decisiones. Análisis similares para otras tareas de investigación en nanomedicina ayudarían a detectar qué recursos nanoinformáticos se requieren para cumplir los objetivos actuales en el área, así como a generar conjunto de datos de referencia, estructurados y densos en información, a partir de literatura y otros fuentes no estructuradas para poder aplicar nuevos algoritmos e inferir nueva información de valor para la investigación en nanomedicina. ABSTRACT Nanotechnology is a research area of recent development that deals with the manipulation and control of matter with dimensions ranging from 1 to 100 nanometers. At the nanoscale, materials exhibit singular physical, chemical and biological phenomena, very different from those manifested at the conventional scale. In medicine, nanosized compounds and nanostructured materials offer improved drug targeting and efficacy with respect to traditional formulations, and reveal novel diagnostic and therapeutic properties. Nevertheless, the complexity of information at the nano level is much higher than the complexity at the conventional biological levels (from populations to the cell). Thus, any nanomedical research workflow inherently demands advanced information management. Unfortunately, Biomedical Informatics (BMI) has not yet provided the necessary framework to deal with such information challenges, nor adapted its methods and tools to the new research field. In this context, the novel area of nanoinformatics aims to build new bridges between medicine, nanotechnology and informatics, allowing the application of computational methods to solve informational issues at the wide intersection between biomedicine and nanotechnology. The above observations determine the context of this doctoral dissertation, which is focused on analyzing the nanomedical domain in-depth, and developing nanoinformatics strategies and tools to map across disciplines, data sources, computational resources, and information extraction and text mining techniques, for leveraging available nanomedical data. The author analyzes, through real-life case studies, some research tasks in nanomedicine that would require or could benefit from the use of nanoinformatics methods and tools, illustrating present drawbacks and limitations of BMI approaches to deal with data belonging to the nanomedical domain. Three different scenarios, comparing both the biomedical and nanomedical contexts, are discussed as examples of activities that researchers would perform while conducting their research: i) searching over the Web for data sources and computational resources supporting their research; ii) searching the literature for experimental results and publications related to their research, and iii) searching clinical trial registries for clinical results related to their research. The development of these activities will depend on the use of informatics tools and services, such as web browsers, databases of citations and abstracts indexing the biomedical literature, and web-based clinical trial registries, respectively. For each scenario, this document provides a detailed analysis of the potential information barriers that could hamper the successful development of the different research tasks in both fields (biomedicine and nanomedicine), emphasizing the existing challenges for nanomedical research —where the major barriers have been found. The author illustrates how the application of BMI methodologies to these scenarios can be proven successful in the biomedical domain, whilst these methodologies present severe limitations when applied to the nanomedical context. To address such limitations, the author proposes an original nanoinformatics approach specifically designed to deal with the special characteristics of information at the nano level. This approach consists of an in-depth analysis of the scientific literature and available clinical trial registries to extract relevant information about experiments and results in nanomedicine —textual patterns, common vocabulary, experiment descriptors, characterization parameters, etc.—, followed by the development of mechanisms to automatically structure and analyze this information. This analysis resulted in the generation of a gold standard —a manually annotated training or reference set—, which was applied to the automatic classification of clinical trial summaries, distinguishing studies focused on nanodrugs and nanodevices from those aimed at testing traditional pharmaceuticals. The present work aims to provide the necessary methods for organizing, curating and validating existing nanomedical data on a scale suitable for decision-making. Similar analysis for different nanomedical research tasks would help to detect which nanoinformatics resources are required to meet current goals in the field, as well as to generate densely populated and machine-interpretable reference datasets from the literature and other unstructured sources for further testing novel algorithms and inferring new valuable information for nanomedicine.

«
1
2
...
47
48
49
50
51
52
53
...
58
59
»