15 resultados para Compositional data analysis-roots in geosciences
em Universidad Politécnica de Madrid
Resumo:
In the last years significant efforts have been devoted to the development of advanced data analysis tools to both predict the occurrence of disruptions and to investigate the operational spaces of devices, with the long term goal of advancing the understanding of the physics of these events and to prepare for ITER. On JET the latest generation of the disruption predictor called APODIS has been deployed in the real time network during the last campaigns with the new metallic wall. Even if it was trained only with discharges with the carbon wall, it has reached very good performance, with both missed alarms and false alarms in the order of a few percent (and strategies to improve the performance have already been identified). Since for the optimisation of the mitigation measures, predicting also the type of disruption is considered to be also very important, a new clustering method, based on the geodesic distance on a probabilistic manifold, has been developed. This technique allows automatic classification of an incoming disruption with a success rate of better than 85%. Various other manifold learning tools, particularly Principal Component Analysis and Self Organised Maps, are also producing very interesting results in the comparative analysis of JET and ASDEX Upgrade (AUG) operational spaces, on the route to developing predictors capable of extrapolating from one device to another.
Resumo:
A sensitivity analysis on the multiplication factor, keffkeff, to the cross section data has been carried out for the MYRRHA critical configuration in order to show the most relevant reactions. With these results, a further analysis on the 238Pu and 56Fe cross sections has been performed, comparing the evaluations provided in the JEFF-3.1.2 and ENDF/B-VII.1 libraries for these nuclides. Then, the effect in MYRRHA of the differences between evaluations are analysed, presenting the source of the differences. With these results, recommendations for the 56Fe and 238Pu evaluations are suggested. These calculations have been performed with SCALE6.1 and MCNPX-2.7e.
Resumo:
Tolls have increasingly become a common mechanism to fund road projects in recent decades. Therefore, improving knowledge of demand behavior constitutes a key aspect for stakeholders dealing with the management of toll roads. However, the literature concerning demand elasticity estimates for interurban toll roads is still limited due to their relatively scarce number in the international context. Furthermore, existing research has left some aspects to be investigated, among others, the choice of GDP as the most common socioeconomic variable to explain traffic growth over time. This paper intends to determine the variables that better explain the evolution of light vehicle demand in toll roads throughout the years. To that end, we establish a dynamic panel data methodology aimed at identifying the key socioeconomic variables explaining changes in light vehicle demand over time. The results show that, despite some usefulness, GDP does not constitute the most appropriate explanatory variable, while other parameters such as employment or GDP per capita lead to more stable and consistent results. The methodology is applied to Spanish toll roads for the 1990?2011 period, which constitutes a very interesting case on variations in toll road use, as road demand has experienced a significant decrease since the beginning of the economic crisis in 2008.
Resumo:
En los últimos años la sociedad está experimentando una serie de cambios. Uno de estos cambios es la datificación (“datafication” en inglés). Este término puede ser definido como la transformación sistemática de aspectos de la vida cotidiana de las personas en datos procesados por ordenadores. Cada día, a cada minuto y a cada segundo, cada vez que alguien emplea un dispositivo digital,hay datos siendo guardados en algún lugar. Se puede tratar del contenido de un correo electrónico pero también puede ser el número de pasos que esa persona ha caminado o su historial médico. El simple almacenamiento de datos no proporciona un valor añadido por si solo. Para extraer conocimiento de los datos, y por tanto darles un valor, se requiere del análisis de datos. La ciencia de los datos junto con el análisis de datos se está volviendo cada vez más popular. Hoy en día, se pueden encontrar millones de web APIs estadísticas; estas APIs ofrecen la posibilidad de analizar tendencias o sentimientos presentes en las redes sociales o en internet en general. Una de las redes sociales más populares, Twitter, es pública. Cada mensaje, o tweet, publicado puede ser visto por cualquier persona en el mundo, siempre y cuando posea una conexión a internet. Esto hace de Twitter un medio interesante a la hora de analizar hábitos sociales o perfiles de consumo. Es en este contexto en que se engloba este proyecto. Este trabajo, combinando el análisis estadístico de datos y el análisis de contenido, trata de extraer conocimiento de tweets públicos de Twitter. En particular tratará de establecer si el género es un factor influyente en las relaciones entre usuarios de Twitter. Para ello, se analizará una base de datos que contiene casi 2.000 tweets. En primer lugar se determinará el género de los usuarios mediante web APIs. En segundo lugar se empleará el contraste de hipótesis para saber si el género influye en los usuarios a la hora de relacionarse con otros usuarios. Finalmente se construirá un modelo estadístico para predecir el comportamiento de los usuarios de Twitter en relación a su género.
Resumo:
Due to the advancement of both, information technology in general, and databases in particular; data storage devices are becoming cheaper and data processing speed is increasing. As result of this, organizations tend to store large volumes of data holding great potential information. Decision Support Systems, DSS try to use the stored data to obtain valuable information for organizations. In this paper, we use both data models and use cases to represent the functionality of data processing in DSS following Software Engineering processes. We propose a methodology to develop DSS in the Analysis phase, respective of data processing modeling. We have used, as a starting point, a data model adapted to the semantics involved in multidimensional databases or data warehouses, DW. Also, we have taken an algorithm that provides us with all the possible ways to automatically cross check multidimensional model data. Using the aforementioned, we propose diagrams and descriptions of use cases, which can be considered as patterns representing the DSS functionality, in regard to DW data processing, DW on which DSS are based. We highlight the reusability and automation benefits that this can be achieved, and we think this study can serve as a guide in the development of DSS.
Resumo:
The structural connectivity of the brain is considered to encode species-wise and subject-wise patterns that will unlock large areas of understanding of the human brain. Currently, diffusion MRI of the living brain enables to map the microstructure of tissue, allowing to track the pathways of fiber bundles connecting the cortical regions across the brain. These bundles are summarized in a network representation called connectome that is analyzed using graph theory. The extraction of the connectome from diffusion MRI requires a large processing flow including image enhancement, reconstruction, segmentation, registration, diffusion tracking, etc. Although a concerted effort has been devoted to the definition of standard pipelines for the connectome extraction, it is still crucial to define quality assessment protocols of these workflows. The definition of quality control protocols is hindered by the complexity of the pipelines under test and the absolute lack of gold-standards for diffusion MRI data. Here we characterize the impact on structural connectivity workflows of the geometrical deformation typically shown by diffusion MRI data due to the inhomogeneity of magnetic susceptibility across the imaged object. We propose an evaluation framework to compare the existing methodologies to correct for these artifacts including whole-brain realistic phantoms. Additionally, we design and implement an image segmentation and registration method to avoid performing the correction task and to enable processing in the native space of diffusion data. We release PySDCev, an evaluation framework for the quality control of connectivity pipelines, specialized in the study of susceptibility-derived distortions. In this context, we propose Diffantom, a whole-brain phantom that provides a solution to the lack of gold-standard data. The three correction methodologies under comparison performed reasonably, and it is difficult to determine which method is more advisable. We demonstrate that susceptibility-derived correction is necessary to increase the sensitivity of connectivity pipelines, at the cost of specificity. Finally, with the registration and segmentation tool called regseg we demonstrate how the problem of susceptibility-derived distortion can be overcome allowing data to be used in their original coordinates. This is crucial to increase the sensitivity of the whole pipeline without any loss in specificity.
Resumo:
In ubiquitous data stream mining applications, different devices often aim to learn concepts that are similar to some extent. In these applications, such as spam filtering or news recommendation, the data stream underlying concept (e.g., interesting mail/news) is likely to change over time. Therefore, the resultant model must be continuously adapted to such changes. This paper presents a novel Collaborative Data Stream Mining (Coll-Stream) approach that explores the similarities in the knowledge available from other devices to improve local classification accuracy. Coll-Stream integrates the community knowledge using an ensemble method where the classifiers are selected and weighted based on their local accuracy for different partitions of the feature space. We evaluate Coll-Stream classification accuracy in situations with concept drift, noise, partition granularity and concept similarity in relation to the local underlying concept. The experimental results show that Coll-Stream resultant model achieves stability and accuracy in a variety of situations using both synthetic and real world datasets.
Resumo:
In this PhD Thesis proposal, the principles of diffusion MRI (dMRI) in its application to the human brain mapping of connectivity are reviewed. The background section covers the fundamentals of dMRI, with special focus on those related to the distortions caused by susceptibility inhomogeneity across tissues. Also, a deep survey of available correction methodologies for this common artifact of dMRI is presented. Two methodological approaches to improved correction are introduced. Finally, the PhD proposal describes its objectives, the research plan, and the necessary resources.
Resumo:
Contents: - Center for Open Middleware - POSDATA project - User modeling - Some early results - @posdata service
Resumo:
In this paper, the authors introduce a novel mechanism for data management in a middleware for smart home control, where a relational database and semantic ontology storage are used at the same time in a Data Warehouse. An annotation system has been designed for instructing the storage format and location, registering new ontology concepts and most importantly, guaranteeing the Data Consistency between the two storage methods. For easing the data persistence process, the Data Access Object (DAO) pattern is applied and optimized to enhance the Data Consistency assurance. Finally, this novel mechanism provides an easy manner for the development of applications and their integration with BATMP. Finally, an application named "Parameter Monitoring Service" is given as an example for assessing the feasibility of the system.
Resumo:
El presente trabajo consistió en el desarrollo de una intervención nutricional a largo plazo llevada a cabo con jugadores profesionales de baloncesto, en función al cumplimiento de las recomendaciones nutricionales, con los siguientes dos objetivos: 1) valorar los cambios que dicha intervención produce sobre las prácticas nutricionales diarias de estos deportistas y 2) conocer la influencia de las modificaciones nutricionales producidas sobre la tasa de percepción del esfuerzo por sesión (RPE-Sesión) y la fatiga, a lo largo de una temporada competitiva, tanto para entrenamientos como partidos oficiales. Los objetivos del estudio se fundamentan en: 1) la numerosa evidencia científica que muestra la inadecuación de los hábitos nutricionales de los jugadores de baloncesto y otros deportistas respecto a las recomendaciones nutricionales; 2) el hecho ampliamente reconocido en la literatura especializada de que una ingesta nutricional óptima permite maximizar el rendimiento deportivo (a nivel físico y cognitivo), promoviendo una rápida recuperación y disminuyendo el riesgo de enfermedades y lesiones deportivas. No obstante, pocos estudios han llevado a cabo una intervención nutricional a largo plazo para mejorar los hábitos alimentarios de los deportistas y ninguno de ellos fue realizado con jugadores de baloncesto; 3) la elevada correlación entre la percepción del esfuerzo (RPE) y variables fisiológicas relacionadas al desarrollo de un ejercicio (por ej.: frecuencia cardíaca, consumo máximo de oxígeno o lactato sanguíneo) y los múltiples estudios que muestran la atenuación de la RPE durante la realización del ejercicio mediante una ingesta puntual de nutrientes, (especialmente de hidratos de carbono) aunque ninguno fue desarrollado en baloncesto; 4) el estudio incipiente de la relación entre la ingesta nutricional y la RPE-Sesión, siendo éste un método validado en baloncesto y otros deportes de equipo como indicador de la carga de trabajo interna, el rendimiento deportivo y la intensidad del ejercicio realizado; 5) el hecho de que la fatiga constituye uno de los principales factores influyentes en la percepción del esfuerzo y puede ser retrasada y/o atenuada mediante la ingesta de carbohidratos, pudiendo disminuir consecuentemente la RPE-Sesión y la carga interna del esfuerzo físico, potenciando el rendimiento deportivo y las adaptaciones inducidas por el entrenamiento; 6) la reducida evidencia acerca del comportamiento de la RPE-Sesión ante la modificación de la ingesta de nutrientes, encontrándose sólo un estudio llevado a cabo en baloncesto y 7) la ausencia de investigaciones acerca de la influencia que puede tener la mejora del patrón nutricional de los jugadores sobre la RPE-Sesión y la fatiga, desconociéndose si la adecuación de los hábitos nutricionales conduce a una disminución de estas variables en el largo plazo para todos los entrenamientos y partidos oficiales a nivel profesional. Por todo esto, este trabajo comienza con una introducción que presenta el marco teórico de la importancia y función de la nutrición en el deporte, así como de las recomendaciones nutricionales actuales a nivel general y para baloncesto. Además, se describen las intervenciones nutricionales llevadas a cabo previamente con otros deportistas y las consecuentes modificaciones sobre el patrón alimentario, coincidiendo este aspecto con el primer objetivo del presente estudio. Posteriormente, se analiza la RPE, la RPE-Sesión y la fatiga, focalizando el estudio en la relación de dichas variables con la carga de trabajo físico, la intensidad del entrenamiento, el rendimiento deportivo y la recuperación post ejercicio. Finalmente, se combinan todos los aspectos mencionados: ingesta nutricional, RPE percepción del esfuerzo y fatiga, con el fin de conocer la situación actual del estudio de la relación entre dichas variables, conformando la base del segundo objetivo de este estudio. Seguidamente, se exponen y fundamentan los objetivos antes mencionados, para dar lugar después a la explicación de la metodología utilizada en el presente estudio. Ésta consistió en un diseño de estudios de caso, aplicándose una intervención nutricional personalizada a tres jugadores de baloncesto profesional (cada jugador = un estudio de caso; n = 1), con el objetivo de adecuar su ingesta nutricional en el largo plazo a las recomendaciones nutricionales. A su vez, se analizó la respuesta individual de cada uno de los casos a dicha intervención para los dos objetivos del estudio. Para ello, cada jugador completó un registro diario de alimentos (7 días; pesada de alimentos) antes, durante y al final de la intervención. Además, los sujetos registraron diariamente a lo largo del estudio la RPE-Sesión y la fatiga en entrenamientos físicos y de balón y en partidos oficiales de liga, controlándose además en forma cuantitativa otras variables influyentes como el estado de ánimo y el sueño. El análisis de los datos consistió en el cálculo de los estadísticos descriptivos para todas las variables, la comparación de la ingesta en los diferentes momentos evaluados con las recomendaciones nutricionales y una comparación de medias no paramétrica entre el período pre intervención y durante la intervención con el test de Wilcoxon (medidas repetidas) para todas las variables. Finalmente, se relacionaron los cambios obtenidos en la ingesta nutricional con la percepción del esfuerzo y la fatiga y la posible influencia del estado de ánimo y el sueño, a través de un estudio correlacional (Tau_b de Kendall). Posteriormente, se presentan los resultados obtenidos y la discusión de los mismos, haciendo referencia a la evidencia científica relacionada que se encuentra publicada hasta el momento, la cual facilitó el análisis de la relación entre RPE-Sesión, fatiga y nutrición a lo largo de una temporada. Los principales hallazgos y su correspondiente análisis, por lo tanto, pueden resumirse en los siguientes: 1) los tres jugadores de baloncesto profesional presentaron inicialmente hábitos nutricionales inadecuados, haciendo evidente la necesidad de un nutricionista deportivo dentro del cuerpo técnico de los equipos profesionales; 2) las principales deficiencias correspondieron a un déficit pronunciado de energía e hidratos de carbono, que fueron reducidas con la intervención nutricional; 3) la ingesta excesiva de grasa total, ácidos grasos saturados, etanol y proteínas que se halló en alguno/s de los casos, también se adecuó a las recomendaciones después de la intervención; 4) la media obtenida durante un período de la temporada para la RPE-Sesión y la fatiga de entrenamientos, podría ser disminuida en un jugador individual mediante el incremento de su ingesta de carbohidratos a largo plazo, siempre que no existan alteraciones psico-emocionales relevantes; 5) el comportamiento de la RPE-Sesión de partidos oficiales no parece estar influido por los factores nutricionales modificados en este estudio, dependiendo más de la variación de elementos externos no controlables, intrínsecos a los partidos de baloncesto profesional. Ante estos resultados, se pudo observar que las diferentes características de los jugadores y las distintas respuestas obtenidas después de la intervención, reforzaron la importancia de utilizar un diseño de estudio de casos para el análisis de los deportistas de élite y, asimismo, de realizar un asesoramiento nutricional personalizado. Del mismo modo, la percepción del esfuerzo y la fatiga de cada jugador evolucionaron de manera diferente después de la intervención nutricional, lo cual podría depender de las diferentes características de los sujetos, a nivel físico, psico-social, emocional y contextual. Por ello, se propone que el control riguroso de las variables cualitativas que parecen influir sobre la RPE y la fatiga a largo plazo, facilitaría la comprensión de los datos y la determinación de factores desconocidos que influyen sobre estas variables. Finalmente, al ser la RPE-Sesión un indicador directo de la carga interna del entrenamiento, es decir, del estrés psico-fisiológico experimentado por el deportista, la posible atenuación de esta variable mediante la adecuación de los hábitos nutricionales, permitiría aplicar las cargas externas de entrenamiento planificadas, con menor estrés interno y mejor recuperación entre sesiones, disminuyendo también la sensación de fatiga, a pesar del avance de la temporada. ABSTRACT This study consisted in a long-term nutritional intervention carried out with professional basketball players according to nutritional recommendations, with the following two main objectives: 1) to evaluate the changes produced by the intervention on daily nutritional practices of these athletes and 2) to determine the influence of long term nutritional intake modifications on the rate of perceived exertion per session (Session-RPE) and fatigue, throughout a competitive season for training as well as competition games. These objectives are based on: 1) much scientific evidence that shows an inadequacy of the nutritional habits of basketball players and other athletes regarding nutritional recommendations; 2) the fact widely recognized in the scientific literature that an optimal nutrition allows to achieve the maximum performance of an athlete (both physically and cognitively), promoting fast recovery and decreasing risks of sports injuries and illnesses. However, only few studies carried out a long term nutritional intervention to improve nutritional practices of athletes and it could not be found any research with basketball players; 3) the high correlation between the rate of perceived exertion (RPE) and physiological variables related to the performance of physical exercise (e.g.: heart rate, maximum consumption of oxygen or blood lactate) and multiple studies showing the attenuation of RPE during exercise due to the intake of certain nutrients (especially carbohydrates), while none of them was developed in basketball; 4) correlation between nutritional intake and Session-RPE has been recently studied for the first time. Session-RPE method has been validated in basketball players and other team sports as an indicator of internal workload, sports performance and exercise intensity; 5) fatigue is considered one of the main influential factor on RPE and sport performance. It has also been observed that carbohydrates intake may delay or mitigate the onset of fatigue and, thus, decrease the perceived exertion and the internal training load, which could improve sports performance and training-induced adaptations; 6) there are few studies evaluating the influence of nutrient intake on Session-RPE and only one of them has been carried out with basketball players. Moreover, it has not been analyzed the possible effects of the adequacy of players’ nutritional habits through a nutritional intervention on Session-RPE and fatigue, variables that could be decreased for all training session and competition games because of an improvement of daily nutritional intake. Therefore, this work begins with an introduction that provides the conceptual framework of this research focused on the key role of nutrition in sport, as well as on the current nutritional recommendations for athletes and specifically for basketball players. In addition, previous nutritional interventions carried out with other athletes are described, as well as consequential modifications on their food pattern, coinciding with the first objective of the present study. Subsequently, RPE, Session-RPE and fatigue are analyzed, with focus on their correlation with physical workload, training intensity, sports performance and recovery. Finally, all the aforementioned aspects (nutritional intake, RPE and fatigue) were combined in order to know the current status of the relation between each other, this being the base for the second objective of this study. Subsequently, the objectives mentioned above are explained, continuing with the explanation of the methodology used in the study. The methodology consisted of a case-study design, carrying out a long term nutritional intervention with three professional basketball players (each player = one case study; n = 1), in order to adapt their nutritional intake to nutritional recommendations. At the same time, the individual response of each player to the intervention was analyzed for the two main objectives of the study. Each player completed a food diary (7 days; weighing food) in three moments: before, during and at the end of the intervention. In addition, the Session-RPE and fatigue were daily recorded throughout the study for all trainings (training with ball and resistance training) and competition games. At the same time, other potentially influential variables such as mood state and sleeping were daily controlled throughout the study. Data analysis consisted in descriptive statistics calculation for all the variables of the study, the comparison between nutritional intake (evaluated at different times) and nutritional recommendations and a non-parametric mean comparison between pre intervention and during intervention periods was made by Wilcoxon test (repeated measurements) for all variables too. Finally, the changes in nutritional intake, mood state and sleeping were correlated with the perceived exertion and fatigue through correctional study (Tau_b de Kendall). After the methodology, the study results and the associated discussion are presented. The discussion is based on the current scientific evidence that contributes to understand the relation between Session-RPE, fatigue and nutrition throughout the competitive season. The main findings and results analysis can be summarized as follows: 1) the three professional basketball players initially had inadequate nutritional habits and this clearly shows the need of a sports nutritionist in the coaching staff of professional teams; (2) the major deficiencies of the three players’ diet corresponded to a pronounced deficit of energy intake and carbohydrates consumption which were reduced with nutritional intervention; (3) the excessive intake of total fat, saturated fatty acids, ethanol and protein found in some cases were also adapted to the recommendations after the intervention; (4) Session-RPE mean and fatigue of a certain period of the competition season, could be decreased in an individual player by increasing his carbohydrates intake in the long term, if there are no relevant psycho-emotional disorders; (5) the behavior of the Session-RPE in competition games does not seem to be influenced by the nutritional factors modified in this study. They seem to depend much more on the variation of external non-controllable factors associated with the professional basketball games. Given these results, the different characteristics of each player and the diverse responses observed after the intervention in each individual for all the variables, reinforced the importance of the use of a case study design for research with elite athletes as well as personalized nutritional counselling. In the same way, the different responses obtained for RPE and fatigue in the long term for each player due to modification of nutritional habits, show that there is a dependence of such variables on the physical, psychosocial, emotional and contextual characteristics of each player. Therefore it is proposed that the rigorous control of the qualitative variables that seem to influence the RPE and fatigue in the long term, may facilitate the understanding of data and the determination of unknown factors that could influence these variables. Finally, because Session-RPE is a direct indicator of the internal load of training (psycho-physiological stress experienced by the athlete), the possible attenuation of Session-RPE through the improvement in nutritional habits, would allow to apply the planned external loads of training with less internal stress and better recovery between sessions, with a decrease in fatigue, despite of the advance of the season.
Resumo:
En los últimos años ha habido un gran aumento de fuentes de datos biomédicos. La aparición de nuevas técnicas de extracción de datos genómicos y generación de bases de datos que contienen esta información ha creado la necesidad de guardarla para poder acceder a ella y trabajar con los datos que esta contiene. La información contenida en las investigaciones del campo biomédico se guarda en bases de datos. Esto se debe a que las bases de datos permiten almacenar y manejar datos de una manera simple y rápida. Dentro de las bases de datos existen una gran variedad de formatos, como pueden ser bases de datos en Excel, CSV o RDF entre otros. Actualmente, estas investigaciones se basan en el análisis de datos, para a partir de ellos, buscar correlaciones que permitan inferir, por ejemplo, tratamientos nuevos o terapias más efectivas para una determinada enfermedad o dolencia. El volumen de datos que se maneja en ellas es muy grande y dispar, lo que hace que sea necesario el desarrollo de métodos automáticos de integración y homogeneización de los datos heterogéneos. El proyecto europeo p-medicine (FP7-ICT-2009-270089) tiene como objetivo asistir a los investigadores médicos, en este caso de investigaciones relacionadas con el cáncer, proveyéndoles con nuevas herramientas para el manejo de datos y generación de nuevo conocimiento a partir del análisis de los datos gestionados. La ingestión de datos en la plataforma de p-medicine, y el procesamiento de los mismos con los métodos proporcionados, buscan generar nuevos modelos para la toma de decisiones clínicas. Dentro de este proyecto existen diversas herramientas para integración de datos heterogéneos, diseño y gestión de ensayos clínicos, simulación y visualización de tumores y análisis estadístico de datos. Precisamente en el ámbito de la integración de datos heterogéneos surge la necesidad de añadir información externa al sistema proveniente de bases de datos públicas, así como relacionarla con la ya existente mediante técnicas de integración semántica. Para resolver esta necesidad se ha creado una herramienta, llamada Term Searcher, que permite hacer este proceso de una manera semiautomática. En el trabajo aquí expuesto se describe el desarrollo y los algoritmos creados para su correcto funcionamiento. Esta herramienta ofrece nuevas funcionalidades que no existían dentro del proyecto para la adición de nuevos datos provenientes de fuentes públicas y su integración semántica con datos privados.---ABSTRACT---Over the last few years, there has been a huge growth of biomedical data sources. The emergence of new techniques of genomic data generation and data base generation that contain this information, has created the need of storing it in order to access and work with its data. The information employed in the biomedical research field is stored in databases. This is due to the capability of databases to allow storing and managing data in a quick and simple way. Within databases there is a variety of formats, such as Excel, CSV or RDF. Currently, these biomedical investigations are based on data analysis, which lead to the discovery of correlations that allow inferring, for example, new treatments or more effective therapies for a specific disease or ailment. The volume of data handled in them is very large and dissimilar, which leads to the need of developing new methods for automatically integrating and homogenizing the heterogeneous data. The p-medicine (FP7-ICT-2009-270089) European project aims to assist medical researchers, in this case related to cancer research, providing them with new tools for managing and creating new knowledge from the analysis of the managed data. The ingestion of data into the platform and its subsequent processing with the provided tools aims to enable the generation of new models to assist in clinical decision support processes. Inside this project, there exist different tools related to areas such as the integration of heterogeneous data, the design and management of clinical trials, simulation and visualization of tumors and statistical data analysis. Particularly in the field of heterogeneous data integration, there is a need to add external information from public databases, and relate it to the existing ones through semantic integration methods. To solve this need a tool has been created: the term Searcher. This tool aims to make this process in a semiautomatic way. This work describes the development of this tool and the algorithms employed in its operation. This new tool provides new functionalities that did not exist inside the p-medicine project for adding new data from public databases and semantically integrate them with private data.
Resumo:
Una de las barreras para la aplicación de las técnicas de monitorización de la integridad estructural (SHM) basadas en ondas elásticas guiadas (GLW) en aeronaves es la influencia perniciosa de las condiciones ambientales y de operación (EOC). En esta tesis se ha estudiado dicha influencia y la compensación de la misma, particularizando en variaciones del estado de carga y temperatura. La compensación de dichos efectos se fundamenta en Redes Neuronales Artificiales (ANN) empleando datos experimentales procesados con la Transformada Chirplet. Los cambios en la geometría y en las propiedades del material respecto al estado inicial de la estructura (lo daños) provocan cambios en la forma de onda de las GLW (lo que denominamos característica sensible al daño o DSF). Mediante técnicas de tratamiento de señal se puede buscar una relación entre dichas variaciones y los daños, esto se conoce como SHM. Sin embargo, las variaciones en las EOC producen también cambios en los datos adquiridos relativos a las GLW (DSF) que provocan errores en los algoritmos de diagnóstico de daño (SHM). Esto sucede porque las firmas de daño y de las EOC en la DSF son del mismo orden. Por lo tanto, es necesario cuantificar y compensar el efecto de las EOC sobre la GLW. Si bien existen diversas metodologías para compensar los efectos de las EOC como por ejemplo “Optimal Baseline Selection” (OBS) o “Baseline Signal Stretching” (BSS), estas, se emplean exclusivamente en la compensación de los efectos térmicos. El método propuesto en esta tesis mezcla análisis de datos experimentales, como en el método OBS, y modelos basados en Redes Neuronales Artificiales (ANN) que reemplazan el modelado físico requerido por el método BSS. El análisis de datos experimentales consiste en aplicar la Transformada Chirplet (CT) para extraer la firma de las EOC sobre la DSF. Con esta información, obtenida bajo diversas EOC, se entrena una ANN. A continuación, la ANN actuará como un interpolador de referencias de la estructura sin daño, generando información de referencia para cualquier EOC. La comparación de las mediciones reales de la DSF con los valores simulados por la ANN, dará como resultado la firma daño en la DSF, lo que permite el diagnóstico de daño. Este esquema se ha aplicado y verificado, en diversas EOC, para una estructura unidimensional con un único camino de daño, y para una estructura representativa de un fuselaje de una aeronave, con curvatura y múltiples elementos rigidizadores, sometida a un estado de cargas complejo, con múltiples caminos de daños. Los efectos de las EOC se han estudiado en detalle en la estructura unidimensional y se han generalizado para el fuselaje, demostrando la independencia del método respecto a la configuración de la estructura y el tipo de sensores utilizados para la adquisición de datos GLW. Por otra parte, esta metodología se puede utilizar para la compensación simultánea de una variedad medible de EOC, que afecten a la adquisición de datos de la onda elástica guiada. El principal resultado entre otros, de esta tesis, es la metodología CT-ANN para la compensación de EOC en técnicas SHM basadas en ondas elásticas guiadas para el diagnóstico de daño. ABSTRACT One of the open problems to implement Structural Health Monitoring techniques based on elastic guided waves in real aircraft structures at operation is the influence of the environmental and operational conditions (EOC) on the damage diagnosis problem. This thesis deals with the compensation of these environmental and operational effects, specifically, the temperature and the external loading, by the use of the Chirplet Transform working with Artificial Neural Networks. It is well known that the guided elastic wave form is affected by the damage appearance (what is known as the damage sensitive feature or DSF). The DSF is modified by the temperature and by the load applied to the structure. The EOC promotes variations in the acquired data (DSF) and cause mistakes in damage diagnosis algorithms. This effect promotes changes on the waveform due to the EOC variations of the same order than the damage occurrence. It is difficult to separate both effects in order to avoid damage diagnosis mistakes. Therefore it is necessary to quantify and compensate the effect of EOC over the GLW forms. There are several approaches to compensate the EOC effects such as Optimal Baseline Selection (OBS) or Baseline Signal Stretching (BSS). Usually, they are used for temperature compensation. The new method proposed here mixes experimental data analysis, as in the OBS method, and Artificial Neural Network (ANN) models to replace the physical modelling which involves the BSS method. The experimental data analysis studied is based on apply the Chirplet Transform (CT) to extract the EOC signature on the DSF. The information obtained varying EOC is employed to train an ANN. Then, the ANN will act as a baselines interpolator of the undamaged structure. The ANN generates reference information at any EOC. By comparing real measurements of the DSF against the ANN simulated values, the damage signature appears clearly in the DSF, enabling an accurate damage diagnosis. This schema has been applied in a range of EOC for a one-dimensional structure containing single damage path and two dimensional real fuselage structure with stiffener elements and multiple damage paths. The EOC effects tested in the one-dimensional structure have been generalized to the fuselage showing its independence from structural arrangement and the type of sensors used for GLW data acquisition. Moreover, it can be used for the simultaneous compensation of a variety of measurable EOC, which affects the guided wave data acquisition. The main result, among others, of this thesis is the CT-ANN methodology for the compensation of EOC in GLW based SHM technique for damage diagnosis.
Resumo:
Researchers in ecology commonly use multivariate analyses (e.g. redundancy analysis, canonical correspondence analysis, Mantel correlation, multivariate analysis of variance) to interpret patterns in biological data and relate these patterns to environmental predictors. There has been, however, little recognition of the errors associated with biological data and the influence that these may have on predictions derived from ecological hypotheses. We present a permutational method that assesses the effects of taxonomic uncertainty on the multivariate analyses typically used in the analysis of ecological data. The procedure is based on iterative randomizations that randomly re-assign non identified species in each site to any of the other species found in the remaining sites. After each re-assignment of species identities, the multivariate method at stake is run and a parameter of interest is calculated. Consequently, one can estimate a range of plausible values for the parameter of interest under different scenarios of re-assigned species identities. We demonstrate the use of our approach in the calculation of two parameters with an example involving tropical tree species from western Amazonia: 1) the Mantel correlation between compositional similarity and environmental distances between pairs of sites, and; 2) the variance explained by environmental predictors in redundancy analysis (RDA). We also investigated the effects of increasing taxonomic uncertainty (i.e. number of unidentified species), and the taxonomic resolution at which morphospecies are determined (genus-resolution, family-resolution, or fully undetermined species) on the uncertainty range of these parameters. To achieve this, we performed simulations on a tree dataset from southern Mexico by randomly selecting a portion of the species contained in the dataset and classifying them as unidentified at each level of decreasing taxonomic resolution. An analysis of covariance showed that both taxonomic uncertainty and resolution significantly influence the uncertainty range of the resulting parameters. Increasing taxonomic uncertainty expands our uncertainty of the parameters estimated both in the Mantel test and RDA. The effects of increasing taxonomic resolution, however, are not as evident. The method presented in this study improves the traditional approaches to study compositional change in ecological communities by accounting for some of the uncertainty inherent to biological data. We hope that this approach can be routinely used to estimate any parameter of interest obtained from compositional data tables when faced with taxonomic uncertainty.
Resumo:
The properties of data and activities in business processes can be used to greatly facilítate several relevant tasks performed at design- and run-time, such as fragmentation, compliance checking, or top-down design. Business processes are often described using workflows. We present an approach for mechanically inferring business domain-specific attributes of workflow components (including data Ítems, activities, and elements of sub-workflows), taking as starting point known attributes of workflow inputs and the structure of the workflow. We achieve this by modeling these components as concepts and applying sharing analysis to a Horn clause-based representation of the workflow. The analysis is applicable to workflows featuring complex control and data dependencies, embedded control constructs, such as loops and branches, and embedded component services.