13 resultados para automated thematic analysis of textual data

em Universidad Politécnica de Madrid


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Researchers in ecology commonly use multivariate analyses (e.g. redundancy analysis, canonical correspondence analysis, Mantel correlation, multivariate analysis of variance) to interpret patterns in biological data and relate these patterns to environmental predictors. There has been, however, little recognition of the errors associated with biological data and the influence that these may have on predictions derived from ecological hypotheses. We present a permutational method that assesses the effects of taxonomic uncertainty on the multivariate analyses typically used in the analysis of ecological data. The procedure is based on iterative randomizations that randomly re-assign non identified species in each site to any of the other species found in the remaining sites. After each re-assignment of species identities, the multivariate method at stake is run and a parameter of interest is calculated. Consequently, one can estimate a range of plausible values for the parameter of interest under different scenarios of re-assigned species identities. We demonstrate the use of our approach in the calculation of two parameters with an example involving tropical tree species from western Amazonia: 1) the Mantel correlation between compositional similarity and environmental distances between pairs of sites, and; 2) the variance explained by environmental predictors in redundancy analysis (RDA). We also investigated the effects of increasing taxonomic uncertainty (i.e. number of unidentified species), and the taxonomic resolution at which morphospecies are determined (genus-resolution, family-resolution, or fully undetermined species) on the uncertainty range of these parameters. To achieve this, we performed simulations on a tree dataset from southern Mexico by randomly selecting a portion of the species contained in the dataset and classifying them as unidentified at each level of decreasing taxonomic resolution. An analysis of covariance showed that both taxonomic uncertainty and resolution significantly influence the uncertainty range of the resulting parameters. Increasing taxonomic uncertainty expands our uncertainty of the parameters estimated both in the Mantel test and RDA. The effects of increasing taxonomic resolution, however, are not as evident. The method presented in this study improves the traditional approaches to study compositional change in ecological communities by accounting for some of the uncertainty inherent to biological data. We hope that this approach can be routinely used to estimate any parameter of interest obtained from compositional data tables when faced with taxonomic uncertainty.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The application of thematic maps obtained through the classification of remote images needs the obtained products with an optimal accuracy. The registered images from the airplanes display a very satisfactory spatial resolution, but the classical methods of thematic classification not always give better results than when the registered data from satellite are used. In order to improve these results of classification, in this work, the LIDAR sensor data from first return (Light Detection And Ranging) registered simultaneously with the spectral sensor data from airborne are jointly used. The final results of the thematic classification of the scene object of study have been obtained, quantified and discussed with and without LIDAR data, after applying different methods: Maximum Likehood Classification, Support Vector Machine with four different functions kernel and Isodata clustering algorithm (ML, SVM-L, SVM-P, SVM-RBF, SVM-S, Isodata). The best results are obtained for SVM with Sigmoide kernel. These allow the correlation with others different physical parameters with great interest like Manning hydraulic coefficient, for their incorporation in a GIS and their application in hydraulic modeling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An important competence of human data analysts is to interpret and explain the meaning of the results of data analysis to end-users. However, existing automatic solutions for intelligent data analysis provide limited help to interpret and communicate information to non-expert users. In this paper we present a general approach to generating explanatory descriptions about the meaning of quantitative sensor data. We propose a type of web application: a virtual newspaper with automatically generated news stories that describe the meaning of sensor data. This solution integrates a variety of techniques from intelligent data analysis into a web-based multimedia presentation system. We validated our approach in a real world problem and demonstrate its generality using data sets from several domains. Our experience shows that this solution can facilitate the use of sensor data by general users and, therefore, can increase the utility of sensor network infrastructures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work is aimed to present the main differences of nuclear data uncertainties among three different nuclear data libraries: EAF-2007, EAF-2010 and SCALE-6.0, under different neutron spectra: LWR, ADS and DEMO (fusion)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract interpretation-based data-flow analysis of logic programs is at this point relatively well understood from the point of view of general frameworks and abstract domains. On the other hand, comparatively little attention has been given to the problems which arise when analysis of a full, practical dialect of the Prolog language is attempted, and only few solutions to these problems have been proposed to date. Such problems relate to dealing correctly with all builtins, including meta-logical and extra-logical predicates, with dynamic predicates (where the program is modified during execution), and with the absence of certain program text during compilation. Existing proposals for dealing with such issues generally restrict in one way or another the classes of programs which can be analyzed if the information from analysis is to be used for program optimization. This paper attempts to fill this gap by considering a full dialect of Prolog, essentially following the recently proposed ISO standard, pointing out the problems that may arise in the analysis of such a dialect, and proposing a combination of known and novel solutions that together allow the correct analysis of arbitrary programs using the full power of the language.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work is aimed to present the main differences of nuclear data uncertainties among three different nuclear data libraries: EAF-2007, EAF-2010 and SCALE-6.0, under different neutron spectra: LWR, ADS and DEMO (fusion). To take into account the neutron spectrum, the uncertainty data are collapsed to onegroup. That is a simple way to see the differences among libraries for one application. Also, the neutron spectrum effect on different applications can be observed. These comparisons are presented only for (n,fission), (n,gamma) and (n,p) reactions, for the main transuranic isotopes (234,235,236,238U, 237Np, 238,239,240,241Pu, 241,242m,243Am, 242,243,244,245,246,247,248Cm, 249Bk, 249,250,251,252Cf). But also general comparisons among libraries are presented taking into account all included isotopes. In other works, target accuracies are presented for nuclear data uncertainties; here, these targets are compared with uncertainties on the above libraries. The main results of these comparisons are that EAF-2010 has reduced their uncertainties for many isotopes from EAF-2007 for (n,gamma) and (n,fission) but not for (n,p); SCALE-6.0 gives lower uncertainties for (n,fission) reactions for ADS and PWR applications, but gives higher uncertainties for (n,p) reactions in all applications. For the (n,gamma) reaction, the amount of isotopes which have higher uncertainties is quite similar to the amount of isotopes which have lower uncertainties when SCALE-6.0 and EAF-2010 are compared. When the effect of neutron spectra is analysed, the ADS neutron spectrum obtained the highest uncertainties for (n,gamma) and (n,fission) reactions of all libraries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analysis of Neutron Thermal Scattering Data Uncertainties in PWRs

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A sensitivity analysis on the multiplication factor, keffkeff, to the cross section data has been carried out for the MYRRHA critical configuration in order to show the most relevant reactions. With these results, a further analysis on the 238Pu and 56Fe cross sections has been performed, comparing the evaluations provided in the JEFF-3.1.2 and ENDF/B-VII.1 libraries for these nuclides. Then, the effect in MYRRHA of the differences between evaluations are analysed, presenting the source of the differences. With these results, recommendations for the 56Fe and 238Pu evaluations are suggested. These calculations have been performed with SCALE6.1 and MCNPX-2.7e.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An uncertainty propagation methodology based on the Monte Carlo method is applied to PWR nuclear design analysis to assess the impact of nuclear data uncertainties. The importance of the nuclear data uncertainties for 235,238 U, 239 Pu, and the thermal scattering library for hydrogen in water is analyzed. This uncertainty analysis is compared with the design and acceptance criteria to assure the adequacy of bounding estimates in safety margins.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Grapheme-color synesthesia is a neurological phenomenon in which viewing achromatic letters/numbers leads to automatic and involuntary color experiences. In this study, voxel-based morphometry analyses were performed on T1 images and fractional anisotropy measures to examine the whole brain in associator grapheme-color synesthetes. These analyses provide new evidence of variations in emotional areas (both at the cortical and subcortical levels), findings that help understand the emotional component as a relevant aspect of the synesthetic experience. Additionally, this study replicates previous findings in the left intraparietal sulcus and, for the first time, reports the existence of anatomical differences in subcortical gray nuclei of developmental grapheme-color synesthetes, providing a link between acquired and developmental synesthesia. This empirical evidence, which goes beyond modality-specific areas, could lead to a better understanding of grapheme-color synesthesia as well as of other modalities of the phenomenon.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Assessment of diastolic chamber properties of the right ventricle by global fitting of pressure-volume data and conformational analysis of 3D + T echocardiographic sequences

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sentiment analysis has recently gained popularity in the financial domain thanks to its capability to predict the stock market based on the wisdom of the crowds. Nevertheless, current sentiment indicators are still silos that cannot be combined to get better insight about the mood of different communities. In this article we propose a Linked Data approach for modelling sentiment and emotions about financial entities. We aim at integrating sentiment information from different communities or providers, and complements existing initiatives such as FIBO. The ap- proach has been validated in the semantic annotation of tweets of several stocks in the Spanish stock market, including its sentiment information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El uso de aritmética de punto fijo es una opción de diseño muy extendida en sistemas con fuertes restricciones de área, consumo o rendimiento. Para producir implementaciones donde los costes se minimicen sin impactar negativamente en la precisión de los resultados debemos llevar a cabo una asignación cuidadosa de anchuras de palabra. Encontrar la combinación óptima de anchuras de palabra en coma fija para un sistema dado es un problema combinatorio NP-hard al que los diseñadores dedican entre el 25 y el 50 % del ciclo de diseño. Las plataformas hardware reconfigurables, como son las FPGAs, también se benefician de las ventajas que ofrece la aritmética de coma fija, ya que éstas compensan las frecuencias de reloj más bajas y el uso más ineficiente del hardware que hacen estas plataformas respecto a los ASICs. A medida que las FPGAs se popularizan para su uso en computación científica los diseños aumentan de tamaño y complejidad hasta llegar al punto en que no pueden ser manejados eficientemente por las técnicas actuales de modelado de señal y ruido de cuantificación y de optimización de anchura de palabra. En esta Tesis Doctoral exploramos distintos aspectos del problema de la cuantificación y presentamos nuevas metodologías para cada uno de ellos: Las técnicas basadas en extensiones de intervalos han permitido obtener modelos de propagación de señal y ruido de cuantificación muy precisos en sistemas con operaciones no lineales. Nosotros llevamos esta aproximación un paso más allá introduciendo elementos de Multi-Element Generalized Polynomial Chaos (ME-gPC) y combinándolos con una técnica moderna basada en Modified Affine Arithmetic (MAA) estadístico para así modelar sistemas que contienen estructuras de control de flujo. Nuestra metodología genera los distintos caminos de ejecución automáticamente, determina las regiones del dominio de entrada que ejercitarán cada uno de ellos y extrae los momentos estadísticos del sistema a partir de dichas soluciones parciales. Utilizamos esta técnica para estimar tanto el rango dinámico como el ruido de redondeo en sistemas con las ya mencionadas estructuras de control de flujo y mostramos la precisión de nuestra aproximación, que en determinados casos de uso con operadores no lineales llega a tener tan solo una desviación del 0.04% con respecto a los valores de referencia obtenidos mediante simulación. Un inconveniente conocido de las técnicas basadas en extensiones de intervalos es la explosión combinacional de términos a medida que el tamaño de los sistemas a estudiar crece, lo cual conlleva problemas de escalabilidad. Para afrontar este problema presen tamos una técnica de inyección de ruidos agrupados que hace grupos con las señales del sistema, introduce las fuentes de ruido para cada uno de los grupos por separado y finalmente combina los resultados de cada uno de ellos. De esta forma, el número de fuentes de ruido queda controlado en cada momento y, debido a ello, la explosión combinatoria se minimiza. También presentamos un algoritmo de particionado multi-vía destinado a minimizar la desviación de los resultados a causa de la pérdida de correlación entre términos de ruido con el objetivo de mantener los resultados tan precisos como sea posible. La presente Tesis Doctoral también aborda el desarrollo de metodologías de optimización de anchura de palabra basadas en simulaciones de Monte-Cario que se ejecuten en tiempos razonables. Para ello presentamos dos nuevas técnicas que exploran la reducción del tiempo de ejecución desde distintos ángulos: En primer lugar, el método interpolativo aplica un interpolador sencillo pero preciso para estimar la sensibilidad de cada señal, y que es usado después durante la etapa de optimización. En segundo lugar, el método incremental gira en torno al hecho de que, aunque es estrictamente necesario mantener un intervalo de confianza dado para los resultados finales de nuestra búsqueda, podemos emplear niveles de confianza más relajados, lo cual deriva en un menor número de pruebas por simulación, en las etapas iniciales de la búsqueda, cuando todavía estamos lejos de las soluciones optimizadas. Mediante estas dos aproximaciones demostramos que podemos acelerar el tiempo de ejecución de los algoritmos clásicos de búsqueda voraz en factores de hasta x240 para problemas de tamaño pequeño/mediano. Finalmente, este libro presenta HOPLITE, una infraestructura de cuantificación automatizada, flexible y modular que incluye la implementación de las técnicas anteriores y se proporciona de forma pública. Su objetivo es ofrecer a desabolladores e investigadores un entorno común para prototipar y verificar nuevas metodologías de cuantificación de forma sencilla. Describimos el flujo de trabajo, justificamos las decisiones de diseño tomadas, explicamos su API pública y hacemos una demostración paso a paso de su funcionamiento. Además mostramos, a través de un ejemplo sencillo, la forma en que conectar nuevas extensiones a la herramienta con las interfaces ya existentes para poder así expandir y mejorar las capacidades de HOPLITE. ABSTRACT Using fixed-point arithmetic is one of the most common design choices for systems where area, power or throughput are heavily constrained. In order to produce implementations where the cost is minimized without negatively impacting the accuracy of the results, a careful assignment of word-lengths is required. The problem of finding the optimal combination of fixed-point word-lengths for a given system is a combinatorial NP-hard problem to which developers devote between 25 and 50% of the design-cycle time. Reconfigurable hardware platforms such as FPGAs also benefit of the advantages of fixed-point arithmetic, as it compensates for the slower clock frequencies and less efficient area utilization of the hardware platform with respect to ASICs. As FPGAs become commonly used for scientific computation, designs constantly grow larger and more complex, up to the point where they cannot be handled efficiently by current signal and quantization noise modelling and word-length optimization methodologies. In this Ph.D. Thesis we explore different aspects of the quantization problem and we present new methodologies for each of them: The techniques based on extensions of intervals have allowed to obtain accurate models of the signal and quantization noise propagation in systems with non-linear operations. We take this approach a step further by introducing elements of MultiElement Generalized Polynomial Chaos (ME-gPC) and combining them with an stateof- the-art Statistical Modified Affine Arithmetic (MAA) based methodology in order to model systems that contain control-flow structures. Our methodology produces the different execution paths automatically, determines the regions of the input domain that will exercise them, and extracts the system statistical moments from the partial results. We use this technique to estimate both the dynamic range and the round-off noise in systems with the aforementioned control-flow structures. We show the good accuracy of our approach, which in some case studies with non-linear operators shows a 0.04 % deviation respect to the simulation-based reference values. A known drawback of the techniques based on extensions of intervals is the combinatorial explosion of terms as the size of the targeted systems grows, which leads to scalability problems. To address this issue we present a clustered noise injection technique that groups the signals in the system, introduces the noise terms in each group independently and then combines the results at the end. In this way, the number of noise sources in the system at a given time is controlled and, because of this, the combinato rial explosion is minimized. We also present a multi-way partitioning algorithm aimed at minimizing the deviation of the results due to the loss of correlation between noise terms, in order to keep the results as accurate as possible. This Ph.D. Thesis also covers the development of methodologies for word-length optimization based on Monte-Carlo simulations in reasonable times. We do so by presenting two novel techniques that explore the reduction of the execution times approaching the problem in two different ways: First, the interpolative method applies a simple but precise interpolator to estimate the sensitivity of each signal, which is later used to guide the optimization effort. Second, the incremental method revolves on the fact that, although we strictly need to guarantee a certain confidence level in the simulations for the final results of the optimization process, we can do it with more relaxed levels, which in turn implies using a considerably smaller amount of samples, in the initial stages of the process, when we are still far from the optimized solution. Through these two approaches we demonstrate that the execution time of classical greedy techniques can be accelerated by factors of up to ×240 for small/medium sized problems. Finally, this book introduces HOPLITE, an automated, flexible and modular framework for quantization that includes the implementation of the previous techniques and is provided for public access. The aim is to offer a common ground for developers and researches for prototyping and verifying new techniques for system modelling and word-length optimization easily. We describe its work flow, justifying the taken design decisions, explain its public API and we do a step-by-step demonstration of its execution. We also show, through an example, the way new extensions to the flow should be connected to the existing interfaces in order to expand and improve the capabilities of HOPLITE.