15 resultados para Automated analysis
em Universidad Politécnica de Madrid
Resumo:
Automatic cost analysis of programs has been traditionally concentrated on a reduced number of resources such as execution steps, time, or memory. However, the increasing relevance of analysis applications such as static debugging and/or certiflcation of user-level properties (including for mobile code) makes it interesting to develop analyses for resource notions that are actually application-dependent. This may include, for example, bytes sent or received by an application, number of files left open, number of SMSs sent or received, number of accesses to a datábase, money spent, energy consumption, etc. We present a fully automated analysis for inferring upper bounds on the usage that a Java bytecode program makes of a set of application programmer-deflnable resources. In our context, a resource is defined by programmer-provided annotations which state the basic consumption that certain program elements make of that resource. From these deflnitions our analysis derives functions which return an upper bound on the usage that the whole program (and individual blocks) make of that resource for any given set of input data sizes. The analysis proposed is independent of the particular resource. We also present some experimental results from a prototype implementation of the approach covering a signiflcant set of interesting resources.
Resumo:
Automatic cost analysis of programs has been traditionally studied in terms of a number of concrete, predefined resources such as execution steps, time, or memory. However, the increasing relevance of analysis applications such as static debugging and/or certification of user-level properties (including for mobile code) makes it interesting to develop analyses for resource notions that are actually applicationdependent. This may include, for example, bytes sent or received by an application, number of files left open, number of SMSs sent or received, number of accesses to a database, money spent, energy consumption, etc. We present a fully automated analysis for inferring upper bounds on the usage that a Java bytecode program makes of a set of application programmer-definable resources. In our context, a resource is defined by programmer-provided annotations which state the basic consumption that certain program elements make of that resource. From these definitions our analysis derives functions which return an upper bound on the usage that the whole program (and individual blocks) make of that resource for any given set of input data sizes. The analysis proposed is independent of the particular resource. We also present some experimental results from a prototype implementation of the approach covering an ample set of interesting resources.
Resumo:
Exploiting the full potential of telemedical systems means using platform based solutions: data are recovered from biomedical sensors, hospital information systems, care-givers, as well as patients themselves, and are processed and redistributed in an either centralized or, more probably, decentralized way. The integration of all these different devices, and interfaces, as well as the automated analysis and representation of all the pieces of information are current key challenges in telemedicine. Mobile phone technology has just begun to offer great opportunities of using this diverse information for guiding, warning, and educating patients, thus increasing their autonomy and adherence to their prescriptions. However, most of these existing mobile solutions are not based on platform systems and therefore represent limited, isolated applications. This article depicts how telemedical systems, based on integrated health data platforms, can maximize prescription adherence in chronic patients through mobile feedback. The application described here has been developed in an EU-funded R&D project called METABO, dedicated to patients with type 1 or type 2 Diabetes Mellitus
Resumo:
The properties of data and activities in business processes can be used to greatly facilítate several relevant tasks performed at design- and run-time, such as fragmentation, compliance checking, or top-down design. Business processes are often described using workflows. We present an approach for mechanically inferring business domain-specific attributes of workflow components (including data Ítems, activities, and elements of sub-workflows), taking as starting point known attributes of workflow inputs and the structure of the workflow. We achieve this by modeling these components as concepts and applying sharing analysis to a Horn clause-based representation of the workflow. The analysis is applicable to workflows featuring complex control and data dependencies, embedded control constructs, such as loops and branches, and embedded component services.
Resumo:
The aim of this study was to compare automated ribosomal intergenic spacer analysis (ARISA) and denaturing gradient gel electrophoresis (DGGE) techniques to assess bacterial diversity in the rumen of sheep. Sheep were fed 2 diets with 70% of either alfalfa hay or grass hay, and the solid (SOL) and liquid (LIQ) phases of the rumen were sampled immediately before feeding (0 h) and at 4 and 8 h postfeeding. Both techniques detected similar differences between forages, with alfalfa hay promoting greater (P < 0.05) bacterial diversity than grass hay. In contrast, whereas ARISA analysis showed a decrease (P < 0.05) of bacterial diversity in SOL at 4 h postfeeding compared with 0 and 8 h samplings, no variations (P > 0.05) over the postfeeding period were detected by DGGE. The ARISA technique showed lower (P < 0.05) bacterial diversity in SOL than in LIQ samples at 4 h postfeeding, but no differences (P > 0.05) in bacterial diversity between both rumen phases were detected by DGGE. Under the conditions of this study, the DGGE was not sensitive enough to detect some changes in ruminal bacterial communities, and therefore ARISA was considered more accurate for assessing bacterial diversity of ruminal samples. The results highlight the influence of the fingerprinting technique used to draw conclusions on factors affecting ruminal bacterial diversity.
Resumo:
El uso de aritmética de punto fijo es una opción de diseño muy extendida en sistemas con fuertes restricciones de área, consumo o rendimiento. Para producir implementaciones donde los costes se minimicen sin impactar negativamente en la precisión de los resultados debemos llevar a cabo una asignación cuidadosa de anchuras de palabra. Encontrar la combinación óptima de anchuras de palabra en coma fija para un sistema dado es un problema combinatorio NP-hard al que los diseñadores dedican entre el 25 y el 50 % del ciclo de diseño. Las plataformas hardware reconfigurables, como son las FPGAs, también se benefician de las ventajas que ofrece la aritmética de coma fija, ya que éstas compensan las frecuencias de reloj más bajas y el uso más ineficiente del hardware que hacen estas plataformas respecto a los ASICs. A medida que las FPGAs se popularizan para su uso en computación científica los diseños aumentan de tamaño y complejidad hasta llegar al punto en que no pueden ser manejados eficientemente por las técnicas actuales de modelado de señal y ruido de cuantificación y de optimización de anchura de palabra. En esta Tesis Doctoral exploramos distintos aspectos del problema de la cuantificación y presentamos nuevas metodologías para cada uno de ellos: Las técnicas basadas en extensiones de intervalos han permitido obtener modelos de propagación de señal y ruido de cuantificación muy precisos en sistemas con operaciones no lineales. Nosotros llevamos esta aproximación un paso más allá introduciendo elementos de Multi-Element Generalized Polynomial Chaos (ME-gPC) y combinándolos con una técnica moderna basada en Modified Affine Arithmetic (MAA) estadístico para así modelar sistemas que contienen estructuras de control de flujo. Nuestra metodología genera los distintos caminos de ejecución automáticamente, determina las regiones del dominio de entrada que ejercitarán cada uno de ellos y extrae los momentos estadísticos del sistema a partir de dichas soluciones parciales. Utilizamos esta técnica para estimar tanto el rango dinámico como el ruido de redondeo en sistemas con las ya mencionadas estructuras de control de flujo y mostramos la precisión de nuestra aproximación, que en determinados casos de uso con operadores no lineales llega a tener tan solo una desviación del 0.04% con respecto a los valores de referencia obtenidos mediante simulación. Un inconveniente conocido de las técnicas basadas en extensiones de intervalos es la explosión combinacional de términos a medida que el tamaño de los sistemas a estudiar crece, lo cual conlleva problemas de escalabilidad. Para afrontar este problema presen tamos una técnica de inyección de ruidos agrupados que hace grupos con las señales del sistema, introduce las fuentes de ruido para cada uno de los grupos por separado y finalmente combina los resultados de cada uno de ellos. De esta forma, el número de fuentes de ruido queda controlado en cada momento y, debido a ello, la explosión combinatoria se minimiza. También presentamos un algoritmo de particionado multi-vía destinado a minimizar la desviación de los resultados a causa de la pérdida de correlación entre términos de ruido con el objetivo de mantener los resultados tan precisos como sea posible. La presente Tesis Doctoral también aborda el desarrollo de metodologías de optimización de anchura de palabra basadas en simulaciones de Monte-Cario que se ejecuten en tiempos razonables. Para ello presentamos dos nuevas técnicas que exploran la reducción del tiempo de ejecución desde distintos ángulos: En primer lugar, el método interpolativo aplica un interpolador sencillo pero preciso para estimar la sensibilidad de cada señal, y que es usado después durante la etapa de optimización. En segundo lugar, el método incremental gira en torno al hecho de que, aunque es estrictamente necesario mantener un intervalo de confianza dado para los resultados finales de nuestra búsqueda, podemos emplear niveles de confianza más relajados, lo cual deriva en un menor número de pruebas por simulación, en las etapas iniciales de la búsqueda, cuando todavía estamos lejos de las soluciones optimizadas. Mediante estas dos aproximaciones demostramos que podemos acelerar el tiempo de ejecución de los algoritmos clásicos de búsqueda voraz en factores de hasta x240 para problemas de tamaño pequeño/mediano. Finalmente, este libro presenta HOPLITE, una infraestructura de cuantificación automatizada, flexible y modular que incluye la implementación de las técnicas anteriores y se proporciona de forma pública. Su objetivo es ofrecer a desabolladores e investigadores un entorno común para prototipar y verificar nuevas metodologías de cuantificación de forma sencilla. Describimos el flujo de trabajo, justificamos las decisiones de diseño tomadas, explicamos su API pública y hacemos una demostración paso a paso de su funcionamiento. Además mostramos, a través de un ejemplo sencillo, la forma en que conectar nuevas extensiones a la herramienta con las interfaces ya existentes para poder así expandir y mejorar las capacidades de HOPLITE. ABSTRACT Using fixed-point arithmetic is one of the most common design choices for systems where area, power or throughput are heavily constrained. In order to produce implementations where the cost is minimized without negatively impacting the accuracy of the results, a careful assignment of word-lengths is required. The problem of finding the optimal combination of fixed-point word-lengths for a given system is a combinatorial NP-hard problem to which developers devote between 25 and 50% of the design-cycle time. Reconfigurable hardware platforms such as FPGAs also benefit of the advantages of fixed-point arithmetic, as it compensates for the slower clock frequencies and less efficient area utilization of the hardware platform with respect to ASICs. As FPGAs become commonly used for scientific computation, designs constantly grow larger and more complex, up to the point where they cannot be handled efficiently by current signal and quantization noise modelling and word-length optimization methodologies. In this Ph.D. Thesis we explore different aspects of the quantization problem and we present new methodologies for each of them: The techniques based on extensions of intervals have allowed to obtain accurate models of the signal and quantization noise propagation in systems with non-linear operations. We take this approach a step further by introducing elements of MultiElement Generalized Polynomial Chaos (ME-gPC) and combining them with an stateof- the-art Statistical Modified Affine Arithmetic (MAA) based methodology in order to model systems that contain control-flow structures. Our methodology produces the different execution paths automatically, determines the regions of the input domain that will exercise them, and extracts the system statistical moments from the partial results. We use this technique to estimate both the dynamic range and the round-off noise in systems with the aforementioned control-flow structures. We show the good accuracy of our approach, which in some case studies with non-linear operators shows a 0.04 % deviation respect to the simulation-based reference values. A known drawback of the techniques based on extensions of intervals is the combinatorial explosion of terms as the size of the targeted systems grows, which leads to scalability problems. To address this issue we present a clustered noise injection technique that groups the signals in the system, introduces the noise terms in each group independently and then combines the results at the end. In this way, the number of noise sources in the system at a given time is controlled and, because of this, the combinato rial explosion is minimized. We also present a multi-way partitioning algorithm aimed at minimizing the deviation of the results due to the loss of correlation between noise terms, in order to keep the results as accurate as possible. This Ph.D. Thesis also covers the development of methodologies for word-length optimization based on Monte-Carlo simulations in reasonable times. We do so by presenting two novel techniques that explore the reduction of the execution times approaching the problem in two different ways: First, the interpolative method applies a simple but precise interpolator to estimate the sensitivity of each signal, which is later used to guide the optimization effort. Second, the incremental method revolves on the fact that, although we strictly need to guarantee a certain confidence level in the simulations for the final results of the optimization process, we can do it with more relaxed levels, which in turn implies using a considerably smaller amount of samples, in the initial stages of the process, when we are still far from the optimized solution. Through these two approaches we demonstrate that the execution time of classical greedy techniques can be accelerated by factors of up to ×240 for small/medium sized problems. Finally, this book introduces HOPLITE, an automated, flexible and modular framework for quantization that includes the implementation of the previous techniques and is provided for public access. The aim is to offer a common ground for developers and researches for prototyping and verifying new techniques for system modelling and word-length optimization easily. We describe its work flow, justifying the taken design decisions, explain its public API and we do a step-by-step demonstration of its execution. We also show, through an example, the way new extensions to the flow should be connected to the existing interfaces in order to expand and improve the capabilities of HOPLITE.
Resumo:
An important competence of human data analysts is to interpret and explain the meaning of the results of data analysis to end-users. However, existing automatic solutions for intelligent data analysis provide limited help to interpret and communicate information to non-expert users. In this paper we present a general approach to generating explanatory descriptions about the meaning of quantitative sensor data. We propose a type of web application: a virtual newspaper with automatically generated news stories that describe the meaning of sensor data. This solution integrates a variety of techniques from intelligent data analysis into a web-based multimedia presentation system. We validated our approach in a real world problem and demonstrate its generality using data sets from several domains. Our experience shows that this solution can facilitate the use of sensor data by general users and, therefore, can increase the utility of sensor network infrastructures.
Resumo:
A Near Infrared Spectroscopy (NIRS) industrial application was developed by the LPF-Tagralia team, and transferred to a Spanish dehydrator company (Agrotécnica Extremeña S.L.) for the classification of dehydrator onion bulbs for breeding purposes. The automated operation of the system has allowed the classification of more than one million onion bulbs during seasons 2004 to 2008 (Table 1). The performance achieved by the original model (R2=0,65; SEC=2,28ºBrix) was enough for qualitative classification thanks to the broad range of variation of the initial population (18ºBrix). Nevertheless, a reduction of the classification performance of the model has been observed with the passing of seasons. One of the reasons put forward is the reduction of the range of variation that naturally occurs during a breeding process, the other is the variations in other parameters than the variable of interest but whose effects would probably be affecting the measurements [1]. This study points to the application of Independent Component Analysis (ICA) on this highly variable dataset coming from a NIRS industrial application for the identification of the different sources of variation present through seasons.
Resumo:
The synapses in the cerebral cortex can be classified into two main types, Gray’s type I and type II, which correspond to asymmetric (mostly glutamatergic excitatory) and symmetric (inhibitory GABAergic) synapses, respectively. Hence, the quantification and identification of their different types and the proportions in which they are found, is extraordinarily important in terms of brain function. The ideal approach to calculate the number of synapses per unit volume is to analyze 3D samples reconstructed from serial sections. However, obtaining serial sections by transmission electron microscopy is an extremely time consuming and technically demanding task. Using focused ion beam/scanning electron microscope microscopy, we recently showed that virtually all synapses can be accurately identified as asymmetric or symmetric synapses when they are visualized, reconstructed, and quantified from large 3D tissue samples obtained in an automated manner. Nevertheless, the analysis, segmentation, and quantification of synapses is still a labor intensive procedure. Thus, novel solutions are currently necessary to deal with the large volume of data that is being generated by automated 3D electron microscopy. Accordingly, we have developed ESPINA, a software tool that performs the automated segmentation and counting of synapses in a reconstructed 3D volume of the cerebral cortex, and that greatly facilitates and accelerates these processes.
Resumo:
Goal independent analysis of logic programs is commonly discussed in the context of the bottom-up approach. However, while the literature is rich in descriptions of top-down analysers and their application, practical experience with bottom-up analysis is still in a preliminary stage. Moreover, the practical use of existing top-down frameworks for goal independent analysis has not been addressed in a practical system. We illustrate the efficient use of existing goal dependent, top-down frameworks for abstract interpretation in performing goal independent analyses of logic programs much the same as those usually derived from bottom-up frameworks. We present several optimizations for this flavour of top-down analysis. The approach is fully implemented within an existing top-down framework. Several implementation tradeoffs are discussed as well as the influence of domain characteristics. An experimental evaluation including a comparison with a bottom-up analysis for the domain Prop is presented. We conclude that the technique can offer advantages with respect to standard goal dependent analyses.
Resumo:
While workflow technology has gained momentum in the last decade as a means for specifying and enacting computational experiments in modern science, reusing and repurposing existing workflows to build new scientific experiments is still a daunting task. This is partly due to the difficulty that scientists experience when attempting to understand existing workflows, which contain several data preparation and adaptation steps in addition to the scientifically significant analysis steps. One way to tackle the understandability problem is through providing abstractions that give a high-level view of activities undertaken within workflows. As a first step towards abstractions, we report in this paper on the results of a manual analysis performed over a set of real-world scientific workflows from Taverna and Wings systems. Our analysis has resulted in a set of scientific workflow motifs that outline i) the kinds of data intensive activities that are observed in workflows (data oriented motifs), and ii) the different manners in which activities are implemented within workflows (workflow oriented motifs). These motifs can be useful to inform workflow designers on the good and bad practices for workflow development, to inform the design of automated tools for the generation of workflow abstractions, etc.
Resumo:
Workflow technology continues to play an important role as a means for specifying and enacting computational experiments in modern science. Reusing and re-purposing workflows allow scientists to do new experiments faster, since the workflows capture useful expertise from others. As workflow libraries grow, scientists face the challenge of finding workflows appropriate for their task, understanding what each workflow does, and reusing relevant portions of a given workflow.We believe that workflows would be easier to understand and reuse if high-level views (abstractions) of their activities were available in workflow libraries. As a first step towards obtaining these abstractions, we report in this paper on the results of a manual analysis performed over a set of real-world scientific workflows from Taverna, Wings, Galaxy and Vistrails. Our analysis has resulted in a set of scientific workflow motifs that outline (i) the kinds of data-intensive activities that are observed in workflows (Data-Operation motifs), and (ii) the different manners in which activities are implemented within workflows (Workflow-Oriented motifs). These motifs are helpful to identify the functionality of the steps in a given workflow, to develop best practices for workflow design, and to develop approaches for automated generation of workflow abstractions.
Resumo:
The characterisation of mineral texture has been a major concern for process mineralogists, as liberation characteristics of the ores are intimately related to the mineralogical texture. While a great effort has been done to automatically characterise texture in unbroken ores, the characterisation of textural attributes in mineral particles is usually descriptive. However, the quantitative characterisation of texture in mineral particles is essential to improve and predict the performance of minerallurgical processes (i.e. all the processes involved in the liberation and separation of the mineral of interest) and to achieve a more accurate geometallurgical model. Driven by this necessity of achieving a more complete characterisation of textural attributes in mineral particles, a methodology has been recently developed to automatically characterise the type of intergrowth between mineral phases within particles by means of digital image analysis. In this methodology, a set ofminerallurgical indices has been developed to quantify different mineralogical features and to identify the intergrowth pattern by discriminant analysis. The paper shows the application of the methodology to the textural characterisation of chalcopyrite in the rougher concentrate of the Kansanshi copper mine (Zambia). In this sample, the variety of intergrowth patterns of chalcopyrite with the other minerals has been used to illustrate the methodology. The results obtained show that the method identifies the intergrowth type and provides quantitative information to achieve a complete and detailed mineralogical characterisation. Therefore, the use of this methodology as a routinely tool in automated mineralogy would contribute to a better understanding of the ore behaviour during liberation and separation processes.
Resumo:
Process mineralogy provides the mineralogical information required by geometallurgists to address the inherent variation of geological data. The successful benefitiation of ores mostly depends on the ability of mineral processing to be efficiently adapted to the ore characteristics, being liberation one of the most relevant mineralogical parameters. The liberation characteristics of ores are intimately related to mineral texture. Therefore, the characterization of liberation necessarily requieres the identification and quantification of those textural features with a major bearing on mineral liberation. From this point of view grain size, bonding between mineral grains and intergrowth types are considered as the most influential textural attributes. While the quantification of grain size is a usual output of automated current technologies, information about grain boundaries and intergrowth types is usually descriptive and difficult to quantify to be included in the geometallurgical model. Aiming at the systematic and quantitative analysis of the intergrowth type within mineral particles, a new methodology based on digital image analysis has been developed. In this work, the ability of this methodology to achieve a more complete characterization of liberation is explored by the analysis of chalcopyrite in the rougher concentrate of the Kansanshi copper-gold mine (Zambia). Results obtained show that the method provides valuable textural information to achieve a better understanding of mineral behaviour during concentration processes. The potential of this method is enhanced by the fact that it provides data unavailable by current technologies. This opens up new perspectives on the quantitative analysis of mineral processing performance based on textural attributes.
Resumo:
Due to the relative transparency of its embryos and larvae, the zebrafish is an ideal model organism for bioimaging approaches in vertebrates. Novel microscope technologies allow the imaging of developmental processes in unprecedented detail, and they enable the use of complex image-based read-outs for high-throughput/high-content screening. Such applications can easily generate Terabytes of image data, the handling and analysis of which becomes a major bottleneck in extracting the targeted information. Here, we describe the current state of the art in computational image analysis in the zebrafish system. We discuss the challenges encountered when handling high-content image data, especially with regard to data quality, annotation, and storage. We survey methods for preprocessing image data for further analysis, and describe selected examples of automated image analysis, including the tracking of cells during embryogenesis, heartbeat detection, identification of dead embryos, recognition of tissues and anatomical landmarks, and quantification of behavioral patterns of adult fish. We review recent examples for applications using such methods, such as the comprehensive analysis of cell lineages during early development, the generation of a three-dimensional brain atlas of zebrafish larvae, and high-throughput drug screens based on movement patterns. Finally, we identify future challenges for the zebrafish image analysis community, notably those concerning the compatibility of algorithms and data formats for the assembly of modular analysis pipelines.