19 resultados para Data Structure Evaluation

em Universidad Politécnica de Madrid


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Managing large medical image collections is an increasingly demanding important issue in many hospitals and other medical settings. A huge amount of this information is daily generated, which requires robust and agile systems. In this paper we present a distributed multi-agent system capable of managing very large medical image datasets. In this approach, agents extract low-level information from images and store them in a data structure implemented in a relational database. The data structure can also store semantic information related to images and particular regions. A distinctive aspect of our work is that a single image can be divided so that the resultant sub-images can be stored and managed separately by different agents to improve performance in data accessing and processing. The system also offers the possibility of applying some region-based operations and filters on images, facilitating image classification. These operations can be performed directly on data structures in the database.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In professional video production, users have to access to huge multimedia files simultaneously in an error-free environment, this restriction force the use of expensive disk architectures for video servers. Previous researches proposed different RAID systems for each specific task (ingest, editing, file, play-out, etc.). Video production companies have to acquire different servers with different RAIDs systems in order to support each task in the production workflow. The solution has multiples disadvantages, duplicated material in several RAIDs, duplicated material for different qualities, transfer and transcoding processes, etc. In this work, an architecture for video servers based on the spreading of JPEG200 data in different RAIDs is presented, each individual part of the data structure goes to a specific RAID type depending on the effect that produces the data on the overall image quality, the method provide a redundancy correlated with the data rank. The global storage can be used in all the different tasks of the production workflow saving disk space, redundant files and transfers procedures.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The technique of Abstract Interpretation has allowed the development of very sophisticated global program analyses which are at the same time provably correct and practical. We present in a tutorial fashion a novel program development framework which uses abstract interpretation as a fundamental tool. The framework uses modular, incremental abstract interpretation to obtain information about the program. This information is used to validate programs, to detect bugs with respect to partial specifications written using assertions (in the program itself and/or in system libraries), to generate and simplify run-time tests, and to perform high-level program transformations such as multiple abstract specialization, parallelization, and resource usage control, all in a provably correct way. In the case of validation and debugging, the assertions can refer to a variety of program points such as procedure entry, procedure exit, points within procedures, or global computations. The system can reason with much richer information than, for example, traditional types. This includes data structure shape (including pointer sharing), bounds on data structure sizes, and other operational variable instantiation properties, as well as procedure-level properties such as determinacy, termination, nonfailure, and bounds on resource consumption (time or space cost). CiaoPP, the preprocessor of the Ciao multi-paradigm programming system, which implements the described functionality, will be used to illustrate the fundamental ideas.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The technique of Abstract Interpretation has allowed the development of very sophisticated global program analyses which are at the same time provably correct and practical. We present in a tutorial fashion a novel program development framework which uses abstract interpretation as a fundamental tool. The framework uses modular, incremental abstract interpretation to obtain information about the program. This information is used to validate programs, to detect bugs with respect to partial specifications written using assertions (in the program itself and/or in system librarles), to genérate and simplify run-time tests, and to perform high-level program transformations such as múltiple abstract specialization, parallelization, and resource usage control, all in a provably correct way. In the case of validation and debugging, the assertions can refer to a variety of program points such as procedure entry, procedure exit, points within procedures, or global computations. The system can reason with much richer information than, for example, traditional types. This includes data structure shape (including pointer sharing), bounds on data structure sizes, and other operational variable instantiation properties, as well as procedure-level properties such as determinacy, termination, non-failure, and bounds on resource consumption (time or space cost). CiaoPP, the preprocessor of the Ciao multi-paradigm programming system, which implements the described functionality, will be used to illustrate the fundamental ideas.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present in a tutorial fashion CiaoPP, the preprocessor of the Ciao multi-paradigm programming system, which implements a novel program development framework which uses abstract interpretation as a fundamental tool. The framework uses modular, incremental abstract interpretation to obtain information about the program. This information is used to validate programs, to detect bugs with respect to partial specifications written using assertions (in the program itself and/or in system libraries), to generate and simplify run-time tests, and to perform high-level program transformations such as multiple abstract specialization, parallelization, and resource usage control, all in a provably correct way. In the case of validation and debugging, the assertions can refer to a variety of program points such as procedure entry, procedure exit, points within procedures, or global computations. The system can reason with much richer information than, for example, traditional types. This includes data structure shape (including pointer sharing), bounds on data structure sizes, and other operational variable instantiation properties, as well as procedure-level properties such as determinacy, termination, non-failure, and bounds on resource consumption (time or space cost).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Los avances en el hardware permiten disponer de grandes volúmenes de datos, surgiendo aplicaciones que deben suministrar información en tiempo cuasi-real, la monitorización de pacientes, ej., el seguimiento sanitario de las conducciones de agua, etc. Las necesidades de estas aplicaciones hacen emerger el modelo de flujo de datos (data streaming) frente al modelo almacenar-para-despuésprocesar (store-then-process). Mientras que en el modelo store-then-process, los datos son almacenados para ser posteriormente consultados; en los sistemas de streaming, los datos son procesados a su llegada al sistema, produciendo respuestas continuas sin llegar a almacenarse. Esta nueva visión impone desafíos para el procesamiento de datos al vuelo: 1) las respuestas deben producirse de manera continua cada vez que nuevos datos llegan al sistema; 2) los datos son accedidos solo una vez y, generalmente, no son almacenados en su totalidad; y 3) el tiempo de procesamiento por dato para producir una respuesta debe ser bajo. Aunque existen dos modelos para el cómputo de respuestas continuas, el modelo evolutivo y el de ventana deslizante; éste segundo se ajusta mejor en ciertas aplicaciones al considerar únicamente los datos recibidos más recientemente, en lugar de todo el histórico de datos. En los últimos años, la minería de datos en streaming se ha centrado en el modelo evolutivo. Mientras que, en el modelo de ventana deslizante, el trabajo presentado es más reducido ya que estos algoritmos no sólo deben de ser incrementales si no que deben borrar la información que caduca por el deslizamiento de la ventana manteniendo los anteriores tres desafíos. Una de las tareas fundamentales en minería de datos es la búsqueda de agrupaciones donde, dado un conjunto de datos, el objetivo es encontrar grupos representativos, de manera que se tenga una descripción sintética del conjunto. Estas agrupaciones son fundamentales en aplicaciones como la detección de intrusos en la red o la segmentación de clientes en el marketing y la publicidad. Debido a las cantidades masivas de datos que deben procesarse en este tipo de aplicaciones (millones de eventos por segundo), las soluciones centralizadas puede ser incapaz de hacer frente a las restricciones de tiempo de procesamiento, por lo que deben recurrir a descartar datos durante los picos de carga. Para evitar esta perdida de datos, se impone el procesamiento distribuido de streams, en concreto, los algoritmos de agrupamiento deben ser adaptados para este tipo de entornos, en los que los datos están distribuidos. En streaming, la investigación no solo se centra en el diseño para tareas generales, como la agrupación, sino también en la búsqueda de nuevos enfoques que se adapten mejor a escenarios particulares. Como ejemplo, un mecanismo de agrupación ad-hoc resulta ser más adecuado para la defensa contra la denegación de servicio distribuida (Distributed Denial of Services, DDoS) que el problema tradicional de k-medias. En esta tesis se pretende contribuir en el problema agrupamiento en streaming tanto en entornos centralizados y distribuidos. Hemos diseñado un algoritmo centralizado de clustering mostrando las capacidades para descubrir agrupaciones de alta calidad en bajo tiempo frente a otras soluciones del estado del arte, en una amplia evaluación. Además, se ha trabajado sobre una estructura que reduce notablemente el espacio de memoria necesario, controlando, en todo momento, el error de los cómputos. Nuestro trabajo también proporciona dos protocolos de distribución del cómputo de agrupaciones. Se han analizado dos características fundamentales: el impacto sobre la calidad del clustering al realizar el cómputo distribuido y las condiciones necesarias para la reducción del tiempo de procesamiento frente a la solución centralizada. Finalmente, hemos desarrollado un entorno para la detección de ataques DDoS basado en agrupaciones. En este último caso, se ha caracterizado el tipo de ataques detectados y se ha desarrollado una evaluación sobre la eficiencia y eficacia de la mitigación del impacto del ataque. ABSTRACT Advances in hardware allow to collect huge volumes of data emerging applications that must provide information in near-real time, e.g., patient monitoring, health monitoring of water pipes, etc. The data streaming model emerges to comply with these applications overcoming the traditional store-then-process model. With the store-then-process model, data is stored before being consulted; while, in streaming, data are processed on the fly producing continuous responses. The challenges of streaming for processing data on the fly are the following: 1) responses must be produced continuously whenever new data arrives in the system; 2) data is accessed only once and is generally not maintained in its entirety, and 3) data processing time to produce a response should be low. Two models exist to compute continuous responses: the evolving model and the sliding window model; the latter fits best with applications must be computed over the most recently data rather than all the previous data. In recent years, research in the context of data stream mining has focused mainly on the evolving model. In the sliding window model, the work presented is smaller since these algorithms must be incremental and they must delete the information which expires when the window slides. Clustering is one of the fundamental techniques of data mining and is used to analyze data sets in order to find representative groups that provide a concise description of the data being processed. Clustering is critical in applications such as network intrusion detection or customer segmentation in marketing and advertising. Due to the huge amount of data that must be processed by such applications (up to millions of events per second), centralized solutions are usually unable to cope with timing restrictions and recur to shedding techniques where data is discarded during load peaks. To avoid discarding of data, processing of streams (such as clustering) must be distributed and adapted to environments where information is distributed. In streaming, research does not only focus on designing for general tasks, such as clustering, but also in finding new approaches that fit bests with particular scenarios. As an example, an ad-hoc grouping mechanism turns out to be more adequate than k-means for defense against Distributed Denial of Service (DDoS). This thesis contributes to the data stream mining clustering technique both for centralized and distributed environments. We present a centralized clustering algorithm showing capabilities to discover clusters of high quality in low time and we provide a comparison with existing state of the art solutions. We have worked on a data structure that significantly reduces memory requirements while controlling the error of the clusters statistics. We also provide two distributed clustering protocols. We focus on the analysis of two key features: the impact on the clustering quality when computation is distributed and the requirements for reducing the processing time compared to the centralized solution. Finally, with respect to ad-hoc grouping techniques, we have developed a DDoS detection framework based on clustering.We have characterized the attacks detected and we have evaluated the efficiency and effectiveness of mitigating the attack impact.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nuestro cerebro contiene cerca de 1014 sinapsis neuronales. Esta enorme cantidad de conexiones proporciona un entorno ideal donde distintos grupos de neuronas se sincronizan transitoriamente para provocar la aparición de funciones cognitivas, como la percepción, el aprendizaje o el pensamiento. Comprender la organización de esta compleja red cerebral en base a datos neurofisiológicos, representa uno de los desafíos más importantes y emocionantes en el campo de la neurociencia. Se han propuesto recientemente varias medidas para evaluar cómo se comunican las diferentes partes del cerebro a diversas escalas (células individuales, columnas corticales, o áreas cerebrales). Podemos clasificarlos, según su simetría, en dos grupos: por una parte, la medidas simétricas, como la correlación, la coherencia o la sincronización de fase, que evalúan la conectividad funcional (FC); mientras que las medidas asimétricas, como la causalidad de Granger o transferencia de entropía, son capaces de detectar la dirección de la interacción, lo que denominamos conectividad efectiva (EC). En la neurociencia moderna ha aumentado el interés por el estudio de las redes funcionales cerebrales, en gran medida debido a la aparición de estos nuevos algoritmos que permiten analizar la interdependencia entre señales temporales, además de la emergente teoría de redes complejas y la introducción de técnicas novedosas, como la magnetoencefalografía (MEG), para registrar datos neurofisiológicos con gran resolución. Sin embargo, nos hallamos ante un campo novedoso que presenta aun varias cuestiones metodológicas sin resolver, algunas de las cuales trataran de abordarse en esta tesis. En primer lugar, el creciente número de aproximaciones para determinar la existencia de FC/EC entre dos o más señales temporales, junto con la complejidad matemática de las herramientas de análisis, hacen deseable organizarlas todas en un paquete software intuitivo y fácil de usar. Aquí presento HERMES (http://hermes.ctb.upm.es), una toolbox en MatlabR, diseñada precisamente con este fin. Creo que esta herramienta será de gran ayuda para todos aquellos investigadores que trabajen en el campo emergente del análisis de conectividad cerebral y supondrá un gran valor para la comunidad científica. La segunda cuestión practica que se aborda es el estudio de la sensibilidad a las fuentes cerebrales profundas a través de dos tipos de sensores MEG: gradiómetros planares y magnetómetros, esta aproximación además se combina con un enfoque metodológico, utilizando dos índices de sincronización de fase: phase locking value (PLV) y phase lag index (PLI), este ultimo menos sensible a efecto la conducción volumen. Por lo tanto, se compara su comportamiento al estudiar las redes cerebrales, obteniendo que magnetómetros y PLV presentan, respectivamente, redes más densamente conectadas que gradiómetros planares y PLI, por los valores artificiales que crea el problema de la conducción de volumen. Sin embargo, cuando se trata de caracterizar redes epilépticas, el PLV ofrece mejores resultados, debido a la gran dispersión de las redes obtenidas con PLI. El análisis de redes complejas ha proporcionado nuevos conceptos que mejoran caracterización de la interacción de sistemas dinámicos. Se considera que una red está compuesta por nodos, que simbolizan sistemas, cuyas interacciones se representan por enlaces, y su comportamiento y topología puede caracterizarse por un elevado número de medidas. Existe evidencia teórica y empírica de que muchas de ellas están fuertemente correlacionadas entre sí. Por lo tanto, se ha conseguido seleccionar un pequeño grupo que caracteriza eficazmente estas redes, y condensa la información redundante. Para el análisis de redes funcionales, la selección de un umbral adecuado para decidir si un determinado valor de conectividad de la matriz de FC es significativo y debe ser incluido para un análisis posterior, se convierte en un paso crucial. En esta tesis, se han obtenido resultados más precisos al utilizar un test de subrogadas, basado en los datos, para evaluar individualmente cada uno de los enlaces, que al establecer a priori un umbral fijo para la densidad de conexiones. Finalmente, todas estas cuestiones se han aplicado al estudio de la epilepsia, caso práctico en el que se analizan las redes funcionales MEG, en estado de reposo, de dos grupos de pacientes epilépticos (generalizada idiopática y focal frontal) en comparación con sujetos control sanos. La epilepsia es uno de los trastornos neurológicos más comunes, con más de 55 millones de afectados en el mundo. Esta enfermedad se caracteriza por la predisposición a generar ataques epilépticos de actividad neuronal anormal y excesiva o bien síncrona, y por tanto, es el escenario perfecto para este tipo de análisis al tiempo que presenta un gran interés tanto desde el punto de vista clínico como de investigación. Los resultados manifiestan alteraciones especificas en la conectividad y un cambio en la topología de las redes en cerebros epilépticos, desplazando la importancia del ‘foco’ a la ‘red’, enfoque que va adquiriendo relevancia en las investigaciones recientes sobre epilepsia. ABSTRACT There are about 1014 neuronal synapses in the human brain. This huge number of connections provides the substrate for neuronal ensembles to become transiently synchronized, producing the emergence of cognitive functions such as perception, learning or thinking. Understanding the complex brain network organization on the basis of neuroimaging data represents one of the most important and exciting challenges for systems neuroscience. Several measures have been recently proposed to evaluate at various scales (single cells, cortical columns, or brain areas) how the different parts of the brain communicate. We can classify them, according to their symmetry, into two groups: symmetric measures, such as correlation, coherence or phase synchronization indexes, evaluate functional connectivity (FC); and on the other hand, the asymmetric ones, such as Granger causality or transfer entropy, are able to detect effective connectivity (EC) revealing the direction of the interaction. In modern neurosciences, the interest in functional brain networks has increased strongly with the onset of new algorithms to study interdependence between time series, the advent of modern complex network theory and the introduction of powerful techniques to record neurophysiological data, such as magnetoencephalography (MEG). However, when analyzing neurophysiological data with this approach several questions arise. In this thesis, I intend to tackle some of the practical open problems in the field. First of all, the increase in the number of time series analysis algorithms to study brain FC/EC, along with their mathematical complexity, creates the necessity of arranging them into a single, unified toolbox that allow neuroscientists, neurophysiologists and researchers from related fields to easily access and make use of them. I developed such a toolbox for this aim, it is named HERMES (http://hermes.ctb.upm.es), and encompasses several of the most common indexes for the assessment of FC and EC running for MatlabR environment. I believe that this toolbox will be very helpful to all the researchers working in the emerging field of brain connectivity analysis and will entail a great value for the scientific community. The second important practical issue tackled in this thesis is the evaluation of the sensitivity to deep brain sources of two different MEG sensors: planar gradiometers and magnetometers, in combination with the related methodological approach, using two phase synchronization indexes: phase locking value (PLV) y phase lag index (PLI), the latter one being less sensitive to volume conduction effect. Thus, I compared their performance when studying brain networks, obtaining that magnetometer sensors and PLV presented higher artificial values as compared with planar gradiometers and PLI respectively. However, when it came to characterize epileptic networks it was the PLV which gives better results, as PLI FC networks where very sparse. Complex network analysis has provided new concepts which improved characterization of interacting dynamical systems. With this background, networks could be considered composed of nodes, symbolizing systems, whose interactions with each other are represented by edges. A growing number of network measures is been applied in network analysis. However, there is theoretical and empirical evidence that many of these indexes are strongly correlated with each other. Therefore, in this thesis I reduced them to a small set, which could more efficiently characterize networks. Within this framework, selecting an appropriate threshold to decide whether a certain connectivity value of the FC matrix is significant and should be included in the network analysis becomes a crucial step, in this thesis, I used the surrogate data tests to make an individual data-driven evaluation of each of the edges significance and confirmed more accurate results than when just setting to a fixed value the density of connections. All these methodologies were applied to the study of epilepsy, analysing resting state MEG functional networks, in two groups of epileptic patients (generalized and focal epilepsy) that were compared to matching control subjects. Epilepsy is one of the most common neurological disorders, with more than 55 million people affected worldwide, characterized by its predisposition to generate epileptic seizures of abnormal excessive or synchronous neuronal activity, and thus, this scenario and analysis, present a great interest from both the clinical and the research perspective. Results revealed specific disruptions in connectivity and network topology and evidenced that networks’ topology is changed in epileptic brains, supporting the shift from ‘focus’ to ‘networks’ which is gaining importance in modern epilepsy research.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Leaf nitrogen and leaf surface area influence the exchange of gases between terrestrial ecosystems and the atmosphere, and play a significant role in the global cycles of carbon, nitrogen and water. The purpose of this study is to use field-based and satellite remote-sensing-based methods to assess leaf nitrogen pools in five diverse European agricultural landscapes located in Denmark, Scotland (United Kingdom), Poland, the Netherlands and Italy. REGFLEC (REGularized canopy reFLECtance) is an advanced image-based inverse canopy radiative transfer modelling system which has shown proficiency for regional mapping of leaf area index (LAI) and leaf chlorophyll (CHLl) using remote sensing data. In this study, high spatial resolution (10–20 m) remote sensing images acquired from the multispectral sensors aboard the SPOT (Satellite For Observation of Earth) satellites were used to assess the capability of REGFLEC for mapping spatial variations in LAI, CHLland the relation to leaf nitrogen (Nl) data in five diverse European agricultural landscapes. REGFLEC is based on physical laws and includes an automatic model parameterization scheme which makes the tool independent of field data for model calibration. In this study, REGFLEC performance was evaluated using LAI measurements and non-destructive measurements (using a SPAD meter) of leaf-scale CHLl and Nl concentrations in 93 fields representing crop- and grasslands of the five landscapes. Furthermore, empirical relationships between field measurements (LAI, CHLl and Nl and five spectral vegetation indices (the Normalized Difference Vegetation Index, the Simple Ratio, the Enhanced Vegetation Index-2, the Green Normalized Difference Vegetation Index, and the green chlorophyll index) were used to assess field data coherence and to serve as a comparison basis for assessing REGFLEC model performance. The field measurements showed strong vertical CHLl gradient profiles in 26% of fields which affected REGFLEC performance as well as the relationships between spectral vegetation indices (SVIs) and field measurements. When the range of surface types increased, the REGFLEC results were in better agreement with field data than the empirical SVI regression models. Selecting only homogeneous canopies with uniform CHLl distributions as reference data for evaluation, REGFLEC was able to explain 69% of LAI observations (rmse = 0.76), 46% of measured canopy chlorophyll contents (rmse = 719 mg m−2) and 51% of measured canopy nitrogen contents (rmse = 2.7 g m−2). Better results were obtained for individual landscapes, except for Italy, where REGFLEC performed poorly due to a lack of dense vegetation canopies at the time of satellite recording. Presence of vegetation is needed to parameterize the REGFLEC model. Combining REGFLEC- and SVI-based model results to minimize errors for a "snap-shot" assessment of total leaf nitrogen pools in the five landscapes, results varied from 0.6 to 4.0 t km−2. Differences in leaf nitrogen pools between landscapes are attributed to seasonal variations, extents of agricultural area, species variations, and spatial variations in nutrient availability. In order to facilitate a substantial assessment of variations in Nl pools and their relation to landscape based nitrogen and carbon cycling processes, time series of satellite data are needed. The upcoming Sentinel-2 satellite mission will provide new multiple narrowband data opportunities at high spatio-temporal resolution which are expected to further improve remote sensing capabilities for mapping LAI, CHLl and Nl.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La presente Tesis persigue la definición y el desarrollo de un sistema basado en el conocimiento que permita la generación de modelos de líneas de montaje durante la fase conceptual de definición de una aeroestructura aeronáutica. Para ello, se propone la definición de un modelo formal del proceso en concurrencia asociado al diseño de líneas de montaje en la fase conceptual, y de un modelo de la estructura de datos básica para soportar dicho proceso. Ambos modelos sirven de base para el desarrollo de una aplicación de prueba de concepto en el entorno del sistema comercial CAX-PLM CATIA v5. Los modelos de línea generados integran las tres estructuras básicas definidas en el modelo propuesto: producto, procesos y recursos. Los modelos generados son estructuras “de montaje”, basadas en estructuras de producto “de fabricación” a su vez derivadas de estructuras “de diseño”. Cada modelo generado se evalúa en términos de cuatro estimaciones básicas: dimensiones máximas del nodo producto, distancia de transporte y medio a utilizar, tiempo total de ejecución y coste total. La generación de modelos de línea de montaje se realiza en concurrencia con la función diseño de producto, teniendo por tanto la oportunidad de influir en la misma e incluir requerimientos de fabricación y montaje al producto en las primeras fases de su ciclo de vida, lo que proporciona una clara ventaja competitiva. El desarrollo propuesto en esta Tesis permite sentar las bases para realizar desarrollos con objeto de asistir a los diseñadores durante la fase conceptual de generación de diseños de líneas de montaje. La aplicación prototipo desarrollada demuestra la viabilidad de la propuesta conceptual que se realiza en la Tesis. ABSTRACT The current thesis proposes the definition and development of a knowledge-based system to generate aircraft components assembly line models during the conceptual phase of the product life cycle. With this objective, the definition of a formal activity model to represent the design of assembly lines during the conceptual phase is proposed; such model considers the concurrence with the product design process. Associated to the activity model, a data structure model is defined to support such process. Both models are the basis for the development of a proof of concept application within the environment of the commercial CAX-PLM system CATIA v5. The generated assembly line models integrate the three basic structures defined in the proposed model: product, processes and resources. The generated models are “As Prepared” structures based on “As Planned” structures derived from “As Designed” structures. Each generated model is evaluated in terms of four basic estimates: maximum dimensions of the product node, transport distance and transport mean to be used, total execution time and total cost. The assembly line models generation is made in concurrence with the product design function. Therefore, it provides the opportunity to influence on it and allows including manufacturing and assembly requirements early in the product life cycle, which gives a clear competitive advantage. The development proposed in this Thesis allows setting the foundation to carry out further developments with the aim of assisting designers during the conceptual phase of the assembly line design process. The developed prototype application shows the feasibility of the conceptual proposal presented in the Thesis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In programming languages with dynamic use of memory, such as Java, knowing that a reference variable x points to an acyclic data structure is valuable for the analysis of termination and resource usage (e.g., execution time or memory consumption). For instance, this information guarantees that the depth of the data structure to which x points is greater than the depth of the data structure pointed to by x.f for any field f of x. This, in turn, allows bounding the number of iterations of a loop which traverses the structure by its depth, which is essential in order to prove the termination or infer the resource usage of the loop. The present paper provides an Abstract-Interpretation-based formalization of a static analysis for inferring acyclicity, which works on the reduced product of two abstract domains: reachability, which models the property that the location pointed to by a variable w can be reached by dereferencing another variable v (in this case, v is said to reach w); and cyclicity, modeling the property that v can point to a cyclic data structure. The analysis is proven to be sound and optimal with respect to the chosen abstraction.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper assesses rural vernacular heritage established in a warm temperate climate, with dry, hot summer, in São Vicente e Ventosa (SVV), Alentejo, Portugal, and takes part in a larger investigation intending to create rehabilitation guidelines, with sustainable criteria and integration of recent technologies, to improving indoor comfort, and revert the state of deterioration. To further reach this aim, this paper proposes a four phases methodology: data collection, evaluation, simulation and development; a first survey data analysis, including climate data and the adapted comfort climograph and isopleth diagram, allows an understanding of thermal comfort and main constraints in site, as well as suitable bioclimatic strategies for SVV: high thermal inertia for tempering extreme summer conditions and the considerable temperature amplitudes throughout the year, complementarily night ventilation for passive cooling, small-sized window openings and movable shading systems for solar radiation protection. An efficient behaviour in stabilizing indoor temperature swings is revealed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El concepto de algoritmo es básico en informática, por lo que es crucial que los alumnos profundicen en él desde el inicio de su formación. Por tanto, contar con una herramienta que guíe a los estudiantes en su aprendizaje puede suponer una gran ayuda en su formación. La mayoría de los autores coinciden en que, para determinar la eficacia de una herramienta de visualización de algoritmos, es esencial cómo se utiliza. Así, los estudiantes que participan activamente en la visualización superan claramente a los que la contemplan de forma pasiva. Por ello, pensamos que uno de los mejores ejercicios para un alumno consiste en simular la ejecución del algoritmo que desea aprender mediante el uso de una herramienta de visualización, i. e. consiste en realizar una simulación visual de dicho algoritmo. La primera parte de esta tesis presenta los resultados de una profunda investigación sobre las características que debe reunir una herramienta de ayuda al aprendizaje de algoritmos y conceptos matemáticos para optimizar su efectividad: el conjunto de especificaciones eMathTeacher, además de un entorno de aprendizaje que integra herramientas que las cumplen: GRAPHs. Hemos estudiado cuáles son las cualidades esenciales para potenciar la eficacia de un sistema e-learning de este tipo. Esto nos ha llevado a la definición del concepto eMathTeacher, que se ha materializado en el conjunto de especificaciones eMathTeacher. Una herramienta e-learning cumple las especificaciones eMathTeacher si actúa como un profesor virtual de matemáticas, i. e. si es una herramienta de autoevaluación que ayuda a los alumnos a aprender de forma activa y autónoma conceptos o algoritmos matemáticos, corrigiendo sus errores y proporcionando pistas para encontrar la respuesta correcta, pero sin dársela explícitamente. En estas herramientas, la simulación del algoritmo no continúa hasta que el usuario introduce la respuesta correcta. Para poder reunir en un único entorno una colección de herramientas que cumplan las especificaciones eMathTeacher hemos creado GRAPHs, un entorno ampliable, basado en simulación visual, diseñado para el aprendizaje activo e independiente de los algoritmos de grafos y creado para que en él se integren simuladores de diferentes algoritmos. Además de las opciones de creación y edición del grafo y la visualización de los cambios producidos en él durante la simulación, el entorno incluye corrección paso a paso, animación del pseudocódigo del algoritmo, preguntas emergentes, manejo de las estructuras de datos del algoritmo y creación de un log de interacción en XML. Otro problema que nos planteamos en este trabajo, por su importancia en el proceso de aprendizaje, es el de la evaluación formativa. El uso de ciertos entornos e-learning genera gran cantidad de datos que deben ser interpretados para llegar a una evaluación que no se limite a un recuento de errores. Esto incluye el establecimiento de relaciones entre los datos disponibles y la generación de descripciones lingüísticas que informen al alumno sobre la evolución de su aprendizaje. Hasta ahora sólo un experto humano era capaz de hacer este tipo de evaluación. Nuestro objetivo ha sido crear un modelo computacional que simule el razonamiento del profesor y genere un informe sobre la evolución del aprendizaje que especifique el nivel de logro de cada uno de los objetivos definidos por el profesor. Como resultado del trabajo realizado, la segunda parte de esta tesis presenta el modelo granular lingüístico de la evaluación del aprendizaje, capaz de modelizar la evaluación y generar automáticamente informes de evaluación formativa. Este modelo es una particularización del modelo granular lingüístico de un fenómeno (GLMP), en cuyo desarrollo y formalización colaboramos, basado en la lógica borrosa y en la teoría computacional de las percepciones. Esta técnica, que utiliza sistemas de inferencia basados en reglas lingüísticas y es capaz de implementar criterios de evaluación complejos, se ha aplicado a dos casos: la evaluación, basada en criterios, de logs de interacción generados por GRAPHs y de cuestionarios de Moodle. Como consecuencia, se han implementado, probado y utilizado en el aula sistemas expertos que evalúan ambos tipos de ejercicios. Además de la calificación numérica, los sistemas generan informes de evaluación, en lenguaje natural, sobre los niveles de competencia alcanzados, usando sólo datos objetivos de respuestas correctas e incorrectas. Además, se han desarrollado dos aplicaciones capaces de ser configuradas para implementar los sistemas expertos mencionados. Una procesa los archivos producidos por GRAPHs y la otra, integrable en Moodle, evalúa basándose en los resultados de los cuestionarios. ABSTRACT The concept of algorithm is one of the core subjects in computer science. It is extremely important, then, for students to get a good grasp of this concept from the very start of their training. In this respect, having a tool that helps and shepherds students through the process of learning this concept can make a huge difference to their instruction. Much has been written about how helpful algorithm visualization tools can be. Most authors agree that the most important part of the learning process is how students use the visualization tool. Learners who are actively involved in visualization consistently outperform other learners who view the algorithms passively. Therefore we think that one of the best exercises to learn an algorithm is for the user to simulate the algorithm execution while using a visualization tool, thus performing a visual algorithm simulation. The first part of this thesis presents the eMathTeacher set of requirements together with an eMathTeacher-compliant tool called GRAPHs. For some years, we have been developing a theory about what the key features of an effective e-learning system for teaching mathematical concepts and algorithms are. This led to the definition of eMathTeacher concept, which has materialized in the eMathTeacher set of requirements. An e-learning tool is eMathTeacher compliant if it works as a virtual math trainer. In other words, it has to be an on-line self-assessment tool that helps students to actively and autonomously learn math concepts or algorithms, correcting their mistakes and providing them with clues to find the right answer. In an eMathTeacher-compliant tool, algorithm simulation does not continue until the user enters the correct answer. GRAPHs is an extendible environment designed for active and independent visual simulation-based learning of graph algorithms, set up to integrate tools to help the user simulate the execution of different algorithms. Apart from the options of creating and editing the graph, and visualizing the changes made to the graph during simulation, the environment also includes step-by-step correction, algorithm pseudo-code animation, pop-up questions, data structure handling and XML-based interaction log creation features. On the other hand, assessment is a key part of any learning process. Through the use of e-learning environments huge amounts of data can be output about this process. Nevertheless, this information has to be interpreted and represented in a practical way to arrive at a sound assessment that is not confined to merely counting mistakes. This includes establishing relationships between the available data and also providing instructive linguistic descriptions about learning evolution. Additionally, formative assessment should specify the level of attainment of the learning goals defined by the instructor. Till now, only human experts were capable of making such assessments. While facing this problem, our goal has been to create a computational model that simulates the instructor’s reasoning and generates an enlightening learning evolution report in natural language. The second part of this thesis presents the granular linguistic model of learning assessment to model the assessment of the learning process and implement the automated generation of a formative assessment report. The model is a particularization of the granular linguistic model of a phenomenon (GLMP) paradigm, based on fuzzy logic and the computational theory of perceptions, to the assessment phenomenon. This technique, useful for implementing complex assessment criteria using inference systems based on linguistic rules, has been applied to two particular cases: the assessment of the interaction logs generated by GRAPHs and the criterion-based assessment of Moodle quizzes. As a consequence, several expert systems to assess different algorithm simulations and Moodle quizzes have been implemented, tested and used in the classroom. Apart from the grade, the designed expert systems also generate natural language progress reports on the achieved proficiency level, based exclusively on the objective data gathered from correct and incorrect responses. In addition, two applications, capable of being configured to implement the expert systems, have been developed. One is geared up to process the files output by GRAPHs and the other one is a Moodle plug-in set up to perform the assessment based on the quizzes results.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Replication Data Management (RDM) aims at enabling the use of data collections from several iterations of an experiment. However, there are several major challenges to RDM from integrating data models and data from empirical study infrastructures that were not designed to cooperate, e.g., data model variation of local data sources. [Objective] In this paper we analyze RDM needs and evaluate conceptual RDM approaches to support replication researchers. [Method] We adapted the ATAM evaluation process to (a) analyze RDM use cases and needs of empirical replication study research groups and (b) compare three conceptual approaches to address these RDM needs: central data repositories with a fixed data model, heterogeneous local repositories, and an empirical ecosystem. [Results] While the central and local approaches have major issues that are hard to resolve in practice, the empirical ecosystem allows bridging current gaps in RDM from heterogeneous data sources. [Conclusions] The empirical ecosystem approach should be explored in diverse empirical environments.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Una apropiada evaluación de los márgenes de seguridad de una instalación nuclear, por ejemplo, una central nuclear, tiene en cuenta todas las incertidumbres que afectan a los cálculos de diseño, funcionanmiento y respuesta ante accidentes de dicha instalación. Una fuente de incertidumbre son los datos nucleares, que afectan a los cálculos neutrónicos, de quemado de combustible o activación de materiales. Estos cálculos permiten la evaluación de las funciones respuesta esenciales para el funcionamiento correcto durante operación, y también durante accidente. Ejemplos de esas respuestas son el factor de multiplicación neutrónica o el calor residual después del disparo del reactor. Por tanto, es necesario evaluar el impacto de dichas incertidumbres en estos cálculos. Para poder realizar los cálculos de propagación de incertidumbres, es necesario implementar metodologías que sean capaces de evaluar el impacto de las incertidumbres de estos datos nucleares. Pero también es necesario conocer los datos de incertidumbres disponibles para ser capaces de manejarlos. Actualmente, se están invirtiendo grandes esfuerzos en mejorar la capacidad de analizar, manejar y producir datos de incertidumbres, en especial para isótopos importantes en reactores avanzados. A su vez, nuevos programas/códigos están siendo desarrollados e implementados para poder usar dichos datos y analizar su impacto. Todos estos puntos son parte de los objetivos del proyecto europeo ANDES, el cual ha dado el marco de trabajo para el desarrollo de esta tesis doctoral. Por tanto, primero se ha llevado a cabo una revisión del estado del arte de los datos nucleares y sus incertidumbres, centrándose en los tres tipos de datos: de decaimiento, de rendimientos de fisión y de secciones eficaces. A su vez, se ha realizado una revisión del estado del arte de las metodologías para la propagación de incertidumbre de estos datos nucleares. Dentro del Departamento de Ingeniería Nuclear (DIN) se propuso una metodología para la propagación de incertidumbres en cálculos de evolución isotópica, el Método Híbrido. Esta metodología se ha tomado como punto de partida para esta tesis, implementando y desarrollando dicha metodología, así como extendiendo sus capacidades. Se han analizado sus ventajas, inconvenientes y limitaciones. El Método Híbrido se utiliza en conjunto con el código de evolución isotópica ACAB, y se basa en el muestreo por Monte Carlo de los datos nucleares con incertidumbre. En esta metodología, se presentan diferentes aproximaciones según la estructura de grupos de energía de las secciones eficaces: en un grupo, en un grupo con muestreo correlacionado y en multigrupos. Se han desarrollado diferentes secuencias para usar distintas librerías de datos nucleares almacenadas en diferentes formatos: ENDF-6 (para las librerías evaluadas), COVERX (para las librerías en multigrupos de SCALE) y EAF (para las librerías de activación). Gracias a la revisión del estado del arte de los datos nucleares de los rendimientos de fisión se ha identificado la falta de una información sobre sus incertidumbres, en concreto, de matrices de covarianza completas. Además, visto el renovado interés por parte de la comunidad internacional, a través del grupo de trabajo internacional de cooperación para evaluación de datos nucleares (WPEC) dedicado a la evaluación de las necesidades de mejora de datos nucleares mediante el subgrupo 37 (SG37), se ha llevado a cabo una revisión de las metodologías para generar datos de covarianza. Se ha seleccionando la actualización Bayesiana/GLS para su implementación, y de esta forma, dar una respuesta a dicha falta de matrices completas para rendimientos de fisión. Una vez que el Método Híbrido ha sido implementado, desarrollado y extendido, junto con la capacidad de generar matrices de covarianza completas para los rendimientos de fisión, se han estudiado diferentes aplicaciones nucleares. Primero, se estudia el calor residual tras un pulso de fisión, debido a su importancia para cualquier evento después de la parada/disparo del reactor. Además, se trata de un ejercicio claro para ver la importancia de las incertidumbres de datos de decaimiento y de rendimientos de fisión junto con las nuevas matrices completas de covarianza. Se han estudiado dos ciclos de combustible de reactores avanzados: el de la instalación europea para transmutación industrial (EFIT) y el del reactor rápido de sodio europeo (ESFR), en los cuales se han analizado el impacto de las incertidumbres de los datos nucleares en la composición isotópica, calor residual y radiotoxicidad. Se han utilizado diferentes librerías de datos nucleares en los estudios antreriores, comparando de esta forma el impacto de sus incertidumbres. A su vez, mediante dichos estudios, se han comparando las distintas aproximaciones del Método Híbrido y otras metodologías para la porpagación de incertidumbres de datos nucleares: Total Monte Carlo (TMC), desarrollada en NRG por A.J. Koning y D. Rochman, y NUDUNA, desarrollada en AREVA GmbH por O. Buss y A. Hoefer. Estas comparaciones demostrarán las ventajas del Método Híbrido, además de revelar sus limitaciones y su rango de aplicación. ABSTRACT For an adequate assessment of safety margins of nuclear facilities, e.g. nuclear power plants, it is necessary to consider all possible uncertainties that affect their design, performance and possible accidents. Nuclear data are a source of uncertainty that are involved in neutronics, fuel depletion and activation calculations. These calculations can predict critical response functions during operation and in the event of accident, such as decay heat and neutron multiplication factor. Thus, the impact of nuclear data uncertainties on these response functions needs to be addressed for a proper evaluation of the safety margins. Methodologies for performing uncertainty propagation calculations need to be implemented in order to analyse the impact of nuclear data uncertainties. Nevertheless, it is necessary to understand the current status of nuclear data and their uncertainties, in order to be able to handle this type of data. Great eórts are underway to enhance the European capability to analyse/process/produce covariance data, especially for isotopes which are of importance for advanced reactors. At the same time, new methodologies/codes are being developed and implemented for using and evaluating the impact of uncertainty data. These were the objectives of the European ANDES (Accurate Nuclear Data for nuclear Energy Sustainability) project, which provided a framework for the development of this PhD Thesis. Accordingly, first a review of the state-of-the-art of nuclear data and their uncertainties is conducted, focusing on the three kinds of data: decay, fission yields and cross sections. A review of the current methodologies for propagating nuclear data uncertainties is also performed. The Nuclear Engineering Department of UPM has proposed a methodology for propagating uncertainties in depletion calculations, the Hybrid Method, which has been taken as the starting point of this thesis. This methodology has been implemented, developed and extended, and its advantages, drawbacks and limitations have been analysed. It is used in conjunction with the ACAB depletion code, and is based on Monte Carlo sampling of variables with uncertainties. Different approaches are presented depending on cross section energy-structure: one-group, one-group with correlated sampling and multi-group. Differences and applicability criteria are presented. Sequences have been developed for using different nuclear data libraries in different storing-formats: ENDF-6 (for evaluated libraries) and COVERX (for multi-group libraries of SCALE), as well as EAF format (for activation libraries). A revision of the state-of-the-art of fission yield data shows inconsistencies in uncertainty data, specifically with regard to complete covariance matrices. Furthermore, the international community has expressed a renewed interest in the issue through the Working Party on International Nuclear Data Evaluation Co-operation (WPEC) with the Subgroup (SG37), which is dedicated to assessing the need to have complete nuclear data. This gives rise to this review of the state-of-the-art of methodologies for generating covariance data for fission yields. Bayesian/generalised least square (GLS) updating sequence has been selected and implemented to answer to this need. Once the Hybrid Method has been implemented, developed and extended, along with fission yield covariance generation capability, different applications are studied. The Fission Pulse Decay Heat problem is tackled first because of its importance during events after shutdown and because it is a clean exercise for showing the impact and importance of decay and fission yield data uncertainties in conjunction with the new covariance data. Two fuel cycles of advanced reactors are studied: the European Facility for Industrial Transmutation (EFIT) and the European Sodium Fast Reactor (ESFR), and response function uncertainties such as isotopic composition, decay heat and radiotoxicity are addressed. Different nuclear data libraries are used and compared. These applications serve as frameworks for comparing the different approaches of the Hybrid Method, and also for comparing with other methodologies: Total Monte Carlo (TMC), developed at NRG by A.J. Koning and D. Rochman, and NUDUNA, developed at AREVA GmbH by O. Buss and A. Hoefer. These comparisons reveal the advantages, limitations and the range of application of the Hybrid Method.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Probabilistic graphical models are a huge research field in artificial intelligence nowadays. The scope of this work is the study of directed graphical models for the representation of discrete distributions. Two of the main research topics related to this area focus on performing inference over graphical models and on learning graphical models from data. Traditionally, the inference process and the learning process have been treated separately, but given that the learned models structure marks the inference complexity, this kind of strategies will sometimes produce very inefficient models. With the purpose of learning thinner models, in this master thesis we propose a new model for the representation of network polynomials, which we call polynomial trees. Polynomial trees are a complementary representation for Bayesian networks that allows an efficient evaluation of the inference complexity and provides a framework for exact inference. We also propose a set of methods for the incremental compilation of polynomial trees and an algorithm for learning polynomial trees from data using a greedy score+search method that includes the inference complexity as a penalization in the scoring function.