921 resultados para Processing wikipedia data
Resumo:
The aim of TinyML is to bring the capability of Machine Learning to ultra-low-power devices, typically under a milliwatt, and with this it breaks the traditional power barrier that prevents the widely distributed machine intelligence. TinyML allows greater reactivity and privacy by conducting inference on the computer and near-sensor while avoiding the energy cost associated with wireless communication, which is far higher at this scale than that of computing. In addition, TinyML’s efficiency makes a class of smart, battery-powered, always-on applications that can revolutionize the collection and processing of data in real time. This emerging field, which is the end of a lot of innovation, is ready to speed up its growth in the coming years. In this thesis, we deploy three model on a microcontroller. For the model, datasets are retrieved from an online repository and are preprocessed as per our requirement. The model is then trained on the split of preprocessed data at its best to get the most accuracy out of it. Later the trained model is converted to C language to make it possible to deploy on the microcontroller. Finally, we take step towards incorporating the model into the microcontroller by implementing and evaluating an interface for the user to utilize the microcontroller’s sensors. In our thesis, we will have 4 chapters. The first will give us an introduction of TinyML. The second chapter will help setup the TinyML Environment. The third chapter will be about a major use of TinyML in Wake Word Detection. The final chapter will deal with Gesture Recognition in TinyML.
Resumo:
The aim of this dissertation is to describe the methodologies required to design, operate, and validate the performance of ground stations dedicated to near and deep space tracking, as well as the models developed to process the signals acquired, from raw data to the output parameters of the orbit determination of spacecraft. This work is framed in the context of lunar and planetary exploration missions by addressing the challenges in receiving and processing radiometric data for radio science investigations and navigation purposes. These challenges include the designing of an appropriate back-end to read, convert and store the antenna voltages, the definition of appropriate methodologies for pre-processing, calibration, and estimation of radiometric data for the extraction of information on the spacecraft state, and the definition and integration of accurate models of the spacecraft dynamics to evaluate the goodness of the recorded signals. Additionally, the experimental design of acquisition strategies to perform direct comparison between ground stations is described and discussed. In particular, the evaluation of the differential performance between stations requires the designing of a dedicated tracking campaign to maximize the overlap of the recorded datasets at the receivers, making it possible to correlate the received signals and isolate the contribution of the ground segment to the noise in the single link. Finally, in support of the methodologies and models presented, results from the validation and design work performed on the Deep Space Network (DSN) affiliated nodes DSS-69 and DSS-17 will also be reported.
Resumo:
Background: WGS is increasingly used as a first-line diagnostic test for patients with rare genetic diseases such as neurodevelopmental disorders (NDD). Clinical applications require a robust infrastructure to support processing, storage and analysis of WGS data. The identification and interpretation of SVs from WGS data also needs to be improved. Finally, there is a need for a prioritization system that enables downstream clinical analysis and facilitates data interpretation. Here, we present the results of a clinical application of WGS in a cohort of patients with NDD. Methods: We developed highly portable workflows for processing WGS data, including alignment, quality control, and variant calling of SNVs and SVs. A benchmark analysis of state-of-the-art SV detection tools was performed to select the most accurate combination for SV calling. A gene-based prioritization system was also implemented to support variant interpretation. Results: Using a benchmark analysis, we selected the most accurate combination of tools to improve SV detection from WGS data and build a dedicated pipeline. Our workflows were used to process WGS data from 77 NDD patient-parent families. The prioritization system supported downstream analysis and enabled molecular diagnosis in 32% of patients, 25% of which were SVs and suggested a potential diagnosis in 20% of patients, requiring further investigation to achieve diagnostic certainty. Conclusion: Our data suggest that the integration of SNVs and SVs is a main factor that increases diagnostic yield by WGS and show that the adoption of a dedicated pipeline improves the process of variant detection and interpretation.
Resumo:
L'avanzamento nel campo della long document summarization dipende interamente dalla disponibilità di dataset pubblici di alta qualità e con testi di lunghezza considerevole. Risulta pertanto problematico il fatto che tali dataset risultino spesso solo in lingua inglese, comportandone una limitazione notevole se ci si rivolge a linguaggi le cui risorse sono limitate. A tal scopo, si propone LAWSU-IT, un nuovo dataset giudiziario per long document summarization italiana. LAWSU-IT è il primo dataset italiano di summarization ad avere documenti di grandi dimensioni e a trattare il dominio giudiziario, ed è stato costruito attuando procedure di cleaning dei dati e selezione mirata delle istanze, con lo scopo di ottenere un dataset di long document summarization di alta qualità. Inoltre, sono proposte molteplici baseline sperimentali di natura estrattiva e astrattiva con modelli stato dell'arte e approcci di segmentazione del testo. Si spera che tale risultato possa portare a ulteriori ricerche e sviluppi nell'ambito della long document summarization italiana.
Resumo:
In the metal industry, and more specifically in the forging one, scrap material is a crucial issue and reducing it would be an important goal to reach. Not only would this help the companies to be more environmentally friendly and more sustainable, but it also would reduce the use of energy and lower costs. At the same time, the techniques for Industry 4.0 and the advancements in Artificial Intelligence (AI), especially in the field of Deep Reinforcement Learning (DRL), may have an important role in helping to achieve this objective. This document presents the thesis work, a contribution to the SmartForge project, that was performed during a semester abroad at Karlstad University (Sweden). This project aims at solving the aforementioned problem with a business case of the company Bharat Forge Kilsta, located in Karlskoga (Sweden). The thesis work includes the design and later development of an event-driven architecture with microservices, to support the processing of data coming from sensors set up in the company's industrial plant, and eventually the implementation of an algorithm with DRL techniques to control the electrical power to use in it.
Resumo:
Due to the imprecise nature of biological experiments, biological data is often characterized by the presence of redundant and noisy data. This may be due to errors that occurred during data collection, such as contaminations in laboratorial samples. It is the case of gene expression data, where the equipments and tools currently used frequently produce noisy biological data. Machine Learning algorithms have been successfully used in gene expression data analysis. Although many Machine Learning algorithms can deal with noise, detecting and removing noisy instances from the training data set can help the induction of the target hypothesis. This paper evaluates the use of distance-based pre-processing techniques for noise detection in gene expression data classification problems. This evaluation analyzes the effectiveness of the techniques investigated in removing noisy data, measured by the accuracy obtained by different Machine Learning classifiers over the pre-processed data.
Resumo:
Maltose-binding protein is the periplasmic component of the ABC transporter responsible for the uptake of maltose/maltodextrins. The Xanthomonas axonopodis pv. citri maltose-binding protein MalE has been crystallized at 293 Kusing the hanging-drop vapour-diffusion method. The crystal belonged to the primitive hexagonal space group P6(1)22, with unit-cell parameters a = 123.59, b = 123.59, c = 304.20 angstrom, and contained two molecules in the asymetric unit. It diffracted to 2.24 angstrom resolution.
Resumo:
The cost of spatial join processing can be very high because of the large sizes of spatial objects and the computation-intensive spatial operations. While parallel processing seems a natural solution to this problem, it is not clear how spatial data can be partitioned for this purpose. Various spatial data partitioning methods are examined in this paper. A framework combining the data-partitioning techniques used by most parallel join algorithms in relational databases and the filter-and-refine strategy for spatial operation processing is proposed for parallel spatial join processing. Object duplication caused by multi-assignment in spatial data partitioning can result in extra CPU cost as well as extra communication cost. We find that the key to overcome this problem is to preserve spatial locality in task decomposition. We show in this paper that a near-optimal speedup can be achieved for parallel spatial join processing using our new algorithms.
Resumo:
Network control systems (NCSs) are spatially distributed systems in which the communication between sensors, actuators and controllers occurs through a shared band-limited digital communication network. However, the use of a shared communication network, in contrast to using several dedicated independent connections, introduces new challenges which are even more acute in large scale and dense networked control systems. In this paper we investigate a recently introduced technique of gathering information from a dense sensor network to be used in networked control applications. Obtaining efficiently an approximate interpolation of the sensed data is exploited as offering a good tradeoff between accuracy in the measurement of the input signals and the delay to the actuation. These are important aspects to take into account for the quality of control. We introduce a variation to the state-of-the-art algorithms which we prove to perform relatively better because it takes into account the changes over time of the input signal within the process of obtaining an approximate interpolation.
Resumo:
Cooperating objects (COs) is a recently coined term used to signify the convergence of classical embedded computer systems, wireless sensor networks and robotics and control. We present essential elements of a reference architecture for scalable data processing for the CO paradigm.
Resumo:
Beyond the classical statistical approaches (determination of basic statistics, regression analysis, ANOVA, etc.) a new set of applications of different statistical techniques has increasingly gained relevance in the analysis, processing and interpretation of data concerning the characteristics of forest soils. This is possible to be seen in some of the recent publications in the context of Multivariate Statistics. These new methods require additional care that is not always included or refered in some approaches. In the particular case of geostatistical data applications it is necessary, besides to geo-reference all the data acquisition, to collect the samples in regular grids and in sufficient quantity so that the variograms can reflect the spatial distribution of soil properties in a representative manner. In the case of the great majority of Multivariate Statistics techniques (Principal Component Analysis, Correspondence Analysis, Cluster Analysis, etc.) despite the fact they do not require in most cases the assumption of normal distribution, they however need a proper and rigorous strategy for its utilization. In this work, some reflections about these methodologies and, in particular, about the main constraints that often occur during the information collecting process and about the various linking possibilities of these different techniques will be presented. At the end, illustrations of some particular cases of the applications of these statistical methods will also be presented.
Resumo:
Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
Los eventos transitorios únicos analógicos (ASET, Analog Single Event Transient) se producen debido a la interacción de un ión pesado o un protón de alta energía con un dispositivo sensible de un circuito analógico. La interacción del ión con un transistor bipolar o de efecto de campo MOS induce pares electrón-hueco que provocan picos que pueden propagarse a la salida del componente analógico provocando transitorios que pueden inducir fallas en el nivel sistema. Los problemas más graves debido a este tipo de fenómeno se dan en el medioambiente espacial, muy rico en iones pesados. Casos típicos los constituyen las computadoras de a bordo de satélites y otros artefactos espaciales. Sin embargo, y debido a la continua contracción de dimensiones de los transistores (que trae aparejado un aumento de sensibilidad), este fenómeno ha comenzado a observarse a nivel del mar, provocado fundamentalmente por el impacto de neutrones atmosféricos. Estos efectos pueden provocar severos problemas a los sistemas informáticos con interfaces analógicas desde las que obtienen datos para el procesamiento y se han convertido en uno de los problemas más graves a los que tienen que hacer frente los diseñadores de sistemas de alta escala de integración. Casos típicos son los Sistemas en Chip que incluyen módulos de procesamiento de altas prestaciones como las interfaces analógicas.El proyecto persigue como objetivo general estudiar la susceptibilidad de sistemas informáticos a ASETs en sus secciones analógicas, proponiendo estrategias para la mitigación de los errores.Como objetivos específicos se pretende: -Proponer nuevos modelos de ASETs basados en simulaciones en el nivel dispositivo y resueltas por el método de elementos finitos.-Utilizar los modelos para identificar las secciones más propensas a producir errores y consecuentemente para ser candidatos a la aplicación de técnicas de endurecimiento a radiaciones.-Utilizar estos modelos para estudiar la naturaleza de los errores producidos en sistemas de procesamiento de datos.-Proponer soluciones novedosas para la mitigación de estos efectos en los mismos circuitos analógicos evitando su propagación a las secciones digitales.-Proponer soluciones para la mitigación de los efectos en el nivel sistema.Para llevar a cabo el proyecto se plantea un procedimiento ascendente para las investigaciones a realizar, comenzando por descripciones en el nivel físico para posteriormente aumentar el nivel de abstracción en el que se encuentra modelado el circuito. Se propone el modelado físico de los dispositivos MOS y su resolución mediante el Método de Elementos Finitos. La inyección de cargas en las zonas sensibles de los modelos permitirá determinar los perfiles de los pulsos de corriente que deben inyectarse en el nivel circuito para emular estos efectos. Estos procedimientos se realizarán para los distintos bloques constructivos de las interfaces analógicas, proponiendo estrategias de mitigación de errores en diferentes niveles.Los resultados esperados del presente proyecto incluyen hardware para detección de errores y tolerancia a este tipo de eventos que permitan aumentar la confiabilidad de sistemas de tratamiento de la información, así como también nuevos datos referentes a efectos de la radiación en semiconductores, nuevos modelos de fallas transitorias que permitan una simulación de estos eventos en el nivel circuito y la determinación de zonas sensibles de interfaces analógicas típicas que deben ser endurecidas para radiación.
Resumo:
There is described data processing at the flaw detector with combined multisectional eddy-current transducer and heterofrequency magnetic field. The application of this method for detecting flaws in rods and pipes under the conditions of significant transverse displacements is described.