32 resultados para Collection of Network Data
em Universidad Politécnica de Madrid
Resumo:
Complex networks have been extensively used in the last decade to characterize and analyze complex systems, and they have been recently proposed as a novel instrument for the analysis of spectra extracted from biological samples. Yet, the high number of measurements composing spectra, and the consequent high computational cost, make a direct network analysis unfeasible. We here present a comparative analysis of three customary feature selection algorithms, including the binning of spectral data and the use of information theory metrics. Such algorithms are compared by assessing the score obtained in a classification task, where healthy subjects and people suffering from different types of cancers should be discriminated. Results indicate that a feature selection strategy based on Mutual Information outperforms the more classical data binning, while allowing a reduction of the dimensionality of the data set in two orders of magnitude
Resumo:
This work studied the combined use of gliadins and SSRs to analyse inter- and intra-accession variability of the Spanish collection of cultivated einkorn (Triticum monococcum L. ssp. monococcum) maintained at the CRF-INIA. In general, gliadin loci presented higher discrimination power than SSRs, reflecting the high variability of the gliadins. The loci on chromosome 6A were the most polymorphic with similar PIC values for both marker systems, showing that these markers are very useful for genetic variability studies in wheat. The gliadin results indicated that the Spanish einkorn collection possessed high genetic diversity, being the differentiation large between varieties and small within them. Some associations between gliadin alleles and geographical and agro-morphological data were found. Agro-morphological relations were also observed in the clusters of the SSRs dendrogram. A high concordance was found between gliadins and SSRs for genotype identification. In addition, both systems provide complementary information to resolve the different cases of intra-accession variability not detected at the agro-morphological level, and to identify separately all the genotypes analysed. The combined use of both genetic markers is an excellent tool for genetic resource evaluation in addition to agro-morphological evaluation.
Resumo:
An important competence of human data analysts is to interpret and explain the meaning of the results of data analysis to end-users. However, existing automatic solutions for intelligent data analysis provide limited help to interpret and communicate information to non-expert users. In this paper we present a general approach to generating explanatory descriptions about the meaning of quantitative sensor data. We propose a type of web application: a virtual newspaper with automatically generated news stories that describe the meaning of sensor data. This solution integrates a variety of techniques from intelligent data analysis into a web-based multimedia presentation system. We validated our approach in a real world problem and demonstrate its generality using data sets from several domains. Our experience shows that this solution can facilitate the use of sensor data by general users and, therefore, can increase the utility of sensor network infrastructures.
Resumo:
We study the notion of approximate entropy within the framework of network theory. Approximate entropy is an uncertainty measure originally proposed in the context of dynamical systems and time series. We first define a purely structural entropy obtained by computing the approximate entropy of the so-called slide sequence. This is a surrogate of the degree sequence and it is suggested by the frequency partition of a graph. We examine this quantity for standard scale-free and Erdös-Rényi networks. By using classical results of Pincus, we show that our entropy measure often converges with network size to a certain binary Shannon entropy. As a second step, with specific attention to networks generated by dynamical processes, we investigate approximate entropy of horizontal visibility graphs. Visibility graphs allow us to naturally associate with a network the notion of temporal correlations, therefore providing the measure a dynamical garment. We show that approximate entropy distinguishes visibility graphs generated by processes with different complexity. The result probes to a greater extent these networks for the study of dynamical systems. Applications to certain biological data arising in cancer genomics are finally considered in the light of both approaches.
Resumo:
PAMELA (Phased Array Monitoring for Enhanced Life Assessment) SHMTM System is an integrated embedded ultrasonic guided waves based system consisting of several electronic devices and one system manager controller. The data collected by all PAMELA devices in the system must be transmitted to the controller, who will be responsible for carrying out the advanced signal processing to obtain SHM maps. PAMELA devices consist of hardware based on a Virtex 5 FPGA with a PowerPC 440 running an embedded Linux distribution. Therefore, PAMELA devices, in addition to the capability of performing tests and transmitting the collected data to the controller, have the capability of perform local data processing or pre-processing (reduction, normalization, pattern recognition, feature extraction, etc.). Local data processing decreases the data traffic over the network and allows CPU load of the external computer to be reduced. Even it is possible that PAMELA devices are running autonomously performing scheduled tests, and only communicates with the controller in case of detection of structural damages or when programmed. Each PAMELA device integrates a software management application (SMA) that allows to the developer downloading his own algorithm code and adding the new data processing algorithm to the device. The development of the SMA is done in a virtual machine with an Ubuntu Linux distribution including all necessary software tools to perform the entire cycle of development. Eclipse IDE (Integrated Development Environment) is used to develop the SMA project and to write the code of each data processing algorithm. This paper presents the developed software architecture and describes the necessary steps to add new data processing algorithms to SMA in order to increase the processing capabilities of PAMELA devices.An example of basic damage index estimation using delay and sum algorithm is provided.
Resumo:
Researchers in ecology commonly use multivariate analyses (e.g. redundancy analysis, canonical correspondence analysis, Mantel correlation, multivariate analysis of variance) to interpret patterns in biological data and relate these patterns to environmental predictors. There has been, however, little recognition of the errors associated with biological data and the influence that these may have on predictions derived from ecological hypotheses. We present a permutational method that assesses the effects of taxonomic uncertainty on the multivariate analyses typically used in the analysis of ecological data. The procedure is based on iterative randomizations that randomly re-assign non identified species in each site to any of the other species found in the remaining sites. After each re-assignment of species identities, the multivariate method at stake is run and a parameter of interest is calculated. Consequently, one can estimate a range of plausible values for the parameter of interest under different scenarios of re-assigned species identities. We demonstrate the use of our approach in the calculation of two parameters with an example involving tropical tree species from western Amazonia: 1) the Mantel correlation between compositional similarity and environmental distances between pairs of sites, and; 2) the variance explained by environmental predictors in redundancy analysis (RDA). We also investigated the effects of increasing taxonomic uncertainty (i.e. number of unidentified species), and the taxonomic resolution at which morphospecies are determined (genus-resolution, family-resolution, or fully undetermined species) on the uncertainty range of these parameters. To achieve this, we performed simulations on a tree dataset from southern Mexico by randomly selecting a portion of the species contained in the dataset and classifying them as unidentified at each level of decreasing taxonomic resolution. An analysis of covariance showed that both taxonomic uncertainty and resolution significantly influence the uncertainty range of the resulting parameters. Increasing taxonomic uncertainty expands our uncertainty of the parameters estimated both in the Mantel test and RDA. The effects of increasing taxonomic resolution, however, are not as evident. The method presented in this study improves the traditional approaches to study compositional change in ecological communities by accounting for some of the uncertainty inherent to biological data. We hope that this approach can be routinely used to estimate any parameter of interest obtained from compositional data tables when faced with taxonomic uncertainty.
Resumo:
Accurate control over the spent nuclear fuel content is essential for its safe and optimized transportation, storage and management. Consequently, the reactivity of spent fuel and its isotopic content must be accurately determined. Nowadays, to predict isotopic evolution throughout irradiation and decay periods is not a problem thanks to the development of powerful codes and methodologies. In order to have a realistic confidence level in the prediction of spent fuel isotopic content, it is desirable to determine how uncertainties in the basic nuclear data affect isotopic prediction calculations by quantifying their associated uncertainties
Resumo:
The assessment of the accuracy of parameters related to the reactor core performance (e.g., ke) and f el cycle (e.g., isotopic evolution/transmutation) due to the uncertainties in the basic nuclear data (ND) is a critical issue. Different error propagation techniques (adjoint/forward sensitivity analysis procedures and/or Monte Carlo technique) can be used to address by computational simulation the systematic propagation of uncertainties on the final parameters. To perform this uncertainty assessment, the ENDF covariance les (variance/correlation in energy and cross- reactions-isotopes correlations) are required. In this paper, we assess the impact of ND uncertainties on the isotopic prediction for a conceptual design of a modular European Facility for Industrial Transmutation (EFIT) for a discharge burnup of 150 GWd/tHM. The complete set of uncertainty data for cross sections (EAF2007/UN, SCALE6.0/COVA-44G), radioactive decay and fission yield data (JEFF-3.1.1) are processed and used in ACAB code.
Resumo:
To study the propagation of the uncertainty from basic data across different scale and physics phenomena -> through complex coupled multi-physics and multi-scale simulations
Resumo:
Accurate control over the spent nuclear fuel content is essential for its safe and optimized transportation, storage and management. Consequently, the reactivity of spent fuel and its isotopic content must be accurately determined.
Resumo:
The uncertainty propagation in fuel cycle calculations due to Nuclear Data (ND) is a important important issue for : issue for : • Present fuel cycles (e.g. high burnup fuel programme) • New fuel cycles designs (e.g. fast breeder reactors and ADS) Different error propagation techniques can be used: • Sensitivity analysis • Response Response Surface Method Surface Method • Monte Carlo technique Then, p p , , in this paper, it is assessed the imp y pact of ND uncertainties on the decay heat and radiotoxicity in two applications: • Fission Pulse Decay ( y Heat calculation (FPDH) • Conceptual design of European Facility for Industrial Transmutation (EFIT)
Resumo:
A small Positron Emission Tomography demonstrator based on LYSO slabs and Silicon Photomultiplier matrices is under construction at the University and INFN of Pisa. In this paper we present the characterization results of the read-out electronics and of the detection system. Two SiPM matrices, composed by 8 × 8 SiPM pixels, 1.5 mm pitch, have been coupled one to one to a LYSO crystals array. Custom Front-End ASICs were used to read the 64 channels of each matrix. Data from each Front-End were multiplexed and sent to a DAQ board for the digital conversion; a motherboard collects the data and communicates with a host computer through a USB port. Specific tests were carried out on the system in order to assess its performance. Futhermore we have measured some of the most important parameters of the system for PET application.
Resumo:
The Microarray technique is rather powerful, as it allows to test up thousands of genes at a time, but this produces an overwhelming set of data files containing huge amounts of data, which is quite difficult to pre-process, separate, classify and correlate for interesting conclusions to be extracted. Modern machine learning, data mining and clustering techniques based on information theory, are needed to read and interpret the information contents buried in those large data sets. Independent Component Analysis method can be used to correct the data affected by corruption processes or to filter the uncorrectable one and then clustering methods can group similar genes or classify samples. In this paper a hybrid approach is used to obtain a two way unsupervised clustering for a corrected microarray data.
Resumo:
The objectives of this study were to assess diversity and genetic structure of a collection of Spanish durum wheat (Triticum turgidum L) landraces, using SSRs, DArTs and gliadin-markers, and to correlate the distribution of diversity with geographic and climatic features, as well as agro-morphological traits. A high level of diversity was detected in the genotypes analyzed, which were separated into nine populations with a moderate to great genetic divergence among them. The three subspecies taxa, dicoccon, turgidum and durum, present in the collection, largely determined the clustering of the populations. Genotype variation was lower in dicoccon (one major population) and turgidum (two major populations) than in durum (five major populations). Genetic differentiation by the agro-ecological zone of origin was greater in dicoccon and turgidum than in durum. DArT markers revealed two geographic substructures, east-west for dicoccon and northeast-southwest for turgidum. The ssp. durum had a more complex structure, consisting of seven populations with high intra-population variation. DArT markers allowed the detection of subgroups within some populations, with agro-morphological and gliadin differences, and distinct agro-ecological zones of origin. Two different phylogenetic groups were detected; revealing that some durum populations were more related to ssp. turgidum from northern Spain, while others seem to be more related to durum wheats from North Africa
Resumo:
The aim of this paper is to study the importance of nuclear data uncertainties in the prediction of the uncertainties in keff for LWR (Light Water Reactor) unit-cells. The first part of this work is focused on the comparison of different sensitivity/uncertainty propagation methodologies based on TSUNAMI and MCNP codes; this study is undertaken for a fresh-fuel at different operational conditions. The second part of this work studies the burnup effect where the indirect contribution due to the uncertainty of the isotopic evolution is also analyzed.