Biblioteca Digital

887 resultados para Multiple data

DnaSP v5: A software for comprehensive analysis of DNA polymorphism data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

DnaSP is a software package for a comprehensive analysis of DNA polymorphism data. Version 5 implements a number of new features and analytical methods allowing extensive DNA polymorphism analyses on large datasets. Among other features, the newly implemented methods allow for: (i) analyses on multiple data files; (ii) haplotype phasing; (iii) analyses on insertion/deletion polymorphism data; (iv) visualizing sliding window results integrated with available genome annotations in the UCSC browser.

Reversible and high capacity data hiding in medical images

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we introduce a highly efficient reversible data hiding system. It is based on dividing the image into tiles and shifting the histograms of each image tile between its minimum and maximum frequency. Data are then inserted at the pixel level with the largest frequency to maximize data hiding capacity. It exploits the special properties of medical images, where the histogram of their nonoverlapping image tiles mostly peak around some gray values and the rest of the spectrum is mainlyempty. The zeros (or minima) and peaks (maxima) of the histograms of the image tiles are then relocated to embed the data. The grey values of some pixels are therefore modified.High capacity, high fidelity, reversibility and multiple data insertions are the key requirements of data hiding in medical images. We show how histograms of image tiles of medical images can be exploited to achieve these requirements. Compared with data hiding method applied to the whole image, our scheme can result in 30%-200% capacity improvement and still with better image quality, depending on the medical image content. Additional advantages of the proposed method include hiding data in the regions of non-interest and better exploitation of spatial masking.

AmalgamScope: merging annotations data across the human genome

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The past years have shown an enormous advancement in sequencing and array-based technologies, producing supplementary or alternative views of the genome stored in various formats and databases. Their sheer volume and different data scope pose a challenge to jointly visualize and integrate diverse data types. We present AmalgamScope a new interactive software tool focusing on assisting scientists with the annotation of the human genome and particularly the integration of the annotation files from multiple data types, using gene identifiers and genomic coordinates. Supported platforms include next-generation sequencing and microarray technologies. The available features of AmalgamScope range from the annotation of diverse data types across the human genome to integration of the data based on the annotational information and visualization of the merged files within chromosomal regions or the whole genome. Additionally, users can define custom transcriptome library files for any species and use the file exchanging distant server options of the tool.

Towards Interoperability in P2P World: an Indexing Middleware for Multi-Protocol Peer-to-Peer Data Sharing

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Despite the abundant availability,of protocols and application for peer-to-peer file sharing, several drawbacks are still present in the field. Among most notable drawbacks is the lack of a simple and interoperable way to share information among independent peer-to-peer networks. Another drawback is the requirement that the shared content can be accessed only by a limited number of compatible applications, making impossible their access to others applications and system. In this work we present a new approach for peer-to-peer data indexing, focused on organization and retrieval of metadata which describes the shared content. This approach results in a common and interoperable infrastructure, which provides a transparent access to data shared on multiple data sharing networks via a simple API. The proposed approach is evaluated using a case study, implemented as a cross-platform extension to Mozilla Fir fox browser; and demonstrates the advantages of such interoperability over conventional distributed data access strategies.

Towards interoperability in P2Pworld: An indexing middleware for multi-protocol peer-to-peer data sharing

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Despite the abundant availability of protocols and application for peer-to-peer file sharing, several drawbacks are still present in the field. Among most notable drawbacks is the lack of a simple and interoperable way to share information among independent peer-to-peer networks. Another drawback is the requirement that the shared content can be accessed only by a limited number of compatible applications, making impossible their access to others applications and system. In this work we present a new approach for peer-to-peer data indexing, focused on organization and retrieval of metadata which describes the shared content. This approach results in a common and interoperable infrastructure, which provides a transparent access to data shared on multiple data sharing networks via a simple API. The proposed approach is evaluated using a case study, implemented as a cross-platform extension to Mozilla Firefox browser, and demonstrates the advantages of such interoperability over conventional distributed data access strategies. © 2009 IEEE.

Selecting control genes for RT-QPCR using public microarray data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

BACKGROUND: Gene expression analysis has emerged as a major biological research area, with real-time quantitative reverse transcription PCR (RT-QPCR) being one of the most accurate and widely used techniques for expression profiling of selected genes. In order to obtain results that are comparable across assays, a stable normalization strategy is required. In general, the normalization of PCR measurements between different samples uses one to several control genes (e.g. housekeeping genes), from which a baseline reference level is constructed. Thus, the choice of the control genes is of utmost importance, yet there is not a generally accepted standard technique for screening a large number of candidates and identifying the best ones. RESULTS: We propose a novel approach for scoring and ranking candidate genes for their suitability as control genes. Our approach relies on publicly available microarray data and allows the combination of multiple data sets originating from different platforms and/or representing different pathologies. The use of microarray data allows the screening of tens of thousands of genes, producing very comprehensive lists of candidates. We also provide two lists of candidate control genes: one which is breast cancer-specific and one with more general applicability. Two genes from the breast cancer list which had not been previously used as control genes are identified and validated by RT-QPCR. Open source R functions are available at http://www.isrec.isb-sib.ch/~vpopovic/research/ CONCLUSION: We proposed a new method for identifying candidate control genes for RT-QPCR which was able to rank thousands of genes according to some predefined suitability criteria and we applied it to the case of breast cancer. We also empirically showed that translating the results from microarray to PCR platform was achievable.

Partitioned likelihood support and the evaluation of data set conflict

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In simultaneous analyses of multiple data partitions, the trees relevant when measuring support for a clade are the optimal tree, and the best tree lacking the clade (i.e., the most reasonable alternative). The parsimony-based method of partitioned branch support (PBS) forces each data set to arbitrate between the two relevant trees. This value is the amount each data set contributes to clade support in the combined analysis, and can be very different to support apparent in separate analyses. The approach used in PBS can also be employed in likelihood: a simultaneous analysis of all data retrieves the maximum likelihood tree, and the best tree without the clade of interest is also found. Each data set is fitted to the two trees and the log-likelihood difference calculated, giving partitioned likelihood support (PLS) for each data set. These calculations can be performed regardless of the complexity of the ML model adopted. The significance of PLS can be evaluated using a variety of resampling methods, such as the Kishino-Hasegawa test, the Shimodiara-Hasegawa test, or likelihood weights, although the appropriateness and assumptions of these tests remains debated.

Development of prediction models for freeway incident durations using data mining techniques

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The nation's freeway systems are becoming increasingly congested. A major contribution to traffic congestion on freeways is due to traffic incidents. Traffic incidents are non-recurring events such as accidents or stranded vehicles that cause a temporary roadway capacity reduction, and they can account for as much as 60 percent of all traffic congestion on freeways. One major freeway incident management strategy involves diverting traffic to avoid incident locations by relaying timely information through Intelligent Transportation Systems (ITS) devices such as dynamic message signs or real-time traveler information systems. The decision to divert traffic depends foremost on the expected duration of an incident, which is difficult to predict. In addition, the duration of an incident is affected by many contributing factors. Determining and understanding these factors can help the process of identifying and developing better strategies to reduce incident durations and alleviate traffic congestion. A number of research studies have attempted to develop models to predict incident durations, yet with limited success. ^ This dissertation research attempts to improve on this previous effort by applying data mining techniques to a comprehensive incident database maintained by the District 4 ITS Office of the Florida Department of Transportation (FDOT). Two categories of incident duration prediction models were developed: "offline" models designed for use in the performance evaluation of incident management programs, and "online" models for real-time prediction of incident duration to aid in the decision making of traffic diversion in the event of an ongoing incident. Multiple data mining analysis techniques were applied and evaluated in the research. The multiple linear regression analysis and decision tree based method were applied to develop the offline models, and the rule-based method and a tree algorithm called M5P were used to develop the online models. ^ The results show that the models in general can achieve high prediction accuracy within acceptable time intervals of the actual durations. The research also identifies some new contributing factors that have not been examined in past studies. As part of the research effort, software code was developed to implement the models in the existing software system of District 4 FDOT for actual applications. ^

Sensitivity and uncertainty analyses for burden of disease and risk factor estimates

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Epidemiological studies report confidence or uncertainty intervals around their estimates. Estimates of the burden of diseases and risk factors are subject to a broader range of uncertainty because of the combination of multiple data sources and value choices. Sensitivity analysis can be used to examine the effects of social values that have been incorporated into the design of the disability–adjusted life year (DALY). Age weight, where a year of healthy life lived at one age is valued differently from at another age, is the most controversial value built into the DALY. The discount rate, which addresses the difference in value of current versus future health benefits, also has been criticized. The distribution of the global disease burden and rankings of various conditions are largely insensitive to alternate assumptions about the discount rate and age weighting. The major effects of discounting and age weighting are to enhance the importance of neuropsychiatric conditions and sexually transmitted infections. The Global Burden of Disease study also has been criticized for estimating mortality and disease burden for regions using incomplete and uncertain data. Including uncertain results, with uncertainty quantified to the extent possible, is preferable, however, to leaving blank cells in tables intended to provide policy makers with an overall assessment of burden of disease. No estimate is generally interpreted as no problem. Greater investment in getting the descriptive epidemiology of diseases and injuries correct in poor countries will do vastly more to reduce uncertainty in disease burden assessments than a philosophical debate about the appropriateness of social value

Testing the relationship between morphological and molecular rates of change along phylogenies

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Molecular evolution has been considered to be essentially a stochastic process, little influenced by the pace of phenotypic change. This assumption was challenged by a study that demonstrated an association between rates of morphological and molecular change estimated for total-evidence phylogenies, a finding that led some researchers to challenge molecular date estimates of major evolutionary radiations. Here we show that Omland's (1997) result is probably due to methodological bias, particularly phylogenetic nonindependence, rather than being indicative of an underlying evolutionary phenomenon. We apply three new methods specifically designed to overcome phylogenetic bias to 13 published phylogenetic datasets for vertebrate taxa, each of which includes both morphological characters and DNA sequence data. We find no evidence of an association between rates of molecular and morphological rates of change.

Recommendation of Tourism Resources Supported by Crowdsourcing

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Context-aware recommendation of personalised tourism resources is possible because of personal mobile devices and powerful data filtering algorithms. The devices contribute with computing capabilities, on board sensors, ubiquitous Internet access and continuous user monitoring, whereas the filtering algorithms provide the ability to match the profile (interests and the context) of the tourist against a large knowledge bases of tourism resources. While, in terms of technology, personal mobile devices can gather user-related information, including the user context and access multiple data sources, the creation and maintenance of an updated knowledge base of tourism-related resources requires a collaborative approach due to the heterogeneity, volume and dynamic nature of the resources. The current PhD thesis aims to contribute to the solution of this problem by adopting a Crowdsourcing approach for the collaborative maintenance of the knowledge base of resources, Trust and Reputation for the validation of uploaded resources as well as publishers, Big Data for user profiling and context-aware filtering algorithms for the personalised recommendation of tourism resources.

Monitoring and analysis of queries in distributed databases

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Dissertação de Mestrado em Engenharia Informática

A Real-Time intelligent system for tracking patient condition

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Hospitals have multiple data sources, such as embedded systems, monitors and sensors. The number of data available is increasing and the information are used not only to care the patient but also to assist the decision processes. The introduction of intelligent environments in health care institutions has been adopted due their ability to provide useful information for health professionals, either in helping to identify prognosis or also to understand patient condition. Behind of this concept arises this Intelligent System to track patient condition (e.g. critic events) in health care. This system has the great advantage of being adaptable to the environment and user needs. The system is focused in identifying critic events from data streaming (e.g. vital signs and ventilation) which is particularly valuable for understanding the patient’s condition. This work aims to demonstrate the process of creating an intelligent system capable of operating in a real environment using streaming data provided by ventilators and vital signs monitors. Its development is important to the physician because becomes possible crossing multiple variables in real-time by analyzing if a value is critic or not and if their variation has or not clinical importance.

DESARROLLO DE APLICACIONES ESTADÍSTICAS PARA LA AGRICULTURA DE PRECISIÓN. STATISTICAL APPLICATIONS FOR PRECISION AGRICULTURE

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A partir de las últimas décadas se ha impulsado el desarrollo y la utilización de los Sistemas de Información Geográficos (SIG) y los Sistemas de Posicionamiento Satelital (GPS) orientados a mejorar la eficiencia productiva de distintos sistemas de cultivos extensivos en términos agronómicos, económicos y ambientales. Estas nuevas tecnologías permiten medir variabilidad espacial de propiedades del sitio como conductividad eléctrica aparente y otros atributos del terreno así como el efecto de las mismas sobre la distribución espacial de los rendimientos. Luego, es posible aplicar el manejo sitio-específico en los lotes para mejorar la eficiencia en el uso de los insumos agroquímicos, la protección del medio ambiente y la sustentabilidad de la vida rural. En la actualidad, existe una oferta amplia de recursos tecnológicos propios de la agricultura de precisión para capturar variación espacial a través de los sitios dentro del terreno. El óptimo uso del gran volumen de datos derivado de maquinarias de agricultura de precisión depende fuertemente de las capacidades para explorar la información relativa a las complejas interacciones que subyacen los resultados productivos. La covariación espacial de las propiedades del sitio y el rendimiento de los cultivos ha sido estudiada a través de modelos geoestadísticos clásicos que se basan en la teoría de variables regionalizadas. Nuevos desarrollos de modelos estadísticos contemporáneos, entre los que se destacan los modelos lineales mixtos, constituyen herramientas prometedoras para el tratamiento de datos correlacionados espacialmente. Más aún, debido a la naturaleza multivariada de las múltiples variables registradas en cada sitio, las técnicas de análisis multivariado podrían aportar valiosa información para la visualización y explotación de datos georreferenciados. La comprensión de las bases agronómicas de las complejas interacciones que se producen a la escala de lotes en producción, es hoy posible con el uso de éstas nuevas tecnologías. Los objetivos del presente proyecto son: (l) desarrollar estrategias metodológicas basadas en la complementación de técnicas de análisis multivariados y geoestadísticas, para la clasificación de sitios intralotes y el estudio de interdependencias entre variables de sitio y rendimiento; (ll) proponer modelos mixtos alternativos, basados en funciones de correlación espacial de los términos de error que permitan explorar patrones de correlación espacial de los rendimientos intralotes y las propiedades del suelo en los sitios delimitados. From the last decades the use and development of Geographical Information Systems (GIS) and Satellite Positioning Systems (GPS) is highly promoted in cropping systems. Such technologies allow measuring spatial variability of site properties including electrical conductivity and others soil features as well as their impact on the spatial variability of yields. Therefore, site-specific management could be applied to improve the efficiency in the use of agrochemicals, the environmental protection, and the sustainability of the rural life. Currently, there is a wide offer of technological resources to capture spatial variation across sites within field. However, the optimum use of data coming from the precision agriculture machineries strongly depends on the capabilities to explore the information about the complex interactions underlying the productive outputs. The covariation between spatial soil properties and yields from georeferenced data has been treated in a graphical manner or with standard geostatistical approaches. New statistical modeling capabilities from the Mixed Linear Model framework are promising to deal with correlated data such those produced by the precision agriculture. Moreover, rescuing the multivariate nature of the multiple data collected at each site, several multivariate statistical approaches could be crucial tools for data analysis with georeferenced data. Understanding the basis of complex interactions at the scale of production field is now within reach the use of these new techniques. Our main objectives are: (1) to develop new statistical strategies, based on the complementarities of geostatistics and multivariate methods, useful to classify sites within field grown with grain crops and analyze the interrelationships of several soil and yield variables, (2) to propose mixed linear models to predict yield according spatial soil variability and to build contour maps to promote a more sustainable agriculture.

A Novel Memory-centric Architecture and Organization of Processors and Computers

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The modern computer systems that are in use nowadays are mostly processor-dominant, which means that their memory is treated as a slave element that has one major task – to serve execution units data requirements. This organization is based on the classical Von Neumann's computer model, proposed seven decades ago in the 1950ties. This model suffers from a substantial processor-memory bottleneck, because of the huge disparity between the processor and memory working speeds. In order to solve this problem, in this paper we propose a novel architecture and organization of processors and computers that attempts to provide stronger match between the processing and memory elements in the system. The proposed model utilizes a memory-centric architecture, wherein the execution hardware is added to the memory code blocks, allowing them to perform instructions scheduling and execution, management of data requests and responses, and direct communication with the data memory blocks without using registers. This organization allows concurrent execution of all threads, processes or program segments that fit in the memory at a given time. Therefore, in this paper we describe several possibilities for organizing the proposed memory-centric system with multiple data and logicmemory merged blocks, by utilizing a high-speed interconnection switching network.

«
1
2
3
4
5
6
7
8
...
59
60
»