903 resultados para Data selection


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distributed data aggregation is an important task, allowing the de- centralized determination of meaningful global properties, that can then be used to direct the execution of other applications. The resulting val- ues result from the distributed computation of functions like count, sum and average. Some application examples can found to determine the network size, total storage capacity, average load, majorities and many others. In the last decade, many di erent approaches have been pro- posed, with di erent trade-o s in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of ag- gregation algorithms, it can be di cult and time consuming to determine which techniques will be more appropriate to use in speci c settings, jus- tifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally de nes the concept of aggrega- tion, characterizing the di erent types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recently, there has been a growing interest in the field of metabolomics, materialized by a remarkable growth in experimental techniques, available data and related biological applications. Indeed, techniques as Nuclear Magnetic Resonance, Gas or Liquid Chromatography, Mass Spectrometry, Infrared and UV-visible spectroscopies have provided extensive datasets that can help in tasks as biological and biomedical discovery, biotechnology and drug development. However, as it happens with other omics data, the analysis of metabolomics datasets provides multiple challenges, both in terms of methodologies and in the development of appropriate computational tools. Indeed, from the available software tools, none addresses the multiplicity of existing techniques and data analysis tasks. In this work, we make available a novel R package, named specmine, which provides a set of methods for metabolomics data analysis, including data loading in different formats, pre-processing, metabolite identification, univariate and multivariate data analysis, machine learning, and feature selection. Importantly, the implemented methods provide adequate support for the analysis of data from diverse experimental techniques, integrating a large set of functions from several R packages in a powerful, yet simple to use environment. The package, already available in CRAN, is accompanied by a web site where users can deposit datasets, scripts and analysis reports to be shared with the community, promoting the efficient sharing of metabolomics data analysis pipelines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

During must fermentation by Saccharomyces cerevisiae strains thousands of volatile aroma compounds are formed. The objective of the present work was to adapt computational approaches to analyze pheno-metabolomic diversity of a S. cerevisiae strain collection with different origins. Phenotypic and genetic characterization together with individual must fermentations were performed, and metabolites relevant to aromatic profiles were determined. Experimental results were projected onto a common coordinates system, revealing 17 statistical-relevant multi-dimensional modules, combining sets of most-correlated features of noteworthy biological importance. The present method allowed, as a breakthrough, to combine genetic, phenotypic and metabolomic data, which has not been possible so far due to difficulties in comparing different types of data. Therefore, the proposed computational approach revealed as successful to shed light into the holistic characterization of S. cerevisiae pheno-metabolome in must fermentative conditions. This will allow the identification of combined relevant features with application in selection of good winemaking strains.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Several researchers seek methods for the selection of homogeneous groups of animals in experimental studies, a fact justified because homogeneity is an indispensable prerequisite for casualization of treatments. The lack of robust methods that comply with statistical and biological principles is the reason why researchers use empirical or subjective methods, influencing their results. Objective: To develop a multivariate statistical model for the selection of a homogeneous group of animals for experimental research and to elaborate a computational package to use it. Methods: The set of echocardiographic data of 115 male Wistar rats with supravalvular aortic stenosis (AoS) was used as an example of model development. Initially, the data were standardized, and became dimensionless. Then, the variance matrix of the set was submitted to principal components analysis (PCA), aiming at reducing the parametric space and at retaining the relevant variability. That technique established a new Cartesian system into which the animals were allocated, and finally the confidence region (ellipsoid) was built for the profile of the animals’ homogeneous responses. The animals located inside the ellipsoid were considered as belonging to the homogeneous batch; those outside the ellipsoid were considered spurious. Results: The PCA established eight descriptive axes that represented the accumulated variance of the data set in 88.71%. The allocation of the animals in the new system and the construction of the confidence region revealed six spurious animals as compared to the homogeneous batch of 109 animals. Conclusion: The biometric criterion presented proved to be effective, because it considers the animal as a whole, analyzing jointly all parameters measured, in addition to having a small discard rate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

AbstractBackground:Guidelines recommend that in suspected stable coronary artery disease (CAD), a clinical (non-invasive) evaluation should be performed before coronary angiography.Objective:We assessed the efficacy of patient selection for coronary angiography in suspected stable CAD.Methods:We prospectively selected consecutive patients without known CAD, referred to a high-volume tertiary center. Demographic characteristics, risk factors, symptoms and non-invasive test results were correlated to the presence of obstructive CAD. We estimated the CAD probability based on available clinical data and the incremental diagnostic value of previous non-invasive tests.Results:A total of 830 patients were included; median age was 61 years, 49.3% were males, 81% had hypertension and 35.5% were diabetics. Non-invasive tests were performed in 64.8% of the patients. At coronary angiography, 23.8% of the patients had obstructive CAD. The independent predictors for obstructive CAD were: male gender (odds ratio [OR], 3.95; confidence interval [CI] 95%, 2.70 - 5.77), age (OR for 5 years increment, 1.15; CI 95%, 1.06 - 1.26), diabetes (OR, 2.01; CI 95%, 1.40 - 2.90), dyslipidemia (OR, 2.02; CI 95%, 1.32 - 3.07), typical angina (OR, 2.92; CI 95%, 1.77 - 4.83) and previous non-invasive test (OR 1.54; CI 95% 1.05 - 2.27).Conclusions:In this study, less than a quarter of the patients referred for coronary angiography with suspected CAD had the diagnosis confirmed. A better clinical and non-invasive assessment is necessary, to improve the efficacy of patient selection for coronary angiography.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The behavioral response of Biomphalaria straminea to light was evaluted in terms of location of the snail in a Y-shaped aquarium in a situation of selection and of the rate (cm/hour) and direction of locomotion under homogeneous 9vertical) or differential (horizontal) lighting upon only one arm of the aquarium. The light source consisted of daylight fluorescent lamps with a spectrum close to that of natural light, with illumination varying from 28 to 350 lux. Analysis of the data showed that all animals, whether in groups or isolated, were attracted to light, although the time needed to approach the light source was 50% shorter for the former than for the latter. The rate of locomotion of B. straminea was 35% higher than observed in B. glabrata and 51% higher than that observed in B. tenagophila studied under similar conditions. The results are discussed in terms of social factors and geographical distribution of the three species.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The productive characteristics of migrating individuals, emigrant selection, affect welfare. The empirical estimation of the degree of selection suffers from a lack of complete and nationally representative data. This paper uses a new and better dataset to address both issues: the ENET (Mexican Labor Survey), which identifies emigrants right before they leave and allows a direct comparison to non-migrants. This dataset presents a relevant dichotomy: it shows on average negative selection for Mexican emigrants to the United States for the period 2000-2004 together with positive selection in Mexican emigration out of rural Mexico to the United States in the same period. Three theories that could explain this dichotomy are tested. Whereas higher skill prices in Mexico than in the US are enough to explain negative selection in urban Mexico, its combination with network effects and wealth constraints is required to account for positive selection in rural Mexico.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper examines the extent to which Mexican emigrants to the United States are negatively selected, that is, have lower skills than individuals who remain in Mexico. Previous studies have been limited by the lack of nationally representative longitudinal data. This one uses a newly available household survey, which identifies emigrants before they leave and allows a direct comparison to non-migrants. I find that, on average, US bound Mexican emigrants from 2000 to 2004 earn a lower wage and have less schooling years than individuals who remain in Mexico, evidence of negative selection. This supports the original hypothesis of Borjas (AER, 1987) and argues against recent findings, notably those of Chiquiar and Hanson (JPE, 2005). The discrepancy with the latter is primarily due to an under-count of unskilled migrants in US sources and secondarily to the omission of unobservables in their methodology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper develops stochastic search variable selection (SSVS) for zero-inflated count models which are commonly used in health economics. This allows for either model averaging or model selection in situations with many potential regressors. The proposed techniques are applied to a data set from Germany considering the demand for health care. A package for the free statistical software environment R is provided.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper utilizes a panel data sample selection model to correct the selection in the analysis of longitudinal labor market data for married women in European countries. We estimate the female wage equation in a framework of unbalanced panel data models with sample selection. The wage equations of females have several potential sources of.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many new gene copies emerged by gene duplication in hominoids, but little is known with respect to their functional evolution. Glutamate dehydrogenase (GLUD) is an enzyme central to the glutamate and energy metabolism of the cell. In addition to the single, GLUD-encoding gene present in all mammals (GLUD1), humans and apes acquired a second GLUD gene (GLUD2) through retroduplication of GLUD1, which codes for an enzyme with unique, potentially brain-adapted properties. Here we show that whereas the GLUD1 parental protein localizes to mitochondria and the cytoplasm, GLUD2 is specifically targeted to mitochondria. Using evolutionary analysis and resurrected ancestral protein variants, we demonstrate that the enhanced mitochondrial targeting specificity of GLUD2 is due to a single positively selected glutamic acid-to-lysine substitution, which was fixed in the N-terminal mitochondrial targeting sequence (MTS) of GLUD2 soon after the duplication event in the hominoid ancestor approximately 18-25 million years ago. This MTS substitution arose in parallel with two crucial adaptive amino acid changes in the enzyme and likely contributed to the functional adaptation of GLUD2 to the glutamate metabolism of the hominoid brain and other tissues. We suggest that rapid, selectively driven subcellular adaptation, as exemplified by GLUD2, represents a common route underlying the emergence of new gene functions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this paper is to verify, for the Spanish case, whether between 1977 and 2008 has increased the internal democracy of the major political parties (PSOE, AP / PP, PCE / IU, PNV and CDC). To do this, we will focus on their leadership selection processes, one of the key elements associated with intra-party democracy. The paper is going to introduce data on four different dimensions of leadership selection: the certification process, the voting procedure, the inclusiveness of the selectorate and, finally, the degree of competitiveness. The results will show that have been few changes in the leadership selection processes of the Spanish political parties since 1977. However, the results of the Spanish case will also be used to suggest some preliminary links between the four dimensions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The availability of rich firm-level data sets has recently led researchers to uncover new evidence on the effects of trade liberalization. First, trade openness forces the least productive firms to exit the market. Secondly, it induces surviving firms to increase their innovation efforts and thirdly, it increases the degree of product market competition. In this paper we propose a model aimed at providing a coherent interpretation of these findings. We introducing firm heterogeneity into an innovation-driven growth model, where incumbent firms operating in oligopolistic industries perform cost-reducing innovations. In this framework, trade liberalization leads to higher product market competition, lower markups and higher quantity produced. These changes in markups and quantities, in turn, promote innovation and productivity growth through a direct competition effect, based on the increase in the size of the market, and a selection effect, produced by the reallocation of resources towards more productive firms. Calibrated to match US aggregate and firm-level statistics, the model predicts that a 10 percent reduction in variable trade costs reduces markups by 1:15 percent, firm surviving probabilities by 1 percent, and induces an increase in productivity growth of about 13 percent. More than 90 percent of the trade-induced growth increase can be attributed to the selection effect.