956 resultados para Data Sets
Resumo:
En este trabajo se analiza el efecto de la selección de datos sobre las estimaciones de heredabilidad. Se estimó el valor de heredabilidad del tamaño de camada en una población porcina en la que los datos correspondientes a las cerdas más viejas eran una muestra seleccionada. Las estimaciones se obtuvieron usando distintos conjuntos de datos derivados de toda la información disponible. Esos conjunto de datos se compararon evaluando su capacidad predictiva. Se vio que las estimaciones de heredabilidad obtenidas utilizando todos los datos disponibles correspondían a valores infraestimados. También se simuló un carácter materno y se generó un conjunto de datos seleccionados eliminando aquellos correspondientes a las hembras sin padres conocidos. Distintos modelos, habitualmente empleados cuando no existe selección de registros, se consideraron para estimar el valor de heredabilidad. Los resultados mostraron que ninguno de esos modelos ofrecía estimaciones insesgadas. Sólo los modelos que tenían en cuenta el efecto de la selección sobre la media residual y la media y varianza genéticas ofrecían estimaciones poco sesgadas. Sin embargo, para poder aplicarlos se debe conocer la selección realizada. El problema de la selección de datos es difícil de abordar cuando se desconoce cual es el proceso de selección que se ha realizado en una población.
Resumo:
Advances in flow cytometry and other single-cell technologies have enabled high-dimensional, high-throughput measurements of individual cells as well as the interrogation of cell population heterogeneity. However, in many instances, computational tools to analyze the wealth of data generated by these technologies are lacking. Here, we present a computational framework for unbiased combinatorial polyfunctionality analysis of antigen-specific T-cell subsets (COMPASS). COMPASS uses a Bayesian hierarchical framework to model all observed cell subsets and select those most likely to have antigen-specific responses. Cell-subset responses are quantified by posterior probabilities, and human subject-level responses are quantified by two summary statistics that describe the quality of an individual's polyfunctional response and can be correlated directly with clinical outcome. Using three clinical data sets of cytokine production, we demonstrate how COMPASS improves characterization of antigen-specific T cells and reveals cellular 'correlates of protection/immunity' in the RV144 HIV vaccine efficacy trial that are missed by other methods. COMPASS is available as open-source software.
Resumo:
Tutkielman tavoitteena oli tarkastella innovaatioiden leviämismallien ennustetarkkuuteen vaikuttavia tekijöitä. Tutkielmassa ennustettiin logistisella mallilla matkapuhelinliittymien leviämistä kolmessa Euroopan maassa: Suomessa, Ranskassa ja Kreikassa. Teoriaosa keskittyi innovaatioiden leviämisen ennustamiseen leviämismallien avulla. Erityisesti painotettiin mallien ennustuskykyä ja niiden käytettävyyttä eri tilanteissa. Empiirisessä osassa keskityttiin ennustamiseen logistisella leviämismallilla, joka kalibroitiin eri tavoin koostetuilla aikasarjoilla. Näin tehtyjä ennusteita tarkasteltiin tiedon kokoamistasojen vaikutusten selvittämiseksi. Tutkimusasetelma oli empiirinen, mikä sisälsi logistisen leviämismallin ennustetarkkuuden tutkimista otosdatan kokoamistasoa muunnellen. Leviämismalliin syötettävä data voidaan kerätä kuukausittain ja operaattorikohtaisesti vaikuttamatta ennustetarkkuuteen. Dataan on sisällytettävä leviämiskäyrän käännöskohta, eli pitkän aikavälin huippukysyntäpiste.
Resumo:
Wide-range spectral coverage of blazar-type active galactic nuclei is of paramount importance for understanding the particle acceleration mechanisms assumed to take place in their jets. The Major Atmospheric Gamma Imaging Cerenkov (MAGIC) telescope participated in three multiwavelength (MWL) campaigns, observing the blazar Markarian (Mkn) 421 during the nights of April 28 and 29, 2006, and June 14, 2006. Aims. We analyzed the corresponding MAGIC very-high energy observations during 9 nights from April 22 to 30, 2006 and on June 14, 2006. We inferred light curves with sub-day resolution and night-by-night energy spectra. Methods. MAGIC detects γ-rays by observing extended air showers in the atmosphere. The obtained air-shower images were analyzed using the standard MAGIC analysis chain. Results. A strong γ-ray signal was detected from Mkn 421 on all observation nights. The flux (E > 250 GeV) varied on night-by-night basis between (0.92±0.11) × 10-10 cm-2 s-1 (0.57 Crab units) and (3.21±0.15) × 10-10 cm-2 s-1 (2.0 Crab units) in April 2006. There is a clear indication for intra-night variability with a doubling time of 36± min on the night of April 29, 2006, establishing once more rapid flux variability for this object. For all individual nights γ-ray spectra could be inferred, with power-law indices ranging from 1.66 to 2.47. We did not find statistically significant correlations between the spectral index and the flux state for individual nights. During the June 2006 campaign, a flux substantially lower than the one measured by the Whipple 10-m telescope four days later was found. Using a log-parabolic power law fit we deduced for some data sets the location of the spectral peak in the very-high energy regime. Our results confirm the indications of rising peak energy with increasing flux, as expected in leptonic acceleration models.
Resumo:
The pumpkinseed Lepomis gibbosus, an omnivorous, nest guarding North American sunfish, was introduced into European waters about 100 years ago. To assess growth performance following introduction, we reviewed the available data for North American and European populations of pumpkinseed and compared the back-calculated age-specific growth for juveniles (standard length, SL, at age two) and adults (age two to five increment) as well as adult body size (SL at age five), von Bertalanffy growth model parameters and the index of growth (in length) performance (φ′). For continental comparisons of growth trajectory, mean growth curves for North American and Europe were calculated with the von Bertalanffy model using pooled data sets for each continent. Juvenile growth rate did not differ between European and North American pumpkinseed, but mean adult body size and adult growth rate were both significantly greater in North American than European populations. Adult body size decreased with increasing latitude (ANOVA) in North American populations, but this was not observed with adult growth rate. In contrast, adult body size tended to increase with latitude in European populations. Adult body size correlated significantly with φ′. The von Bertalanffy model described the overall growth patterns of North American and European populations reasonably well, but on the individual population level, length asymptotes were unrealistic (estimates that were > 20 % of the mean back-calculated size for the oldest age class) for a third of European populations and 80% of the North American populations. In contrast to North American pumpkinseed populations, somatic growth in European populations appears to be compromised by limited, but adequate, food resources, probably due to strong intraspecific interactions. This appears to be especially acute in adults, having potential ramifications for life span and reproductive allocation
Resumo:
BACKGROUND: Available methods to simulate nucleotide or amino acid data typically use Markov models to simulate each position independently. These approaches are not appropriate to assess the performance of combinatorial and probabilistic methods that look for coevolving positions in nucleotide or amino acid sequences. RESULTS: We have developed a web-based platform that gives a user-friendly access to two phylogenetic-based methods implementing the Coev model: the evaluation of coevolving scores and the simulation of coevolving positions. We have also extended the capabilities of the Coev model to allow for the generalization of the alphabet used in the Markov model, which can now analyse both nucleotide and amino acid data sets. The simulation of coevolving positions is novel and builds upon the developments of the Coev model. It allows user to simulate pairs of dependent nucleotide or amino acid positions. CONCLUSIONS: The main focus of our paper is the new simulation method we present for coevolving positions. The implementation of this method is embedded within the web platform Coev-web that is freely accessible at http://coev.vital-it.ch/, and was tested in most modern web browsers.
Resumo:
BACKGROUND: Several distributions of country-specific blood pressure (BP) percentiles by sex, age, and height for children and adolescents have been established worldwide. However, there are no globally unified BP references for defining elevated BP in children and adolescents, which limits international comparisons of the prevalence of pediatric elevated BP. We aimed to establish international BP references for children and adolescents by using 7 nationally representative data sets (China, India, Iran, Korea, Poland, Tunisia, and the United States). METHODS AND RESULTS: Data on BP for 52 636 nonoverweight children and adolescents aged 6 to 19 years were obtained from 7 large nationally representative cross-sectional surveys in China, India, Iran, Korea, Poland, Tunisia, and the United States. BP values were obtained with certified mercury sphygmomanometers in all 7 countries by using standard procedures for BP measurement. Smoothed BP percentiles (50th, 90th, 95th, and 99th) by age and height were estimated by using the Generalized Additive Model for Location Scale and Shape model. BP values were similar between males and females until the age of 13 years and were higher in males than females thereafter. In comparison with the BP levels of the 90th and 95th percentiles of the US Fourth Report at median height, systolic BP of the corresponding percentiles of these international references was lower, whereas diastolic BP was similar. CONCLUSIONS: These international BP references will be a useful tool for international comparison of the prevalence of elevated BP in children and adolescents and may help to identify hypertensive youths in diverse populations.
Resumo:
As increasingly large molecular data sets are collected for phylogenomics, the conflicting phylogenetic signal among gene trees poses challenges to resolve some difficult nodes of the Tree of Life. Among these nodes, the phylogenetic position of the honey bees (Apini) within the corbiculate bee group remains controversial, despite its considerable importance for understanding the emergence and maintenance of eusociality. Here, we show that this controversy stems in part from pervasive phylogenetic conflicts among GC-rich gene trees. GC-rich genes typically have a high nucleotidic heterogeneity among species, which can induce topological conflicts among gene trees. When retaining only the most GC-homogeneous genes or using a nonhomogeneous model of sequence evolution, our analyses reveal a monophyletic group of the three lineages with a eusocial lifestyle (honey bees, bumble bees, and stingless bees). These phylogenetic relationships strongly suggest a single origin of eusociality in the corbiculate bees, with no reversal to solitary living in this group. To accurately reconstruct other important evolutionary steps across the Tree of Life, we suggest removing GC-rich and GC-heterogeneous genes from large phylogenomic data sets. Interpreted as a consequence of genome-wide variations in recombination rates, this GC effect can affect all taxa featuring GC-biased gene conversion, which is common in eukaryotes.
Resumo:
Mammalian physiology and behavior follow daily rhythms that are orchestrated by endogenous timekeepers known as circadian clocks. Rhythms in transcription are considered the main mechanism to engender rhythmic gene expression, but important roles for posttranscriptional mechanisms have recently emerged as well (reviewed in Lim and Allada (2013) [1]). We have recently reported on the use of ribosome profiling (RPF-seq), a method based on the high-throughput sequencing of ribosome protected mRNA fragments, to explore the temporal regulation of translation efficiency (Janich et al., 2015 [2]). Through the comparison of around-the-clock RPF-seq and matching RNA-seq data we were able to identify 150 genes, involved in ribosome biogenesis, iron metabolism and other pathways, whose rhythmicity is generated entirely at the level of protein synthesis. The temporal transcriptome and translatome data sets from this study have been deposited in NCBI's Gene Expression Omnibus under the accession number GSE67305. Here we provide additional information on the experimental setup and on important optimization steps pertaining to the ribosome profiling technique in mouse liver and to data analysis.
Resumo:
Aim: Emerging polyploids may depend on environmental niche shifts for successful establishment. Using the alpine plant Ranunculus kuepferi as a model system, we explore the niche shift hypothesis at different spatial resolutions and in contrasting parts of the species range. Location: European Alps. Methods: We sampled 12 individuals from each of 102 populations of R. kuepferi across the Alps, determined their ploidy levels, derived coarse-grain (100x100m) environmental descriptors for all sampling sites by downscaling WorldClim maps, and calculated fine-scale environmental descriptors (2x2m) from indicator values of the vegetation accompanying the sampled individuals. Both coarse and fine-scale variables were further computed for 8239 vegetation plots from across the Alps. Subsequently, we compared niche optima and breadths of diploid and tetraploid cytotypes by combining principal components analysis and kernel smoothing procedures. Comparisons were done separately for coarse and fine-grain data sets and for sympatric, allopatric and the total set of populations. Results: All comparisons indicate that the niches of the two cytotypes differ in optima and/or breadths, but results vary in important details. The whole-range analysis suggests differentiation along the temperature gradient to be most important. However, sympatric comparisons indicate that this climatic shift was not a direct response to competition with diploid ancestors. Moreover, fine-grained analyses demonstrate niche contraction of tetraploids, especially in the sympatric range, that goes undetected with coarse-grained data. Main conclusions: Although the niche optima of the two cytotypes differ, separation along ecological gradients was probably less decisive for polyploid establishment than a shift towards facultative apomixis, a particularly effective strategy to avoid minority cytotype exclusion. In addition, our results suggest that coarse-grained analyses overestimate niche breadths of widely distributed taxa. Niche comparison analyses should hence be conducted at environmental data resolutions appropriate for the organism and question under study.
Resumo:
Longline fisheries, oil spills, and offshore wind farms are some of the major threats increasing seabird mortality at sea, but the impact of these threats on specific populations has been difficult to determine so far. We tested the use of molecular markers, morphometric measures, and stable isotope (δ15N and δ13C) and trace element concentrations in the first primary feather (grown at the end of the breeding period) to assign the geographic origin of Calonectris shearwaters. Overall, we sampled birds from three taxa: 13 Mediterranean Cory's Shearwater (Calonectris diomedea diomedea) breeding sites, 10 Atlantic Cory's Shearwater (Calonectris diomedea borealis) breeding sites, and one Cape Verde Shearwater (C. edwardsii) breeding site. Assignment rates were investigated at three spatial scales: breeding colony, breeding archipelago, and taxa levels. Genetic analyses based on the mitochondrial control region (198 birds from 21 breeding colonies) correctly assigned 100% of birds to the three main taxa but failed in detecting geographic structuring at lower scales. Discriminant analyses based on trace elements composition achieved the best rate of correct assignment to colony (77.5%). Body measurements or stable isotopes mainly succeeded in assigning individuals among taxa (87.9% and 89.9%, respectively) but failed at the colony level (27.1% and 38.0%, respectively). Combining all three approaches (morphometrics, isotopes, and trace elements on 186 birds from 15 breeding colonies) substantially improved correct classifications (86.0%, 90.7%, and 100% among colonies, archipelagos, and taxa, respectively). Validations using two independent data sets and jackknife cross-validation confirmed the robustness of the combined approach in the colony assignment (62.5%, 58.8%, and 69.8% for each validation test, respectively). A preliminary application of the discriminant model based on stable isotope δ15N and δ13C values and trace elements (219 birds from 17 breeding sites) showed that 41 Cory's Shearwaters caught by western Mediterranean long-liners came mainly from breeding colonies in Menorca (48.8%), Ibiza (14.6%), and Crete (31.7%). Our findings show that combining analyses of trace elements and stable isotopes on feathers can achieve high rates of correct geographic assignment of birds in the marine environment, opening new prospects for the study of seabird mortality at sea.
Resumo:
Coastal birds are an integral part of coastal ecosystems, which nowadays are subject to severe environmental pressures. Effective measures for the management and conservation of seabirds and their habitats call for insight into their population processes and the factors affecting their distribution and abundance. Central to national and international management and conservation measures is the availability of accurate data and information on bird populations, as well as on environmental trends and on measures taken to solve environmental problems. In this thesis I address different aspects of the occurrence, abundance, population trends and breeding success of waterbirds breeding on the Finnish coast of the Baltic Sea, and discuss the implications of the results for seabird monitoring, management and conservation. In addition, I assess the position and prospects of coastal bird monitoring data, in the processing and dissemination of biodiversity data and information in accordance with the Convention on Biological Diversity (CBD) and other national and international commitments. I show that important factors for seabird habitat selection are island area and elevation, water depth, shore openness, and the composition of island cover habitats. Habitat preferences are species-specific, with certain similarities within species groups. The occurrence of the colonial Arctic Tern (Sterna paradisaea) is partly affected by different habitat characteristics than its abundance. Using long-term bird monitoring data, I show that eutrophication and winter severity have reduced the populations of several Finnish seabird species. A major demographic factor through which environmental changes influence bird populations is breeding success. Breeding success can function as a more rapid indicator of sublethal environmental impacts than population trends, particularly for long-lived and slowbreeding species, and should therefore be included in coastal bird monitoring schemes. Among my target species, local breeding success can be shown to affect the populations of the Mallard (Anas platyrhynchos), the Eider (Somateria mollissima) and the Goosander (Mergus merganser) after a time lag corresponding to their species-specific recruitment age. For some of the target species, the number of individuals in late summer can be used as an easier and more cost-effective indicator of breeding success than brood counts. My results highlight that the interpretation and application of habitat and population studies require solid background knowledge of the ecology of the target species. In addition, the special characteristics of coastal birds, their habitats, and coastal bird monitoring data have to be considered in the assessment of their distribution and population trends. According to the results, the relationships between the occurrence, abundance and population trends of coastal birds and environmental factors can be quantitatively assessed using multivariate modelling and model selection. Spatial data sets widely available in Finland can be utilised in the calculation of several variables that are relevant to the habitat selection of Finnish coastal species. Concerning some habitat characteristics field work is still required, due to a lack of remotely sensed data or the low resolution of readily available data in relation to the fine scale of the habitat patches in the archipelago. While long-term data sets exist for water quality and weather, the lack of data concerning for instance the food resources of birds hampers more detailed studies of environmental effects on bird populations. Intensive studies of coastal bird species in different archipelago areas should be encouraged. The provision and free delivery of high-quality coastal data concerning bird populations and their habitats would greatly increase the capability of ecological modelling, as well as the management and conservation of coastal environments and communities. International initiatives that promote open spatial data infrastructures and sharing are therefore highly regarded. To function effectively, international information networks, such as the biodiversity Clearing House Mechanism (CHM) under the CBD, need to be rooted at regional and local levels. Attention should also be paid to the processing of data for higher levels of the information hierarchy, so that data are synthesized and developed into high-quality knowledge applicable to management and conservation.
Resumo:
Despite recent advances, early diagnosis of Alzheimer’s disease (AD) from electroencephalography (EEG) remains a difficult task. In this paper, we offer an added measure through which such early diagnoses can potentially be improved. One feature that has been used for discriminative classification is changes in EEG synchrony. So far, only the decrease of synchrony in the higher frequencies has been deeply analyzed. In this paper, we investigate the increase of synchrony found in narrow frequency ranges within the θ band. This particular increase of synchrony is used with the well-known decrease of synchrony in the band to enhance detectable differences between AD patients and healthy subjects. We propose a new synchrony ratio that maximizes the differences between two populations. The ratio is tested using two different data sets, one of them containing mild cognitive impairment patients and healthy subjects, and another one, containing mild AD patients and healthy subjects. The results presented in this paper show that classification rate is improved, and the statistical difference between AD patients and healthy subjects is increased using the proposed ratio.
Resumo:
Objective. Recently, significant advances have been made in the early diagnosis of Alzheimer’s disease from EEG. However, choosing suitable measures is a challenging task. Among other measures, frequency Relative Power and loss of complexity have been used with promising results. In the present study we investigate the early diagnosis of AD using synchrony measures and frequency Relative Power on EEG signals, examining the changes found in different frequency ranges. Approach. We first explore the use of a single feature for computing the classification rate, looking for the best frequency range. Then, we present a multiple feature classification system that outperforms all previous results using a feature selection strategy. These two approaches are tested in two different databases, one containing MCI and healthy subjects (patients age: 71.9 ± 10.2, healthy subjects age: 71.7 ± 8.3), and the other containing Mild AD and healthy subjects (patients age: 77.6 ± 10.0; healthy subjects age: 69.4± 11.5). Main Results. Using a single feature to compute classification rates we achieve a performance of 78.33% for the MCI data set and of 97.56 % for Mild AD. Results are clearly improved using the multiple feature classification, where a classification rate of 95% is found for the MCI data set using 11 features, and 100% for the Mild AD data set using 4 features. Significance. The new features selection method described in this work may be a reliable tool that could help to design a realistic system that does not require prior knowledge of a patient's status. With that aim, we explore the standardization of features for MCI and Mild AD data sets with promising results.
Resumo:
The present study builds on a previous proposal for assigning probabilities to the outcomes computed using different primary indicators in single-case studies. These probabilities are obtained comparing the outcome to previously tabulated reference values and reflect the likelihood of the results in case there was no intervention effect. The current study explores how well different metrics are translated into p values in the context of simulation data. Furthermore, two published multiple baseline data sets are used to illustrate how well the probabilities could reflect the intervention effectiveness as assessed by the original authors. Finally, the importance of which primary indicator is used in each data set to be integrated is explored; two ways of combining probabilities are used: a weighted average and a binomial test. The results indicate that the translation into p values works well for the two nonoverlap procedures, with the results for the regression-based procedure diverging due to some undesirable features of its performance. These p values, both when taken individually and when combined, were well-aligned with the effectiveness for the real-life data. The results suggest that assigning probabilities can be useful for translating the primary measure into the same metric, using these probabilities as additional evidence on the importance of behavioral change, complementing visual analysis and professional's judgments.