882 resultados para large scale data gathering


Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A large proportion of the death toll associated with malaria is a consequence of malaria infection during pregnancy, causing up to 200,000 infant deaths annually. We previously published the first extensive genetic association study of placental malaria infection, and here we extend this analysis considerably, investigating genetic variation in over 9,000 SNPs in more than 1,000 genes involved in immunity and inflammation for their involvement in susceptibility to placental malaria infection. We applied a new approach incorporating results from both single gene analysis as well as gene-gene interactionson a protein-protein interaction network. We found suggestive associations of variants in the gene KLRK1 in the single geneanalysis, as well as evidence for associations of multiple members of the IL-7/IL-7R signalling cascade in the combined analysis. To our knowledge, this is the first large-scale genetic study on placental malaria infection to date, opening the door for follow-up studies trying to elucidate the genetic basis of this neglected form of malaria.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Risks of significant infant drug exposure through human milk arepoorly defined due to lack of large-scale PK data. We propose to useBayesian approach based on population PK (popPK)-guided modelingand simulation for risk prediction. As a proof-of-principle study, weexploited fluoxetine milk concentration data from 25 women. popPKparameters including milk-to-plasma ratio (MP ratio) were estimatedfrom the best model. The dose of fluoxetine the breastfed infant wouldreceive through mother's milk, and infant plasma concentrations wereestimated from 1000 simulated mother-infant pairs, using randomassignment of feeding times and milk volume. A conservative estimateof CYP2D6 activity of 20% of the allometrically-adjusted adult valuewas assumed. Derived model parameters, including MP ratio were consistentwith those reported in the literature. Visual predictive check andother model diagnostics showed no signs of model misspecifications.The model simulation predicted that infant exposure levels to fluoxetinevia mother's milk were below 10% of weight-adjusted maternal therapeuticdoses in >99% of simulated infants. Predicted median ratio ofinfant-mother serum levels at steady state was 0.093 (range 0.033-0.31),consistent with literature reported values (mean=0.07; range 0-0.59).Predicted incidence of relatively high infant-mother ratio (>0.2) ofsteady-state serum fluoxetine concentrations was <1.3%. Overall, ourpredictions are consistent with clinical observations. Our approach maybe valid for other drugs, allowing in silico prediction of infant drugexposure risks through human milk. We will discuss application of thisapproach to another drug used in lactating women.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ABSTRACT: BACKGROUND: Local adaptation can drive the divergence of populations but identification of the traits under selection remains a major challenge in evolutionary biology. Reciprocal transplant experiments are ideal tests of local adaptation, yet rarely used for higher vertebrates because of the mobility and potential invasiveness of non-native organisms. Here, we reciprocally transplanted 2500 brown trout (Salmo trutta) embryos from five populations to investigate local adaptation in early life history traits. Embryos were bred in a full-factorial design and raised in natural riverbeds until emergence. Customized egg capsules were used to simulate the natural redd environment and allowed tracking the fate of every individual until retrieval. We predicted that 1) within sites, native populations would outperform non-natives, and 2) across sites, populations would show higher performance at 'home' compared to 'away' sites. RESULTS: There was no evidence for local adaptation but we found large differences in survival and hatching rates between sites, indicative of considerable variation in habitat quality. Survival was generally high across all populations (55% +/- 3%), but ranged from 4% to 89% between sites. Average hatching rate was 25% +/- 3% across populations ranging from 0% to 62% between sites. CONCLUSION: This study provides rare empirical data on variation in early life history traits in a population network of a salmonid, and large-scale breeding and transplantation experiments like ours provide powerful tests for local adaptation. Despite the recently reported genetic and morphological differences between the populations in our study area, local adaptation at the embryo level is small, non-existent, or confined to ecological conditions that our experiment could not capture.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper deals with the problem of spatial data mapping. A new method based on wavelet interpolation and geostatistical prediction (kriging) is proposed. The method - wavelet analysis residual kriging (WARK) - is developed in order to assess the problems rising for highly variable data in presence of spatial trends. In these cases stationary prediction models have very limited application. Wavelet analysis is used to model large-scale structures and kriging of the remaining residuals focuses on small-scale peculiarities. WARK is able to model spatial pattern which features multiscale structure. In the present work WARK is applied to the rainfall data and the results of validation are compared with the ones obtained from neural network residual kriging (NNRK). NNRK is also a residual-based method, which uses artificial neural network to model large-scale non-linear trends. The comparison of the results demonstrates the high quality performance of WARK in predicting hot spots, reproducing global statistical characteristics of the distribution and spatial correlation structure.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of my thesis was to find out how mobile TV service will influence TV consumption behaviour of the Finns. In particular the study focuses on the consumption behaviour of a well educated urban people. For my thesis, I provided a detailed analysis of the study results of a large scale questionnaire research FinPilot from the year 2005 based on an assignment of Nokia Ltd. In order to deepen the study results, I focused on the above mentioned group of young people with good education. The goal of the FinPilot research was to give answers to the following questions: what kind of programs, in what kind of circumstances, and for which reasons are they watched when using the mobile television service. The results of the research consisted mainly of data like figures, graphics etc. The data was explaned from the helicopter perspective, for it gave additional value to the research and consequently to my own thesis. My study offered complementary, unique information about their needs as it was based on questionnaires supplemented by individual interviews of the group members, their free comments as well as group discussions. The study results proved that mobile TV service did not increase the total TV consumption time. The time used for watching the mobile TV was significantly shorter than the time for watching the traditional TV. According to my study, the young urban people with good education are more interested to adapt the mobile TV service than the average Finns. Being eager to utilize the added value offered by the mobile TVs they are a potential target group in launching and marketing processes. On the basis of the outcome of the thesis, the future of mobile TV service seems very promising. The content and the pricing, however, have to match the user's needs and expectations. All the study results prove that there exists a social order for mobile TV service.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The efficient removal of a N- or C-terminal purification tag from a fusion protein is necessary to obtain a protein in a pure and active form, ready for use in human or animal medicine. Current techniques based on enzymatic cleavage are expensive and result in the presence of additional amino acids at either end of the proteins, as well as contaminating proteases in the preparation. Here we evaluate an alternative method to the one-step affinity/protease purification process for large-scale purification. It is based upon the cyanogen bromide (CNBr) cleavage at a single methionine placed in between a histidine tag and a Plasmodium falciparum antigen. The C-terminal segment of the circumsporozoite polypeptide was expressed as a fusion protein with a histidine tag in Escherichia coli purified by Ni-NAT agarose column chromatography and subsequently cleaved by CNBr to obtain a polypeptide without any extraneous amino acids derived from the cleavage site or from the affinity purification tag. Thus, a recombinant protein is produced without the need for further purification, demonstrating that CNBr cleavage is a precise, efficient, and low-cost alternative to enzymatic digestion, and can be applied to large-scale preparations of recombinant proteins.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Proteomics has come a long way from the initial qualitative analysis of proteins present in a given sample at a given time ("cataloguing") to large-scale characterization of proteomes, their interactions and dynamic behavior. Originally enabled by breakthroughs in protein separation and visualization (by two-dimensional gels) and protein identification (by mass spectrometry), the discipline now encompasses a large body of protein and peptide separation, labeling, detection and sequencing tools supported by computational data processing. The decisive mass spectrometric developments and most recent instrumentation news are briefly mentioned accompanied by a short review of gel and chromatographic techniques for protein/peptide separation, depletion and enrichment. Special emphasis is placed on quantification techniques: gel-based, and label-free techniques are briefly discussed whereas stable-isotope coding and internal peptide standards are extensively reviewed. Another special chapter is dedicated to software and computing tools for proteomic data processing and validation. A short assessment of the status quo and recommendations for future developments round up this journey through quantitative proteomics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The vast territories that have been radioactively contaminated during the 1986 Chernobyl accident provide a substantial data set of radioactive monitoring data, which can be used for the verification and testing of the different spatial estimation (prediction) methods involved in risk assessment studies. Using the Chernobyl data set for such a purpose is motivated by its heterogeneous spatial structure (the data are characterized by large-scale correlations, short-scale variability, spotty features, etc.). The present work is concerned with the application of the Bayesian Maximum Entropy (BME) method to estimate the extent and the magnitude of the radioactive soil contamination by 137Cs due to the Chernobyl fallout. The powerful BME method allows rigorous incorporation of a wide variety of knowledge bases into the spatial estimation procedure leading to informative contamination maps. Exact measurements (?hard? data) are combined with secondary information on local uncertainties (treated as ?soft? data) to generate science-based uncertainty assessment of soil contamination estimates at unsampled locations. BME describes uncertainty in terms of the posterior probability distributions generated across space, whereas no assumption about the underlying distribution is made and non-linear estimators are automatically incorporated. Traditional estimation variances based on the assumption of an underlying Gaussian distribution (analogous, e.g., to the kriging variance) can be derived as a special case of the BME uncertainty analysis. The BME estimates obtained using hard and soft data are compared with the BME estimates obtained using only hard data. The comparison involves both the accuracy of the estimation maps using the exact data and the assessment of the associated uncertainty using repeated measurements. Furthermore, a comparison of the spatial estimation accuracy obtained by the two methods was carried out using a validation data set of hard data. Finally, a separate uncertainty analysis was conducted that evaluated the ability of the posterior probabilities to reproduce the distribution of the raw repeated measurements available in certain populated sites. The analysis provides an illustration of the improvement in mapping accuracy obtained by adding soft data to the existing hard data and, in general, demonstrates that the BME method performs well both in terms of estimation accuracy as well as in terms estimation error assessment, which are both useful features for the Chernobyl fallout study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abiotic factors such as climate and soil determine the species fundamental niche, which is further constrained by biotic interactions such as interspecific competition. To parameterize this realized niche, species distribution models (SDMs) most often relate species occurrence data to abiotic variables, but few SDM studies include biotic predictors to help explain species distributions. Therefore, most predictions of species distributions under future climates assume implicitly that biotic interactions remain constant or exert only minor influence on large-scale spatial distributions, which is also largely expected for species with high competitive ability. We examined the extent to which variance explained by SDMs can be attributed to abiotic or biotic predictors and how this depends on species traits. We fit generalized linear models for 11 common tree species in Switzerland using three different sets of predictor variables: biotic, abiotic, and the combination of both sets. We used variance partitioning to estimate the proportion of the variance explained by biotic and abiotic predictors, jointly and independently. Inclusion of biotic predictors improved the SDMs substantially. The joint contribution of biotic and abiotic predictors to explained deviance was relatively small (similar to 9%) compared to the contribution of each predictor set individually (similar to 20% each), indicating that the additional information on the realized niche brought by adding other species as predictors was largely independent of the abiotic (topo-climatic) predictors. The influence of biotic predictors was relatively high for species preferably growing under low disturbance and low abiotic stress, species with long seed dispersal distances, species with high shade tolerance as juveniles and adults, and species that occur frequently and are dominant across the landscape. The influence of biotic variables on SDM performance indicates that community composition and other local biotic factors or abiotic processes not included in the abiotic predictors strongly influence prediction of species distributions. Improved prediction of species' potential distributions in future climates and communities may assist strategies for sustainable forest management.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Summary [résumé français voir ci-dessous] From the beginning of the 20th century the world population has been confronted with the human immune deficiency virus 1 (HIV-1). This virus has the particularity to mutate fast, and could thus evade and adapt to the human host. Our closest evolutionary related organisms, the non-human primates, are less susceptible to HIV-1. In a broader sense, primates are differentially susceptible to various retrovirus. Species specificity may be due to genetic differences among primates. In the present study we applied evolutionary and comparative genetic techniques to characterize the evolutionary pattern of host cellular determinants of HIV-1 pathogenesis. The study of the evolution of genes coding for proteins participating to the restriction or pathogenesis of HIV-1 may help understanding the genetic basis of modern human susceptibility to infection. To perform comparative genetics analysis, we constituted a collection of primate DNA and RNA to allow generation of de novo sequence of gene orthologs. More recently, release to the public domain of two new primate complete genomes (bornean orang-utan and common marmoset) in addition of the three previously available genomes (human, chimpanzee and Rhesus monkey) help scaling up the evolutionary and comparative genome analysis. Sequence analysis used phylogenetic and statistical methods for detecting molecular adaptation. We identified different selective pressures acting on host proteins involved in HIV-1 pathogenesis. Proteins with HIV-1 restriction properties in non-human primates were under strong positive selection, in particular in regions of interaction with viral proteins. These regions carried key residues for the antiviral activity. Proteins of the innate immunity presented an evolutionary pattern of conservation (purifying selection) but with signals of relaxed constrain if we compared them to the average profile of purifying selection of the primate genomes. Large scale analysis resulted in patterns of evolutionary pressures according to molecular function, biological process and cellular distribution. The data generated by various analyses served to guide the ancestral reconstruction of TRIM5a a potent antiviral host factor. The resurrected TRIM5a from the common ancestor of Old world monkeys was effective against HIV-1 and the recent resurrected hominoid variants were more effective against other retrovirus. Thus, as the result of trade-offs in the ability to restrict different retrovirus, human might have been exposed to HIV-1 at a time when TRIM5a lacked the appropriate specific restriction activity. The application of evolutionary and comparative genetic tools should be considered for the systematical assessment of host proteins relevant in viral pathogenesis, and to guide biological and functional studies. Résumé La population mondiale est confrontée depuis le début du vingtième siècle au virus de l'immunodéficience humaine 1 (VIH-1). Ce virus a un taux de mutation particulièrement élevé, il peut donc s'évader et s'adapter très efficacement à son hôte. Les organismes évolutivement le plus proches de l'homme les primates nonhumains sont moins susceptibles au VIH-1. De façon générale, les primates répondent différemment aux rétrovirus. Cette spécificité entre espèces doit résider dans les différences génétiques entre primates. Dans cette étude nous avons appliqué des techniques d'évolution et de génétique comparative pour caractériser le modèle évolutif des déterminants cellulaires impliqués dans la pathogenèse du VIH- 1. L'étude de l'évolution des gènes, codant pour des protéines impliquées dans la restriction ou la pathogenèse du VIH-1, aidera à la compréhension des bases génétiques ayant récemment rendu l'homme susceptible. Pour les analyses de génétique comparative, nous avons constitué une collection d'ADN et d'ARN de primates dans le but d'obtenir des nouvelles séquences de gènes orthologues. Récemment deux nouveaux génomes complets ont été publiés (l'orang-outan du Bornéo et Marmoset commun) en plus des trois génomes déjà disponibles (humain, chimpanzé, macaque rhésus). Ceci a permis d'améliorer considérablement l'étendue de l'analyse. Pour détecter l'adaptation moléculaire nous avons analysé les séquences à l'aide de méthodes phylogénétiques et statistiques. Nous avons identifié différentes pressions de sélection agissant sur les protéines impliquées dans la pathogenèse du VIH-1. Des protéines avec des propriétés de restriction du VIH-1 dans les primates non-humains présentent un taux particulièrement haut de remplacement d'acides aminés (sélection positive). En particulier dans les régions d'interaction avec les protéines virales. Ces régions incluent des acides aminés clé pour l'activité de restriction. Les protéines appartenant à l'immunité inné présentent un modèle d'évolution de conservation (sélection purifiante) mais avec des traces de "relaxation" comparé au profil général de sélection purifiante du génome des primates. Une analyse à grande échelle a permis de classifier les modèles de pression évolutive selon leur fonction moléculaire, processus biologique et distribution cellulaire. Les données générées par les différentes analyses ont permis la reconstruction ancestrale de TRIM5a, un puissant facteur antiretroviral. Le TRIM5a ressuscité, correspondant à l'ancêtre commun entre les grands singes et les groupe des catarrhiniens, est efficace contre le VIH-1 moderne. Les TRIM5a ressuscités plus récents, correspondant aux ancêtres des grands singes, sont plus efficaces contre d'autres rétrovirus. Ainsi, trouver un compromis dans la capacité de restreindre différents rétrovirus, l'homme aurait été exposé au VIH-1 à une période où TRIM5a manquait d'activité de restriction spécifique contre celui-ci. L'application de techniques d'évolution et de génétique comparative devraient être considérées pour l'évaluation systématique de protéines impliquées dans la pathogenèse virale, ainsi que pour guider des études biologiques et fonctionnelles

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Contingent sovereign debt can create important welfare gains. Nonetheless,there is almost no issuance today. Using hand-collected archival data, we examine thefirst known case of large-scale use of state-contingent sovereign debt in history. Philip IIof Spain entered into hundreds of contracts whose value and due date depended onverifiable, exogenous events such as the arrival of silver fleets. We show that this allowedfor effective risk-sharing between the king and his bankers. The data also stronglysuggest that the defaults that occurred were excusable they were simply contingenciesover which Crown and bankers had not contracted previously.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Human immunodeficiency virus (HIV) takes advantage of multiple host proteins to support its own replication. The gene ZNRD1 (zinc ribbon domain-containing 1) has been identified as encoding a potential host factor that influenced disease progression in HIV-positive individuals in a genomewide association study and also significantly affected HIV replication in a large-scale in vitro short interfering RNA (siRNA) screen. Genes and polymorphisms identified by large-scale analysis need to be followed up by means of functional assays and resequencing efforts to more precisely map causal genes. METHODS: Genotyping and ZNRD1 gene resequencing for 208 HIV-positive subjects (119 who experienced long-term nonprogression [LTNP] and 89 who experienced normal disease progression) was done by either TaqMan genotyping assays or direct sequencing. Genetic association analysis was performed with the SNPassoc package and Haploview software. siRNA and short hairpin RNA (shRNA) specifically targeting ZNRD1 were used to transiently or stably down-regulate ZNRD1 expression in both lymphoid and nonlymphoid cells. Cells were infected with X4 and R5 HIV strains, and efficiency of infection was assessed by reporter gene assay or p24 assay. RESULTS: Genetic association analysis found a strong statistically significant correlation with the LTNP phenotype (single-nucleotide polymorphism rs1048412; [Formula: see text]), independently of HLA-A10 influence. siRNA-based functional analysis showed that ZNRD1 down-regulation by siRNA or shRNA impaired HIV-1 replication at the transcription level in both lymphoid and nonlymphoid cells. CONCLUSION: Genetic association analysis unequivocally identified ZNRD1 as an independent marker of LTNP to AIDS. Moreover, in vitro experiments pointed to viral transcription as the inhibited step. Thus, our data strongly suggest that ZNRD1 is a host cellular factor that influences HIV-1 replication and disease progression in HIV-positive individuals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Correspondence analysis has found extensive use in ecology, archeology, linguisticsand the social sciences as a method for visualizing the patterns of association in a table offrequencies or nonnegative ratio-scale data. Inherent to the method is the expression of the datain each row or each column relative to their respective totals, and it is these sets of relativevalues (called profiles) that are visualized. This relativization of the data makes perfect sensewhen the margins of the table represent samples from sub-populations of inherently differentsizes. But in some ecological applications sampling is performed on equal areas or equalvolumes so that the absolute levels of the observed occurrences may be of relevance, in whichcase relativization may not be required. In this paper we define the correspondence analysis ofthe raw unrelativized data and discuss its properties, comparing this new method to regularcorrespondence analysis and to a related variant of non-symmetric correspondence analysis.