Biblioteca Digital

971 resultados para ALS data-set

Docking to heme proteins.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In silico screening has become a valuable tool in drug design, but some drug targets represent real challenges for docking algorithms. This is especially true for metalloproteins, whose interactions with ligands are difficult to parametrize. Our docking algorithm, EADock, is based on the CHARMM force field, which assures a physically sound scoring function and a good transferability to a wide range of systems, but also exhibits difficulties in case of some metalloproteins. Here, we consider the therapeutically important case of heme proteins featuring an iron core at the active site. Using a standard docking protocol, where the iron-ligand interaction is underestimated, we obtained a success rate of 28% for a test set of 50 heme-containing complexes with iron-ligand contact. By introducing Morse-like metal binding potentials (MMBP), which are fitted to reproduce density functional theory calculations, we are able to increase the success rate to 62%. The remaining failures are mainly due to specific ligand-water interactions in the X-ray structures. Testing of the MMBP on a second data set of non iron binders (14 cases) demonstrates that they do not introduce a spurious bias towards metal binding, which suggests that they may reliably be used also for cross-docking studies.

An assessment of gene prediction accuracy in large DNA sequences

Relevância:

80.00% 80.00%

Publicador:

Resumo:

One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.

Monitoring of illicit pill distribution networks using an image collection exploration framework

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper proposes a novel approach for the analysis of illicit tablets based on their visual characteristics. In particular, the paper concentrates on the problem of ecstasy pill seizure profiling and monitoring. The presented method extracts the visual information from pill images and builds a representation of it, i.e. it builds a pill profile based on the pill visual appearance. Different visual features are used to build different image similarity measures, which are the basis for a pill monitoring strategy based on both discriminative and clustering models. The discriminative model permits to infer whether two pills come from the same seizure, while the clustering models groups of pills that share similar visual characteristics. The resulting clustering structure allows to perform a visual identification of the relationships between different seizures. The proposed approach was evaluated using a data set of 621 Ecstasy pill pictures. The results demonstrate that this is a feasible and cost effective method for performing pill profiling and monitoring.

GENCODE: the reference human genome annotation for The ENCODE Project.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.

Predicting clinical scores from magnetic resonance scans in Alzheimer's disease.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Machine learning and pattern recognition methods have been used to diagnose Alzheimer's disease (AD) and mild cognitive impairment (MCI) from individual MRI scans. Another application of such methods is to predict clinical scores from individual scans. Using relevance vector regression (RVR), we predicted individuals' performances on established tests from their MRI T1 weighted image in two independent data sets. From Mayo Clinic, 73 probable AD patients and 91 cognitively normal (CN) controls completed the Mini-Mental State Examination (MMSE), Dementia Rating Scale (DRS), and Auditory Verbal Learning Test (AVLT) within 3months of their scan. Baseline MRI's from the Alzheimer's disease Neuroimaging Initiative (ADNI) comprised the other data set; 113 AD, 351 MCI, and 122 CN subjects completed the MMSE and Alzheimer's Disease Assessment Scale-Cognitive subtest (ADAS-cog) and 39 AD, 92 MCI, and 32 CN ADNI subjects completed MMSE, ADAS-cog, and AVLT. Predicted and actual clinical scores were highly correlated for the MMSE, DRS, and ADAS-cog tests (P<0.0001). Training with one data set and testing with another demonstrated stability between data sets. DRS, MMSE, and ADAS-Cog correlated better than AVLT with whole brain grey matter changes associated with AD. This result underscores their utility for screening and tracking disease. RVR offers a novel way to measure interactions between structural changes and neuropsychological tests beyond that of univariate methods. In clinical practice, we envision using RVR to aid in diagnosis and predict clinical outcome.

PACIC Instrument: disentangling dimensions using published validation models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

OBJECTIVE: To better understand the structure of the Patient Assessment of Chronic Illness Care (PACIC) instrument. More specifically to test all published validation models, using one single data set and appropriate statistical tools. DESIGN: Validation study using data from cross-sectional survey. PARTICIPANTS: A population-based sample of non-institutionalized adults with diabetes residing in Switzerland (canton of Vaud). MAIN OUTCOME MEASURE: French version of the 20-items PACIC instrument (5-point response scale). We conducted validation analyses using confirmatory factor analysis (CFA). The original five-dimension model and other published models were tested with three types of CFA: based on (i) a Pearson estimator of variance-covariance matrix, (ii) a polychoric correlation matrix and (iii) a likelihood estimation with a multinomial distribution for the manifest variables. All models were assessed using loadings and goodness-of-fit measures. RESULTS: The analytical sample included 406 patients. Mean age was 64.4 years and 59% were men. Median of item responses varied between 1 and 4 (range 1-5), and range of missing values was between 5.7 and 12.3%. Strong floor and ceiling effects were present. Even though loadings of the tested models were relatively high, the only model showing acceptable fit was the 11-item single-dimension model. PACIC was associated with the expected variables of the field. CONCLUSIONS: Our results showed that the model considering 11 items in a single dimension exhibited the best fit for our data. A single score, in complement to the consideration of single-item results, might be used instead of the five dimensions usually described.

Demographic effects of extreme winter weather in the barn owl.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Extreme weather events can lead to immediate catastrophic mortality. Due to their rare occurrence, however, the long-term impacts of such events for ecological processes are unclear. We examined the effect of extreme winters on barn owl (Tyto alba) survival and reproduction in Switzerland over a 68-year period (approximately 20 generations). This long-term data set allowed us to compare events that occurred only once in several decades to more frequent events. Winter harshness explained 17 and 49% of the variance in juvenile and adult survival, respectively, and the two harshest winters were associated with major population crashes caused by simultaneous low juvenile and adult survival. These two winters increased the correlation between juvenile and adult survival from 0.63 to 0.69. Overall, survival decreased non-linearly with increasing winter harshness in adults, and linearly in juveniles. In contrast, brood size was not related to the harshness of the preceding winter. Our results thus reveal complex interactions between climate and demography. The relationship between weather and survival observed during regular years is likely to underestimate the importance of climate variation for population dynamics.

BME-based uncertainty assessment of the Chernobyl fallout

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The vast territories that have been radioactively contaminated during the 1986 Chernobyl accident provide a substantial data set of radioactive monitoring data, which can be used for the verification and testing of the different spatial estimation (prediction) methods involved in risk assessment studies. Using the Chernobyl data set for such a purpose is motivated by its heterogeneous spatial structure (the data are characterized by large-scale correlations, short-scale variability, spotty features, etc.). The present work is concerned with the application of the Bayesian Maximum Entropy (BME) method to estimate the extent and the magnitude of the radioactive soil contamination by 137Cs due to the Chernobyl fallout. The powerful BME method allows rigorous incorporation of a wide variety of knowledge bases into the spatial estimation procedure leading to informative contamination maps. Exact measurements (?hard? data) are combined with secondary information on local uncertainties (treated as ?soft? data) to generate science-based uncertainty assessment of soil contamination estimates at unsampled locations. BME describes uncertainty in terms of the posterior probability distributions generated across space, whereas no assumption about the underlying distribution is made and non-linear estimators are automatically incorporated. Traditional estimation variances based on the assumption of an underlying Gaussian distribution (analogous, e.g., to the kriging variance) can be derived as a special case of the BME uncertainty analysis. The BME estimates obtained using hard and soft data are compared with the BME estimates obtained using only hard data. The comparison involves both the accuracy of the estimation maps using the exact data and the assessment of the associated uncertainty using repeated measurements. Furthermore, a comparison of the spatial estimation accuracy obtained by the two methods was carried out using a validation data set of hard data. Finally, a separate uncertainty analysis was conducted that evaluated the ability of the posterior probabilities to reproduce the distribution of the raw repeated measurements available in certain populated sites. The analysis provides an illustration of the improvement in mapping accuracy obtained by adding soft data to the existing hard data and, in general, demonstrates that the BME method performs well both in terms of estimation accuracy as well as in terms estimation error assessment, which are both useful features for the Chernobyl fallout study.

Phylogenetic structures of the Holarctic Sorex araneus group and its relationships with S-samniticus, as inferred from mtDNA sequences

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The shrews of the Sorex araneus group, characterized by the sexual chromosome complex XY1, Y2 have been intensively studied by morphological, karyotypical, and biochemical analyses. Nevertheless, the phylogenetic relationships among the species belonging to the araneus complex are still under debate, as different approaches gave often contradictory results. In this paper, partial nucleotide sequences of the mitochondrial DNA cytochrome b gene (1011 bp) were determined for 6 species of the araneus group from Eurasia and North America. We also included in the data set the sequences of Sorex samniticus, whose relationships with the araneus group remain controversial. Three other species representing two major karyological groups were also examined. Both parsimony and distance trees strongly support the monophyly of the araneus group. Sorex sumniticus is significantly more closely related to the araneus complex than to the other species included in the analysis. Based on the branching pattern within the araneus group, an attempt has been made to reconstruct the colonization history of the Holarctic region.

The Effect of Four-Lane to Three-Lane Conversion on the Number of Crashes and Crash Rates in Iowa Roads, June 2005

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We analyze crash data collected by the Iowa Department of Transportation using Bayesian methods. The data set includes monthly crash numbers, estimated monthly traffic volumes, site length and other information collected at 30 paired sites in Iowa over more than 20 years during which an intervention experiment was set up. The intervention consisted in transforming 15 undivided road segments from four-lane to three lanes, while an additional 15 segments, thought to be comparable in terms of traffic safety-related characteristics were not converted. The main objective of this work is to find out whether the intervention reduces the number of crashes and the crash rates at the treated sites. We fitted a hierarchical Poisson regression model with a change-point to the number of monthly crashes per mile at each of the sites. Explanatory variables in the model included estimated monthly traffic volume, time, an indicator for intervention reflecting whether the site was a “treatment” or a “control” site, and various interactions. We accounted for seasonal effects in the number of crashes at a site by including smooth trigonometric functions with three different periods to reflect the four seasons of the year. A change-point at the month and year in which the intervention was completed for treated sites was also included. The number of crashes at a site can be thought to follow a Poisson distribution. To estimate the association between crashes and the explanatory variables, we used a log link function and added a random effect to account for overdispersion and for autocorrelation among observations obtained at the same site. We used proper but non-informative priors for all parameters in the model, and carried out all calculations using Markov chain Monte Carlo methods implemented in WinBUGS. We evaluated the effect of the four to three-lane conversion by comparing the expected number of crashes per year per mile during the years preceding the conversion and following the conversion for treatment and control sites. We estimated this difference using the observed traffic volumes at each site and also on a per 100,000,000 vehicles. We also conducted a prospective analysis to forecast the expected number of crashes per mile at each site in the study one year, three years and five years following the four to three-lane conversion. Posterior predictive distributions of the number of crashes, the crash rate and the percent reduction in crashes per mile were obtained for each site for the months of January and June one, three and five years after completion of the intervention. The model appears to fit the data well. We found that in most sites, the intervention was effective and reduced the number of crashes. Overall, and for the observed traffic volumes, the reduction in the expected number of crashes per year and mile at converted sites was 32.3% (31.4% to 33.5% with 95% probability) while at the control sites, the reduction was estimated to be 7.1% (5.7% to 8.2% with 95% probability). When the reduction in the expected number of crashes per year, mile and 100,000,000 AADT was computed, the estimates were 44.3% (43.9% to 44.6%) and 25.5% (24.6% to 26.0%) for converted and control sites, respectively. In both cases, the difference in the percent reduction in the expected number of crashes during the years following the conversion was significantly larger at converted sites than at control sites, even though the number of crashes appears to decline over time at all sites. Results indicate that the reduction in the expected number of sites per mile has a steeper negative slope at converted than at control sites. Consistent with this, the forecasted reduction in the number of crashes per year and mile during the years after completion of the conversion at converted sites is more pronounced than at control sites. Seasonal effects on the number of crashes have been well-documented. In this dataset, we found that, as expected, the expected number of monthly crashes per mile tends to be higher during winter months than during the rest of the year. Perhaps more interestingly, we found that there is an interaction between the four to three-lane conversion and season; the reduction in the number of crashes appears to be more pronounced during months, when the weather is nice than during other times of the year, even though a reduction was estimated for the entire year. Thus, it appears that the four to three-lane conversion, while effective year-round, is particularly effective in reducing the expected number of crashes in nice weather.

Melanin-based colouration predicts natal dispersal in the barn owl, Tyto alba

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Searching for a suitable breeding site is an important decision in the life of most animals. The decisions where to settle and how far to travel before doing so depend on many factors. Individual differences in dispersal distance could result from different strategies (e.g. specialists versus generalists), which might result in similar reproductive success in different habitats, or different competitive abilities to acquire a territory close to the natal site. The barn owl is polymorphic in melanic coloration, which is associated with many physiological and behavioural traits such as habitat choice, stress response and docility, raising the possibility that the coloration is also related to dispersal. We studied natal dispersal (from rearing site to site of first breeding attempt) and breeding dispersal (from one breeding site to the next) in barn owls using a long-term data set. Darker reddish individuals moved further than paler individuals during natal dispersal, but not during breeding dispersal. A cross-fostering experiment showed that the colour of the biological and foster parents had no influence on dispersal distance. The distance dispersed by parents and same-sex offspring was correlated, whereas natal and breeding dispersal were not repeatable within individuals, indicating that they are two different processes. Given that the distance travelled in natal dispersal appears to be heritable, the underlying genes might be coupled to those related to coloration. We discuss hypotheses to explain the potential adaptive function of the link between coloration and natal dispersal.

Seeing the Wood through the Trees: The Current State of Higher Systematics in the Strepsirhini.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Strepsirhines comprise 10 living or recently extinct families, ≥50% of extant primate families. Their phylogenetic relationships have been intensively studied, but common topologies have only recently emerged; e.g. all recent reconstructions link the Lepilemuridae and Cheirogaleidae. The position of the indriids, however, remains uncertain, and molecular studies have placed them as the sister to every clade except Daubentonia, the preferred sister group of morphologists. The node subtending Afro-Asian lorisids has been similarly elusive. We probed these phylogenetic inconsistencies using a test data set including 20 strepsirhine taxa and 2 outgroups represented by 3,543 mtDNA base pairs, and 43 selected morphological characters, subjecting the data to maximum parsimony, maximum likelihood and Bayesian inference analyses, and reconstructing topology and node ages jointly from the molecular data using relaxed molecular clock analyses. Our permutations yielded compatible but not identical evolutionary histories, and currently popular techniques seem unable to deal adequately with morphological data. We investigated the influence of morphological characters on tree topologies, and examined the effect of taxon sampling in two experiments: (1) we removed the molecular data only for 5 endangered Malagasy taxa to simulate 'extinction leaving a fossil record'; (2) we removed both the sequence and morphological data for these taxa. Topologies were affected more by the inclusion of morphological data only, indicating that palaeontological studies that involve inserting a partial morphological data set into a combined data matrix of extant species should be interpreted with caution. The gap of approximately 10 million years between the daubentoniid divergence and those of the other Malagasy families deserves more study. The apparently contemporaneous divergence of African and non-daubentoniid Malagasy families 40-30 million years ago may be related to regional plume-induced uplift followed by a global period of cooling and drying. © 2013 S. Karger AG, Basel.

Improving generalized regression analysis for the spatial prediction of forest communities

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Aim This study used data from temperate forest communities to assess: (1) five different stepwise selection methods with generalized additive models, (2) the effect of weighting absences to ensure a prevalence of 0.5, (3) the effect of limiting absences beyond the environmental envelope defined by presences, (4) four different methods for incorporating spatial autocorrelation, and (5) the effect of integrating an interaction factor defined by a regression tree on the residuals of an initial environmental model. Location State of Vaud, western Switzerland. Methods Generalized additive models (GAMs) were fitted using the grasp package (generalized regression analysis and spatial predictions, http://www.cscf.ch/grasp). Results Model selection based on cross-validation appeared to be the best compromise between model stability and performance (parsimony) among the five methods tested. Weighting absences returned models that perform better than models fitted with the original sample prevalence. This appeared to be mainly due to the impact of very low prevalence values on evaluation statistics. Removing zeroes beyond the range of presences on main environmental gradients changed the set of selected predictors, and potentially their response curve shape. Moreover, removing zeroes slightly improved model performance and stability when compared with the baseline model on the same data set. Incorporating a spatial trend predictor improved model performance and stability significantly. Even better models were obtained when including local spatial autocorrelation. A novel approach to include interactions proved to be an efficient way to account for interactions between all predictors at once. Main conclusions Models and spatial predictions of 18 forest communities were significantly improved by using either: (1) cross-validation as a model selection method, (2) weighted absences, (3) limited absences, (4) predictors accounting for spatial autocorrelation, or (5) a factor variable accounting for interactions between all predictors. The final choice of model strategy should depend on the nature of the available data and the specific study aims. Statistical evaluation is useful in searching for the best modelling practice. However, one should not neglect to consider the shapes and interpretability of response curves, as well as the resulting spatial predictions in the final assessment.

Survey on batch-to-batch variation in spray paints: a collaborative study

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This study represents the most extensive analysis of batch-to-batch variations in spray paint samples to date. The survey was performed as a collaborative project of the ENFSI (European Network of Forensic Science Institutes) Paint and Glass Working Group (EPG) and involved 11 laboratories. Several studies have already shown that paint samples of similar color but from different manufacturers can usually be differentiated using an appropriate analytical sequence. The discrimination of paints from the same manufacturer and color (batch-to-batch variations) is of great interest and these data are seldom found in the literature. This survey concerns the analysis of batches from different color groups (white, papaya (special shade of orange), red and black) with a wide range of analytical techniques and leads to the following conclusions. Colored batch samples are more likely to be differentiated since their pigment composition is more complex (pigment mixtures, added pigments) and therefore subject to variations. These variations may occur during the paint production but may also occur when checking the paint shade in quality control processes. For these samples, techniques aimed at color/pigment(s) characterization (optical microscopy, microspectrophotometry (MSP), Raman spectroscopy) provide better discrimination than techniques aimed at the organic (binder) or inorganic composition (fourier transform infrared spectroscopy (FTIR) or elemental analysis (SEM - scanning electron microscopy and XRF - X-ray fluorescence)). White samples contain mainly titanium dioxide as a pigment and the main differentiation is based on the binder composition (Csingle bondH stretches) detected either by FTIR or Raman. The inorganic composition (elemental analysis) also provides some discrimination. Black samples contain mainly carbon black as a pigment and are problematic with most of the spectroscopic techniques. In this case, pyrolysis-GC/MS represents the best technique to detect differences. Globally, Py-GC/MS may show a high potential of discrimination on all samples but the results are highly dependent on the specific instrumental conditions used. Finally, the discrimination of samples when data was interpreted visually as compared to statistically using principal component analysis (PCA) yielded very similar results. PCA increases sensitivity and could perform better on specific samples, but one first has to ensure that all non-informative variation (baseline deviation) is eliminated by applying correct pre-treatments. Statistical treatments can be used on a large data set and, when combined with an expert's opinion, will provide more objective criteria for decision making.

Drosophilids (Diptera) from an Atlantic Forest Area in Santa Catarina, Southern Brazil

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The present work aims at knowing the faunal composition of drosophilids in forest areas of southern Brazil. Besides, estimation of species richness for this fauna is briefly discussed. The sampling were carried out in three well-preserved areas of the Atlantic Rain Forest in the State of Santa Catarina. In this study, 136,931 specimens were captured and 96.6% of them were identified in the specific level. The observed species richness (153 species) is the largest that has been registered in faunal inventories conducted in Brazil. Sixty-three of the captured species did not fit to the available descriptions, and we believe that most of them are non-described species. The incidence-based estimators tended to give rise to the largest richness estimates while the abundance based give rise to the smallest ones. Such estimators suggest the presence from 172.28 to 220.65 species in the studied area. Based on these values, from 69.35 to 88.81% of the expected species richness were sampled. We suggest that the large richness recorded in this study is a consequence of the large sampling effort, the capture method, recent advances in the taxonomy of drosophilids, the high preservation level and the large extension of the sampled fragment and the high complexity of the Atlantic Rain forest. Finally, our data set suggest that the employment of estimators of richness for drosophilid assemblages is useful but it requires caution.

«
1
2
...
40
41
42
43
44
45
46
...
64
65
»