971 resultados para Linked Data
Resumo:
The 16S rRNA gene (16S rDNA) is currently the most widely used gene for estimating the evolutionary history of prokaryotes, To date, there are more than 30 000 16S rDNA sequences available from the core databases, GenBank, EMBL and DDBJ, This great number may cause a dilemma when composing datasets for phylogenetic analysis, since the choice and number of reference organisms are known to affect the resulting tree topology. A group of sequences appearing monophyletic in one dataset may not be so in another. This can be especially problematic when establishing the relationships of distantly related sequences at the division (phylum) level. In this study, a multiple-outgroup approach to resolving division-level phylogenetic relationships is suggested using 16S rDNA data. The approach is illustrated by two case studies concerning the monophyly of two recently proposed bacterial divisions, OP9 and OP10.
Resumo:
Qualitative data analysis (QDA) is often a time-consuming and laborious process usually involving the management of large quantities of textual data. Recently developed computer programs offer great advances in the efficiency of the processes of QDA. In this paper we report on an innovative use of a combination of extant computer software technologies to further enhance and simplify QDA. Used in appropriate circumstances, we believe that this innovation greatly enhances the speed with which theoretical and descriptive ideas can be abstracted from rich, complex, and chaotic qualitative data. © 2001 Human Sciences Press, Inc.
Resumo:
Much progress has been made on inferring population history from molecular data. However, complex demographic scenarios have been considered rarely or have proved intractable. The serial introduction of the South-Central American cane Load Bufo marinas in various Caribbean and Pacific islands involves four major phases: a possible genetic admixture during the first introduction, a bottleneck associated with founding, a transitory, population boom, and finally, a demographic stabilization. A large amount of historical and demographic information is available for those introductions and can be combined profitably with molecular data. We used a Bayesian approach to combine this information With microsatellite (10 loci) and enzyme (22 loci) data and used a rejection algorithm to simultaneously estimate the demographic parameters describing the four major phases of the introduction history,. The general historical trends supported by microsatellites and enzymes were similar. However, there was a stronger support for a larger bottleneck at introductions for microsatellites than enzymes and for a more balanced genetic admixture for enzymes than for microsatellites. Verb, little information was obtained from either marker about the transitory population boom observed after each introduction. Possible explanations for differences in resolution of demographic events and discrepancies between results obtained with microsatellites and enzymes were explored. Limits Of Our model and method for the analysis of nonequilibrium populations were discussed.
Resumo:
We investigated whether red cell 2,3-diphosphoglycerate (2,3-DPG) concentrations are reduced in critical illness, whether acidaemia, hypophosphataemia or anaemia influence 2,3-DPG, and whether there is any net effect on in vivo P50. Twenty healthy, non-smoking, male volunteers were compared with 20 male intensive care patients with APACHE 2 scores > 20 on the preceding day. Those transfused in this time were excluded. Venous red cell 2,3-DPG concentrations were measured in both groups. In the patient group, routine multichannel biochemical profile and arterial blood gas analysis were also performed and in vivo P50 calculated. The mean 2,3-DPG concentration was significantly lower in the patient group than in the controls (4.2 +/-1.3 mmoll/l vs 4.9 +/-0.5 mmol/l, P=0.016). The patients were well oxygenated (lowest arterial PO2=75 mm Hg) and showed a tendency to acidaemia (median pH 7.37, range 7.06 to 7.48) and anaemia (median haemoglobin concentration 113 g/l, range 89 to 154 g/l). By linear regression of patient data, pH had a significant effect on 2,3-DPG concentrations (r=0.6, P=0.011). Haemoglobin and phosphate concentrations did not, but there were few abnormal phosphate values. There was no correlation between 2,3-DPG concentrations and in vivo P50 (r(2) less than or equal to 0.08). We conclude that 2,3-DPG concentrations were reduced in a broad group of critically ill patients. Although this would normally reduce the P50, the reduction was primarily linked with acidaemia, which increases the P50. Overall, there was no net effect on the P50 and thus no affinity-related decrease in tissue oxygenation.
Resumo:
Wilson disease is an autosomal recessive copper transport disorder resulting from defective biliary excretion of copper and subsequent hepatic copper accumulation and liver failure if not treated. The disease is caused by mutations in the ATP7B (WND) gene, which is expressed predominantly in the liver and encodes a copper-transporting P-type ATPase that is structurally and functionally similar to the Menkes protein (MNK), which is defective in the X-linked copper transport disorder Menkes disease. The toxic milk (tx) mouse has a clinical phenotype similar to Wilson disease patients and, recently, the tx mutation within the murine WND homologue (Wnd) of this mouse was identified, establishing it as an animal model for Wilson disease. In this study, cDNA constructs encoding the wild-type (Wnd-wt) and mutant (Wnd-tx) Wilson proteins (Wnd) were generated and expressed in Chinese hamster ovary (CHO) cells. The fx mutation disrupted the copper-induced relocalization of Wnd in CHO cells and abrogated Wnd-mediated copper resistance of transfected CHO cells. In addition, co-localization experiments demonstrated that while Wnd and MNK are located in the trans-Golgi network in basal copper conditions, with elevated copper, these proteins are sorted to different destinations within the same cell, Ultrastructural studies showed that with elevated copper levels, Wnd accumulated in large multivesicular structures resembling late endosomes that may represent a novel compartment for copper transport. The data presented provide further support for a relationship between copper transport activity and the copper-induced relocalization response of mammalian copper ATPases, and an explanation at a molecular level for the observed phenotype of fx mice.
Resumo:
Nine cases of melioidosis with four deaths occurred over a 28-month period in members of a small remote Aboriginal community in the top end of the Northern Territory of Australia. Typing by pulsed-field gel electrophoresis showed isolates of Burkholderia pseudomallei from six of the cases to be clonal and also identical to an isolate from the community water supply, but not to soil isolates. The clonality of the isolates found in this cluster contrasts with the marked genetic diversity of human and environmental isolates found in this region which is hyperendemic for B. pseudomallei. It is possible that the clonal bacteria persisted and were propagated in biofilm in the water supply system. While the exact mode of transmission to humans and the reasons for cessation of the outbreak remain uncertain, contamination of the unchlorinated community water supply is a likely explanation.
Resumo:
Objectives: This study examines human scalp electroencephalographic (EEG) data for evidence of non-linear interdependence between posterior channels. The spectral and phase properties of those epochs of EEG exhibiting non-linear interdependence are studied. Methods: Scalp EEG data was collected from 40 healthy subjects. A technique for the detection of non-linear interdependence was applied to 2.048 s segments of posterior bipolar electrode data. Amplitude-adjusted phase-randomized surrogate data was used to statistically determine which EEG epochs exhibited non-linear interdependence. Results: Statistically significant evidence of non-linear interactions were evident in 2.9% (eyes open) to 4.8% (eyes closed) of the epochs. In the eyes-open recordings, these epochs exhibited a peak in the spectral and cross-spectral density functions at about 10 Hz. Two types of EEG epochs are evident in the eyes-closed recordings; one type exhibits a peak in the spectral density and cross-spectrum at 8 Hz. The other type has increased spectral and cross-spectral power across faster frequencies. Epochs identified as exhibiting non-linear interdependence display a tendency towards phase interdependencies across and between a broad range of frequencies. Conclusions: Non-linear interdependence is detectable in a small number of multichannel EEG epochs, and makes a contribution to the alpha rhythm. Non-linear interdependence produces spatially distributed activity that exhibits phase synchronization between oscillations present at different frequencies. The possible physiological significance of these findings are discussed with reference to the dynamical properties of neural systems and the role of synchronous activity in the neocortex. (C) 2002 Elsevier Science Ireland Ltd. All rights reserved.
Resumo:
In many occupational safety interventions, the objective is to reduce the injury incidence as well as the mean claims cost once injury has occurred. The claims cost data within a period typically contain a large proportion of zero observations (no claim). The distribution thus comprises a point mass at 0 mixed with a non-degenerate parametric component. Essentially, the likelihood function can be factorized into two orthogonal components. These two components relate respectively to the effect of covariates on the incidence of claims and the magnitude of claims, given that claims are made. Furthermore, the longitudinal nature of the intervention inherently imposes some correlation among the observations. This paper introduces a zero-augmented gamma random effects model for analysing longitudinal data with many zeros. Adopting the generalized linear mixed model (GLMM) approach reduces the original problem to the fitting of two independent GLMMs. The method is applied to evaluate the effectiveness of a workplace risk assessment teams program, trialled within the cleaning services of a Western Australian public hospital.
Resumo:
Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.
Resumo:
Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.
Resumo:
Genetic research on risk of alcohol, tobacco or drug dependence must make allowance for the partial overlap of risk-factors for initiation of use, and risk-factors for dependence or other outcomes in users. Except in the extreme cases where genetic and environmental risk-factors for initiation and dependence overlap completely or are uncorrelated, there is no consensus about how best to estimate the magnitude of genetic or environmental correlations between Initiation and Dependence in twin and family data. We explore by computer simulation the biases to estimates of genetic and environmental parameters caused by model misspecification when Initiation can only be defined as a binary variable. For plausible simulated parameter values, the two-stage genetic models that we consider yield estimates of genetic and environmental variances for Dependence that, although biased, are not very discrepant from the true values. However, estimates of genetic (or environmental) correlations between Initiation and Dependence may be seriously biased, and may differ markedly under different two-stage models. Such estimates may have little credibility unless external data favor selection of one particular model. These problems can be avoided if Initiation can be assessed as a multiple-category variable (e.g. never versus early-onset versus later onset user), with at least two categories measurable in users at risk for dependence. Under these conditions, under certain distributional assumptions., recovery of simulated genetic and environmental correlations becomes possible, Illustrative application of the model to Australian twin data on smoking confirmed substantial heritability of smoking persistence (42%) with minimal overlap with genetic influences on initiation.
Resumo:
There have been few replicated examples of genotype x environment interaction effects on behavioral variation or risk of psychiatric disorder. We review some of the factors that have made detection of genotype x environment interaction effects difficult, and show how genotype x shared environment interaction (GxSE) effects are commonly confounded with genetic parameters in data from twin pairs reared together. Historic data on twin pairs reared apart can in principle be used to estimate such GxSE effects, but have rarely been used for this purpose. We illustrate this using previously published data from the Swedish Adoption Twin Study of Aging (SATSA), which suggest that GxSE effects could account for as much as 25% of the total variance in risk of becoming a regular smoker. Since few separated twin pairs will be available for study in the future, we also consider methods for modifying variance components linkage analysis to allow for environmental interactions with linked loci.