912 resultados para sequencing error
Resumo:
A 1887-bp region at the 5' flank of the human p75 tumor necrosis factor receptor (p75 TNF-R)-encoding gene was found to be active in driving expression of the luc (luciferase-encoding) reporter gene, suggesting that it contains the promoter for the receptor. Rather unexpectedly, a 1827-bp region at the 3' end of the first intron of the p75 TNF-R gene also displayed promoter activity. This activity may be artefactual, reflecting only the presence of an enhancer in this region; yet it also raises the possibility that p75 TNF-R is controlled by more than one promoter and that it encodes various forms of the receptor, or even other proteins. We present here the nucleotide sequences of the 5' flanking and intron regions. Possible implications for the transcriptional regulation of the p75 TNF-R gene are discussed.
Resumo:
Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.
Resumo:
The RNome of a cell is highly diverse and consists besides messenger RNAs (mRNAs), transfer RNAs (tRNAs), and ribosomal RNAs (rRNAs) also of other small and long transcript entities without apparent coding potential. This class of molecules, commonly referred to as non-protein-coding RNAs (ncRNAs), is involved in regulating numerous biological processes and thought to contribute to cellular complexity. Therefore, much effort is put into their identification and further functional characterization. Here we provide a cost-effective and reliable method for cDNA library construction of small RNAs in the size range of 20-500 residues. The effectiveness of the described method is demonstrated by the analysis of ribosome-associated small RNAs in the eukaryotic model organism Trypanosoma brucei.
Resumo:
Background: Tef (Eragrostis tef), an indigenous cereal critical to food security in the Horn of Africa, is rich in minerals and protein, resistant to many biotic and abiotic stresses and safe for diabetics as well as sufferers of immune reactions to wheat gluten. We present the genome of tef, the first species in the grass subfamily Chloridoideae and the first allotetraploid assembled de novo. We sequenced the tef genome for marker-assisted breeding, to shed light on the molecular mechanisms conferring tef's desirable nutritional and agronomic properties, and to make its genome publicly available as a community resource. Results: The draft genome contains 672 Mbp representing 87% of the genome size estimated from flow cytometry. We also sequenced two transcriptomes, one from a normalized RNA library and another from unnormalized RNASeq data. The normalized RNA library revealed around 38000 transcripts that were then annotated by the SwissProt group. The CoGe comparative genomics platform was used to compare the tef genome to other genomes, notably sorghum. Scaffolds comprising approximately half of the genome size were ordered by syntenic alignment to sorghum producing tef pseudo-chromosomes, which were sorted into A and B genomes as well as compared to the genetic map of tef. The draft genome was used to identify novel SSR markers, investigate target genes for abiotic stress resistance studies, and understand the evolution of the prolamin family of proteins that are responsible for the immune response to gluten. Conclusions: It is highly plausible that breeding targets previously identified in other cereal crops will also be valuable breeding targets in tef. The draft genome and transcriptome will be of great use for identifying these targets for genetic improvement of this orphan crop that is vital for feeding 50 million people in the Horn of Africa.
Resumo:
Over 250 Mendelian traits and disorders, caused by rare alleles have been mapped in the canine genome. Although each disease is rare in the dog as a species, they are collectively common and have major impact on canine health. With SNP-based genotyping arrays, genome-wide association studies (GWAS) have proven to be a powerful method to map the genomic region of interest when 10-20 cases and 10-20 controls are available. However, to identify the genetic variant in associated regions, fine-mapping and targeted re-sequencing is required. Here we present a new approach using whole-genome sequencing (WGS) of a family trio without prior GWAS. As a proof-of-concept, we chose an autosomal recessive disease known as hereditary footpad hyperkeratosis (HFH) in Kromfohrl änder dogs. To our knowledge, this is the first time this family trio WGS-approach, has successfully been used to identify a genetic variant that perfectly segregates with a canine disorder. The sequencing of three Kromfohrl änder dogs from a family trio (an affected offspring and both its healthy parents) resulted in an average genome coverage of 9.2X per individual. After applying stringent filtering criteria for candidate causative coding variants, 527 single nucleotide variants (SNVs) and 15 indels were found to be homozygous in the affected offspring and heterozygous in the parents. Using the computer software packages ANNOVAR and SIFT to functionally annotate coding sequence differences and to predict their functional effect, resulted in seven candidate variants located in six different genes. Of these, only FAM83G:c155G>C (p.R52P) was found to be concordant in eight additional cases and 16 healthy Kromfohrl änder dogs.
Resumo:
OBJECTIVE Intraarticular gadolinium-enhanced magnetic resonance arthrography (MRA) is commonly applied to characterize morphological disorders of the hip. However, the reproducibility of retrieving anatomic landmarks on MRA scans and their correlation with intraarticular pathologies is unknown. A precise mapping system for the exact localization of hip pathomorphologies with radial MRA sequences is lacking. Therefore, the purpose of the study was the establishment and validation of a reproducible mapping system for radial sequences of hip MRA. MATERIALS AND METHODS Sixty-nine consecutive intraarticular gadolinium-enhanced hip MRAs were evaluated. Radial sequencing consisted of 14 cuts orientated along the axis of the femoral neck. Three orthopedic surgeons read the radial sequences independently. Each MRI was read twice with a minimum interval of 7 days from the first reading. The intra- and inter-observer reliability of the mapping procedure was determined. RESULTS A clockwise system for hip MRA was established. The teardrop figure served to determine the 6 o'clock position of the acetabulum; the center of the greater trochanter served to determine the 12 o'clock position of the femoral head-neck junction. The intra- and inter-observer ICCs to retrieve the correct 6/12 o'clock positions were 0.906-0.996 and 0.978-0.988, respectively. CONCLUSIONS The established mapping system for radial sequences of hip joint MRA is reproducible and easy to perform.
Resumo:
Familial acute myeloid leukemia is rare and linked to germline mutations in RUNX1, GATA2 or CCAAT/enhancer binding protein-α (CEBPA). We re-evaluated a large family with acute myeloid leukemia originally seen at NIH in 1969. We utilized whole-exome sequencing to study this family, and conducted in silico bioinformatics analysis, protein structural modeling and laboratory experiments to assess the impact of the identified CEBPA Q311P mutation. Unlike most previously identified germline mutations in CEBPA, which were N-terminal frameshift mutations, we identified a novel Q311P variant that was located in the C-terminal bZip domain of C/EBPα. Protein structural modeling suggested that the Q311P mutation alters the ability of the CEBPA dimer to bind DNA. Electrophoretic mobility shift assays showed that the Q311P mutant had attenuated binding to DNA, as predicted by the protein modeling. Consistent with these findings, we found that the Q311P mutation has reduced transactivation, consistent with a loss-of-function mutation. From 45 years of follow-up, we observed incomplete penetrance (46%) of CEBPA Q311P. This study of a large multi-generational pedigree reveals that a germline mutation in the C-terminal bZip domain can alter the ability of C/EBP-α to bind DNA and reduces transactivation, leading to acute myeloid leukemia.
A functional approach to movement analysis and error identification in sports and physical education
Resumo:
The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^
Resumo:
This paper proposes asymptotically optimal tests for unstable parameter process under the feasible circumstance that the researcher has little information about the unstable parameter process and the error distribution, and suggests conditions under which the knowledge of those processes does not provide asymptotic power gains. I first derive a test under known error distribution, which is asymptotically equivalent to LR tests for correctly identified unstable parameter processes under suitable conditions. The conditions are weak enough to cover a wide range of unstable processes such as various types of structural breaks and time varying parameter processes. The test is then extended to semiparametric models in which the underlying distribution in unknown but treated as unknown infinite dimensional nuisance parameter. The semiparametric test is adaptive in the sense that its asymptotic power function is equivalent to the power envelope under known error distribution.
Resumo:
Statement of the problem and public health significance. Hospitals were designed to be a safe haven and respite from disease and illness. However, a large body of evidence points to preventable errors in hospitals as the eighth leading cause of death among Americans. Twelve percent of Americans, or over 33.8 million people, are hospitalized each year. This population represents a significant portion of at risk citizens exposed to hospital medical errors. Since the number of annual deaths due to hospital medical errors is estimated to exceed 44,000, the magnitude of this tragedy makes it a significant public health problem. ^ Specific aims. The specific aims of this study were threefold. First, this study aimed to analyze the state of the states' mandatory hospital medical error reporting six years after the release of the influential IOM report, "To Err is Human." The second aim was to identify barriers to reporting of medical errors by hospital personnel. The third aim was to identify hospital safety measures implemented to reduce medical errors and enhance patient safety. ^ Methods. A descriptive, longitudinal, retrospective design was used to address the first stated objective. The study data came from the twenty-one states with mandatory hospital reporting programs which report aggregate hospital error data that is accessible to the public by way of states' websites. The data analysis included calculations of expected number of medical errors for each state according to IOM rates. Where possible, a comparison was made between state reported data and the calculated IOM expected number of errors. A literature review was performed to achieve the second study aim, identifying barriers to reporting medical errors. The final aim was accomplished by telephone interviews of principal patient safety/quality officers from five Texas hospitals with more than 700 beds. ^ Results. The state medical error data suggests vast underreporting of hospital medical errors to the states. The telephone interviews suggest that hospitals are working at reducing medical errors and creating safer environments for patients. The literature review suggests the underreporting of medical errors at the state level stems from underreporting of errors at the delivery level. ^