983 resultados para Step Length Estimation
Resumo:
The RIKEN Mouse Gene Encyclopaedia Project, a systematic approach to determining the full coding potential of the mouse genome, involves collection and sequencing of full-length complementary DNAs and physical mapping of the corresponding genes to the mouse genome. We organized an international functional annotation meeting (FANTOM) to annotate the first 21,076 cDNAs to be analysed in this project. Here we describe the first RIKEN clone collection, which is one of the largest described for any organism. Analysis of these cDNAs extends known gene families and identifies new ones.
Resumo:
Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.
Resumo:
Reaction between 5-(4-amino-2-thiabutyl)-5-methyl-3,7-dithianonane-1, 9-diamine (N3S3) and 5- methyl-2,2-bipyridine-5-carbaldehyde and subsequent reduction of the resulting imine with sodium borohydride results in a potentially ditopic ligand (L). Treatment of L with one equivalent of an iron( II) salt led to the monoprotonated complex [Fe(HL)](3+), isolated as the hexafluorophosphate salt. The presence of characteristic bands for the tris( bipyridyl) iron( II) chromophore in the UV/vis spectrum indicated that the iron( II) atom is coordinated octahedrally by the three bipyridyl (bipy) groups. The [Fe( bipy) 3] moiety encloses a cavity composed of the N3S3 portion of the ditopic ligand. The mononuclear and monomeric nature of the complex [Fe(HL)](3+) has been established also by accurate mass analysis. [Fe(HL)](3+) displays reduced stability to base compared with the complex [Fe(bipy)(3)](2+). In aqueous solution [Fe(HL)](3+) exhibits irreversible electrochemical behaviour with an oxidation wave ca. 60 mV to more positive potential than [Fe(bipy)(3)](2+). Investigations of the interaction of [Fe(L)](2+) with copper( II), iron( II), and mercury( II) using mass spectroscopic and potentiometric methods suggested that where complexation occurred, fewer than six of the N3S3 cavity donors were involved. The high affinity of the complex [Fe(L)](2+) for protons is one reason suggested to contribute to the reluctance to coordinate a second metal ion.
Resumo:
We present a novel maximum-likelihood-based algorithm for estimating the distribution of alignment scores from the scores of unrelated sequences in a database search. Using a new method for measuring the accuracy of p-values, we show that our maximum-likelihood-based algorithm is more accurate than existing regression-based and lookup table methods. We explore a more sophisticated way of modeling and estimating the score distributions (using a two-component mixture model and expectation maximization), but conclude that this does not improve significantly over simply ignoring scores with small E-values during estimation. Finally, we measure the classification accuracy of p-values estimated in different ways and observe that inaccurate p-values can, somewhat paradoxically, lead to higher classification accuracy. We explain this paradox and argue that statistical accuracy, not classification accuracy, should be the primary criterion in comparisons of similarity search methods that return p-values that adjust for target sequence length.
Resumo:
The choice of genotyping families vs unrelated individuals is a critical factor in any large-scale linkage disequilibrium (LD) study. The use of unrelated individuals for such studies is promising, but in contrast to family designs, unrelated samples do not facilitate detection of genotyping errors, which have been shown to be of great importance for LD and linkage studies and may be even more important in genotyping collaborations across laboratories. Here we employ some of the most commonly-used analysis methods to examine the relative accuracy of haplotype estimation using families vs unrelateds in the presence of genotyping error. The results suggest that even slight amounts of genotyping error can significantly decrease haplotype frequency and reconstruction accuracy, that the ability to detect such errors in large families is essential when the number/complexity of haplotypes is high (low LD/common alleles). In contrast, in situations of low haplotype complexity (high LD and/or many rare alleles) unrelated individuals offer such a high degree of accuracy that there is little reason for less efficient family designs. Moreover, parent-child trios, which comprise the most popular family design and the most efficient in terms of the number of founder chromosomes per genotype but which contain little information for error detection, offer little or no gain over unrelated samples in nearly all cases, and thus do not seem a useful sampling compromise between unrelated individuals and large families. The implications of these results are discussed in the context of large-scale LD mapping projects such as the proposed genome-wide haplotype map.
Resumo:
High quality MSS membranes were synthesised by a single-step and two-step catalysed hydrolyses employing tetraethylorthosilicate (TEOS), absolute ethanol (EtOH), I M nitric acid (HNO3) and distilled water (H2O). The Si-29 NMR results showed that the two-step xerogels consistently had more contribution of silanol groups (Q(3) and Q(2)) than the single-step xerogel. According to the fractal theory, high contribution of Q(2) and Q(3) species are responsible for the formation of weakly branched systems leading to low pore volume of microporous dimension. The transport of diffusing gases in these membranes is shown to be activated as the permeance increased with temperature. Albeit the permeance of He for both single-step and two-step membranes are very similar, the two-step membranes permselectivity (ideal separation factor) for He/CO2 (69-319) and He/CH4 (585-958) are one to two orders of magnitude higher than the single-step membranes results of 2-7 and 69, respectively. The two-step membranes have high activation energy for He and H-2 permeance, in excess of 16 kJ mol(-1). The mobility energy for He permeance is three to six-fold higher for the two-step than the single-step membranes. As the mobility energy is higher for small pores than large pores and coupled with the permselectivity results, the two-step catalysed hydrolysis sol-gel process resulted in the formation of pore sizes in the region of 3 Angstrom while the single-step process tended to produce slightly larger pores. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
Introduction Bioelectrical impedance analysis (BIA) is a useful field measure to estimate total body water (TBW). No prediction formulae have been developed or validated against a reference method in patients with pancreatic cancer. The aim of this study was to assess the agreement between three prediction equations for the estimation of TBW in cachectic patients with pancreatic cancer. Methods Resistance was measured at frequencies of 50 and 200 kHz in 18 outpatients (10 males and eight females, age 70.2 +/- 11.8 years) with pancreatic cancer from two tertiary Australian hospitals. Three published prediction formulae were used to calculate TBW - TBWs developed in surgical patients, TBWca-uw and TBWca-nw developed in underweight and normal weight patients with end-stage cancer. Results There was no significant difference in the TBW estimated by the three prediction equations - TBWs 32.9 +/- 8.3 L, TBWca-nw 36.3 +/- 7.4 L, TBWca-uw 34.6 +/- 7.6 L. At a population level, there is agreement between prediction of TBW in patients with pancreatic cancer estimated from the three equations. The best combination of low bias and narrow limits of agreement was observed when TBW was estimated from the equation developed in the underweight cancer patients relative to the normal weight cancer patients. When no established BIA prediction equation exists, practitioners should utilize an equation developed in a population with similar critical characteristics such as diagnosis, weight loss, body mass index and/or age. Conclusions Further research is required to determine the accuracy of the BIA prediction technique against a reference method in patients with pancreatic cancer.
Resumo:
Members of the Culex sitiens subgroup are important vectors of arboviruses, including Japanese encephalitis virus, Murray Valley encephalitis virus and Ross River virus. Of the eight described species, Cx. annulirostris Skuse, Cx. sitiens Wiedemann, and Cx. palpalis Taylor appear to be the most abundant and widespread throughout northern Australia and Papua New Guinea (PNG). Recent investigations using allozymes have shown this subgroup to contain cryptic species that possess overlapping adult morphology. We report the development of a polymerase chain reaction-restriction fragment-length polymorphism (PCR-RFLP) procedure that reliably separates these three species. This procedure utilizes the sequence variation in the ribosomal DNA ITS1 and demonstrates species-specific PCR-RFLP profiles from both colony and field collected material. Assessment of the consistency of this procedure was undertaken on mosquitoes sampled from a wide geographic area including Australia, PNG, and the Solomon Islands. Overlapping adult morphology was observed for Cx. annulirostris and Cx. palpalis in both northern Queensland and PNG and for all three species at one site in northwest Queensland.
Resumo:
Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 'transcriptional units', contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.