935 resultados para Functional Requirements for Authority Data (FRAD)
Resumo:
The SVWN, BVWN, BP86, BLYP, BPW91, B3P86, B3LYP, B3PW91, B1LYP, mPW1PW, and PBE1PBE density functionals, as implemented in Gaussian 98 and Gaussian 03, were used to calculate ΔG0 and ΔH0 values for 17 deprotonation reactions where the experimental values are accurately known. The PBE1PBE and B3P86 functionals are shown to compute results with accuracy comparable to more computationally intensive compound model chemistries. A rationale for the relative performance of various functionals is explored.
Resumo:
The last few years have seen the advent of high-throughput technologies to analyze various properties of the transcriptome and proteome of several organisms. The congruency of these different data sources, or lack thereof, can shed light on the mechanisms that govern cellular function. A central challenge for bioinformatics research is to develop a unified framework for combining the multiple sources of functional genomics information and testing associations between them, thus obtaining a robust and integrated view of the underlying biology. We present a graph theoretic approach to test the significance of the association between multiple disparate sources of functional genomics data by proposing two statistical tests, namely edge permutation and node label permutation tests. We demonstrate the use of the proposed tests by finding significant association between a Gene Ontology-derived "predictome" and data obtained from mRNA expression and phenotypic experiments for Saccharomyces cerevisiae. Moreover, we employ the graph theoretic framework to recast a surprising discrepancy presented in Giaever et al. (2002) between gene expression and knockout phenotype, using expression data from a different set of experiments.
Resumo:
We propose a novel class of models for functional data exhibiting skewness or other shape characteristics that vary with spatial or temporal location. We use copulas so that the marginal distributions and the dependence structure can be modeled independently. Dependence is modeled with a Gaussian or t-copula, so that there is an underlying latent Gaussian process. We model the marginal distributions using the skew t family. The mean, variance, and shape parameters are modeled nonparametrically as functions of location. A computationally tractable inferential framework for estimating heterogeneous asymmetric or heavy-tailed marginal distributions is introduced. This framework provides a new set of tools for increasingly complex data collected in medical and public health studies. Our methods were motivated by and are illustrated with a state-of-the-art study of neuronal tracts in multiple sclerosis patients and healthy controls. Using the tools we have developed, we were able to find those locations along the tract most affected by the disease. However, our methods are general and highly relevant to many functional data sets. In addition to the application to one-dimensional tract profiles illustrated here, higher-dimensional extensions of the methodology could have direct applications to other biological data including functional and structural MRI.
Resumo:
High-throughput assays, such as yeast two-hybrid system, have generated a huge amount of protein-protein interaction (PPI) data in the past decade. This tremendously increases the need for developing reliable methods to systematically and automatically suggest protein functions and relationships between them. With the available PPI data, it is now possible to study the functions and relationships in the context of a large-scale network. To data, several network-based schemes have been provided to effectively annotate protein functions on a large scale. However, due to those inherent noises in high-throughput data generation, new methods and algorithms should be developed to increase the reliability of functional annotations. Previous work in a yeast PPI network (Samanta and Liang, 2003) has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional associations between proteins, and hence suggest their functions. One advantage of the work is that their algorithm is not sensitive to noises (false positives) in high-throughput PPI data. In this study, we improved their prediction scheme by developing a new algorithm and new methods which we applied on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting functionally associated proteins. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as independent and unbiased benchmarks to evaluate our algorithms and methods within the human PPI network. We showed that, compared with the previous work from Samanta and Liang, our algorithm and methods developed in this study improved the overall quality of functional inferences for human proteins. By applying the algorithms to the human PPI network, we obtained 4,233 significant functional associations among 1,754 proteins. Further comparisons of their KEGG and GO annotations allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made pathway analysis to identify several subclusters that are highly enriched in certain signaling pathways. Particularly, we performed a detailed analysis on a subcluster enriched in the transforming growth factor β signaling pathway (P<10-50) which is important in cell proliferation and tumorigenesis. Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotations in this post-genomic era.
Resumo:
Resting-state functional connectivity (FC) fMRI (rs-fcMRI) offers an appealing approach to mapping the brain's intrinsic functional organization. Blood oxygen level dependent (BOLD) and arterial spin labeling (ASL) are the two main rs-fcMRI approaches to assess alterations in brain networks associated with individual differences, behavior and psychopathology. While the BOLD signal is stronger with a higher temporal resolution, ASL provides quantitative, direct measures of the physiology and metabolism of specific networks. This study systematically investigated the similarity and reliability of resting brain networks (RBNs) in BOLD and ASL. A 2×2×2 factorial design was employed where each subject underwent repeated BOLD and ASL rs-fcMRI scans on two occasions on two MRI scanners respectively. Both independent and joint FC analyses revealed common RBNs in ASL and BOLD rs-fcMRI with a moderate to high level of spatial overlap, verified by Dice Similarity Coefficients. Test-retest analyses indicated more reliable spatial network patterns in BOLD (average modal Intraclass Correlation Coefficients: 0.905±0.033 between-sessions; 0.885±0.052 between-scanners) than ASL (0.545±0.048; 0.575±0.059). Nevertheless, ASL provided highly reproducible (0.955±0.021; 0.970±0.011) network-specific CBF measurements. Moreover, we observed positive correlations between regional CBF and FC in core areas of all RBNs indicating a relationship between network connectivity and its baseline metabolism. Taken together, the combination of ASL and BOLD rs-fcMRI provides a powerful tool for characterizing the spatiotemporal and quantitative properties of RBNs. These findings pave the way for future BOLD and ASL rs-fcMRI studies in clinical populations that are carried out across time and scanners.
Resumo:
Formation of the FtsZ ring (Z ring) in Escherichia coli is the first step in assembly of the divisome, a molecular machine composed of 14 known proteins which are all required for cell division. Although the biochemical functions of most divisome proteins are unknown, several of these have overlapping roles in ensuring that the Z ring assembles at the cytoplasmic membrane and is active. ^ We identified a single amino acid change in FtsA, R286W, renamed FtsA*, that completely bypasses the requirement for ZipA in cell division. This and other data suggest that FtsA* is a hyperactive form of FtsA that can replace the multiple functions normally assumed by ZipA, which include stabilization of Z rings, recruitment of downstream cell division proteins, and anchoring the Z ring to the membrane. This is the first example of complete functional replacement of an essential prokaryotic cell division protein by another. ^ Cells expressing ftsA* with a complete deletion of ftsK are viable and divide, although many of these ftsK null cells formed multiseptate chains, suggesting a role in cell separation for FtsK. In addition, strains expressing extra ftsAZ, ftsQ, ftsB, zipA or ftsN, were also able to survive and divide in the absence of ftsK. The cytoplasmic and transmembrane domains of FtsQ were sufficient to allow viability and septum formation to ftsK deleted strains. These findings suggest that FtsK is normally involved in stabilizing the divisome and shares functional overlap with other cell division proteins. ^ As well as permitting the removal of other divisome components, the presence of FtsA* in otherwise wild-type cells accelerated Z-ring assembly, which resulted in a significant decrease in the average length of cells. In support of its role in Z-ring stability, FtsA* suppressed the cell division inhibition caused by overexpressing FtsZ. FtsA* did not affect FtsZ turnover within the Z ring as measured by fluorescence recovery after photobleaching. Turnover of FtsA* in the ring was somewhat faster than wild-type FtsA. Yeast two-hybrid data suggest that FtsA* has an increased affinity for FtsZ relative to wild-type FtsA. These results indicate that FtsA* interacts with FtsZ more strongly, and its enhancement of Z ring assembly may explain why FtsA* can permit survival of cells lacking ZipA or FtsK.^
Resumo:
Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^
Resumo:
Pteropods are a group of holoplanktonic gastropods for which global biomass distribution patterns remain poorly resolved. The aim of this study was to collect and synthesize existing pteropod (Gymnosomata, Thecosomata and Pseudothecosomata) abundance and biomass data, in order to evaluate the global distribution of pteropod carbon biomass, with a particular emphasis on its seasonal, temporal and vertical patterns. We collected 25 902 data points from several online databases and a number of scientific articles. The biomass data has been gridded onto a 360 x 180° grid, with a vertical resolution of 33 WOA depth levels. Data has been converted to NetCDF format. Data were collected between 1951-2010, with sampling depths ranging from 0-1000 m. Pteropod biomass data was either extracted directly or derived through converting abundance to biomass with pteropod specific length to weight conversions. In the Northern Hemisphere (NH) the data were distributed evenly throughout the year, whereas sampling in the Southern Hemisphere was biased towards the austral summer months. 86% of all biomass values were located in the NH, most (42%) within the latitudinal band of 30-50° N. The range of global biomass values spanned over three orders of magnitude, with a mean and median biomass concentration of 8.2 mg C l-1 (SD = 61.4) and 0.25 mg C l-1, respectively for all data points, and with a mean of 9.1 mg C l-1 (SD = 64.8) and a median of 0.25 mg C l-1 for non-zero biomass values. The highest mean and median biomass concentrations were located in the NH between 40-50° S (mean biomass: 68.8 mg C l-1 (SD = 213.4) median biomass: 2.5 mg C l-1) while, in the SH, they were within the 70-80° S latitudinal band (mean: 10.5 mg C l-1 (SD = 38.8) and median: 0.2 mg C l-1). Biomass values were lowest in the equatorial regions. A broad range of biomass concentrations was observed at all depths, with the biomass peak located in the surface layer (0-25 m) and values generally decreasing with depth. However, biomass peaks were located at different depths in different ocean basins: 0-25 m depth in the N Atlantic, 50-100 m in the Pacific, 100-200 m in the Arctic, 200-500 m in the Brazilian region and >500 m in the Indo-Pacific region. Biomass in the NH was relatively invariant over the seasonal cycle, but more seasonally variable in the SH. The collected database provides a valuable tool for modellers for the study of ecosystem processes and global biogeochemical cycles.
Resumo:
The smallest marine phytoplankton, collectively termed picophytoplankton, have been routinely enumerated by flow cytometry since the late 1980s, during cruises throughout most of the world ocean. We compiled a database of 40,946 data points, with separate abundance entries for Prochlorococcus, Synechococcus and picoeukaryotes. We use average conversion factors for each of the three groups to convert the abundance data to carbon biomass. After gridding with 1° spacing, the database covers 2.4% of the ocean surface area, with the best data coverage in the North Atlantic, the South Pacific and North Indian basins. The average picophytoplankton biomass is 12 ± 22 µg C L-1 or 1.9 g C m-2. We estimate a total global picophytoplankton biomass, excluding N2-fixers, of 0.53 - 0.74 Pg C (17 - 39 % Prochlorococcus, 12 - 15 % Synechococcus and 49 - 69 % picoeukaryotes). Future efforts in this area of research should focus on reporting calibrated cell size, and collecting data in undersampled regions.
Resumo:
This study is a first effort to compile the largest possible body of data available from different plankton databases as well as from individual published or unpublished datasets regarding diatom distribution in the world ocean. The data obtained originate from time series studies as well as spatial studies. This effort is supported by the Marine Ecosystem Data (MAREDAT) project, which aims at building consistent data sets for the main PFTs (Plankton Functional Types) in order to help validate biogeochemical ocean models by using converted C biomass from abundance data. Diatom abundance data were obtained from various research programs with the associated geolocation and date of collection, as well as with a taxonomic information ranging from group down to species. Minimum, maximum and average cell size information were mined from the literature for each taxonomic entry, and all abundance data were subsequently converted to biovolume and C biomass using the same methodology.
Resumo:
Microzooplankton database. Originally published in: Buitenhuis, Erik, Richard Rivkin, Sévrine Sailley, Corinne Le Quéré (2010) Biogeochemical fluxes through microzooplankton. Global Biogeochemical Cycles Vol. 24, GB4015, doi:10.1029/2009GB003601 This new version has had some mistakes corrected.