998 resultados para Algorithms genetics
Resumo:
Integrating evidence from multiple domains is useful in prioritizing disease candidate genes for subsequent testing. We ranked all known human genes (n = 3819) under linkage peaks in the Irish Study of High-Density Schizophrenia Families using three different evidence domains: 1) a meta-analysis of microarray gene expression results using the Stanley Brain collection, 2) a schizophrenia protein-protein interaction network, and 3) a systematic literature search. Each gene was assigned a domain-specific p-value and ranked after evaluating the evidence within each domain. For comparison to this
ranking process, a large-scale candidate gene hypothesis was also tested by including genes with Gene Ontology terms related to neurodevelopment. Subsequently, genotypes of 3725 SNPs in 167 genes from a custom Illumina iSelect array were used to evaluate the top ranked vs. hypothesis selected genes. Seventy-three genes were both highly ranked and involved in neurodevelopment (category 1) while 42 and 52 genes were exclusive to neurodevelopment (category 2) or highly ranked (category 3), respectively. The most significant associations were observed in genes PRKG1, PRKCE, and CNTN4 but no individual SNPs were significant after correction for multiple testing. Comparison of the approaches showed an excess of significant tests using the hypothesis-driven neurodevelopment category. Random selection of similar sized genes from two independent genome-wide association studies (GWAS) of schizophrenia showed the excess was unlikely by chance. In a further meta-analysis of three GWAS datasets, four candidate SNPs reached nominal significance. Although gene ranking using integrated sources of prior information did not enrich for significant results in the current experiment, gene selection using an a priori hypothesis (neurodevelopment) was superior to random selection. As such, further development of gene ranking strategies using more carefully selected sources of information is warranted.
Resumo:
Summary: We present a new R package, diveRsity, for the calculation of various diversity statistics, including common diversity partitioning statistics (?, G) and population differentiation statistics (D, GST ', ? test for population heterogeneity), among others. The package calculates these estimators along with their respective bootstrapped confidence intervals for loci, sample population pairwise and global levels. Various plotting tools are also provided for a visual evaluation of estimated values, allowing users to critically assess the validity and significance of statistical tests from a biological perspective. diveRsity has a set of unique features, which facilitate the use of an informed framework for assessing the validity of the use of traditional F-statistics for the inference of demography, with reference to specific marker types, particularly focusing on highly polymorphic microsatellite loci. However, the package can be readily used for other co-dominant marker types (e.g. allozymes, SNPs). Detailed examples of usage and descriptions of package capabilities are provided. The examples demonstrate useful strategies for the exploration of data and interpretation of results generated by diveRsity. Additional online resources for the package are also described, including a GUI web app version intended for those with more limited experience using R for statistical analysis. © 2013 British Ecological Society.
Resumo:
Non-invasive population genetics has become a valuable tool in ecology and conservation biology, allowing genetic studies of wild populations without the need to catch, handle or even observe the study subjects directly. We address some of the concerns regarding the limitations of using non-invasive samples by comparing the quality of population genetic information gained through DNA extracted from faecal samples and biopsy samples of two elusive bat species, Myotis mystacinus and Myotis nattereri. We demonstrate that DNA extracted from faeces and tissue samples gives comparable results for frequency based population genetic analyses, despite the occurrence of genotyping errors when using faecal DNA. We conclude that non-invasive genetic sampling for population genetic analysis in bats is viable, and although more labour-intensive and expensive, it is an alternative to tissue sampling, which is particularly pertinent when specimens are rare, endangered or difficult to capture. © 2012 Museum and Institute of Zoology PAS.
Resumo:
Processor architectures has taken a turn towards many-core processors, which integrate multiple processing cores on a single chip to increase overall performance, and there are no signs that this trend will stop in the near future. Many-core processors are harder to program than multi-core and single-core processors due to the need of writing parallel or concurrent programs with high degrees of parallelism. Moreover, many-cores have to operate in a mode of strong scaling because of memory bandwidth constraints. In strong scaling increasingly finer-grain parallelism must be extracted in order to keep all processing cores busy.
Task dataflow programming models have a high potential to simplify parallel program- ming because they alleviate the programmer from identifying precisely all inter-task de- pendences when writing programs. Instead, the task dataflow runtime system detects and enforces inter-task dependences during execution based on the description of memory each task accesses. The runtime constructs a task dataflow graph that captures all tasks and their dependences. Tasks are scheduled to execute in parallel taking into account dependences specified in the task graph.
Several papers report important overheads for task dataflow systems, which severely limits the scalability and usability of such systems. In this paper we study efficient schemes to manage task graphs and analyze their scalability. We assume a programming model that supports input, output and in/out annotations on task arguments, as well as commutative in/out and reductions. We analyze the structure of task graphs and identify versions and generations as key concepts for efficient management of task graphs. Then, we present three schemes to manage task graphs building on graph representations, hypergraphs and lists. We also consider a fourth edge-less scheme that synchronizes tasks using integers. Analysis using micro-benchmarks shows that the graph representation is not always scalable and that the edge-less scheme introduces least overhead in nearly all situations.
Resumo:
New-onset diabetes after transplantation is a common complication that reduces recipient survival. Research in renal transplant recipients has suggested that pancreatic ß-cell dysfunction, as opposed to insulin resistance, may be the key pathologic process. In this study, clinical and genetic factors associated with new-onset diabetes after transplantation were identified in a white population. A joint analysis approach, with an initial genome-wide association study in a subset of cases followed by de novo genotyping in the complete case cohort, was implemented to identify single-nucleotide polymorphisms (SNPs) associated with the development of new-onset diabetes after transplantation. Clinical variables associated with the development of diabetes after renal transplantation included older recipient age, female sex, and percentage weight gain within 12 months of transplantation. The genome-wide association study identified 26 SNPs associated with new-onset diabetes after transplantation; this association was validated for eight SNPs (rs10484821, rs7533125, rs2861484, rs11580170, rs2020902, rs1836882, rs198372, and rs4394754) by de novo genotyping. These associations remained significant after multivariate adjustment for clinical variables. Seven of these SNPs are associated with genes implicated in ß-cell apoptosis. These results corroborate recent clinical evidence implicating ß-cell dysfunction in the pathophysiology of new-onset diabetes after transplantation and support the pursuit of therapeutic strategies to protect ß cells in the post-transplant period.
Resumo:
Many modern networks are \emph{reconfigurable}, in the sense that the topology of the network can be changed by the nodes in the network. For example, peer-to-peer, wireless and ad-hoc networks are reconfigurable. More generally, many social networks, such as a company's organizational chart; infrastructure networks, such as an airline's transportation network; and biological networks, such as the human brain, are also reconfigurable. Modern reconfigurable networks have a complexity unprecedented in the history of engineering, resembling more a dynamic and evolving living animal rather than a structure of steel designed from a blueprint. Unfortunately, our mathematical and algorithmic tools have not yet developed enough to handle this complexity and fully exploit the flexibility of these networks. We believe that it is no longer possible to build networks that are scalable and never have node failures. Instead, these networks should be able to admit small, and maybe, periodic failures and still recover like skin heals from a cut. This process, where the network can recover itself by maintaining key invariants in response to attack by a powerful adversary is what we call \emph{self-healing}. Here, we present several fast and provably good distributed algorithms for self-healing in reconfigurable dynamic networks. Each of these algorithms have different properties, a different set of gaurantees and limitations. We also discuss future directions and theoretical questions we would like to answer. %in the final dissertation that this document is proposed to lead to.
Resumo:
In a scenario of increasing life expectancy worldwide, it is mandatory to identify the characteristics of a healthy aging phenotype, including survival predictors, and to disentangle those related to environment/lifestyle versus those related to familiarity/genetics. To this aim we comprehensively characterised a cohort of 1,160 Italian subjects of 90 years and over (90+, mean age 93 years; age range 90-106 years) followed for 6 years survival, belonging to 552 sib-ships (familiar longevity) recruited (2005-2008) within the EU-funded GEHA project in three Italian geographic areas (Northern, Central and Southern Italy) different for urban/rural and socio-economical characteristics. On the whole, the following factors emerged as significant predictors of survival after 90 years of age: absence of cognitive impairment and physical disability, high hand grip strength scores and body mass index (BMI) values, "excellent/good" self-reported health, high haemoglobin and total cholesterol levels and low creatinine levels. These parameters, excluding BMI values, were also significantly associated within sib-ships, suggesting a strong familial/genetic component. Geographical micro-heterogeneity of survival predictors emerged, such as functional and physical status being more important in Southern than in Central and Northern Italy. In conclusion, we identified modifiable survival predictors related to specific domains, whose role and importance vary according to the geographic area considered and which can help in interpreting the genetic results obtained by the GEHA project, whose major aim is the comprehensive evaluation of phenotypic and genetic data.