862 resultados para data replication
Resumo:
Improved sequencing technologies offer unprecedented opportunities for investigating the role of rare genetic variation in common disease. However, there are considerable challenges with respect to study design, data analysis and replication. Using pooled next-generation sequencing of 507 genes implicated in the repair of DNA in 1,150 samples, an analytical strategy focused on protein-truncating variants (PTVs) and a large-scale sequencing case-control replication experiment in 13,642 individuals, here we show that rare PTVs in the p53-inducible protein phosphatase PPM1D are associated with predisposition to breast cancer and ovarian cancer. PPM1D PTV mutations were present in 25 out of 7,781 cases versus 1 out of 5,861 controls (P = 1.12 × 10-5), including 18 mutations in 6,912 individuals with breast cancer (P = 2.42 × 10-4) and 12 mutations in 1,121 individuals with ovarian cancer (P = 3.10 × 10-9). Notably, all of the identified PPM1D PTVs were mosaic in lymphocyte DNA and clustered within a 370-base-pair region in the final exon of the gene, carboxy-terminal to the phosphatase catalytic domain. Functional studies demonstrate that the mutations result in enhanced suppression of p53 in response to ionizing radiation exposure, suggesting that the mutant alleles encode hyperactive PPM1D isoforms. Thus, although the mutations cause premature protein truncation, they do not result in the simple loss-of-function effect typically associated with this class of variant, but instead probably have a gain-of-function effect. Our results have implications for the detection and management of breast and ovarian cancer risk. More generally, these data provide new insights into the role of rare and of mosaic genetic variants in common conditions, and the use of sequencing in their identification.
Resumo:
Copy number variations (CNVs) as described in the healthy population are purported to contribute significantly to genetic heterogeneity. Recent studies have described CNVs using lymphoblastoid cell lines or by application of specifically developed algorithms to interrogate previously described data. However, the full extent of CNVs remains unclear. Using high-density SNP array, we have undertaken a comprehensive investigation of chromosome 18 for CNV discovery and characterisation of distribution and association with chromosome architecture. We identified 399 CNVs, of which loss represents 98%, 58% are less than 2.5 kb in size and 71% are intergenic. Intronic deletions account for the majority of copy number changes with gene involvement. Furthermore, one-third of CNVs do not have putative breakpoints within repetitive sequences. We conclude that replicative processes, mediated either by repetitive elements or microhomology, account for the majority of CNVs in the healthy population. Genomic instability involving the formation of a non-B structure is demonstrated in one region.
Resumo:
Rapid advances in sequencing technologies (Next Generation Sequencing or NGS) have led to a vast increase in the quantity of bioinformatics data available, with this increasing scale presenting enormous challenges to researchers seeking to identify complex interactions. This paper is concerned with the domain of transcriptional regulation, and the use of visualisation to identify relationships between specific regulatory proteins (the transcription factors or TFs) and their associated target genes (TGs). We present preliminary work from an ongoing study which aims to determine the effectiveness of different visual representations and large scale displays in supporting discovery. Following an iterative process of implementation and evaluation, representations were tested by potential users in the bioinformatics domain to determine their efficacy, and to understand better the range of ad hoc practices among bioinformatics literate users. Results from two rounds of small scale user studies are considered with initial findings suggesting that bioinformaticians require richly detailed views of TF data, features to compare TF layouts between organisms quickly, and ways to keep track of interesting data points.
Resumo:
Background and aims. Primary sclerosing cholangitis (PSC) is a chronic cholestatic liver disease characterized by progressive inflammation and fibrosis of the bile ducts eventually leading to biliary cirrhosis. Recent genetic studies in PSC have identified associations at 2q13, 2q35, 3p21, 4q27, 13q31 and suggestive association at 10p15. The aim of this study was to further characterize and refine the genetic architecture of PSC. Methods. We analyzed previously reported associated SNPs at four of these non-HLA loci and 59 SNPs tagging the IL-2/IL-21 (4q27) and IL2RA (10p15) loci in 992 UK PSC cases and 5162 healthy UK controls. Results. The most associated SNPs identified were rs3197999 (3p21 (MST1), p = 1.9 × 10 -6, OR A vs G = 1.28, 95% CI (1.16-1.42)); rs4147359 (10p15 (IL2RA), p = 2.6 × 10 -4, OR A vs G = 1.20, 95% CI (1.09-1.33)) and rs12511287 (4q27 (IL-2/IL-21), p = 3.0 × 10 -4, OR A vs T = 1.21, 95% CI (1.09-1.35)). In addition, we performed a meta-analysis for selected SNPs using published summary statistics from recent studies. We observed genome-wide significance for rs3197999 (3p21 (MST1), P combined = 3.8 × 10 -12) and rs4147359 (10p15 (IL2RA), P combined = 1.5 × 10 -8). Conclusion. We have for the first time confirmed the association of PSC with genetic variants at 10p15 (IL2RA) locus at genome-wide significance and replicated the associations at MST1 and IL-2/IL-21 loci in a large homogeneous UK population. These results strongly implicate the role of IL-2/IL2RA pathway in PSC and provide further confirmation of MST1 association. © Informa Healthcare.
Resumo:
This thesis has investigated how to cluster a large number of faces within a multi-media corpus in the presence of large session variation. Quality metrics are used to select the best faces to represent a sequence of faces; and session variation modelling improves clustering performance in the presence of wide variations across videos. Findings from this thesis contribute to improving the performance of both face verification systems and the fully automated clustering of faces from a large video corpus.
Resumo:
A strong association between ERAP1 and ankylosing spondylitis (AS) was recently identified by the Wellcome Trust Case Control Consortium and the Australo-Anglo-American Spondylitis Consortium (WTCCC-TASC) study. ERAP1 is highly polymorphic with strong linkage disequilibrium evident across the gene. We therefore conducted a series of experiments to try to identify the primary genetic association(s) with ERAP1. We replicated the original associations in an independent set of 730 patients and 1021 controls, resequenced ERAP1 to define the full extent of coding polymorphisms and tested all variants in additional association studies. The genetic association with ERAP1 was independently confirmed; the strongest association was with rs30187 in the replication set (P = 3.4 × 103). When the data were combined with the original WTCCC-TASC study the strongest association was with rs27044 (P = 1.1 × 10-9). We identified 33 sequence polymorphisms in ERAP1, including three novel and eight known non-synonymous polymorphisms. We report several new associations between AS and polymorphisms distributed across ERAP1 from the extended case-control study, the most significant of which was with rs27434 (P = 4.7 × 10-7). Regression analysis failed to identify a primary association clearly; we therefore used data from HapMap to impute genotypes for an additional 205 non-coding SNPs located within and adjacent to ERAP1. A number of highly significant associations (P < 5 × 10-9) were identified in regulatory sequences which are good candidates for causing susceptibility to AS, possibly by regulating ERAP1 expression. © 2009 The Author(s).
Resumo:
Sensor networks for environmental monitoring present enormous benefits to the community and society as a whole. Currently there is a need for low cost, compact, solar powered sensors suitable for deployment in rural areas. The purpose of this research is to develop both a ground based wireless sensor network and data collection using unmanned aerial vehicles. The ground based sensor system is capable of measuring environmental data such as temperature or air quality using cost effective low power sensors. The sensor will be configured such that its data is stored on an ATMega16 microcontroller which will have the capability of communicating with a UAV flying overhead using UAV communication protocols. The data is then either sent to the ground in real time or stored on the UAV using a microcontroller until it lands or is close enough to enable the transmission of data to the ground station.
Resumo:
This technical report describes a Light Detection and Ranging (LiDAR) augmented optimal path planning at low level flight methodology for remote sensing and sampling Unmanned Aerial Vehicles (UAV). The UAV is used to perform remote air sampling and data acquisition from a network of sensors on the ground. The data that contains information on the terrain is in the form of a 3D point clouds maps is processed by the algorithms to find an optimal path. The results show that the method and algorithm are able to use the LiDAR data to avoid obstacles when planning a path from a start to a target point. The report compares the performance of the method as the resolution of the LIDAR map is increased and when a Digital Elevation Model (DEM) is included. From a practical point of view, the optimal path plan is loaded and works seemingly with the UAV ground station and also shows the UAV ground station software augmented with more accurate LIDAR data.
Resumo:
We have genotyped 14,436 nonsynonymous SNPs (nsSNPs) and 897 major histocompatibility complex (MHC) tag SNPs from 1,000 independent cases of ankylosing spondylitis (AS), autoimmune thyroid disease (AITD), multiple sclerosis (MS) and breast cancer (BC). Comparing these data against a common control dataset derived from 1,500 randomly selected healthy British individuals, we report initial association and independent replication in a North American sample of two new loci related to ankylosing spondylitis, ARTS1 and IL23R, and confirmation of the previously reported association of AITD with TSHR and FCRL3. These findings, enabled in part by increased statistical power resulting from the expansion of the control reference group to include individuals from the other disease groups, highlight notable new possibilities for autoimmune regulation and suggest that IL23R may be a common susceptibility factor for the major 'seronegative' diseases.
Resumo:
On the basis of local data, we write in support of the conclusions of Smith and Ahern that current Pharmaceu- tical Benefits Scheme (PBS) criteria for tumour necrosis factor (TNF)-a inhibitors in ankylosing spondylitis (AS) are not evidence-based. 1 As a prerequisite to the appropriate use of biological therapy in AS, three aspects of the disease need to be defined: (i) diagnosis, (ii) activity and (iii) therapeutic failure (Table 1)....
Resumo:
This poster presents key features of how QUT’s integrated research data storage and management services work with researchers through their own individual or team research life cycle. By understanding the characteristics of research data, and the long-term need to store this data, QUT has provided resources and tools that support QUT’s goal of being a research intensive institute. Key to successful delivery and operation has been the focus upon researchers’ individual needs and the collaboration between providers, in particular, Information Technology Services, High Performance Computing and Research Support, and QUT Library. QUT’s Research Data Storage service provides all QUT researchers (staff and Higher Degree Research students (HDRs)) with a secure data repository throughout the research data lifecycle. Three distinct storage areas provide for raw research data to be acquired, project data to be worked on, and published data to be archived. Since the service was launched in late 2014, it has provided research project teams from all QUT faculties with acquisition, working or archival data space. Feedback indicates that the storage suits the unique needs of researchers and their data. As part of the workflow to establish storage space for researchers, Research Support Specialists and Research Data Librarians consult with researchers and HDRs to identify data storage requirements for projects and individual researchers, and to select and implement the most suitable data storage services and facilities. While research can be a journey into the unknown[1], a plan can help navigate through the uncertainty. Intertwined in the storage provision is QUT’s Research Data Management Planning tool. Launched in March 2015, it has already attracted 273 QUT staff and 352 HDR student registrations, and over 620 plans have been created (2/10/2015). Developed in collaboration with Office of Research Ethics and Integrity (OREI), uptake of the plan has exceeded expectations.
Resumo:
Bird species richness survey is one of the most intriguing ecological topics for evaluating environmental health. Here, bird species richness denotes the number of unique bird species in a particular area. Factors affecting the investigation of bird species richness include weather, observation bias, and most importantly, the prohibitive costs of conducting surveys at large spatiotemporal scales. Thanks to advances in recording techniques, these problems have been alleviated by deploying sensors for acoustic data collection. Although automated detection techniques have been introduced to identify various bird species, the innate complexity of bird vocalizations, the background noise present in the recording and the escalating volumes of acoustic data pose a challenging task on determination of bird species richness. In this paper we proposed a two-step computer-assisted sampling approach for determining bird species richness in one-day acoustic data. First, a classification model is built based on acoustic indices for filtering out minutes that contain few bird species. Then the classified bird minutes are ordered by an acoustic index and the redundant temporal minutes are removed from the ranked minute sequence. The experimental results show that our method is more efficient in directing experts for determination of bird species compared with the previous methods.
Resumo:
This article describes a maximum likelihood method for estimating the parameters of the standard square-root stochastic volatility model and a variant of the model that includes jumps in equity prices. The model is fitted to data on the S&P 500 Index and the prices of vanilla options written on the index, for the period 1990 to 2011. The method is able to estimate both the parameters of the physical measure (associated with the index) and the parameters of the risk-neutral measure (associated with the options), including the volatility and jump risk premia. The estimation is implemented using a particle filter whose efficacy is demonstrated under simulation. The computational load of this estimation method, which previously has been prohibitive, is managed by the effective use of parallel computing using graphics processing units (GPUs). The empirical results indicate that the parameters of the models are reliably estimated and consistent with values reported in previous work. In particular, both the volatility risk premium and the jump risk premium are found to be significant.
Resumo:
Big Data and predictive analytics have received significant attention from the media and academic literature throughout the past few years, and it is likely that these emerging technologies will materially impact the mining sector. This short communication argues, however, that these technological forces will probably unfold differently in the mining industry than they have in many other sectors because of significant differences in the marginal cost of data capture and storage. To this end, we offer a brief overview of what Big Data and predictive analytics are, and explain how they are bringing about changes in a broad range of sectors. We discuss the “N=all” approach to data collection being promoted by many consultants and technology vendors in the marketplace but, by considering the economic and technical realities of data acquisition and storage, we then explain why a “n « all” data collection strategy probably makes more sense for the mining sector. Finally, towards shaping the industry’s policies with regards to technology-related investments in this area, we conclude by putting forward a conceptual model for leveraging Big Data tools and analytical techniques that is a more appropriate fit for the mining sector.