340 resultados para datasets


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The SNP-SNP interactome has rarely been explored in the context of neuroimaging genetics mainly due to the complexity of conducting approximately 10(11) pairwise statistical tests. However, recent advances in machine learning, specifically the iterative sure independence screening (SIS) method, have enabled the analysis of datasets where the number of predictors is much larger than the number of observations. Using an implementation of the SIS algorithm (called EPISIS), we used exhaustive search of the genome-wide, SNP-SNP interactome to identify and prioritize SNPs for interaction analysis. We identified a significant SNP pair, rs1345203 and rs1213205, associated with temporal lobe volume. We further examined the full-brain, voxelwise effects of the interaction in the ADNI dataset and separately in an independent dataset of healthy twins (QTIM). We found that each additional loading in the epistatic effect was associated with approximately 5% greater brain regional brain volume (a protective effect) in both the ADNI and QTIM samples.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The ENIGMA (Enhancing NeuroImaging Genetics through Meta-Analysis) Consortium was set up to analyze brain measures and genotypes from multiple sites across the world to improve the power to detect genetic variants that influence the brain. Diffusion tensor imaging (DTI) yields quantitative measures sensitive to brain development and degeneration, and some common genetic variants may be associated with white matter integrity or connectivity. DTI measures, such as the fractional anisotropy (FA) of water diffusion, may be useful for identifying genetic variants that influence brain microstructure. However, genome-wide association studies (GWAS) require large populations to obtain sufficient power to detect and replicate significant effects, motivating a multi-site consortium effort. As part of an ENIGMA-DTI working group, we analyzed high-resolution FA images from multiple imaging sites across North America, Australia, and Europe, to address the challenge of harmonizing imaging data collected at multiple sites. Four hundred images of healthy adults aged 18-85 from four sites were used to create a template and corresponding skeletonized FA image as a common reference space. Using twin and pedigree samples of different ethnicities, we used our common template to evaluate the heritability of tract-derived FA measures. We show that our template is reliable for integrating multiple datasets by combining results through meta-analysis and unifying the data through exploratory mega-analyses. Our results may help prioritize regions of the FA map that are consistently influenced by additive genetic factors for future genetic discovery studies. Protocols and templates are publicly available at (http://enigma.loni.ucla.edu/ongoing/dti-working-group/).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Combining datasets across independent studies can boost statistical power by increasing the numbers of observations and can achieve more accurate estimates of effect sizes. This is especially important for genetic studies where a large number of observations are required to obtain sufficient power to detect and replicate genetic effects. There is a need to develop and evaluate methods for joint-analytical analyses of rich datasets collected in imaging genetics studies. The ENIGMA-DTI consortium is developing and evaluating approaches for obtaining pooled estimates of heritability through meta-and mega-genetic analytical approaches, to estimate the general additive genetic contributions to the intersubject variance in fractional anisotropy (FA) measured from diffusion tensor imaging (DTI). We used the ENIGMA-DTI data harmonization protocol for uniform processing of DTI data from multiple sites. We evaluated this protocol in five family-based cohorts providing data from a total of 2248 children and adults (ages: 9-85) collected with various imaging protocols. We used the imaging genetics analysis tool, SOLAR-Eclipse, to combine twin and family data from Dutch, Australian and Mexican-American cohorts into one large "mega-family". We showed that heritability estimates may vary from one cohort to another. We used two meta-analytical (the sample-size and standard-error weighted) approaches and a mega-genetic analysis to calculate heritability estimates across-population. We performed leave-one-out analysis of the joint estimates of heritability, removing a different cohort each time to understand the estimate variability. Overall, meta- and mega-genetic analyses of heritability produced robust estimates of heritability.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To classify each stage for a progressing disease such as Alzheimer’s disease is a key issue for the disease prevention and treatment. In this study, we derived structural brain networks from diffusion-weighted MRI using whole-brain tractography since there is growing interest in relating connectivity measures to clinical, cognitive, and genetic data. Relatively little work has usedmachine learning to make inferences about variations in brain networks in the progression of the Alzheimer’s disease. Here we developed a framework to utilize generalized low rank approximations of matrices (GLRAM) and modified linear discrimination analysis for unsupervised feature learning and classification of connectivity matrices. We apply the methods to brain networks derived from DWI scans of 41 people with Alzheimer’s disease, 73 people with EMCI, 38 people with LMCI, 47 elderly healthy controls and 221 young healthy controls. Our results show that this new framework can significantly improve classification accuracy when combining multiple datasets; this suggests the value of using data beyond the classification task at hand to model variations in brain connectivity.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a visual SLAM method for temporary satellite dropout navigation, here applied on fixed- wing aircraft. It is designed for flight altitudes beyond typical stereo ranges, but within the range of distance measurement sensors. The proposed visual SLAM method consists of a common localization step with monocular camera resectioning, and a mapping step which incorporates radar altimeter data for absolute scale estimation. With that, there will be no scale drift of the map and the estimated flight path. The method does not require simplifications like known landmarks and it is thus suitable for unknown and nearly arbitrary terrain. The method is tested with sensor datasets from a manned Cessna 172 aircraft. With 5% absolute scale error from radar measurements causing approximately 2-6% accumulation error over the flown distance, stable positioning is achieved over several minutes of flight time. The main limitations are flight altitudes above the radar range of 750 m where the monocular method will suffer from scale drift, and, depending on the flight speed, flights below 50 m where image processing gets difficult with a downwards-looking camera due to the high optical flow rates and the low image overlap.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Montserrat now provides one of the most complete datasets for understanding the character and tempo of hazardous events at volcanic islands. Much of the erupted material ends up offshore, and this offshore record may be easier to date due to intervening hemiplegic sediments between event beds. The offshore dataset includes the first scientific drilling of volcanic island landslides during IODP Expedition 340, together with an unusually comprehensive set of shallow sediment cores and 2-D and 3-D seismic surveys. Most recently in 2013, Remotely Operated Vehicle (ROV) dives mapped and sampled the surface of the main landslide deposits. This contribution aims to provide an overview of key insights from ongoing work on IODP Expedition 340 Sites offshore Montserrat.Key objectives are to understand the composition (and hence source), emplacement mechanism (and hence tsunami generation) of major landslides, together with their frequency and timing relative to volcanic eruption cycles. The most recent major collapse event is Deposit 1, which involved ~1.8 km cubed of material and produced a blocky deposit at ~12-14ka. Deposit 1 appears to have involved not only the volcanic edifice, but also a substantial component of a fringing bioclastic shelf, and material locally incorporated from the underlying seafloor. This information allows us to test how first-order landslide morphology (e.g. blocky or elongate lobes) is related to first-order landslide composition. Preliminary analysis suggests that Deposit 1 occurred shortly before a second major landslide on the SW of the island (Deposit 5). It may have initiated English's Crater, but was not associated with a major change in magma composition. An associated turbidite-stack suggests it was emplaced in multiple stages, separated by at least a few hours and thus reducing the tsunami magnitude. The ROV dives show that mega-blocks in detail comprise smaller-scale breccias, which can travel significant distances without complete disintegration. Landslide Deposit 2 was emplaced at ~130ka, and is more voluminous (~8.4km cubed). It had a much more profound influence on the magmatic system, as it was linked to a major explosive mafic eruption and formation of a new volcanic centre (South Soufriere Hills) on the island. Site U1395 confirms a hypothesis based on the site survey seismic data that Deposit 2 includes a substantial component of pre-existing seafloor sediment. However, surprisingly, this pre-existing seafloor sediment in the lower part of Deposit 2 at Site U1395 is completely undeformed and flat lying, suggesting that Site U1395 penetrated a flat lying block. Work to date material from the upper part of U1396, U1395 and U1394 will also be summarised. This work is establishing a chronostratigraphy of major events over the last 1 Ma, with particularly detailed constraints during the last ~250ka. This is helping us to understand whether major landslides are related to cycles of volcanic eruptions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In life cycle assessment studies, greenhouse gas (GHG) emissions from direct land-use change have been estimated to make a significant contribution to the global warming potential of agricultural products. However, these estimates have a high uncertainty due to the complexity of data requirements and difficulty in attribution of land-use change. This paper presents estimates of GHG emissions from direct land-use change from native woodland to grazing land for two beef production regions in eastern Australia, which were the subject of a multi-impact life cycle assessment study for premium beef production. Spatially- and temporally consistent datasets were derived for areas of forest cover and biomass carbon stocks using published remotely sensed tree-cover data and regionally applicable allometric equations consistent with Australia's national GHG inventory report. Standard life cycle assessment methodology was used to estimate GHG emissions and removals from direct land-use change attributed to beef production. For the northern-central New South Wales region of Australia estimates ranged from a net emission of 0.03 t CO2-e ha-1 year-1 to net removal of 0.12 t CO2-e ha-1 year-1 using low and high scenarios, respectively, for sequestration in regrowing forests. For the same period (1990-2010), the study region in southern-central Queensland was estimated to have net emissions from land-use change in the range of 0.45-0.25 t CO2-e ha-1 year-1. The difference between regions reflects continuation of higher rates of deforestation in Queensland until strict regulation in 2006 whereas native vegetation protection laws were introduced earlier in New South Wales. On the basis of liveweight produced at the farm-gate, emissions from direct land-use change for 1990-2010 were comparable in magnitude to those from other on-farm sources, which were dominated by enteric methane. However, calculation of land-use change impacts for the Queensland region for a period starting 2006, gave a range from net emissions of 0.11 t CO2-e ha-1 year-1 to net removals of 0.07 t CO2-e ha-1 year-1. This study demonstrated a method for deriving spatially- and temporally consistent datasets to improve estimates for direct land-use change impacts in life cycle assessment. It identified areas of uncertainty, including rates of sequestration in woody regrowth and impacts of land-use change on soil carbon stocks in grazed woodlands, but also showed the potential for direct land-use change to represent a net sink for GHG.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Product reviews are the foremost source of information for customers and manufacturers to help them make appropriate purchasing and production decisions. Natural language data is typically very sparse; the most common words are those that do not carry a lot of semantic content, and occurrences of any particular content-bearing word are rare, while co-occurrences of these words are rarer. Mining product aspects, along with corresponding opinions, is essential for Aspect-Based Opinion Mining (ABOM) as a result of the e-commerce revolution. Therefore, the need for automatic mining of reviews has reached a peak. In this work, we deal with ABOM as sequence labelling problem and propose a supervised extraction method to identify product aspects and corresponding opinions. We use Conditional Random Fields (CRFs) to solve the extraction problem and propose a feature function to enhance accuracy. The proposed method is evaluated using two different datasets. We also evaluate the effectiveness of feature function and the optimisation through multiple experiments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Strand specific RNAseq data is now more common in RNAseq projects. Visualizing RNAseq data has become an important matter in Analysis of sequencing data. The most widely used visualization tool is the UCSC genome browser that introduced the custom track concept that enabled researchers to simultaneously visualize gene expression at a particular locus from multiple experiments. Our objective of the software tool is to provide friendly interface for visualization of RNAseq datasets. Results This paper introduces a visualization tool (RNASeqBrowser) that incorporates and extends the functionality of the UCSC genome browser. For example, RNASeqBrowser simultaneously displays read coverage, SNPs, InDels and raw read tracks with other BED and wiggle tracks -- all being dynamically built from the BAM file. Paired reads are also connected in the browser to enable easier identification of novel exon/intron borders and chimaeric transcripts. Strand specific RNAseq data is also supported by RNASeqBrowser that displays reads above (positive strand transcript) or below (negative strand transcripts) a central line. Finally, RNASeqBrowser was designed for ease of use for users with few bioinformatic skills, and incorporates the features of many genome browsers into one platform. Conclusions The features of RNASeqBrowser: (1) RNASeqBrowser integrates UCSC genome browser and NGS visualization tools such as IGV. It extends the functionality of the UCSC genome browser by adding several new types of tracks to show NGS data such as individual raw reads, SNPs and InDels. (2) RNASeqBrowser can dynamically generate RNA secondary structure. It is useful for identifying non-coding RNA such as miRNA. (3) Overlaying NGS wiggle data is helpful in displaying differential expression and is simple to implement in RNASeqBrowser. (4) NGS data accumulates a lot of raw reads. Thus, RNASeqBrowser collapses exact duplicate reads to reduce visualization space. Normal PC’s can show many windows of NGS individual raw reads without much delay. (5) Multiple popup windows of individual raw reads provide users with more viewing space. This avoids existing approaches (such as IGV) which squeeze all raw reads into one window. This will be helpful for visualizing multiple datasets simultaneously. RNASeqBrowser and its manual are freely available at http://www.australianprostatecentre.org/research/software/rnaseqbrowser webcite or http://sourceforge.net/projects/rnaseqbrowser/ webcite

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Environmental acoustic recordings can be used to perform avian species richness surveys, whereby a trained ornithologist can observe the species present by listening to the recording. This could be made more efficient by using computational methods for iteratively selecting the richest parts of a long recording for the human observer to listen to, a process known as “smart sampling”. This allows scaling up to much larger ecological datasets. In this paper we explore computational approaches based on information and diversity of selected samples. We propose to use an event detection algorithm to estimate the amount of information present in each sample. We further propose to cluster the detected events for a better estimate of this amount of information. Additionally, we present a time dispersal approach to estimating diversity between iteratively selected samples. Combinations of approaches were evaluated on seven 24-hour recordings that have been manually labeled by bird watchers. The results show that on average all the methods we have explored would allow annotators to observe more new species in fewer minutes compared to a baseline of random sampling at dawn.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the mining optimisation literature, most researchers focused on two strategic-level and tactical-level open-pit mine optimisation problems, which are respectively termed ultimate pit limit (UPIT) or constrained pit limit (CPIT). However, many researchers indicate that the substantial numbers of variables and constraints in real-world instances (e.g., with 50-1000 thousand blocks) make the CPIT’s mixed integer programming (MIP) model intractable for use. Thus, it becomes a considerable challenge to solve the large scale CPIT instances without relying on exact MIP optimiser as well as the complicated MIP relaxation/decomposition methods. To take this challenge, two new graph-based algorithms based on network flow graph and conjunctive graph theory are developed by taking advantage of problem properties. The performance of our proposed algorithms is validated by testing recent large scale benchmark UPIT and CPIT instances’ datasets of MineLib in 2013. In comparison to best known results from MineLib, it is shown that the proposed algorithms outperform other CPIT solution approaches existing in the literature. The proposed graph-based algorithms leads to a more competent mine scheduling optimisation expert system because the third-party MIP optimiser is no longer indispensable and random neighbourhood search is not necessary.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As of today, user-generated information such as online reviews has become increasingly significant for customers in decision making process. Meanwhile, as the volume of online reviews proliferates, there is an insistent demand to help the users tackle the information overload problem. In order to extract useful information from overwhelming reviews, considerable work has been proposed such as review summarization and review selection. Particularly, to avoid the redundant information, researchers attempt to select a small set of reviews to represent the entire review corpus by preserving its statistical properties (e.g., opinion distribution). However, one significant drawback of the existing works is that they only measure the utility of the extracted reviews as a whole without considering the quality of each individual review. As a result, the set of chosen reviews may consist of low-quality ones even its statistical property is close to that of the original review corpus, which is not preferred by the users. In this paper, we proposed a review selection method which takes review quality into consideration during the selection process. Specifically, we examine the relationships between product features based upon a domain ontology to capture the review characteristics based on which to select reviews that have good quality and preserve the opinion distribution as well. Our experimental results based on real world review datasets demonstrate that our proposed approach is feasible and able to improve the performance of the review selection effectively.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modularity has been suggested to be connected to evolvability because a higher degree of independence among parts allows them to evolve as separate units. Recently, the Escoufier RV coefficient has been proposed as a measure of the degree of integration between modules in multivariate morphometric datasets. However, it has been shown, using randomly simulated datasets, that the value of the RV coefficient depends on sample size. Also, so far there is no statistical test for the difference in the RV coefficient between a priori defined groups of observations. Here, we (1), using a rarefaction analysis, show that the value of the RV coefficient depends on sample size also in real geometric morphometric datasets; (2) propose a permutation procedure to test for the difference in the RV coefficient between a priori defined groups of observations; (3) show, through simulations, that such a permutation procedure has an appropriate Type I error; (4) suggest that a rarefaction procedure could be used to obtain sample-size-corrected values of the RV coefficient; and (5) propose a nearest-neighbor procedure that could be used when studying the variation of modularity in geographic space. The approaches outlined here, readily extendable to non-morphometric datasets, allow study of the variation in the degree of integration between a priori defined modules. A Java application – that will allow performance of the proposed test using a software with graphical user interface – has also been developed and is available at the Morphometrics at Stony Brook Web page (http://life.bio.sunysb.edu/morph/).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Osteoporotic fracture is a major cause of morbidity and mortality worldwide. Low bone mineral density (BMD) is a major predisposing factor to fracture and is known to be highly heritable. Site-, gender-, and age-specific genetic effects on BMD are thought to be significant, but have largely not been considered in the design of genome-wide association studies (GWAS) of BMD to date. We report here a GWAS using a novel study design focusing on women of a specific age (postmenopausal women, age 55-85 years), with either extreme high or low hip BMD (age- and gender-adjusted BMD z-scores of +1.5 to +4.0, n = 1055, or -4.0 to -1.5, n = 900), with replication in cohorts of women drawn from the general population (n = 20,898). The study replicates 21 of 26 known BMD-associated genes. Additionally, we report suggestive association of a further six new genetic associations in or around the genes CLCN7, GALNT3, IBSP, LTBP3, RSPO3, and SOX4, with replication in two independent datasets. A novel mouse model with a loss-of-function mutation in GALNT3 is also reported, which has high bone mass, supporting the involvement of this gene in BMD determination. In addition to identifying further genes associated with BMD, this study confirms the efficiency of extreme-truncate selection designs for quantitative trait association studies. © 2011 Duncan et al.