970 resultados para Datasets


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a visual SLAM method for temporary satellite dropout navigation, here applied on fixed- wing aircraft. It is designed for flight altitudes beyond typical stereo ranges, but within the range of distance measurement sensors. The proposed visual SLAM method consists of a common localization step with monocular camera resectioning, and a mapping step which incorporates radar altimeter data for absolute scale estimation. With that, there will be no scale drift of the map and the estimated flight path. The method does not require simplifications like known landmarks and it is thus suitable for unknown and nearly arbitrary terrain. The method is tested with sensor datasets from a manned Cessna 172 aircraft. With 5% absolute scale error from radar measurements causing approximately 2-6% accumulation error over the flown distance, stable positioning is achieved over several minutes of flight time. The main limitations are flight altitudes above the radar range of 750 m where the monocular method will suffer from scale drift, and, depending on the flight speed, flights below 50 m where image processing gets difficult with a downwards-looking camera due to the high optical flow rates and the low image overlap.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Montserrat now provides one of the most complete datasets for understanding the character and tempo of hazardous events at volcanic islands. Much of the erupted material ends up offshore, and this offshore record may be easier to date due to intervening hemiplegic sediments between event beds. The offshore dataset includes the first scientific drilling of volcanic island landslides during IODP Expedition 340, together with an unusually comprehensive set of shallow sediment cores and 2-D and 3-D seismic surveys. Most recently in 2013, Remotely Operated Vehicle (ROV) dives mapped and sampled the surface of the main landslide deposits. This contribution aims to provide an overview of key insights from ongoing work on IODP Expedition 340 Sites offshore Montserrat.Key objectives are to understand the composition (and hence source), emplacement mechanism (and hence tsunami generation) of major landslides, together with their frequency and timing relative to volcanic eruption cycles. The most recent major collapse event is Deposit 1, which involved ~1.8 km cubed of material and produced a blocky deposit at ~12-14ka. Deposit 1 appears to have involved not only the volcanic edifice, but also a substantial component of a fringing bioclastic shelf, and material locally incorporated from the underlying seafloor. This information allows us to test how first-order landslide morphology (e.g. blocky or elongate lobes) is related to first-order landslide composition. Preliminary analysis suggests that Deposit 1 occurred shortly before a second major landslide on the SW of the island (Deposit 5). It may have initiated English's Crater, but was not associated with a major change in magma composition. An associated turbidite-stack suggests it was emplaced in multiple stages, separated by at least a few hours and thus reducing the tsunami magnitude. The ROV dives show that mega-blocks in detail comprise smaller-scale breccias, which can travel significant distances without complete disintegration. Landslide Deposit 2 was emplaced at ~130ka, and is more voluminous (~8.4km cubed). It had a much more profound influence on the magmatic system, as it was linked to a major explosive mafic eruption and formation of a new volcanic centre (South Soufriere Hills) on the island. Site U1395 confirms a hypothesis based on the site survey seismic data that Deposit 2 includes a substantial component of pre-existing seafloor sediment. However, surprisingly, this pre-existing seafloor sediment in the lower part of Deposit 2 at Site U1395 is completely undeformed and flat lying, suggesting that Site U1395 penetrated a flat lying block. Work to date material from the upper part of U1396, U1395 and U1394 will also be summarised. This work is establishing a chronostratigraphy of major events over the last 1 Ma, with particularly detailed constraints during the last ~250ka. This is helping us to understand whether major landslides are related to cycles of volcanic eruptions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In life cycle assessment studies, greenhouse gas (GHG) emissions from direct land-use change have been estimated to make a significant contribution to the global warming potential of agricultural products. However, these estimates have a high uncertainty due to the complexity of data requirements and difficulty in attribution of land-use change. This paper presents estimates of GHG emissions from direct land-use change from native woodland to grazing land for two beef production regions in eastern Australia, which were the subject of a multi-impact life cycle assessment study for premium beef production. Spatially- and temporally consistent datasets were derived for areas of forest cover and biomass carbon stocks using published remotely sensed tree-cover data and regionally applicable allometric equations consistent with Australia's national GHG inventory report. Standard life cycle assessment methodology was used to estimate GHG emissions and removals from direct land-use change attributed to beef production. For the northern-central New South Wales region of Australia estimates ranged from a net emission of 0.03 t CO2-e ha-1 year-1 to net removal of 0.12 t CO2-e ha-1 year-1 using low and high scenarios, respectively, for sequestration in regrowing forests. For the same period (1990-2010), the study region in southern-central Queensland was estimated to have net emissions from land-use change in the range of 0.45-0.25 t CO2-e ha-1 year-1. The difference between regions reflects continuation of higher rates of deforestation in Queensland until strict regulation in 2006 whereas native vegetation protection laws were introduced earlier in New South Wales. On the basis of liveweight produced at the farm-gate, emissions from direct land-use change for 1990-2010 were comparable in magnitude to those from other on-farm sources, which were dominated by enteric methane. However, calculation of land-use change impacts for the Queensland region for a period starting 2006, gave a range from net emissions of 0.11 t CO2-e ha-1 year-1 to net removals of 0.07 t CO2-e ha-1 year-1. This study demonstrated a method for deriving spatially- and temporally consistent datasets to improve estimates for direct land-use change impacts in life cycle assessment. It identified areas of uncertainty, including rates of sequestration in woody regrowth and impacts of land-use change on soil carbon stocks in grazed woodlands, but also showed the potential for direct land-use change to represent a net sink for GHG.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Product reviews are the foremost source of information for customers and manufacturers to help them make appropriate purchasing and production decisions. Natural language data is typically very sparse; the most common words are those that do not carry a lot of semantic content, and occurrences of any particular content-bearing word are rare, while co-occurrences of these words are rarer. Mining product aspects, along with corresponding opinions, is essential for Aspect-Based Opinion Mining (ABOM) as a result of the e-commerce revolution. Therefore, the need for automatic mining of reviews has reached a peak. In this work, we deal with ABOM as sequence labelling problem and propose a supervised extraction method to identify product aspects and corresponding opinions. We use Conditional Random Fields (CRFs) to solve the extraction problem and propose a feature function to enhance accuracy. The proposed method is evaluated using two different datasets. We also evaluate the effectiveness of feature function and the optimisation through multiple experiments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Strand specific RNAseq data is now more common in RNAseq projects. Visualizing RNAseq data has become an important matter in Analysis of sequencing data. The most widely used visualization tool is the UCSC genome browser that introduced the custom track concept that enabled researchers to simultaneously visualize gene expression at a particular locus from multiple experiments. Our objective of the software tool is to provide friendly interface for visualization of RNAseq datasets. Results This paper introduces a visualization tool (RNASeqBrowser) that incorporates and extends the functionality of the UCSC genome browser. For example, RNASeqBrowser simultaneously displays read coverage, SNPs, InDels and raw read tracks with other BED and wiggle tracks -- all being dynamically built from the BAM file. Paired reads are also connected in the browser to enable easier identification of novel exon/intron borders and chimaeric transcripts. Strand specific RNAseq data is also supported by RNASeqBrowser that displays reads above (positive strand transcript) or below (negative strand transcripts) a central line. Finally, RNASeqBrowser was designed for ease of use for users with few bioinformatic skills, and incorporates the features of many genome browsers into one platform. Conclusions The features of RNASeqBrowser: (1) RNASeqBrowser integrates UCSC genome browser and NGS visualization tools such as IGV. It extends the functionality of the UCSC genome browser by adding several new types of tracks to show NGS data such as individual raw reads, SNPs and InDels. (2) RNASeqBrowser can dynamically generate RNA secondary structure. It is useful for identifying non-coding RNA such as miRNA. (3) Overlaying NGS wiggle data is helpful in displaying differential expression and is simple to implement in RNASeqBrowser. (4) NGS data accumulates a lot of raw reads. Thus, RNASeqBrowser collapses exact duplicate reads to reduce visualization space. Normal PC’s can show many windows of NGS individual raw reads without much delay. (5) Multiple popup windows of individual raw reads provide users with more viewing space. This avoids existing approaches (such as IGV) which squeeze all raw reads into one window. This will be helpful for visualizing multiple datasets simultaneously. RNASeqBrowser and its manual are freely available at http://www.australianprostatecentre.org/research/software/rnaseqbrowser webcite or http://sourceforge.net/projects/rnaseqbrowser/ webcite

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Environmental acoustic recordings can be used to perform avian species richness surveys, whereby a trained ornithologist can observe the species present by listening to the recording. This could be made more efficient by using computational methods for iteratively selecting the richest parts of a long recording for the human observer to listen to, a process known as “smart sampling”. This allows scaling up to much larger ecological datasets. In this paper we explore computational approaches based on information and diversity of selected samples. We propose to use an event detection algorithm to estimate the amount of information present in each sample. We further propose to cluster the detected events for a better estimate of this amount of information. Additionally, we present a time dispersal approach to estimating diversity between iteratively selected samples. Combinations of approaches were evaluated on seven 24-hour recordings that have been manually labeled by bird watchers. The results show that on average all the methods we have explored would allow annotators to observe more new species in fewer minutes compared to a baseline of random sampling at dawn.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the mining optimisation literature, most researchers focused on two strategic-level and tactical-level open-pit mine optimisation problems, which are respectively termed ultimate pit limit (UPIT) or constrained pit limit (CPIT). However, many researchers indicate that the substantial numbers of variables and constraints in real-world instances (e.g., with 50-1000 thousand blocks) make the CPIT’s mixed integer programming (MIP) model intractable for use. Thus, it becomes a considerable challenge to solve the large scale CPIT instances without relying on exact MIP optimiser as well as the complicated MIP relaxation/decomposition methods. To take this challenge, two new graph-based algorithms based on network flow graph and conjunctive graph theory are developed by taking advantage of problem properties. The performance of our proposed algorithms is validated by testing recent large scale benchmark UPIT and CPIT instances’ datasets of MineLib in 2013. In comparison to best known results from MineLib, it is shown that the proposed algorithms outperform other CPIT solution approaches existing in the literature. The proposed graph-based algorithms leads to a more competent mine scheduling optimisation expert system because the third-party MIP optimiser is no longer indispensable and random neighbourhood search is not necessary.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As of today, user-generated information such as online reviews has become increasingly significant for customers in decision making process. Meanwhile, as the volume of online reviews proliferates, there is an insistent demand to help the users tackle the information overload problem. In order to extract useful information from overwhelming reviews, considerable work has been proposed such as review summarization and review selection. Particularly, to avoid the redundant information, researchers attempt to select a small set of reviews to represent the entire review corpus by preserving its statistical properties (e.g., opinion distribution). However, one significant drawback of the existing works is that they only measure the utility of the extracted reviews as a whole without considering the quality of each individual review. As a result, the set of chosen reviews may consist of low-quality ones even its statistical property is close to that of the original review corpus, which is not preferred by the users. In this paper, we proposed a review selection method which takes review quality into consideration during the selection process. Specifically, we examine the relationships between product features based upon a domain ontology to capture the review characteristics based on which to select reviews that have good quality and preserve the opinion distribution as well. Our experimental results based on real world review datasets demonstrate that our proposed approach is feasible and able to improve the performance of the review selection effectively.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modularity has been suggested to be connected to evolvability because a higher degree of independence among parts allows them to evolve as separate units. Recently, the Escoufier RV coefficient has been proposed as a measure of the degree of integration between modules in multivariate morphometric datasets. However, it has been shown, using randomly simulated datasets, that the value of the RV coefficient depends on sample size. Also, so far there is no statistical test for the difference in the RV coefficient between a priori defined groups of observations. Here, we (1), using a rarefaction analysis, show that the value of the RV coefficient depends on sample size also in real geometric morphometric datasets; (2) propose a permutation procedure to test for the difference in the RV coefficient between a priori defined groups of observations; (3) show, through simulations, that such a permutation procedure has an appropriate Type I error; (4) suggest that a rarefaction procedure could be used to obtain sample-size-corrected values of the RV coefficient; and (5) propose a nearest-neighbor procedure that could be used when studying the variation of modularity in geographic space. The approaches outlined here, readily extendable to non-morphometric datasets, allow study of the variation in the degree of integration between a priori defined modules. A Java application – that will allow performance of the proposed test using a software with graphical user interface – has also been developed and is available at the Morphometrics at Stony Brook Web page (http://life.bio.sunysb.edu/morph/).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Osteoporotic fracture is a major cause of morbidity and mortality worldwide. Low bone mineral density (BMD) is a major predisposing factor to fracture and is known to be highly heritable. Site-, gender-, and age-specific genetic effects on BMD are thought to be significant, but have largely not been considered in the design of genome-wide association studies (GWAS) of BMD to date. We report here a GWAS using a novel study design focusing on women of a specific age (postmenopausal women, age 55-85 years), with either extreme high or low hip BMD (age- and gender-adjusted BMD z-scores of +1.5 to +4.0, n = 1055, or -4.0 to -1.5, n = 900), with replication in cohorts of women drawn from the general population (n = 20,898). The study replicates 21 of 26 known BMD-associated genes. Additionally, we report suggestive association of a further six new genetic associations in or around the genes CLCN7, GALNT3, IBSP, LTBP3, RSPO3, and SOX4, with replication in two independent datasets. A novel mouse model with a loss-of-function mutation in GALNT3 is also reported, which has high bone mass, supporting the involvement of this gene in BMD determination. In addition to identifying further genes associated with BMD, this study confirms the efficiency of extreme-truncate selection designs for quantitative trait association studies. © 2011 Duncan et al.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Genome-wide association studies (GWAS) have identified around 60 common variants associated with multiple sclerosis (MS), but these loci only explain a fraction of the heritability of MS. Some missing heritability may be caused by rare variants that have been suggested to play an important role in the aetiology of complex diseases such as MS. However current genetic and statistical methods for detecting rare variants are expensive and time consuming. 'Population-based linkage analysis' (PBLA) or so called identity-by-descent (IBD) mapping is a novel way to detect rare variants in extant GWAS datasets. We employed BEAGLE fastIBD to search for rare MS variants utilising IBD mapping in a large GWAS dataset of 3,543 cases and 5,898 controls. We identified a genome-wide significant linkage signal on chromosome 19 (LOD = 4.65; p = 1.9×10-6). Network analysis of cases and controls sharing haplotypes on chromosome 19 further strengthened the association as there are more large networks of cases sharing haplotypes than controls. This linkage region includes a cluster of zinc finger genes of unknown function. Analysis of genome wide transcriptome data suggests that genes in this zinc finger cluster may be involved in very early developmental regulation of the CNS. Our study also indicates that BEAGLE fastIBD allowed identification of rare variants in large unrelated population with moderate computational intensity. Even with the development of whole-genome sequencing, IBD mapping still may be a promising way to narrow down the region of interest for sequencing priority. © 2013 Lin et al.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This project was a step forward in applying statistical methods and models to provide new insights for more informed decision-making at large spatial scales. The model has been designed to address complicated effects of ecological processes that govern the state of populations and uncertainties inherent in large spatio-temporal datasets. Specifically, the thesis contributes to better understanding and management of the Great Barrier Reef.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Antigen selection of B cells within the germinal center reaction generally leads to the accumulation of replacement mutations in the complementarity-determining regions (CDRs) of immunoglobulin genes. Studies of mutations in IgE-associated VDJ gene sequences have cast doubt on the role of antigen selection in the evolution of the human IgE response, and it may be that selection for high affinity antibodies is a feature of some but not all allergic diseases. The severity of IgE-mediated anaphylaxis is such that it could result from higher affinity IgE antibodies. We therefore investigated IGHV mutations in IgE-associated sequences derived from ten individuals with a history of anaphylactic reactions to bee or wasp venom or peanut allergens. IgG sequences, which more certainly experience antigen selection, served as a control dataset. A total of 6025 unique IgE and 5396 unique IgG sequences were generated using high throughput 454 pyrosequencing. The proportion of replacement mutations seen in the CDRs of the IgG dataset was significantly higher than that of the IgE dataset, and the IgE sequences showed little evidence of antigen selection. To exclude the possibility that 454 errors had compromised analysis, rigorous filtering of the datasets led to datasets of 90 core IgE sequences and 411 IgG sequences. These sequences were present as both forward and reverse reads, and so were most unlikely to include sequencing errors. The filtered datasets confirmed that antigen selection plays a greater role in the evolution of IgG sequences than of IgE sequences derived from the study participants.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective The Nintendo Wii Fit integrates virtual gaming with body movement, and may be suitable as an adjunct to conventional physiotherapy following lower limb fractures. This study examined the feasibility and safety of using the Wii Fit as an adjunct to outpatient physiotherapy following lower limb fractures, and reports sample size considerations for an appropriately powered randomised trial. Methodology Ambulatory patients receiving physiotherapy following a lower limb fracture participated in this study (n = 18). All participants received usual care (individual physiotherapy). The first nine participants also used the Wii Fit under the supervision of their treating clinician as an adjunct to usual care. Adverse events, fracture malunion or exacerbation of symptoms were recorded. Pain, balance and patient-reported function were assessed at baseline and discharge from physiotherapy. Results No adverse events were attributed to either the usual care physiotherapy or Wii Fit intervention for any patient. Overall, 15 (83%) participants completed both assessments and interventions as scheduled. For 80% power in a clinical trial, the number of complete datasets required in each group to detect a small, medium or large effect of the Wii Fit at a post-intervention assessment was calculated at 175, 63 and 25, respectively. Conclusions The Nintendo Wii Fit was safe and feasible as an adjunct to ambulatory physiotherapy in this sample. When considering a likely small effect size and the 17% dropout rate observed in this study, 211 participants would be required in each clinical trial group. A larger effect size or multiple repeated measures design would require fewer participants.