340 resultados para datasets


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Marine sediments around volcanic islands contain an archive of volcaniclastic deposits, which can be used to reconstruct the volcanic history of an area. Such records hold many advantages over often incomplete terrestrial datasets. This includes the potential for precise and continuous dating of intervening sediment packages, which allow a correlatable and temporally-constrained stratigraphic framework to be constructed across multiple marine sediment cores. Here, we discuss a marine record of eruptive and mass-wasting events spanning ~250 ka offshore of Montserrat, using new data from IODP Expedition 340, as well as previously collected cores. By using a combination of high-resolution oxygen isotope stratigraphy, AMS radiocarbon dating, biostratigraphy of foraminifera and calcareous nannofossils and clast componentry, we identify five major events at Soufriere Hills volcano since 250 ka. Lateral correlation of these events across sediment cores collected offshore of the south and south west of Montserrat, have improved our understanding of the timing, extent and associations between events in this area. Correlations reveal that powerful and potentially erosive density-currents travelled at least 33 km offshore, and demonstrate that marine deposits, produced by eruption-fed and mass-wasting events on volcanic islands, are heterogeneous in their spatial distribution. Thus, multiple drilling/coring sites are needed to reconstruct the full chronostratigraphy of volcanic islands. This multidisciplinary study will be vital to interpreting the chaotic records of submarine landslides at other sites drilled during Expedition 340 and provides a framework that can be applied to the stratigraphic analysis of sediments surrounding other volcanic islands.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background The sequencing, de novo assembly and annotation of transcriptome datasets generated with next generation sequencing (NGS) has enabled biologists to answer genomic questions in non-model species with unprecedented ease. Reliable and accurate de novo assembly and annotation of transcriptomes, however, is a critically important step for transcriptome assemblies generated from short read sequences. Typical benchmarks for assembly and annotation reliability have been performed with model species. To address the reliability and accuracy of de novo transcriptome assembly in non-model species, we generated an RNAseq dataset for an intertidal gastropod mollusc species, Nerita melanotragus, and compared the assembly produced by four different de novo transcriptome assemblers; Velvet, Oases, Geneious and Trinity, for a number of quality metrics and redundancy. Results Transcriptome sequencing on the Ion Torrent PGM™ produced 1,883,624 raw reads with a mean length of 133 base pairs (bp). Both the Trinity and Oases de novo assemblers produced the best assemblies based on all quality metrics including fewer contigs, increased N50 and average contig length and contigs of greater length. Overall the BLAST and annotation success of our assemblies was not high with only 15-19% of contigs assigned a putative function. Conclusions We believe that any improvement in annotation success of gastropod species will require more gastropod genome sequences, but in particular an increase in mollusc protein sequences in public databases. Overall, this paper demonstrates that reliable and accurate de novo transcriptome assemblies can be generated from short read sequencers with the right assembly algorithms. Keywords: Nerita melanotragus; De novo assembly; Transcriptome; Heat shock protein; Ion torrent

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background Small RNA sequencing is commonly used to identify novel miRNAs and to determine their expression levels in plants. There are several miRNA identification tools for animals such as miRDeep, miRDeep2 and miRDeep*. miRDeep-P was developed to identify plant miRNA using miRDeep’s probabilistic model of miRNA biogenesis, but it depends on several third party tools and lacks a user-friendly interface. The objective of our miRPlant program is to predict novel plant miRNA, while providing a user-friendly interface with improved accuracy of prediction. Result We have developed a user-friendly plant miRNA prediction tool called miRPlant. We show using 16 plant miRNA datasets from four different plant species that miRPlant has at least a 10% improvement in accuracy compared to miRDeep-P, which is the most popular plant miRNA prediction tool. Furthermore, miRPlant uses a Graphical User Interface for data input and output, and identified miRNA are shown with all RNAseq reads in a hairpin diagram. Conclusions We have developed miRPlant which extends miRDeep* to various plant species by adopting suitable strategies to identify hairpin excision regions and hairpin structure filtering for plants. miRPlant does not require any third party tools such as mapping or RNA secondary structure prediction tools. miRPlant is also the first plant miRNA prediction tool that dynamically plots miRNA hairpin structure with small reads for identified novel miRNAs. This feature will enable biologists to visualize novel pre-miRNA structure and the location of small RNA reads relative to the hairpin. Moreover, miRPlant can be easily used by biologists with limited bioinformatics skills.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Age-related Macular Degeneration (AMD) is one of the major causes of vision loss and blindness in ageing population. Currently, there is no cure for AMD, however early detection and subsequent treatment may prevent the severe vision loss or slow the progression of the disease. AMD can be classified into two types: dry and wet AMDs. The people with macular degeneration are mostly affected by dry AMD. Early symptoms of AMD are formation of drusen and yellow pigmentation. These lesions are identified by manual inspection of fundus images by the ophthalmologists. It is a time consuming, tiresome process, and hence an automated diagnosis of AMD screening tool can aid clinicians in their diagnosis significantly. This study proposes an automated dry AMD detection system using various entropies (Shannon, Kapur, Renyi and Yager), Higher Order Spectra (HOS) bispectra features, Fractional Dimension (FD), and Gabor wavelet features extracted from greyscale fundus images. The features are ranked using t-test, Kullback–Lieber Divergence (KLD), Chernoff Bound and Bhattacharyya Distance (CBBD), Receiver Operating Characteristics (ROC) curve-based and Wilcoxon ranking methods in order to select optimum features and classified into normal and AMD classes using Naive Bayes (NB), k-Nearest Neighbour (k-NN), Probabilistic Neural Network (PNN), Decision Tree (DT) and Support Vector Machine (SVM) classifiers. The performance of the proposed system is evaluated using private (Kasturba Medical Hospital, Manipal, India), Automated Retinal Image Analysis (ARIA) and STructured Analysis of the Retina (STARE) datasets. The proposed system yielded the highest average classification accuracies of 90.19%, 95.07% and 95% with 42, 54 and 38 optimal ranked features using SVM classifier for private, ARIA and STARE datasets respectively. This automated AMD detection system can be used for mass fundus image screening and aid clinicians by making better use of their expertise on selected images that require further examination.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Existing crowd counting algorithms rely on holistic, local or histogram based features to capture crowd properties. Regression is then employed to estimate the crowd size. Insufficient testing across multiple datasets has made it difficult to compare and contrast different methodologies. This paper presents an evaluation across multiple datasets to compare holistic, local and histogram based methods, and to compare various image features and regression models. A K-fold cross validation protocol is followed to evaluate the performance across five public datasets: UCSD, PETS 2009, Fudan, Mall and Grand Central datasets. Image features are categorised into five types: size, shape, edges, keypoints and textures. The regression models evaluated are: Gaussian process regression (GPR), linear regression, K nearest neighbours (KNN) and neural networks (NN). The results demonstrate that local features outperform equivalent holistic and histogram based features; optimal performance is observed using all image features except for textures; and that GPR outperforms linear, KNN and NN regression

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Stormwater pollution is linked to stream ecosystem degradation. In predicting stormwater pollution, various types of modelling techniques are adopted. The accuracy of predictions provided by these models depends on the data quality, appropriate estimation of model parameters, and the validation undertaken. It is well understood that available water quality datasets in urban areas span only relatively short time scales unlike water quantity data, which limits the applicability of the developed models in engineering and ecological assessment of urban waterways. This paper presents the application of leave-one-out (LOO) and Monte Carlo cross validation (MCCV) procedures in a Monte Carlo framework for the validation and estimation of uncertainty associated with pollutant wash-off when models are developed using a limited dataset. It was found that the application of MCCV is likely to result in a more realistic measure of model coefficients than LOO. Most importantly, MCCV and LOO were found to be effective in model validation when dealing with a small sample size which hinders detailed model validation and can undermine the effectiveness of stormwater quality management strategies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A nonlinear interface element modelling method is formulated for the prediction of deformation and failure of high adhesive thin layer polymer mortared masonry exhibiting failure of units and mortar. Plastic flow vectors are explicitly integrated within the implicit finite element framework instead of relying on predictor–corrector like approaches. The method is calibrated using experimental data from uniaxial compression, shear triplet and flexural beam tests. The model is validated using a thin layer mortared masonry shear wall, whose experimental datasets are reported in the literature and is used to examine the behaviour of thin layer mortared masonry under biaxial loading.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In recent years, increasing focus has been made on making good business decisions utilizing the product of data analysis. With the advent of the Big Data phenomenon, this is even more apparent than ever before. But the question is how can organizations trust decisions made on the basis of results obtained from analysis of untrusted data? Assurances and trust that data and datasets that inform these decisions have not been tainted by outside agency. This study will propose enabling the authentication of datasets specifically by the extension of the RESTful architectural scheme to include authentication parameters while operating within a larger holistic security framework architecture or model compliant to legislation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Rating systems are used by many websites, which allow customers to rate available items according to their own experience. Subsequently, reputation models are used to aggregate available ratings in order to generate reputation scores for items. A problem with current reputation models is that they provide solutions to enhance accuracy of sparse datasets not thinking of their models performance over dense datasets. In this paper, we propose a novel reputation model to generate more accurate reputation scores for items using any dataset; whether it is dense or sparse. Our proposed model is described as a weighted average method, where the weights are generated using the normal distribution. Experiments show promising results for the proposed model over state-of-the-art ones on sparse and dense datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many websites offer the opportunity for customers to rate items and then use customers' ratings to generate items reputation, which can be used later by other users for decision making purposes. The aggregated value of the ratings per item represents the reputation of this item. The accuracy of the reputation scores is important as it is used to rank items. Most of the aggregation methods didn't consider the frequency of distinct ratings and they didn't test how accurate their reputation scores over different datasets with different sparsity. In this work we propose a new aggregation method which can be described as a weighted average, where weights are generated using the normal distribution. The evaluation result shows that the proposed method outperforms state-of-the-art methods over different sparsity datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

After attending this presentation, attendees will gain awareness of the ontogeny of cranial maturation, specifically: (1) the fusion timings of primary ossification centers in the basicranium; and (2) the temporal pattern of closure of the anterior fontanelle, to develop new population-specific age standards for medicolegal death investigation of Australian subadults. This presentation will impact the forensic science community by demonstrating the potential of a contemporary forensic subadult Computed Tomography (CT) database of cranial scans and population data, to recalibrate existing standards for age estimation and quantify growth and development of Australian children. This research welcomes a study design applicable to all countries faced with paucity in skeletal repositories. Accurate assessment of age-at-death of skeletal remains represents a key element in forensic anthropology methodology. In Australian casework, age standards derived from American reference samples are applied in light of scarcity in documented Australian skeletal collections. Currently practitioners rely on antiquated standards, such as the Scheuer and Black1 compilation for age estimation, despite implications of secular trends and population variation. Skeletal maturation standards are population specific and should not be extrapolated from one population to another, while secular changes in skeletal dimensions and accelerated maturation underscore the importance of establishing modern standards to estimate age in modern subadults. Despite CT imaging becoming the gold standard for skeletal analysis in Australia, practitioners caution the application of forensic age standards derived from macroscopic inspection to a CT medium, suggesting a need for revised methodologies. Multi-slice CT scans of subadult crania and cervical vertebrae 1 and 2 were acquired from 350 Australian individuals (males: n=193, females: n=157) aged birth to 12 years. The CT database, projected at 920 individuals upon completion (January 2014), comprises thin-slice DICOM data (resolution: 0.5/0.3mm) of patients scanned since 2010 at major Brisbane Childrens Hospitals. DICOM datasets were subject to manual segmentation, followed by the construction of multi-planar and volume rendering cranial models, for subsequent scoring. The union of primary ossification centers of the occipital bone were scored as open, partially closed or completely closed; while the fontanelles, and vertebrae were scored in accordance with two stages. Transition analysis was applied to elucidate age at transition between union states for each center, and robust age parameters established using Bayesian statistics. In comparison to reported literature, closure of the fontanelles and contiguous sutures in Australian infants occur earlier than reported, with the anterior fontanelle transitioning from open to closed at 16.7±1.1 months. The metopic suture is closed prior to 10 weeks post-partum and completely obliterated by 6 months of age, independent of sex. Utilizing reverse engineering capabilities, an alternate method for infant age estimation based on quantification of fontanelle area and non-linear regression with variance component modeling will be presented. Closure models indicate that the greatest rate of change in anterior fontanelle area occurs prior to 5 months of age. This study complements the work of Scheuer and Black1, providing more specific age intervals for union and temporal maturity of each primary ossification center of the occipital bone. For example, dominant fusion of the sutura intra-occipitalis posterior occurs before 9 months of age, followed by persistence of a hyaline cartilage tongue posterior to the foramen magnum until 2.5 years; with obliteration at 2.9±0.1 years. Recalibrated age parameters for the atlas and axis are presented, with the anterior arch of the atlas appearing at 2.9 months in females and 6.3 months in males; while dentoneural, dentocentral and neurocentral junctions of the axis transitioned from non-union to union at 2.1±0.1 years in females and 3.7±0.1 years in males. These results are an exemplar of significant sexual dimorphism in maturation (p<0.05), with girls exhibiting union earlier than boys, justifying the need for segregated sex standards for age estimation. Studies such as this are imperative for providing updated standards for Australian forensic and pediatric practice and provide an insight into skeletal development of this population. During this presentation, the utility of novel regression models for age estimation of infants will be discussed, with emphasis on three-dimensional modeling capabilities of complex structures such as fontanelles, for the development of new age estimation methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Determination of sequence similarity is a central issue in computational biology, a problem addressed primarily through BLAST, an alignment based heuristic which has underpinned much of the analysis and annotation of the genomic era. Despite their success, alignment-based approaches scale poorly with increasing data set size, and are not robust under structural sequence rearrangements. Successive waves of innovation in sequencing technologies – so-called Next Generation Sequencing (NGS) approaches – have led to an explosion in data availability, challenging existing methods and motivating novel approaches to sequence representation and similarity scoring, including adaptation of existing methods from other domains such as information retrieval. In this work, we investigate locality-sensitive hashing of sequences through binary document signatures, applying the method to a bacterial protein classification task. Here, the goal is to predict the gene family to which a given query protein belongs. Experiments carried out on a pair of small but biologically realistic datasets (the full protein repertoires of families of Chlamydia and Staphylococcus aureus genomes respectively) show that a measure of similarity obtained by locality sensitive hashing gives highly accurate results while offering a number of avenues which will lead to substantial performance improvements over BLAST..

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Motivation Shotgun sequence read data derived from xenograft material contains a mixture of reads arising from the host and reads arising from the graft. Classifying the read mixture to separate the two allows for more precise analysis to be performed. Results We present a technique, with an associated tool Xenome, which performs fast, accurate and specific classification of xenograft-derived sequence read data. We have evaluated it on RNA-Seq data from human, mouse and human-in-mouse xenograft datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis investigates face recognition in video under the presence of large pose variations. It proposes a solution that performs simultaneous detection of facial landmarks and head poses across large pose variations, employs discriminative modelling of feature distributions of faces with varying poses, and applies fusion of multiple classifiers to pose-mismatch recognition. Experiments on several benchmark datasets have demonstrated that improved performance is achieved using the proposed solution.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In a tag-based recommender system, the multi-dimensional correlation should be modeled effectively for finding quality recommendations. Recently, few researchers have used tensor models in recommendation to represent and analyze latent relationships inherent in multi-dimensions data. A common approach is to build the tensor model, decompose it and, then, directly use the reconstructed tensor to generate the recommendation based on the maximum values of tensor elements. In order to improve the accuracy and scalability, we propose an implementation of the -mode block-striped (matrix) product for scalable tensor reconstruction and probabilistically ranking the candidate items generated from the reconstructed tensor. With testing on real-world datasets, we demonstrate that the proposed method outperforms the benchmarking methods in terms of recommendation accuracy and scalability.