970 resultados para Datasets
Resumo:
Existing crowd counting algorithms rely on holistic, local or histogram based features to capture crowd properties. Regression is then employed to estimate the crowd size. Insufficient testing across multiple datasets has made it difficult to compare and contrast different methodologies. This paper presents an evaluation across multiple datasets to compare holistic, local and histogram based methods, and to compare various image features and regression models. A K-fold cross validation protocol is followed to evaluate the performance across five public datasets: UCSD, PETS 2009, Fudan, Mall and Grand Central datasets. Image features are categorised into five types: size, shape, edges, keypoints and textures. The regression models evaluated are: Gaussian process regression (GPR), linear regression, K nearest neighbours (KNN) and neural networks (NN). The results demonstrate that local features outperform equivalent holistic and histogram based features; optimal performance is observed using all image features except for textures; and that GPR outperforms linear, KNN and NN regression
Resumo:
Stormwater pollution is linked to stream ecosystem degradation. In predicting stormwater pollution, various types of modelling techniques are adopted. The accuracy of predictions provided by these models depends on the data quality, appropriate estimation of model parameters, and the validation undertaken. It is well understood that available water quality datasets in urban areas span only relatively short time scales unlike water quantity data, which limits the applicability of the developed models in engineering and ecological assessment of urban waterways. This paper presents the application of leave-one-out (LOO) and Monte Carlo cross validation (MCCV) procedures in a Monte Carlo framework for the validation and estimation of uncertainty associated with pollutant wash-off when models are developed using a limited dataset. It was found that the application of MCCV is likely to result in a more realistic measure of model coefficients than LOO. Most importantly, MCCV and LOO were found to be effective in model validation when dealing with a small sample size which hinders detailed model validation and can undermine the effectiveness of stormwater quality management strategies.
Resumo:
A nonlinear interface element modelling method is formulated for the prediction of deformation and failure of high adhesive thin layer polymer mortared masonry exhibiting failure of units and mortar. Plastic flow vectors are explicitly integrated within the implicit finite element framework instead of relying on predictor–corrector like approaches. The method is calibrated using experimental data from uniaxial compression, shear triplet and flexural beam tests. The model is validated using a thin layer mortared masonry shear wall, whose experimental datasets are reported in the literature and is used to examine the behaviour of thin layer mortared masonry under biaxial loading.
Resumo:
In recent years, increasing focus has been made on making good business decisions utilizing the product of data analysis. With the advent of the Big Data phenomenon, this is even more apparent than ever before. But the question is how can organizations trust decisions made on the basis of results obtained from analysis of untrusted data? Assurances and trust that data and datasets that inform these decisions have not been tainted by outside agency. This study will propose enabling the authentication of datasets specifically by the extension of the RESTful architectural scheme to include authentication parameters while operating within a larger holistic security framework architecture or model compliant to legislation.
Resumo:
Rating systems are used by many websites, which allow customers to rate available items according to their own experience. Subsequently, reputation models are used to aggregate available ratings in order to generate reputation scores for items. A problem with current reputation models is that they provide solutions to enhance accuracy of sparse datasets not thinking of their models performance over dense datasets. In this paper, we propose a novel reputation model to generate more accurate reputation scores for items using any dataset; whether it is dense or sparse. Our proposed model is described as a weighted average method, where the weights are generated using the normal distribution. Experiments show promising results for the proposed model over state-of-the-art ones on sparse and dense datasets.
Resumo:
Many websites offer the opportunity for customers to rate items and then use customers' ratings to generate items reputation, which can be used later by other users for decision making purposes. The aggregated value of the ratings per item represents the reputation of this item. The accuracy of the reputation scores is important as it is used to rank items. Most of the aggregation methods didn't consider the frequency of distinct ratings and they didn't test how accurate their reputation scores over different datasets with different sparsity. In this work we propose a new aggregation method which can be described as a weighted average, where weights are generated using the normal distribution. The evaluation result shows that the proposed method outperforms state-of-the-art methods over different sparsity datasets.
Resumo:
After attending this presentation, attendees will gain awareness of the ontogeny of cranial maturation, specifically: (1) the fusion timings of primary ossification centers in the basicranium; and (2) the temporal pattern of closure of the anterior fontanelle, to develop new population-specific age standards for medicolegal death investigation of Australian subadults. This presentation will impact the forensic science community by demonstrating the potential of a contemporary forensic subadult Computed Tomography (CT) database of cranial scans and population data, to recalibrate existing standards for age estimation and quantify growth and development of Australian children. This research welcomes a study design applicable to all countries faced with paucity in skeletal repositories. Accurate assessment of age-at-death of skeletal remains represents a key element in forensic anthropology methodology. In Australian casework, age standards derived from American reference samples are applied in light of scarcity in documented Australian skeletal collections. Currently practitioners rely on antiquated standards, such as the Scheuer and Black1 compilation for age estimation, despite implications of secular trends and population variation. Skeletal maturation standards are population specific and should not be extrapolated from one population to another, while secular changes in skeletal dimensions and accelerated maturation underscore the importance of establishing modern standards to estimate age in modern subadults. Despite CT imaging becoming the gold standard for skeletal analysis in Australia, practitioners caution the application of forensic age standards derived from macroscopic inspection to a CT medium, suggesting a need for revised methodologies. Multi-slice CT scans of subadult crania and cervical vertebrae 1 and 2 were acquired from 350 Australian individuals (males: n=193, females: n=157) aged birth to 12 years. The CT database, projected at 920 individuals upon completion (January 2014), comprises thin-slice DICOM data (resolution: 0.5/0.3mm) of patients scanned since 2010 at major Brisbane Childrens Hospitals. DICOM datasets were subject to manual segmentation, followed by the construction of multi-planar and volume rendering cranial models, for subsequent scoring. The union of primary ossification centers of the occipital bone were scored as open, partially closed or completely closed; while the fontanelles, and vertebrae were scored in accordance with two stages. Transition analysis was applied to elucidate age at transition between union states for each center, and robust age parameters established using Bayesian statistics. In comparison to reported literature, closure of the fontanelles and contiguous sutures in Australian infants occur earlier than reported, with the anterior fontanelle transitioning from open to closed at 16.7±1.1 months. The metopic suture is closed prior to 10 weeks post-partum and completely obliterated by 6 months of age, independent of sex. Utilizing reverse engineering capabilities, an alternate method for infant age estimation based on quantification of fontanelle area and non-linear regression with variance component modeling will be presented. Closure models indicate that the greatest rate of change in anterior fontanelle area occurs prior to 5 months of age. This study complements the work of Scheuer and Black1, providing more specific age intervals for union and temporal maturity of each primary ossification center of the occipital bone. For example, dominant fusion of the sutura intra-occipitalis posterior occurs before 9 months of age, followed by persistence of a hyaline cartilage tongue posterior to the foramen magnum until 2.5 years; with obliteration at 2.9±0.1 years. Recalibrated age parameters for the atlas and axis are presented, with the anterior arch of the atlas appearing at 2.9 months in females and 6.3 months in males; while dentoneural, dentocentral and neurocentral junctions of the axis transitioned from non-union to union at 2.1±0.1 years in females and 3.7±0.1 years in males. These results are an exemplar of significant sexual dimorphism in maturation (p<0.05), with girls exhibiting union earlier than boys, justifying the need for segregated sex standards for age estimation. Studies such as this are imperative for providing updated standards for Australian forensic and pediatric practice and provide an insight into skeletal development of this population. During this presentation, the utility of novel regression models for age estimation of infants will be discussed, with emphasis on three-dimensional modeling capabilities of complex structures such as fontanelles, for the development of new age estimation methods.
Resumo:
Determination of sequence similarity is a central issue in computational biology, a problem addressed primarily through BLAST, an alignment based heuristic which has underpinned much of the analysis and annotation of the genomic era. Despite their success, alignment-based approaches scale poorly with increasing data set size, and are not robust under structural sequence rearrangements. Successive waves of innovation in sequencing technologies – so-called Next Generation Sequencing (NGS) approaches – have led to an explosion in data availability, challenging existing methods and motivating novel approaches to sequence representation and similarity scoring, including adaptation of existing methods from other domains such as information retrieval. In this work, we investigate locality-sensitive hashing of sequences through binary document signatures, applying the method to a bacterial protein classification task. Here, the goal is to predict the gene family to which a given query protein belongs. Experiments carried out on a pair of small but biologically realistic datasets (the full protein repertoires of families of Chlamydia and Staphylococcus aureus genomes respectively) show that a measure of similarity obtained by locality sensitive hashing gives highly accurate results while offering a number of avenues which will lead to substantial performance improvements over BLAST..
Resumo:
Motivation Shotgun sequence read data derived from xenograft material contains a mixture of reads arising from the host and reads arising from the graft. Classifying the read mixture to separate the two allows for more precise analysis to be performed. Results We present a technique, with an associated tool Xenome, which performs fast, accurate and specific classification of xenograft-derived sequence read data. We have evaluated it on RNA-Seq data from human, mouse and human-in-mouse xenograft datasets.
Resumo:
This thesis investigates face recognition in video under the presence of large pose variations. It proposes a solution that performs simultaneous detection of facial landmarks and head poses across large pose variations, employs discriminative modelling of feature distributions of faces with varying poses, and applies fusion of multiple classifiers to pose-mismatch recognition. Experiments on several benchmark datasets have demonstrated that improved performance is achieved using the proposed solution.
Resumo:
In a tag-based recommender system, the multi-dimensional
A tag-based personalized item recommendation system using tensor modeling and topic model approaches
Resumo:
This research falls in the area of enhancing the quality of tag-based item recommendation systems. It aims to achieve this by employing a multi-dimensional user profile approach and by analyzing the semantic aspects of tags. Tag-based recommender systems have two characteristics that need to be carefully studied in order to build a reliable system. Firstly, the multi-dimensional correlation, called as tag assignment
Resumo:
This thesis presents an association rule mining approach, association hierarchy mining (AHM). Different to the traditional two-step bottom-up rule mining, AHM adopts one-step top-down rule mining strategy to improve the efficiency and effectiveness of mining association rules from datasets. The thesis also presents a novel approach to evaluate the quality of knowledge discovered by AHM, which focuses on evaluating information difference between the discovered knowledge and the original datasets. Experiments performed on the real application, characterizing network traffic behaviour, have shown that AHM achieves encouraging performance.
Resumo:
In this paper we propose the hybrid use of illuminant invariant and RGB images to perform image classification of urban scenes despite challenging variation in lighting conditions. Coping with lighting change (and the shadows thereby invoked) is a non-negotiable requirement for long term autonomy using vision. One aspect of this is the ability to reliably classify scene components in the presence of marked and often sudden changes in lighting. This is the focus of this paper. Posed with the task of classifying all parts in a scene from a full colour image, we propose that lighting invariant transforms can reduce the variability of the scene, resulting in a more reliable classification. We leverage the ideas of “data transfer” for classification, beginning with full colour images for obtaining candidate scene-level matches using global image descriptors. This is commonly followed by superpixellevel matching with local features. However, we show that if the RGB images are subjected to an illuminant invariant transform before computing the superpixel-level features, classification is significantly more robust to scene illumination effects. The approach is evaluated using three datasets. The first being our own dataset and the second being the KITTI dataset using manually generated ground truth for quantitative analysis. We qualitatively evaluate the method on a third custom dataset over a 750m trajectory.
Resumo:
1.Marine ecosystems provide critically important goods and services to society, and hence their accelerated degradation underpins an urgent need to take rapid, ambitious and informed decisions regarding their conservation and management. 2.The capacity, however, to generate the detailed field data required to inform conservation planning at appropriate scales is limited by time and resource consuming methods for collecting and analysing field data at the large scales required. 3.The ‘Catlin Seaview Survey’, described here, introduces a novel framework for large-scale monitoring of coral reefs using high-definition underwater imagery collected using customized underwater vehicles in combination with computer vision and machine learning. This enables quantitative and geo-referenced outputs of coral reef features such as habitat types, benthic composition, and structural complexity (rugosity) to be generated across multiple kilometre-scale transects with a spatial resolution ranging from 2 to 6 m2. 4.The novel application of technology described here has enormous potential to contribute to our understanding of coral reefs and associated impacts by underpinning management decisions with kilometre-scale measurements of reef health. 5.Imagery datasets from an initial survey of 500 km of seascape are freely available through an online tool called the Catlin Global Reef Record. Outputs from the image analysis using the technologies described here will be updated on the online repository as work progresses on each dataset. 6.Case studies illustrate the utility of outputs as well as their potential to link to information from remote sensing. The potential implications of the innovative technologies on marine resource management and conservation are also discussed, along with the accuracy and efficiency of the methodologies deployed.