354 resultados para Automatic Image Annotation
Resumo:
Citizen Science projects are initiatives in which members of the general public participate in scientific research projects and perform or manage research-related tasks such as data collection and/or data annotation. Citizen Science is technologically possible and scientifically significant. However, as the gathered information is from the crowd, the data quality is always hard to manage. There are many ways to manage data quality, and reputation management is one of the common approaches. In recent year, many research teams have deployed many audio or image sensors in natural environment in order to monitor the status of animals or plants. The collected data will be analysed by ecologists. However, as the amount of collected data is exceedingly huge and the number of ecologists is very limited, it is impossible for scientists to manually analyse all these data. The functions of existing automated tools to process the data are still very limited and the results are still not very accurate. Therefore, researchers have turned to recruiting general citizens who are interested in helping scientific research to do the pre-processing tasks such as species tagging. Although research teams can save time and money by recruiting general citizens to volunteer their time and skills to help data analysis, the reliability of contributed data varies a lot. Therefore, this research aims to investigate techniques to enhance the reliability of data contributed by general citizens in scientific research projects especially for acoustic sensing projects. In particular, we aim to investigate how to use reputation management to enhance data reliability. Reputation systems have been used to solve the uncertainty and improve data quality in many marketing and E-Commerce domains. The commercial organizations which have chosen to embrace the reputation management and implement the technology have gained many benefits. Data quality issues are significant to the domain of Citizen Science due to the quantity and diversity of people and devices involved. However, research on reputation management in this area is relatively new. We therefore start our investigation by examining existing reputation systems in different domains. Then we design novel reputation management approaches for Citizen Science projects to categorise participants and data. We have investigated some critical elements which may influence data reliability in Citizen Science projects. These elements include personal information such as location and education and performance information such as the ability to recognise certain bird calls. The designed reputation framework is evaluated by a series of experiments involving many participants for collecting and interpreting data, in particular, environmental acoustic data. Our research in exploring the advantages of reputation management in Citizen Science (or crowdsourcing in general) will help increase awareness among organizations that are unacquainted with its potential benefits.
Resumo:
Typical flow fields in a stormwater gross pollutant trap (GPT) with blocked retaining screens were experimentally captured and visualised. Particle image velocimetry (PIV) software was used to capture the flow field data by tracking neutrally buoyant particles with a high speed camera. A technique was developed to apply the Image Based Flow Visualization (IBFV) algorithm to the experimental raw dataset generated by the PIV software. The dataset consisted of scattered 2D point velocity vectors and the IBFV visualisation facilitates flow feature characterisation within the GPT. The flow features played a pivotal role in understanding gross pollutant capture and retention within the GPT. It was found that the IBFV animations revealed otherwise unnoticed flow features and experimental artefacts. For example, a circular tracer marker in the IBFV program visually highlighted streamlines to investigate specific areas and identify the flow features within the GPT.
Resumo:
Sound tagging has been studied for years. Among all sound types, music, speech, and environmental sound are three hottest research areas. This survey aims to provide an overview about the state-of-the-art development in these areas.We discuss about the meaning of tagging in different sound areas at the beginning of the journey. Some examples of sound tagging applications are introduced in order to illustrate the significance of this research. Typical tagging techniques include manual, automatic, and semi-automatic approaches.After reviewing work in music, speech and environmental sound tagging, we compare them and state the research progress to date. Research gaps are identified for each research area and the common features and discriminations between three areas are discovered as well. Published datasets, tools used by researchers, and evaluation measures frequently applied in the analysis are listed. In the end, we summarise the worldwide distribution of countries dedicated to sound tagging research for years.
Resumo:
This paper presents a novel technique for segmenting an audio stream into homogeneous regions according to speaker identities, background noise, music, environmental and channel conditions. Audio segmentation is useful in audio diarization systems, which aim to annotate an input audio stream with information that attributes temporal regions of the audio into their specific sources. The segmentation method introduced in this paper is performed using the Generalized Likelihood Ratio (GLR), computed between two adjacent sliding windows over preprocessed speech. This approach is inspired by the popular segmentation method proposed by the pioneering work of Chen and Gopalakrishnan, using the Bayesian Information Criterion (BIC) with an expanding search window. This paper will aim to identify and address the shortcomings associated with such an approach. The result obtained by the proposed segmentation strategy is evaluated on the 2002 Rich Transcription (RT-02) Evaluation dataset, and a miss rate of 19.47% and a false alarm rate of 16.94% is achieved at the optimal threshold.
Resumo:
Many methods exist at the moment for deformable face fitting. A drawback to nearly all these approaches is that they are (i) noisy in terms of landmark positions, and (ii) the noise is biased across frames (i.e. the misalignment is toward common directions across all frames). In this paper we propose a grouped $\mathcal{L}1$-norm anchored method for simultaneously aligning an ensemble of deformable face images stemming from the same subject, given noisy heterogeneous landmark estimates. Impressive alignment performance improvement and refinement is obtained using very weak initialization as "anchors".
Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach
Resumo:
In this paper we propose and evaluate a speaker attribution system using a complete-linkage clustering method. Speaker attribution refers to the annotation of a collection of spoken audio based on speaker identities. This can be achieved using diarization and speaker linking. The main challenge associated with attribution is achieving computational efficiency when dealing with large audio archives. Traditional agglomerative clustering methods with model merging and retraining are not feasible for this purpose. This has motivated the use of linkage clustering methods without retraining. We first propose a diarization system using complete-linkage clustering and show that it outperforms traditional agglomerative and single-linkage clustering based diarization systems with a relative improvement of 40% and 68%, respectively. We then propose a complete-linkage speaker linking system to achieve attribution and demonstrate a 26% relative improvement in attribution error rate (AER) over the single-linkage speaker linking approach.
Resumo:
This item provides supplementary materials for the paper mentioned in the title, specifically a range of organisms used in the study. The full abstract for the main paper is as follows: Next Generation Sequencing (NGS) technologies have revolutionised molecular biology, allowing clinical sequencing to become a matter of routine. NGS data sets consist of short sequence reads obtained from the machine, given context and meaning through downstream assembly and annotation. For these techniques to operate successfully, the collected reads must be consistent with the assumed species or species group, and not corrupted in some way. The common bacterium Staphylococcus aureus may cause severe and life-threatening infections in humans,with some strains exhibiting antibiotic resistance. In this paper, we apply an SVM classifier to the important problem of distinguishing S. aureus sequencing projects from alternative pathogens, including closely related Staphylococci. Using a sequence k-mer representation, we achieve precision and recall above 95%, implicating features with important functional associations.
Resumo:
Many state of the art vision-based Simultaneous Localisation And Mapping (SLAM) and place recognition systems compute the salience of visual features in their environment. As computing salience can be problematic in radically changing environments new low resolution feature-less systems have been introduced, such as SeqSLAM, all of which consider the whole image. In this paper, we implement a supervised classifier system (UCS) to learn the salience of image regions for place recognition by feature-less systems. SeqSLAM only slightly benefits from the results of training, on the challenging real world Eynsham dataset, as it already appears to filter less useful regions of a panoramic image. However, when recognition is limited to specific image regions performance improves by more than an order of magnitude by utilising the learnt image region saliency. We then investigate whether the region salience generated from the Eynsham dataset generalizes to another car-based dataset using a perspective camera. The results suggest the general applicability of an image region salience mask for optimizing route-based navigation applications.
Resumo:
Since the first destination image studies were published in the early 1970s, the field has become one of the most popular in the tourism literature. While reviews of the destination image literature show no commonly agreed conceptualisation of the construct, researchers have predominantly used structured questionnaires for measurement. There has been criticism that the way some of these scales have been selected means a greater likelihood of attributes being irrelevant to participants. This opens up the risk of stimulating uninformed responses. The issue of uninformed response was first raised as a source of error 60 years ago. However, there has been little, if any, discussion in relation to destination image measurement, studies of which often require participants to provide opinion-driven rather than fact-based responses. This paper reports the trial of a ‘don’t know’ (DK) non-response option for participants in two destination image questionnaires. It is suggested the use of a DK option provides participants with an alternative to i) skipping the question, ii) using the scale midpoint to denote neutrality, or iii) providing an uninformed response. High levels of DK usage by participants can then alert the marketer of the need to improve awareness of destination performance for potential salient attributes.
Resumo:
In the last decade, smartphones have gained widespread usage. Since the advent of online application stores, hundreds of thousands of applications have become instantly available to millions of smart-phone users. Within the Android ecosystem, application security is governed by digital signatures and a list of coarse-grained permissions. However, this mechanism is not fine-grained enough to provide the user with a sufficient means of control of the applications' activities. Abuse of highly sensible private information such as phone numbers without users' notice is the result. We show that there is a high frequency of privacy leaks even among widely popular applications. Together with the fact that the majority of the users are not proficient in computer security, this presents a challenge to the engineers developing security solutions for the platform. Our contribution is twofold: first, we propose a service which is able to assess Android Market applications via static analysis and provide detailed, but readable reports to the user. Second, we describe a means to mitigate security and privacy threats by automated reverse-engineering and refactoring binary application packages according to the users' security preferences.
Resumo:
The rank and census are two filters based on order statistics which have been applied to the image matching problem for stereo pairs. Advantages of these filters include their robustness to radiometric distortion and small amounts of random noise, and their amenability to hardware implementation. In this paper, a new matching algorithm is presented, which provides an overall framework for matching, and is used to compare the rank and census techniques with standard matching metrics. The algorithm was tested using both real stereo pairs and a synthetic pair with ground truth. The rank and census filters were shown to significantly improve performance in the case of radiometric distortion. In all cases, the results obtained were comparable to, if not better than, those obtained using standard matching metrics. Furthermore, the rank and census have the additional advantage that their computational overhead is less than these metrics. For all techniques tested, the difference between the results obtained for the synthetic stereo pair, and the ground truth results was small.
Resumo:
We applied a texture-based flow visualisation technique to a numerical hydrodynamic model of the Pumicestone Passage in southeast Queensland, Australia. The quality of the visualisations using our flow visualisation tool, are compared with animations generated using more traditional drogue release plot and velocity contour and vector techniques. The texture-based method is found to be far more effective in visualising advective flow within the model domain. In some instances, it also makes it easier for the researcher to identify specific hydrodynamic features within the complex flow regimes of this shallow tidal barrier estuary as compared with the direct and geometric based methods.
Resumo:
Concern that poor image of UK construction industry is restricting recruitment has lead to call for action. This paper gives the results of a recent comparative analysis of the image of both UK and Hungarian industries which indicates the UK image to be relatively good. The perceived cause of Hungarian problems is the poor level of organisation and management.