262 resultados para litter mixture
Resumo:
This thesis has investigated how to cluster a large number of faces within a multi-media corpus in the presence of large session variation. Quality metrics are used to select the best faces to represent a sequence of faces; and session variation modelling improves clustering performance in the presence of wide variations across videos. Findings from this thesis contribute to improving the performance of both face verification systems and the fully automated clustering of faces from a large video corpus.
Resumo:
This paper presents a system to analyze long field recordings with low signal-to-noise ratio (SNR) for bio-acoustic monitoring. A method based on spectral peak track, Shannon entropy, harmonic structure and oscillation structure is proposed to automatically detect anuran (frog) calling activity. Gaussian mixture model (GMM) is introduced for modelling those features. Four anuran species widespread in Queensland, Australia, are selected to evaluate the proposed system. A visualization method based on extracted indices is employed for detection of anuran calling activity which achieves high accuracy.
Resumo:
Road deposited dust is a complex mixture of pollutants derived from a wide range of sources. Accurate identification of these sources is seminal for effective source-oriented control measures. A range of techniques such as enrichment factor analysis (EF), principal component analysis (PCA) and hierarchical cluster analysis (HCA) are available for identifying sources of complex mixtures. However, they have multiple deficiencies when applied individually. This study presents an approach for the effective utilisation of EF, PCA and HCA for source identification, so that their specific deficiencies on an individual basis are eliminated. EF analysis confirmed the non-soil origin of metals such as Na, Cu, Cd, Zn, Sn, K, Ca, Sb, Ba, Ti, Ni and Mo providing guidance in the identification of anthropogenic sources. PCA and HCA identified four sources, with soil and asphalt wear in combination being the most prominent sources. Other sources were tyre wear, brake wear and sea salt.
Resumo:
For clustered survival data, the traditional Gehan-type estimator is asymptotically equivalent to using only the between-cluster ranks, and the within-cluster ranks are ignored. The contribution of this paper is two fold: - (i) incorporating within-cluster ranks in censored data analysis, and; - (ii) applying the induced smoothing of Brown and Wang (2005, Biometrika) for computational convenience. Asymptotic properties of the resulting estimating functions are given. We also carry out numerical studies to assess the performance of the proposed approach and conclude that the proposed approach can lead to much improved estimators when strong clustering effects exist. A dataset from a litter-matched tumorigenesis experiment is used for illustration.
Resumo:
The 2008 US election has been heralded as the first presidential election of the social media era, but took place at a time when social media were still in a state of comparative infancy; so much so that the most important platform was not Facebook or Twitter, but the purpose-built campaign site my.barackobama.com, which became the central vehicle for the most successful electoral fundraising campaign in American history. By 2012, the social media landscape had changed: Facebook and, to a somewhat lesser extent, Twitter are now well-established as the leading social media platforms in the United States, and were used extensively by the campaign organisations of both candidates. As third-party spaces controlled by independent commercial entities, however, their use necessarily differs from that of home-grown, party-controlled sites: from the point of view of the platform itself, a @BarackObama or @MittRomney is technically no different from any other account, except for the very high follower count and an exceptional volume of @mentions. In spite of the significant social media experience which Democrat and Republican campaign strategists had already accumulated during the 2008 campaign, therefore, the translation of such experience to the use of Facebook and Twitter in their 2012 incarnations still required a substantial amount of new work, experimentation, and evaluation. This chapter examines the Twitter strategies of the leading accounts operated by both campaign headquarters: the ‘personal’ candidate accounts @BarackObama and @MittRomney as well as @JoeBiden and @PaulRyanVP, and the campaign accounts @Obama2012 and @TeamRomney. Drawing on datasets which capture all tweets from and at these accounts during the final months of the campaign (from early September 2012 to the immediate aftermath of the election night), we reconstruct the campaigns’ approaches to using Twitter for electioneering from the quantitative and qualitative patterns of their activities, and explore the resonance which these accounts have found with the wider Twitter userbase. A particular focus of our investigation in this context will be on the tweeting styles of these accounts: the mixture of original messages, @replies, and retweets, and the level and nature of engagement with everyday Twitter followers. We will examine whether the accounts chose to respond (by @replying) to the messages of support or criticism which were directed at them, whether they retweeted any such messages (and whether there was any preferential retweeting of influential or – alternatively – demonstratively ordinary users), and/or whether they were used mainly to broadcast and disseminate prepared campaign messages. Our analysis will highlight any significant differences between the accounts we examine, trace changes in style over the course of the final campaign months, and correlate such stylistic differences with the respective electoral positioning of the candidates. Further, we examine the use of these accounts during moments of heightened attention (such as the presidential and vice-presidential debates, or in the context of controversies such as that caused by the publication of the Romney “47%” video; additional case studies may emerge over the remainder of the campaign) to explore how they were used to present or defend key talking points, and exploit or avert damage from campaign gaffes. A complementary analysis of the messages directed at the campaign accounts (in the form of @replies or retweets) will also provide further evidence for the extent to which these talking points were picked up and disseminated by the wider Twitter population. Finally, we also explore the use of external materials (links to articles, images, videos, and other content on the campaign sites themselves, in the mainstream media, or on other platforms) by the campaign accounts, and the resonance which these materials had with the wider follower base of these accounts. This provides an indication of the integration of Twitter into the overall campaigning process, by highlighting how the platform was used as a means of encouraging the viral spread of campaign propaganda (such as advertising materials) or of directing user attention towards favourable media coverage. By building on comprehensive, large datasets of Twitter activity (as of early October, our combined datasets comprise some 3.8 million tweets) which we process and analyse using custom-designed social media analytics tools, and by using our initial quantitative analysis to guide further qualitative evaluation of Twitter activity around these campaign accounts, we are able to provide an in-depth picture of the use of Twitter in political campaigning during the 2012 US election which will provide detailed new insights social media use in contemporary elections. This analysis will then also be able to serve as a touchstone for the analysis of social media use in subsequent elections, in the USA as well as in other developed nations where Twitter and other social media platforms are utilised in electioneering.
Resumo:
In school environments, children are constantly exposed to mixtures of airborne substances, derived from a variety of sources, both in the classroom and in the school surroundings. It is important to evaluate the hazardous properties of these mixtures, in order to conduct risk assessments of their impact on chil¬dren’s health. Within this context, through the application of a Maximum Cumulative Ratio approach, this study aimed to explore whether health risks due to indoor air mixtures are driven by a single substance or are due to cumulative exposure to various substances. This methodology requires knowledge of the concentration of substances in the air mixture, together with a health related weighting factor (i.e. reference concentration or lowest concentration of interest), which is necessary to calculate the Hazard Index. Maximum cumulative ratio and Hazard Index values were then used to categorise the mixtures into four groups, based on their hazard potential and therefore, appropriate risk management strategies. Air samples were collected from classrooms in 25 primary schools in Brisbane, Australia. Analysis was conducted based on the measured concentration of these substances in about 300 air samples. The results showed that in 92% of the schools, indoor air mixtures belonged to the ‘low concern’ group and therefore, they did not require any further assessment. In the remaining schools, toxicity was mainly governed by a single substance, with a very small number of schools having a multiple substance mix which required a combined risk assessment. The proposed approach enables the identification of such schools and thus, aides in the efficient health risk management of pollution emissions and air quality in the school environment.
Resumo:
Species distribution modelling (SDM) typically analyses species’ presence together with some form of absence information. Ideally absences comprise observations or are inferred from comprehensive sampling. When such information is not available, then pseudo-absences are often generated from the background locations within the study region of interest containing the presences, or else absence is implied through the comparison of presences to the whole study region, e.g. as is the case in Maximum Entropy (MaxEnt) or Poisson point process modelling. However, the choice of which absence information to include can be both challenging and highly influential on SDM predictions (e.g. Oksanen and Minchin, 2002). In practice, the use of pseudo- or implied absences often leads to an imbalance where absences far outnumber presences. This leaves analysis highly susceptible to ‘naughty-noughts’: absences that occur beyond the envelope of the species, which can exert strong influence on the model and its predictions (Austin and Meyers, 1996). Also known as ‘excess zeros’, naughty noughts can be estimated via an overall proportion in simple hurdle or mixture models (Martin et al., 2005). However, absences, especially those that occur beyond the species envelope, can often be more diverse than presences. Here we consider an extension to excess zero models. The two-staged approach first exploits the compartmentalisation provided by classification trees (CTs) (as in O’Leary, 2008) to identify multiple sources of naughty noughts and simultaneously delineate several species envelopes. Then SDMs can be fit separately within each envelope, and for this stage, we examine both CTs (as in Falk et al., 2014) and the popular MaxEnt (Elith et al., 2006). We introduce a wider range of model performance measures to improve treatment of naughty noughts in SDM. We retain an overall measure of model performance, the area under the curve (AUC) of the Receiver-Operating Curve (ROC), but focus on its constituent measures of false negative rate (FNR) and false positive rate (FPR), and how these relate to the threshold in the predicted probability of presence that delimits predicted presence from absence. We also propose error rates more relevant to users of predictions: false omission rate (FOR), the chance that a predicted absence corresponds to (and hence wastes) an observed presence, and the false discovery rate (FDR), reflecting those predicted (or potential) presences that correspond to absence. A high FDR may be desirable since it could help target future search efforts, whereas zero or low FOR is desirable since it indicates none of the (often valuable) presences have been ignored in the SDM. For illustration, we chose Bradypus variegatus, a species that has previously been published as an exemplar species for MaxEnt, proposed by Phillips et al. (2006). We used CTs to increasingly refine the species envelope, starting with the whole study region (E0), eliminating more and more potential naughty noughts (E1–E3). When combined with an SDM fit within the species envelope, the best CT SDM had similar AUC and FPR to the best MaxEnt SDM, but otherwise performed better. The FNR and FOR were greatly reduced, suggesting that CTs handle absences better. Interestingly, MaxEnt predictions showed low discriminatory performance, with the most common predicted probability of presence being in the same range (0.00-0.20) for both true absences and presences. In summary, this example shows that SDMs can be improved by introducing an initial hurdle to identify naughty noughts and partition the envelope before applying SDMs. This improvement was barely detectable via AUC and FPR yet visible in FOR, FNR, and the comparison of predicted probability of presence distribution for pres/absence.