121 resultados para audio data classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

• Introduction: Concern and action for rural road safety is relatively new in Australia in comparison to the field of traffic safety as a whole. In 2003, a program of research was begun by the Centre for Accident Research and Road Safety - Queensland (CARRS-Q) and the Rural Health Research Unit (RHRU) at James Cook University to investigate factors contributing to serious rural road crashes in the North Queensland region. This project was funded by the Premier’s Department, Main Roads Department, Queensland Transport, QFleet, Queensland Rail, Queensland Ambulance Service, Department of Natural Resources and Queensland Police Service. Additional funding was provided by NRMA Insurance for a PhD scholarship. In-kind support was provided through the four hospitals used for data collection, namely Cairns Base Hospital, The Townsville Hospital, Mount Isa Hospital and Atherton Hospital.----- The primary aim of the project was to: Identify human factors related to the occurrence of serious traffic incidents in rural and remote areas of Australia, and to the trauma suffered by persons as a result of these incidents, using a sample drawn from a rural and remote area in North Queensland.----- The data and analyses presented in this report are the core findings from two broad studies: a general examination of fatalities and casualties from rural and remote crashes for the period 1 March 2004 until 30 June 2007, and a further linked case-comparison study of hospitalised patients compared with a sample of non-crash-involved drivers.----- • Method: The study was undertaken in rural North Queensland, as defined by the Australian Bureau of Statistics (ABS) statistical divisions of North Queensland, Far North Queensland and North-West Queensland. Urban areas surrounding Townsville, Thuringowa and Cairns were not included. The study methodology was centred on serious crashes, as defined by a resulting hospitalisation for 24 hours or more and/or a fatality. Crashes meeting this criteria within the North Queensland region between 1 March 2004 and 30 June 2007 were identified through hospital records and interviewed where possible. Additional data was sourced from coroner’s reports, the Queensland Transport road crash database, the Queensland Ambulance Service and the study hospitals in the region.----- This report is divided into chapters corresponding to analyses conducted on the collected crash and casualty data.----- Chapter 3 presents an overview of all crashes and casualties identified during the study period. Details are presented in regard to the demographics and road user types of casualties; the locations, times, types, and circumstances of crashes; along with the contributing circumstances of crashes.----- Chapter 4 presents the results of summary statistics for all casualties for which an interview was able to be conducted. Statistics are presented separately for drivers and riders, passengers, pedestrians and cyclists. Details are also presented separately for drivers and riders crashing in off-road and on-road settings. Results from questionnaire data are presented in relation to demographics; the experience of the crash in narrative form; vehicle characteristics and maintenance; trip characteristics (e.g. purpose and length of journey; periods of fatigue and monotony; distractions from driving task); driving history; alcohol and drug use; medical history; driving attitudes, intentions and behaviour; attitudes to enforcement; and experience of road safety advertising.----- Chapter 5 compares the above-listed questionnaire results between on-road crash-involved casualties and interviews conducted in the region with non-crash-involved persons. Direct comparisons as well as age and sex adjusted comparisons are presented.----- Chapter 6 presents information on those casualties who were admitted to one of the study hospitals during the study period. Brief information is given regarding the demographic characteristics of these casualties. Emergency services’ data is used to highlight the characteristics of patient retrieval and transport to and between hospitals. The major injuries resulting from the crashes are presented for each region of the body and analysed by vehicle type, occupant type, seatbelt status, helmet status, alcohol involvement and nature of crash. Estimates are provided of the costs associated with in-hospital treatment and retrieval.----- Chapter 7 describes the characteristics of the fatal casualties and the nature and circumstances of the crashes. Demographics, road user types, licence status, crash type and contributing factors for crashes are presented. Coronial data is provided in regard to contributing circumstances (including alcohol, drugs and medical conditions), cause of death, resulting injuries, and restraint and helmet use.----- Chapter 8 presents the results of a comparison between casualties’ crash descriptions and police-attributed crash circumstances. The relative frequency of contributing circumstances are compared both broadly within the categories of behavioural, environmental, vehicle related, medical and other groupings and specifically for circumstances within these groups.----- Chapter 9 reports on the associated research projects which have been undertaken on specific topics related to rural road safety.----- Finally, Chapter 10 reports on the conclusions and recommendations made from the program of research.---- • Major Recommendations : From the findings of these analyses, a number of major recommendations were made: + Male drivers and riders - Male drivers and riders should continue to be the focus of interventions, given their very high representation among rural and remote road crash fatalities and serious injuries.----- - The group of males aged between 30 and 50 years comprised the largest number of casualties and must also be targeted for change if there is to be a meaningful improvement in rural and remote road safety.----- + Motorcyclists - Single vehicle motorcycle crashes constitute over 80% of serious, on-road rural motorcycle crashes and need particular attention in development of policy and infrastructure.----- - The motorcycle safety consultation process currently being undertaken by Queensland Transport (via the "Motorbike Safety in Queensland - Consultation Paper") is strongly endorsed. As part of this process, particular attention needs to be given to initiatives designed to reduce rural and single vehicle motorcycle crashes.----- - The safety of off-road riders is a serious problem that falls outside the direct responsibility of either Transport or Health departments. Responsibility for this issue needs to be attributed to develop appropriate policy, regulations and countermeasures.----- + Road safety for Indigenous people - Continued resourcing and expansion of The Queensland Aboriginal Peoples and Torres Strait Islander Peoples Driver Licensing Program to meet the needs of remote and Indigenous communities with significantly lower licence ownership levels.----- - Increased attention needs to focus on the contribution of geographic disadvantage (remoteness) factors to remote and Indigenous road trauma.----- + Road environment - Speed is the ‘final common pathway’ in determining the severity of rural and remote crashes and rural speed limits should be reduced to 90km/hr for sealed off-highway roads and 80km/hr for all unsealed roads as recommended in the Austroads review and in line with the current Tasmanian government trial.----- - The Department of Main Roads should monitor rural crash clusters and where appropriate work with local authorities to conduct relevant audits and take mitigating action. - The international experts at the workshop reviewed the data and identified the need to focus particular attention on road design management for dangerous curves. They also indicated the need to maximise the use of audio-tactile linemarking (audible lines) and rumble strips to alert drivers to dangerous conditions and behaviours.----- + Trauma costs - In accordance with Queensland Health priorities, recognition should be given to the substantial financial costs associated with acute management of trauma resulting from serious rural and remote crashes.----- - Efforts should be made to develop a comprehensive, regionally specific costing formula for road trauma that incorporates the pre-hospital, hospital and post-hospital phases of care. This would inform health resource allocation and facilitate the evaluation of interventions.----- - The commitment of funds to the development of preventive strategies to reduce rural and remote crashes should take into account the potential cost savings associated with trauma.----- - A dedicated study of the rehabilitation needs and associated personal and healthcare costs arising from rural and remote road crashes should be undertaken.----- + Emergency services - While the study has demonstrated considerable efficiency in the response and retrieval systems of rural and remote North Queensland, relevant Intelligent Transport Systems technologies (such as vehicle alarm systems) to improve crash notification should be both developed and evaluated.----- + Enforcement - Alcohol and speed enforcement programs should target the period between 2 and 6pm because of the high numbers of crashes in the afternoon period throughout the rural region.----- + Drink driving - Courtesy buses should be advocated and schemes such as the Skipper project promoted as local drink driving countermeasures in line with the very high levels of community support for these measures identified in the hospital study.------ - Programs should be developed to target the high levels of alcohol consumption identified in rural and remote areas and related involvement in crashes.----- - Referrals to drink driving rehabilitation programs should be mandated for recidivist offenders.----- + Data requirements - Rural and remote road crashes should receive the same quality of attention as urban crashes. As such, it is strongly recommended that increased resources be committed to enable dedicated Forensic Crash Units to investigate rural and remote fatal and serious injury crashes.----- - Transport department records of rural and remote crashes should record the crash location using the national ARIA area classifications used by health departments as a means to better identifying rural crashes.----- - Rural and remote crashes tend to be unnoticed except in relatively infrequent rural reviews. They should receive the same level of attention and this could be achieved if fatalities and fatal crashes were coded by the ARIA classification system and included in regular crash reporting.----- - Health, Transport and Police agencies should collect a common, minimal set of data relating to road crashes and injuries, including presentations to small rural and remote health facilities.----- + Media and community education programmes - Interventions seeking to highlight the human contribution to crashes should be prioritised. Driver distraction, alcohol and inappropriate speed for the road conditions are key examples of such behaviours.----- - Promotion of basic safety behaviours such as the use of seatbelts and helmets should be given a renewed focus.----- - Knowledge, attitude and behavioural factors that have been identified for the hospital Brief Intervention Trial should be considered in developing safety campaigns for rural and remote people. For example challenging the myth of the dangerous ‘other’ or ‘non-local’ driver.----- - Special educational initiatives on the issues involved in rural and remote driving should be undertaken. For example the material used by Main Roads, the Australian Defence Force and local initiatives.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The construction industry has adapted information technology in its processes in terms of computer aided design and drafting, construction documentation and maintenance. The data generated within the construction industry has become increasingly overwhelming. Data mining is a sophisticated data search capability that uses classification algorithms to discover patterns and correlations within a large volume of data. This paper presents the selection and application of data mining techniques on maintenance data of buildings. The results of applying such techniques and potential benefits of utilising their results to identify useful patterns of knowledge and correlations to support decision making of improving the management of building life cycle are presented and discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The building life cycle process is complex and prone to fragmentation as it moves through its various stages. The number of participants, and the diversity, specialisation and isolation both in space and time of their activities, have dramatically increased over time. The data generated within the construction industry has become increasingly overwhelming. Most currently available computer tools for the building industry have offered productivity improvement in the transmission of graphical drawings and textual specifications, without addressing more fundamental changes in building life cycle management. Facility managers and building owners are primarily concerned with highlighting areas of existing or potential maintenance problems in order to be able to improve the building performance, satisfying occupants and minimising turnover especially the operational cost of maintenance. In doing so, they collect large amounts of data that is stored in the building’s maintenance database. The work described in this paper is targeted at adding value to the design and maintenance of buildings by turning maintenance data into information and knowledge. Data mining technology presents an opportunity to increase significantly the rate at which the volumes of data generated through the maintenance process can be turned into useful information. This can be done using classification algorithms to discover patterns and correlations within a large volume of data. This paper presents how and what data mining techniques can be applied on maintenance data of buildings to identify the impediments to better performance of building assets. It demonstrates what sorts of knowledge can be found in maintenance records. The benefits to the construction industry lie in turning passive data in databases into knowledge that can improve the efficiency of the maintenance process and of future designs that incorporate that maintenance knowledge.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper deals with the problem of using the data mining models in a real-world situation where the user can not provide all the inputs with which the predictive model is built. A learning system framework, Query Based Learning System (QBLS), is developed for improving the performance of the predictive models in practice where not all inputs are available for querying to the system. The automatic feature selection algorithm called Query Based Feature Selection (QBFS) is developed for selecting features to obtain a balance between the relative minimum subset of features and the relative maximum classification accuracy. Performance of the QBLS system and the QBFS algorithm is successfully demonstrated with a real-world application

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To examine the reliability of work-related activity coding for injury-related hospitalisations in Australia. Method: A random sample of 4373 injury-related hospital separations from 1 July 2002 to 30 June 2004 were obtained from a stratified random sample of 50 hospitals across 4 states in Australia. From this sample, cases were identified as work-related if they contained an ICD-10-AM work-related activity code (U73) allocated by either: (i) the original coder; (ii) an independent auditor, blinded to the original code; or (iii) a research assistant, blinded to both the original and auditor codes, who reviewed narrative text extracted from the medical record. The concordance of activity coding and number of cases identified as work-related using each method were compared. Results: Of the 4373 cases sampled, 318 cases were identified as being work-related using any of the three methods for identification. The original coder identified 217 and the auditor identified 266 work-related cases (68.2% and 83.6% of the total cases identified, respectively). Around 10% of cases were only identified through the text description review. The original coder and auditor agreed on the assignment of work-relatedness for 68.9% of cases. Conclusions and Implications: The current best estimates of the frequency of hospital admissions for occupational injury underestimate the burden by around 32%. This is a substantial underestimate that has major implications for public policy, and highlights the need for further work on improving the quality and completeness of routine, administrative data sources for a more complete identification of work-related injuries.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Research has noted a ‘pronounced pattern of increase with increasing remoteness' of death rates in road crashes. However, crash characteristics by remoteness are not commonly or consistently reported, with definitions of rural and urban often relying on proxy representations such as prevailing speed limit. The current paper seeks to evaluate the efficacy of the Accessibility / Remoteness Index of Australia (ARIA+) to identifying trends in road crashes. ARIA+ does not rely on road-specific measures and uses distances to populated centres to attribute a score to an area, which can in turn be grouped into 5 classifications of increasing remoteness. The current paper uses applications of these classifications at the broad level of Australian Bureau of Statistics' Statistical Local Areas, thus avoiding precise crash locating or dedicated mapping software. Analyses used Queensland road crash database details for all 31,346 crashes resulting in a fatality or hospitalisation occurring between 1st July, 2001 and 30th June 2006 inclusive. Results showed that this simplified application of ARIA+ aligned with previous definitions such as speed limit, while also providing further delineation. Differences in crash contributing factors were noted with increasing remoteness such as a greater representation of alcohol and ‘excessive speed for circumstances.' Other factors such as the predominance of younger drivers in crashes differed little by remoteness classification. The results are discussed in terms of the utility of remoteness as a graduated rather than binary (rural/urban) construct and the potential for combining ARIA crash data with census and hospital datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This report provides an introduction to our analyses of secondary data with respect to violent acts and incidents relating to males living in rural settings in Australia. It clarifies important aspects of our overall approach primarily by concentrating on three elements that required early scoping and resolution. Firstly, a wide and inclusive view of violence which encompasses measures of violent acts and incidents and also data identifying risk taking behaviour and the consequences of violence is outlined and justified. Secondly, the classification used to make comparisons between the city and the bush together with associated caveats is outlined. The third element discussed is in relation to national injury data. Additional commentary resulting from exploration, examination and analyses of secondary data is published online in five subsequent reports in this series.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The inquiry documented in this thesis is located at the nexus of technological innovation and traditional schooling. As we enter the second decade of a new century, few would argue against the increasingly urgent need to integrate digital literacies with traditional academic knowledge. Yet, despite substantial investments from governments and businesses, the adoption and diffusion of contemporary digital tools in formal schooling remain sluggish. To date, research on technology adoption in schools tends to take a deficit perspective of schools and teachers, with the lack of resources and teacher ‘technophobia’ most commonly cited as barriers to digital uptake. Corresponding interventions that focus on increasing funding and upskilling teachers, however, have made little difference to adoption trends in the last decade. Empirical evidence that explicates the cultural and pedagogical complexities of innovation diffusion within long-established conventions of mainstream schooling, particularly from the standpoint of students, is wanting. To address this knowledge gap, this thesis inquires into how students evaluate and account for the constraints and affordances of contemporary digital tools when they engage with them as part of their conventional schooling. It documents the attempted integration of a student-led Web 2.0 learning initiative, known as the Student Media Centre (SMC), into the schooling practices of a long-established, high-performing independent senior boys’ school in urban Australia. The study employed an ‘explanatory’ two-phase research design (Creswell, 2003) that combined complementary quantitative and qualitative methods to achieve both breadth of measurement and richness of characterisation. In the initial quantitative phase, a self-reported questionnaire was administered to the senior school student population to determine adoption trends and predictors of SMC usage (N=481). Measurement constructs included individual learning dispositions (learning and performance goals, cognitive playfulness and personal innovativeness), as well as social and technological variables (peer support, perceived usefulness and ease of use). Incremental predictive models of SMC usage were conducted using Classification and Regression Tree (CART) modelling: (i) individual-level predictors, (ii) individual and social predictors, and (iii) individual, social and technological predictors. Peer support emerged as the best predictor of SMC usage. Other salient predictors include perceived ease of use and usefulness, cognitive playfulness and learning goals. On the whole, an overwhelming proportion of students reported low usage levels, low perceived usefulness and a lack of peer support for engaging with the digital learning initiative. The small minority of frequent users reported having high levels of peer support and robust learning goal orientations, rather than being predominantly driven by performance goals. These findings indicate that tensions around social validation, digital learning and academic performance pressures influence students’ engagement with the Web 2.0 learning initiative. The qualitative phase that followed provided insights into these tensions by shifting the analytics from individual attitudes and behaviours to shared social and cultural reasoning practices that explain students’ engagement with the innovation. Six indepth focus groups, comprising 60 students with different levels of SMC usage, were conducted, audio-recorded and transcribed. Textual data were analysed using Membership Categorisation Analysis. Students’ accounts converged around a key proposition. The Web 2.0 learning initiative was useful-in-principle but useless-in-practice. While students endorsed the usefulness of the SMC for enhancing multimodal engagement, extending peer-topeer networks and acquiring real-world skills, they also called attention to a number of constraints that obfuscated the realisation of these design affordances in practice. These constraints were cast in terms of three binary formulations of social and cultural imperatives at play within the school: (i) ‘cool/uncool’, (ii) ‘dominant staff/compliant student’, and (iii) ‘digital learning/academic performance’. The first formulation foregrounds the social stigma of the SMC among peers and its resultant lack of positive network benefits. The second relates to students’ perception of the school culture as authoritarian and punitive with adverse effects on the very student agency required to drive the innovation. The third points to academic performance pressures in a crowded curriculum with tight timelines. Taken together, findings from both phases of the study provide the following key insights. First, students endorsed the learning affordances of contemporary digital tools such as the SMC for enhancing their current schooling practices. For the majority of students, however, these learning affordances were overshadowed by the performative demands of schooling, both social and academic. The student participants saw engagement with the SMC in-school as distinct from, even oppositional to, the conventional social and academic performance indicators of schooling, namely (i) being ‘cool’ (or at least ‘not uncool’), (ii) sufficiently ‘compliant’, and (iii) achieving good academic grades. Their reasoned response therefore, was simply to resist engagement with the digital learning innovation. Second, a small minority of students seemed dispositionally inclined to negotiate the learning affordances and performance constraints of digital learning and traditional schooling more effectively than others. These students were able to engage more frequently and meaningfully with the SMC in school. Their ability to adapt and traverse seemingly incommensurate social and institutional identities and norms is theorised as cultural agility – a dispositional construct that comprises personal innovativeness, cognitive playfulness and learning goals orientation. The logic then is ‘both and’ rather than ‘either or’ for these individuals with a capacity to accommodate both learning and performance in school, whether in terms of digital engagement and academic excellence, or successful brokerage across multiple social identities and institutional affiliations within the school. In sum, this study takes us beyond the familiar terrain of deficit discourses that tend to blame institutional conservatism, lack of resourcing and teacher resistance for low uptake of digital technologies in schools. It does so by providing an empirical base for the development of a ‘third way’ of theorising technological and pedagogical innovation in schools, one which is more informed by students as critical stakeholders and thus more relevant to the lived culture within the school, and its complex relationship to students’ lives outside of school. It is in this relationship that we find an explanation for how these individuals can, at the one time, be digital kids and analogue students.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To quantify the extent to which alcohol related injuries are adequately identified in hospitalisation data using ICD-10-AM codes indicative of alcohol involvement. Method: A random sample of 4373 injury-related hospital separations from 1 July 2002 to 30 June 2004 were obtained from a stratified random sample of 50 hospitals across 4 states in Australia. From this sample, cases were identified as involving alcohol if they contained an ICD-10-AM diagnosis or external cause code referring to alcohol, or if the text description extracted from the medical records mentioned alcohol involvement. Results: Overall, identification of alcohol involvement using ICD codes detected 38% of the alcohol-related sample, whilst almost 94% of alcohol-related cases were identified through a search of the text extracted from the medical records. The resultant estimate of alcohol involvement in injury-related hospitalisations in this sample was 10%. Emergency department records were the most likely to identify whether the injury was alcohol-related with almost three-quarters of alcohol-related cases mentioning alcohol in the text abstracted from these records. Conclusions and Implications: The current best estimates of the frequency of hospital admissions where alcohol is involved prior to the injury underestimate the burden by around 62%. This is a substantial underestimate that has major implications for public policy, and highlights the need for further work on improving the quality and completeness of routine administrative data sources for identification of alcohol-related injuries.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To examine the sources of coding discrepancy for injury morbidity data and explore the implications of these sources for injury surveillance.-------- Method: An on-site medical record review and recoding study was conducted for 4373 injury-related hospital admissions across Australia. Codes from the original dataset were compared to the recoded data to explore the reliability of coded data aand sources of discrepancy.---------- Results: The most common reason for differences in coding overall was assigning the case to a different external cause category with 8.5% assigned to a different category. Differences in the specificity of codes assigned within a category accounted for 7.8% of coder difference. Differences in intent assignment accounted for 3.7% of the differences in code assignment.---------- Conclusions: In the situation where 8 percent of cases are misclassified by major category, the setting of injury targets on the basis of extent of burden is a somewhat blunt instrument Monitoring the effect of prevention programs aimed at reducing risk factors is not possible in datasets with this level of misclassification error in injury cause subcategories. Future research is needed to build the evidence base around the quality and utility of the ICD classification system and application of use of this for injury surveillance in the hospital environment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The recently proposed data-driven background dataset refinement technique provides a means of selecting an informative background for support vector machine (SVM)-based speaker verification systems. This paper investigates the characteristics of the impostor examples in such highly-informative background datasets. Data-driven dataset refinement individually evaluates the suitability of candidate impostor examples for the SVM background prior to selecting the highest-ranking examples as a refined background dataset. Further, the characteristics of the refined dataset were analysed to investigate the desired traits of an informative SVM background. The most informative examples of the refined dataset were found to consist of large amounts of active speech and distinctive language characteristics. The data-driven refinement technique was shown to filter the set of candidate impostor examples to produce a more disperse representation of the impostor population in the SVM kernel space, thereby reducing the number of redundant and less-informative examples in the background dataset. Furthermore, data-driven refinement was shown to provide performance gains when applied to the difficult task of refining a small candidate dataset that was mis-matched to the evaluation conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Longitudinal data, where data are repeatedly observed or measured on a temporal basis of time or age provides the foundation of the analysis of processes which evolve over time, and these can be referred to as growth or trajectory models. One of the traditional ways of looking at growth models is to employ either linear or polynomial functional forms to model trajectory shape, and account for variation around an overall mean trend with the inclusion of random eects or individual variation on the functional shape parameters. The identification of distinct subgroups or sub-classes (latent classes) within these trajectory models which are not based on some pre-existing individual classification provides an important methodology with substantive implications. The identification of subgroups or classes has a wide application in the medical arena where responder/non-responder identification based on distinctly diering trajectories delivers further information for clinical processes. This thesis develops Bayesian statistical models and techniques for the identification of subgroups in the analysis of longitudinal data where the number of time intervals is limited. These models are then applied to a single case study which investigates the neuropsychological cognition for early stage breast cancer patients undergoing adjuvant chemotherapy treatment from the Cognition in Breast Cancer Study undertaken by the Wesley Research Institute of Brisbane, Queensland. Alternative formulations to the linear or polynomial approach are taken which use piecewise linear models with a single turning point, change-point or knot at a known time point and latent basis models for the non-linear trajectories found for the verbal memory domain of cognitive function before and after chemotherapy treatment. Hierarchical Bayesian random eects models are used as a starting point for the latent class modelling process and are extended with the incorporation of covariates in the trajectory profiles and as predictors of class membership. The Bayesian latent basis models enable the degree of recovery post-chemotherapy to be estimated for short and long-term followup occasions, and the distinct class trajectories assist in the identification of breast cancer patients who maybe at risk of long-term verbal memory impairment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A one-sided classifier for a given class of languages converges to 1 on every language from the class and outputs 0 infinitely often on languages outside the class. A two-sided classifier, on the other hand, converges to 1 on languages from the class and converges to 0 on languages outside the class. The present paper investigates one-sided and two-sided classification for classes of recursive languages. Theorems are presented that help assess the classifiability of natural classes. The relationships of classification to inductive learning theory and to structural complexity theory in terms of Turing degrees are studied. Furthermore, the special case of classification from only positive data is also investigated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Keyword Spotting is the task of detecting keywords of interest within continu- ous speech. The applications of this technology range from call centre dialogue systems to covert speech surveillance devices. Keyword spotting is particularly well suited to data mining tasks such as real-time keyword monitoring and unre- stricted vocabulary audio document indexing. However, to date, many keyword spotting approaches have su®ered from poor detection rates, high false alarm rates, or slow execution times, thus reducing their commercial viability. This work investigates the application of keyword spotting to data mining tasks. The thesis makes a number of major contributions to the ¯eld of keyword spotting. The ¯rst major contribution is the development of a novel keyword veri¯cation method named Cohort Word Veri¯cation. This method combines high level lin- guistic information with cohort-based veri¯cation techniques to obtain dramatic improvements in veri¯cation performance, in particular for the problematic short duration target word class. The second major contribution is the development of a novel audio document indexing technique named Dynamic Match Lattice Spotting. This technique aug- ments lattice-based audio indexing principles with dynamic sequence matching techniques to provide robustness to erroneous lattice realisations. The resulting algorithm obtains signi¯cant improvement in detection rate over lattice-based audio document indexing while still maintaining extremely fast search speeds. The third major contribution is the study of multiple veri¯er fusion for the task of keyword veri¯cation. The reported experiments demonstrate that substantial improvements in veri¯cation performance can be obtained through the fusion of multiple keyword veri¯ers. The research focuses on combinations of speech background model based veri¯ers and cohort word veri¯ers. The ¯nal major contribution is a comprehensive study of the e®ects of limited training data for keyword spotting. This study is performed with consideration as to how these e®ects impact the immediate development and deployment of speech technologies for non-English languages.