976 resultados para Dataset
Resumo:
This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.
Resumo:
This paper investigates the use of the dimensionality-reduction techniques weighted linear discriminant analysis (WLDA), and weighted median fisher discriminant analysis (WMFD), before probabilistic linear discriminant analysis (PLDA) modeling for the purpose of improving speaker verification performance in the presence of high inter-session variability. Recently it was shown that WLDA techniques can provide improvement over traditional linear discriminant analysis (LDA) for channel compensation in i-vector based speaker verification systems. We show in this paper that the speaker discriminative information that is available in the distance between pair of speakers clustered in the development i-vector space can also be exploited in heavy-tailed PLDA modeling by using the weighted discriminant approaches prior to PLDA modeling. Based upon the results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset, we believe that WLDA and WMFD projections before PLDA modeling can provide an improved approach when compared to uncompensated PLDA modeling for i-vector based speaker verification systems.
Resumo:
In this paper we use a sequence-based visual localization algorithm to reveal surprising answers to the question, how much visual information is actually needed to conduct effective navigation? The algorithm actively searches for the best local image matches within a sliding window of short route segments or 'sub-routes', and matches sub-routes by searching for coherent sequences of local image matches. In contract to many existing techniques, the technique requires no pre-training or camera parameter calibration. We compare the algorithm's performance to the state-of-the-art FAB-MAP 2.0 algorithm on a 70 km benchmark dataset. Performance matches or exceeds the state of the art feature-based localization technique using images as small as 4 pixels, fields of view reduced by a factor of 250, and pixel bit depths reduced to 2 bits. We present further results demonstrating the system localizing in an office environment with near 100% precision using two 7 bit Lego light sensors, as well as using 16 and 32 pixel images from a motorbike race and a mountain rally car stage. By demonstrating how little image information is required to achieve localization along a route, we hope to stimulate future 'low fidelity' approaches to visual navigation that complement probabilistic feature-based techniques.
Resumo:
The popularity of Bayesian Network modelling of complex domains using expert elicitation has raised questions of how one might validate such a model given that no objective dataset exists for the model. Past attempts at delineating a set of tests for establishing confidence in an entirely expert-elicited model have focused on single types of validity stemming from individual sources of uncertainty within the model. This paper seeks to extend the frameworks proposed by earlier researchers by drawing upon other disciplines where measuring latent variables is also an issue. We demonstrate that even in cases where no data exist at all there is a broad range of validity tests that can be used to establish confidence in the validity of a Bayesian Belief Network.
Resumo:
Background: Kallikrein 15 (KLK15)/Prostinogen is a plausible candidate for prostate cancer susceptibility. Elevated KLK15 expression has been reported in prostate cancer and it has been described as an unfavorable prognostic marker for the disease. Objectives: We performed a comprehensive analysis of association of variants in the KLK15 gene with prostate cancer risk and aggressiveness by genotyping tagSNPs, as well as putative functional SNPs identified by extensive bioinformatics analysis. Methods and Data Sources: Twelve out of 22 SNPs, selected on the basis of linkage disequilibrium pattern, were analyzed in an Australian sample of 1,011 histologically verified prostate cancer cases and 1,405 ethnically matched controls. Replication was sought from two existing genome wide association studies (GWAS): the Cancer Genetic Markers of Susceptibility (CGEMS) project and a UK GWAS study. Results: Two KLK15 SNPs, rs2659053 and rs3745522, showed evidence of association (p, 0.05) but were not present on the GWAS platforms. KLK15 SNP rs2659056 was found to be associated with prostate cancer aggressiveness and showed evidence of association in a replication cohort of 5,051 patients from the UK, Australia, and the CGEMS dataset of US samples. A highly significant association with Gleason score was observed when the data was combined from these three studies with an Odds Ratio (OR) of 0.85 (95% CI = 0.77-0.93; p = 2.7610 24). The rs2659056 SNP is predicted to alter binding of the RORalpha transcription factor, which has a role in the control of cell growth and differentiation and has been suggested to control the metastatic behavior of prostate cancer cells. Conclusions: Our findings suggest a role for KLK15 genetic variation in the etiology of prostate cancer among men of European ancestry, although further studies in very large sample sets are necessary to confirm effect sizes.
Resumo:
KLK15 over-expression is reported to be a significant predictor of reduced progression-free survival and overall survival in ovarian cancer. Our aim was to analyse the KLK15 gene for putative functional single nucleotide polymorphisms (SNPs) and assess the association of these and KLK15 HapMap tag SNPs with ovarian cancer survival. Results In silico analysis was performed to identify KLK15 regulatory elements and to classify potentially functional SNPs in these regions. After SNP validation and identification by DNA sequencing of ovarian cancer cell lines and aggressive ovarian cancer patients, 9 SNPs were shortlisted and genotyped using the Sequenom iPLEX Mass Array platform in a cohort of Australian ovarian cancer patients (N = 319). In the Australian dataset we observed significantly worse survival for the KLK15 rs266851 SNP in a dominant model (Hazard Ratio (HR) 1.42, 95% CI 1.02-1.96). This association was observed in the same direction in two independent datasets, with a combined HR for the three studies of 1.16 (1.00-1.34). This SNP lies 15bp downstream of a novel exon and is predicted to be involved in mRNA splicing. The mutant allele is also predicted to abrogate an HSF-2 binding site. Conclusions We provide evidence of association for the SNP rs266851 with ovarian cancer survival. Our results provide the impetus for downstream functional assays and additional independent validation studies to assess the role of KLK15 regulatory SNPs and KLK15 isoforms with alternative intracellular functional roles in ovarian cancer survival.
Resumo:
Background Cohort studies can provide valuable evidence of cause and effect relationships but are subject to loss of participants over time, limiting the validity of findings. Computerised record linkage offers a passive and ongoing method of obtaining health outcomes from existing routinely collected data sources. However, the quality of record linkage is reliant upon the availability and accuracy of common identifying variables. We sought to develop and validate a method for linking a cohort study to a state-wide hospital admissions dataset with limited availability of unique identifying variables. Methods A sample of 2000 participants from a cohort study (n = 41 514) was linked to a state-wide hospitalisations dataset in Victoria, Australia using the national health insurance (Medicare) number and demographic data as identifying variables. Availability of the health insurance number was limited in both datasets; therefore linkage was undertaken both with and without use of this number and agreement tested between both algorithms. Sensitivity was calculated for a sub-sample of 101 participants with a hospital admission confirmed by medical record review. Results Of the 2000 study participants, 85% were found to have a record in the hospitalisations dataset when the national health insurance number and sex were used as linkage variables and 92% when demographic details only were used. When agreement between the two methods was tested the disagreement fraction was 9%, mainly due to "false positive" links when demographic details only were used. A final algorithm that used multiple combinations of identifying variables resulted in a match proportion of 87%. Sensitivity of this final linkage was 95%. Conclusions High quality record linkage of cohort data with a hospitalisations dataset that has limited identifiers can be achieved using combinations of a national health insurance number and demographic data as identifying variables.
Resumo:
The chief challenge facing persistent robotic navigation using vision sensors is the recognition of previously visited locations under different lighting and illumination conditions. The majority of successful approaches to outdoor robot navigation use active sensors such as LIDAR, but the associated weight and power draw of these systems makes them unsuitable for widespread deployment on mobile robots. In this paper we investigate methods to combine representations for visible and long-wave infrared (LWIR) thermal images with time information to combat the time-of-day-based limitations of each sensing modality. We calculate appearance-based match likelihoods using the state-of-the-art FAB-MAP [1] algorithm to analyse loop closure detection reliability across different times of day. We present preliminary results on a dataset of 10 successive traverses of a combined urban-parkland environment, recorded in 2-hour intervals from before dawn to after dusk. Improved location recognition throughout an entire day is demonstrated using the combined system compared with methods which use visible or thermal sensing alone.
Resumo:
Traditional recommendation methods offer items, that are inanimate and one way recommendation, to users. Emerging new applications such as online dating or job recruitments require reciprocal people-to-people recommendations that are animate and two-way recommendations. In this paper, we propose a reciprocal collaborative method based on the concepts of users' similarities and common neighbors. The dataset employed for the experiment is gathered from a real life online dating network. The proposed method is compared with baseline methods that use traditional collaborative algorithms. Results show the proposed method can achieve noticeably better performance than the baseline methods.
Resumo:
Background China has one of the highest suicide rates in the world; however, the recent trends in suicide have not been adequately studied. This study aimed to examine the potential changes in the rates and characteristics in a Chinese population. Methods Data on suicide deaths in 1991–2010 were extracted from the Shandong Disease Surveillance Point (DSP) mortality dataset based on ICD-10 codes. The temporal trend in age-adjusted suicide rates for each subpopulation was tested using log-linear Poisson regression analysis. Results From 1991 to 2010, there was a marked decrease in the overall suicide rate in Shandong, with an average reduction of 8% per year. The decrease trend was stronger in rural than in urban areas and more evident in females than in males. Similar decreases were observed for all age groups. Pesticide ingestion and hanging remained the top two methods for suicide. Limitations There are likely quality concerns in the morality data, such as underreporting and misclassification, as well as low accuracy in determining the underlying causes of deaths. The representativeness of the DSP system may also be problematic due to the rapid changes in economy and demography. Conclusions Completed suicides in Shandong have sharply declined over the past 20 years. Higher rates in females versus males and in rural versus urban areas, which were previously considered to be distinguishing features of suicide in China, are becoming less pronounced.
Resumo:
Siamese mud carp (Henichorynchus siamensis) is a freshwater teleost of high economic importance in the Mekong River Basin. However, genetic data relevant for delineating wild stocks for management purposes currently are limited for this species. Here, we used 454 pyrosequencing to generate a partial genome survey sequence (GSS) dataset to develop simple sequence repeat (SSR) markers from H. siamensis genomic DNA. Data generated included a total of 65,954 sequence reads with average length of 264 nucleotides, of which 2.79% contain SSR motifs. Based on GSS-BLASTx results, 10.5% of contigs and 8.1% singletons possessed significant similarity (E value < 10–5) with the majority matching well to reported fish sequences. KEGG analysis identified several metabolic pathways that provide insights into specific potential roles and functions of sequences involved in molecular processes in H. siamensis. Top protein domains detected included reverse transcriptase and the top putative functional transcript identified was an ORF2-encoded protein. One thousand eight hundred and thirty seven sequences containing SSR motifs were identified, of which 422 qualified for primer design and eight polymorphic loci have been tested with average observed and expected heterozygosity estimated at 0.75 and 0.83, respectively. Regardless of their relative levels of polymorphism and heterozygosity, microsatellite loci developed here are suitable for further population genetic studies in H. siamensis and may also be applicable to other related taxa.
Resumo:
Members of the Calliphoridae (blowflies) are significant for medical and veterinary management, due to the ability of some species to consume living flesh as larvae, and for forensic investigations due to the ability of others to develop in corpses. Due to the difficulty of accurately identifying larval blowflies to species there is a need for DNA-based diagnostics for this family, however the widely used DNA-barcoding marker, cox1, has been shown to fail for several groups within this family. Additionally, many phylogenetic relationships within the Calliphoridae are still unresolved, particularly deeper level relationships. Sequencing whole mt genomes has been demonstrated both as an effective method for identifying the most informative diagnostic markers and for resolving phylogenetic relationships. Twenty-seven complete, or nearly so, mt genomes were sequenced representing 13 species, seven genera and four calliphorid subfamilies and a member of the related family Tachinidae. PCR and sequencing primers developed for sequencing one calliphorid species could be reused to sequence related species within the same superfamily with success rates ranging from 61% to 100%, demonstrating the speed and efficiency with which an mt genome dataset can be assembled. Comparison of molecular divergences for each of the 13 protein-coding genes and 2 ribosomal RNA genes, at a range of taxonomic scales identified novel targets for developing as diagnostic markers which were 117–200% more variable than the markers which have been used previously in calliphorids. Phylogenetic analysis of whole mt genome sequences resulted in much stronger support for family and subfamily-level relationships. The Calliphoridae are polyphyletic, with the Polleninae more closely related to the Tachinidae, and the Sarcophagidae are the sister group of the remaining calliphorids. Within the Calliphoridae, there was strong support for the monophyly of the Chrysomyinae and Luciliinae and for the sister-grouping of Luciliinae with Calliphorinae. Relationships within Chrysomya were not well resolved. Whole mt genome data, supported the previously demonstrated paraphyly of Lucilia cuprina with respect to L. sericata and allowed us to conclude that it is due to hybrid introgression prior to the last common ancestor of modern sericata populations, rather than due to recent hybridisation, nuclear pseudogenes or incomplete lineage sorting.
Resumo:
Background Lower extremity amputation results in significant global morbidity and mortality. Australia appears to have a paucity of studies investigating lower extremity amputation. The primary aim of this retrospective study was to investigate key conditions associated with lower extremity amputations in an Australian population. Secondary objectives were to determine the influence of age and sex on lower extremity amputations, and the reliability of hospital coded amputations. Methods: Lower extremity amputation cases performed at the Princess Alexandra Hospital (Brisbane, Australia) between July 2006 and June 2007 were identified through the relevant hospital discharge dataset (n = 197). All eligible clinical records were interrogated for age, sex, key condition associated with amputation, amputation site, first ever amputation status and the accuracy of the original hospital coding. Exclusion criteria included records unavailable for audit and cases where the key condition was unable to be determined. Chi-squared, t-tests, ANOVA and post hoc tests were used to determine differences between groups. Kappa statistics were used to measure reliability between coded and audited amputations. A minimum significance level of p < 0.05 was used throughout. Results: One hundred and eighty-six cases were eligible and audited. Overall 69% were male, 56% were first amputations, 54% were major amputations, and mean age was 62 ± 16 years. Key conditions associated included type 2 diabetes (53%), peripheral arterial disease (non-diabetes) (18%), trauma (8%), type 1 diabetes (7%) and malignant tumours (5%). Differences in ages at amputation were associated with trauma 36 ± 10 years, type 1 diabetes 52 ± 12 years and type 2 diabetes 67 ± 10 years (p < 0.01). Reliability of original hospital coding was high with Kappa values over 0.8 for all variables. Conclusions: This study, the first in over 20 years to report on all levels of lower extremity amputations in Australia, found that people undergoing amputation are more likely to be older, male and have diabetes. It is recommended that large prospective studies are implemented and national lower extremity amputation rates are established to address the large preventable burden of lower extremity amputation in Australia.
Resumo:
Background: Hospitalisation for ambulatory care sensitive conditions (ACSHs) has become a recognised tool to measure access to primary care. Timely and effective outpatient care is highly relevant to refugee populations given the past exposure to torture and trauma, and poor access to adequate health care in their countries of origin and during flight. Little is known about ACSHs among resettled refugee populations. With the aim of examining the hypothesis that people from refugee backgrounds have higher ACSHs than people born in the country of hospitalisation, this study analysed a six-year state-wide hospital discharge dataset to estimate ACSH rates for residents born in refugee-source countries and compared them with the Australia-born population. Methods: Hospital discharge data between 1 July 1998 and 30 June 2004 from the Victorian Admitted Episodes Dataset were used to assess ACSH rates among residents born in eight refugee-source countries, and compare them with the Australia-born average. Rate ratios and 95% confidence levels were used to illustrate these comparisons. Four categories of ambulatory care sensitive conditions were measured: total, acute, chronic and vaccine-preventable. Country of birth was used as a proxy indicator of refugee status. Results: When compared with the Australia-born population, hospitalisations for total and acute ambulatory care sensitive conditions were lower among refugee-born persons over the six-year period. Chronic and vaccine-preventable ACSHs were largely similar between the two population groups. Conclusion: Contrary to our hypothesis, preventable hospitalisation rates among people born in refugee-source countries were no higher than Australia-born population averages. More research is needed to elucidate whether low rates of preventable hospitalisation indicate better health status, appropriate health habits, timely and effective care-seeking behaviour and outpatient care, or overall low levels of health care-seeking due to other more pressing needs during the initial period of resettlement. It is important to unpack dimensions of health status and health care access in refugee populations through ad-hoc surveys as the refugee population is not a homogenous group despite sharing a common experience of forced displacement and violence-related trauma.
Resumo:
Objective: To investigate whether hospital utilisation and health outcomes in Victoria differ between people born in refugee-source countries and those born in Australia. Design and setting: Analysis of a statewide hospital discharge dataset for the 6 financial years from 1 July 1998 to 30 June 2004. Hospital admissions of people born in eight countries for which the majority of entrants to Australia arrived as refugees were included in the analysis. Main outcome measures: Age-standardised rates and rate ratios for: total hospital admissions; emergency admissions; surgical admissions; total days in hospital; discharge at own risk; hospital deaths; admissions due to infectious and parasitic diseases; and admissions due to mental and behavioural disorders. Results: In 2003–04, compared with the Australia-born Victorian population, people born in refugee-source countries had lower rates of surgical admission (rate ratio [RR], 0.85; 95% CI, 0.81–0.88), total days in hospital (RR, 0.74; 95% CI, 0.73–0.75), and admission due to mental and behavioural disorders (RR, 0.70; 95% CI, 0.65–0.76). Over the 6-year period, rates of total days in hospital and rates of admission due to mental and behavioural disorders for people born in refugee-source countries increased towards Australian-born averages, while rates of total admissions, emergency admissions, and admissions due to infectious and parasitic diseases increased above the Australian-born averages. Conclusions: Use of hospital services among people born in refugee-source countries is not higher than that of the Australian-born population and shows a trend towards Australian-born averages. Our findings indicate that the Refugee and Humanitarian Program does not currently place a burden on the Australian hospital system.