66 resultados para speaker clustering


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The paper disputes two influential claims in the Romance Linguistics literature. The first is that the synthetic future tenses in spoken Western Romance are now rivalled, if not supplanted, as temporal functors by the more recently developed GO futures. The second is that these synthetic futures now have modal rather than temporal meanings in spoken Romance. These claims are seen as reflecting a universal cycle of diachronic change, in which verb forms originally expressing modal (or aspectual) values take on future temporal reference, becoming tenses. The new modal meanings supplant the temporal, which are then taken up by new forms. Challenges to this theory for French are raised on the basis of empirical evidence of two sorts. Positively, future tenses in spoken Romance continue to be used with temporal meaning. Negatively, evidence of modal meaning for these forms is lacking. The evidence comes froma corpora of spoken French, native speaker judgements and verb data from a daily broadsheet. Cumulatively, it points to the reverse of the claims noted above: the synthetic future in spoken French has temporal but little modal meaning.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A combination of deductive reasoning, clustering, and inductive learning is given as an example of a hybrid system for exploratory data analysis. Visualization is replaced by a dialogue with the data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the context of cancer diagnosis and treatment, we consider the problem of constructing an accurate prediction rule on the basis of a relatively small number of tumor tissue samples of known type containing the expression data on very many (possibly thousands) genes. Recently, results have been presented in the literature suggesting that it is possible to construct a prediction rule from only a few genes such that it has a negligible prediction error rate. However, in these results the test error or the leave-one-out cross-validated error is calculated without allowance for the selection bias. There is no allowance because the rule is either tested on tissue samples that were used in the first instance to select the genes being used in the rule or because the cross-validation of the rule is not external to the selection process; that is, gene selection is not performed in training the rule at each stage of the cross-validation process. We describe how in practice the selection bias can be assessed and corrected for by either performing a cross-validation or applying the bootstrap external to the selection process. We recommend using 10-fold rather than leave-one-out cross-validation, and concerning the bootstrap, we suggest using the so-called. 632+ bootstrap error estimate designed to handle overfitted prediction rules. Using two published data sets, we demonstrate that when correction is made for the selection bias, the cross-validated error is no longer zero for a subset of only a few genes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Multiple sclerosis and idiopathic dilated cardiomyopathy are two conditions in which an autoimmune process is implicated in the pathogenesis. There is evidence to support clustering of autoimmune diseases in patients with multiple sclerosis and their families. To our knowledge, this is the first report of idiopathic dilated cardiomyopathy occurring in a patient with multiple sclerosis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Contrary to the common pattern of spatial terms being metaphorically extended to location in time, the Australian language Jingulu shows an unusual extension of temporal markers to indicate location in space. Light verbs, which typically encode tense, aspect, mood and associated motion, are occasionally found on nouns to indicate the relative location of the referent with respect to the speaker. It is hypothesised that this pattern resulted from the reduction of verbal clauses used as relative modifiers to the nouns in question.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Using the framework of communication accommodation theory the authors examined convergence and maintenance on evaluations of Chinese and Australian students. In Study 1, Australian students judged interactions between an Anglo-Australian. and another interactant who either maintained his or converged in speech style. Results indicated that participants were aware of convergence but that speaker ethnicity (Anglo-Australian, Chinese Australian or Chinese national) was a stronger influence on evaluations and future intentions to interact with the speaker In Study 2, Australian students judged Chinese speakers who maintained communication style or converged on interpersonal speech markers, intergroup markers, or both types of markers. Results indicated that the more participants defined themselves in intergroup terms, the more positively they judged intergroup convergence relative to interpersonal convergence and maintenance. This points to the importance of distinguishing between, convergence on interpersonal and intergroup speech markers, and underlines the role of individual differences in the evaluation of convergence.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Liver samples from rabbits killed by RHDV, collected from five States in Australia in 1996 and 1997 were analysed by RT-PCR. A 398 bp fragment of the capsid protein (VP60) gene was amplified by PCR and directly sequenced. The alignment of the nucleotide and amino acid sequences and their comparison with the original strain of the virus released in Australia indicated genetic changes after two years have been small with 98.2% to 100% identity. The constructed phylogenetic tree suggests slight differences in nucleotide substitutions in various States but there is no clear evidence of clustering of sequences according to their geographic origin. In practical terms, sequencing of viral RNA provides a means of testing the efficacy of further releases and subsequent spread of the virus if such a strategy is employed as a means of enhancing RHD as a biological control of the wild rabbit in Australia.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cylindrospermopsis raciborskii is a toxic-bloom-forming cyanobacterium that is commonly found in tropical to subtropical climatic regions worldwide, but it is also recognized as a common component of cyanobacterial communities in temperate climates. Genetic profiles of C. raciborskii were examined in 19 cultured isolates originating from geographically diverse regions of Australia and represented by two distinct morphotypes. A 609-bp region of rpoC1, a DNA-dependent RNA polymerase gene, was amplified by PCR from these isolates with cyanobacterium-specific primers. Sequence analysis revealed that all isolates belonged to the same species, including morphotypes with straight or coiled trichomes. Additional rpoC1 gene sequences obtained for a range of cyanobacteria highlighted clustering of C. raciborskii with other heterocyst-producing cyanobacteria (orders Nostocales and Stigonematales). In contrast, randomly amplified polymorphic DNA and short tandemly repeated repetitive sequence profiles revealed a greater level of genetic heterogeneity among C. raciborskii isolates than did rpoC1 gene analysis, and unique band profiles were also found among each of the cyanobacterial genera examined. A PCR test targeting a region of the rpoC1 gene unique to C. raciborskii was developed for the specific identification of C. raciborskii from both purified genomic DNA and environmental samples. The PCR was evaluated with a number of cyanobacterial isolates, but a PCR-positive result was only achieved with C, raciborskii. This method provides an accurate alternative to traditional morphological identification of C. raciborskii.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper develops an interactive approach for exploratory spatial data analysis. Measures of attribute similarity and spatial proximity are combined in a clustering model to support the identification of patterns in spatial information. Relationships between the developed clustering approach, spatial data mining and choropleth display are discussed. Analysis of property crime rates in Brisbane, Australia is presented. A surprising finding in this research is that there are substantial inconsistencies in standard choropleth display options found in two widely used commercial geographical information systems, both in terms of definition and performance. The comparative results demonstrate the usefulness and appeal of the developed approach in a geographical information system environment for exploratory spatial data analysis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Examples from the Murray-Darling basin in Australia are used to illustrate different methods of disaggregation of reconnaissance-scale maps. One approach for disaggregation revolves around the de-convolution of the soil-landscape paradigm elaborated during a soil survey. The descriptions of soil ma units and block diagrams in a soil survey report detail soil-landscape relationships or soil toposequences that can be used to disaggregate map units into component landscape elements. Toposequences can be visualised on a computer by combining soil maps with digital elevation data. Expert knowledge or statistics can be used to implement the disaggregation. Use of a restructuring element and k-means clustering are illustrated. Another approach to disaggregation uses training areas to develop rules to extrapolate detailed mapping into other, larger areas where detailed mapping is unavailable. A two-level decision tree example is presented. At one level, the decision tree method is used to capture mapping rules from the training area; at another level, it is used to define the domain over which those rules can be extrapolated. (C) 2001 Elsevier Science B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Using data from the H I Parkes All Sky Survey (HIPASS), we have searched for neutral hydrogen in galaxies in a region similar to25x25 deg(2) centred on NGC 1399, the nominal centre of the Fornax cluster. Within a velocity search range of 300-3700 km s(-1) and to a 3sigma lower flux limit of similar to40 mJy, 110 galaxies with H I emission were detected, one of which is previously uncatalogued. None of the detections has early-type morphology. Previously unknown velocities for 14 galaxies have been determined, with a further four velocity measurements being significantly dissimilar to published values. Identification of an optical counterpart is relatively unambiguous for more than similar to90 per cent of our H I galaxies. The galaxies appear to be embedded in a sheet at the cluster velocity which extends for more than 30degrees across the search area. At the nominal cluster distance of similar to20 Mpc, this corresponds to an elongated structure more than 10 Mpc in extent. A velocity gradient across the structure is detected, with radial velocities increasing by similar to500 km s(-1) from south-east to north-west. The clustering of galaxies evident in optical surveys is only weakly suggested in the spatial distribution of our H I detections. Of 62 H I detections within a 10degrees projected radius of the cluster centre, only two are within the core region (projected radius

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The habit of inducing plant galls has evolved multiple times among insects but most species diversity occurs in only a few groups, such as gall midges and gall wasps. This phylogenetic clustering may reflect adaptive radiations in insect groups in which the trait has evolved. Alternatively, multiple independent origins of galling may suggest a selective advantage to the habit. We use DNA sequence data to examine the origins of galling among the most speciose group of gall-inducing scale insects, the eriococcids. We determine that the galling habit has evolved multiple times, including four times in Australian taxa, suggesting that there has been a selective advantage to galling in Australia. Additionally, although most gall-inducing eriococcid species occur on Myrtaceae, we found that lineages feeding on Myrtaceae are no more likely to have evolved the galling habit than those feeding on other plant groups. However, most gall-inducing species-richness is clustered in only two clades (Apiomorpha and Lachnodius + Opisthoscelis), all of which occur exclusively on Eucalyptus s.s. The Eriococcidae and the large genus Eriococcus were determined to be non-monophyletic and each will require revision. (C) 2004 The Linnean Society of London.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective: To examine the quality of diabetes care and prevention of cardiovascular disease (CVD) in Australian general practice patients with type 2 diabetes and to investigate its relationship with coronary heart disease absolute risk (CHDAR). Methods: A total of 3286 patient records were extracted from registers of patients with type 2 diabetes held by 16 divisions of general practice (250 practices) across Australia for the year 2002. CHDAR was estimated using the United Kingdom Prospective Diabetes Study algorithm with higher CHDAR set at a 10 year risk of >15%. Multivariate multilevel logistic regression investigated the association between CHDAR and diabetes care. Results: 47.9% of diabetic patient records had glycosylated haemoglobin (HbA1c) >7%, 87.6% had total cholesterol >= 4.0 mmol/l, and 73.8% had blood pressure (BP) >= 130/85 mm Hg. 57.6% of patients were at a higher CHDAR, 76.8% of whom were not on lipid modifying medication and 66.2% were not on antihypertensive medication. After adjusting for clustering at the general practice level and age, lipid modifying medication was negatively related to CHDAR (odds ratio (OR) 0.84) and total cholesterol. Antihypertensive medication was positively related to systolic BP but negatively related to CHDAR (OR 0.88). Referral to ophthalmologists/optometrists and attendance at other health professionals were not related to CHDAR. Conclusions: At the time of the study the diabetes and CVD preventive care in Australian general practice was suboptimal, even after a number of national initiatives. The Australian Pharmaceutical Benefits Scheme (PBS) guidelines need to be modified to improve CVD preventive care in patients with type 2 diabetes.