873 resultados para agglomerative clustering


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The canonical representation of speech constitutes a perfect reconstruction (PR) analysis-synthesis system. Its parameters are the autoregressive (AR) model coefficients, the pitch period and the voiced and unvoiced components of the excitation represented as transform coefficients. Each set of parameters may be operated on independently. A time-frequency unvoiced excitation (TFUNEX) model is proposed that has high time resolution and selective frequency resolution. Improved time-frequency fit is obtained by using for antialiasing cancellation the clustering of pitch-synchronous transform tracks defined in the modulation transform domain. The TFUNEX model delivers high-quality speech while compressing the unvoiced excitation representation about 13 times over its raw transform coefficient representation for wideband speech.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We study the spreading of contagious diseases in a population of constant size using susceptible-infective-recovered (SIR) models described in terms of ordinary differential equations (ODEs) and probabilistic cellular automata (PCA). In the PCA model, each individual (represented by a cell in the lattice) is mainly locally connected to others. We investigate how the topological properties of the random network representing contacts among individuals influence the transient behavior and the permanent regime of the epidemiological system described by ODE and PCA. Our main conclusions are: (1) the basic reproduction number (commonly called R(0)) related to a disease propagation in a population cannot be uniquely determined from some features of transient behavior of the infective group; (2) R(0) cannot be associated to a unique combination of clustering coefficient and average shortest path length characterizing the contact network. We discuss how these results can embarrass the specification of control strategies for combating disease propagations. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents the design and implementation of an embedded soft sensor, i. e., a generic and autonomous hardware module, which can be applied to many complex plants, wherein a certain variable cannot be directly measured. It is implemented based on a fuzzy identification algorithm called ""Limited Rules"", employed to model continuous nonlinear processes. The fuzzy model has a Takagi-Sugeno-Kang structure and the premise parameters are defined based on the Fuzzy C-Means (FCM) clustering algorithm. The firmware contains the soft sensor and it runs online, estimating the target variable from other available variables. Tests have been performed using a simulated pH neutralization plant. The results of the embedded soft sensor have been considered satisfactory. A complete embedded inferential control system is also presented, including a soft sensor and a PID controller. (c) 2007, ISA. Published by Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Using the network random generation models from Gustedt (2009)[23], we simulate and analyze several characteristics (such as the number of components, the degree distribution and the clustering coefficient) of the generated networks. This is done for a variety of distributions (fixed value, Bernoulli, Poisson, binomial) that are used to control the parameters of the generation process. These parameters are in particular the size of newly appearing sets of objects, the number of contexts in which new elements appear initially, the number of objects that are shared with `parent` contexts, and, the time period inside which a context may serve as a parent context (aging). The results show that these models allow to fine-tune the generation process such that the graphs adopt properties as can be found in real world graphs. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Survival models involving frailties are commonly applied in studies where correlated event time data arise due to natural or artificial clustering. In this paper we present an application of such models in the animal breeding field. Specifically, a mixed survival model with a multivariate correlated frailty term is proposed for the analysis of data from over 3611 Brazilian Nellore cattle. The primary aim is to evaluate parental genetic effects on the trait length in days that their progeny need to gain a commercially specified standard weight gain. This trait is not measured directly but can be estimated from growth data. Results point to the importance of genetic effects and suggest that these models constitute a valuable data analysis tool for beef cattle breeding.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Target region amplification polymorphism (TRAP) markers were used to estimate the genetic similarity (GS) among 53 sugarcane varieties and five species of the Saccharum complex. Seven fixed primers designed from candidate genes involved in sucrose metabolism and three from those involved in drought response metabolism were used in combination with three arbitrary primers. The clustering of the genotypes for sucrose metabolism and drought response were similar, but the GS based on Jaccard`s coefficient changed. The GS based on polymorphism in sucrose genes estimated in a set of 46 Brazilian varieties, all of which belong to the three Brazilian breeding programs, ranged from 0.52 to 0.9, and that based on drought data ranged from 0.44 to 0.95. The results suggest that genetic variability in the evaluated genes was lower in the sucrose metabolism genes than in the drought response metabolism ones.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The rhizosphere constitutes a complex niche that may be exploited by a wide variety of bacteria. Bacterium-plant interactions in this niche can be influenced by factors such as the expression of heterologous genes in the plant. The objective of this work was to describe the bacterial communities associated with the rhizosphere and rhizoplane regions of tobacco plants, and to compare communities from transgenic tobacco lines (CAB1, CAB2 and TRP) with those found in wild-type (WT) plants. Samples were collected at two stages of plant development, the vegetative and flowering stages (1 and 3 months after germination). The diversity of the culturable microbial community was assessed by isolation and further characterization of isolates by amplified ribosomal RNA gene restriction analysis (ARDRA) and 16S rRNA sequencing. These analyses revealed the presence of fairly common rhizosphere organisms with the main groups Alphaproteobacteria, Betaproteobacteria, Actinobacteria and Bacilli. Analysis of the total bacterial communities using PCR-DGGE (denaturing gradient gel electrophoresis) revealed that shifts in bacterial communities occurred during early plant development, but the reestablishment of original community structure was observed over time. The effects were smaller in rhizosphere than in rhizoplane samples, where selection of specific bacterial groups by the different plant lines was demonstrated. Clustering patterns and principal components analysis (PCA) were used to distinguish the plant lines according to the fingerprint of their associated bacterial communities. Bands differentially detected in plant lines were found to be affiliated with the genera Pantoea, Bacillus and Burkholderia in WT, CAB and TRP plants, respectively. The data revealed that, although rhizosphere/rhizoplane microbial communities can be affected by the cultivation of transgenic plants, soil resilience may be able to restore the original bacterial diversity after one cycle of plant cultivation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The bacterial diversity present in sediments of a well-preserved mangrove in Ilha do Cardoso, located in the extreme south of So Paulo State coastline, Brazil, was assessed using culture-independent molecular approaches (denaturing gradient gel electrophoresis (DGGE) and analysis of 166 sequences from a clone library). The data revealed a bacterial community dominated by Alphaproteobacteria (40.36% of clones), Gammaproteobacteria (19.28% of clones) and Acidobacteria (27.71% of clones), while minor components of the assemblage were affiliated to Betaproteobacteria, Deltaproteobacteria, Firmicutes, Actinobacteria and Bacteroidetes. The clustering and redundancy analysis (RDA) based on DGGE were used to determine factors that modulate the diversity of bacterial communities in mangroves, such as depth, seasonal fluctuations, and locations over a transect area from the sea to the land. Profiles of specific DGGE gels showed that both dominant (`universal` Bacteria and Alphaproteobacteria) and low-density bacterial communities (Betaproteobacteria and Actinobacteria) are responsive to shifts in environmental factors. The location within the mangrove was determinant for all fractions of the community studied, whereas season was significant for Bacteria, Alphaproteobacteria, and Betaproteobacteria and sample depth determined the diversity of Alphaproteobacteria and Actinobacteria.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Rectangular dropshafts, commonly used in sewers and storm water systems, are characterised by significant flow aeration. New detailed air-water flow measurements were conducted in a near-full-scale dropshaft at large discharges. In the shaft pool and outflow channel, the results demonstrated the complexity of different competitive air entrainment mechanisms. Bubble size measurements showed a broad range of entrained bubble sizes. Analysis of streamwise distributions of bubbles suggested further some clustering process in the bubbly flow although, in the outflow channel, bubble chords were in average smaller than in the shaft pool. A robust hydrophone was tested to measure bubble acoustic spectra and to assess its field application potential. The acoustic results characterised accurately the order of magnitude of entrained bubble sizes, but the transformation from acoustic frequencies to bubble radii did not predict correctly the probability distribution functions of bubble sizes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In an open channel, a hydraulic jump is the rapid transition from super- to sub-critical flow associated with strong turbulence and air bubble entrainment in the mixing layer. New experiments were performed at relatively large Reynolds numbers using phase-detection probes. Some new signal analysis provided characteristic air-water time and length scales of the vortical structures advecting the air bubbles in the developing shear flow. An analysis of the longitudinal air-water flow structure suggested little bubble clustering in the mixing layer, although an interparticle arrival time analysis showed some preferential bubble clustering for small bubbles with chord times below 3 ms. Correlation analyses yielded longitudinal air-water time scales Txx*V1/d1 of about 0.8 in average. The transverse integral length scale Z/d1 of the eddies advecting entrained bubbles was typically between 0.25 and 0.4, irrespective of the inflow conditions within the range of the investigations. Overall the findings highlighted the complicated nature of the air-water flow

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A combination of deductive reasoning, clustering, and inductive learning is given as an example of a hybrid system for exploratory data analysis. Visualization is replaced by a dialogue with the data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the context of cancer diagnosis and treatment, we consider the problem of constructing an accurate prediction rule on the basis of a relatively small number of tumor tissue samples of known type containing the expression data on very many (possibly thousands) genes. Recently, results have been presented in the literature suggesting that it is possible to construct a prediction rule from only a few genes such that it has a negligible prediction error rate. However, in these results the test error or the leave-one-out cross-validated error is calculated without allowance for the selection bias. There is no allowance because the rule is either tested on tissue samples that were used in the first instance to select the genes being used in the rule or because the cross-validation of the rule is not external to the selection process; that is, gene selection is not performed in training the rule at each stage of the cross-validation process. We describe how in practice the selection bias can be assessed and corrected for by either performing a cross-validation or applying the bootstrap external to the selection process. We recommend using 10-fold rather than leave-one-out cross-validation, and concerning the bootstrap, we suggest using the so-called. 632+ bootstrap error estimate designed to handle overfitted prediction rules. Using two published data sets, we demonstrate that when correction is made for the selection bias, the cross-validated error is no longer zero for a subset of only a few genes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Multiple sclerosis and idiopathic dilated cardiomyopathy are two conditions in which an autoimmune process is implicated in the pathogenesis. There is evidence to support clustering of autoimmune diseases in patients with multiple sclerosis and their families. To our knowledge, this is the first report of idiopathic dilated cardiomyopathy occurring in a patient with multiple sclerosis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Liver samples from rabbits killed by RHDV, collected from five States in Australia in 1996 and 1997 were analysed by RT-PCR. A 398 bp fragment of the capsid protein (VP60) gene was amplified by PCR and directly sequenced. The alignment of the nucleotide and amino acid sequences and their comparison with the original strain of the virus released in Australia indicated genetic changes after two years have been small with 98.2% to 100% identity. The constructed phylogenetic tree suggests slight differences in nucleotide substitutions in various States but there is no clear evidence of clustering of sequences according to their geographic origin. In practical terms, sequencing of viral RNA provides a means of testing the efficacy of further releases and subsequent spread of the virus if such a strategy is employed as a means of enhancing RHD as a biological control of the wild rabbit in Australia.