840 resultados para Evolutionary clustering
Resumo:
Numerical optimisation methods are being more commonly applied to agricultural systems models, to identify the most profitable management strategies. The available optimisation algorithms are reviewed and compared, with literature and our studies identifying evolutionary algorithms (including genetic algorithms) as superior in this regard to simulated annealing, tabu search, hill-climbing, and direct-search methods. Results of a complex beef property optimisation, using a real-value genetic algorithm, are presented. The relative contributions of the range of operational options and parameters of this method are discussed, and general recommendations listed to assist practitioners applying evolutionary algorithms to the solution of agricultural systems. (C) 2001 Elsevier Science Ltd. All rights reserved.
Resumo:
In this paper a methodology for integrated multivariate monitoring and control of biological wastewater treatment plants during extreme events is presented. To monitor the process, on-line dynamic principal component analysis (PCA) is performed on the process data to extract the principal components that represent the underlying mechanisms of the process. Fuzzy c-means (FCM) clustering is used to classify the operational state. Performing clustering on scores from PCA solves computational problems as well as increases robustness due to noise attenuation. The class-membership information from FCM is used to derive adequate control set points for the local control loops. The methodology is illustrated by a simulation study of a biological wastewater treatment plant, on which disturbances of various types are imposed. The results show that the methodology can be used to determine and co-ordinate control actions in order to shift the control objective and improve the effluent quality.
Resumo:
Sequences from the tuf gene coding for the elongation factor EF-Tu were amplified and sequenced from the genomic DNA of Pirellula marina and Isosphaera pallida, two species of bacteria within the order Planctomycetales. A near-complete (1140-bp) sequence was obtained from Pi. marina and a partial (759-bp) sequence was obtained for I. pallida. Alignment of the deduced Pi. marina EF-Tu amino acid sequence against reference sequences demonstrated the presence of a unique Il-amino acid sequence motif not present in any other division of the domain Bacteria. Pi. marina shared the highest percentage amino acid sequence identity with I. pallida but showed only a low percentage identity with other members of the domain Bacteria. This is consistent with the concept of the planctomycetes as a unique division of the Bacteria. Neither primary sequence comparison of EF-Tu nor phylogenetic analysis supports any close relationship between planctomycetes and the chlamydiae, which has previously been postulated on the basis of 16S rRNA. Phylogenetic analysis of aligned EF-Tu amino acid sequences performed using distance, maximum-parsimony, and maximum likelihood approaches yielded contradictory results with respect to the position of planctomycetes relative to other bacteria, It is hypothesized that long-branch attraction effects due to unequal evolutionary rates and mutational saturation effects may account for some of the contradictions.
Resumo:
We investigate the size and power properties of the AH test of evolutionary change. This involves examining whether the size results are sensitive to both the number of individual frequencies estimated and the spectral shape adopted under the null hypothesis. The power tests examine whether the test has good power to detect shifts in both spectral position (variance) and spectral shape (autocovariance structure).
Resumo:
Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.
Resumo:
It is becoming increasingly apparent that at least some aspects of the evolution of mate recognition may be amenable to manipulation in evolutionary experiments. Quantitative genetic analyses that focus on the genetic consequences of evolutionary processes that result in mate recognition evolution may eventually provide an understanding of the genetic basis of the process of speciation. We review a series of experiments that have attempted to determine the genetic basis of the response to natural and sexual selection on mate recognition in the Drosophila serrata species complex. The genetic basis of mate recognition has been investigated at three levels: (1) between the species of D. serrata and D. birchii using interspecific hybrids, (2) between populations of D. serrata that are sympatric and allopatric with respect to D. birchii, and (3) within populations of D. serrata. These experiments suggest that it may be possible to use evolutionary experiments to observe important events such as the reinforcement of mate recognition, or the generation of the genetic associations that are central to many sexual selection models.
Resumo:
In microarray studies, the application of clustering techniques is often used to derive meaningful insights into the data. In the past, hierarchical methods have been the primary clustering tool employed to perform this task. The hierarchical algorithms have been mainly applied heuristically to these cluster analysis problems. Further, a major limitation of these methods is their inability to determine the number of clusters. Thus there is a need for a model-based approach to these. clustering problems. To this end, McLachlan et al. [7] developed a mixture model-based algorithm (EMMIX-GENE) for the clustering of tissue samples. To further investigate the EMMIX-GENE procedure as a model-based -approach, we present a case study involving the application of EMMIX-GENE to the breast cancer data as studied recently in van 't Veer et al. [10]. Our analysis considers the problem of clustering the tissue samples on the basis of the genes which is a non-standard problem because the number of genes greatly exceed the number of tissue samples. We demonstrate how EMMIX-GENE can be useful in reducing the initial set of genes down to a more computationally manageable size. The results from this analysis also emphasise the difficulty associated with the task of separating two tissue groups on the basis of a particular subset of genes. These results also shed light on why supervised methods have such a high misallocation error rate for the breast cancer data.
Resumo:
In order to study the impact of premature birth and low income on mother–infant interaction, four Portuguese samples were gathered: full-term, middle-class (n=99); premature, middle-class (n=63); full-term, low income (n=22); and premature, low income (n=21). Infants were filmed in a free play situation with their mothers, and the results were scored using the CARE Index. By means of multinomial regression analysis, social economic status (SES) was found to be the best predictor of maternal sensitivity and infant cooperative behavior within a set of medical and social factors. Contrary to the expectations of the cumulative risk perspective, two factors of risk (premature birth together with low SES) were as negative for mother–infant interaction as low SES solely. In this study, as previous studies have shown, maternal sensitivity and infant cooperative behavior were highly correlated, as was maternal control with infant compliance. Our results further indicate that, when maternal lack of responsiveness is high, the infant displays passive behavior, whereas when the maternal lack of responsiveness is medium, the infant displays difficult behavior. Indeed, our findings suggest that, in these cases, the link between types of maternal and infant interactive behavior is more dependent on the degree of maternal lack of responsiveness than it is on birth status or SES. The results will be discussed under a developmental and evolutionary reasoning
Resumo:
OBJECTIVE: To estimate the incidence rate of type 1 diabetes in the urban area of Santiago, Chile, from March 21, 1997 to March 20, 1998, and to assess the spatio-temporal clustering of cases during that period. METHODS: All sixty-one incident cases were located temporally (day of diagnosis) and spatially (place of residence) in the area of study. Knox's method was used to assess spatio-temporal clustering of incident cases. RESULTS: The overall incidence rate of type 1 diabetes was 4.11 cases per 100,000 children aged less than 15 years per year (95% confidence interval: 3.06--5.14). The incidence rate seems to have increased since the last estimate of the incidence calculated for the years 1986--1992 in the metropolitan region of Santiago. Different combinations of space-time intervals have been evaluated to assess spatio-temporal clustering. The smallest p-value was found for the combination of critical distances of 750 meters and 60 days (uncorrected p-value = 0.048). CONCLUSIONS: Although these are preliminary results regarding space-time clustering in Santiago, exploratory analysis of the data method would suggest a possible aggregation of incident cases in space-time coordinates.
Resumo:
A definition of medium voltage (MV) load diagrams was made, based on the data base knowledge discovery process. Clustering techniques were used as support for the agents of the electric power retail markets to obtain specific knowledge of their customers’ consumption habits. Each customer class resulting from the clustering operation is represented by its load diagram. The Two-step clustering algorithm and the WEACS approach based on evidence accumulation (EAC) were applied to an electricity consumption data from a utility client’s database in order to form the customer’s classes and to find a set of representative consumption patterns. The WEACS approach is a clustering ensemble combination approach that uses subsampling and that weights differently the partitions in the co-association matrix. As a complementary step to the WEACS approach, all the final data partitions produced by the different variations of the method are combined and the Ward Link algorithm is used to obtain the final data partition. Experiment results showed that WEACS approach led to better accuracy than many other clustering approaches. In this paper the WEACS approach separates better the customer’s population than Two-step clustering algorithm.