849 resultados para constrained clustering
Resumo:
Multiple sclerosis and idiopathic dilated cardiomyopathy are two conditions in which an autoimmune process is implicated in the pathogenesis. There is evidence to support clustering of autoimmune diseases in patients with multiple sclerosis and their families. To our knowledge, this is the first report of idiopathic dilated cardiomyopathy occurring in a patient with multiple sclerosis.
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
Silicic volcanic eruptions are typically accompanied by repetitive Long-Period (LP) seismicity that originates from a small region of the upper conduit. These signals have the capability to advance eruption prediction, since they commonly precede a change in the eruption vigour. Shear bands forming along the conduit wall, where the shear stresses are highest, have been linked to providing the seismic trigger. However, existing computational models are unable to generate shear bands at the depths where the LP signals originate using simple magma strength models. Presented here is a model in which the magma strength is determined from a constitutive relationship dependent upon crystallinity and pressure. This results in a depth-dependent magma strength, analogous to planetary lithospheres. Hence, in shallow highly-crystalline regions a macroscopically discontinuous brittle type of deformation will prevail, whilst in deeper crystal-poor regions there will be a macroscopically continuous plastic deformation mechanism. This will result in a depth where the brittle-ductile transition occurs, and here shear bands disconnected from the free-surface may develop. We utilize the Finite Element Method and use axi-symmetric coordinates to model magma flow as a viscoplastic material, simulating quasi-static shear bands along the walls of a volcanic conduit. Model results constrained to the Soufrière Hills Volcano, Montserrat, show the generation of two types of shear bands: upper-conduit shear bands that form between the free-surface to a few 100 metres below it and discrete shear bands that form at the depths where LP seismicity is measured to occur corresponding to the brittle-ductile transition and the plastic shear region. It is beyond the limitation of the model to simulate a seismic event, although the modelled viscosity within the discrete shear bands suggests a failure and healing cycle time that supports the observed LP seismicity repeat times. However, due to the paucity of data and large parameter space available these results can only be considered to be qualitative rather than quantitative at this stage.
Resumo:
The writers measured velocity, pressure and energy distributions, wavelengths, and wave amplitudes along undular jumps in a smooth rectangular channel 0.25 m wide. In each case the upstream flow was a fully developed shear flow. Analysis of the data shows that the jump has strong three-dimensional features and that the aspect ratio of the channel is an important parameter. Energy dissipation on the centerline is far from negligible and is largely constrained to the reach between the start of the lateral shock waves and the first wave crest of the jump, in which the boundary layer develops under a strong adverse pressure gradient. A Boussinesq-type solution of the free-surface profile, velocity, and energy and pressure distributions is developed and compared with the data. Limitations of the two-dimensional analysis are discussed.
Resumo:
It is argued that the common classification of abrasive wear into 'two-body abrasion' and 'three-body abrasion' is seriously flawed. No definitions have been agreed upon for these terms, and indeed there are two quite different interpretations, the implications of which are mutually inconsistent. In the dominant interpretation, the primary thrust of the two-body/three-body concept is to describe whether the abrasive particles are constrained (two-body) or free to roll (three-body). In this view, two-body abrasion is generally much more severe than three-body. The alternative interpretation emphasises the presence (three-body) or absence (two-body) of a rigid counterface backing the abrasive. In this view, three-body abrasion is equated to high-stress (or grinding) abrasion and is generally more severe than two-body (low-stress) abrasion. This paper recommends that the 'two-body/three-body' terminology be abandoned, to be replaced by an alternative classification scheme based directly upon the manifest severity of wear. (C) 1998 Elsevier Science S.A.
Resumo:
Liver samples from rabbits killed by RHDV, collected from five States in Australia in 1996 and 1997 were analysed by RT-PCR. A 398 bp fragment of the capsid protein (VP60) gene was amplified by PCR and directly sequenced. The alignment of the nucleotide and amino acid sequences and their comparison with the original strain of the virus released in Australia indicated genetic changes after two years have been small with 98.2% to 100% identity. The constructed phylogenetic tree suggests slight differences in nucleotide substitutions in various States but there is no clear evidence of clustering of sequences according to their geographic origin. In practical terms, sequencing of viral RNA provides a means of testing the efficacy of further releases and subsequent spread of the virus if such a strategy is employed as a means of enhancing RHD as a biological control of the wild rabbit in Australia.
Resumo:
Cylindrospermopsis raciborskii is a toxic-bloom-forming cyanobacterium that is commonly found in tropical to subtropical climatic regions worldwide, but it is also recognized as a common component of cyanobacterial communities in temperate climates. Genetic profiles of C. raciborskii were examined in 19 cultured isolates originating from geographically diverse regions of Australia and represented by two distinct morphotypes. A 609-bp region of rpoC1, a DNA-dependent RNA polymerase gene, was amplified by PCR from these isolates with cyanobacterium-specific primers. Sequence analysis revealed that all isolates belonged to the same species, including morphotypes with straight or coiled trichomes. Additional rpoC1 gene sequences obtained for a range of cyanobacteria highlighted clustering of C. raciborskii with other heterocyst-producing cyanobacteria (orders Nostocales and Stigonematales). In contrast, randomly amplified polymorphic DNA and short tandemly repeated repetitive sequence profiles revealed a greater level of genetic heterogeneity among C. raciborskii isolates than did rpoC1 gene analysis, and unique band profiles were also found among each of the cyanobacterial genera examined. A PCR test targeting a region of the rpoC1 gene unique to C. raciborskii was developed for the specific identification of C. raciborskii from both purified genomic DNA and environmental samples. The PCR was evaluated with a number of cyanobacterial isolates, but a PCR-positive result was only achieved with C, raciborskii. This method provides an accurate alternative to traditional morphological identification of C. raciborskii.
Resumo:
Inhibitors of proteolytic enzymes (proteases) are emerging as prospective treatments for diseases such as AIDS and viral infections, cancers, inflammatory disorders, and Alzheimer's disease. Generic approaches to the design of protease inhibitors are limited by the unpredictability of interactions between, and structural changes to, inhibitor and protease during binding. A computer analysis of superimposed crystal structures for 266 small molecule inhibitors bound to 48 proteases (16 aspartic, 17 serine, 8 cysteine, and 7 metallo) provides the first conclusive proof that inhibitors, including substrate analogues, commonly bind in an extended beta-strand conformation at the active sites of all these proteases. Representative superimposed structures are shown for (a) multiple inhibitors bound to a protease of each class, (b) single inhibitors each bound to multiple proteases, and (c) conformationally constrained inhibitors bound to proteases. Thus inhibitor/substrate conformation, rather than sequence/composition alone, influences protease recognition, and this has profound implications for inhibitor design. This conclusion is supported by NMR, CD, and binding studies for HIV-1 protease inhibitors/ substrates which, when preorganized in an extended conformation, have significantly higher protease affinity. Recognition is dependent upon conformational equilibria since helical and turn peptide conformations are not processed by proteases. Conformational selection explains the resistance of folded/structured regions of proteins to proteolytic degradation, the susceptibility of denatured proteins to processing, and the higher affinity of conformationally constrained 'extended' inhibitors/substrates for proteases. Other approaches to extended inhibitor conformations should similarly lead to high-affinity binding to a protease.
Resumo:
Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.
Resumo:
Production of sorghum [Sorghum bicolor (L.) Moench], an important cereal crop in semiarid regions of the world, is often limited by drought. When water is limiting during the grain-filling period, hybrids possessing the stay-green trait maintain more photosynthetically active leaves than hybrids not possessing this trait. To improve yield under drought, knowledge of the extent of genetic variation in green leaf area retention is required. Field studies were undertaken in north-eastern Australia on a cracking and self-mulching gray clay to determine the effects of water regime and hybrid on the components of green leaf area at maturity (GLAM). Nine hybrids varying in stay-green were grown under a fully irrigated control, postflowering water deficit, and terminal (pre- and postflowering) water deficit. Water deficit reduced GLAM by 67% in the terminal drought treatment compared with the fully irrigated control. Under terminal water deficit, hybrids possessing the B35 and KS19 sources of stay-green retained more GLAM (1260 cm(2) plant(-1)) compared with intermediate (780 cm(2) plant(-1)) and senescent (670 cm(2) plant(-1)) hybrids. RQL12 hybrids (KS19 source of stay-green) displayed delayed onset and reduced rate of senescence; A35 hybrids displayed only delayed onset. Visual rating of green leaf retention was highly correlated with measured GLAM, although this procedure is constrained by an inability to distinguish among the functional mechanisms determining the phenotype. Linking functional rather than phenotypic differences to molecular markers may improve the efficiency of selecting for traits such as stay-green.
Resumo:
This paper develops an interactive approach for exploratory spatial data analysis. Measures of attribute similarity and spatial proximity are combined in a clustering model to support the identification of patterns in spatial information. Relationships between the developed clustering approach, spatial data mining and choropleth display are discussed. Analysis of property crime rates in Brisbane, Australia is presented. A surprising finding in this research is that there are substantial inconsistencies in standard choropleth display options found in two widely used commercial geographical information systems, both in terms of definition and performance. The comparative results demonstrate the usefulness and appeal of the developed approach in a geographical information system environment for exploratory spatial data analysis.
Resumo:
Examples from the Murray-Darling basin in Australia are used to illustrate different methods of disaggregation of reconnaissance-scale maps. One approach for disaggregation revolves around the de-convolution of the soil-landscape paradigm elaborated during a soil survey. The descriptions of soil ma units and block diagrams in a soil survey report detail soil-landscape relationships or soil toposequences that can be used to disaggregate map units into component landscape elements. Toposequences can be visualised on a computer by combining soil maps with digital elevation data. Expert knowledge or statistics can be used to implement the disaggregation. Use of a restructuring element and k-means clustering are illustrated. Another approach to disaggregation uses training areas to develop rules to extrapolate detailed mapping into other, larger areas where detailed mapping is unavailable. A two-level decision tree example is presented. At one level, the decision tree method is used to capture mapping rules from the training area; at another level, it is used to define the domain over which those rules can be extrapolated. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
Using data from the H I Parkes All Sky Survey (HIPASS), we have searched for neutral hydrogen in galaxies in a region similar to25x25 deg(2) centred on NGC 1399, the nominal centre of the Fornax cluster. Within a velocity search range of 300-3700 km s(-1) and to a 3sigma lower flux limit of similar to40 mJy, 110 galaxies with H I emission were detected, one of which is previously uncatalogued. None of the detections has early-type morphology. Previously unknown velocities for 14 galaxies have been determined, with a further four velocity measurements being significantly dissimilar to published values. Identification of an optical counterpart is relatively unambiguous for more than similar to90 per cent of our H I galaxies. The galaxies appear to be embedded in a sheet at the cluster velocity which extends for more than 30degrees across the search area. At the nominal cluster distance of similar to20 Mpc, this corresponds to an elongated structure more than 10 Mpc in extent. A velocity gradient across the structure is detected, with radial velocities increasing by similar to500 km s(-1) from south-east to north-west. The clustering of galaxies evident in optical surveys is only weakly suggested in the spatial distribution of our H I detections. Of 62 H I detections within a 10degrees projected radius of the cluster centre, only two are within the core region (projected radius
Resumo:
In this paper, genetic algorithm (GA) is applied to the optimum design of reinforced concrete liquid retaining structures, which comprise three discrete design variables, including slab thickness, reinforcement diameter and reinforcement spacing. GA, being a search technique based on the mechanics of natural genetics, couples a Darwinian survival-of-the-fittest principle with a random yet structured information exchange amongst a population of artificial chromosomes. As a first step, a penalty-based strategy is entailed to transform the constrained design problem into an unconstrained problem, which is appropriate for GA application. A numerical example is then used to demonstrate strength and capability of the GA in this domain problem. It is shown that, only after the exploration of a minute portion of the search space, near-optimal solutions are obtained at an extremely converging speed. The method can be extended to application of even more complex optimization problems in other domains.
Resumo:
The development of structure perpendicular to and in the plane of the interface has been studied for mesoporous silicate films self-assembled at the air/water interface. The use of constrained X-ray and neutron specular reflectometry has enabled a detailed study of the structural development perpendicular to the interface during the pre-growth phase. Off-specular neutron reflectometry and grazing incidence X-ray diffraction has enabled the in-plane structure to be probed with excellent time resolution. The growth mechanism under the surfactant to silicate source ratios used in this work is clearly due to the self-assembly of micellar and molecular species at the air/liquid interface, resulting in the formation of a planar mesoporous film that is tens of microns thick. (C) 2003 Elsevier Science B.V. All rights reserved.