61 resultados para clustering techniques

em University of Queensland eSpace - Australia


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In microarray studies, the application of clustering techniques is often used to derive meaningful insights into the data. In the past, hierarchical methods have been the primary clustering tool employed to perform this task. The hierarchical algorithms have been mainly applied heuristically to these cluster analysis problems. Further, a major limitation of these methods is their inability to determine the number of clusters. Thus there is a need for a model-based approach to these. clustering problems. To this end, McLachlan et al. [7] developed a mixture model-based algorithm (EMMIX-GENE) for the clustering of tissue samples. To further investigate the EMMIX-GENE procedure as a model-based -approach, we present a case study involving the application of EMMIX-GENE to the breast cancer data as studied recently in van 't Veer et al. [10]. Our analysis considers the problem of clustering the tissue samples on the basis of the genes which is a non-standard problem because the number of genes greatly exceed the number of tissue samples. We demonstrate how EMMIX-GENE can be useful in reducing the initial set of genes down to a more computationally manageable size. The results from this analysis also emphasise the difficulty associated with the task of separating two tissue groups on the basis of a particular subset of genes. These results also shed light on why supervised methods have such a high misallocation error rate for the breast cancer data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes the application of a new technique, rough clustering, to the problem of market segmentation. Rough clustering produces different solutions to k-means analysis because of the possibility of multiple cluster membership of objects. Traditional clustering methods generate extensional descriptions of groups, that show which objects are members of each cluster. Clustering techniques based on rough sets theory generate intensional descriptions, which outline the main characteristics of each cluster. In this study, a rough cluster analysis was conducted on a sample of 437 responses from a larger study of the relationship between shopping orientation (the general predisposition of consumers toward the act of shopping) and intention to purchase products via the Internet. The cluster analysis was based on five measures of shopping orientation: enjoyment, personalization, convenience, loyalty, and price. The rough clusters obtained provide interpretations of different shopping orientations present in the data without the restriction of attempting to fit each object into only one segment. Such descriptions can be an aid to marketers attempting to identify potential segments of consumers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Non-technical losses (NTL) identification and prediction are important tasks for many utilities. Data from customer information system (CIS) can be used for NTL analysis. However, in order to accurately and efficiently perform NTL analysis, the original data from CIS need to be pre-processed before any detailed NTL analysis can be carried out. In this paper, we propose a feature selection based method for CIS data pre-processing in order to extract the most relevant information for further analysis such as clustering and classifications. By removing irrelevant and redundant features, feature selection is an essential step in data mining process in finding optimal subset of features to improve the quality of result by giving faster time processing, higher accuracy and simpler results with fewer features. Detailed feature selection analysis is presented in the paper. Both time-domain and load shape data are compared based on the accuracy, consistency and statistical dependencies between features.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A piecewise uniform fitted mesh method turns out to be sufficient for the solution of a surprisingly wide variety of singularly perturbed problems involving steep gradients. The technique is applied to a model of adsorption in bidisperse solids for which two fitted mesh techniques, a fitted-mesh finite difference method (FMFDM) and fitted mesh collocation method (FMCM) are presented. A combination (FMCMD) of FMCM and the DASSL integration package is found to be most effective in solving the problems. Numerical solutions (FMFDM and FMCMD) were found to match the analytical solution when the adsorption isotherm is linear, even under conditions involving steep gradients for which global collocation fails. In particular, FMCMD is highly efficient for macropore diffusion control or micropore diffusion control. These techniques are simple and there is no limit on the range of the parameters. The techniques can be applied to a variety of adsorption and desorption problems in bidisperse solids with non-linear isotherm and for arbitrary particle geometry.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Experimental mechanical sieving methods are applied to samples of shellfish remains from three sites in southeast Queensland, Seven Mile Creek Mound, Sandstone Point and One-Tree, to test the efficacy of various recovery and quantification procedures commonly applied to shellfish assemblages in Australia. There has been considerable debate regarding the most appropriate sieve sizes and quantification methods that should be applied in the recovery of vertebrate faunal remains. Few studies, however, have addressed the impact of recovery and quantification methods on the interpretation of invertebrates, specifically shellfish remains. In this study, five shellfish taxa representing four bivalves (Anadara trapezia, Trichomya hirsutus, Saccostrea glomerata, Donax deltoides) and one gastropod (Pyrazus ebeninus) common in eastern Australian midden assemblages are sieved through 10mm, 6.3mm and 3.15mm mesh. Results are quantified using MNI, NISP and weight. Analyses indicate that different structural properties and pre- and postdepositional factors affect recovery rates. Fragile taxa (T. hirsutus) or those with foliated structure (S. glomerata) tend to be overrepresented by NISP measures in smaller sieve fractions, while more robust taxa (A. trapezia and P. ebeninus) tend to be overrepresented by weight measures. Results demonstrate that for all quantification methods tested a 3mm sieve should be used on all sites to allow for regional comparability and to effectively collect all available information about the shellfish remains.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent efforts in the characterization of air-water flows properties have included some clustering process analysis. A cluster of bubbles is defined as a group of two or more bubbles, with a distinct separation from other bubbles before and after the cluster. The present paper compares the results of clustering processes two hydraulic structures. That is, a large-size dropshaft and a hydraulic jump in a rectangular horizontal channel. The comparison highlighted some significant differences in clustering production and structures. Both dropshaft and hydraulic jump flows are complex turbulent shear flows, and some clustering index may provide some measure of the bubble-turbulence interactions and associated energy dissipation.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective: To improve the success of culturing olfactory neurons from human nasal mucosa by investigating the intranasal distribution of the olfactory epithelium and devising new techniques for growing human olfactory epithelium in vitro. Design: Ninety-seven biopsy specimens were obtained from 33 individuals, aged 21 to 74 years, collected from 6 regions of the nasal cavity. Each biopsy specimen was bisected, and 1 piece was processed for immunohistochemistry or electron microscopy while the other piece was dissected further for explant culture. Four culture techniques were performed, including whole explants and explanted biopsy slices. Five days after plating, neuronal differentiation was induced by means of a medium that contained basic fibroblast growth factor. After another 5 days, cultures were processed for immunocytochemical analysis. Results: The probability of finding olfactory epithelium in a biopsy specimen ranged from 30% to 76%, depending on its location. The dorsoposterior regions of the nasal septum and the superior turbinate provided the highest probability, but, surprisingly, olfactory epithelium was also found anteriorly and ventrally on both septum and turbinates. A new method of culturing the olfactory epithelium was devised. This slice culture technique improved the success rate for generating olfactory neurons from 10% to 90%. Conclusions: This study explains and overcomes most of the variability in the success in observing neurogenesis in cultures of adult human olfactory epithelium. The techniques presented here make the human olfactory epithelium a useful model for clinical research into certain olfactory dysfunctions and a model for the causes of neurodevelopmental and neurodegenerative diseases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The high-affinity receptors for human granulocyte-macrophage colony-stimulating factor (GM-CSF), interleukin-1 (IL-3), and IL-5 are heterodimeric complexes consisting of cytokine-specific alpha subunits and a common signal-transducing beta subunit (h beta c). We have previously demonstrated the oncogenic potential of this group of receptors by identifying constitutively activating point mutations in the extracellular and transmembrane domains of h beta c. We report here a comprehensive screen of the entire h beta c molecule that has led to the identification of additional constitutive point mutations by virtue of their ability to confer factor independence on murine FDC-P1 cells. These mutations were clustered exclusively in a central region of h beta c that encompasses the extracellular membrane-proximal domain, transmembrane domain, and membrane-proximal region of the cytoplasmic domain. Interestingly, most h beta c mutants exhibited cell type-specific constitutive activity, with only two transmembrane domain mutants able to confer factor independence on both murine FDC-P1 and BAF-B03 cells. Examination of the biochemical properties of these mutants in FDC-P1 cells indicated that MAP kinase (ERK1/2), STAT, and JAK2 signaling molecules were constitutively activated. In contrast, only some of the mutant beta subunits were constitutively tyrosine phosphorylated. Taken together; these results highlight key regions involved in h beta c activation, dissociate h beta c tyrosine phosphorylation from MAP kinase and STAT activation, and suggest the involvement of distinct mechanisms by which proliferative signals can be generated by h beta c. (C) 1998 by The American Society of Hematology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Techniques and mechanism of doping controlled amounts of various cations into pillared clays without causing precipitation or damages to the pillared layered structures are reviewed and discussed. Transition metals of great interest in catalysis can be doped in the micropores of pillared clay in ionic forms by a two-step process. The micropore structures and surface nature of pillared clays are altered by the introduced cations, and this results in a significant improvement in adsorption properties of the clays. Adsorption of water, air components and organic vapors on cation-doped pillared clays were studied. The effects of the amount and species of cations on the pore structure and adsorption behavior are discussed. It is demonstrated that the presence of doped Ca2+ ions can effectively aides the control of modification of the pillared clays of large pore openings. Controlled cation doping is a simple and powerful tool for improving the adsorption properties of pillared clay.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

New techniques in air-displacement plethysmography seem to have overcome many of the previous problems of poor reproducibility and validity. These have made body-density measurements available to a larger range of individuals, including children, elderly and sick patients who often have difficulties in being submerged underwater in hydrodensitometry systems. The BOD POD air-displacement system (BOD POD body composition system; Life Measurement Instruments, Concord, CA, USA) is more precise than hydrodensitometry, is simple and rapid to operate (approximately 1 min measurements) and the results agree closely with those of hydrodensitometry (e.g. +/-3.4% for estimation of body fat). Body line scanners employing the principles of three-dimensional photography are potentially able to measure the surface area and volume of the body and its segments even more rapidly (approximately 10 s), but the validity of the measurements needs to be established. Advances in i.r. spectroscopy and mathematical modelling for calculating the area under the curve have improved precision for measuring enrichment of (H2O)-H-2 in studies of water dilution (CV 0.1-0.9% within the range of 400-1000 mu l/l) in saliva, plasma and urine. The technique is rapid and compares closely with mass spectrometry (bias 1 (SD 2) %). Advances in bedside bioelectrical-impedance techniques are making possible potential measurements of skinfold thicknesses and limb muscle mass electronically. Preliminary results suggest that the electronic method is more reproducible (intra-and inter-individual reproducibility for measuring skinfold thicknesses) and associated with less bias (+ 12%), than anthropometry (+ 40%). In addition to these selected examples, the 'mobility' or transfer of reference methods between centres has made the distinction between reference and bedside or field techniques less distinct than in the past.