535 resultados para Clustering analysis

em Queensland University of Technology - ePrints Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective To investigate the epidemic characteristics of human cutaneous anthrax (CA) in China, detect the spatiotemporal clusters at the county level for preemptive public health interventions, and evaluate the differences in the epidemiological characteristics within and outside clusters. Methods CA cases reported during 2005–2012 from the national surveillance system were evaluated at the county level using space-time scan statistic. Comparative analysis of the epidemic characteristics within and outside identified clusters was performed using using the χ2 test or Kruskal-Wallis test. Results The group of 30–39 years had the highest incidence of CA, and the fatality rate increased with age, with persons ≥70 years showing a fatality rate of 4.04%. Seasonality analysis showed that most of CA cases occurred between May/June and September/October of each year. The primary spatiotemporal cluster contained 19 counties from June 2006 to May 2010, and it was mainly located straddling the borders of Sichuan, Gansu, and Qinghai provinces. In these high-risk areas, CA cases were predominantly found among younger, local, males, shepherds, who were living on agriculture and stockbreeding and characterized with high morbidity, low mortality and a shorter period from illness onset to diagnosis. Conclusion CA was geographically and persistently clustered in the Southwestern China during 2005–2012, with notable differences in the epidemic characteristics within and outside spatiotemporal clusters; this demonstrates the necessity for CA interventions such as enhanced surveillance, health education, mandatory and standard decontamination or disinfection procedures to be geographically targeted to the areas identified in this study.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent studies suggest that genetic and environmental factors do not account for all the schizophrenia risk and epigenetics also plays a role in disease susceptibility. DNA methylation is a heritable epigenetic modification that can regulate gene expression. Genome-Wide DNA methylation analysis was performed on post-mortem human brain tissue from 24 patients with schizophrenia and 24 unaffected controls. DNA methylation was assessed at over 485 000 CpG sites using the Illumina Infinium Human Methylation450 Bead Chip. After adjusting for age and post-mortem interval (PMI), 4 641 probes corresponding to 2 929 unique genes were found to be differentially methylated. Of those genes, 1 291 were located in a CpG island and 817 were in a promoter region. These include NOS1, AKT1, DTNBP1, DNMT1, PPP3CC and SOX10 which have previously been associated with schizophrenia. More than 100 of these genes overlap with a previous DNA methylation study of peripheral blood from schizophrenia patients in which 27 000 CpG sites were analysed. Unsupervised clustering analysis of the top 3 000 most variable probes revealed two distinct groups with significantly more people with schizophrenia in cluster one compared to controls (p = 1.74x10-4). The first cluster was composed of 88% of patients with schizophrenia and only 12% controls while the second cluster was composed of 27% of patients with schizophrenia and 73% controls. These results strongly suggest that differential DNA methylation is important in schizophrenia etiology and add support for the use of DNA methylation profiles as a future prognostic indicator of schizophrenia.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Samples of Forsythia suspensa from raw (Laoqiao) and ripe (Qingqiao) fruit were analyzed with the use of HPLC-DAD and the EIS-MS techniques. Seventeen peaks were detected, and of these, twelve were identified. Most were related to the glucopyranoside molecular fragment. Samples collected from three geographical areas (Shanxi, Henan and Shandong Provinces), were discriminated with the use of hierarchical clustering analysis (HCA), discriminant analysis (DA), and principal component analysis (PCA) models, but only PCA was able to provide further information about the relationships between objects and loadings; eight peaks were related to the provinces of sample origin. The supervised classification models-K-nearest neighbor (KNN), least squares support vector machines (LS-SVM), and counter propagation artificial neural network (CP-ANN) methods, indicated successful classification but KNN produced 100% classification rate. Thus, the fruit were discriminated on the basis of their places of origin.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper investigates the determinants of China’s regional innovation capacity (RIC) and variations in these determinants between different types of regions. Based on the framework of national innovation capacity (NIC) and research on innovation system, this paper develops a framework of RIC in the Chinese context. Using panel data from 1991 to 2009, clustering analysis is first employed to classify regions according to their innovation development path. Panel data regressions with fixed effect model are conducted to explore the determinants of RIC and how these vary across the different regional clusters. We find that the 30 regions can be clustered into three groups, and there are considerable differences in the drivers of RIC between these different regional groups.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Atherosclerotic cardiovascular disease remains the leading cause of morbidity and mortality in industrialized societies. The lack of metabolite biomarkers has impeded the clinical diagnosis of atherosclerosis so far. In this study, stable atherosclerosis patients (n=16) and age- and sex-matched non-atherosclerosis healthy subjects (n=28) were recruited from the local community (Harbin, P. R. China). The plasma was collected from each study subject and was subjected to metabolomics analysis by GC/MS. Pattern recognition analyses (principal components analysis, orthogonal partial least-squares discriminate analysis, and hierarchical clustering analysis) commonly demonstrated plasma metabolome, which was significantly different from atherosclerotic and non-atherosclerotic subjects. The development of atherosclerosis-induced metabolic perturbations of fatty acids, such as palmitate, stearate, and 1-monolinoleoylglycerol, was confirmed consistent with previous publication, showing that palmitate significantly contributes to atherosclerosis development via targeting apoptosis and inflammation pathways. Altogether, this study demonstrated that the development of atherosclerosis directly perturbed fatty acid metabolism, especially that of palmitate, which was confirmed as a phenotypic biomarker for clinical diagnosis of atherosclerosis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Entomological surveillance and control are essential to the management of dengue fever (DF). Hence, understanding the spatial and temporal patterns of DF vectors, Aedes (Stegomyia) aegypti (L.) and Ae. (Stegomyia) albopictus (Skuse), is paramount. In the Philippines, resources are limited and entomological surveillance and control are generally commenced during epidemics, when transmission is difficult to control. Recent improvements in spatial epidemiological tools and methods offer opportunities to explore more efficient DF surveillance and control solutions: however, there are few examples in the literature from resource-poor settings. The objectives of this study were to: (i) explore spatial patterns of Aedes populations and (ii) predict areas of high and low vector density to inform DF control in San Jose village, Muntinlupa city, Philippines. Fortnightly, adult female Aedes mosquitoes were collected from 50 double-sticky ovitraps (SOs) located in San Jose village for the period June-November 2011. Spatial clustering analysis was performed to identify high and low density clusters of Ae. aegypti and Ae. albopictus mosquitoes. Spatial autocorrelation was assessed by examination of semivariograms, and ordinary kriging was undertaken to create a smoothed surface of predicted vector density in the study area. Our results show that both Ae. aegypti and Ae. albopictus were present in San Jose village during the study period. However, one Aedes species was dominant in a given geographic area at a time, suggesting differing habitat preferences and interspecies competition between vectors. Density maps provide information to direct entomological control activities and advocate the development of geographically enhanced surveillance and control systems to improve DF management in the Philippines.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This project aimed to identify novel genetic risk variants associated with migraine in the Norfolk Island population. Statistical analysis and bioinformatics approaches such as polygenic modeling and gene clustering methods were carried out to explore genotypic and expression data from high-throughput techniques. This project had a particular focus on hormonal genes and other genetic variants and identified a modest effect size on the migraine phenotype.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Road traffic emissions are often considered the main source of ultrafine particles (UFP, diameter smaller than 100 nm) in urban environments. However, recent studies worldwide have shown that - in high-insolation urban regions at least - new particle formation events can also contribute to UFP. In order to quantify such events we systematically studied three cities located in predominantly sunny environments: Barcelona (Spain), Madrid (Spain) and Brisbane (Australia). Three long term datasets (1-2 years) of fine and ultrafine particle number size distributions (measured by SMPS, Scanning Mobility Particle Sizer) were analysed. Compared to total particle number concentrations, aerosol size distributions offer far more information on the type, origin and atmospheric evolution of the particles. By applying k-Means clustering analysis, we categorized the collected aerosol size distributions in three main categories: “Traffic” (prevailing 44-63% of the time), “Nucleation” (14-19%) and “Background pollution and Specific cases” (7-22%). Measurements from Rome (Italy) and Los Angeles (California) were also included to complement the study. The daily variation of the average UFP concentrations for a typical nucleation day at each site revealed a similar pattern for all cities, with three distinct particle bursts. A morning and an evening spike reflected traffic rush hours, whereas a third one at midday showed nucleation events. The photochemically nucleated particles burst lasted 1-4 hours, reaching sizes of 30-40 nm. On average, the occurrence of particle size spectra dominated by nucleation events was 16% of the time, showing the importance of this process as a source of UFP in urban environments exposed to high solar radiation. On average, nucleation events lasting for 2 hours or more occurred on 55% of the days, this extending to >4hrs in 28% of the days, demonstrating that atmospheric conditions in urban environments are not favourable to the growth of photochemically nucleated particles. In summary, although traffic remains the main source of UFP in urban areas, in developed countries with high insolation urban nucleation events are also a main source of UFP. If traffic-related particle concentrations are reduced in the future, nucleation events will likely increase in urban areas, due to the reduced urban condensation sinks.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Migraine is a painful disorder for which the etiology remains obscure. Diagnosis is largely based on International Headache Society criteria. However, no feature occurs in all patients who meet these criteria, and no single symptom is required for diagnosis. Consequently, this definition may not accurately reflect the phenotypic heterogeneity or genetic basis of the disorder. Such phenotypic uncertainty is typical for complex genetic disorders and has encouraged interest in multivariate statistical methods for classifying disease phenotypes. We applied three popular statistical phenotyping methods—latent class analysis, grade of membership and grade of membership “fuzzy” clustering (Fanny)—to migraine symptom data, and compared heritability and genome-wide linkage results obtained using each approach. Our results demonstrate that different methodologies produce different clustering structures and non-negligible differences in subsequent analyses. We therefore urge caution in the use of any single approach and suggest that multiple phenotyping methods be used.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We introduce a framework for population analysis of white matter tracts based on diffusion-weighted images of the brain. The framework enables extraction of fibers from high angular resolution diffusion images (HARDI); clustering of the fibers based partly on prior knowledge from an atlas; representation of the fiber bundles compactly using a path following points of highest density (maximum density path; MDP); and registration of these paths together using geodesic curve matching to find local correspondences across a population. We demonstrate our method on 4-Tesla HARDI scans from 565 young adults to compute localized statistics across 50 white matter tracts based on fractional anisotropy (FA). Experimental results show increased sensitivity in the determination of genetic influences on principal fiber tracts compared to the tract-based spatial statistics (TBSS) method. Our results show that the MDP representation reveals important parts of the white matter structure and considerably reduces the dimensionality over comparable fiber matching approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a novel Hybrid Clustering approach for XML documents (HCX) that first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. The empirical analysis reveals that the proposed method is scalable and accurate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

XML document clustering is essential for many document handling applications such as information storage, retrieval, integration and transformation. An XML clustering algorithm should process both the structural and the content information of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. This paper introduces a novel approach that first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. The proposed method reduces the high dimensionality of input data by using only the structure-constrained content. The empirical analysis reveals that the proposed method can effectively cluster even very large XML datasets and outperform other existing methods.