853 resultados para Unsupervised clustering


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper analyses earthquake data in the perspective of dynamical systems and fractional calculus (FC). This new standpoint uses Multidimensional Scaling (MDS) as a powerful clustering and visualization tool. FC extends the concepts of integrals and derivatives to non-integer and complex orders. MDS is a technique that produces spatial or geometric representations of complex objects, such that those objects that are perceived to be similar in some sense are placed on the MDS maps forming clusters. In this study, over three million seismic occurrences, covering the period from January 1, 1904 up to March 14, 2012 are analysed. The events are characterized by their magnitude and spatiotemporal distributions and are divided into fifty groups, according to the Flinn–Engdahl (F–E) seismic regions of Earth. Several correlation indices are proposed to quantify the similarities among regions. MDS maps are proven as an intuitive and useful visual representation of the complex relationships that are present among seismic events, which may not be perceived on traditional geographic maps. Therefore, MDS constitutes a valid alternative to classic visualization tools for understanding the global behaviour of earthquakes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cluster analysis for categorical data has been an active area of research. A well-known problem in this area is the determination of the number of clusters, which is unknown and must be inferred from the data. In order to estimate the number of clusters, one often resorts to information criteria, such as BIC (Bayesian information criterion), MML (minimum message length, proposed by Wallace and Boulton, 1968), and ICL (integrated classification likelihood). In this work, we adopt the approach developed by Figueiredo and Jain (2002) for clustering continuous data. They use an MML criterion to select the number of clusters and a variant of the EM algorithm to estimate the model parameters. This EM variant seamlessly integrates model estimation and selection in a single algorithm. For clustering categorical data, we assume a finite mixture of multinomial distributions and implement a new EM algorithm, following a previous version (Silvestre et al., 2008). Results obtained with synthetic datasets are encouraging. The main advantage of the proposed approach, when compared to the above referred criteria, is the speed of execution, which is especially relevant when dealing with large data sets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background and aim: Cardiorespiratory fitness (CRF) and diet have been involved as significant factors towards the prevention of cardio-metabolic diseases. This study aimed to assess the impact of the combined associations of CRF and adherence to the Southern European Atlantic Diet (SEADiet) on the clustering of metabolic risk factors in adolescents. Methods and Results: A cross-sectional school-based study was conducted on 468 adolescents aged 15-18, from the Azorean Islands, Portugal. We measured fasting glucose, insulin, total cholesterol (TC), HDL-cholesterol, triglycerides, systolic blood pressure, waits circumference and height. HOMA, TC/HDL-C ratio and waist-to-height ratio were calculated. For each of these variables, a Z-score was computed by age and sex. A metabolic risk score (MRS) was constructed by summing the Z scores of all individual risk factors. High risk was considered when the individual had 1SD of this score. CRF was measured with the 20 m-Shuttle-Run- Test. Adherence to SEADiet was assessed with a semi-quantitative food frequency questionnaire. Logistic regression showed that, after adjusting for potential confounders, unfit adolescents with low adherence to SEADiet had the highest odds of having MRS (OR Z 9.4; 95%CI:2.6e33.3) followed by the unfit ones with high adherence to the SEADiet (OR Z 6.6; 95% CI: 1.9e22.5) when compared to those who were fit and had higher adherence to SEADiet.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mestrado em Controlo de Gestão e dos Negócios

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Scheduling of constrained deadline sporadic task systems on multiprocessor platforms is an area which has received much attention in the recent past. It is widely believed that finding an optimal scheduler is hard, and therefore most studies have focused on developing algorithms with good processor utilization bounds. These algorithms can be broadly classified into two categories: partitioned scheduling in which tasks are statically assigned to individual processors, and global scheduling in which each task is allowed to execute on any processor in the platform. In this paper we consider a third, more general, approach called cluster-based scheduling. In this approach each task is statically assigned to a processor cluster, tasks in each cluster are globally scheduled among themselves, and clusters in turn are scheduled on the multiprocessor platform. We develop techniques to support such cluster-based scheduling algorithms, and also consider properties that minimize total processor utilization of individual clusters. In the last part of this paper, we develop new virtual cluster-based scheduling algorithms. For implicit deadline sporadic task systems, we develop an optimal scheduling algorithm that is neither Pfair nor ERfair. We also show that the processor utilization bound of us-edf{m/(2m−1)} can be improved by using virtual clustering. Since neither partitioned nor global strategies dominate over the other, cluster-based scheduling is a natural direction for research towards achieving improved processor utilization bounds.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

OBJECTIVE : To analyze the evolution in the prevalence and determinants of malnutrition in children in the semiarid region of Brazil. METHODS : Data were collected from two cross-sectional population-based household surveys that used the same methodology. Clustering sampling was used to collect data from 8,000 families in Ceará, Northeastern Brazil, for the years 1987 and 2007. Acute undernutrition was calculated as weight/age < -2 standard deviation (SD); stunting as height/age < -2 SD; wasting as weight/height < -2 SD. Data on biological and sociodemographic determinants were analyzed using hierarchical multivariate analyses based on a theoretical model. RESULTS : A sample of 4,513 and 1,533 children under three years of age, in 1987 and 2007, respectively, were included in the analyses. The prevalence of acute malnutrition was reduced by 60.0%, from 12.6% in 1987 to 4.7% in 2007, while prevalence of stunting was reduced by 50.0%, from 27.0% in 1987 to 13.0% in 2007. Prevalence of wasting changed little in the period. In 1987, socioeconomic and biological characteristics (family income, mother’s education, toilet and tap water availability, children’s medical consultation and hospitalization, age, sex and birth weight) were significantly associated with undernutrition, stunting and wasting. In 2007, the determinants of malnutrition were restricted to biological characteristics (age, sex and birth weight). Only one socioeconomic characteristic, toilet availability, remained associated with stunting. CONCLUSIONS : Socioeconomic development, along with health interventions, may have contributed to improvements in children’s nutritional status. Birth weight, especially extremely low weight (< 1,500 g), appears as the most important risk factor for early childhood malnutrition.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Electrocardiography (ECG) biometrics is emerging as a viable biometric trait. Recent developments at the sensor level have shown the feasibility of performing signal acquisition at the fingers and hand palms, using one-lead sensor technology and dry electrodes. These new locations lead to ECG signals with lower signal to noise ratio and more prone to noise artifacts; the heart rate variability is another of the major challenges of this biometric trait. In this paper we propose a novel approach to ECG biometrics, with the purpose of reducing the computational complexity and increasing the robustness of the recognition process enabling the fusion of information across sessions. Our approach is based on clustering, grouping individual heartbeats based on their morphology. We study several methods to perform automatic template selection and account for variations observed in a person's biometric data. This approach allows the identification of different template groupings, taking into account the heart rate variability, and the removal of outliers due to noise artifacts. Experimental evaluation on real world data demonstrates the advantages of our approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known "curse of dimensionality". As a consequence, in order to extract useful information from this data, one has to resort to feature selection (FS) techniques to lower the dimensionality of the learning space. In this paper, we assess the impact of FS techniques for silent speech data, in a dataset with 4 non-invasive and promising modalities, namely: video, depth, ultrasonic Doppler sensing, and surface electromyography. We consider two supervised (mutual information and Fisher's ratio) and two unsupervised (meanmedian and arithmetic mean geometric mean) FS filters. The evaluation was made by assessing the classification accuracy (word recognition error) of three well-known classifiers (knearest neighbors, support vector machines, and dynamic time warping). The key results of this study show that both unsupervised and supervised FS techniques improve on the classification accuracy on both individual and combined modalities. For instance, on the video component, we attain relative performance gains of 36.2% in error rates. FS is also useful as pre-processing for feature fusion. Copyright © 2014 ISCA.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Discrete data representations are necessary, or at least convenient, in many machine learning problems. While feature selection (FS) techniques aim at finding relevant subsets of features, the goal of feature discretization (FD) is to find concise (quantized) data representations, adequate for the learning task at hand. In this paper, we propose two incremental methods for FD. The first method belongs to the filter family, in which the quality of the discretization is assessed by a (supervised or unsupervised) relevance criterion. The second method is a wrapper, where discretized features are assessed using a classifier. Both methods can be coupled with any static (unsupervised or supervised) discretization procedure and can be used to perform FS as pre-processing or post-processing stages. The proposed methods attain efficient representations suitable for binary and multi-class problems with different types of data, being competitive with existing methods. Moreover, using well-known FS methods with the features discretized by our techniques leads to better accuracy than with the features discretized by other methods or with the original features. (C) 2013 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This letter presents a new parallel method for hyperspectral unmixing composed by the efficient combination of two popular methods: vertex component analysis (VCA) and sparse unmixing by variable splitting and augmented Lagrangian (SUNSAL). First, VCA extracts the endmember signatures, and then, SUNSAL is used to estimate the abundance fractions. Both techniques are highly parallelizable, which significantly reduces the computing time. A design for the commodity graphics processing units of the two methods is presented and evaluated. Experimental results obtained for simulated and real hyperspectral data sets reveal speedups up to 100 times, which grants real-time response required by many remotely sensed hyperspectral applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents the application of multidimensional scaling (MDS) analysis to data emerging from noninvasive lung function tests, namely the input respiratory impedance. The aim is to obtain a geometrical mapping of the diseases in a 3D space representation, allowing analysis of (dis)similarities between subjects within the same pathology groups, as well as between the various groups. The adult patient groups investigated were healthy, diagnosed chronic obstructive pulmonary disease (COPD) and diagnosed kyphoscoliosis, respectively. The children patient groups were healthy, asthma and cystic fibrosis. The results suggest that MDS can be successfully employed for mapping purposes of restrictive (kyphoscoliosis) and obstructive (COPD) pathologies. Hence, MDS tools can be further examined to define clear limits between pools of patients for clinical classification, and used as a training aid for medical traineeship.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertation presented at the Faculty of Science and Technology of the New University of Lisbon in fulfillment of the requirements for the Masters degree in Electrical Engineering and Computers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present an analysis and characterization of the regional seismicity recorded by a temporary broadband seismic network deployed in the Cape Verde archipelago between November 2007 and September 2008. The detection of earthquakes was based on spectrograms, allowing the discrimination from low-frequency volcanic signals, resulting in 358 events of which 265 were located, the magnitudes usually being smaller than 3. For the location, a new 1-D P-velocity model was derived for the region showing a crust consistent with an oceanic crustal structure. The seismicity is located mostly offshore the westernmost and geologically youngest areas of the archipelago, near the islands of Santo Antao and Sao Vicente in the NW and Brava and Fogo in the SW. The SW cluster has a lower occurrence rate and corresponds to seismicity concentrated mainly along an alignment between Brava and the Cadamosto seamount presenting normal faulting mechanisms. The existence of the NW cluster, located offshore SW of Santo Antao, was so far unknown and concentrates around a recently recognized submarine cone field; this cluster presents focal depths extending from the crust to the upper mantle and suggests volcanic unrest No evident temporal behaviour could be perceived, although the events tend to occur in bursts of activity lasting a few days. In this recording period, no significant activity was detected at Fogo volcano, the most active volcanic edifice in Cape Verde. The seismicity characteristics point mainly to a volcanic origin. The correlation of the recorded seismicity with active volcanic structures agrees with the tendency for a westward migration of volcanic activity in the archipelago as indicated by the geologic record. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Finding the structure of a confined liquid crystal is a difficult task since both the density and order parameter profiles are nonuniform. Starting from a microscopic model and density-functional theory, one has to either (i) solve a nonlinear, integral Euler-Lagrange equation, or (ii) perform a direct multidimensional free energy minimization. The traditional implementations of both approaches are computationally expensive and plagued with convergence problems. Here, as an alternative, we introduce an unsupervised variant of the multilayer perceptron (MLP) artificial neural network for minimizing the free energy of a fluid of hard nonspherical particles confined between planar substrates of variable penetrability. We then test our algorithm by comparing its results for the structure (density-orientation profiles) and equilibrium free energy with those obtained by standard iterative solution of the Euler-Lagrange equations and with Monte Carlo simulation results. Very good agreement is found and the MLP method proves competitively fast, flexible, and refinable. Furthermore, it can be readily generalized to the richer experimental patterned-substrate geometries that are now experimentally realizable but very problematic to conventional theoretical treatments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para a obtenção do grau de Mestre em Ciência e Sistemas de Informação Geográfica