38 resultados para Cluster Analysis. Information Theory. Entropy. Cross Information Potential. Complex Data
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
The purpose of this paper is to study the possible differences among countries as CO2 emitters and to examine the underlying causes of these differences. The starting point of the analysis is the Kaya identity, which allows us to break down per capita emissions in four components: an index of carbon intensity, transformation efficiency, energy intensity and social wealth. Through a cluster analysis we have identified five groups of countries with different behavior according to these four factors. One significant finding is that these groups are stable for the period analyzed. This suggests that a study based on these components can characterize quite accurately the polluting behavior of individual countries, that is to say, the classification found in the analysis could be used in other studies which look to study the behavior of countries in terms of CO2 emissions in homogeneous groups. In this sense, it supposes an advance over the traditional regional or rich-poor countries classifications .
Resumo:
In this paper we study the disability transition probabilities (as well as the mortalityprobabilities) due to concurrent factors to age such as income, gender and education. Althoughit is well known that ageing and socioeconomic status influence the probability ofcausing functional disorders, surprisingly little attention has been paid to the combined effectof those factors along the individuals' life and how this affects the transition from one degreeof disability to another. The assumption that tomorrow's disability state is only a functionof the today's state is very strong, since disability is a complex variable that depends onseveral other elements than time. This paper contributes into the field in two ways: (1) byattending the distinction between the initial disability level and the process that leads tohis course (2) by addressing whether and how education, age and income differentially affectthe disability transitions. Using a Markov chain discrete model and a survival analysis, weestimate the probability by year and individual characteristics that changes the state of disabilityand the duration that it takes its progression in each case. We find that people withan initial state of disability have a higher propensity to change and take less time to transitfrom different stages. Men do that more frequently than women. Education and incomehave negative effects on transition. Moreover, we consider the disability benefits associatedto those changes along different stages of disability and therefore we offer some clues onthe potential savings of preventive actions that may delay or avoid those transitions. Onpure cost considerations, preventive programs for improvement show higher benefits thanthose for preventing deterioration, and in general terms, those focussing individuals below65 should go first. Finally the trend of disability in Spain seems not to change among yearsand regional differences are not found.
Resumo:
To cosmic rays incident near the horizon the Earth's atmosphere represents a beam dump with a slant depth reaching 36 000 g cm-2 at 90. The prompt decay of a heavy quark produced by very high energy cosmic ray showers will leave an unmistakable signature in this dump. We translate the failure of experiments to detect such a signal into an upper limit on the heavy quark hadroproduction cross section in the energy region beyond existing accelerators. Our results disfavor any rapid growth of the cross section or the gluon structure function beyond conservative estimates based on perturbative QCD.
Resumo:
This correspondence studies the formulation of members ofthe Cohen-Posch class of positive time-frequency energy distributions.Minimization of cross-entropy measures with respect to different priorsand the case of no prior or maximum entropy were considered. It isconcluded that, in general, the information provided by the classicalmarginal constraints is very limited, and thus, the final distributionheavily depends on the prior distribution. To overcome this limitation,joint time and frequency marginals are derived based on a "directioninvariance" criterion on the time-frequency plane that are directly relatedto the fractional Fourier transform.
Resumo:
In applied regional analysis, statistical information is usually published at different territorial levels with the aim providing inforamtion of interest for different potential users. When using this information, there are two different choices: first, to use normative regions ( towns, provinces, etc.) or, second, to design analytical regions directly related with the analysed phenomena. In this paper, privincial time series of unemployment rates in Spain are used in order to compare the results obtained by applying yoy analytical regionalisation models ( a two stages procedure based on cluster analysis and a procedure based on mathematical programming) with the normative regions available at two different scales: NUTS II and NUTS I. The results have shown that more homogeneous regions were designed when applying both analytical regionalisation tools. Two other obtained interesting results are related with the fact that analytical regions were also more estable along time and with the effects of scales in the regionalisation process
Resumo:
In applied regional analysis, statistical information is usually published at different territorial levels with the aim providing inforamtion of interest for different potential users. When using this information, there are two different choices: first, to use normative regions ( towns, provinces, etc.) or, second, to design analytical regions directly related with the analysed phenomena. In this paper, privincial time series of unemployment rates in Spain are used in order to compare the results obtained by applying yoy analytical regionalisation models ( a two stages procedure based on cluster analysis and a procedure based on mathematical programming) with the normative regions available at two different scales: NUTS II and NUTS I. The results have shown that more homogeneous regions were designed when applying both analytical regionalisation tools. Two other obtained interesting results are related with the fact that analytical regions were also more estable along time and with the effects of scales in the regionalisation process
Resumo:
During the process of language development, one of the most important tasks that children must face is that of identifying the grammatical category to which words in their language belong. This is essential in order to be able to form grammatically correct utterances. How do children proceed in order to classify words in their language and assign them to their corresponding grammatical category? The present study investigates the usefulness of phonological information for the categorization of nouns in English, given the fact that it is phonology the first source of information that might be available to prelinguistic infants who lack access to semantic information or complex morphosyntactic information. We analyse four different corpora containing linguistic samples of English speaking mothers addressing their children in order to explore the reliability with which words are represented in mothers’ speech based on several phonological criteria. The results of the analysis confirm the prediction that most of the words to which English learning infants are exposed during the first two years of life can be accounted for in terms of their phonological resemblance
Resumo:
In economic literature, information deficiencies and computational complexities have traditionally been solved through the aggregation of agents and institutions. In inputoutput modelling, researchers have been interested in the aggregation problem since the beginning of 1950s. Extending the conventional input-output aggregation approach to the social accounting matrix (SAM) models may help to identify the effects caused by the information problems and data deficiencies that usually appear in the SAM framework. This paper develops the theory of aggregation and applies it to the social accounting matrix model of multipliers. First, we define the concept of linear aggregation in a SAM database context. Second, we define the aggregated partitioned matrices of multipliers which are characteristic of the SAM approach. Third, we extend the analysis to other related concepts, such as aggregation bias and consistency in aggregation. Finally, we provide an illustrative example that shows the effects of aggregating a social accounting matrix model.
Resumo:
Globalization involves several facility location problems that need to be handled at large scale. Location Allocation (LA) is a combinatorial problem in which the distance among points in the data space matter. Precisely, taking advantage of the distance property of the domain we exploit the capability of clustering techniques to partition the data space in order to convert an initial large LA problem into several simpler LA problems. Particularly, our motivation problem involves a huge geographical area that can be partitioned under overall conditions. We present different types of clustering techniques and then we perform a cluster analysis over our dataset in order to partition it. After that, we solve the LA problem applying simulated annealing algorithm to the clustered and non-clustered data in order to work out how profitable is the clustering and which of the presented methods is the most suitable
Resumo:
This final year project presents the design principles and prototype implementation of BIMS (Biomedical Information Management System), a flexible software system which provides an infrastructure to manage all information required by biomedical research projects.The BIMS project was initiated with the motivation to solve several limitations in medical data acquisition of some research projects, in which Universitat Pompeu Fabra takes part. These limitations,based on the lack of control mechanisms to constraint information submitted by clinicians, impact on the data quality, decreasing it.BIMS can easily be adapted to manage information of a wide variety of clinical studies, not being limited to a given clinical specialty. The software can manage both, textual information, like clinical data (measurements, demographics, diagnostics, etc ...), as well as several kinds of medical images (magnetic resonance imaging, computed tomography, etc ...). Moreover, BIMS provides a web - based graphical user interface and is designed to be deployed in a distributed andmultiuser environment. It is built on top of open source software products and frameworks.Specifically, BIMS has been used to represent all clinical data being currently used within the CardioLab platform (an ongoing project managed by Universitat Pompeu Fabra), demonstratingthat it is a solid software system, which could fulfill requirements of a real production environment.
Resumo:
A new drift compensation method based on Common Principal Component Analysis (CPCA) is proposed. The drift variance in data is found as the principal components computed by CPCA. This method finds components that are common for all gasses in feature space. The method is compared in classification task with respect to the other approaches published where the drift direction is estimated through a Principal Component Analysis (PCA) of a reference gas. The proposed new method ¿ employing no specific reference gas, but information from all gases ¿has shown the same performance as the traditional approach with the best-fitted reference gas. Results are shown with data lasting 7-months including three gases at different concentrations for an array of 17 polymeric sensors.
Resumo:
The prediction of rockfall travel distance below a rock cliff is an indispensable activity in rockfall susceptibility, hazard and risk assessment. Although the size of the detached rock mass may differ considerably at each specific rock cliff, small rockfall (<100 m3) is the most frequent process. Empirical models may provide us with suitable information for predicting the travel distance of small rockfalls over an extensive area at a medium scale (1:100 000¿1:25 000). "Solà d'Andorra la Vella" is a rocky slope located close to the town of Andorra la Vella, where the government has been documenting rockfalls since 1999. This documentation consists in mapping the release point and the individual fallen blocks immediately after the event. The documentation of historical rockfalls by morphological analysis, eye-witness accounts and historical images serve to increase available information. In total, data from twenty small rockfalls have been gathered which reveal an amount of a hundred individual fallen rock blocks. The data acquired has been used to check the reliability of the main empirical models widely adopted (reach and shadow angle models) and to analyse the influence of parameters which affecting the travel distance (rockfall size, height of fall along the rock cliff and volume of the individual fallen rock block). For predicting travel distances in maps with medium scales, a method has been proposed based on the "reach probability" concept. The accuracy of results has been tested from the line entailing the farthest fallen boulders which represents the maximum travel distance of past rockfalls. The paper concludes with a discussion of the application of both empirical models to other study areas.
Resumo:
The objective of research was to analyse the potential of Normalized Difference Vegetation Index (NDVI) maps from satellite images, yield maps and grapevine fertility and load variables to delineate zones with different wine grape properties for selective harvesting. Two vineyard blocks located in NE Spain (Cabernet Sauvignon and Syrah) were analysed. The NDVI was computed from a Quickbird-2 multi-spectral image at veraison (July 2005). Yield data was acquired by means of a yield monitor during September 2005. Other variables, such as the number of buds, number of shoots, number of wine grape clusters and weight of 100 berries were sampled in a 10 rows × 5 vines pattern and used as input variables, in combination with the NDVI, to define the clusters as alternative to yield maps. Two days prior to the harvesting, grape samples were taken. The analysed variables were probable alcoholic degree, pH of the juice, total acidity, total phenolics, colour, anthocyanins and tannins. The input variables, alone or in combination, were clustered (2 and 3 Clusters) by using the ISODATA algorithm, and an analysis of variance and a multiple rang test were performed. The results show that the zones derived from the NDVI maps are more effective to differentiate grape maturity and quality variables than the zones derived from the yield maps. The inclusion of other grapevine fertility and load variables did not improve the results.
Resumo:
The main aim of this study was to replicate and extend previous results on subtypes of adolescents with substance use disorders (SUD), according to their Minnesota Multiphasic Personality Inventory for adolescents (MMPI-A) profiles. Sixty patients with SUD and psychiatric comorbidity (41.7% male, mean age = 15.9 years old) completed the MMPI-A, the Teen Addiction Severity Index (T-ASI), the Child Behaviour Checklist (CBCL), and were interviewed in order to determine DSMIV diagnoses and level of substance use. Mean MMPI-A personality profile showed moderate peaks in Psychopathic Deviate, Depression and Hysteria scales. Hierarchical cluster analysis revealed four profiles (acting-out, 35% of the sample; disorganized-conflictive, 15%; normative-impulsive, 15%; and deceptive-concealed, 35%). External correlates were found between cluster 1, CBCL externalizing symptoms at a clinical level and conduct disorders, and between cluster 2 and mixed CBCL internalized/externalized symptoms at a clinical level. Discriminant analysis showed that Depression, Psychopathic Deviate and Psychasthenia MMPI-A scales correctly classified 90% of the patients into the clusters obtained.