912 resultados para Cluster Analysis. Information Theory. Entropy. Cross Information Potential. Complex Data


Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: To assess a new impunity index and variables that have been found to predict variation in homicide rates in other geographical levels as predictive of state-level homicide rates in Brazil. METHODS: This was a cross-sectional ecological study. Data from the mortality information system relating to the 27 Brazilian states for the years 1996 to 2005 were analyzed. The outcome variables were taken to be homicide victim rates in 2005, for the entire population and for men aged 20-29 years. Measurements of economic and social development, economic inequality, demographic structure and life expectancy were analyzed as predictors. An "impunity index", calculated as the total number of homicides between 1996 and 2005 divided by the number of individuals in prison in 2007, was constructed. The data were analyzed by means of simple linear regression and negative binomial regression. RESULTS: In 2005, state-level crude total homicide rates ranged from 11 to 51 per 100,000; for young men, they ranged from 39 to 241. The impunity index ranged from 0.4 to 3.5 and was the most important predictor of this variability. From negative binomial regression, it was estimated that the homicide victim rate among young males increased by 50% for every increase of one point in this ratio. CONCLUSIONS: Classic predictive factors were not associated with homicides in this analysis of state-level variation in Brazil. However, the impunity index indicated that the greater the impunity, the higher the homicide rate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Beyond the classical statistical approaches (determination of basic statistics, regression analysis, ANOVA, etc.) a new set of applications of different statistical techniques has increasingly gained relevance in the analysis, processing and interpretation of data concerning the characteristics of forest soils. This is possible to be seen in some of the recent publications in the context of Multivariate Statistics. These new methods require additional care that is not always included or refered in some approaches. In the particular case of geostatistical data applications it is necessary, besides to geo-reference all the data acquisition, to collect the samples in regular grids and in sufficient quantity so that the variograms can reflect the spatial distribution of soil properties in a representative manner. In the case of the great majority of Multivariate Statistics techniques (Principal Component Analysis, Correspondence Analysis, Cluster Analysis, etc.) despite the fact they do not require in most cases the assumption of normal distribution, they however need a proper and rigorous strategy for its utilization. In this work, some reflections about these methodologies and, in particular, about the main constraints that often occur during the information collecting process and about the various linking possibilities of these different techniques will be presented. At the end, illustrations of some particular cases of the applications of these statistical methods will also be presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The choice of an information systems is a critical factor of success in an organization's performance, since, by involving multiple decision-makers, with often conflicting objectives, several alternatives with aggressive marketing, makes it particularly complex by the scope of a consensus. The main objective of this work is to make the analysis and selection of a information system to support the school management, pedagogical and administrative components, using a multicriteria decision aid system – MMASSITI – Multicriteria Method- ology to Support the Selection of Information Systems/Information Technologies – integrates a multicriteria model that seeks to provide a systematic approach in the process of choice of Information Systems, able to produce sustained recommendations concerning the decision scope. Its application to a case study has identi- fied the relevant factors in the selection process of school educational and management information system and get a solution that allows the decision maker’ to compare the quality of the various alternatives.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia do Ambiente – Perfil Gestão e Sistemas Ambientais

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The processes of mobilization of land for infrastructures of public and private domain are developed according to proper legal frameworks and systematically confronted with the impoverished national situation as regards the cadastral identification and regularization, which leads to big inefficiencies, sometimes with very negative impact to the overall effectiveness. This project report describes Ferbritas Cadastre Information System (FBSIC) project and tools, which in conjunction with other applications, allow managing the entire life-cycle of Land Acquisition and Cadastre, including support to field activities with the integration of information collected in the field, the development of multi-criteria analysis information, monitoring all information in the exploration stage, and the automated generation of outputs. The benefits are evident at the level of operational efficiency, including tools that enable process integration and standardization of procedures, facilitate analysis and quality control and maximize performance in the acquisition, maintenance and management of registration information and expropriation (expropriation projects). Therefore, the implemented system achieves levels of robustness, comprehensiveness, openness, scalability and reliability suitable for a structural platform. The resultant solution, FBSIC, is a fit-for-purpose cadastre information system rooted in the field of railway infrastructures. FBSIC integrating nature of allows: to accomplish present needs and scale to meet future services; to collect, maintain, manage and share all information in one common platform, and transform it into knowledge; to relate with other platforms; to increase accuracy and productivity of business processes related with land property management.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is a significant potential to improve the plant-beneficial effects of root-colonizing pseudomonads by breeding wheat genotypes with a greater capacity to sustain interactions with these bacteria. However, the interaction between pseudomonads and crop plants at the cultivar level, as well as the conditions which favor the accumulation of beneficial microorganisms in the wheat rhizosphere, is largely unknown. Therefore, we characterized the three Swiss winter wheat (Triticum aestivum) cultivars Arina, Zinal, and Cimetta for their ability to accumulate naturally occurring plant-beneficial pseudomonads in the rhizosphere. Cultivar performance was measured also by the ability to select for specific genotypes of 2,4-diacetylphloroglucinol (DAPG) producers in two different soils. Cultivar-specific differences were found; however, these were strongly influenced by the soil type. Denaturing gradient gel electrophoresis (DGGE) analysis of fragments of the DAPG biosynthetic gene phlD amplified from natural Pseudomonas rhizosphere populations revealed that phlD diversity substantially varied between the two soils and that there was a cultivar-specific accumulation of certain phlD genotypes in one soil but not in the other. Furthermore, the three cultivars were tested for their ability to benefit from Pseudomonas inoculants. Interestingly, Arina, which was best protected against Pythium ultimum infection by inoculation with Pseudomonas fluorescens biocontrol strain CHA0, was the cultivar which profited the least from the bacterial inoculant in terms of plant growth promotion in the absence of the pathogen. Knowledge gained of the interactions between wheat cultivars, beneficial pseudomonads, and soil types allows us to optimize cultivar-soil combinations for the promotion of growth through beneficial pseudomonads. Additionally, this information can be implemented by breeders into a new and unique breeding strategy for low-input and organic conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Forest fire sequences can be modelled as a stochastic point process where events are characterized by their spatial locations and occurrence in time. Cluster analysis permits the detection of the space/time pattern distribution of forest fires. These analyses are useful to assist fire-managers in identifying risk areas, implementing preventive measures and conducting strategies for an efficient distribution of the firefighting resources. This paper aims to identify hot spots in forest fire sequences by means of the space-time scan statistics permutation model (STSSP) and a geographical information system (GIS) for data and results visualization. The scan statistical methodology uses a scanning window, which moves across space and time, detecting local excesses of events in specific areas over a certain period of time. Finally, the statistical significance of each cluster is evaluated through Monte Carlo hypothesis testing. The case study is the forest fires registered by the Forest Service in Canton Ticino (Switzerland) from 1969 to 2008. This dataset consists of geo-referenced single events including the location of the ignition points and additional information. The data were aggregated into three sub-periods (considering important preventive legal dispositions) and two main ignition-causes (lightning and anthropogenic causes). Results revealed that forest fire events in Ticino are mainly clustered in the southern region where most of the population is settled. Our analysis uncovered local hot spots arising from extemporaneous arson activities. Results regarding the naturally-caused fires (lightning fires) disclosed two clusters detected in the northern mountainous area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Synthetic combinatorial peptide libraries in positional scanning format (PS-SCL) have recently emerged as a useful tool for the analysis of T cell recognition. This includes identification of potentially cross-reactive sequences of self or pathogen origin that could be relevant for the understanding of TCR repertoire selection and maintenance, as well as of the cross-reactive potential of Ag-specific immune responses. In this study, we have analyzed the recognition of sequences retrieved by using a biometric analysis of the data generated by screening a PS-SCL with a tumor-reactive CTL clone specific for an immunodominant peptide from the melanocyte differentiation and tumor-associated Ag Melan-A. We found that 39% of the retrieved peptides were recognized by the CTL clone used for PS-SCL screening. The proportion of peptides recognized was higher among those with both high predicted affinity for the HLA-A2 molecule and high predicted stimulatory score. Interestingly, up to 94% of the retrieved peptides were cross-recognized by other Melan-A-specific CTL. Cross-recognition was at least partially focused, as some peptides were cross-recognized by the majority of CTL. Importantly, stimulation of PBMC from melanoma patients with the most frequently recognized peptides elicited the expansion of heterogeneous CD8(+) T cell populations, one fraction of which cross-recognized Melan-A. Together, these results underline the high predictive value of PS-SCL for the identification of sequences cross-recognized by Ag-specific T cells.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of T cell vaccines is the expansion of antigen-specific T cells able to confer immune protection against pathogens or tumors. Although increase in absolute cell numbers, effector functions and TCR repertoire of vaccine-induced T cells are often evaluated, their reactivity for the cognate antigen versus their cross-reactive potential is rarely considered. In fact, little information is available regarding the influence of vaccines on T cell fine specificity of antigen recognition despite the impact that this feature may have in protective immunity. To shed light on the cross-reactive potential of vaccine-induced cells, we analyzed the reactivity of CD8(+) T cells following vaccination of HLA-A2(+) melanoma patients with Melan-A peptide, incomplete Freund's adjuvant and CpG-oligodeoxynucleotide adjuvant, which was shown to induce strong expansion of Melan-A-reactive CD8(+) T cells in vivo. A collection of predicted Melan-A cross-reactive peptides, identified from a combinatorial peptide library, was used to probe functional antigen recognition of PBMC ex vivo and Melan-A-reactive CD8(+) T cell clones. While Melan-A-reactive CD8(+) T cells prior to vaccination are usually constituted of widely cross-reactive naive cells, we show that peptide vaccination resulted in expansion of memory T cells displaying a reactivity predominantly restricted to the antigen of interest. Importantly, these cells are tumor-reactive.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Previously published scientific papers have reported a negative correlation between drinking water hardness and cardiovascular mortality. Some ecologic and case-control studies suggest the protective effect of calcium and magnesium concentration in drinking water. In this article we present an analysis of this protective relationship in 538 municipalities of Comunidad Valenciana (Spain) from 1991-1998. We used the Spanish version of the Rapid Inquiry Facility (RIF) developed under the European Environment and Health Information System (EUROHEIS) research project. The strategy of analysis used in our study conforms to the exploratory nature of the RIF that is used as a tool to obtain quick and flexible insight into epidemiologic surveillance problems. This article describes the use of the RIF to explore possible associations between disease indicators and environmental factors. We used exposure analysis to assess the effect of both protective factors--calcium and magnesium--on mortality from cerebrovascular (ICD-9 430-438) and ischemic heart (ICD-9 410-414) diseases. This study provides statistical evidence of the relationship between mortality from cardiovascular diseases and hardness of drinking water. This relationship is stronger in cerebrovascular disease than in ischemic heart disease, is more pronounced for women than for men, and is more apparent with magnesium than with calcium concentration levels. Nevertheless, the protective nature of these two factors is not clearly established. Our results suggest the possibility of protectiveness but cannot be claimed as conclusive. The weak effects of these covariates make it difficult to separate them from the influence of socioeconomic and environmental factors. We have also performed disease mapping of standardized mortality ratios to detect clusters of municipalities with high risk. Further standardization by levels of calcium and magnesium in drinking water shows changes in the maps when we remove the effect of these covariates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology. RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads. CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Individuals sampled in hybrid zones are usually analysed according to their sampling locality, morphology, behaviour or karyotype. But the increasing availability of genetic information more and more favours its use for individual sorting purposes and numerous assignment methods based on the genetic composition of individuals have been developed. The shrews of the Sorex araneus group offer good opportunities to test the genetic assignment on individuals identified by their karyotype. Here we explored the potential and efficiency of a Bayesian assignment method combined or not with a reference dataset to study admixture and individual assignment in the difficult context of two hybrid zones between karyotypic species of the Sorex araneus group. As a whole, we assigned more than 80% of the individuals to their respective karyotypic categories (i.e. 'pure' species or hybrids). This assignment level is comparable to what was obtained for the same species away from hybrid zones. Additionally, we showed that the assignment result for several individuals was strongly affected by the inclusion or not of a reference dataset. This highlights the importance of such comparisons when analysing hybrid zones. Finally, differences between the admixture levels detected in both hybrid zones support the hypothesis of an impact of chromosomal rearrangements on gene flow.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Iowa livestock industry generates large quantities of manure and other organic residues; composed of feces, urine, bedding material, waste feed, dilution water, and mortalities. Often viewed as a waste material, little has been done to characterize and determine the usefulness of this resource. The Iowa Department of Natural Resources initiated the process to assess in detail the manure resource and the potential utilization of this resource through anaerobic digestion coupled with energy recovery. Many of the pieces required to assess the manure resource already exist, albeit in disparate forms and locations. This study began by interpreting and integrating existing Federal, State, ISU studies, and other sources of livestock numbers, housing, and management information. With these data, models were analyzed to determine energy production and economic feasibility of energy recovery using anaerobic digestion facilities on livestock faxms. Having these data individual facilities and clusters that appear economically feasible can be identified specifically through the use of a GIs system for further investigation. Also livestock facilities and clusters of facilities with high methane recovery potential can be the focus of targeted educational programs through Cooperative Extension network and other outreach networks, providing a more intensive counterpoint to broadly based educational efforts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In applied regional analysis, statistical information is usually published at different territorial levels with the aim providing inforamtion of interest for different potential users. When using this information, there are two different choices: first, to use normative regions ( towns, provinces, etc.) or, second, to design analytical regions directly related with the analysed phenomena. In this paper, privincial time series of unemployment rates in Spain are used in order to compare the results obtained by applying yoy analytical regionalisation models ( a two stages procedure based on cluster analysis and a procedure based on mathematical programming) with the normative regions available at two different scales: NUTS II and NUTS I. The results have shown that more homogeneous regions were designed when applying both analytical regionalisation tools. Two other obtained interesting results are related with the fact that analytical regions were also more estable along time and with the effects of scales in the regionalisation process