873 resultados para agglomerative clustering


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The use of the Internet now has a specific purpose: to find information. Unfortunately, the amount of data available on the Internet is growing exponentially, creating what can be considered a nearly infinite and ever-evolving network with no discernable structure. This rapid growth has raised the question of how to find the most relevant information. Many different techniques have been introduced to address the information overload, including search engines, Semantic Web, and recommender systems, among others. Recommender systems are computer-based techniques that are used to reduce information overload and recommend products likely to interest a user when given some information about the user's profile. This technique is mainly used in e-Commerce to suggest items that fit a customer's purchasing tendencies. The use of recommender systems for e-Government is a research topic that is intended to improve the interaction among public administrations, citizens, and the private sector through reducing information overload on e-Government services. More specifically, e-Democracy aims to increase citizens' participation in democratic processes through the use of information and communication technologies. In this chapter, an architecture of a recommender system that uses fuzzy clustering methods for e-Elections is introduced. In addition, a comparison with the smartvote system, a Web-based Voting Assistance Application (VAA) used to aid voters in finding the party or candidate that is most in line with their preferences, is presented.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Our aim was to critically evaluate the relations among smoking, body weight, body fat distribution, and insulin resistance as reported in the literature. In the short term, nicotine increases energy expenditure and could reduce appetite, which may explain why smokers tend to have lower body weight than do nonsmokers and why smoking cessation is frequently followed by weight gain. In contrast, heavy smokers tend to have greater body weight than do light smokers or nonsmokers, which likely reflects a clustering of risky behaviors (eg, low degree of physical activity, poor diet, and smoking) that is conducive to weight gain. Other factors, such as weight cycling, could also be involved. In addition, smoking increases insulin resistance and is associated with central fat accumulation. As a result, smoking increases the risk of metabolic syndrome and diabetes, and these factors increase risk of cardiovascular disease. In the context of the worldwide obesity epidemic and a high prevalence of smoking, the greater risk of (central) obesity and insulin resistance among smokers is a matter of major concern

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Macroeconomists working with multivariate models typically face uncertainty over which (if any) of their variables have long run steady states which are subject to breaks. Furthermore, the nature of the break process is often unknown. In this paper, we draw on methods from the Bayesian clustering literature to develop an econometric methodology which: i) finds groups of variables which have the same number of breaks; and ii) determines the nature of the break process within each group. We present an application involving a five-variate steady-state VAR.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Les tècniques de clustering poden ajudar a reduir la supervisió en processos d'obtenció de patrons per a Extracció d'Informació. En aquest treball, que abarca un període de 4 anys de recerca, es comença per estudiar la representació de documents més adequada per a la tasca de clustering. Per tal d'evitar els biaixos dels mètodes individuals de clustering, es consideren mètodes de clustering conjunt. S'exploren diversos mètodes de combinació supervisada, i s'hi afegeixen estratègies automàtiques per a determinar el nombre de clusters de la combinació. També es consideren mecanismes per a obtenir clusterings conjunts ponderats, així com estratègies de combinació no supervisada. Finalment, els resultats del clustering s'utilitzen en un sistema d'adquisició de patrons per a substituir els elements de supervisió humana. Totes aquestes estratègies i mètodes s'avaluen en tasques de clustering de documents i adquisició de patrons usant dades reals. Es comprova que els mots com representació de documents superen altres models per a la tasca de clustering, així com que el clustering conjunt supera les limitacions dels clusterings individuals, i que les estratègies no supervisades d'adquisició de patrons obtenen resultats competitius respecte a les estratègies supervisades.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La localització de les empreses de nova economia en zones urbanes, a pesar que el factor distància no sigui important, no deixa de ser considerable pels seus avantatges que els suposa estar situades conjuntament en relació amb les infraestructures, consum, beneficis socioculturals, i facilitat en les transaccions cara a cara. És inevitable que el primer quart del segle vint-i-un estigui lligat a l’economia creativa de forma similar amb que el començament del segle vint estava íntimament lligat a l’economia industrial i la invenció del sistema de producció en massa. La ciutat també va jugar un dels papers més importants per al desenvolupament de “la nova economia industrial” a les albors del segle vint, com ho és la ciutat del coneixement que acull “la nova economia creativa” al segle vint-i-un. És evident que els resultats morfològics, socials, econòmics i urbans són ben diferents en ambdós fenòmens, però l’impacte a les ciutats és molt gran. L’objectiu d’aquest estudi és analitzar els mecanismes d’aglomeració (clustering) d’activitats competitives basades en creació de coneixement i de serveis avançats que estan al darrera de desenvolupaments punters a ciutats com Barcelona, el projecte 22@bcn, i East London, el projecte Shoreditch. L’esforç que han posat les autoritats locals en crear l’entorn apropiat per atreure i crear empreses innovadores, com a motor de desenvolupament d’algunes ciutats modernes europees ha resultat en el sorgiment de nuclis o centres urbans molt dinàmics que suposadament estan preparats i acullen punts de creació de coneixement (“Urban Knowledge Hubs”), amb una demanda i llocs de treball altament qualificats. Aquest és el cas dels projectes de Barcelona (22@bcn) i East London (Shoreditch).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Functional connectivity in human brain can be represented as a network using electroencephalography (EEG) signals. These networks--whose nodes can vary from tens to hundreds--are characterized by neurobiologically meaningful graph theory metrics. This study investigates the degree to which various graph metrics depend upon the network size. To this end, EEGs from 32 normal subjects were recorded and functional networks of three different sizes were extracted. A state-space based method was used to calculate cross-correlation matrices between different brain regions. These correlation matrices were used to construct binary adjacency connectomes, which were assessed with regards to a number of graph metrics such as clustering coefficient, modularity, efficiency, economic efficiency, and assortativity. We showed that the estimates of these metrics significantly differ depending on the network size. Larger networks had higher efficiency, higher assortativity and lower modularity compared to those with smaller size and the same density. These findings indicate that the network size should be considered in any comparison of networks across studies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: School-based intervention studies promoting a healthy lifestyle have shown favorable immediate health effects. However, there is a striking paucity on long-term follow-ups. The aim of this study was therefore to assess the 3 yr-follow-up of a cluster-randomized controlled school-based physical activity program over nine month with beneficial immediate effects on body fat, aerobic fitness and physical activity. METHODS AND FINDINGS: Initially, 28 classes from 15 elementary schools in Switzerland were grouped into an intervention (16 classes from 9 schools, n = 297 children) and a control arm (12 classes from 6 schools, n = 205 children) after stratification for grade (1st and 5th graders). Three years after the end of the multi-component physical activity program of nine months including daily physical education (i.e. two additional lessons per week on top of three regular lessons), short physical activity breaks during academic lessons, and daily physical activity homework, 289 (58%) participated in the follow-up. Primary outcome measures included body fat (sum of four skinfolds), aerobic fitness (shuttle run test), physical activity (accelerometry), and quality of life (questionnaires). After adjustment for grade, gender, baseline value and clustering within classes, children in the intervention arm compared with controls had a significantly higher average level of aerobic fitness at follow-up (0.373 z-score units [95%-CI: 0.157 to 0.59, p = 0.001] corresponding to a shift from the 50th to the 65th percentile between baseline and follow-up), while the immediate beneficial effects on the other primary outcomes were not sustained. CONCLUSIONS: Apart from aerobic fitness, beneficial effects seen after one year were not maintained when the intervention was stopped. A continuous intervention seems necessary to maintain overall beneficial health effects as reached at the end of the intervention. TRIAL REGISTRATION: ControlledTrials.com ISRCTN15360785.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There is a vast literature that specifies Bayesian shrinkage priors for vector autoregressions (VARs) of possibly large dimensions. In this paper I argue that many of these priors are not appropriate for multi-country settings, which motivates me to develop priors for panel VARs (PVARs). The parametric and semi-parametric priors I suggest not only perform valuable shrinkage in large dimensions, but also allow for soft clustering of variables or countries which are homogeneous. I discuss the implications of these new priors for modelling interdependencies and heterogeneities among different countries in a panel VAR setting. Monte Carlo evidence and an empirical forecasting exercise show clear and important gains of the new priors compared to existing popular priors for VARs and PVARs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Las aplicaciones de alineamiento múltiple de secuencias son prototipos de aplicaciones que requieren elevada potencia de cómputo y memoria. Se destacan por la relevancia científica que tienen los resultados que brindan a investigaciones científicas en el campo de la biomedicina, genética y farmacología. Las aplicaciones de alineamiento múltiple tienen la limitante de que no son capaces de procesar miles de secuencias, por lo que se hace necesario crear un modelo para resolver la problemática. Analizando el volumen de datos que se manipulan en el área de las ciencias biológica y la complejidad de los algoritmos de alineamiento de secuencias, la única vía de solución del problema es a través de la utilización de entornos de cómputo paralelos y la computación de altas prestaciones. La investigación realizada por nosotros tiene como objetivo la creación de un modelo paralelo que le permita a los algoritmos de alineamiento múltiple aumentar el número de secuencias a procesar, tratando de mantener la calidad en los resultados para garantizar la precisión científica. El modelo que proponemos emplea como base la clusterización de las secuencias de entrada utilizando criterios biológicos que permiten mantener la calidad de los resultados. Además, el modelo se enfoca en la disminución del tiempo de cómputo y consumo de memoria. Para presentar y validar el modelo utilizamos T-Coffee, como plataforma de desarrollo e investigación. El modelo propuesto pudiera ser aplicado a cualquier otro algoritmo de alineamiento múltiple de secuencias.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Kilombero Malaria Project (KMP) attemps to define opperationally useful indicators of levels of transmission and disease and health system relevant monitoring indicators to evaluate the impact of disease control at the community or health facility level. The KMP is longitudinal community based study (N = 1024) in rural Southern Tanzania, investigating risk factors for malarial morbidity and developing household based malaria control strategies. Biweekly morbidity and bimonthly serological, parasitological and drug consumption surveys are carried out in all study households. Mosquito densities are measured biweekly in 50 sentinel houses by timed light traps. Determinants of transmission and indicators of exposure were not strongly aggregated within households. Subjective morbidity (recalled fever), objective morbidity (elevated body temperature and high parasitaemia) and chloroquine consumption were strongly aggregated within a few households. Nested analysis of anti-NANP40 antibody suggest that only approximately 30% of the titer variance can explained by household clustering and that the largest proportion of antibody titer variability must be explained by non-measured behavioral determinants relating to an individual's level of exposure within a household. Indicators for evaluation and monitoring and outcome measures are described within the context of health service management to describe control measure output in terms of community effectiveness.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As the evolutionary significance of hybridization is largely dictated by its extent beyond the first generation, we broadly surveyed patterns of introgression across a sympatric zone of two native poplars (Populus balsamifera, Populus deltoides) in Quebec, Canada within which European exotic Populus nigra and its hybrids have been extensively planted since the 1800s. Single nucleotide polymorphisms (SNPs) that appeared fixed within each species were characterized by DNA-sequencing pools of pure individuals. Thirty-five of these diagnostic SNPs were employed in a high-throughput assay that genotyped 635 trees of different age classes, sampled from 15 sites with various degrees of anthropogenic disturbance. The degree of admixture within sampled trees was then assessed through Bayesian clustering of genotypes. Hybrids were present in seven of the populations, with 2.4% of all sampled trees showing spontaneous admixture. Sites with hybrids were significantly more disturbed than pure stands, while hybrids comprised both immature juveniles and trees of reproductive age. All three possible F1s were detected. Advanced-generation hybrids were consistently biased towards P. balsamifera regardless of whether hybridization had occurred with P. deltoides or P. nigra. Gene exchange between P. deltoides and P. nigra was not detected beyond the F1 generation; however, detection of a trihybrid demonstrates that even this apparent reproductive isolation does not necessarily result in an evolutionary dead end. Collectively, results demonstrate the natural fertility of hybrid poplars and suggest that introduced genes could potentially affect the genetic integrity of native trees, similar to that arising from introgression between natives.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In many fields, the spatial clustering of sampled data points has many consequences. Therefore, several indices have been proposed to assess the level of clustering affecting datasets (e.g. the Morisita index, Ripley's Kfunction and Rényi's generalized entropy). The classical Morisita index measures how many times it is more likely to select two measurement points from the same quadrats (the data set is covered by a regular grid of changing size) than it would be in the case of a random distribution generated from a Poisson process. The multipoint version (k-Morisita) takes into account k points with k >= 2. The present research deals with a new development of the k-Morisita index for (1) monitoring network characterization and for (2) detection of patterns in monitored phenomena. From a theoretical perspective, a connection between the k-Morisita index and multifractality has also been found and highlighted on a mathematical multifractal set.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Superinfection with drug resistant HIV strains could potentially contribute to compromised therapy in patients initially infected with drug-sensitive virus and receiving antiretroviral therapy. To investigate the importance of this potential route to drug resistance, we developed a bioinformatics pipeline to detect superinfection from routinely collected genotyping data, and assessed whether superinfection contributed to increased drug resistance in a large European cohort of viremic, drug treated patients. METHODS: We used sequence data from routine genotypic tests spanning the protease and partial reverse transcriptase regions in the Virolab and EuResist databases that collated data from five European countries. Superinfection was indicated when sequences of a patient failed to cluster together in phylogenetic trees constructed with selected sets of control sequences. A subset of the indicated cases was validated by re-sequencing pol and env regions from the original samples. RESULTS: 4425 patients had at least two sequences in the database, with a total of 13816 distinct sequence entries (of which 86% belonged to subtype B). We identified 107 patients with phylogenetic evidence for superinfection. In 14 of these cases, we analyzed newly amplified sequences from the original samples for validation purposes: only 2 cases were verified as superinfections in the repeated analyses, the other 12 cases turned out to involve sample or sequence misidentification. Resistance to drugs used at the time of strain replacement did not change in these two patients. A third case could not be validated by re-sequencing, but was supported as superinfection by an intermediate sequence with high degenerate base pair count within the time frame of strain switching. Drug resistance increased in this single patient. CONCLUSIONS: Routine genotyping data are informative for the detection of HIV superinfection; however, most cases of non-monophyletic clustering in patient phylogenies arise from sample or sequence mix-up rather than from superinfection, which emphasizes the importance of validation. Non-transient superinfection was rare in our mainly treatment experienced cohort, and we found a single case of possible transmitted drug resistance by this route. We therefore conclude that in our large cohort, superinfection with drug resistant HIV did not compromise the efficiency of antiretroviral treatment.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim of the present work is to investigate innovative processes within a geographical cluster, and thus contribute to the debate on the effects of industrial clusters on innovation capacity. In particular, we would like to ascertain whether the advantages of industrial districts in promoting innovation, as already revealed by literature (diffusion of knowledge, social capital and trust, efficient networking), are also keys to success in the Tuscan shipbuilding industry of pleasure and sporting boats. First, we verify the existence of clusters of shipbuilding in Tuscany, using a specific methodology. Next, in the identified clusters, we analyse three innovative networks financed in a policy to support innovation, and examine whether the typical features of a cluster for promoting innovation are at work, using a questionnaire administered to 71 actors. Finally, we develop a performance analysis of the cluster firms and ascertain whether their different behaviours also lead to different performances. The analysis results show that our case records effects of industrial clustering on innovation capacity, such as the important role given to trust and social capital, the significant worth put in interfirm relations and in each partner’s specific competencies, or even the distinctive performance of firms belonging to a cluster.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Species delimitation has been invigorated as a discipline in systematics by an influx of new character sets, analytical methods, and conceptual advances. We use genetic data from 68 markers, combined with distributional, bioclimatic, and coloration information, to hypothesize boundaries of evolutionarily independent lineages (species) within the widespread and highly variable nominal fire ant species Solenopsis saevissima, a member of a species group containing invasive pests as well as species that are models for ecological and evolutionary research. Our integrated approach uses diverse methods of analysis to sequentially test whether populations meet specific operational criteria (contingent properties) for candidacy as morphologically cryptic species, including genetic clustering, monophyly, reproductive isolation, and occupation of distinctive niche space. We hypothesize that nominal S. saevissima comprises at least 4-6 previously unrecognized species, including several pairs whose parapatric distributions implicate the development of intrinsic premating or postmating barriers to gene flow. Our genetic data further suggest that regional genetic differentiation in S. saevissima has been influenced by hybridization with other nominal species occurring in sympatry or parapatry, including the quite distantly related Solenopsis geminata. The results of this study illustrate the importance of employing different classes of genetic data (coding and noncoding regions and nuclear and mitochondrial DNA [mtDNA] markers), different methods of genetic data analysis (tree-based and non-tree based methods), and different sources of data (genetic, morphological, and ecological data) to explicitly test various operational criteria for species boundaries in clades of recently diverged lineages, while warning against over reliance on any single data type (e.g., mtDNA sequence variation) when drawing inferences.