997 resultados para complete linkage clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Antimicrobial drug resistance is a global challenge for the 21st century with the emergence of resistant bacterial strains worldwide. Transferable resistance to beta-lactam antimicrobial drugs, mediated by production of extended-spectrum beta-lactamases (ESBLs), is of particular concern. In 2004, an ESBL-carrying IncK plasmid (pCT) was isolated from cattle in the United Kingdom. The sequence was a 93,629-bp plasmid encoding a single antimicrobial drug resistance gene, bla(CTX-M-14). From this information, PCRs identifying novel features of pCT were designed and applied to isolates from several countries, showing that the plasmid has disseminated worldwide in bacteria from humans and animals. Complete DNA sequences can be used as a platform to develop rapid epidemiologic tools to identify and trace the spread of plasmids in clinically relevant pathogens, thus facilitating a better understanding of their distribution and ability to transfer between bacteria of humans and animals.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This dissertation deals with aspects of sequential data assimilation (in particular ensemble Kalman filtering) and numerical weather forecasting. In the first part, the recently formulated Ensemble Kalman-Bucy (EnKBF) filter is revisited. It is shown that the previously used numerical integration scheme fails when the magnitude of the background error covariance grows beyond that of the observational error covariance in the forecast window. Therefore, we present a suitable integration scheme that handles the stiffening of the differential equations involved and doesn’t represent further computational expense. Moreover, a transform-based alternative to the EnKBF is developed: under this scheme, the operations are performed in the ensemble space instead of in the state space. Advantages of this formulation are explained. For the first time, the EnKBF is implemented in an atmospheric model. The second part of this work deals with ensemble clustering, a phenomenon that arises when performing data assimilation using of deterministic ensemble square root filters in highly nonlinear forecast models. Namely, an M-member ensemble detaches into an outlier and a cluster of M-1 members. Previous works may suggest that this issue represents a failure of EnSRFs; this work dispels that notion. It is shown that ensemble clustering can be reverted also due to nonlinear processes, in particular the alternation between nonlinear expansion and compression of the ensemble for different regions of the attractor. Some EnSRFs that use random rotations have been developed to overcome this issue; these formulations are analyzed and their advantages and disadvantages with respect to common EnSRFs are discussed. The third and last part contains the implementation of the Robert-Asselin-Williams (RAW) filter in an atmospheric model. The RAW filter is an improvement to the widely popular Robert-Asselin filter that successfully suppresses spurious computational waves while avoiding any distortion in the mean value of the function. Using statistical significance tests both at the local and field level, it is shown that the climatology of the SPEEDY model is not modified by the changed time stepping scheme; hence, no retuning of the parameterizations is required. It is found the accuracy of the medium-term forecasts is increased by using the RAW filter.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ensemble clustering (EC) can arise in data assimilation with ensemble square root filters (EnSRFs) using non-linear models: an M-member ensemble splits into a single outlier and a cluster of M−1 members. The stochastic Ensemble Kalman Filter does not present this problem. Modifications to the EnSRFs by a periodic resampling of the ensemble through random rotations have been proposed to address it. We introduce a metric to quantify the presence of EC and present evidence to dispel the notion that EC leads to filter failure. Starting from a univariate model, we show that EC is not a permanent but transient phenomenon; it occurs intermittently in non-linear models. We perform a series of data assimilation experiments using a standard EnSRF and a modified EnSRF by a resampling though random rotations. The modified EnSRF thus alleviates issues associated with EC at the cost of traceability of individual ensemble trajectories and cannot use some of algorithms that enhance performance of standard EnSRF. In the non-linear regimes of low-dimensional models, the analysis root mean square error of the standard EnSRF slowly grows with ensemble size if the size is larger than the dimension of the model state. However, we do not observe this problem in a more complex model that uses an ensemble size much smaller than the dimension of the model state, along with inflation and localisation. Overall, we find that transient EC does not handicap the performance of the standard EnSRF.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks, such as massively parallel processors and clusters of workstations. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. The lack of scalable and fault tolerant global communication and synchronisation methods in large-scale systems has hindered the adoption of the K-Means algorithm for applications in large networked systems such as wireless sensor networks, peer-to-peer systems and mobile ad hoc networks. This work proposes a fully distributed K-Means algorithm (EpidemicK-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art sampling methods and shows that the proposed method overcomes the limitations of the sampling-based approaches for skewed clusters distributions. The experimental analysis confirms that the proposed algorithm is very accurate and fault tolerant under unreliable network conditions (message loss and node failures) and is suitable for asynchronous networks of very large and extreme scale.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Although most researchers recognise that the language repertoire of bilinguals canmvary, few studies have tried to address variation in bilingual competence in any detail. This study aims to take a first step towards further understanding the way in which bilingual competencies can vary at the level of syntax by comparing the use of syntactic embeddings among three different groups of Turkish�German bilinguals. The approach of the present paper is new in that different groups of bilinguals are compared with each other, and not only with monolingual speakers, as is common in most studies in the field. The analysis focuses on differences in the use of embeddings in Turkish, which are generally considered to be one of the more complex aspects of Turkish grammar. The study shows that young Turkish� German bilingual adults who were born and raised in Germany use fewer, and less complex embeddings than Turkish�German bilingual returnees who had lived in Turkey for eight years at the time of recording. The present study provides new insights in the nature of bilingual competence, as well as a new perspective on syntactic change in immigrant Turkish as spoken in Europe.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Deforestation and forest degradation are estimated to account for between 12% and 20% of annual greenhouse gas emissions and in the 1990s (largely in the developing world) released about 5.8 Gt per year, which was bigger than all forms of transport combined. The idea behind REDD + is that payments for sequestering carbon can tip the economic balance away from loss of forests and in the process yield climate benefits. Recent analysis has suggested that developing country carbon sequestration can effectively compete with other climate investments as part of a cost effective climate policy. This paper focuses on opportunities and complications associated with bringing community-controlled forests into REDD +. About 25% of developing country forests are community controlled and therefore it is difficult to envision a successful REDD + without coming to terms with community controlled forests. It is widely agreed that REDD + offers opportunities to bring value to developing country forests, but there are also concerns driven by worries related to insecure and poorly defined community forest tenure, informed by often long histories of government unwillingness to meaningfully devolve to communities. Further, communities are complicated systems and it is therefore also of concern that REDD + could destabilize existing well-functioning community forestry systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background & Aims: Malnutrition is prevalent in people diagnosed with dementia however ensuring adequate oral intake within this group is often problematic. It is important to determine whether providing nutritionally complete oral nutritional supplements (ONS) drinks is an effective way of improving clinical outcomes for older people with dementia. This paper systematically reviewed clinical, wellbeing and nutritional outcomes in people with long-term cognitive impairment. Methods: The CINAHL, Medline and EMBASE databases were searched from their inception until January 2012. Reference lists of the included papers, foreign language papers and review articles obtained were manually searched. Results: Twelve articles were included in the review containing 1076 people in the supplement groups (intervention) and 748 people in the control groups. Meta-analysis shows there was a significant improvement in weight (p=<0.0001), Body Mass Index (BMI) (p=<0.0001) and cognition at 6.5+/-3.9 month follow up (p=0.002) when supplements were given compared to the control group. Conclusions: Providing ONS drinks has a positive effect on weight gain and cognition at follow up in older people with dementia. Additional research is required in both comparing nutritional supplements to vitamin/mineral tablets and high protein/calorie shots and clinical outcomes relevant to hospitalised people with dementia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multilocus digenic linkage disequilibria (LD) and their population structure were investigated in eleven landrace populations of barley (Hordeum vulgare ssp. vulgare L.) in Sardinia, using 134 dominant simple-sequence amplified polymorphism markers. The analysis of molecular variance for these markers indicated that the populations were partially differentiated (F ST = 0.18), and clustered into three geographic areas. Consistent with this population pattern, STRUCTURE analysis allocated individuals from a bulk of all populations into four genetic groups, and these groups also showed geographic patterns. In agreement with other molecular studies in barley, the general level of LD was low (13 % of locus pairs, with P < 0.01) in the bulk of 337 lines, and decayed steeply with map distance between markers. The partitioning of multilocus associations into various components indicated that genetic drift and founder effects played a major role in determining the overall genetic makeup of the diversity in these landrace populations, but that epistatic homogenising or diversifying selection was also present. Notably, the variance of the disequilibrium component was relatively high, which implies caution in the pooling of barley lines for association studies. Finally, we compared the analyses of multilocus structure in barley landrace populations with parallel analyses in both composite crosses of barley on the one hand and in natural populations of wild barley on the other. Neither of these serves as suitable mimics of landraces in barley, which require their own study. Overall, the results suggest that these populations can be exploited for LD mapping if population structure is controlled.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A deeper understanding of random markers is important if they are to be employed for a range of objectives. The sequence specific amplified polymorphism (S-SAP) technique is a powerful genetic analysis tool which exploits the high copy number of retrotransposon long terminal repeats (LTRs) in the plant genome. The distribution and inheritance of S-SAP bands in the barley genome was studied using the Steptoe × Morex (S × M) double haploid (DH) population. Six S-SAP primer combinations generated 98 polymorphic bands, and map positions were assigned to all but one band. Eight putative co-dominant loci were detected, representing 16 of the mapped markers. Thus at least 81 of the mapped S-SAP loci were dominant. The markers were distributed along all of the seven chromosomes and a tendency to cluster was observed. The distribution of S-SAP markers over the barley genome concurred with the knowledge of the high copy number of retrotransposons in plants. This experiment has demonstrated the potential for the S-SAP technique to be applied in a range of analyses such as genetic fingerprinting, marker assisted breeding, biodiversity assessment and phylogenetic analyses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Under particular large-scale atmospheric conditions, several windstorms may affect Europe within a short time period. The occurrence of such cyclone families leads to large socioeconomic impacts and cumulative losses. The serial clustering of windstorms is analyzed for the North Atlantic/western Europe. Clustering is quantified as the dispersion (ratio variance/mean) of cyclone passages over a certain area. Dispersion statistics are derived for three reanalysis data sets and a 20-run European Centre Hamburg Version 5 /Max Planck Institute Version–Ocean Model Version 1 global climate model (ECHAM5/MPI-OM1 GCM) ensemble. The dependence of the seriality on cyclone intensity is analyzed. Confirming previous studies, serial clustering is identified in reanalysis data sets primarily on both flanks and downstream regions of the North Atlantic storm track. This pattern is a robust feature in the reanalysis data sets. For the whole area, extreme cyclones cluster more than nonextreme cyclones. The ECHAM5/MPI-OM1 GCM is generally able to reproduce the spatial patterns of clustering under recent climate conditions, but some biases are identified. Under future climate conditions (A1B scenario), the GCM ensemble indicates that serial clustering may decrease over the North Atlantic storm track area and parts of western Europe. This decrease is associated with an extension of the polar jet toward Europe, which implies a tendency to a more regular occurrence of cyclones over parts of the North Atlantic Basin poleward of 50°N and western Europe. An increase of clustering of cyclones is projected south of Newfoundland. The detected shifts imply a change in the risk of occurrence of cumulative events over Europe under future climate conditions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in iterative parallel data mining algorithms. In particular, the analysis focuses on one of the most influential and popular data mining methods, the k-means algorithm for cluster analysis. The straightforward parallel formulation of the k-means algorithm requires a global reduction operation at each iteration step, which hinders its scalability. This work studies a different parallel formulation of the algorithm where the requirement of global communication can be relaxed while still providing the exact solution of the centralised k-means algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a hierarchical clustering method for semantic Web service discovery. This method aims to improve the accuracy and efficiency of the traditional service discovery using vector space model. The Web service is converted into a standard vector format through the Web service description document. With the help of WordNet, a semantic analysis is conducted to reduce the dimension of the term vector and to make semantic expansion to meet the user’s service request. The process and algorithm of hierarchical clustering based semantic Web service discovery is discussed. Validation is carried out on the dataset.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cognitive experiments involving motor execution (ME) and motor imagery (MI) have been intensively studied using functional magnetic resonance imaging (fMRI). However, the functional networks of a multitask paradigm which include ME and MI were not widely explored. In this article, we aimed to investigate the functional networks involved in MI and ME using a method combining the hierarchical clustering analysis (HCA) and the independent component analysis (ICA). Ten right-handed subjects were recruited to participate a multitask experiment with conditions such as visual cue, MI, ME and rest. The results showed that four activation clusters were found including parts of the visual network, ME network, the MI network and parts of the resting state network. Furthermore, the integration among these functional networks was also revealed. The findings further demonstrated that the combined HCA with ICA approach was an effective method to analyze the fMRI data of multitasks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the last decades, several windstorm series hit Europe leading to large aggregated losses. Such storm series are examples of serial clustering of extreme cyclones, presenting a considerable risk for the insurance industry. Clustering of events and return periods of storm series for Germany are quantified based on potential losses using empirical models. Two reanalysis data sets and observations from German weather stations are considered for 30 winters. Histograms of events exceeding selected return levels (1-, 2- and 5-year) are derived. Return periods of historical storm series are estimated based on the Poisson and the negative binomial distributions. Over 4000 years of general circulation model (GCM) simulations forced with current climate conditions are analysed to provide a better assessment of historical return periods. Estimations differ between distributions, for example 40 to 65 years for the 1990 series. For such less frequent series, estimates obtained with the Poisson distribution clearly deviate from empirical data. The negative binomial distribution provides better estimates, even though a sensitivity to return level and data set is identified. The consideration of GCM data permits a strong reduction of uncertainties. The present results support the importance of considering explicitly clustering of losses for an adequate risk assessment for economical applications.