7 resultados para Community detection
em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast
Resumo:
In this paper we propose a graph stream clustering algorithm with a unied similarity measure on both structural and attribute properties of vertices, with each attribute being treated as a vertex. Unlike others, our approach does not require an input parameter for the number of clusters, instead, it dynamically creates new sketch-based clusters and periodically merges existing similar clusters. Experiments on two publicly available datasets reveal the advantages of our approach in detecting vertex clusters in the graph stream. We provide a detailed investigation into how parameters affect the algorithm performance. We also provide a quantitative evaluation and comparison with a well-known offline community detection algorithm which shows that our streaming algorithm can achieve comparable or better average cluster purity.
Resumo:
Background: Anaerobic bacteria are increasingly regarded as important in cystic fibrosis (CF) pulmonary infection. The aim of this study was to determine the effect of antibiotic treatment on aerobic and anaerobic microbial community diversity and abundance during exacerbations in patients with CF.
Methods: Sputum was collected at the start and completion of antibiotic treatment of exacerbations and when clinically stable. Bacteria were quantified and identified following culture, and community composition was also examined using culture-independent methods.
Results: Pseudomonas aeruginosa or Burkholderia cepacia complex were detected by culture in 24/26 samples at the start of treatment, 22/26 samples at completion of treatment and 11/13 stable samples. Anaerobic bacteria were detected in all start of treatment and stable samples and in 23/26 completion of treatment samples. Molecular analysis showed greater bacterial diversity within sputum samples than was detected by culture; there was reasonably good agreement between the methods for the presence or absence of aerobic bacteria such as P aeruginosa (kappa=0.74) and B cepacia complex (kappa=0.92), but agreement was poorer for anaerobes. Both methods showed that the composition of the bacterial community varied between patients but remained relatively stable in most individuals despite treatment. Bacterial abundance decreased transiently following treatment, with this effect more evident for aerobes (median decrease in total viable count 2.3 x 10(7) cfu/g, p=0.005) than for anaerobes (median decrease in total viable count 3 x 10(6) cfu/g, p=0.046).
Conclusion: Antibiotic treatment targeted against aerobes had a minimal effect on abundance of anaerobes and community composition, with both culture and molecular detection methods required for comprehensive characterisation of the microbial community in the CF lung. Further studies are required to determine the clinical significance of and optimal treatment for these newly identified bacteria.
Resumo:
Background: There is growing interest in the potential utility of molecular diagnostics in improving the detection of life-threatening infection (sepsis). LightCycler® SeptiFast is a multipathogen probebased real-time PCR system targeting DNA sequences of bacteria and fungi present in blood samples within a few hours. We report here the protocol of the first systematic review of published clinical diagnostic accuracy studies of this technology when compared with blood culture in the setting of suspected sepsis. Methods/design: Data sources: the Cochrane Database of Systematic Reviews, the Database of Abstracts of Reviews of Effects (DARE), the Health Technology Assessment Database (HTA), the NHS Economic Evaluation Database (NHSEED), The Cochrane Library, MEDLINE, EMBASE, ISI Web of Science, BIOSIS Previews, MEDION and the Aggressive Research Intelligence Facility Database (ARIF). Study selection: diagnostic accuracy studies that compare the real-time PCR technology with standard culture results performed on a patient's blood sample during the management of sepsis. Data extraction: three reviewers, working independently, will determine the level of evidence, methodological quality and a standard data set relating to demographics and diagnostic accuracy metrics for each study. Statistical analysis/data synthesis: heterogeneity of studies will be investigated using a coupled forest plot of sensitivity and specificity and a scatter plot in Receiver Operator Characteristic (ROC) space. Bivariate model method will be used to estimate summary sensitivity and specificity. The authors will investigate reporting biases using funnel plots based on effective sample size and regression tests of asymmetry. Subgroup analyses are planned for adults, children and infection setting (hospital vs community) if sufficient data are uncovered. Dissemination: Recommendations will be made to the Department of Health (as part of an open-access HTA report) as to whether the real-time PCR technology has sufficient clinical diagnostic accuracy potential to move forward to efficacy testing during the provision of routine clinical care.
Resumo:
Introduction and Aims: Persistent bacterial infection is a major cause of morbidity and mortality in patients with both Cystic Fibrosis (CF) and non-CF Bronchiectasis (non-CFBX). Numerous studies have shown that CF and non-CFBX airways are colonised by a complex microbiota. However, many bacteria are difficult, if not impossible, to culture by conventional laboratory techniques. Therefore, molecular detection techniques offer a more comprehensive view of bacterial diversity within clinical specimens. The objective of this study was to characterise and compare bacterial diversity and relative abundance in patients with CF and non-CFBX during exacerbation and when clinically stable.
Methods: Sputum samples were collected from CF (n=50 samples) and non-CFBX (n=52 samples) patients at the start and end of treatment for an infective exacerbation and when clinically stable. Pyrosequencing was used to assess the microbial diversity and relative genera (or the closest possibly taxonomic order) abundance within the samples. Each sequence read was defined based on 3% difference.
Results: High-throughput pyrosequencing allowed a sensitive and detailed examination of microbial community composition. Rich microbial communities were apparent within both CF (171 species-level phylotypes per genus) and non-CFBX airways (144 species-level phylotypes per genus). Relative species distribution within those two environments was considerably different; however, relatively few genera formed a core of microorganisms, representing approximately 90% of all sequences, which dominated both environments. Relative abundance based on observed operational taxonomic units demonstrated that the most abundant bacteria in CF were Pseudomonas (28%), Burkholderia (22%), Streptococcus (13%), family Pseudomonadaceae (8%) and Prevotella (6%). In contrast, the most commonly detected operational taxonomic units in non-CFBX were Haemophilus (22%), Streptococcus (14%), other (unassigned taxa) (11%), Pseudomonas (10%), Veillonella (7%) and Prevotella (6%).
Conclusions: These results suggest that distinctive microbial communities are associated with infection and/or colonisation in patients with both CF and non-CFBX. Although relatively high species richness was observed within the two environments, each was dominated by different core taxa. This suggests that differences in the lung environment of these two diseases may affect adaptability of the relevant bacterial taxa.
Resumo:
This paper presents a machine learning approach to sarcasm detection on Twitter in two languages – English and Czech. Although there has been some research in sarcasm detection in languages other than English (e.g., Dutch, Italian, and Brazilian Portuguese), our work is the first attempt at sarcasm detection in the Czech language. We created a large Czech Twitter corpus consisting of 7,000 manually-labeled tweets and provide it to the community. We evaluate two classifiers with various combinations of features on both the Czech and English datasets. Furthermore, we tackle the issues of rich Czech morphology by examining different preprocessing techniques. Experiments show that our language-independent approach significantly outperforms adapted state-of-the-art methods in English (F-measure 0.947) and also represents a strong baseline for further research in Czech (F-measure 0.582).
Resumo:
The problem of detecting spatially-coherent groups of data that exhibit anomalous behavior has started to attract attention due to applications across areas such as epidemic analysis and weather forecasting. Earlier efforts from the data mining community have largely focused on finding outliers, individual data objects that display deviant behavior. Such point-based methods are not easy to extend to find groups of data that exhibit anomalous behavior. Scan Statistics are methods from the statistics community that have considered the problem of identifying regions where data objects exhibit a behavior that is atypical of the general dataset. The spatial scan statistic and methods that build upon it mostly adopt the framework of defining a character for regions (e.g., circular or elliptical) of objects and repeatedly sampling regions of such character followed by applying a statistical test for anomaly detection. In the past decade, there have been efforts from the statistics community to enhance efficiency of scan statstics as well as to enable discovery of arbitrarily shaped anomalous regions. On the other hand, the data mining community has started to look at determining anomalous regions that have behavior divergent from their neighborhood.In this chapter,we survey the space of techniques for detecting anomalous regions on spatial data from across the data mining and statistics communities while outlining connections to well-studied problems in clustering and image segmentation. We analyze the techniques systematically by categorizing them appropriately to provide a structured birds eye view of the work on anomalous region detection;we hope that this would encourage better cross-pollination of ideas across communities to help advance the frontier in anomaly detection.
Resumo:
Rheumatic heart disease (RHD) is the largest cardiac cause of morbidity and mortality in the world's youth. Early detection of RHD through echocardiographic screening in asymptomatic children may identify an early stage of disease, when secondary prophylaxis has the greatest chance of stopping disease progression. Latent RHD signifies echocardiographic evidence of RHD with no known history of acute rheumatic fever and no clinical symptoms.
OBJECTIVE: Determine the prevalence of latent RHD among children ages 5-16 in Lilongwe, Malawi.
DESIGN: This is a cross-sectional study in which children ages 5 through 16 were screened for RHD using echocardiography.
SETTING: Screening was conducted in 3 schools and surrounding communities in the Lilongwe district of Malawi between February and April 2014.
OUTCOME MEASURES: Children were diagnosed as having no, borderline, or definite RHD as defined by World Heart Federation criteria. The primary reader completed offline reads of all studies. A second reader reviewed all of the studies diagnosed as RHD, plus a selection of normal studies. A third reader served as tiebreaker for discordant diagnoses. The distribution of results was compared between gender, location, and age categories using Fisher's exact test.
RESULTS: The prevalence of latent RHD was 3.4% (95% CI = 2.45, 4.31), with 0.7% definite RHD and 2.7% borderline RHD. There was no significant differences in prevalence between gender (P = .44), site (P = .6), urban vs. peri-urban (P = .75), or age (P = .79). Of those with definite RHD, all were diagnosed because of pathologic mitral regurgitation (MR) and 2 morphologic features of the mitral valve. Of those with borderline RHD, most met the criteria by having pathological MR (92.3%).
CONCLUSION: Malawi has a high rate of latent RHD, which is consistent with other results from sub-Saharan Africa. This study strongly supports the need for a RHD prevention and control program in Malawi.