997 resultados para complete linkage clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel approach to improving subspace clustering by exploiting the spatial constraints. The new method encourages the sparse solution to be consistent with the spatial geometry of the tracked points, by embedding weights into the sparse formulation. By doing so, we are able to correct sparse representations in a principled manner without introducing much additional computational cost. We discuss alternative ways to treat the missing and corrupted data using the latest theory in robust lasso regression and suggest numerical algorithms so solve the proposed formulation. The experiments on the benchmark Johns Hopkins 155 dataset demonstrate that exploiting spatial constraints significantly improves motion segmentation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the limitations of the traditional port-based and payload-based traffic classification approaches, the past decade has seen extensive work on utilizing machine learning techniques to classify network traffic based on packet and flow level features. In particular, previous studies have shown that the unsupervised clustering approach is both accurate and capable of discovering previously unknown application classes. In this paper, we explore the utility of side information in the process of traffic clustering. Specifically, we focus on the flow correlation information that can be efficiently extracted from packet headers and expressed as instance-level constraints, which indicate that particular sets of flows are using the same application and thus should be put into the same cluster. To incorporate the constraints, we propose a modified constrained K-Means algorithm. A variety of real-world traffic traces are used to show that the constraints are widely available. The experimental results indicate that the constrained approach not only improves the quality of the resulted clusters, but also speeds up the convergence of the clustering process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new spectral clustering method called correlation preserving indexing (CPI), which is performed in the correlation similarity measure space. In this framework, the documents are projected into a low-dimensional semantic space in which the correlations between the documents in the local patches are maximized while the correlations between the documents outside these patches are minimized simultaneously. Since the intrinsic geometrical structure of the document space is often embedded in the similarities between the documents, correlation as a similarity measure is more suitable for detecting the intrinsic geometrical structure of the document space than euclidean distance. Consequently, the proposed CPI method can effectively discover the intrinsic structures embedded in high-dimensional document space. The effectiveness of the new method is demonstrated by extensive experiments conducted on various data sets and by comparison with existing document clustering methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Context Bariatric surgery results in sustained weight loss; reduced incidence of diabetes, cardiovascular events, and cancer; and improved survival. The long-term effect on health care use is unknown.

Objective To assess health care use over 20 years by obese patients treated conventionally or with bariatric surgery.

Design, Setting, and Participants
The Swedish Obese Subjects study is an ongoing, prospective, nonrandomized, controlled intervention study conducted in the Swedish health care system that included 2010 adults who underwent bariatric surgery and 2037 contemporaneously matched controls recruited between 1987 and 2001. Inclusion criteria were age 37 years to 60 years and body mass index of 34 or higher in men and 38 or higher in women. Exclusion criteria were identical in both groups.

Interventions Of the surgery patients, 13% underwent gastric bypass, 19% gastric banding, and 68% vertical-banded gastroplasty. Controls received conventional obesity treatment.

Main Outcome Measures Annual hospital days (follow-up years 1 to 20; data capture 1987-2009; median follow-up 15 years) and nonprimary care outpatient visits (years 2-20; data capture 2001-2009; median follow-up 9 years) were retrieved from the National Patient Register, and drug costs from the Prescribed Drug Register (years 7-20; data capture 2005-2011; median follow-up 6 years). Registry linkage was complete for more than 99% of patients (4044 of 4047). Mean differences were adjusted for baseline age, sex, smoking, diabetes status, body mass index, inclusion period, and (for the inpatient care analysis) hospital days the year before the index date.

Results In the 20 years following their bariatric procedure, surgery patients used a total of 54 mean cumulative hospital days compared with 40 used by those in the control group (adjusted difference, 15; 95% CI, 2-27; P = .03). During the years 2 through 6, surgery patients had an accumulated annual mean of 1.7 hospital days vs 1.2 days among control patients (adjusted difference, 0.5; 95% CI, 0.2 to 0.7; P < .001). From year 7 to 20, both groups had a mean annual 1.8 hospital days (adjusted difference, 0.0; 95% CI, −0.3 to 0.3; P = .95). Surgery patients had a mean annual 1.3 nonprimary care outpatient visits during the years 2 through 6 vs 1.1 among the controls (adjusted difference, 0.3; 95% CI, 0.1 to 0.4; P = .003), but from year 7, the 2 groups did not differ (1.8 vs 1.9 mean annual visits; adjusted difference, −0.2; 95% CI, −0.4 to 0.1; P = .12). From year 7 to 20, the surgery group incurred a mean annual drug cost of US $930; the control patients, $1123 (adjusted difference, −$228; 95% CI, −$335 to −$121; P < .001).

Conclusions Compared with controls, surgically treated patients used more inpatient and nonprimary outpatient care during the first 6-year period after undergoing bariatric surgery but not thereafter. Drug costs from years 7 through 20 were lower for surgery patients than for control patients.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Satellite image processing is a complex task that has received considerable attention from many researchers. In this paper, an interactive image query system for satellite imagery searching and retrieval is proposed. Like most image retrieval systems, extraction of image features is the most important step that has a great impact on the retrieval performance. Thus, a new technique that fuses color and texture features for segmentation is introduced. Applicability of the proposed technique is assessed using a database containing multispectral satellite imagery. The experiments demonstrate that the proposed segmentation technique is able to improve quality of the segmentation results as well as the retrieval performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, an interactive Content-Based Image Retrieval (CBIR) system that allows searching and retrieving images from databases is designed and developed. Based on the fuzzy c-means clustering algorithm, the CBIR system fuses color and texture features in image segmentation. A technique to form compound queries based on the combined features of different images is devised. This technique allows users to have a better control on the search criteria, thus a higher retrieval performance can be achieved. A database consisting of skin cancer imagery is used to demonstrate the applicability of the CBIR system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: The effectiveness of lifestyle interventions in reducing diabetes incidence has been well established. Little is known, however, about factors influencing the reach of diabetes prevention programs. This study examines the predictors of enrolment in the Sydney Diabetes Prevention Program (SDPP), a community-based diabetes prevention program conducted in general practice, New South Wales, Australia from 2008–2011.

Methods:
SDPP was an effectiveness trial. Participating general practitioners (GPs) from three Divisions of General Practice invited individuals aged 50–65 years without known diabetes to complete the Australian Type 2 Diabetes Risk Assessment tool. Individuals at high risk of diabetes were invited to participate in a lifestyle modification program. A multivariate model using generalized estimating equations to control for clustering of enrolment outcomes by GPs was used to examine independent predictors of enrolment in the program. Predictors included age, gender, indigenous status, region of birth, socio-economic status, family history of diabetes, history of high glucose, use of anti-hypertensive medication, smoking status, fruit and vegetable intake, physical activity level and waist measurement.

Results:
Of the 1821 eligible people identified as high risk, one third chose not to enrol in the lifestyle program. In multivariant analysis, physically inactive individuals (OR: 1.48, P = 0.004) and those with a family history of diabetes (OR: 1.67, P = 0.000) and history of high blood glucose levels (OR: 1.48, P = 0.001) were significantly more likely to enrol in the program. However, high risk individuals who smoked (OR: 0.52, P = 0.000), were born in a country with high diabetes risk (OR: 0.52, P = 0.000), were taking blood pressure lowering medications (OR: 0.80, P = 0.040) and consumed little fruit and vegetables (OR: 0.76, P = 0.047) were significantly less likely to take up the program.

Conclusions: Targeted strategies are likely to be needed to engage groups such as smokers and high risk ethnic groups. Further research is required to better understand factors influencing enrolment in diabetes prevention programs in the primary health care setting, both at the GP and individual level.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel method for document clustering using sparse representation of documents in conjunction with spectral clustering. An ℓ1-norm optimization formulation is posed to learn the sparse representation of each document, allowing us to characterize the affinity between documents by considering the overall information instead of traditional pair wise similarities. This document affinity is encoded through a graph on which spectral clustering is performed. The decomposition into multiple subspaces allows documents to be part of a sub-group that shares a smaller set of similar vocabulary, thus allowing for cleaner clusters. Extensive experimental evaluations on two real-world datasets from Reuters-21578 and 20Newsgroup corpora show that our proposed method consistently outperforms state-of-the-art algorithms. Significantly, the performance improvement over other methods is prominent for this datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Distal clavicle fracture associated with complete coracoclavicular ligament disruption represents an unstable injury, and osteosynthesis is recommended. This study was performed (1) to retrospectively analyse the clinico-radiological outcomes of two internal fixation techniques, and (2) to identify and analyse radiographic fracture patterns of fracture that are associated with this injury. Conclusions: Internal fixation of this fracture pattern is associated with a high union rate and favorable clinical outcomes with both techniques. A combination of distal radius plate and ligament reconstruction device resulted in stable fixation and significantly lower reoperation rates, and should be used when fracture geometry permits (Types 1 and 2).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose in this paper a novel sparse subspace clustering method that regularizes sparse subspace representation by exploiting the structural sharing between tasks and data points via group sparse coding. We derive simple, provably convergent, and computationally efficient algorithms for solving the proposed group formulations. We demonstrate the advantage of the framework on three challenging benchmark datasets ranging from medical record data to image and text clustering and show that they consistently outperforms rival methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, significant effort has been given to predicting protein functions from protein interaction data generated from high throughput techniques. However, predicting protein functions correctly and reliably still remains a challenge. Recently, many computational methods have been proposed for predicting protein functions. Among these methods, clustering based methods are the most promising. The existing methods, however, mainly focus on protein relationship modeling and the prediction algorithms that statically predict functions from the clusters that are related to the unannotated proteins. In fact, the clustering itself is a dynamic process and the function prediction should take this dynamic feature of clustering into consideration. Unfortunately, this dynamic feature of clustering is ignored in the existing prediction methods. In this paper, we propose an innovative progressive clustering based prediction method to trace the functions of relevant annotated proteins across all clusters that are generated through the progressive clustering of proteins. A set of prediction criteria is proposed to predict functions of unannotated proteins from all relevant clusters and traced functions. The method was evaluated on real protein interaction datasets and the results demonstrated the effectiveness of the proposed method compared with representative existing methods.