3 resultados para distance measures
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
The ubiquity of time series data across almost all human endeavors has produced a great interest in time series data mining in the last decade. While dozens of classification algorithms have been applied to time series, recent empirical evidence strongly suggests that simple nearest neighbor classification is exceptionally difficult to beat. The choice of distance measure used by the nearest neighbor algorithm is important, and depends on the invariances required by the domain. For example, motion capture data typically requires invariance to warping, and cardiology data requires invariance to the baseline (the mean value). Similarly, recent work suggests that for time series clustering, the choice of clustering algorithm is much less important than the choice of distance measure used.In this work we make a somewhat surprising claim. There is an invariance that the community seems to have missed, complexity invariance. Intuitively, the problem is that in many domains the different classes may have different complexities, and pairs of complex objects, even those which subjectively may seem very similar to the human eye, tend to be further apart under current distance measures than pairs of simple objects. This fact introduces errors in nearest neighbor classification, where some complex objects may be incorrectly assigned to a simpler class. Similarly, for clustering this effect can introduce errors by “suggesting” to the clustering algorithm that subjectively similar, but complex objects belong in a sparser and larger diameter cluster than is truly warranted.We introduce the first complexity-invariant distance measure for time series, and show that it generally produces significant improvements in classification and clustering accuracy. We further show that this improvement does not compromise efficiency, since we can lower bound the measure and use a modification of triangular inequality, thus making use of most existing indexing and data mining algorithms. We evaluate our ideas with the largest and most comprehensive set of time series mining experiments ever attempted in a single work, and show that complexity-invariant distance measures can produce improvements in classification and clustering in the vast majority of cases.
Resumo:
Background: The sieve analysis for the Step trial found evidence that breakthrough HIV-1 sequences for MRKAd5/HIV-1 Gag/Pol/Nef vaccine recipients were more divergent from the vaccine insert than placebo sequences in regions with predicted epitopes. We linked the viral sequence data with immune response and acute viral load data to explore mechanisms for and consequences of the observed sieve effect. Methods: Ninety-one male participants (37 placebo and 54 vaccine recipients) were included; viral sequences were obtained at the time of HIV-1 diagnosis. T-cell responses were measured 4 weeks post-second vaccination and at the first or second week post-diagnosis. Acute viral load was obtained at RNA-positive and antibody-negative visits. Findings: Vaccine recipients had a greater magnitude of post-infection CD8+ T cell response than placebo recipients (median 1.68% vs 1.18%; p = 0.04) and greater breadth of post-infection response (median 4.5 vs 2; p = 0.06). Viral sequences for vaccine recipients were marginally more divergent from the insert than placebo sequences in regions of Nef targeted by pre-infection immune responses (p = 0.04; Pol p = 0.13; Gag p = 0.89). Magnitude and breadth of pre-infection responses did not correlate with distance of the viral sequence to the insert (p. 0.50). Acute log viral load trended lower in vaccine versus placebo recipients (estimated mean 4.7 vs 5.1) but the difference was not significant (p = 0.27). Neither was acute viral load associated with distance of the viral sequence to the insert (p>0.30). Interpretation: Despite evidence of anamnestic responses, the sieve effect was not well explained by available measures of T-cell immunogenicity. Sequence divergence from the vaccine was not significantly associated with acute viral load. While point estimates suggested weak vaccine suppression of viral load, the result was not significant and more viral load data would be needed to detect suppression.
Resumo:
Objective. Spondyloarthritides (SpA) can present different disease spectra according to ethnic background. The Brazilian Registry of Spondyloarthritis (RBE) is a nationwide registry that comprises a large databank on clinical, functional, and treatment data on Brazilian patients with SpA. The aim of our study was to analyze the influence of ethnic background in SpA disease patterns in a large series of Brazilian patients. Methods. A common protocol of investigation was prospectively applied to 1318 SpA patients in 29 centers distributed through the main geographical regions in Brazil. The group comprised whites (65%), African Brazilians (31.3%), and people of mixed origins (3.7%). Clinical and demographic variables and various disease index scores were compiled. Ankylosing spondylitis (AS) was the most frequent disease in the group (65.1%); others were psoriatic arthritis (18.3%), undifferentiated SpA (6.8%), enteropathic arthritis (3.7%), and reactive arthritis (3.4%). Results. White patients were significantly associated with psoriasis (p = 0.002), positive HLA-B27 (p = 0.014), and use of corticosteroids (p < 0.0001). Hip involvement (p = 0.02), axial inflammatory pain (p = 0.04), and radiographic sacroiliitis (p = 0.025) were associated with African Brazilian descent. Sex distribution, family history, and presence of peripheral arthritis, uveitis, dactylitis, urethritis, and inflammatory bowel disease were similar in the 3 groups, as well as age at disease onset, time from first symptom until diagnosis, and use of anti-tumor necrosis factor-a agents (p > 0.05). Schober test and thoracic expansion were similar in the 3 groups, whereas African Brazilians had higher Maastricht Ankylasing Spondylitis Enthesitis Scores (p = 0.005) and decreased lateral lumbar flexion (p = 0.003), while whites had a higher occiput-to-wall distance (p = 0.02). African Brazilians reported a worse patient global assessment of disease (p = 0.011). Other index scores and prevalence of work incapacity were similar in the 3 groups, although African Brazilians had worse performance in the Ankylosing Spondylitis Quality of Life questionnaire (p < 0.001). Conclusion. Ethnic background is associated with distinct clinical aspects of SpA in Brazilian patients. African Brazilian patients with SpA have a poorer quality of life and report worse disease compared to whites, (First Release Nov 1 2011; J Rheumatol 2012;39:141-7; doi:10.3899/jrheum.110372)