986 resultados para Similarity measure


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Object matching is a fundamental operation in data analysis. It typically requires the definition of a similarity measure between the classes of objects to be matched. Instead, we develop an approach which is able to perform matching by requiring a similarity measure only within each of the classes. This is achieved by maximizing the dependency between matched pairs of observations by means of the Hilbert Schmidt Independence Criterion. This problem can be cast as one of maximizing a quadratic assignment problem with special structure and we present a simple algorithm for finding a locally optimal solution.

Relevância:

60.00% 60.00%

Publicador:

Relevância:

60.00% 60.00%

Publicador:

Resumo:

小尺寸目标跟踪是视觉跟踪中的难题。本文使用均值移动算法跟踪小尺寸目标。论文首先分析了基于均值移动的小尺寸目标跟踪算法的两个主要问题:跟踪算法中断和跟踪目标丢失。然后,论文在这两个方面对小尺寸目标跟踪算法进行改进。给出了一种新的直方图单元编号方法,使包含目标颜色分量的直方图单元分布得更为集中紧凑。当候选目标与目标模型不匹配时,给出一种平滑算法来处理候选目标的直方图。论文提出一种新的相似性度量函数,推导了相应的权值计算公式,在此基础上建立了基于均值移动的目标跟踪算法。多段真实场景视频序列的跟踪实验表明,本文提出的算法可以有效地跟踪小尺寸目标,跟踪精度也有一定提高。

Relevância:

60.00% 60.00%

Publicador:

Resumo:

首先利用模糊C-均值聚类算法在多特征形成的特征空间上对图像进行区域分割,并在此基础上对区域进行多尺度小波分解;然后利用柯西函数构造区域的模糊相似度,应用模糊相似度及区域信息量构造加权因子,从而得到融合图像的小波系数;最后利用小波逆变换得到融合图像·采用均方根误差、峰值信噪比、熵、交叉熵和互信息5种准则评价融合算法的性能·实验结果表明,文中方法具有良好的融合特性·

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis presents a learning based approach for detecting classes of objects and patterns with variable image appearance but highly predictable image boundaries. It consists of two parts. In part one, we introduce our object and pattern detection approach using a concrete human face detection example. The approach first builds a distribution-based model of the target pattern class in an appropriate feature space to describe the target's variable image appearance. It then learns from examples a similarity measure for matching new patterns against the distribution-based target model. The approach makes few assumptions about the target pattern class and should therefore be fairly general, as long as the target class has predictable image boundaries. Because our object and pattern detection approach is very much learning-based, how well a system eventually performs depends heavily on the quality of training examples it receives. The second part of this thesis looks at how one can select high quality examples for function approximation learning tasks. We propose an {em active learning} formulation for function approximation, and show for three specific approximation function classes, that the active example selection strategy learns its target with fewer data samples than random sampling. We then simplify the original active learning formulation, and show how it leads to a tractable example selection paradigm, suitable for use in many object and pattern detection problems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

T. Boongoen and Q. Shen. 'Detecting False Identity through Behavioural Patterns', In Proceedings of International Crime Science Conference, British Library, London UK, 2008. Publisher's online version forthcoming.;The full text is currently unavailable in CADAIR pending approval by the publisher. Sponsorship: UK EPSRC grant EP/D057086

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Object detection and recognition are important problems in computer vision. The challenges of these problems come from the presence of noise, background clutter, large within class variations of the object class and limited training data. In addition, the computational complexity in the recognition process is also a concern in practice. In this thesis, we propose one approach to handle the problem of detecting an object class that exhibits large within-class variations, and a second approach to speed up the classification processes. In the first approach, we show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly solved with using a multiplicative form of two kernel functions. One kernel measures similarity for foreground-background classification. The other kernel accounts for latent factors that control within-class variation and implicitly enables feature sharing among foreground training samples. For applications where explicit parameterization of the within-class states is unavailable, a nonparametric formulation of the kernel can be constructed with a proper foreground distance/similarity measure. Detector training is accomplished via standard Support Vector Machine learning. The resulting detectors are tuned to specific variations in the foreground class. They also serve to evaluate hypotheses of the foreground state. When the image masks for foreground objects are provided in training, the detectors can also produce object segmentation. Methods for generating a representative sample set of detectors are proposed that can enable efficient detection and tracking. In addition, because individual detectors verify hypotheses of foreground state, they can also be incorporated in a tracking-by-detection frame work to recover foreground state in image sequences. To run the detectors efficiently at the online stage, an input-sensitive speedup strategy is proposed to select the most relevant detectors quickly. The proposed approach is tested on data sets of human hands, vehicles and human faces. On all data sets, the proposed approach achieves improved detection accuracy over the best competing approaches. In the second part of the thesis, we formulate a filter-and-refine scheme to speed up recognition processes. The binary outputs of the weak classifiers in a boosted detector are used to identify a small number of candidate foreground state hypotheses quickly via Hamming distance or weighted Hamming distance. The approach is evaluated in three applications: face recognition on the face recognition grand challenge version 2 data set, hand shape detection and parameter estimation on a hand data set, and vehicle detection and estimation of the view angle on a multi-pose vehicle data set. On all data sets, our approach is at least five times faster than simply evaluating all foreground state hypotheses with virtually no loss in classification accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes progress on a project to utilise case based reasoning methods in the design and manufacture of furniture products. The novel feature of this research is that cases are represented as structures in a relational database of products, components and materials. The paper proposes a method for extending the usual "weighted sum" over attribute similarities for a ·single table to encompass relational structures over several tables. The capabilities of the system are discussed, particularly with respect to differing user objectives, such as cost estimation, CAD, cutting scheme re-use, and initial design. It is shown that specification of a target case as a relational structure combined with suitable weights can fulfil several user functions. However, it is also shown that some user functions cannot satisfactorily be specified via a single target case. For these functions it is proposed to allow the specification of a set of target cases. A derived similarity measure between individuals and sets of cases is proposed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we propose a graph stream clustering algorithm with a unied similarity measure on both structural and attribute properties of vertices, with each attribute being treated as a vertex. Unlike others, our approach does not require an input parameter for the number of clusters, instead, it dynamically creates new sketch-based clusters and periodically merges existing similar clusters. Experiments on two publicly available datasets reveal the advantages of our approach in detecting vertex clusters in the graph stream. We provide a detailed investigation into how parameters affect the algorithm performance. We also provide a quantitative evaluation and comparison with a well-known offline community detection algorithm which shows that our streaming algorithm can achieve comparable or better average cluster purity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Earthquakes are associated with negative events, such as large number of casualties, destruction of buildings and infrastructures, or emergence of tsunamis. In this paper, we apply the Multidimensional Scaling (MDS) analysis to earthquake data. MDS is a set of techniques that produce spatial or geometric representations of complex objects, such that, objects perceived to be similar/distinct in some sense are placed nearby/distant on the MDS maps. The interpretation of the charts is based on the resulting clusters since MDS produces a different locus for each similarity measure. In this study, over three million seismic occurrences, covering the period from January 1, 1904 up to March 14, 2012 are analyzed. The events, characterized by their magnitude and spatiotemporal distributions, are divided into groups, either according to the Flinn–Engdahl seismic regions of Earth or using a rectangular grid based in latitude and longitude coordinates. Space-time and Space-frequency correlation indices are proposed to quantify the similarities among events. MDS has the advantage of avoiding sensitivity to the non-uniform spatial distribution of seismic data, resulting from poorly instrumented areas, and is well suited for accessing dynamics of complex systems. MDS maps are proven as an intuitive and useful visual representation of the complex relationships that are present among seismic events, which may not be perceived on traditional geographic maps. Therefore, MDS constitutes a valid alternative to classic visualization tools, for understanding the global behavior of earthquakes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In an earlier investigation (Burger et al., 2000) five sediment cores near the Rodrigues Triple Junction in the Indian Ocean were studied applying classical statistical methods (fuzzy c-means clustering, linear mixing model, principal component analysis) for the extraction of endmembers and evaluating the spatial and temporal variation of geochemical signals. Three main factors of sedimentation were expected by the marine geologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. The display of fuzzy membership values and/or factor scores versus depth provided consistent results for two factors only; the ultra-basic component could not be identified. The reason for this may be that only traditional statistical methods were applied, i.e. the untransformed components were used and the cosine-theta coefficient as similarity measure. During the last decade considerable progress in compositional data analysis was made and many case studies were published using new tools for exploratory analysis of these data. Therefore it makes sense to check if the application of suitable data transformations, reduction of the D-part simplex to two or three factors and visual interpretation of the factor scores would lead to a revision of earlier results and to answers to open questions . In this paper we follow the lines of a paper of R. Tolosana- Delgado et al. (2005) starting with a problem-oriented interpretation of the biplot scattergram, extracting compositional factors, ilr-transformation of the components and visualization of the factor scores in a spatial context: The compositional factors will be plotted versus depth (time) of the core samples in order to facilitate the identification of the expected sources of the sedimentary process. Kew words: compositional data analysis, biplot, deep sea sediments

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La present tesi, tot i que emmarcada dins de la teoria de les Mesures Semblança Molecular Quántica (MQSM), es deriva en tres àmbits clarament definits: - La creació de Contorns Moleculars de IsoDensitat Electrònica (MIDCOs, de l'anglès Molecular IsoDensity COntours) a partir de densitats electròniques ajustades. - El desenvolupament d'un mètode de sobreposició molecular, alternatiu a la regla de la màxima semblança. - Relacions Quantitatives Estructura-Activitat (QSAR, de l'anglès Quantitative Structure-Activity Relationships). L'objectiu en el camp dels MIDCOs és l'aplicació de funcions densitat ajustades, ideades inicialment per a abaratir els càlculs de MQSM, per a l'obtenció de MIDCOs. Així, es realitza un estudi gràfic comparatiu entre diferents funcions densitat ajustades a diferents bases amb densitats obtingudes de càlculs duts a terme a nivells ab initio. D'aquesta manera, l'analogia visual entre les funcions ajustades i les ab initio obtinguda en el ventall de representacions de densitat obtingudes, i juntament amb els valors de les mesures de semblança obtinguts prèviament, totalment comparables, fonamenta l'ús d'aquestes funcions ajustades. Més enllà del propòsit inicial, es van realitzar dos estudis complementaris a la simple representació de densitats, i són l'anàlisi de curvatura i l'extensió a macromolècules. La primera observació correspon a comprovar no només la semblança dels MIDCOs, sinó la coherència del seu comportament a nivell de curvatura, podent-se així observar punts d'inflexió en la representació de densitats i veure gràficament aquelles zones on la densitat és còncava o convexa. Aquest primer estudi revela que tant les densitats ajustades com les calculades a nivell ab initio es comporten de manera totalment anàloga. En la segona part d'aquest treball es va poder estendre el mètode a molècules més grans, de fins uns 2500 àtoms. Finalment, s'aplica part de la filosofia del MEDLA. Sabent que la densitat electrònica decau ràpidament al allunyar-se dels nuclis, el càlcul d'aquesta pot ser obviat a distàncies grans d'aquests. D'aquesta manera es va proposar particionar l'espai, i calcular tan sols les funcions ajustades de cada àtom tan sols en una regió petita, envoltant l'àtom en qüestió. Duent a terme aquest procés, es disminueix el temps de càlcul i el procés esdevé lineal amb nombre d'àtoms presents en la molècula tractada. En el tema dedicat a la sobreposició molecular es tracta la creació d'un algorisme, així com la seva implementació en forma de programa, batejat Topo-Geometrical Superposition Algorithm (TGSA), d'un mètode que proporcionés aquells alineaments que coincideixen amb la intuïció química. El resultat és un programa informàtic, codificat en Fortran 90, el qual alinea les molècules per parelles considerant tan sols nombres i distàncies atòmiques. La total absència de paràmetres teòrics permet desenvolupar un mètode de sobreposició molecular general, que proporcioni una sobreposició intuïtiva, i també de forma rellevant, de manera ràpida i amb poca intervenció de l'usuari. L'ús màxim del TGSA s'ha dedicat a calcular semblances per al seu ús posterior en QSAR, les quals majoritàriament no corresponen al valor que s'obtindria d'emprar la regla de la màxima semblança, sobretot si hi ha àtoms pesats en joc. Finalment, en l'últim tema, dedicat a la Semblança Quàntica en el marc del QSAR, es tracten tres aspectes diferents: - Ús de matrius de semblança. Aquí intervé l'anomenada matriu de semblança, calculada a partir de les semblances per parelles d'entre un conjunt de molècules. Aquesta matriu és emprada posteriorment, degudament tractada, com a font de descriptors moleculars per a estudis QSAR. Dins d'aquest àmbit s'han fet diversos estudis de correlació d'interès farmacològic, toxicològic, així com de diverses propietats físiques. - Aplicació de l'energia d'interacció electró-electró, assimilat com a una forma d'autosemblança. Aquesta modesta contribució consisteix breument en prendre el valor d'aquesta magnitud, i per analogia amb la notació de l'autosemblança molecular quàntica, assimilar-la com a cas particular de d'aquesta mesura. Aquesta energia d'interacció s'obté fàcilment a partir de programari mecanoquàntic, i esdevé ideal per a fer un primer estudi preliminar de correlació, on s'utilitza aquesta magnitud com a únic descriptor. - Càlcul d'autosemblances, on la densitat ha estat modificada per a augmentar el paper d'un substituent. Treballs previs amb densitats de fragments, tot i donar molt bons resultats, manquen de cert rigor conceptual en aïllar un fragment, suposadament responsable de l'activitat molecular, de la totalitat de l'estructura molecular, tot i que les densitats associades a aquest fragment ja difereixen degut a pertànyer a esquelets amb diferents substitucions. Un procediment per a omplir aquest buit que deixa la simple separació del fragment, considerant així la totalitat de la molècula (calcular-ne l'autosemblança), però evitant al mateix temps valors d'autosemblança no desitjats provocats per àtoms pesats, és l'ús de densitats de Forats de fermi, els quals es troben definits al voltant del fragment d'interès. Aquest procediment modifica la densitat de manera que es troba majoritàriament concentrada a la regió d'interès, però alhora permet obtenir una funció densitat, la qual es comporta matemàticament igual que la densitat electrònica regular, podent-se així incorporar dins del marc de la semblança molecular. Les autosemblances calculades amb aquesta metodologia han portat a bones correlacions amb àcids aromàtics substituïts, podent així donar una explicació al seu comportament. Des d'un altre punt de vista, també s'han fet contribucions conceptuals. S'ha implementat una nova mesura de semblança, la d'energia cinètica, la qual consisteix en prendre la recentment desenvolupada funció densitat d'energia cinètica, la qual al comportar-se matemàticament igual a les densitats electròniques regulars, s'ha incorporat en el marc de la semblança. A partir d'aquesta mesura s'han obtingut models QSAR satisfactoris per diferents conjunts moleculars. Dins de l'aspecte del tractament de les matrius de semblança s'ha implementat l'anomenada transformació estocàstica com a alternativa a l'ús de l'índex Carbó. Aquesta transformació de la matriu de semblança permet obtenir una nova matriu no simètrica, la qual pot ser posteriorment tractada per a construir models QSAR.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present an efficient graph-based algorithm for quantifying the similarity of household-level energy use profiles, using a notion of similarity that allows for small time–shifts when comparing profiles. Experimental results on a real smart meter data set demonstrate that in cases of practical interest our technique is far faster than the existing method for computing the same similarity measure. Having a fast algorithm for measuring profile similarity improves the efficiency of tasks such as clustering of customers and cross-validation of forecasting methods using historical data. Furthermore, we apply a generalisation of our algorithm to produce substantially better household-level energy use forecasts from historical smart meter data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

OBJECTIVES: To develop a method for objective assessment of fine motor timing variability in Parkinson’s disease (PD) patients, using digital spiral data gathered by a touch screen device. BACKGROUND: A retrospective analysis was conducted on data from 105 subjects including65 patients with advanced PD (group A), 15 intermediate patients experiencing motor fluctuations (group I), 15 early stage patients (group S), and 10 healthy elderly subjects (HE) were examined. The subjects were asked to perform repeated upper limb motor tasks by tracing a pre-drawn Archimedes spiral as shown on the screen of the device. The spiral tracing test was performed using an ergonomic pen stylus, using dominant hand. The test was repeated three times per test occasion and the subjects were instructed to complete it within 10 seconds. Digital spiral data including stylus position (x-ycoordinates) and timestamps (milliseconds) were collected and used in subsequent analysis. The total number of observations with the test battery were as follows: Swedish group (n=10079), Italian I group (n=822), Italian S group (n = 811), and HE (n=299). METHODS: The raw spiral data were processed with three data processing methods. To quantify motor timing variability during spiral drawing tasks Approximate Entropy (APEN) method was applied on digitized spiral data. APEN is designed to capture the amount of irregularity or complexity in time series. APEN requires determination of two parameters, namely, the window size and similarity measure. In our work and after experimentation, window size was set to 4 and similarity measure to 0.2 (20% of the standard deviation of the time series). The final score obtained by APEN was normalized by total drawing completion time and used in subsequent analysis. The score generated by this method is hence on denoted APEN. In addition, two more methods were applied on digital spiral data and their scores were used in subsequent analysis. The first method was based on Digital Wavelet Transform and Principal Component Analysis and generated a score representing spiral drawing impairment. The score generated by this method is hence on denoted WAV. The second method was based on standard deviation of frequency filtered drawing velocity. The score generated by this method is hence on denoted SDDV. Linear mixed-effects (LME) models were used to evaluate mean differences of the spiral scores of the three methods across the four subject groups. Test-retest reliability of the three scores was assessed after taking mean of the three possible correlations (Spearman’s rank coefficients) between the three test trials. Internal consistency of the methods was assessed by calculating correlations between their scores. RESULTS: When comparing mean spiral scores between the four subject groups, the APEN scores were different between HE subjects and three patient groups (P=0.626 for S group with 9.9% mean value difference, P=0.089 for I group with 30.2%, and P=0.0019 for A group with 44.1%). However, there were no significant differences in mean scores of the other two methods, except for the WAV between the HE and A groups (P<0.001). WAV and SDDV were highly and significantly correlated to each other with a coefficient of 0.69. However, APEN was not correlated to neither WAV nor SDDV with coefficients of 0.11 and 0.12, respectively. Test-retest reliability coefficients of the three scores were as follows: APEN (0.9), WAV(0.83) and SD-DV (0.55). CONCLUSIONS: The results show that the digital spiral analysis-based objective APEN measure is able to significantly differentiate the healthy subjects from patients at advanced level. In contrast to the other two methods (WAV and SDDV) that are designed to quantify dyskinesias (over-medications), this method can be useful for characterizing Off symptoms in PD. The APEN was not correlated to none of the other two methods indicating that it measures a different construct of upper limb motor function in PD patients than WAV and SDDV. The APEN also had a better test-retest reliability indicating that it is more stable and consistent over time than WAV and SDDV.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One of the content-based image retrieval techniques is the shape-based technique, which allows users to ask for objects similar in shape to a query object. Sajjanhar and Lu proposed a method for shape representation and similarity measure called the grid-based method [1]. They have shown that the method is effective for the retrieval of segmented objects based on shape. In this paper, we describe a system which uses the grid-based method for retrieval of images with multiple objects. We perform experiments on the prototype system to compare the performance of the grid-based method with the Fourier descriptors method [2]. Preliminary results have been presented.