953 resultados para Euclidean distance model,
Resumo:
Dissertation to Obtain the Degree of Master in Biomedical Engineering
Resumo:
O objetivo desta dissertação foi estudar um conjunto de empresas cotadas na bolsa de valores de Lisboa, para identificar aquelas que têm um comportamento semelhante ao longo do tempo. Para isso utilizamos algoritmos de Clustering tais como K-Means, PAM, Modelos hierárquicos, Funny e C-Means tanto com a distância euclidiana como com a distância de Manhattan. Para selecionar o melhor número de clusters identificado por cada um dos algoritmos testados, recorremos a alguns índices de avaliação/validação de clusters como o Davies Bouldin e Calinski-Harabasz entre outros.
Resumo:
Report for the scientific sojourn at the University of Bern, Swiss, from Mars until June 2008. Writer identification consists in determining the writer of a piece of handwriting from a set of writers. Even though an important amount of compositions contains handwritten text in the music scores, the aim of the work is to use only music notation to determine the author. It’s been developed two approaches for writer identification in old handwritten music scores. The methods proposed extract features from every music line, and also features from a texture image of music symbols. First of all, the music sheet is first preprocessed for obtaining a binarized music score without the staff lines. The classification is performed using a k-NN classifier based on Euclidean distance. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving encouraging identification rates.
Resumo:
A statistical methodology for the objective comparison of LDI-MS mass spectra of blue gel pen inks was evaluated. Thirty-three blue gel pen inks previously studied by RAMAN were analyzed directly on the paper using both positive and negative mode. The obtained mass spectra were first compared using relative areas of selected peaks using the Pearson correlation coefficient and the Euclidean distance. Intra-variability among results from one ink and inter-variability between results from different inks were compared in order to choose a differentiation threshold minimizing the rate of false negative (i.e. avoiding false differentiation of the inks). This yielded a discriminating power of up to 77% for analysis made in the negative mode. The whole mass spectra were then compared using the same methodology, allowing for a better DP in the negative mode of 92% using the Pearson correlation on standardized data. The positive mode results generally yielded a lower differential power (DP) than the negative mode due to a higher intra-variability compared to the inter-variability in the mass spectra of the ink samples.
Resumo:
The complex relationship between structural and functional connectivity, as measured by noninvasive imaging of the human brain, poses many unresolved challenges and open questions. Here, we apply analytic measures of network communication to the structural connectivity of the human brain and explore the capacity of these measures to predict resting-state functional connectivity across three independently acquired datasets. We focus on the layout of shortest paths across the network and on two communication measures-search information and path transitivity-which account for how these paths are embedded in the rest of the network. Search information is an existing measure of information needed to access or trace shortest paths; we introduce path transitivity to measure the density of local detours along the shortest path. We find that both search information and path transitivity predict the strength of functional connectivity among both connected and unconnected node pairs. They do so at levels that match or significantly exceed path length measures, Euclidean distance, as well as computational models of neural dynamics. This capacity suggests that dynamic couplings due to interactions among neural elements in brain networks are substantially influenced by the broader network context adjacent to the shortest communication pathways.
Resumo:
Workgroup diversity can be conceptualized as variety, separation, or disparity. Thus, the proper operationalization of diversity depends on how a diversity dimension has been defined. Analytically, the minimal diversity must be obtained when there are no differences on an attribute among the members of a group, however maximal diversity has a different shape for each conceptualization of diversity. Previous work on diversity indexes indicated maximum values for variety (e.g., Blau"s index and Teachman"s index), separation (e.g., standard deviation and mean Euclidean distance), and disparity (e.g., coefficient of variation and the Gini coefficient of concentration), although these maximum values are not valid for all group characteristics (i.e., group size and group size parity) and attribute scales (i.e., number of categories). We demonstrate analytically appropriate upper boundaries for conditional diversity determined by some specific group characteristics, avoiding the bias related to absolute diversity. This will allow applied researchers to make better interpretations regarding the relationship between group diversity and group outcomes.
Resumo:
Tire traces can be observed on several crime scenes as vehicles are often used by criminals. The tread abrasion on the road, while braking or skidding, leads to the production of small rubber particles which can be collected for comparison purposes. This research focused on the statistical comparison of Py-GC/MS profiles of tire traces and tire treads. The optimisation of the analytical method was carried out using experimental designs. The aim was to determine the best pyrolysis parameters regarding the repeatability of the results. Thus, the pyrolysis factor effect could also be calculated. The pyrolysis temperature was found to be five time more important than time. Finally, a pyrolysis at 650 °C during 15 s was selected. Ten tires of different manufacturers and models were used for this study. Several samples were collected on each tire, and several replicates were carried out to study the variability within each tire (intravariability). More than eighty compounds were integrated for each analysis and the variability study showed that more than 75% presented a relative standard deviation (RSD) below 5% for the ten tires, thus supporting a low intravariability. The variability between the ten tires (intervariability) presented higher values and the ten most variant compounds had a RSD value above 13%, supporting their high potential of discrimination between the tires tested. Principal Component Analysis (PCA) was able to fully discriminate the ten tires with the help of the first three principal components. The ten tires were finally used to perform braking tests on a racetrack with a vehicle equipped with an anti-lock braking system. The resulting tire traces were adequately collected using sheets of white gelatine. As for tires, the intravariability for the traces was found to be lower than the intervariability. Clustering methods were carried out and the Ward's method based on the squared Euclidean distance was able to correctly group all of the tire traces replicates in the same cluster than the replicates of their corresponding tire. Blind tests on traces were performed and were correctly assigned to their tire source. These results support the hypothesis that the tested tires, of different manufacturers and models, can be discriminated by a statistical comparison of their chemical profiles. The traces were found to be not differentiable from their source but differentiable from all the other tires present in the subset. The results are promising and will be extended on a larger sample set.
Resumo:
The objective of this work was to propose a way of using the Tocher's method of clustering to obtain a matrix similar to the cophenetic one obtained for hierarchical methods, which would allow the calculation of a cophenetic correlation. To illustrate the obtention of the proposed cophenetic matrix, we used two dissimilarity matrices - one obtained with the generalized squared Mahalanobis distance and the other with the Euclidean distance - between 17 garlic cultivars, based on six morphological characters. Basically, the proposal for obtaining the cophenetic matrix was to use the average distances within and between clusters, after performing the clustering. A function in R language was proposed to compute the cophenetic matrix for Tocher's method. The empirical distribution of this correlation coefficient was briefly studied. For both dissimilarity measures, the values of cophenetic correlation obtained for the Tocher's method were higher than those obtained with the hierarchical methods (Ward's algorithm and average linkage - UPGMA). Comparisons between the clustering made with the agglomerative hierarchical methods and with the Tocher's method can be performed using a criterion in common: the correlation between matrices of original and cophenetic distances.
Resumo:
MOTIVATION: Comparative analyses of gene expression data from different species have become an important component of the study of molecular evolution. Thus methods are needed to estimate evolutionary distances between expression profiles, as well as a neutral reference to estimate selective pressure. Divergence between expression profiles of homologous genes is often calculated with Pearson's or Euclidean distance. Neutral divergence is usually inferred from randomized data. Despite being widely used, neither of these two steps has been well studied. Here, we analyze these methods formally and on real data, highlight their limitations and propose improvements. RESULTS: It has been demonstrated that Pearson's distance, in contrast to Euclidean distance, leads to underestimation of the expression similarity between homologous genes with a conserved uniform pattern of expression. Here, we first extend this study to genes with conserved, but specific pattern of expression. Surprisingly, we find that both Pearson's and Euclidean distances used as a measure of expression similarity between genes depend on the expression specificity of those genes. We also show that the Euclidean distance depends strongly on data normalization. Next, we show that the randomization procedure that is widely used to estimate the rate of neutral evolution is biased when broadly expressed genes are abundant in the data. To overcome this problem, we propose a novel randomization procedure that is unbiased with respect to expression profiles present in the datasets. Applying our method to the mouse and human gene expression data suggests significant gene expression conservation between these species. CONTACT: marc.robinson-rechavi@unil.ch; sven.bergmann@unil.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Resumo:
A novel Fe3+-selective and turn-on fluorescent probe 1 incorporating a rhodamine fluorophore and quinoline subunit was synthesized. Probe 1 displayed high selectivity for Fe3+ in CH3CN–H2O (95:5 v/v) in the presence of other relevant metal cations. Interaction with Fe3+ in 1:1 stoichiometry could trigger a significant fluorescence enhancement due to the formation of the ring-open form. The fluorescent response images were investigated by a novel Euclidean distance method based on red, green, and blue values. A linear relationship was observed between fluorescence intensity changes and Fe3+ concentrations from 7.3 × 10−7 to 3.6 × 10−5 mol L−1.
Resumo:
Tässä työssä johdetaan lineaarimuunnoksella CIE x y z-värinsovitusfunktioista uudet värinsovitusfunktiot. Tarvittava muunnosmatriisi etsitään optimoimalla CIE ja BFD-RIT värieroellipsejä Matlab-ympäristössä. Työn tuloksena saatiin muunnosmatriisi, ja sillä muunnetut uudet värinsovitusfunktiot ja CIELAB-tyyppinen väriavaruus. Euklidisella etäisyydellä mitattuna CIE ja BFD-RIT värieroellipsien muoto ja koko paranivat noin kolmanneksen, mikä oli myös tavoitteena.
Resumo:
Euclidean distance matrix analysis (EDMA) methods are used to distinguish whether or not significant difference exists between conformational samples of antibody complementarity determining region (CDR) loops, isolated LI loop and LI in three-loop assembly (LI, L3 and H3) obtained from Monte Carlo simulation. After the significant difference is detected, the specific inter-Ca distance which contributes to the difference is identified using EDMA.The estimated and improved mean forms of the conformational samples of isolated LI loop and LI loop in three-loop assembly, CDR loops of antibody binding site, are described using EDMA and distance geometry (DGEOM). To the best of our knowledge, it is the first time the EDMA methods are used to analyze conformational samples of molecules obtained from Monte Carlo simulations. Therefore, validations of the EDMA methods using both positive control and negative control tests for the conformational samples of isolated LI loop and LI in three-loop assembly must be done. The EDMA-I bootstrap null hypothesis tests showed false positive results for the comparison of six samples of the isolated LI loop and true positive results for comparison of conformational samples of isolated LI loop and LI in three-loop assembly. The bootstrap confidence interval tests revealed true negative results for comparisons of six samples of the isolated LI loop, and false negative results for the conformational comparisons between isolated LI loop and LI in three-loop assembly. Different conformational sample sizes are further explored by combining the samples of isolated LI loop to increase the sample size, or by clustering the sample using self-organizing map (SOM) to narrow the conformational distribution of the samples being comparedmolecular conformations. However, there is no improvement made for both bootstrap null hypothesis and confidence interval tests. These results show that more work is required before EDMA methods can be used reliably as a method for comparison of samples obtained by Monte Carlo simulations.
Resumo:
Cette étude porte sur la distance parcourue pour commettre un crime à Gatineau en 2006. Peu d’études canadiennes récentes ont porté sur le sujet. De plus, il existe un vide de connaissances sur la mobilité des délinquants dans les petites villes et les banlieues. La présente recherche vise à comparer trois mesures de distance différentes, à vérifier si la distance parcourue varie en fonction du type de crime et à voir si les variables de temps (jour de la semaine, moment de la journée et saison) de même que certaines caractéristiques des suspects (âge, sexe et lieu de résidence) ont un impact sur la distance parcourue. Pour chaque crime, l’adresse du suspect et le lieu du crime ont été géocodées pour ensuite calculer la distance entre les deux points. Il ressort de l’analyse de la forme des courbes de distances que seules les agressions sexuelles présentent une zone tampon. Les résultats des analyses statistiques indiquent que les jeunes sont plus mobiles que les suspects plus âgés et que les hommes parcourent une distance plus élevée que les femmes. Étonnement, la distance parcourue ne diffère pas significativement selon la saison et le moment de la journée. Enfin, comparativement aux autres criminels, les délinquants qui ont commis un vol qualifié sont ceux qui ont parcouru les plus grandes distances.
Resumo:
Nous proposons de construire un atlas numérique 3D contenant les caractéristiques moyennes et les variabilités de la morphologie d’un organe. Nos travaux seront appliqués particulièrement à la construction d'un atlas numérique 3D de la totalité de la cornée humaine incluant la surface antérieure et postérieure à partir des cartes topographiques fournies par le topographe Orbscan II. Nous procédons tout d'abord par normalisation de toute une population de cornées. Dans cette étape, nous nous sommes basés sur l'algorithme de recalage ICP (iterative closest point) pour aligner simultanément les surfaces antérieures et postérieures d'une population de cornée vers les surfaces antérieure et postérieure d'une cornée de référence. En effet, nous avons élaboré une variante de l'algorithme ICP adapté aux images (cartes) de cornées qui tient compte de changement d'échelle pendant le recalage et qui se base sur la recherche par voisinage via la distance euclidienne pour établir la correspondance entre les points. Après, nous avons procédé pour la construction de l'atlas cornéen par le calcul des moyennes des élévations de surfaces antérieures et postérieures recalées et leurs écarts-types associés. Une population de 100 cornées saines a été utilisée pour construire l'atlas cornéen normal. Pour visualiser l’atlas, on a eu recours à des cartes topographiques couleurs similairement à ce qu’offrent déjà les systèmes topographiques actuels. Enfin, des observations ont été réalisées sur l'atlas cornéen reflétant sa précision et permettant de développer une meilleure connaissance de l’anatomie cornéenne.
Resumo:
This thesis investigates the potential use of zerocrossing information for speech sample estimation. It provides 21 new method tn) estimate speech samples using composite zerocrossings. A simple linear interpolation technique is developed for this purpose. By using this method the A/D converter can be avoided in a speech coder. The newly proposed zerocrossing sampling theory is supported with results of computer simulations using real speech data. The thesis also presents two methods for voiced/ unvoiced classification. One of these methods is based on a distance measure which is a function of short time zerocrossing rate and short time energy of the signal. The other one is based on the attractor dimension and entropy of the signal. Among these two methods the first one is simple and reguires only very few computations compared to the other. This method is used imtea later chapter to design an enhanced Adaptive Transform Coder. The later part of the thesis addresses a few problems in Adaptive Transform Coding and presents an improved ATC. Transform coefficient with maximum amplitude is considered as ‘side information’. This. enables more accurate tfiiz assignment enui step—size computation. A new bit reassignment scheme is also introduced in this work. Finally, sum ATC which applies switching between luiscrete Cosine Transform and Discrete Walsh-Hadamard Transform for voiced and unvoiced speech segments respectively is presented. Simulation results are provided to show the improved performance of the coder