974 resultados para k-nearest neighbours


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Um dos maiores desafios tecnológicos no presente é o de se conseguir gerar e manter, de uma maneira eficiente e consistente, uma base de dados de objectos multimédia, em particular, de imagens. A necessidade de desenvolver métodos de pesquisa automáticos baseados no conteúdo semântico das imagens tornou-se de máxima importância. MPEG-7 é um standard que descreve o contudo dos dados multimédia que suportam estes requisitos operacionais. Adiciona um conjunto de descritores audiovisuais de baixo nível. O histograma é a característica mais utilizada para representar as características globais de uma imagem. Neste trabalho é usado o “Edge Histogram Descriptor” (EHD), que resulta numa representação de baixo nível que permite a computação da similaridade entre imagens. Neste trabalho, é obtida uma caracterização semântica da imagem baseada neste descritor usando dois métodos da classificação: o algoritmo k Nearest Neighbors (k-NN) e uma Rede Neuronal (RN) de retro propagação. No algoritmo k-NN é usada a distância Euclidiana entre os descritores de duas imagens para calcular a similaridade entre imagens diferentes. A RN requer um processo de aprendizagem prévia, que inclui responder correctamente às amostras do treino e às amostras de teste. No fim deste trabalho, será apresentado um estudo sobre os resultados dos dois métodos da classificação.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

BACKGROUND Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer's Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. METHODS It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. RESULTS Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. CONCLUSIONS All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET).

Relevância:

90.00% 90.00%

Publicador:

Resumo:

There is now an emerging need for an efficient modeling strategy to develop a new generation of monitoring systems. One method of approaching the modeling of complex processes is to obtain a global model. It should be able to capture the basic or general behavior of the system, by means of a linear or quadratic regression, and then superimpose a local model on it that can capture the localized nonlinearities of the system. In this paper, a novel method based on a hybrid incremental modeling approach is designed and applied for tool wear detection in turning processes. It involves a two-step iterative process that combines a global model with a local model to take advantage of their underlying, complementary capacities. Thus, the first step constructs a global model using a least squares regression. A local model using the fuzzy k-nearest-neighbors smoothing algorithm is obtained in the second step. A comparative study then demonstrates that the hybrid incremental model provides better error-based performance indices for detecting tool wear than a transductive neurofuzzy model and an inductive neurofuzzy model.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The EU enlargement is scheduled to take place in 2004. After this date, it should be a priority for the EU to develop a coherent and comprehensive policy towards its nearest neighbours, i.e. countries bordering the Member States, which cannot join the EU in the nearest future due to their location or weaknesses of their political and economic systems. There are at least three reasons for this. Firstly, good relations with neighbours will underlie the broadly understood security of the Community. Relations with the nearest neighbours will determine both military security of the EU (including the combating of terrorism) and its ability to prevent other threats such as illegal migration, smuggling, etc. Secondly, good economic relations with neighbours may contribute to the Member States' economic growth in the longer term. And finally, the EU's ability to develop an effective and adequate policy towards its nearest neighbours will demonstrate its competence as a subject of international politics. In other words, the EU will not be recognised as a reliable political player in the global scene until it develops an effective strategy for its neighbourhood.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A k-NN query finds the k nearest-neighbors of a given point from a point database. When it is sufficient to measure object distance using the Euclidian distance, the key to efficient k-NN query processing is to fetch and check the distances of a minimum number of points from the database. For many applications, such as vehicle movement along road networks or rover and animal movement along terrain surfaces, the distance is only meaningful when it is along a valid movement path. For this type of k-NN queries, the focus of efficient query processing is to minimize the cost of computing distances using the environment data (such as the road network data and the terrain data), which can be several orders of magnitude larger than that of the point data. Efficient processing of k-NN queries based on the Euclidian distance or the road network distance has been investigated extensively in the past. In this paper, we investigate the problem of surface k-NN query processing, where the distance is calculated from the shortest path along a terrain surface. This problem is very challenging, as the terrain data can be very large and the computational cost of finding shortest paths is very high. We propose an efficient solution based on multiresolution terrain models. Our approach eliminates the need of costly process of finding shortest paths by ranking objects using estimated lower and upper bounds of distance on multiresolution terrain models.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Alkali tantalates and niobates, including K(Ta / Nb)O3, Li(Ta / Nb)O3 and Na(Ta / Nb)O3, are a very promising ferroic family of lead-free compounds with perovskite-like structures. Their versatile properties make them potentially interesting for current and future application in microelectronics, photocatalysis, energy and biomedics. Among them potassium tantalate, KTaO3 (KTO), has been raising interest as an alternative for the well-known strontium titanate, SrTiO3 (STO). KTO is a perovskite oxide with a quantum paraelectric behaviour when electrically stimulated and a highly polarizable lattice, giving opportunity to tailor its properties via external or internal stimuli. However problems related with the fabrication of either bulk or 2D nanostructures makes KTO not yet a viable alternative to STO. Within this context and to contribute scientifically to the leverage tantalate based compounds applications, the main goals of this thesis are: i) to produce and characterise thin films of alkali tantalates by chemical solution deposition on rigid Si based substrates, at reduced temperatures to be compatible with Si technology, ii) to fulfil scientific knowledge gaps in these relevant functional materials related to their energetics and ii) to exploit alternative applications for alkali tantalates, as photocatalysis. In what concerns the synthesis attention was given to the understanding of the phase formation in potassium tantalate synthesized via distinct routes, to control the crystallization of desired perovskite structure and to avoid low temperature pyrochlore or K-deficient phases. The phase formation process in alkali tantalates is far from being deeply analysed, as in the case of Pb-containing perovskites, therefore the work was initially focused on the process-phase relationship to identify the driving forces responsible to regulate the synthesis. Comparison of phase formation paths in conventional solid-state reaction and sol-gel method was conducted. The structural analyses revealed that intermediate pyrochlore K2Ta2O6 structure is not formed at any stage of the reaction using conventional solid-state reaction. On the other hand in the solution based processes, as alkoxide-based route, the crystallization of the perovskite occurs through the intermediate pyrochlore phase; at low temperatures pyrochlore is dominant and it is transformed to perovskite at >800 °C. The kinetic analysis carried out by using Johnson-MehlAvrami-Kolmogorow model and quantitative X-ray diffraction (XRD) demonstrated that in sol-gel derived powders the crystallization occurs in two stages: i) at early stage of the reaction dominated by primary nucleation, the mechanism is phase-boundary controlled, and ii) at the second stage the low value of Avrami exponent, n ~ 0.3, does not follow any reported category, thus not permitting an easy identification of the mechanism. Then, in collaboration with Prof. Alexandra Navrotsky group from the University of California at Davis (USA), thermodynamic studies were conducted, using high temperature oxide melt solution calorimetry. The enthalpies of formation of three structures: pyrochlore, perovskite and tetragonal tungsten bronze K6Ta10.8O30 (TTB) were calculated. The enthalpies of formation from corresponding oxides, ∆Hfox, for KTaO3, KTa2.2O6 and K6Ta10.8O30 are -203.63 ± 2.84 kJ/mol, - 358.02 ± 3.74 kJ/mol, and -1252.34 ± 10.10 kJ/mol, respectively, whereas from elements, ∆Hfel, for KTaO3, KTa2.2O6 and K6Ta10.8O30 are -1408.96 ± 3.73 kJ/mol, -2790.82 ± 6.06 kJ/mol, and -13393.04 ± 31.15 kJ/mol, respectively. The possible decomposition reactions of K-deficient KTa2.2O6 pyrochlore to KTaO3 perovskite and Ta2O5 (reaction 1) or to TTB K6Ta10.8O30 and Ta2O5 (reaction 2) were proposed, and the enthalpies were calculated to be 308.79 ± 4.41 kJ/mol and 895.79 ± 8.64 kJ/mol for reaction 1 and reaction 2, respectively. The reactions are strongly endothermic, indicating that these decompositions are energetically unfavourable, since it is unlikely that any entropy term could override such a large positive enthalpy. The energetic studies prove that pyrochlore is energetically more stable phase than perovskite at low temperature. Thus, the local order of the amorphous precipitates drives the crystallization into the most favourable structure that is the pyrochlore one with similar local organization; the distance between nearest neighbours in the amorphous or short-range ordered phase is very close to that in pyrochlore. Taking into account the stoichiometric deviation in KTO system, the selection of the most appropriate fabrication / deposition technique in thin films technology is a key issue, especially concerning complex ferroelectric oxides. Chemical solution deposition has been widely reported as a processing method to growth KTO thin films, but classical alkoxide route allows to crystallize perovskite phase at temperatures >800 °C, while the temperature endurance of platinized Si wafers is ~700 °C. Therefore, alternative diol-based routes, with distinct potassium carboxylate precursors, was developed aiming to stabilize the precursor solution, to avoid using toxic solvents and to decrease the crystallization temperature of the perovskite phase. Studies on powders revealed that in the case of KTOac (solution based on potassium acetate), a mixture of perovskite and pyrochlore phases is detected at temperature as low as 450 °C, and gradual transformation into monophasic perovskite structure occurs as temperature increases up to 750 °C, however the desired monophasic KTaO3 perovskite phase is not achieved. In the case of KTOacac (solution with potassium acetylacetonate), a broad peak is detected at temperatures <650 °C, characteristic of amorphous structures, while at higher temperatures diffraction lines from pyrochlore and perovskite phases are visible and a monophasic perovskite KTaO3 is formed at >700 °C. Infrared analysis indicated that the differences are due to a strong deformation of the carbonate-based structures upon heating. A series of thin films of alkali tantalates were spin-coated onto Si-based substrates using diol-based routes. Interestingly, monophasic perovskite KTaO3 films deposited using KTOacac solution were obtained at temperature as low as 650 °C; films were annealed in rapid thermal furnace in oxygen atmosphere for 5 min with heating rate 30 °C/sec. Other compositions of the tantalum based system as LiTaO3 (LTO) and NaTaO3 (NTO), were successfully derived as well, onto Si substrates at 650 °C as well. The ferroelectric character of LTO at room temperature was proved. Some of dielectric properties of KTO could not be measured in parallel capacitor configuration due to either substrate-film or filmelectrode interfaces. Thus, further studies have to be conducted to overcome this issue. Application-oriented studies have also been conducted; two case studies: i) photocatalytic activity of alkali tantalates and niobates for decomposition of pollutant, and ii) bioactivity of alkali tantalate ferroelectric films as functional coatings for bone regeneration. Much attention has been recently paid to develop new type of photocatalytic materials, and tantalum and niobium oxide based compositions have demonstrated to be active photocatalysts for water splitting due to high potential of the conduction bands. Thus, various powders of alkali tantalates and niobates families were tested as catalysts for methylene blue degradation. Results showed promising activities for some of the tested compounds, and KNbO3 is the most active among them, reaching over 50 % degradation of the dye after 7 h under UVA exposure. However further modifications of powders can improve the performance. In the context of bone regeneration, it is important to have platforms that with appropriate stimuli can support the attachment and direct the growth, proliferation and differentiation of the cells. In lieu of this here we exploited an alternative strategy for bone implants or repairs, based on charged mediating signals for bone regeneration. This strategy includes coating metallic 316L-type stainless steel (316L-SST) substrates with charged, functionalized via electrical charging or UV-light irradiation, ferroelectric LiTaO3 layers. It was demonstrated that the formation of surface calcium phosphates and protein adsorption is considerably enhanced for 316L-SST functionalized ferroelectric coatings. Our approach can be viewed as a set of guidelines for the development of platforms electrically functionalized that can stimulate tissue regeneration promoting direct integration of the implant in the host tissue by bone ingrowth and, hence contributing ultimately to reduce implant failure.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The efficacy of fluorescence spectroscopy to detect squamous cell carcinoma is evaluated in an animal model following laser excitation at 442 and 532 nm. Lesions are chemically induced with a topical DMBA application at the left lateral tongue of Golden Syrian hamsters. The animals are investigated every 2 weeks after the 4th week of induction until a total of 26 weeks. The right lateral tongue of each animal is considered as a control site (normal contralateral tissue) and the induced lesions are analyzed as a set of points covering the entire clinically detectable area. Based on fluorescence spectral differences, four indices are determined to discriminate normal and carcinoma tissues, based on intraspectral analysis. The spectral data are also analyzed using a multivariate data analysis and the results are compared with histology as the diagnostic gold standard. The best result achieved is for blue excitation using the KNN (K-nearest neighbor, a interspectral analysis) algorithm with a sensitivity of 95.7% and a specificity of 91.6%. These high indices indicate that fluorescence spectroscopy may constitute a fast noninvasive auxiliary tool for diagnostic of cancer within the oral cavity. (C) 2008 Society of Photo-Optical Instrumentation Engineers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Quality control of toys for avoiding children exposure to potentially toxic elements is of utmost relevance and it is a common requirement in national and/or international norms for health and safety reasons. Laser-induced breakdown spectroscopy (LIBS) was recently evaluated at authors` laboratory for direct analysis of plastic toys and one of the main difficulties for the determination of Cd. Cr and Pb was the variety of mixtures and types of polymers. As most norms rely on migration (lixiviation) protocols, chemometric classification models from LIBS spectra were tested for sampling toys that present potential risk of Cd, Cr and Pb contamination. The classification models were generated from the emission spectra of 51 polymeric toys and by using Partial Least Squares - Discriminant Analysis (PLS-DA), Soft Independent Modeling of Class Analogy (SIMCA) and K-Nearest Neighbor (KNN). The classification models and validations were carried out with 40 and 11 test samples, respectively. Best results were obtained when KNN was used, with corrected predictions varying from 95% for Cd to 100% for Cr and Pb. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A rapid method for classification of mineral waters is proposed. The discrimination power was evaluated by a novel combination of chemometric data analysis and qualitative multi-elemental fingerprints of mineral water samples acquired from different regions of the Brazilian territory. The classification of mineral waters was assessed using only the wavelength emission intensities obtained by inductively coupled plasma optical emission spectrometry (ICP OES), monitoring different lines of Al, B, Ba, Ca, Cl, Cu, Co, Cr, Fe, K, Mg, Mn, Na, Ni, P, Pb, S, Sb, Si, Sr, Ti, V, and Zn, and Be, Dy, Gd, In, La, Sc and Y as internal standards. Data acquisition was done under robust (RC) and non-robust (NRC) conditions. Also, the combination of signal intensities of two or more emission lines for each element were evaluated instead of the individual lines. The performance of two classification-k-nearest neighbor (kNN) and soft independent modeling of class analogy (SIMCA)-and preprocessing algorithms, autoscaling and Pareto scaling, were evaluated for the ability to differentiate between the various samples in each approach tested (combination of robust or non-robust conditions with use of individual lines or sum of the intensities of emission lines). It was shown that qualitative ICP OES fingerprinting in combination with multivariate analysis is a promising analytical tool that has potential to become a recognized procedure for rapid authenticity and adulteration testing of mineral water samples or other material whose physicochemical properties (or origin) are directly related to mineral content.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently, we have built a classification model that is capable of assigning a given sesquiterpene lactone (STL) into exactly one tribe of the plant family Asteraceae from which the STL has been isolated. Although many plant species are able to biosynthesize a set of peculiar compounds, the occurrence of the same secondary metabolites in more than one tribe of Asteraceae is frequent. Building on our previous work, in this paper, we explore the possibility of assigning an STL to more than one tribe (class) simultaneously. When an object may belong to more than one class simultaneously, it is called multilabeled. In this work, we present a general overview of the techniques available to examine multilabeled data. The problem of evaluating the performance of a multilabeled classifier is discussed. Two particular multilabeled classification methods-cross-training with support vector machines (ct-SVM) and multilabeled k-nearest neighbors (M-L-kNN)were applied to the classification of the STLs into seven tribes from the plant family Asteraceae. The results are compared to a single-label classification and are analyzed from a chemotaxonomic point of view. The multilabeled approach allowed us to (1) model the reality as closely as possible, (2) improve our understanding of the relationship between the secondary metabolite profiles of different Asteraceae tribes, and (3) significantly decrease the number of plant sources to be considered for finding a certain STL. The presented classification models are useful for the targeted collection of plants with the objective of finding plant sources of natural compounds that are biologically active or possess other specific properties of interest.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The supervised pattern recognition methods K-Nearest Neighbors (KNN), stepwise discriminant analysis (SDA), and soft independent modelling of class analogy (SIMCA) were employed in this work with the aim to investigate the relationship between the molecular structure of 27 cannabinoid compounds and their analgesic activity. Previous analyses using two unsupervised pattern recognition methods (PCA-principal component analysis and HCA-hierarchical cluster analysis) were performed and five descriptors were selected as the most relevants for the analgesic activity of the compounds studied: R (3) (charge density on substituent at position C(3)), Q (1) (charge on atom C(1)), A (surface area), log P (logarithm of the partition coefficient) and MR (molecular refractivity). The supervised pattern recognition methods (SDA, KNN, and SIMCA) were employed in order to construct a reliable model that can be able to predict the analgesic activity of new cannabinoid compounds and to validate our previous study. The results obtained using the SDA, KNN, and SIMCA methods agree perfectly with our previous model. Comparing the SDA, KNN, and SIMCA results with the PCA and HCA ones we could notice that all multivariate statistical methods classified the cannabinoid compounds studied in three groups exactly in the same way: active, moderately active, and inactive.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective: To develop a model to predict the bleeding source and identify the cohort amongst patients with acute gastrointestinal bleeding (GIB) who require urgent intervention, including endoscopy. Patients with acute GIB, an unpredictable event, are most commonly evaluated and managed by non-gastroenterologists. Rapid and consistently reliable risk stratification of patients with acute GIB for urgent endoscopy may potentially improve outcomes amongst such patients by targeting scarce health-care resources to those who need it the most. Design and methods: Using ICD-9 codes for acute GIB, 189 patients with acute GIB and all. available data variables required to develop and test models were identified from a hospital medical records database. Data on 122 patients was utilized for development of the model and on 67 patients utilized to perform comparative analysis of the models. Clinical data such as presenting signs and symptoms, demographic data, presence of co-morbidities, laboratory data and corresponding endoscopic diagnosis and outcomes were collected. Clinical data and endoscopic diagnosis collected for each patient was utilized to retrospectively ascertain optimal management for each patient. Clinical presentations and corresponding treatment was utilized as training examples. Eight mathematical models including artificial neural network (ANN), support vector machine (SVM), k-nearest neighbor, linear discriminant analysis (LDA), shrunken centroid (SC), random forest (RF), logistic regression, and boosting were trained and tested. The performance of these models was compared using standard statistical analysis and ROC curves. Results: Overall the random forest model best predicted the source, need for resuscitation, and disposition with accuracies of approximately 80% or higher (accuracy for endoscopy was greater than 75%). The area under ROC curve for RF was greater than 0.85, indicating excellent performance by the random forest model Conclusion: While most mathematical models are effective as a decision support system for evaluation and management of patients with acute GIB, in our testing, the RF model consistently demonstrated the best performance. Amongst patients presenting with acute GIB, mathematical models may facilitate the identification of the source of GIB, need for intervention and allow optimization of care and healthcare resource allocation; these however require further validation. (c) 2007 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents results on the simulation of the solid state sintering of copper wires using Monte Carlo techniques based on elements of lattice theory and cellular automata. The initial structure is superimposed onto a triangular, two-dimensional lattice, where each lattice site corresponds to either an atom or vacancy. The number of vacancies varies with the simulation temperature, while a cluster of vacancies is a pore. To simulate sintering, lattice sites are picked at random and reoriented in terms of an atomistic model governing mass transport. The probability that an atom has sufficient energy to jump to a vacant lattice site is related to the jump frequency, and hence the diffusion coefficient, while the probability that an atomic jump will be accepted is related to the change in energy of the system as a result of the jump, as determined by the change in the number of nearest neighbours. The jump frequency is also used to relate model time, measured in Monte Carlo Steps, to the actual sintering time. The model incorporates bulk, grain boundary and surface diffusion terms and includes vacancy annihilation on the grain boundaries. The predictions of the model were found to be consistent with experimental data, both in terms of the microstructural evolution and in terms of the sintering time. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mestrado em Engenharia Informática