923 resultados para classification aided by clustering
Resumo:
With recent advances in remote sensing processing technology, it has become more feasible to begin analysis of the enormous historic archive of remotely sensed data. This historical data provides valuable information on a wide variety of topics which can influence the lives of millions of people if processed correctly and in a timely manner. One such field of benefit is that of landslide mapping and inventory. This data provides a historical reference to those who live near high risk areas so future disasters may be avoided. In order to properly map landslides remotely, an optimum method must first be determined. Historically, mapping has been attempted using pixel based methods such as unsupervised and supervised classification. These methods are limited by their ability to only characterize an image spectrally based on single pixel values. This creates a result prone to false positives and often without meaningful objects created. Recently, several reliable methods of Object Oriented Analysis (OOA) have been developed which utilize a full range of spectral, spatial, textural, and contextual parameters to delineate regions of interest. A comparison of these two methods on a historical dataset of the landslide affected city of San Juan La Laguna, Guatemala has proven the benefits of OOA methods over those of unsupervised classification. Overall accuracies of 96.5% and 94.3% and F-score of 84.3% and 77.9% were achieved for OOA and unsupervised classification methods respectively. The greater difference in F-score is a result of the low precision values of unsupervised classification caused by poor false positive removal, the greatest shortcoming of this method.
Resumo:
Tese de Doutoramento, Ecologia (Ecologia das Populações), Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2015
Resumo:
By definition, the domestication process leads to an overall reduction of crop genetic diversity. This lead to the current search of genomic regions in wild crop relatives (CWR), an important task for modern carrot breeding. Nowadays massive sequencing possibilities can allow for discovery of novel genetic resources in wild populations, but this quest could be aided by the use of a surrogate gene (to first identify and prioritize novel wild populations for increased sequencing effort). Alternative oxidase (AOX) gene family seems to be linked to all kinds of abiotic and biotic stress reactions in various organisms and thus have the potential to be used in the identification of CWR hotspots of environment-adapted diversity. High variability of DcAOX1 was found in populations of wild carrot sampled across a West-European environmental gradient. Even though no direct relation was found with the analyzed climatic conditions or with physical distance, population differentiation exists and results mainly from the polymorphisms associated with DcAOX1 exon 1 and intron 1. The relatively high number of amino acid changes and the identification of several unusually variable positions (through a likelihood ratio test), suggests that DcAOX1 gene might be under positive selection. However, if positive selection is considered, it only acts on some specific populations (i.e. is in the form of adaptive differences in different population locations) given the observed high genetic diversity. We were able to identify two populations with higher levels of differentiation which are promising as hot spots of specific functional diversity.
Resumo:
O presente Relatório de Estágio surge no âmbito da Prática de Ensino Supervisionada, que decorreu no ano letivo de 2011/2012 na Escola Secundária André de Gouveia, em Évora. O objetivo do mesmo é aliar à componente científica que foi adquirida no Mestrado de Educação Física nos Ensino Básico e Secundário o trabalho desenvolvido enquanto estudante estagiário. Para tal, o meu trabalho foi auxiliado pelo Programa Nacional de Educação Física e pelo Projeto Educativo em vigor na Escola. Assim, foi realizada uma descrição do trabalho realizado durante o ano letivo e a respetiva reflexão crítica, tendo em conta quatro dimensões profissionais do professor, nomeadamente: dimensão profissional social e ética, dimensão de desenvolvimento do ensino e da aprendizagem, dimensão da participação na escola e relação com a comunidade e, por último, a dimensão do desenvolvimento profissional ao longo da vida; REPORT OF THE SUPERVISED TEACHING PRACTICE OF PHYSICAL EDUCATION IN THE SECUNDARY SCHOOL/3 ANDRÉ DE GOUVEIA Abstract: This report comes under the Supervised Teaching Practice which took place in the school year 2011/2012 at Secondary School André de Gouveia, in Évora.The purpose of it is to combine the scientific component which was acquired in the Master of Physical Education in the Elementary and Secondary Education the work as a trainee student. To this end, my work was aided by the National Program of Physical Education and the Educational Project used in the school. Thus, a critical reflection of the school year was made, taking into account four professional dimensions of training of future teachers, namely: social professional and ethical dimension, development dimension of teaching and learning, participation in school size and relationship with the community and, finally, the dimension of professional development throughout life.
Resumo:
The high quality of protected designation of origin (PDO) dry-cured pork products depends largely on the chemical and physical parameters of the fresh meat and their variation during the production process of the final product. The discovery of the mechanisms that regulate the variability of these parameters was aided by the reference genome of swine adjuvant to genetic analysis methods. This thesis can contribute to the discovery of genetic mechanisms that regulate the variability of some quality parameters of fresh meat for PDO dry-cured pork production. The first study is of gene expression and showed that between low and high glycolytic potential (GP) samples of Semimembranosus muscle of Italian Large White (ILW) pigs in early postmortem, the differentially expressed genes were all but one over expressed in low GP. These were involved in ATP biosynthesis processes, calcium homeostasis, and lipid metabolism including the potential master regulator gene Peroxisome Proliferator-Activated Receptor Alpha (PPARA). The second is a study in commercial hybrid pigs to evaluate correlations between carcass and fresh ham traits, including carcass and fresh ham lean meat percentages, the former, a potential predictor of the latter. In addition, a genome-wide association study allowed the identification of chromosome-wide associations with phenotypic traits for 19 SNPs, and genome-wide associations for 14 SNPs for ferrochelatase activity. The latter could be a determinant for color variation in nitrite-free dry-cured ham. The third study showed gene expression differences in the Longissimus thoracis muscle of ILW pigs by feeding diets with extruded linseed (source of polyunsaturated fatty acids) and vitamin E and selenium (diet three) or natural (diet four) antioxidants. The diet three promoted a more rapid and massive immune system response possibly determined by improvement in muscle tissue function, while the diet four promoted oxidative stability and increased the anti-inflammatory potential of muscle tissue.
Resumo:
В данной работе рассматривается лексикализация события движения в русском языке в сопоставлении с итальянским. Цель нашей работы двойная: с одной стороны, мы рассмотрим пространственную семантику выбранных нами глагольных префиксов и определим их семантический вклад в лексикализацию события движения. С другой стороны, мы проанализируем соответствия русских приставочных глаголов движения при переводе на итальянский язык. В частности, мы сосредоточимся на том, выражается ли вклад префикса, передается ли он полностью или частично, какие нюансы его пространственной семантики могут опускаться, а какие выражаются обязательно и какими языковыми средствами. Работа состоит из введения, трех глав и заключения. В Первой главе представляется теоретическая рамка, на которую опирается сопоставительный анализ. Рассматриваются понятия движения и перемещения согласно семантическим толкованиям, приведенным как в русскоязычной литературе, так и в работах на других языках. Кроме того, концептуализация пространства описывается в русле когнитивного подхода к изучению языка. Представлена классификация языков по лексикализации события движения, введенной Л. Талми, а также основные последующие исследования, посвященные лексикализации события движения в различных языках, проведенные в русле типологического подхода. Отдельный параграф первой главы посвящается вкладу исследований, проведенных Д. Слобиным в области лексикализации компонентов движения, в частности, способа движения в различных языках. Во Второй главе описываются система бесприставочных глаголов движения в русском языке и основные подходы к их изучению. Регулярно проводятся параллели с итальянской системой глаголов движения. Далее в этой главе представлен обзор системы приставочных глаголов движения русского языка. Отдельно мы рассматриваем главные подходы к изучению семантики глагольных префиксов, фокусируясь на их пространственных значениях. В Третьей главе представляется подбор глагольных префиксов с пространственной семантикой, выбранных для целей сопоставительного анализа. Для каждого префикса предлагается словарное толкование, описывается его пространственная семантика согласно концепциям, разработанным различными авторами, и проводится анализ контекстов употребления приставочных глаголов в русском языке и возможные стратегии их передачи на итальянский язык. Выводы изложены в заключении, прилагается также список литературы.
Resumo:
The recent widespread use of social media platforms and web services has led to a vast amount of behavioral data that can be used to model socio-technical systems. A significant part of this data can be represented as graphs or networks, which have become the prevalent mathematical framework for studying the structure and the dynamics of complex interacting systems. However, analyzing and understanding these data presents new challenges due to their increasing complexity and diversity. For instance, the characterization of real-world networks includes the need of accounting for their temporal dimension, together with incorporating higher-order interactions beyond the traditional pairwise formalism. The ongoing growth of AI has led to the integration of traditional graph mining techniques with representation learning and low-dimensional embeddings of networks to address current challenges. These methods capture the underlying similarities and geometry of graph-shaped data, generating latent representations that enable the resolution of various tasks, such as link prediction, node classification, and graph clustering. As these techniques gain popularity, there is even a growing concern about their responsible use. In particular, there has been an increased emphasis on addressing the limitations of interpretability in graph representation learning. This thesis contributes to the advancement of knowledge in the field of graph representation learning and has potential applications in a wide range of complex systems domains. We initially focus on forecasting problems related to face-to-face contact networks with time-varying graph embeddings. Then, we study hyperedge prediction and reconstruction with simplicial complex embeddings. Finally, we analyze the problem of interpreting latent dimensions in node embeddings for graphs. The proposed models are extensively evaluated in multiple experimental settings and the results demonstrate their effectiveness and reliability, achieving state-of-the-art performances and providing valuable insights into the properties of the learned representations.
Resumo:
Depth estimation from images has long been regarded as a preferable alternative compared to expensive and intrusive active sensors, such as LiDAR and ToF. The topic has attracted the attention of an increasingly wide audience thanks to the great amount of application domains, such as autonomous driving, robotic navigation and 3D reconstruction. Among the various techniques employed for depth estimation, stereo matching is one of the most widespread, owing to its robustness, speed and simplicity in setup. Recent developments has been aided by the abundance of annotated stereo images, which granted to deep learning the opportunity to thrive in a research area where deep networks can reach state-of-the-art sub-pixel precision in most cases. Despite the recent findings, stereo matching still begets many open challenges, two among them being finding pixel correspondences in presence of objects that exhibits a non-Lambertian behaviour and processing high-resolution images. Recently, a novel dataset named Booster, which contains high-resolution stereo pairs featuring a large collection of labeled non-Lambertian objects, has been released. The work shown that training state-of-the-art deep neural network on such data improves the generalization capabilities of these networks also in presence of non-Lambertian surfaces. Regardless being a further step to tackle the aforementioned challenge, Booster includes a rather small number of annotated images, and thus cannot satisfy the intensive training requirements of deep learning. This thesis work aims to investigate novel view synthesis techniques to augment the Booster dataset, with ultimate goal of improving stereo matching reliability in presence of high-resolution images that displays non-Lambertian surfaces.
Resumo:
Biological experiments often produce enormous amount of data, which are usually analyzed by data clustering. Cluster analysis refers to statistical methods that are used to assign data with similar properties into several smaller, more meaningful groups. Two commonly used clustering techniques are introduced in the following section: principal component analysis (PCA) and hierarchical clustering. PCA calculates the variance between variables and groups them into a few uncorrelated groups or principal components (PCs) that are orthogonal to each other. Hierarchical clustering is carried out by separating data into many clusters and merging similar clusters together. Here, we use an example of human leukocyte antigen (HLA) supertype classification to demonstrate the usage of the two methods. Two programs, Generating Optimal Linear Partial Least Square Estimations (GOLPE) and Sybyl, are used for PCA and hierarchical clustering, respectively. However, the reader should bear in mind that the methods have been incorporated into other software as well, such as SIMCA, statistiXL, and R.
Resumo:
A spectral angle based feature extraction method, Spectral Clustering Independent Component Analysis (SC-ICA), is proposed in this work to improve the brain tissue classification from Magnetic Resonance Images (MRI). SC-ICA provides equal priority to global and local features; thereby it tries to resolve the inefficiency of conventional approaches in abnormal tissue extraction. First, input multispectral MRI is divided into different clusters by a spectral distance based clustering. Then, Independent Component Analysis (ICA) is applied on the clustered data, in conjunction with Support Vector Machines (SVM) for brain tissue analysis. Normal and abnormal datasets, consisting of real and synthetic T1-weighted, T2-weighted and proton density/fluid-attenuated inversion recovery images, were used to evaluate the performance of the new method. Comparative analysis with ICA based SVM and other conventional classifiers established the stability and efficiency of SC-ICA based classification, especially in reproduction of small abnormalities. Clinical abnormal case analysis demonstrated it through the highest Tanimoto Index/accuracy values, 0.75/98.8%, observed against ICA based SVM results, 0.17/96.1%, for reproduced lesions. Experimental results recommend the proposed method as a promising approach in clinical and pathological studies of brain diseases
Resumo:
In this paper, we propose a multispectral analysis system using wavelet based Principal Component Analysis (PCA), to improve the brain tissue classification from MRI images. Global transforms like PCA often neglects significant small abnormality details, while dealing with a massive amount of multispectral data. In order to resolve this issue, input dataset is expanded by detail coefficients from multisignal wavelet analysis. Then, PCA is applied on the new dataset to perform feature analysis. Finally, an unsupervised classification with Fuzzy C-Means clustering algorithm is used to measure the improvement in reproducibility and accuracy of the results. A detailed comparative analysis of classified tissues with those from conventional PCA is also carried out. Proposed method yielded good improvement in classification of small abnormalities with high sensitivity/accuracy values, 98.9/98.3, for clinical analysis. Experimental results from synthetic and clinical data recommend the new method as a promising approach in brain tissue analysis.
Resumo:
In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and present some singular cases potentially due to incorrect classification or erroneous annotations in the database.
Resumo:
The development of strategies for structural health monitoring (SHM) has become increasingly important because of the necessity of preventing undesirable damage. This paper describes an approach to this problem using vibration data. It involves a three-stage process: reduction of the time-series data using principle component analysis (PCA), the development of a data-based model using an auto-regressive moving average (ARMA) model using data from an undamaged structure, and the classification of whether or not the structure is damaged using a fuzzy clustering approach. The approach is applied to data from a benchmark structure from Los Alamos National Laboratory, USA. Two fuzzy clustering algorithms are compared: fuzzy c-means (FCM) and Gustafson-Kessel (GK) algorithms. It is shown that while both fuzzy clustering algorithms are effective, the GK algorithm marginally outperforms the FCM algorithm. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Land use classification has been paramount in the last years, since we can identify illegal land use and also to monitor deforesting areas. Although one can find several research works in the literature that address this problem, we propose here the land use recognition by means of Optimum-Path Forest Clustering (OPF), which has never been applied to this context up to date. Experiments among Optimum-Path Forest, Mean Shift and K-Means demonstrated the robustness of OPF for automatic land use classification of images obtained by CBERS-2B and Ikonos-2 satellites. © 2011 IEEE.
Resumo:
This study subdivides the Potter Cove, King George Island, Antarctica, into seafloor regions using multivariate statistical methods. These regions are categories used for comparing, contrasting and quantifying biogeochemical processes and biodiversity between ocean regions geographically but also regions under development within the scope of global change. The division obtained is characterized by the dominating components and interpreted in terms of ruling environmental conditions. The analysis includes in total 42 different environmental variables, interpolated based on samples taken during Australian summer seasons 2010/2011 and 2011/2012. The statistical errors of several interpolation methods (e.g. IDW, Indicator, Ordinary and Co-Kriging) with changing settings have been compared and the most reasonable method has been applied. The multivariate mathematical procedures used are regionalized classification via k means cluster analysis, canonical-correlation analysis and multidimensional scaling. Canonical-correlation analysis identifies the influencing factors in the different parts of the cove. Several methods for the identification of the optimum number of clusters have been tested and 4, 7, 10 as well as 12 were identified as reasonable numbers for clustering the Potter Cove. Especially the results of 10 and 12 clusters identify marine-influenced regions which can be clearly separated from those determined by the geological catchment area and the ones dominated by river discharge.