873 resultados para agglomerative clustering
Resumo:
The objective of this work was to evaluate the efficiency of EST‑SSR markers in the assessment of the genetic diversity of rubber tree genotypes (Hevea brasiliensis) and to verify the transferability of these markers for wild species of Hevea. Forty‑five rubber tree accessions from the Instituto Agronômico (Campinas, SP, Brazil) and six wild species were used. Information provided by modified Roger's genetic distance were used to analyze EST‑SSR data. UPGMA clustering divided the samples into two major groups with high genetic differentiation, while the software Structure distributed the 51 clones into eight groups. A parallel could be established between both clustering analyses. The 30 polymorphic EST‑SSRs showed from two to ten alleles and were efficient in amplifying the six wild species. Functional EST‑SSR microsatellites are efficient in evaluating the genetic diversity among rubber tree clones and can be used to translate the genetic differences among cultivars and to fingerprint closely related materials. The accessions from the Instituto Agronômico show high genetic diversity. The EST‑SSR markers, developed from Hevea brasiliensis, show transferability and are able to amplify other species of Hevea.
Genetic diversity between improved banana diploids using canonical variables and the Ward-MLM method
Resumo:
The objective of this work was to estimate the genetic diversity of improved banana diploids using data from quantitative analysis and from simple sequence repeats (SSR) marker, simultaneously. The experiment was carried out with 33 diploids, in an augmented block design with 30 regular treatments and three common ones. Eighteen agronomic characteristics and 20 SSR primers were used. The agronomic characteristics and the SSR were analyzed simultaneously by the Ward-MLM, cluster, and IML procedures. The Ward clustering method considered the combined matrix obtained by the Gower algorithm. The Ward-MLM procedure identified three ideal groups (G1, G2, and G3) based on pseudo-F and pseudo-t² statistics. The dendrogram showed relative similarity between the G1 genotypes, justified by genealogy. In G2, 'Calcutta 4' appears in 62% of the genealogies. Similar behavior was observed in G3, in which the 028003-01 diploid is the male parent of the 086079-10 and 042079-06 genotypes. The method with canonical variables had greater discriminatory power than Ward-MLM. Although reduced, the genetic variability available is sufficient to be used in the development of new hybrids.
Resumo:
The Universitat Oberta de Catalunya (UOC, Open University of Catalonia) is involved inseveral research projects and educational activities related to the use of Open Educational Resources (OER). Some of the discussed issues in the concept of OER are research issues which are being tackled in two EC projects (OLCOS and SELF). Besides the research part, the UOC aims at developing a virtual centre for analysing and promoting the concept of OERin Europe in the sector of Higher and Further Education. The objectives are to makeinformation and learning services available to provide university management staff,eLearning support centres, faculty and learners with practical information required to create, share and re-use such interoperable digital content, tools and licensing schemes. In the realisation of these objectives, the main activities are the following: to provide organisationaland individual e-learning end-users with orientation; to develop perspectives and useful recommendations in the form of a medium-term Roadmap 2010 for OER in Higher and Further Education in Europe; to offer practical information and support services about how to create, share and re-use open educational content by means of tutorials, guidelines, best practices, and specimen of exemplary open e-learning content; to establish a larger group ofcommitted experts throughout Europe and other continents who not only share theirexpertise but also steer networking, workshops, and clustering efforts; and to foster and support a community of practice in open e-learning content know-how and experiences.
Resumo:
We present in this paper the results of the application of several visual methods on a group of locations, dated between VI and I centuries BC, of the ager Tarraconensis (Tarragona, Spain) a Hinterland of the roman colony of Tarraco. The difficulty in interpreting the diverse results in a combined way has been resolved by means of the use of statistical methods, such as Principal Components Analysis (PCA) and K-means clustering analysis. These methods have allowed us to carry out site classifications in function of the landscape's visual structure that contains them and of the visual relationships that could be given among them.
Resumo:
The ability to obtain gene expression profiles from human disease specimens provides an opportunity to identify relevant gene pathways, but is limited by the absence of data sets spanning a broad range of conditions. Here, we analyzed publicly available microarray data from 16 diverse skin conditions in order to gain insight into disease pathogenesis. Unsupervised hierarchical clustering separated samples by disease as well as common cellular and molecular pathways. Disease-specific signatures were leveraged to build a multi-disease classifier, which predicted the diagnosis of publicly and prospectively collected expression profiles with 93% accuracy. In one sample, the molecular classifier differed from the initial clinical diagnosis and correctly predicted the eventual diagnosis as the clinical presentation evolved. Finally, integration of IFN-regulated gene programs with the skin database revealed a significant inverse correlation between IFN-β and IFN-γ programs across all conditions. Our study provides an integrative approach to the study of gene signatures from multiple skin conditions, elucidating mechanisms of disease pathogenesis. In addition, these studies provide a framework for developing tools for personalized medicine toward the precise prediction, prevention, and treatment of disease on an individual level.
Resumo:
The quality of environmental data analysis and propagation of errors are heavily affected by the representativity of the initial sampling design [CRE 93, DEU 97, KAN 04a, LEN 06, MUL07]. Geostatistical methods such as kriging are related to field samples, whose spatial distribution is crucial for the correct detection of the phenomena. Literature about the design of environmental monitoring networks (MN) is widespread and several interesting books have recently been published [GRU 06, LEN 06, MUL 07] in order to clarify the basic principles of spatial sampling design (monitoring networks optimization) based on Support Vector Machines was proposed. Nonetheless, modelers often receive real data coming from environmental monitoring networks that suffer from problems of non-homogenity (clustering). Clustering can be related to the preferential sampling or to the impossibility of reaching certain regions.
Resumo:
With the increasing availability of various 'omics data, high-quality orthology assignment is crucial for evolutionary and functional genomics studies. We here present the fourth version of the eggNOG database (available at http://eggnog.embl.de) that derives nonsupervised orthologous groups (NOGs) from complete genomes, and then applies a comprehensive characterization and analysis pipeline to the resulting gene families. Compared with the previous version, we have more than tripled the underlying species set to cover 3686 organisms, keeping track with genome project completions while prioritizing the inclusion of high-quality genomes to minimize error propagation from incomplete proteome sets. Major technological advances include (i) a robust and scalable procedure for the identification and inclusion of high-quality genomes, (ii) provision of orthologous groups for 107 different taxonomic levels compared with 41 in eggNOGv3, (iii) identification and annotation of particularly closely related orthologous groups, facilitating analysis of related gene families, (iv) improvements of the clustering and functional annotation approach, (v) adoption of a revised tree building procedure based on the multiple alignments generated during the process and (vi) implementation of quality control procedures throughout the entire pipeline. As in previous versions, eggNOGv4 provides multiple sequence alignments and maximum-likelihood trees, as well as broad functional annotation. Users can access the complete database of orthologous groups via a web interface, as well as through bulk download.
Resumo:
The objective of this work was to assess the genetic diversity and population structure of wheat genotypes, to detect significant and stable genetic associations, as well as to evaluate the efficiency of statistical models to identify chromosome regions responsible for the expression of spike-related traits. Eight important spike characteristics were measured during five growing seasons in Serbia. A set of 30 microsatellite markers positioned near important agronomic loci was used to evaluate genetic diversity, resulting in a total of 349 alleles. The marker-trait associations were analyzed using the general linear and mixed linear models. The results obtained for number of allelic variants per locus (11.5), average polymorphic information content value (0.68), and average gene diversity (0.722) showed that the exceptional level of polymorphism in the genotypes is the main requirement for association studies. The population structure estimated by model-based clustering distributed the genotypes into six subpopulations according to log probability of data. Significant and stable associations were detected on chromosomes 1B, 2A, 2B, 2D, and 6D, which explained from 4.7 to 40.7% of total phenotypic variations. The general linear model identified a significantly larger number of marker-trait associations (192) than the mixed linear model (76). The mixed linear model identified nine markers associated to six traits.
Resumo:
Determining the biogeographical histories of rainforests is central to our understanding of the present distribution of tropical biodiversity. Ice age fragmentation of central African rainforests strongly influenced species distributions. Elevated areas characterized by higher species richness and endemism have been postulated to be Pleistocene forest refugia. However, it is often difficult to separate the effects of history and of present-day ecological conditions on diversity patterns at the interspecific level. Intraspecific genetic variation could yield new insights into history, because refugia hypotheses predict patterns not expected on the basis of contemporary environmental dynamics. Here, we test geographically explicit hypotheses of vicariance associated with the presence of putative refugia and provide clues about their location. We intensively sampled populations of Aucoumea klaineana, a forest tree sensitive to forest fragmentation, throughout its geographical range. Characterizing variation at 10 nuclear microsatellite loci, we were able to obtain phylogeographic data of unprecedented detail for this region. Using Bayesian clustering approaches, we demonstrated the presence of four differentiated genetic units. Their distribution matched that of forest refugia postulated from patterns of species richness and endemism. Our data also show differences in diversity dynamics at leading and trailing edges of the species' shifting distribution. Our results confirm predictions based on refugia hypotheses and cannot be explained on the basis of present-day ecological conditions.
Resumo:
Tämän diplomityön tarkoituksena on tutkia, mitä vaaditaan uutisten samanlaisuuden automaattiseen tunnistamiseen. Uutiset ovat tekstipohjaisia uutisia, jotka on haettu eri uutislähteistä. Uutisista on tarkoitus tunnistaa ensinnäkin ne uutiset, jotka tarkoittavat samaa asiaa, sekä ne uutiset, jotka eivät ole aivan sama asia, mutta liittyvät kuitenkin toisiinsa. Tässä diplomityössä tutkitaan, millä algoritmeilla tämä tunnistus onnistuu tehokkaimmin sekä suomalaisessa, että englanninkielisessä tekstissä. Diplomityössä vertaillaan valmiita algoritmeja. Tavoitteena on valita sellainen algoritmiyhdistelmä, että 90 % vertailluista uutisista tunnistuu oikein. Tutkimuksessa käytetään 2 eri ryhmittelyalgoritmia, sekä 3 eri stemmaus-algoritmia. Näitä algoritmeja vertaillaan sekä uutisten tunnistustehokkuuden, että niiden suorituskyvyn suhteen. Parhaimmaksi stemmaus-algoritmiksi osoittautui sekä suomen-, että englanninkielisten uutisten vertailussa Porterin algoritmi. Ryhmittely-algoritmeista tehokkaammaksi osoittautui yksinkertaisempi erilaisiin tunnuslukuihin perustuva algoritmi.
Resumo:
We have compared the phylogenetic diversity of methicillin-resistant Staphylococcus aureus (MRSA) strains from Switzerland and their phylogenetic relationships with European epidemic clones, using multiprimer random amplification polymorphic DNA (RAPD). Strains included 24 European epidemic clones (59 strains), 66 sporadic strains isolated in Switzerland in 1996-1997, and 15 reference strains of five other Staphylococcus species. Similarity and clustering analysis with the Jaccard's coefficient showed that the maximum genetic distance between MRSA strains was 0.43, whereas the minimum genetic distance between the six Staphylococcus species was 0.97, indicating that the method permits phylogenetic hierarchization. The 24 MRSA clones reported to be epidemic in European countries during the 1990s were distributed into seven different genetic clusters with a maximum distance of 0.29 among them. This clustering pattern was confirmed by the analysis of a subset of MRSA strains by multilocus enzyme electrophoresis at 12 loci. Most of the sporadic Swiss strains were distributed into these seven different genetic clusters, together with the epidemic MRSA clones. This suggests that there is no phylogenetic cluster specific to epidemic clones of MRSA.
Resumo:
In this paper, we consider active sampling to label pixels grouped with hierarchical clustering. The objective of the method is to match the data relationships discovered by the clustering algorithm with the user's desired class semantics. The first is represented as a complete tree to be pruned and the second is iteratively provided by the user. The active learning algorithm proposed searches the pruning of the tree that best matches the labels of the sampled points. By choosing the part of the tree to sample from according to current pruning's uncertainty, sampling is focused on most uncertain clusters. This way, large clusters for which the class membership is already fixed are no longer queried and sampling is focused on division of clusters showing mixed labels. The model is tested on a VHR image in a multiclass classification setting. The method clearly outperforms random sampling in a transductive setting, but cannot generalize to unseen data, since it aims at optimizing the classification of a given cluster structure.
Resumo:
Työn tavoitteena on löytää menetelmä tai malli, jolla suurta joukkoa erilaatuisia ideoita pystytään lajittelemaan ja löytämään tästä joukosta relevantit ja parhaat ideat jatkoke-hittelyyn. Työn empiirisenä aineistona on käytetty kahdella eri toimialalla toimivan yrityksen yhteisessä ideaistunnossa syntyneitä ideoita. Työn teoreettisessa osuudessa on esitelty innovaation määritelmä sekä eri tapoja luoki-tella innovaatioita. Lisäksi teoreettisessa osuudessa on käsitelty avoimen innovaation periaatetta ja sen kenties yhä kasvavaa merkitystä. Työssä on käsitelty myös ideoiden hyödyntämiseen vaikuttavia seikkoja. Näkökulmiksi työssä on valittu yrityksen sisäiset ja ympäristöstä johtuvat seikat, kuten yrityksen osaaminen sekä asiakkaatja markkinat. Teoriaosuuden päättää lyhyt katsaus klusterointiin, sen eri menetelmiin, sekä portfolion hallintaan. Empiirisessä osuudessa yli 50 idean joukosta kyettiin löytämään muutamia ideoita, jotka ovat eri tavoilla toteutettavissa istuntoon osallistuneiden yritysten toimesta.
Resumo:
Visible up-conversion in ZnO:Er and ZnO:Er:Yb thin films deposited by RF magnetron sputtering under different O2-rich atmospheres has been studied. Conventional photoluminescence (325 nm laser source) and up-conversion (980 nm laser source) have been performed in the films before and after an annealing process at 800 °C. The resulting spectra demonstrate that the thermal treatment, either during or post-deposition, activates optically the Er3+ ions, being the latter process much more efficient. Moreover, the atmosphere during deposition was also found to be an important parameter, as the deposition under O2 flow increases the optical activity of Er+3 ions. In addition, the inclusion of Yb3+ ions into the films has shown an enhancement of the visible up-conversion emission at 660 nm by a factor of 4, which could be associated to either a better energy transfer from the 2F5/2 Yb level to the 4I11/2 Er one, or to the prevention of having Er2O3 clustering in the films.
Resumo:
Tässä työssä raportoidaan hybridihitsauksesta otettujen suurnopeuskuvasarjojen automaattisen analyysijärjestelmän kehittäminen.Järjestelmän tarkoitus oli tuottaa tietoa, joka avustaisi analysoijaa arvioimaan kuvatun hitsausprosessin laatua. Tutkimus keskittyi valokaaren taajuuden säännöllisyyden ja lisäainepisaroiden lentosuuntien mittaamiseen. Valokaaria havaittiin kuvasarjoista sumean c-means-klusterointimenetelmän avullaja perättäisten valokaarien välistä aikaväliä käytettiin valokaaren taajuuden säännöllisyyden mittarina. Pisaroita paikannettiin menetelmällä, jossa yhdistyi pääkomponenttianalyysi ja tukivektoriluokitin. Kalman-suodinta käytettiin tuottamaan arvioita pisaroiden lentosuunnista ja nopeuksista. Lentosuunnanmääritysmenetelmä luokitteli pisarat niiden arvioitujen lentosuuntien perusteella. Järjestelmän kehittämiseen käytettävissä olleet kuvasarjat poikkesivat merkittävästi toisistaan kuvanlaadun ja pisaroiden ulkomuodon osalta, johtuen eroista kuvaus- ja hitsausprosesseissa. Analyysijärjestelmä kehitettiin toimimaan pienellä osajoukolla kuvasarjoja, joissa oli tietynlainen kuvaus- ja hitsausprosessi ja joiden kuvanlaatu ja pisaroiden ulkomuoto olivat samankaltaisia, mutta järjestelmää testattiin myös osajoukon ulkopuolisilla kuvasarjoilla. Testitulokset osoittivat, että lentosuunnanmääritystarkkuus oli kohtuullisen suuri osajoukonsisällä ja pieni muissa kuvasarjoissa. Valokaaren taajuuden säännöllisyyden määritys oli tarkka useammassa kuvasarjassa.