37 resultados para Graph-based methods


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Baltic Sea is a unique environment that contains unique genetic populations. In order to study these populations on a genetic level basic molecular research is needed. The aim of this thesis was to provide a basic genetic resource for population genomic studies by de novo assembling a transcriptome for the Baltic Sea isopod Idotea balthica. RNA was extracted from a whole single adult male isopod and sequenced using Illumina (125bp PE) RNA-Seq. The reads were preprocessed using FASTQC for quality control, TRIMMOMATIC for trimming, and RCORRECTOR for error correction. The preprocessed reads were then assembled with TRINITY, a de Bruijn graph-based assembler, using different k-mer sizes. The different assemblies were combined and clustered using CD-HIT. The assemblies were evaluated using TRANSRATE for quality and filtering, BUSCO for completeness, and TRANSDECODER for annotation potential. The 25-mer assembly was annotated using PANNZER (protein annotation with z-score) and BLASTX. The 25-mer assembly represents the best first draft assembly since it contains the most information. However, this assembly shows high levels of polymorphism, which currently cannot be differentiated as paralogs or allelic variants. Furthermore, this assembly is incomplete, which could be improved by sampling additional developmental stages.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ohjelmoinnin opettaminen yleissivistävänä oppiaineena on viime aikoina herättänyt kiinnostusta Suomessa ja muualla maailmassa. Esimerkiksi Suomen opetushallituksen määrittämien, vuonna 2016 käyttöön otettavien peruskoulun opintosuunnitelman perusteiden mukaan, ohjelmointitaitoja aletaan opettaa suomalaisissa peruskouluissa ensimmäiseltä luokalta alkaen. Ohjelmointia ei olla lisäämässä omaksi oppiaineekseen, vaan sen opetuksen on tarkoitus tapahtua muiden oppiaineiden, kuten matematiikan yhteydessä. Tämä tutkimus käsittelee yleissivistävää ohjelmoinnin opetusta yleisesti, käy läpi yleisimpiä haasteita ohjelmoinnin oppimisessa ja tarkastelee erilaisten opetusmenetelmien soveltuvuutta erityisesti nuorten oppilaiden opettamiseen. Tutkimusta varten toteutettiin verkkoympäristössä toimiva, noin 9–12-vuotiaille oppilaille suunnattu graafista ohjelmointikieltä ja visuaalisuutta tehokkaasti hyödyntävä oppimissovellus. Oppimissovelluksen avulla toteutettiin alakoulun neljänsien luokkien kanssa vertailututkimus, jossa graafisella ohjelmointikielellä tapahtuvan opetuksen toimivuutta vertailtiin toiseen opetusmenetelmään, jossa oppilaat tutustuivat ohjelmoinnin perusteisiin toiminnallisten leikkien avulla. Vertailututkimuksessa kahden neljännen luokan oppilaat suorittivat samankaltaisia, ohjelmoinnin peruskäsitteisiin liittyviä ohjelmointitehtäviä molemmilla opetus-menetelmillä. Tutkimuksen tavoitteena oli selvittää alakouluoppilaiden nykyistä ohjelmointiosaamista, sitä minkälaisen vastaanoton ohjelmoinnin opetus alakouluoppilailta saa, onko erilaisilla opetusmenetelmillä merkitystä opetuksen toteutuksen kannalta ja näkyykö eri opetusmenetelmillä opetettujen luokkien oppimistuloksissa eroja. Oppilaat suhtautuivat kumpaankin opetusmenetelmään myönteisesti, ja osoittivat kiinnostusta ohjelmoinnin opiskeluun. Sisällöllisesti oppitunneille oli varattu turhan paljon materiaalia, mutta esimerkiksi yhden keskeisimmän aiheen, eli toiston käsitteen oppimisessa aktiivisilla leikeillä harjoitellut luokka osoitti huomattavasti graafisella ohjelmointikielellä harjoitellutta luokkaa parempaa osaamista oppitunnin jälkeen. Ohjelmakoodin peräkkäisyyteen liittyvä osaaminen oli neljäsluokkalaisilla hyvin hallussa jo ennen ohjelmointiharjoituksia. Aiheeseen liittyvän taustatutkimuksen ja luokkien opettajien haastatteluiden perusteella havaittiin koulujen valmiuksien opetussuunnitelmauudistuksen mukaiseen ohjelmoinnin opettamiseen olevan vielä heikolla tasolla.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Relationship between organisms within an ecosystem is one of the main focuses in the study of ecology and evolution. For instance, host-parasite interactions have long been under close interest of ecology, evolutionary biology and conservation science, due to great variety of strategies and interaction outcomes. The monogenean ecto-parasites consist of a significant portion of flatworms. Gyrodactylus salaris is a monogenean freshwater ecto-parasite of Atlantic salmon (Salmo salar) whose damage can make fish to be prone to further bacterial and fungal infections. G. salaris is the only one parasite whose genome has been studied so far. The RNA-seq data analyzed in this thesis has already been annotated by using LAST. The RNA-seq data was obtained from Illumina sequencing i.e. yielded reads were assembled into 15777 transcripts. Last resulted in annotation of 46% transcripts and remaining were left unknown. This thesis work was started with whole data and annotation process was continued by the use of PANNZER, CDD and InterProScan. This annotation resulted in 56% successfully annotated sequences having parasite specific proteins identified. This thesis represents the first of Monogenean transcriptomic information which gives an important source for further research on this specie. Additionally, comparison of annotation methods interestingly revealed that description and domain based methods perform better than simple similarity search methods. Therefore it is more likely to suggest the use of these tools and databases for functional annotation. These results also emphasize the need for use of multiple methods and databases. It also highlights the need of more genomic information related to G. salaris.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Main purpose of this thesis is to introduce a new lossless compression algorithm for multispectral images. Proposed algorithm is based on reducing the band ordering problem to the problem of finding a minimum spanning tree in a weighted directed graph, where set of the graph vertices corresponds to multispectral image bands and the arcs’ weights have been computed using a newly invented adaptive linear prediction model. The adaptive prediction model is an extended unification of 2–and 4–neighbour pixel context linear prediction schemes. The algorithm provides individual prediction of each image band using the optimal prediction scheme, defined by the adaptive prediction model and the optimal predicting band suggested by minimum spanning tree. Its efficiency has been compared with respect to the best lossless compression algorithms for multispectral images. Three recently invented algorithms have been considered. Numerical results produced by these algorithms allow concluding that adaptive prediction based algorithm is the best one for lossless compression of multispectral images. Real multispectral data captured from an airplane have been used for the testing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The drug discovery process is facing new challenges in the evaluation process of the lead compounds as the number of new compounds synthesized is increasing. The potentiality of test compounds is most frequently assayed through the binding of the test compound to the target molecule or receptor, or measuring functional secondary effects caused by the test compound in the target model cells, tissues or organism. Modern homogeneous high-throughput-screening (HTS) assays for purified estrogen receptors (ER) utilize various luminescence based detection methods. Fluorescence polarization (FP) is a standard method for ER ligand binding assay. It was used to demonstrate the performance of two-photon excitation of fluorescence (TPFE) vs. the conventional one-photon excitation method. As result, the TPFE method showed improved dynamics and was found to be comparable with the conventional method. It also held potential for efficient miniaturization. Other luminescence based ER assays utilize energy transfer from a long-lifetime luminescent label e.g. lanthanide chelates (Eu, Tb) to a prompt luminescent label, the signal being read in a time-resolved mode. As an alternative to this method, a new single-label (Eu) time-resolved detection method was developed, based on the quenching of the label by a soluble quencher molecule when displaced from the receptor to the solution phase by an unlabeled competing ligand. The new method was paralleled with the standard FP method. It was shown to yield comparable results with the FP method and found to hold a significantly higher signal-tobackground ratio than FP. Cell-based functional assays for determining the extent of cell surface adhesion molecule (CAM) expression combined with microscopy analysis of the target molecules would provide improved information content, compared to an expression level assay alone. In this work, immune response was simulated by exposing endothelial cells to cytokine stimulation and the resulting increase in the level of adhesion molecule expression was analyzed on fixed cells by means of immunocytochemistry utilizing specific long-lifetime luminophore labeled antibodies against chosen adhesion molecules. Results showed that the method was capable of use in amulti-parametric assay for protein expression levels of several CAMs simultaneously, combined with analysis of the cellular localization of the chosen adhesion molecules through time-resolved luminescence microscopy inspection.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The most common reason for a low-voltage induction motor breakdown is a bearing failure. Along with the increasing popularity of modern frequency converters, bearing failures have become the most important motor fault type. Conditions in which bearing currents are likely to occur are generated as a side effect of fast du/dt switching transients. Once present, different types of bearing currents can accelerate the mechanical wear of bearings by causing deformation of metal parts in the bearing and degradation of the lubricating oil properties.The bearing current phenomena are well known, and several bearing current measurement and mitigation methods have been proposed. Nevertheless, in order to develop more feasible methods to measure and mitigate bearing currents, better knowledge of the phenomena is required. When mechanical wear is caused by bearing currents, the resulting aging impact has to be monitored and dealt with. Moreover, because of the stepwise aging mechanism, periodically executed condition monitoring measurements have been found ineffective. Thus, there is a need for feasible bearing current measurement methods that can be applied in parallel with the normal operation of series production drive systems. In order to reach the objectives of feasibility and applicability, nonintrusive measurement methods are preferred. In this doctoral dissertation, the characteristics and conditions of bearings that are related to the occurrence of different kinds of bearing currents are studied. Further, the study introduces some nonintrusive radio-frequency-signal-based approaches to detect and measure parameters that are associated with the accelerated bearing wear caused by bearing currents.