32 resultados para Hidden Markov random fields

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Even though a large amount of evidence would suggest that PP2A serine/threonine protein phosphatase acts as a tumour suppressor the genomics data to support this claim is limited. We fit a sparse binary Markov random field with individual sample's total mutational frequency as an additional covariate to model the dependencies between the mutations occurring in the PP2A encoding genes. We utilize the data from recent large scale cancer genomics studies, where the whole genome from a human tumour biopsy has been analysed. Our results show a complex network of interactions between the occurrence of mutations in our twenty examined genes. According to our analysis the mutations occurring in the genes PPP2R1A, PPP2R3A, and PPP2R2B are identified as the key mutations. These genes form the core of the network of conditional dependency between the mutations in the investigated twenty genes. Additionally, we note that the mutations occurring in PPP2R4 seem to be more influential in samples with higher number of total mutations. The mutations occurring in the set of genes suggested by our results has been shown to contribute to the transformation of human cells. We conclude that our evidence further supports the claim that PP2A acts as a tumour suppressor and restoring PP2A activity is an appealing therapeutic strategy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Construction of multiple sequence alignments is a fundamental task in Bioinformatics. Multiple sequence alignments are used as a prerequisite in many Bioinformatics methods, and subsequently the quality of such methods can be critically dependent on the quality of the alignment. However, automatic construction of a multiple sequence alignment for a set of remotely related sequences does not always provide biologically relevant alignments.Therefore, there is a need for an objective approach for evaluating the quality of automatically aligned sequences. The profile hidden Markov model is a powerful approach in comparative genomics. In the profile hidden Markov model, the symbol probabilities are estimated at each conserved alignment position. This can increase the dimension of parameter space and cause an overfitting problem. These two research problems are both related to conservation. We have developed statistical measures for quantifying the conservation of multiple sequence alignments. Two types of methods are considered, those identifying conserved residues in an alignment position, and those calculating positional conservation scores. The positional conservation score was exploited in a statistical prediction model for assessing the quality of multiple sequence alignments. The residue conservation score was used as part of the emission probability estimation method proposed for profile hidden Markov models. The results of the predicted alignment quality score highly correlated with the correct alignment quality scores, indicating that our method is reliable for assessing the quality of any multiple sequence alignment. The comparison of the emission probability estimation method with the maximum likelihood method showed that the number of estimated parameters in the model was dramatically decreased, while the same level of accuracy was maintained. To conclude, we have shown that conservation can be successfully used in the statistical model for alignment quality assessment and in the estimation of emission probabilities in the profile hidden Markov models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speaker diarization is the process of sorting speeches according to the speaker. Diarization helps to search and retrieve what a certain speaker uttered in a meeting. Applications of diarization systemsextend to other domains than meetings, for example, lectures, telephone, television, and radio. Besides, diarization enhances the performance of several speech technologies such as speaker recognition, automatic transcription, and speaker tracking. Methodologies previously used in developing diarization systems are discussed. Prior results and techniques are studied and compared. Methods such as Hidden Markov Models and Gaussian Mixture Models that are used in speaker recognition and other speech technologies are also used in speaker diarization. The objective of this thesis is to develop a speaker diarization system in meeting domain. Experimental part of this work indicates that zero-crossing rate can be used effectively in breaking down the audio stream into segments, and adaptive Gaussian Models fit adequately short audio segments. Results show that 35 Gaussian Models and one second as average length of each segment are optimum values to build a diarization system for the tested data. Uniting the segments which are uttered by same speaker is done in a bottom-up clustering by a newapproach of categorizing the mixture weights.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The subject of the thesis is automatic sentence compression with machine learning, so that the compressed sentences remain both grammatical and retain their essential meaning. There are multiple possible uses for the compression of natural language sentences. In this thesis the focus is generation of television program subtitles, which often are compressed version of the original script of the program. The main part of the thesis consists of machine learning experiments for automatic sentence compression using different approaches to the problem. The machine learning methods used for this work are linear-chain conditional random fields and support vector machines. Also we take a look which automatic text analysis methods provide useful features for the task. The data used for machine learning is supplied by Lingsoft Inc. and consists of subtitles in both compressed an uncompressed form. The models are compared to a baseline system and comparisons are made both automatically and also using human evaluation, because of the potentially subjective nature of the output. The best result is achieved using a CRF - sequence classification using a rich feature set. All text analysis methods help classification and most useful method is morphological analysis. Tutkielman aihe on suomenkielisten lauseiden automaattinen tiivistäminen koneellisesti, niin että lyhennetyt lauseet säilyttävät olennaisen informaationsa ja pysyvät kieliopillisina. Luonnollisen kielen lauseiden tiivistämiselle on monta käyttötarkoitusta, mutta tässä tutkielmassa aihetta lähestytään television ohjelmien tekstittämisen kautta, johon käytännössä kuuluu alkuperäisen tekstin lyhentäminen televisioruudulle paremmin sopivaksi. Tutkielmassa kokeillaan erilaisia koneoppimismenetelmiä tekstin automaatiseen lyhentämiseen ja tarkastellaan miten hyvin erilaiset luonnollisen kielen analyysimenetelmät tuottavat informaatiota, joka auttaa näitä menetelmiä lyhentämään lauseita. Lisäksi tarkastellaan minkälainen lähestymistapa tuottaa parhaan lopputuloksen. Käytetyt koneoppimismenetelmät ovat tukivektorikone ja lineaarisen sekvenssin mallinen CRF. Koneoppimisen tukena käytetään tekstityksiä niiden eri käsittelyvaiheissa, jotka on saatu Lingsoft OY:ltä. Luotuja malleja vertaillaan Lopulta mallien lopputuloksia evaluoidaan automaattisesti ja koska teksti lopputuksena on jossain määrin subjektiivinen myös ihmisarviointiin perustuen. Vertailukohtana toimii kirjallisuudesta poimittu menetelmä. Tutkielman tuloksena paras lopputulos saadaan aikaan käyttäen CRF sekvenssi-luokittelijaa laajalla piirrejoukolla. Kaikki kokeillut teksin analyysimenetelmät auttavat luokittelussa, joista tärkeimmän panoksen antaa morfologinen analyysi.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Selostus: Herukkaviljelmien ravinnetila

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Selostus: Kolmas kevätviljapeltojen rikkakasvikartoitus

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the electrical industry the 50 Hz electric and magnetic fields are often higher than in the average working environment. The electric and magnetic fields can be studied by measuring or by calculatingthe fields in the environment. For example, the electric field under a 400 kV power line is 1 to 10 kV/m, and the magnetic flux density is 1 to 15 µT. Electricand magnetic fields of a power line induce a weak electric field and electric currents in the exposed body. The average current density in a human being standing under a 400 kV line is 1 to 2 mA/m2. The aim of this study is to find out thepossible effects of short term exposure to electric and magnetic fields of electricity power transmission on workers' health, in particular the cardiovascular effects. The study consists of two parts; Experiment I: influence on extrasystoles, and Experiment II: influence on heart rate. In Experiment I two groups, 26 voluntary men (Group 1) and 27 transmission-line workers (Group 2), were measured. Their electrocardiogram (ECG) was recorded with an ambulatory recorder both in and outside the field. In Group 1 the fields were 1.7 to 4.9 kV/m and 1.1 to 7.1 pT; in Group 2 they were 0.1 to 10.2 kV/m and 1.0 to 15.4 pT. In the ECG analysis the only significant observation was a decrease in the heart rate after field exposure (Group 1). The drop cannot be explained with the first measuring method. Therefore Experiment II was carried out. In Experiment II two groups were used; Group 1 (26 male volunteers) were measured in real field exposure, Group 2 (15 male volunteers) in "sham" fields. The subjects of Group 1 spent 1 h outside the field, then 1 h in the field under a 400 kV transmission line, and then again 1 h outside the field. Under the 400 kV linethe field strength varied from 3.5 to 4.3 kV/m, and from 1.4 to 6.6 pT. Group 2spent the entire test period (3 h) in a 33 kV outdoor testing station in a "sham" field. ECG, blood pressure, and electroencephalogram (EEG) were measured by ambulatory methods. Before and after the field exposure, the subjects performed some cardiovascular autonomic function tests. The analysis of the results (Experiments I and II) showed that extrasystoles or arrythmias were as frequent in the field (below 4 kV/m and 4 pT) as outside it. In Experiment II there was no decrease detected in the heart rate, and the systolic and diastolic blood pressure stayed nearly the same. No health effects were found in this study.