5 resultados para Data manipulation

em Helda - Digital Repository of University of Helsinki


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Defects in mitochondrial DNA (mtDNA) maintenance cause a range of human diseases, including autosomal dominant progressive external ophthalmoplegia (adPEO). This study aimed to clarify the molecular background of adPEO. We discovered that deoxynucleoside triphosphate (dNTP) metabolism plays a crucial in mtDNA maintenance and were thus prompted to search for therapeutic strategies based on the modulation of cellular dNTP pools or mtDNA copy number. Human mtDNA is a 16.6 kb circular molecule present in hundreds to thousands of copies per cell. mtDNA is compacted into nucleoprotein clusters called nucleoids. mtDNA maintenance diseases result from defects in nuclear encoded proteins that maintain the mtDNA. These syndromes typically afflict highly differentiated, post-mitotic tissues such as muscle and nerve, but virtually any organ can be affected. adPEO is a disease where mtDNA molecules with large-scale deletions accumulate in patients tissues, particularly in skeletal muscle. Mutations in five nuclear genes, encoding the proteins ANT1, Twinkle, POLG, POLG2 and OPA1, have previously been shown to cause adPEO. Here, we studied a large North American pedigree with adPEO, and identified a novel heterozygous mutation in the gene RRM2B, which encodes the p53R2 subunit of the enzyme ribonucleotide reductase (RNR). RNR is the rate-limiting enzyme in dNTP biosynthesis, and is required both for nuclear and mitochondrial DNA replication. The mutation results in the expression of a truncated form of p53R2, which is likely to compete with the wild-type allele. A change in enzyme function leads to defective mtDNA replication due to altered dNTP pools. Therefore, RRM2B is a novel adPEO disease gene. The importance of adequate dNTP pools and RNR function for mtDNA maintenance has been established in many organisms. In yeast, induction of RNR has previously been shown to increase mtDNA copy number, and to rescue the phenotype caused by mutations in the yeast mtDNA polymerase. To further study the role of RNR in mammalian mtDNA maintenance, we used mice that broadly overexpress the RNR subunits Rrm1, Rrm2 or p53R2. Active RNR is a heterotetramer consisting of two large subunits (Rrm1) and two small subunits (either Rrm2 or p53R2). We also created bitransgenic mice that overexpress Rrm1 together with either Rrm2 or p53R2. In contrast to the previous findings in yeast, bitransgenic RNR overexpression led to mtDNA depletion in mouse skeletal muscle, without mtDNA deletions or point mutations. The mtDNA depletion was associated with imbalanced dNTP pools. Furthermore, the mRNA expression levels of Rrm1 and p53R2 were found to correlate with mtDNA copy number in two independent mouse models, suggesting nuclear-mitochondrial cross talk with regard to mtDNA copy number. We conclude that tight regulation of RNR is needed to prevent harmful alterations in the dNTP pool balance, which can lead to disordered mtDNA maintenance. Increasing the copy number of wild-type mtDNA has been suggested as a strategy for treating PEO and other mitochondrial diseases. Only two proteins are known to cause a robust increase in mtDNA copy number when overexpressed in mice; the mitochondrial transcription factor A (TFAM), and the mitochondrial replicative helicase Twinkle. We studied the mechanisms by which Twinkle and TFAM elevate mtDNA levels, and showed that Twinkle specifically implements mtDNA synthesis. Furthermore, both Twinkle and TFAM were found to increase mtDNA content per nucleoid. Increased mtDNA content in mouse tissues correlated with an age-related accumulation of mtDNA deletions, depletion of mitochondrial transcripts, and progressive respiratory dysfunction. Simultaneous overexpression of Twinkle and TFAM led to a further increase in the mtDNA content of nucleoids, and aggravated the respiratory deficiency. These results suggested that high mtDNA levels have detrimental long-term effects in mice. These data have to be considered when developing and evaluating treatment strategies for elevating mtDNA copy number.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, I look into a grammatical phenomenon found among speakers of the Cambridgeshire dialect of English. According to my hypothesis, the phenomenon is a new entry into the past BE verb paradigm in the English language. In my paper, I claim that the structure I have found complements the existing two verb forms, was and were, with a third verb form that I have labelled ‘intermediate past BE’. The paper is divided into two parts. In the first section, I introduce the theoretical ground for the study of variation, which is founded on empiricist principles. In variationist linguistics, the main claim is that heterogeneous language use is structured and ordered. In the last 50 years of history in modern linguistics, this claim is controversial. In the 1960s, the generativist movement spearheaded by Noam Chomsky diverted attention away from grammatical theories that are based on empirical observations. The generativists steered away from language diversity, variation and change in favour of generalisations, abstractions and universalist claims. The theoretical part of my paper goes through the main points of the variationist agenda and concludes that abandoning the concept of language variation in linguistics is harmful for both theory and methodology. In the method part of the paper, I present the Helsinki Archive of Regional English Speech (HARES) corpus. It is an audio archive that contains interviews conducted in England in the 1970s and 1980s. The interviews were done in accordance to methods used generally in traditional dialectology. The informants are mostly elderly male people who have lived in the same region throughout their lives and who have left school at an early age. The interviews are actually conversations: the interviewer allowed the informant to pick the topic of conversation to induce a maximally relaxed and comfortable atmosphere and thus allow the most natural dialect variant to emerge in the informant’s speech. In the paper, the corpus chapter introduces some of the transcription and annotation problems associated with spoken language corpora (especially those containing dialectal speech). Questions surrounding the concept of variation are present in this part of the paper too, as especially transcription work is troubled by the fundamental problem of having to describe the fluctuations of everyday speech in text. In the empirical section of the paper, I use HARES to analyse the speech of four informants, with special focus on the emergence of the intermediate past BE variant. My observations and the subsequent analysis permit me to claim that my hypothesis seems to hold. The intermediate variant occupies almost all contexts where one would expect was or were in the informants’ speech. This means that the new variant is integrated into the speakers’ grammars and exemplifies the kind of variation that is at the heart of this paper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

What are the musical features that turn a song into a hit? The aim of this research is to explore the musical features of hit tunes by studying the 224 most popular Finnish evergreens from the 1930s to the 1990s. It is remarkable, that 80-90% of Finnish oldies are in a minor key, though parallel major keys have also been widely employed within single pieces through, for example, modulations. Furthermore, melodies are usually diatonic, staying mostly in the same key. Consequently, chromatically altered tones in the melody and short modulations in the bridge sections become more prominent. I have concentrated in particular on the melodic lines in order to find the most typical melodic formulas from the data. These analyzed melodic formulas play an important role, because they serve as leading phrases and punchlines in songs. Analysis has revealed three major melodic formulas, which most often appear in the melodic lines of hit tunes. All of these formulas share common thematic ground, because they originate from the triadic tonic chord. Because the tonic chord is the most conventional opening chord in the verse parts, it is logical that these formulas occur most often in verses. The strong dominance of these formulas is very much a result of the rhythmic flexibility they possess; for instance, they can be found in every musical style from waltz to foxtrot. Alongside the major formulas lies a miscellaneous group of other tonic-related melodic formulas. One group of melodic formulas consists of melodic quotations. These quotations appear in a different musical context, for instance in a harmonically altered form, and are therefore often difficult to recognize as such. Yet despite the contextual manipulation, the distinctive character of the cited melody usually remains the same. Composers have also made use of certain popular chord-progressions in order to create new but familiar-sounding melodies. The most important individual progression in this case is what is known as a "circle of fifths" and its shortened, prolonged and altered versions. Because that progression is harmonically strong, it is also a contrastive tool used especially in chorus parts and middle sections (AABA). I have also paid attention to ragtime and jazz influences, which can be found in harmony parts and certain melody notes, which extend, suspend or alter the accompaning chords. Other influences from jazz and ragtime in the Finnish evergreen are evident in the use of typical Tin Pan Alley popular song forms. The most important is the AABA form, which dominates over the data along with the verse/chorus-type popular song form. To briefly illustrate the main results, the basic concept of the hit tune can be traced back to Tin Pan Alley songs, whereas the major stylistic aspects, such as minor keys and musical styles, bear influences from Russian, Western European, and Finnish traditions.