39 resultados para tree similarity measure
Resumo:
This thesis which consists of an introduction and four peer-reviewed original publications studies the problems of haplotype inference (haplotyping) and local alignment significance. The problems studied here belong to the broad area of bioinformatics and computational biology. The presented solutions are computationally fast and accurate, which makes them practical in high-throughput sequence data analysis. Haplotype inference is a computational problem where the goal is to estimate haplotypes from a sample of genotypes as accurately as possible. This problem is important as the direct measurement of haplotypes is difficult, whereas the genotypes are easier to quantify. Haplotypes are the key-players when studying for example the genetic causes of diseases. In this thesis, three methods are presented for the haplotype inference problem referred to as HaploParser, HIT, and BACH. HaploParser is based on a combinatorial mosaic model and hierarchical parsing that together mimic recombinations and point-mutations in a biologically plausible way. In this mosaic model, the current population is assumed to be evolved from a small founder population. Thus, the haplotypes of the current population are recombinations of the (implicit) founder haplotypes with some point--mutations. HIT (Haplotype Inference Technique) uses a hidden Markov model for haplotypes and efficient algorithms are presented to learn this model from genotype data. The model structure of HIT is analogous to the mosaic model of HaploParser with founder haplotypes. Therefore, it can be seen as a probabilistic model of recombinations and point-mutations. BACH (Bayesian Context-based Haplotyping) utilizes a context tree weighting algorithm to efficiently sum over all variable-length Markov chains to evaluate the posterior probability of a haplotype configuration. Algorithms are presented that find haplotype configurations with high posterior probability. BACH is the most accurate method presented in this thesis and has comparable performance to the best available software for haplotype inference. Local alignment significance is a computational problem where one is interested in whether the local similarities in two sequences are due to the fact that the sequences are related or just by chance. Similarity of sequences is measured by their best local alignment score and from that, a p-value is computed. This p-value is the probability of picking two sequences from the null model that have as good or better best local alignment score. Local alignment significance is used routinely for example in homology searches. In this thesis, a general framework is sketched that allows one to compute a tight upper bound for the p-value of a local pairwise alignment score. Unlike the previous methods, the presented framework is not affeced by so-called edge-effects and can handle gaps (deletions and insertions) without troublesome sampling and curve fitting.
Fire histories and tree ages in unmanaged boreal forests in Eastern Fennoscandia and Onega peninsula
Resumo:
Lignin is a hydrophobic polymer that is synthesised in the secondary cell walls of all vascular plants. It enables water conduction through the stem, supports the upright growth habit and protects against invading pathogens. In addition, lignin hinders the utilisation of the cellulosic cell walls of plants in pulp and paper industry and as forage. Lignin precursors are synthesised in the cytoplasm through the phenylpropanoid pathway, transported into the cell wall and oxidised by peroxidases or laccases to phenoxy radicals that couple to form the lignin polymer. This study was conducted to characterise the lignin biosynthetic pathway in Norway spruce (Picea abies (L.) Karst.). We focused on the less well-known polymerisation stage, to identify the enzymes and the regulatory mechanisms that are involved. Available data for lignin biosynthesis in gymnosperms is scarce and, for example, the latest improvements in precursor biosynthesis have only been verified in herbaceous plants. Therefore, we also wanted to study in detail the roles of individual gene family members during developmental and stress-induced lignification, using EST sequencing and real-time RT-PCR. We used, as a model, a Norway spruce tissue culture line that produces extracellular lignin into the culture medium, and showed that lignin polymerisation in the tissue culture depends on peroxidase activity. We identified in the culture medium a significant NADH oxidase activity that could generate H2O2 for peroxidases. Two basic culture medium peroxidases were shown to have high affinity to coniferyl alcohol. Conservation of the putative substrate-binding amino acids was observed when the spruce peroxidase sequences were compared with other peroxidases with high affinity to coniferyl alcohol. We also used different peroxidase fractions to produce synthetic in vitro lignins from coniferyl alcohol; however, the linkage pattern of the suspension culture lignin could not be reproduced in vitro with the purified peroxidases, nor with the full complement of culture medium proteins. This emphasised the importance of the precursor radical concentration in the reaction zone, which is controlled by the cells through the secretion of both the lignin precursors and the oxidative enzymes to the apoplast. In addition, we identified basic peroxidases that were reversibly bound to the lignin precipitate. They could be involved, for example, in the oxidation of polymeric lignin, which is required for polymer growth. The dibenzodioxocin substructure was used as a marker for polymer oxidation in the in vitro polymerisation studies, as it is a typical substructure in wood lignin and in the suspension culture lignin. Using immunolocalisation, we found the structure mainly in the S2+S3 layers of the secondary cell walls of Norway spruce tracheids. The structure was primarily formed during the late phases of lignification. Contrary to the earlier assumptions, it appears to be a terminal structure in the lignin macromolecule. Most lignin biosynthetic enzymes are encoded for by several genes, all of which may not participate in lignin biosynthesis. In order to identify the gene family members that are responsible for developmental lignification, ESTs were sequenced from the lignin-forming tissue culture and developing xylem of spruce. Expression of the identified lignin biosynthetic genes was studied using real-time RT-PCR. Candidate genes for developmental lignification were identified by a coordinated, high expression of certain genes within the gene families in all lignin-forming tissues. However, such coordinated expression was not found for peroxidase genes. We also studied stress-induced lignification either during compression wood formation by bending the stems or after Heterobasidion annosum infection. Based on gene expression profiles, stress-induced monolignol biosynthesis appeared similar to the developmental process, and only single PAL and C3H genes were specifically up-regulated by stress. On the contrary, the up-regulated peroxidase genes differed between developmental and stress-induced lignification, indicating specific responses.
Resumo:
The ongoing rapid fragmentation of tropical forests is a major threat to global biodiversity. This is because many of the tropical forests are so-called biodiversity 'hotspots', areas that host exceptional species richness and concentrations of endemic species. Forest fragmentation has negative ecological and genetic consequences for plant survival. Proposed reasons for plant species' loss in forest fragments are, e.g., abiotic edge effects, altered species interactions, increased genetic drift, and inbreeding depression. To be able to conserve plants in forest fragments, the ecological and genetic processes that threaten the species have to be understood. That is possible only after obtaining adequate information on their biology, including taxonomy, life history, reproduction, and spatial and genetic structure of the populations. In this research, I focused on the African violet (genus Saintpaulia), a little-studied conservation flagship from the Eastern Arc Mountains and Coastal Forests hotspot of Tanzania and Kenya. The main objective of the research was to increase understanding of the life history, ecology and population genetics of Saintpaulia that is needed for the design of appropriate conservation measures. A further aim was to provide population-level insights into the difficult taxonomy of Saintpaulia. Ecological field work was conducted in a relatively little fragmented protected forest in the Amani Nature Reserve in the East Usambara Mountains, in northeastern Tanzania, complemented by population genetic laboratory work and ecological experiments in Helsinki, Finland. All components of the research were conducted with Saintpaulia ionantha ssp. grotei, which forms a taxonomically controversial population complex in the study area. My results suggest that Saintpaulia has good reproductive performance in forests with low disturbance levels in the East Usambara Mountains. Another important finding was that seed production depends on sufficient pollinator service. The availability of pollinators should thus be considered in the in situ management of threatened populations. Dynamic population stage structures were observed suggesting that the studied populations are demographically viable. High mortality of seedlings and juveniles was observed during the dry season but this was compensated by ample recruitment of new seedlings after the rainy season. Reduced tree canopy closure and substrate quality are likely to exacerbate seedling and juvenile mortality, and, therefore, forest fragmentation and disturbance are serious threats to the regeneration of Saintpaulia. Restoration of sufficient shade to enhance seedling establishment is an important conservation measure in populations located in disturbed habitats. Long-term demographic monitoring, which enables the forecasting of a population s future, is also recommended in disturbed habitats. High genetic diversities were observed in the populations, which suggest that they possess the variation that is needed for evolutionary responses in a changing environment. Thus, genetic management of the studied populations does not seem necessary as long as the habitats remain favourable for Saintpaulia. The observed high levels of inbreeding in some of the populations, and the reduced fitness of the inbred progeny compared to the outbred progeny, as revealed by the hand-pollination experiment, indicate that inbreeding and inbreeding depression are potential mechanisms contributing to the extinction of Saintpaulia populations. The relatively weak genetic divergence of the three different morphotypes of Saintpaulia ionantha ssp. grotei lend support to the hypothesis that the populations in the Usambara/lowlands region represent a segregating metapopulation (or metapopulations), where subpopulations are adapting to their particular environments. The partial genetic and phenological integrity, and the distinct trailing habit of the morphotype 'grotei' would, however, justify its placement in a taxonomic rank of its own, perhaps in a subspecific rank.
Resumo:
The first part of this work investigates the molecular epidemiology of a human enterovirus (HEV), echovirus 30 (E-30). This project is part of a series of studies performed in our research team analyzing the molecular epidemiology of HEV-B viruses. A total of 129 virus strains had been isolated in different parts of Europe. The sequence analysis was performed in three different genomic regions: 420 nucleotides (nt) in the VP4/VP2 capsid protein coding region, the entire VP1 capsid protein coding gene of 876 nt, and 150 nt in the VP1/2A junction region. The analysis revealed a succession of dominant sublineages within a major genotype. The temporally earlier genotypes had been replaced by a genetically homogenous lineage that has been circulating in Europe since the late 1970s. The same genotype was found by other research groups in North America and Australia. Globally, other cocirculating genetic lineages also exist. The prevalence of a dominant genotype makes E-30 different from other previously studied HEVs, such as polioviruses and coxsackieviruses B4 and B5, for which several coexisting genetic lineages have been reported. The second part of this work deals with molecular epidemiology of human rhinoviruses (HRVs). A total of 61 field isolates were studied in the 420-nt stretch in the capsid coding region of VP4/VP2. The isolates were collected from children under two years of age in Tampere, Finland. Sequences from the clinical isolates clustered in the two previously known phylogenetic clades. Seasonal clustering was found. Also, several distinct serotype-like clusters were found to co-circulate during the same epidemic season. Reappearance of a cluster after disappearing for a season was observed. The molecular epidemiology of the analyzed strains turned out to be complex, and we decided to continue our studies of HRV. Only five previously published complete genome sequences of HRV prototype strains were available for analysis. Therefore, all designated HRV prototype strains (n=102) were sequenced in the VP4/VP2 region, and the possibility of genetic typing of HRV was evaluated. Seventy-six of the 102 prototype strains clustered in HRV genetic group A (HRV-A) and 25 in group B (HRV-B). Serotype 87 clustered separately from other HRVs with HEV species D. The field strains of HRV represented as many as 19 different genotypes, as judged with an approximate demarcation of a 20% nt difference in the VP4/VP2 region. The interserotypic differences of HRV were generally similar to those reported between different HEV serotypes (i.e. about 20%), but smaller differences, less than 10%, were also observed. Because some HRV serotypes are genetically so closely related, we suggest that the genetic typing be performed using the criterion "the closest prototype strain". This study is the first systematic genetic characterization of all known HRV prototype strains, providing a further taxonomic proposal for classification of HRV. We proposed to divide the genus Human rhinoviruses into HRV-A and HRV-B. The final part of the work comprises a phylogenetic analysis of a subset (48) of HRV prototype strains and field isolates (12) in the nonstructural part of the genome coding for the RNA-dependent RNA polymerase (3D). The proposed division of the HRV strains in the species HRV-A and HRV-B was also supported by 3D region. HRV-B clustered closer to HEV species B, C, and also to polioviruses than to HRV-A. Intraspecies variation within both HRV-A and HRV-B was greater in the 3D coding region than in the VP4/VP2 coding region, in contrast to HEV. Moreover, the diversity of HRV in 3D exceeded that of HEV. One group of HRV-A, designated HRV-A', formed a separate cluster outside other HRV-A in the 3D region. It formed a cluster also in the capsid region, but located within HRV-A. This may reflect a different evolutionary history of distinct genomic regions among HRV-A. Furthermore, the tree topology within HRV-A in the 3D region differed from that in the VP4/VP2, suggesting possible recombination events in the evolution of the strains. No conflicting phylogenies were observed in any of the 12 field isolates. Possible recombination was further studied using the Similarity and Bootscanning analyses of the complete genome sequences of HRV available in public databases. Evidence for recombination among HRV-A was found, as HRV2 and HRV39 showed higher similarity in the nonstructural part of the genome. Whether HRV2 and HRV39 strains - and perhaps also some other HRV-A strains not yet completely sequenced - are recombinants remains to be determined.
Resumo:
A new deterministic three-dimensional neutral and charged particle transport code, MultiTrans, has been developed. In the novel approach, the adaptive tree multigrid technique is used in conjunction with simplified spherical harmonics approximation of the Boltzmann transport equation. The development of the new radiation transport code started in the framework of the Finnish boron neutron capture therapy (BNCT) project. Since the application of the MultiTrans code to BNCT dose planning problems, the testing and development of the MultiTrans code has continued in conventional radiotherapy and reactor physics applications. In this thesis, an overview of different numerical radiation transport methods is first given. Special features of the simplified spherical harmonics method and the adaptive tree multigrid technique are then reviewed. The usefulness of the new MultiTrans code has been indicated by verifying and validating the code performance for different types of neutral and charged particle transport problems, reported in separate publications.
Resumo:
Regeneration ecology, diversity of native woody species and its potential for landscape restoration was studied in the remnant natural forest at the College of Forestry and Natural Resources at Wondo Genet, Ethiopia. The type of forest is Afromontane rainforest , with many valuable tree species like Aningeria adolfi-friederici, and it is an important provider of ecological, social and economical services for the population that lives in this area. The study contains two parts, natural regeneration studies (at the natural forest) and interviews with farmers in the nearby village of the remnant patch. The objective of the first part was to investigate the floristic composition, densitiy and regeneration profiles of native woody species in the forest, paying special attention to woody species that are considered the most relevant (socio-economic). The second part provided information on woody species preferred by the farmers and on multiple uses of the adjacent natural forest, it also provided information and analysed perceptions on forest degradation. Systematic plot sampling was used in the forest inventory. Twenty square plots of 20 x 20 m were assessed, with 38 identified woody species (the total number of species was 45), representing 26 families. Of these species 61% were trees, 13% shrubs, 11% lianas and 16% species that could have both life forms. An analysis of natural regeneration of five important tree species in the natural forest showed that Aningeria adolfi-friederici had the best regeneration results. An analysis of population structure (as determined by height classes) of two commercially important woody species in the forest, Aningeria adolfi-friederici and Podocarpus falcatus, showed a marked difference: Aningeria had a typical “reversed J” frequency distribution, while Podocarpus showed very low values in all height classes. Multi dimensional scaling (MDS) was used to map the sample plots according to their similarity in species composition, using the Sørensen quantitative index, coupled with indicator species analysis .Three groups were identified with respective indicator species: Group 1 – Adhatoda schimperiana, Group 2 – Olea hochstetteri , Group 3 – Acacia senegal and Aningeria adolfi-friederici. Thirty questionnaire interviews were conducted with farmers in the village of Gotu Onoma that use the nearby remant forest patch. Their tree preferences were exotic species such as Eucalyptus globulus for construction and fuelwood and Grevillea robusta for shade and fertility. Considering forest land degradation farmers were aware of the problem and suggested that the governmental institutions address the problem by planting more Eucalyptus globulus. The natural forest seemed to have moderate levels of disturbance and it was still floristically diverse. However, the low rate of natural regeneration of Podocarpus falcatus suggested that this species is threatened and must be a priority in conservation actions. Plantations and agroforestry seem to be possible solutions for rehabilitation of the surrounding degraded lands, thereby decreasing the existent pressure in the remnant natural forest.
Resumo:
Acute heart failure (AHF) is a complex syndrome associated with exceptionally high mortality. Still, characteristics and prognostic factors of contemporary AHF patients have been inadequately studied. Kidney function has emerged as a very powerful prognostic risk factor in cardiovascular disease. This is believed to be the consequence of an interaction between the heart and kidneys, also termed the cardiorenal syndrome, the mechanisms of which are not fully understood. Renal insufficiency is common in heart failure and of particular interest for predicting outcome in AHF. Cystatin C (CysC) is a marker of glomerular filtration rate with properties making it a prospective alternative to the currently used measure creatinine for assessment of renal function. The aim of this thesis is to characterize a representative cohort of patients hospitalized for AHF and to identify risk factors for poor outcome in AHF. In particular, the role of CysC as a marker of renal function is evaluated, including examination of the value of CysC as a predictor of mortality in AHF. The FINN-AKVA (Finnish Acute Heart Failure) study is a national prospective multicenter study conducted to investigate the clinical presentation, aetiology and treatment of, as well as concomitant diseases and outcome in, AHF. Patients hospitalized for AHF were enrolled in the FINN-AKVA study, and mortality was followed for 12 months. The mean age of patients with AHF is 75 years and they frequently have both cardiovascular and non-cardiovascular co-morbidities. The mortality after hospitalization for AHF is high, rising to 27% by 12 months. The present study shows that renal dysfunction is very common in AHF. CysC detects impaired renal function in forty percent of patients. Renal function, measured by CysC, is one of the strongest predictors of mortality independently of other prognostic risk markers, such as age, gender, co-morbidities and systolic blood pressure on admission. Moreover, in patients with normal creatinine values, elevated CysC is associated with a marked increase in mortality. Acute kidney injury, defined as an increase in CysC within 48 hours of hospital admission, occurs in a significant proportion of patients and is associated with increased short- and mid-term mortality. The results suggest that CysC can be used for risk stratification in AHF. Markers of inflammation are elevated both in heart failure and in chronic kidney disease, and inflammation is one of the mechanisms thought to mediate heart-kidney interactions in the cardiorenal syndrome. Inflammatory cytokines such as interleukin-6 (IL-6) and tumor necrosis factor-alpha (TNF-α) correlate very differently to markers of cardiac stress and renal function. In particular, TNF-α showed a robust correlation to CysC, but was not associated with levels of NT-proBNP, a marker of hemodynamic cardiac stress. Compared to CysC, the inflammatory markers were not strongly related to mortality in AHF. In conclusion, patients with AHF are elderly with multiple co-morbidities, and renal dysfunction is very common. CysC demonstrates good diagnostic properties both in identifying impaired renal function and acute kidney injury in patients with AHF. CysC, as a measure of renal function, is also a powerful prognostic marker in AHF. CysC shows promise as a marker for assessment of kidney function and risk stratification in patients hospitalized for AHF.
Resumo:
Based on the Aristotelian criterion referred to as 'abductio', Peirce suggests a method of hypothetical inference, which operates in a different way than the deductive and inductive methods. “Abduction is nothing but guessing” (Peirce, 7.219). This principle is of extreme value for the study of our understanding of mathematical self-similarity in both of its typical presentations: relative or absolute. For the first case, abduction incarnates the quantitative/qualitative relationships of a self-similar object or process; for the second case, abduction makes understandable the statistical treatment of self-similarity, 'guessing' the continuity of geometric features to the infinity through the use of a systematic stereotype (for instance, the assumption that the general shape of the Sierpiński triangle continuates identically into its particular shapes). The metaphor coined by Peirce, of an exact map containig itself the same exact map (a map of itself), is not only the most important precedent of Mandelbrot’s problem of measuring the boundaries of a continuous irregular surface with a logarithmic ruler, but also still being a useful abstraction for the conceptualisation of relative and absolute self-similarity, and its mechanisms of implementation. It is useful, also, for explaining some of the most basic geometric ontologies as mental constructions: in the notion of infinite convergence of points in the corners of a triangle, or the intuition for defining two parallel straight lines as two lines in a plane that 'never' intersect.
Resumo:
The trees in the Penn Treebank have a standard representation that involves complete balanced bracketing. In this article, an alternative for this standard representation of the tree bank is proposed. The proposed representation for the trees is loss-less, but it reduces the total number of brackets by 28%. This is possible by omitting the redundant pairs of special brackets that encode initial and final embedding, using a technique proposed by Krauwer and des Tombe (1981). In terms of the paired brackets, the maximum nesting depth in sentences decreases by 78%. The 99.9% coverage is achieved with only five non-top levels of paired brackets. The observed shallowness of the reduced bracketing suggests that finite-state based methods for parsing and searching could be a feasible option for tree bank processing.
Resumo:
This thesis studies the tree species’ juvenile diversity in cacao (Theobroma cacao L.) based agroforestry and in primary forest in a natural conservation forest environment of Lore Lindu National Park, Sulawesi, Indonesia. Species’ adult composition in Lore Lindu National Park is relatively well studied, less is known about tree species’ diversity in seedling communities particularly in frequently disturbed cacao agroforestry field environment. Cacao production forms a potentially serious thread for maintaining the conservation areas pristine and forested in Sulawesi. The impacts of cacao production on natural environment are directly linked to the diversity and abundance of shade tree usage. The study aims at comparing differences between cacao agroforestry and natural forest in the surrounding area in their species composition in seedling and sapling size categories. The study was carried out in two parts. Biodiversity inventory of seedlings and saplings was combined with social survey with farmer interviews. Aim of the survey was to gain knowledge of the cacao fields, and farmers’ observations and choices regarding tree species associated with cacao. Data was collected in summer 2008. The assessment of the impact of environmental factors of solar radiation, weeding frequency, cacao tree planting density, distance to forest and distance to main park road, and type of habitat on seedling and sapling compositions was done with Non-metric Multidimensional Scaling (NMS). Outlier analysis was used to assess distorting variables for NMS, and Multi-Response Permutation Procedures (MRPP) analysis to differentiate the impact of categorical variables. Sampling success was estimated with rarefaction curves and jackknife estimate of species richness. In the inventory 135 species of trees and shrubs were found. Only some agroforestry related species were dominating. The most species rich were sapling communities in forest habitat. NMS was showing generally low linear correlation between variation of species composition and environmental variables. Solar radiation was having most significance as explaining variable. The most clearly separated in ordination were cacao and forest habitats. The results of seedling and sapling inventory were only partly coinciding with farmers’ knowledge of the tree species occurring on their fields. More research with frequent assessment of seedling cohorts is needed due to natural variability of cohorts and high mortality rate of seedlings.