94 resultados para optimal sequential search
em Helda - Digital Repository of University of Helsinki
Resumo:
Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.
Resumo:
Segmentation is a data mining technique yielding simplified representations of sequences of ordered points. A sequence is divided into some number of homogeneous blocks, and all points within a segment are described by a single value. The focus in this thesis is on piecewise-constant segments, where the most likely description for each segment and the most likely segmentation into some number of blocks can be computed efficiently. Representing sequences as segmentations is useful in, e.g., storage and indexing tasks in sequence databases, and segmentation can be used as a tool in learning about the structure of a given sequence. The discussion in this thesis begins with basic questions related to segmentation analysis, such as choosing the number of segments, and evaluating the obtained segmentations. Standard model selection techniques are shown to perform well for the sequence segmentation task. Segmentation evaluation is proposed with respect to a known segmentation structure. Applying segmentation on certain features of a sequence is shown to yield segmentations that are significantly close to the known underlying structure. Two extensions to the basic segmentation framework are introduced: unimodal segmentation and basis segmentation. The former is concerned with segmentations where the segment descriptions first increase and then decrease, and the latter with the interplay between different dimensions and segments in the sequence. These problems are formally defined and algorithms for solving them are provided and analyzed. Practical applications for segmentation techniques include time series and data stream analysis, text analysis, and biological sequence analysis. In this thesis segmentation applications are demonstrated in analyzing genomic sequences.
Resumo:
Analyzing statistical dependencies is a fundamental problem in all empirical science. Dependencies help us understand causes and effects, create new scientific theories, and invent cures to problems. Nowadays, large amounts of data is available, but efficient computational tools for analyzing the data are missing. In this research, we develop efficient algorithms for a commonly occurring search problem - searching for the statistically most significant dependency rules in binary data. We consider dependency rules of the form X->A or X->not A, where X is a set of positive-valued attributes and A is a single attribute. Such rules describe which factors either increase or decrease the probability of the consequent A. A classical example are genetic and environmental factors, which can either cause or prevent a disease. The emphasis in this research is that the discovered dependencies should be genuine - i.e. they should also hold in future data. This is an important distinction from the traditional association rules, which - in spite of their name and a similar appearance to dependency rules - do not necessarily represent statistical dependencies at all or represent only spurious connections, which occur by chance. Therefore, the principal objective is to search for the rules with statistical significance measures. Another important objective is to search for only non-redundant rules, which express the real causes of dependence, without any occasional extra factors. The extra factors do not add any new information on the dependence, but can only blur it and make it less accurate in future data. The problem is computationally very demanding, because the number of all possible rules increases exponentially with the number of attributes. In addition, neither the statistical dependency nor the statistical significance are monotonic properties, which means that the traditional pruning techniques do not work. As a solution, we first derive the mathematical basis for pruning the search space with any well-behaving statistical significance measures. The mathematical theory is complemented by a new algorithmic invention, which enables an efficient search without any heuristic restrictions. The resulting algorithm can be used to search for both positive and negative dependencies with any commonly used statistical measures, like Fisher's exact test, the chi-squared measure, mutual information, and z scores. According to our experiments, the algorithm is well-scalable, especially with Fisher's exact test. It can easily handle even the densest data sets with 10000-20000 attributes. Still, the results are globally optimal, which is a remarkable improvement over the existing solutions. In practice, this means that the user does not have to worry whether the dependencies hold in future data or if the data still contains better, but undiscovered dependencies.
Resumo:
This thesis is an empirical study of how two words in Icelandic, "nú" and "núna", are used in contemporary Icelandic conversation. My aims in this study are, first, to explain the differences between the temporal functions of "nú" and "núna", and, second, to describe the non-temporal functions of "nú". In the analysis, a focus is placed on comparing the sequential placement of the two words, on their syntactical distribution, and on their prosodic realization. The empirical data comprise 14 hours and 11 minutes of naturally occurring conversation recorded between 1996 and 2003. The selected conversations represent a wide range of interactional contexts including informal dinner parties, institutional and non-institutional telephone conversations, radio programs for teenagers, phone-in programs, and, finally, a political debate on television. The theoretical and methodological framework is interactional linguistics, which can be described as linguistically oriented conversation analysis (CA). A comparison of "nú" and "núna" shows that the two words have different syntactic distributions. "Nú" has a clear tendency to occur in the front field, before the finite verb, while "núna" typically occurs in the end field, after the object. It is argued that this syntactic difference reflects a functional difference between "nú" and "núna". A sequential analysis of "núna" shows that the word refers to an unspecified period of time which includes the utterance time as well as some time in the past and in the future. This temporal relation is referred to as reference time. "Nú", by contrast, is mainly used in three different environments: a) in temporal comparisons, 2) in transitions, and 3) when the speaker is taking an affective stance. The non-temporal functions of "nú" are divided into three categories: a) "nú" as a tone particle, 2) "nú" as an utterance particle, and 3) "nú" as a dialogue particle. "Nú" as a tone particle is syntactically integrated and can occur in two syntactic positions: pre-verbally and post-verbally. I argue that these instances are employed in utterances in which a speaker is foregrounding information or marking it as particularly important. The study shows that, although these instances are typically prosodically non-prominent and unstressed, they are in some cases delivered with stress and with a higher pitch than the surrounding talk. "Nú" as an utterance particle occurs turn-initially and is syntactically non-integrated. By using "nú", speakers show continuity between turns and link new turns to prior ones. These instances initiate either continuations by the same speaker or new turns after speaker shifts. "Nú" as a dialogue particle occurs as a turn of its own. The study shows that these instances register informings in prior turns as unexpected or as a departure from the normal state of affairs. "Nú" as a dialogue particle is often delivered with a prolonged vowel and a recognizable intonation contour. A comparative sequential and prosodic analysis shows that in these cases there is a correlation between the function of "nú" and the intonation contour by which it is delivered. Finally, I argue that despite the many functions of "nú", all the instances can be said to have a common denominator, which is to display attention towards the present moment and the utterances which are produced prior or after the production of "nú". Instead of anchoring the utterances in external time or reference time, these instances position the utterance in discourse internal time, or discourse time.
Resumo:
Pitch discrimination is a fundamental property of the human auditory system. Our understanding of pitch-discrimination mechanisms is important from both theoretical and clinical perspectives. The discrimination of spectrally complex sounds is crucial in the processing of music and speech. Current methods of cognitive neuroscience can track the brain processes underlying sound processing either with precise temporal (EEG and MEG) or spatial resolution (PET and fMRI). A combination of different techniques is therefore required in contemporary auditory research. One of the problems in comparing the EEG/MEG and fMRI methods, however, is the fMRI acoustic noise. In the present thesis, EEG and MEG in combination with behavioral techniques were used, first, to define the ERP correlates of automatic pitch discrimination across a wide frequency range in adults and neonates and, second, they were used to determine the effect of recorded acoustic fMRI noise on those adult ERP and ERF correlates during passive and active pitch discrimination. Pure tones and complex 3-harmonic sounds served as stimuli in the oddball and matching-to-sample paradigms. The results suggest that pitch discrimination in adults, as reflected by MMN latency, is most accurate in the 1000-2000 Hz frequency range, and that pitch discrimination is facilitated further by adding harmonics to the fundamental frequency. Newborn infants are able to discriminate a 20% frequency change in the 250-4000 Hz frequency range, whereas the discrimination of a 5% frequency change was unconfirmed. Furthermore, the effect of the fMRI gradient noise on the automatic processing of pitch change was more prominent for tones with frequencies exceeding 500 Hz, overlapping with the spectral maximum of the noise. When the fundamental frequency of the tones was lower than the spectral maximum of the noise, fMRI noise had no effect on MMN and P3a, whereas the noise delayed and suppressed N1 and exogenous N2. Noise also suppressed the N1 amplitude in a matching-to-sample working memory task. However, the task-related difference observed in the N1 component, suggesting a functional dissociation between the processing of spatial and non-spatial auditory information, was partially preserved in the noise condition. Noise hampered feature coding mechanisms more than it hampered the mechanisms of change detection, involuntary attention, and the segregation of the spatial and non-spatial domains of working-memory. The data presented in the thesis can be used to develop clinical ERP-based frequency-discrimination protocols and combined EEG and fMRI experimental paradigms.
Resumo:
In visual search one tries to find the currently relevant item among other, irrelevant items. In the present study, visual search performance for complex objects (characters, faces, computer icons and words) was investigated, and the contribution of different stimulus properties, such as luminance contrast between characters and background, set size, stimulus size, colour contrast, spatial frequency, and stimulus layout were investigated. Subjects were required to search for a target object among distracter objects in two-dimensional stimulus arrays. The outcome measure was threshold search time, that is, the presentation duration of the stimulus array required by the subject to find the target with a certain probability. It reflects the time used for visual processing separated from the time used for decision making and manual reactions. The duration of stimulus presentation was controlled by an adaptive staircase method. The number and duration of eye fixations, saccade amplitude, and perceptual span, i.e., the number of items that can be processed during a single fixation, were measured. It was found that search performance was correlated with the number of fixations needed to find the target. Search time and the number of fixations increased with increasing stimulus set size. On the other hand, several complex objects could be processed during a single fixation, i.e., within the perceptual span. Search time and the number of fixations depended on object type as well as luminance contrast. The size of the perceptual span was smaller for more complex objects, and decreased with decreasing luminance contrast within object type, especially for very low contrasts. In addition, the size and shape of perceptual span explained the changes in search performance for different stimulus layouts in word search. Perceptual span was scale invariant for a 16-fold range of stimulus sizes, i.e., the number of items processed during a single fixation was independent of retinal stimulus size or viewing distance. It is suggested that saccadic visual search consists of both serial (eye movements) and parallel (processing within perceptual span) components, and that the size of the perceptual span may explain the effectiveness of saccadic search in different stimulus conditions. Further, low-level visual factors, such as the anatomical structure of the retina, peripheral stimulus visibility and resolution requirements for the identification of different object types are proposed to constrain the size of the perceptual span, and thus, limit visual search performance. Similar methods were used in a clinical study to characterise the visual search performance and eye movements of neurological patients with chronic solvent-induced encephalopathy (CSE). In addition, the data about the effects of different stimulus properties on visual search in normal subjects were presented as simple practical guidelines, so that the limits of human visual perception could be taken into account in the design of user interfaces.
Resumo:
Human growth and attained height are determined by a combination of genetic and environmental effects and in modern Western societies > 80% of the observed variation in height is determined by genetic factors. Height is a fundamental human trait that is associated with many socioeconomic and psychosocial factors and health measures, however little is known of the identity of the specific genes that influence height variation in the general population. This thesis work aimed to identify the genetic variants that influence height in the general population by genome-wide linkage analysis utilizing large family samples. The study focused on analysis of three separate sets of families consisting of: 1) 1,417 individuals from 277 Finnish families (FinnHeight), 2) 8,450 individuals from 3,817 families from Australia and Europe (EUHeight) and 3) 9,306 individuals from 3,302 families from the United States (USHeight). The most significant finding in this study was found in the Finnish family sample where we a locus in the chromosomal region 1p21 was linked to adult height. Several regions showed evidence for linkage in the Australian, European and US families with 8q21 and 15q25 being the most significant. The region on 1p21 was followed up with further studies and we were able to show that the collagen 11-alpha-1 gene (COL11A1) residing at this location was associated with adult height. This association was also confirmed in an independent Finnish population cohort (Health 2000) consisting of 6,542 individuals. From this population sample, we estimated that homozygous males and females for this gene variant were 1.1 and 0.6 cm taller than the respective controls. In this thesis work we identified a gene variant in the COL11A1 gene that influences human height, although this variant alone explains only 0.1% of height variation in the Finnish population. We also demonstrated in this study that special stratification strategies such as performing sex-limited analyses, focusing on dizygous twin pairs, analyzing ethnic groups within a population separately and utilizing homogenous populations such as the Finns can improve the statistical power of finding QTL significantly. Also, we concluded from the results of this study that even though genetic effects explain a great proportion of height variance, it is likely that there are tens or even hundreds of genes with small individual effects underlying the genetic architecture of height.
Resumo:
Schizophrenia is a severe mental disorder affecting 0.4-1% of the population worldwide. It is characterized by impairments in the perception of reality and by significant social or occupational dysfunction. The disorder is one of the major contributors to the global burden of diseases. Studies of twins, families, and adopted children point to strong genetic components for schizophrenia, but environmental factors also play a role in the pathogenesis of disease. Molecular genetic studies have identified several potential positional candidate genes. The strongest evidence for putative schizophrenia susceptibility loci relates to the genes encoding dysbindin (DTNBP1) and neuregulin (NRG1), but studies lack impressive consistency in the precise genetic regions and alleles implicated. We have studied the role of three potential candidate genes by genotyping 28 single nucleotide polymorphisms in the DNTBP1, NRG1, and AKT1 genes in a large schizophrenia family sample consisting of 441 families with 865 affected individuals from Finland. Our results do not support a major role for these genes in the pathogenesis of schizophrenia in Finland. We have previously identified a region on chromosome 5q21-34 as a susceptibility locus for schizophrenia in a Finnish family sample. Recently, two studies reported association between the γ-aminobutyric acid type A receptor cluster of genes in this region and one study showed suggestive evidence for association with another regional gene encoding clathrin interactor 1 (CLINT1, also called Epsin 4 and ENTH). To further address the significance of these genes under the linkage peak in the Finnish families, we genotyped SNPs of these genes, and observed statistically significant association of variants between GABRG2 and schizophrenia. Furthermore, these variants also seem to affect the functioning of the working memory. Fetal events and obstetric complications are associated with schizophrenia. Rh incompatibility has been implicated as a risk factor for schizophrenia in several epidemiological studies. We conducted a family-based candidate-gene study that assessed the role of maternal-fetal genotype incompatibility at the RhD locus in schizophrenia. There was significant evidence for an RhD maternal-fetal genotype incompatibility, and the risk ratio was estimated at 2.3. This is the first candidate-gene study to explicitly test for and provide evidence of a maternal-fetal genotype incompatibility mechanism in schizophrenia. In conclusion, in this thesis we found evidence that one GABA receptor subunit, GABRG2, is significantly associated with schizophrenia. Furthermore, it also seems to affect to the functioning of the working memory. In addition, an RhD maternal-fetal genotype incompatibility increases the risk of schizophrenia by two-fold.
Resumo:
Forest management is facing new challenges under climate change. By adjusting thinning regimes, conventional forest management can be adapted to various objectives of utilization of forest resources, such as wood quality, forest bioenergy, and carbon sequestration. This thesis aims to develop and apply a simulation-optimization system as a tool for an interdisciplinary understanding of the interactions between wood science, forest ecology, and forest economics. In this thesis, the OptiFor software was developed for forest resources management. The OptiFor simulation-optimization system integrated the process-based growth model PipeQual, wood quality models, biomass production and carbon emission models, as well as energy wood and commercial logging models into a single optimization model. Osyczka s direct and random search algorithm was employed to identify optimal values for a set of decision variables. The numerical studies in this thesis broadened our current knowledge and understanding of the relationships between wood science, forest ecology, and forest economics. The results for timber production show that optimal thinning regimes depend on site quality and initial stand characteristics. Taking wood properties into account, our results show that increasing the intensity of thinning resulted in lower wood density and shorter fibers. The addition of nutrients accelerated volume growth, but lowered wood quality for Norway spruce. Integrating energy wood harvesting into conventional forest management showed that conventional forest management without energy wood harvesting was still superior in sparse stands of Scots pine. Energy wood from pre-commercial thinning turned out to be optimal for dense stands. When carbon balance is taken into account, our results show that changing carbon assessment methods leads to very different optimal thinning regimes and average carbon stocks. Raising the carbon price resulted in longer rotations and a higher mean annual increment, as well as a significantly higher average carbon stock over the rotation.
Resumo:
Buffer zones are vegetated strip-edges of agricultural fields along watercourses. As linear habitats in agricultural ecosystems, buffer strips dominate and play a leading ecological role in many areas. This thesis focuses on the plant species diversity of the buffer zones in a Finnish agricultural landscape. The main objective of the present study is to identify the determinants of floral species diversity in arable buffer zones from local to regional levels. This study was conducted in a watershed area of a farmland landscape of southern Finland. The study area, Lepsämänjoki, is situated in the Nurmijärvi commune 30 km to the north of Helsinki, Finland. The biotope mosaics were mapped in GIS. A total of 59 buffer zones were surveyed, of which 29 buffer strips surveyed were also sampled by plot. Firstly, two diversity components (species richness and evenness) were investigated to determine whether the relationship between the two is equal and predictable. I found no correlation between species richness and evenness. The relationship between richness and evenness is unpredictable in a small-scale human-shaped ecosystem. Ordination and correlation analyses show that richness and evenness may result from different ecological processes, and thus should be considered separately. Species richness correlated negatively with phosphorus content, and species evenness correlated negatively with the ratio of organic carbon to total nitrogen in soil. The lack of a consistent pattern in the relationship between these two components may be due to site-specific variation in resource utilization by plant species. Within-habitat configuration (width, length, and area) were investigated to determine which is more effective for predicting species richness. More species per unit area increment could be obtained from widening the buffer strip than from lengthening it. The width of the strips is an effective determinant of plant species richness. The increase in species diversity with an increase in the width of buffer strips may be due to cross-sectional habitat gradients within the linear patches. This result can serve as a reference for policy makers, and has application value in agricultural management. In the framework of metacommunity theory, I found that both mass effect(connectivity) and species sorting (resource heterogeneity) were likely to explain species composition and diversity on a local and regional scale. The local and regional processes were interactively dominated by the degree to which dispersal perturbs local communities. In the lowly and intermediately connected regions, species sorting was of primary importance to explain species diversity, while the mass effect surpassed species sorting in the highly connected region. Increasing connectivity in communities containing high habitat heterogeneity can lead to the homogenization of local communities, and consequently, to lower regional diversity, while local species richness was unrelated to the habitat connectivity. Of all species found, Anthriscus sylvestris, Phalaris arundinacea, and Phleum pretense significantly responded to connectivity, and showed high abundance in the highly connected region. We suggest that these species may play a role in switching the force from local resources to regional connectivity shaping the community structure. On the landscape context level, the different responses of local species richness and evenness to landscape context were investigated. Seven landscape structural parameters served to indicate landscape context on five scales. On all scales but the smallest scales, the Shannon-Wiener diversity of land covers (H') correlated positively with the local richness. The factor (H') showed the highest correlation coefficients in species richness on the second largest scale. The edge density of arable field was the only predictor that correlated with species evenness on all scales, which showed the highest predictive power on the second smallest scale. The different predictive power of the factors on different scales showed a scaledependent relationship between the landscape context and local plant species diversity, and indicated that different ecological processes determine species richness and evenness. The local richness of species depends on a regional process on large scales, which may relate to the regional species pool, while species evenness depends on a fine- or coarse-grained farming system, which may relate to the patch quality of the habitats of field edges near the buffer strips. My results suggested some guidelines of species diversity conservation in the agricultural ecosystem. To maintain a high level of species diversity in the strips, a high level of phosphorus in strip soil should be avoided. Widening the strips is the most effective mean to improve species richness. Habitat connectivity is not always favorable to species diversity because increasing connectivity in communities containing high habitat heterogeneity can lead to the homogenization of local communities (beta diversity) and, consequently, to lower regional diversity. Overall, a synthesis of local and regional factors emerged as the model that best explain variations in plant species diversity. The studies also suggest that the effects of determinants on species diversity have a complex relationship with scale.
Resumo:
Phosphorus is a nutrient needed in crop production. While boosting crop yields it may also accelerate eutrophication in the surface waters receiving the phosphorus runoff. The privately optimal level of phosphorus use is determined by the input and output prices, and the crop response to phosphorus. Socially optimal use also takes into account the impact of phosphorus runoff on water quality. Increased eutrophication decreases the economic value of surface waters by Deteriorating fish stocks, curtailing the potential for recreational activities and by increasing the probabilities of mass algae blooms. In this dissertation, the optimal use of phosphorus is modelled as a dynamic optimization problem. The potentially plant available phosphorus accumulated in soil is treated as a dynamic state variable, the control variable being the annual phosphorus fertilization. For crop response to phosphorus, the state variable is more important than the annual fertilization. The level of this state variable is also a key determinant of the runoff of dissolved, reactive phosphorus. Also the loss of particulate phosphorus due to erosion is considered in the thesis, as well as its mitigation by constructing vegetative buffers. The dynamic model is applied for crop production on clay soils. At the steady state, the analysis focuses on the effects of prices, damage parameterization, discount rate and soil phosphorus carryover capacity on optimal steady state phosphorus use. The economic instruments needed to sustain the social optimum are also analyzed. According to the results the economic incentives should be conditioned on soil phosphorus values directly, rather than on annual phosphorus applications. The results also emphasize the substantial effects the differences in varying discount rates of the farmer and the social planner have on optimal instruments. The thesis analyzes the optimal soil phosphorus paths from its alternative initial levels. It also examines how erosion susceptibility of a parcel affects these optimal paths. The results underline the significance of the prevailing soil phosphorus status on optimal fertilization levels. With very high initial soil phosphorus levels, both the privately and socially optimal phosphorus application levels are close to zero as the state variable is driven towards its steady state. The soil phosphorus processes are slow. Therefore, depleting high phosphorus soils may take decades. The thesis also presents a methodologically interesting phenomenon in problems of maximizing the flow of discounted payoffs. When both the benefits and damages are related to the same state variable, the steady state solution may have an interesting property, under very general conditions: The tail of the payoffs of the privately optimal path as well as the steady state may provide a higher social welfare than the respective tail of the socially optimal path. The result is formalized and an applied to the created framework of optimal phosphorus use.
Resumo:
The purpose of this study is to describe the development of application of mass spectrometry for the structural analyses of non-coding ribonucleic acids during past decade. Mass spectrometric methods are compared of traditional gel electrophoretic methods, the characteristics of performance of mass spectrometric, analyses are studied and the future trends of mass spectrometry of ribonucleic acids are discussed. Non-coding ribonucleic acids are short polymeric biomolecules which are not translated to proteins, but which may affect the gene expression in all organisms. Regulatory ribonucleic acids act through transient interactions with key molecules in signal transduction pathways. Interactions are mediated through specific secondary and tertiary structures. Posttranscriptional modifications in the structures of molecules may introduce new properties to the organism, such as adaptation to environmental changes or development of resistance to antibiotics. In the scope of this study, the structural studies include i) determination of the sequence of nucleobases in the polymer chain, ii) characterisation and localisation of posttranscriptional modifications in nucleobases and in the backbone structure, iii) identification of ribonucleic acid-binding molecules and iv) probing of higher order structures in the ribonucleic acid molecule. Bacteria, archaea, viruses and HeLa cancer cells have been used as target organisms. Synthesised ribonucleic acids consisting of structural regions of interest have been frequently used. Electrospray ionisation (ESI) and matrix-assisted laser desorption ionisation (MALDI) have been used for ionisation of ribonucleic analytes. Ammonium acetate and 2-propanol are common solvents for ESI. Trihydroxyacetophenone is the optimal MALDI matrix for ionisation of ribonucleic acids and peptides. Ammonium salts are used in ESI buffers and MALDI matrices as additives to remove cation adducts. Reverse phase high performance liquid chromatography has been used for desalting and fractionation of analytes either off-line of on-line, coupled with ESI source. Triethylamine and triethylammonium bicarbonate are used as ion pair reagents almost exclusively. Fourier transform ion cyclotron resonance analyser using ESI coupled with liquid chromatography is the platform of choice for all forms of structural analyses. Time-of-flight (TOF) analyser using MALDI may offer sensitive, easy-to-use and economical solution for simple sequencing of longer oligonucleotides and analyses of analyte mixtures without prior fractionation. Special analysis software is used for computer-aided interpretation of mass spectra. With mass spectrometry, sequences of 20-30 nucleotides of length may be determined unambiguously. Sequencing may be applied to quality control of short synthetic oligomers for analytical purposes. Sequencing in conjunction with other structural studies enables accurate localisation and characterisation of posttranscriptional modifications and identification of nucleobases and amino acids at the sites of interaction. High throughput screening methods for RNA-binding ligands have been developed. Probing of the higher order structures has provided supportive data for computer-generated three dimensional models of viral pseudoknots. In conclusion. mass spectrometric methods are well suited for structural analyses of small species of ribonucleic acids, such as short non-coding ribonucleic acids in the molecular size region of 20-30 nucleotides. Structural information not attainable with other methods of analyses, such as nuclear magnetic resonance and X-ray crystallography, may be obtained with the use of mass spectrometry. Sequencing may be applied to quality control of short synthetic oligomers for analytical purposes. Ligand screening may be used in the search of possible new therapeutic agents. Demanding assay design and challenging interpretation of data requires multidisclipinary knowledge. The implement of mass spectrometry to structural studies of ribonucleic acids is probably most efficiently conducted in specialist groups consisting of researchers from various fields of science.