136 resultados para Probability and Statistics
em Helda - Digital Repository of University of Helsinki
Resumo:
Individual movement is very versatile and inevitable in ecology. In this thesis, I investigate two kinds of movement body condition dependent dispersal and small-range foraging movements resulting in quasi-local competition and their causes and consequences on the individual, population and metapopulation level. Body condition dependent dispersal is a widely evident but barely understood phenomenon. In nature, diverse relationships between body condition and dispersal are observed. I develop the first models that study the evolution of dispersal strategies that depend on individual body condition. In a patchy environment where patches differ in environmental conditions, individuals born in rich (e.g. nutritious) patches are on average stronger than their conspecifics that are born in poorer patches. Body condition (strength) determines competitive ability such that stronger individuals win competition with higher probability than weak individuals. Individuals compete for patches such that kin competition selects for dispersal. I determine the evolutionarily stable strategy (ESS) for different ecological scenarios. My models offer explanations for both dispersal of strong individuals and dispersal of weak individuals. Moreover, I find that within-family dispersal behaviour is not always reflected on the population level. This supports the fact that no consistent pattern is detected in data on body condition dependent dispersal. It also encourages the refining of empirical investigations. Quasi-local competition defines interactions between adjacent populations where one population negatively affects the growth of the other population. I model a metapopulation in a homogeneous environment where adults of different subpopulations compete for resources by spending part of their foraging time in the neighbouring patches, while their juveniles only feed on the resource in their natal patch. I show that spatial patterns (different population densities in the patches) are stable only if one age class depletes the resource very much but mainly the other age group depends on it.
Resumo:
Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.
Resumo:
This work develops methods to account for shoot structure in models of coniferous canopy radiative transfer. Shoot structure, as it varies along the light gradient inside canopy, affects the efficiency of light interception per unit needle area, foliage biomass, or foliage nitrogen. The clumping of needles in the shoot volume also causes a notable amount of multiple scattering of light within coniferous shoots. The effect of shoot structure on light interception is treated in the context of canopy level photosynthesis and resource use models, and the phenomenon of within-shoot multiple scattering in the context of physical canopy reflectance models for remote sensing purposes. Light interception. A method for estimating the amount of PAR (Photosynthetically Active Radiation) intercepted by a conifer shoot is presented. The method combines modelling of the directional distribution of radiation above canopy, fish-eye photographs taken at shoot locations to measure canopy gap fraction, and geometrical measurements of shoot orientation and structure. Data on light availability, shoot and needle structure and nitrogen content has been collected from canopies of Pacific silver fir (Abies amabilis (Dougl.) Forbes) and Norway spruce (Picea abies (L.) Karst.). Shoot structure acclimated to light gradient inside canopy so that more shaded shoots have better light interception efficiency. Light interception efficiency of shoots varied about two-fold per needle area, about four-fold per needle dry mass, and about five-fold per nitrogen content. Comparison of fertilized and control stands of Norway spruce indicated that light interception efficiency is not greatly affected by fertilization. Light scattering. Structure of coniferous shoots gives rise to multiple scattering of light between the needles of the shoot. Using geometric models of shoots, multiple scattering was studied by photon tracing simulations. Based on simulation results, the dependence of the scattering coefficient of shoot from the scattering coefficient of needles is shown to follow a simple one-parameter model. The single parameter, termed the recollision probability, describes the level of clumping of the needles in the shoot, is wavelength independent, and can be connected to previously used clumping indices. By using the recollision probability to correct for the within-shoot multiple scattering, canopy radiative transfer models which have used leaves as basic elements can use shoots as basic elements, and thus be applied for coniferous forests. Preliminary testing of this approach seems to explain, at least partially, why coniferous forests appear darker than broadleaved forests in satellite data.
Resumo:
Planar curves arise naturally as interfaces between two regions of the plane. An important part of statistical physics is the study of lattice models. This thesis is about the interfaces of 2D lattice models. The scaling limit is an infinite system limit which is taken by letting the lattice mesh decrease to zero. At criticality, the scaling limit of an interface is one of the SLE curves (Schramm-Loewner evolution), introduced by Oded Schramm. This family of random curves is parametrized by a real variable, which determines the universality class of the model. The first and the second paper of this thesis study properties of SLEs. They contain two different methods to study the whole SLE curve, which is, in fact, the most interesting object from the statistical physics point of view. These methods are applied to study two symmetries of SLE: reversibility and duality. The first paper uses an algebraic method and a representation of the Virasoro algebra to find common martingales to different processes, and that way, to confirm the symmetries for polynomial expected values of natural SLE data. In the second paper, a recursion is obtained for the same kind of expected values. The recursion is based on stationarity of the law of the whole SLE curve under a SLE induced flow. The third paper deals with one of the most central questions of the field and provides a framework of estimates for describing 2D scaling limits by SLE curves. In particular, it is shown that a weak estimate on the probability of an annulus crossing implies that a random curve arising from a statistical physics model will have scaling limits and those will be well-described by Loewner evolutions with random driving forces.
Resumo:
In this thesis we deal with the concept of risk. The objective is to bring together and conclude on some normative information regarding quantitative portfolio management and risk assessment. The first essay concentrates on return dependency. We propose an algorithm for classifying markets into rising and falling. Given the algorithm, we derive a statistic: the Trend Switch Probability, for detection of long-term return dependency in the first moment. The empirical results suggest that the Trend Switch Probability is robust over various volatility specifications. The serial dependency in bear and bull markets behaves however differently. It is strongly positive in rising market whereas in bear markets it is closer to a random walk. Realized volatility, a technique for estimating volatility from high frequency data, is investigated in essays two and three. In the second essay we find, when measuring realized variance on a set of German stocks, that the second moment dependency structure is highly unstable and changes randomly. Results also suggest that volatility is non-stationary from time to time. In the third essay we examine the impact from market microstructure on the error between estimated realized volatility and the volatility of the underlying process. With simulation-based techniques we show that autocorrelation in returns leads to biased variance estimates and that lower sampling frequency and non-constant volatility increases the error variation between the estimated variance and the variance of the underlying process. From these essays we can conclude that volatility is not easily estimated, even from high frequency data. It is neither very well behaved in terms of stability nor dependency over time. Based on these observations, we would recommend the use of simple, transparent methods that are likely to be more robust over differing volatility regimes than models with a complex parameter universe. In analyzing long-term return dependency in the first moment we find that the Trend Switch Probability is a robust estimator. This is an interesting area for further research, with important implications for active asset allocation.
Resumo:
Financing trade between economic agents located in different countries is affected by many types of risks, resulting from incomplete information about the debtor, the problems of enforcing international contracts, or the prevalence of political and financial crises. Trade is important for economic development and the availability of trade finance is essential, especially for developing countries. Relatively few studies treat the topic of political risk, particularly in the context of international lending. This thesis explores new ground to identify links between political risk and international debt defaults. The core hypothesis of the study is that the default probability of debt increases with increasing political risk in the country of the borrower. The thesis consists of three essays that support the hypothesis from different angles of the credit evaluation process. The first essay takes the point of view of an international lender assessing the credit risk of a public borrower. The second investigates creditworthiness assessment of companies. The obtained results are substantiated in the third essay that deals with an extensive political risk survey among finance professionals in developing countries. The financial instruments of core interest are export credit guaranteed debt initiated between the Export Credit Agency of Finland and buyers in 145 countries between 1975 and 2006. Default events of the foreign credit counterparts are conditioned on country-specific macroeconomic variables, corporate-specific accounting information as well as political risk indicators from various international sources. Essay 1 examines debt issued to government controlled institutions and conditions public default events on traditional macroeconomic fundamentals, in addition to selected political and institutional risk factors. Confirming previous research, the study finds country indebtedness and the GDP growth rate to be significant indicators of public default. Further, it is shown that public defaults respond to various political risk factors. However, the impact of the risk varies between countries at different stages of economic development. Essay 2 proceeds by investigating political risk factors as conveivable drivers of corporate default and uses traditional accounting variables together with new political risk indicators in the credit evaluation of private debtors. The study finds links between corporate default and leverage, as well as between corporate default and the general investment climate and measeures of conflict in the debtor country. Essay 3 concludes the thesis by offering survey evidence on the impact of political risk on debt default, as perceived and experienced by 103 finance professionals in 38 developing countries. Taken together, the results of the thesis suggest that various forms of political risk are associated with international debt defaults and continue to pose great concerns for both international creditors and borrowers in developing countries. The study provides new insights on the importance of variable selection in country risk analysis, and shows how political risk is actually perceived and experienced in the riskier, often lower income countries of the global economy.
Resumo:
The objective of this paper is to improve option risk monitoring by examining the information content of implied volatility and by introducing the calculation of a single-sum expected risk exposure similar to the Value-at-Risk. The figure is calculated in two steps. First, there is a need to estimate the value of a portfolio of options for a number of different market scenarios, while the second step is to summarize the information content of the estimated scenarios into a single-sum risk measure. This involves the use of probability theory and return distributions, which confronts the user with the problems of non-normality in the return distribution of the underlying asset. Here the hyperbolic distribution is used to describe one alternative for dealing with heavy tails. Results indicate that the information content of implied volatility is useful when predicting future large returns in the underlying asset. Further, the hyperbolic distribution provides a good fit to historical returns enabling a more accurate definition of statistical intervals and extreme events.
Resumo:
In genetic epidemiology, population-based disease registries are commonly used to collect genotype or other risk factor information concerning affected subjects and their relatives. This work presents two new approaches for the statistical inference of ascertained data: a conditional and full likelihood approaches for the disease with variable age at onset phenotype using familial data obtained from population-based registry of incident cases. The aim is to obtain statistically reliable estimates of the general population parameters. The statistical analysis of familial data with variable age at onset becomes more complicated when some of the study subjects are non-susceptible, that is to say these subjects never get the disease. A statistical model for a variable age at onset with long-term survivors is proposed for studies of familial aggregation, using latent variable approach, as well as for prospective studies of genetic association studies with candidate genes. In addition, we explore the possibility of a genetic explanation of the observed increase in the incidence of Type 1 diabetes (T1D) in Finland in recent decades and the hypothesis of non-Mendelian transmission of T1D associated genes. Both classical and Bayesian statistical inference were used in the modelling and estimation. Despite the fact that this work contains five studies with different statistical models, they all concern data obtained from nationwide registries of T1D and genetics of T1D. In the analyses of T1D data, non-Mendelian transmission of T1D susceptibility alleles was not observed. In addition, non-Mendelian transmission of T1D susceptibility genes did not make a plausible explanation for the increase in T1D incidence in Finland. Instead, the Human Leucocyte Antigen associations with T1D were confirmed in the population-based analysis, which combines T1D registry information, reference sample of healthy subjects and birth cohort information of the Finnish population. Finally, a substantial familial variation in the susceptibility of T1D nephropathy was observed. The presented studies show the benefits of sophisticated statistical modelling to explore risk factors for complex diseases.
Resumo:
The topic of this dissertation lies in the intersection of harmonic analysis and fractal geometry. We particulary consider singular integrals in Euclidean spaces with respect to general measures, and we study how the geometric structure of the measures affects certain analytic properties of the operators. The thesis consists of three research articles and an overview. In the first article we construct singular integral operators on lower dimensional Sierpinski gaskets associated with homogeneous Calderón-Zygmund kernels. While these operators are bounded their principal values fail to exist almost everywhere. Conformal iterated function systems generate a broad range of fractal sets. In the second article we prove that many of these limit sets are porous in a very strong sense, by showing that they contain holes spread in every direction. In the following we connect these results with singular integrals. We exploit the fractal structure of these limit sets, in order to establish that singular integrals associated with very general kernels converge weakly. Boundedness questions consist a central topic of investigation in the theory of singular integrals. In the third article we study singular integrals of different measures. We prove a very general boundedness result in the case where the two underlying measures are separated by a Lipshitz graph. As a consequence we show that a certain weak convergence holds for a large class of singular integrals.
Resumo:
A composition operator is a linear operator between spaces of analytic or harmonic functions on the unit disk, which precomposes a function with a fixed self-map of the disk. A fundamental problem is to relate properties of a composition operator to the function-theoretic properties of the self-map. During the recent decades these operators have been very actively studied in connection with various function spaces. The study of composition operators lies in the intersection of two central fields of mathematical analysis; function theory and operator theory. This thesis consists of four research articles and an overview. In the first three articles the weak compactness of composition operators is studied on certain vector-valued function spaces. A vector-valued function takes its values in some complex Banach space. In the first and third article sufficient conditions are given for a composition operator to be weakly compact on different versions of vector-valued BMOA spaces. In the second article characterizations are given for the weak compactness of a composition operator on harmonic Hardy spaces and spaces of Cauchy transforms, provided the functions take values in a reflexive Banach space. Composition operators are also considered on certain weak versions of the above function spaces. In addition, the relationship of different vector-valued function spaces is analyzed. In the fourth article weighted composition operators are studied on the scalar-valued BMOA space and its subspace VMOA. A weighted composition operator is obtained by first applying a composition operator and then a pointwise multiplier. A complete characterization is given for the boundedness and compactness of a weighted composition operator on BMOA and VMOA. Moreover, the essential norm of a weighted composition operator on VMOA is estimated. These results generalize many previously known results about composition operators and pointwise multipliers on these spaces.
Resumo:
In this thesis the use of the Bayesian approach to statistical inference in fisheries stock assessment is studied. The work was conducted in collaboration of the Finnish Game and Fisheries Research Institute by using the problem of monitoring and prediction of the juvenile salmon population in the River Tornionjoki as an example application. The River Tornionjoki is the largest salmon river flowing into the Baltic Sea. This thesis tackles the issues of model formulation and model checking as well as computational problems related to Bayesian modelling in the context of fisheries stock assessment. Each article of the thesis provides a novel method either for extracting information from data obtained via a particular type of sampling system or for integrating the information about the fish stock from multiple sources in terms of a population dynamics model. Mark-recapture and removal sampling schemes and a random catch sampling method are covered for the estimation of the population size. In addition, a method for estimating the stock composition of a salmon catch based on DNA samples is also presented. For most of the articles, Markov chain Monte Carlo (MCMC) simulation has been used as a tool to approximate the posterior distribution. Problems arising from the sampling method are also briefly discussed and potential solutions for these problems are proposed. Special emphasis in the discussion is given to the philosophical foundation of the Bayesian approach in the context of fisheries stock assessment. It is argued that the role of subjective prior knowledge needed in practically all parts of a Bayesian model should be recognized and consequently fully utilised in the process of model formulation.
Resumo:
Advancements in the analysis techniques have led to a rapid accumulation of biological data in databases. Such data often are in the form of sequences of observations, examples including DNA sequences and amino acid sequences of proteins. The scale and quality of the data give promises of answering various biologically relevant questions in more detail than what has been possible before. For example, one may wish to identify areas in an amino acid sequence, which are important for the function of the corresponding protein, or investigate how characteristics on the level of DNA sequence affect the adaptation of a bacterial species to its environment. Many of the interesting questions are intimately associated with the understanding of the evolutionary relationships among the items under consideration. The aim of this work is to develop novel statistical models and computational techniques to meet with the challenge of deriving meaning from the increasing amounts of data. Our main concern is on modeling the evolutionary relationships based on the observed molecular data. We operate within a Bayesian statistical framework, which allows a probabilistic quantification of the uncertainties related to a particular solution. As the basis of our modeling approach we utilize a partition model, which is used to describe the structure of data by appropriately dividing the data items into clusters of related items. Generalizations and modifications of the partition model are developed and applied to various problems. Large-scale data sets provide also a computational challenge. The models used to describe the data must be realistic enough to capture the essential features of the current modeling task but, at the same time, simple enough to make it possible to carry out the inference in practice. The partition model fulfills these two requirements. The problem-specific features can be taken into account by modifying the prior probability distributions of the model parameters. The computational efficiency stems from the ability to integrate out the parameters of the partition model analytically, which enables the use of efficient stochastic search algorithms.
Resumo:
The topic of this dissertation is the geometric and isometric theory of Banach spaces. This work is motivated by the known Banach-Mazur rotation problem, which asks whether each transitive separable Banach space is isometrically a Hilbert space. A Banach space X is said to be transitive if the isometry group of X acts transitively on the unit sphere of X. In fact, some weaker symmetry conditions than transitivity are studied in the dissertation. One such condition is an almost isometric version of transitivity. Another investigated condition is convex-transitivity, which requires that the closed convex hull of the orbit of any point of the unit sphere under the rotation group is the whole unit ball. Following the tradition developed around the rotation problem, some contemporary problems are studied. Namely, we attempt to characterize Hilbert spaces by using convex-transitivity together with the existence of a 1-dimensional bicontractive projection on the space, and some mild geometric assumptions. The convex-transitivity of some vector-valued function spaces is studied as well. The thesis also touches convex-transitivity of Banach lattices and resembling geometric cases.
Resumo:
The concept of an atomic decomposition was introduced by Coifman and Rochberg (1980) for weighted Bergman spaces on the unit disk. By the Riemann mapping theorem, functions in every simply connected domain in the complex plane have an atomic decomposition. However, a decomposition resulting from a conformal mapping of the unit disk tends to be very implicit and often lacks a clear connection to the geometry of the domain that it has been mapped into. The lattice of points, where the atoms of the decomposition are evaluated, usually follows the geometry of the original domain, but after mapping the domain into another this connection is easily lost and the layout of points becomes seemingly random. In the first article we construct an atomic decomposition directly on a weighted Bergman space on a class of regulated, simply connected domains. The construction uses the geometric properties of the regulated domain, but does not explicitly involve any conformal Riemann map from the unit disk. It is known that the Bergman projection is not bounded on the space L-infinity of bounded measurable functions. Taskinen (2004) introduced the locally convex spaces LV-infinity consisting of measurable and HV-infinity of analytic functions on the unit disk with the latter being a closed subspace of the former. They have the property that the Bergman projection is continuous from LV-infinity onto HV-infinity and, in some sense, the space HV-infinity is the smallest possible substitute to the space H-infinity of analytic functions. In the second article we extend the above result to a smoothly bounded strictly pseudoconvex domain. Here the related reproducing kernels are usually not known explicitly, and thus the proof of continuity of the Bergman projection is based on generalised Forelli-Rudin estimates instead of integral representations. The minimality of the space LV-infinity is shown by using peaking functions first constructed by Bell (1981). Taskinen (2003) showed that on the unit disk the space HV-infinity admits an atomic decomposition. This result is generalised in the third article by constructing an atomic decomposition for the space HV-infinity on a smoothly bounded strictly pseudoconvex domain. In this case every function can be presented as a linear combination of atoms such that the coefficient sequence belongs to a suitable Köthe co-echelon space.
Resumo:
The focus of this study is on statistical analysis of categorical responses, where the response values are dependent of each other. The most typical example of this kind of dependence is when repeated responses have been obtained from the same study unit. For example, in Paper I, the response of interest is the pneumococcal nasopharengyal carriage (yes/no) on 329 children. For each child, the carriage is measured nine times during the first 18 months of life, and thus repeated respones on each child cannot be assumed independent of each other. In the case of the above example, the interest typically lies in the carriage prevalence, and whether different risk factors affect the prevalence. Regression analysis is the established method for studying the effects of risk factors. In order to make correct inferences from the regression model, the associations between repeated responses need to be taken into account. The analysis of repeated categorical responses typically focus on regression modelling. However, further insights can also be gained by investigating the structure of the association. The central theme in this study is on the development of joint regression and association models. The analysis of repeated, or otherwise clustered, categorical responses is computationally difficult. Likelihood-based inference is often feasible only when the number of repeated responses for each study unit is small. In Paper IV, an algorithm is presented, which substantially facilitates maximum likelihood fitting, especially when the number of repeated responses increase. In addition, a notable result arising from this work is the freely available software for likelihood-based estimation of clustered categorical responses.