994 resultados para data hiding


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Maan törmäyskraaterien ikäjakauman mahdollinen ajallinen jaksollisuus on herättänyt laajaa keskustelua sen jälkeen, kun ilmiö ensimmäistä kertaa raportoitiin joukossa arvostettuja tieteellisiä artikkeleita vuonna 1984. Vaikka nykytiedon valossa on kyseenalaista perustuuko havaittu jaksollisuus todelliseen fysikaaliseen ilmiöön, on kuitenkin mahdollista, että jaksollisuus on todella olemassa ja se voitaisiin havaita laajemmalla ja tarkemmalla törmäyskraateriaineistolla. Tutkimuksessa luotiin simuloidut kraaterien ajalliset tiheys- ja kertymäfunktiot tapauksille, jossa kraaterit syntyvät joko täysin jaksollisella tai satunnaisella prosessilla. Näiden kahden ääritapauksen lisäksi luotiin jakaumat myös kahdelle niiden yhdistelmälle. Nämä mallit mahdollistavat myös erilaisten kraaterien iänmäärityksen epätarkkuuksien huomioonottamisen. Näistä jakaumista luotiin eri pituisia simuloituja kraaterien ikien aikasarjoja. Lopulta simuloiduista aikasarjoista pyrittiin Rayleigh'n menetelmän avulla etsimään jakaumassa ollutta jaksollisuutta. Tutkimuksemme perusteella ajallisen jaksollisuuden havaitseminen kraateriaikasarjoista on lähes mahdotonta mikäli vain yksi kolmasosa kraatereista on jaksollisen ilmiön aiheuttamia, vaikka nykyistä kraateriaineistoa laajempi ja tarkempi aineisto olisi tulevaisuudessa saatavilla. Mikäli kaksi kolmasosaa meteoriittitörmäyksistä on jaksollisia, sen havaitseminen on mahdollista, mutta vaatii huomattavasti tämän hetkistä kattavamman kraateriaineiston. Tutkimuksen perusteella on syytä epäillä, että havaittu kraaterien ajallinen jaksollisuus ei ole todellinen ilmiö.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Marine species generally have large population sizes, continuous distributions and high dispersal capacity. Despite this, they are often subdivided into separate populations, which are the basic units of fisheries management. For example, populations of some fisheries species across the deep water of the Timor Trench are genetically different, inferring minimal movement and interbreeding. When connectivity is higher than the Timor Trench example, but not so high that the populations become one, connectivity between populations is crinkled. Crinkled connectivity occurs when migration is above the threshold required to link populations genetically, but below the threshold for demographic links. In future, genetic estimates of connectivity over crinkled links could be uniquely combined with other data, such as estimates of population size and tagging and tracking data, to quantify demographic connectedness between these types of populations. Elasmobranch species may be ideal targets for this research because connectivity between populations is more likely to be crinkled than for finfish species. Fisheries stock-assessment models could be strengthened with estimates of connectivity to improve the strategic and sustainable harvesting of biological resources.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We derive a new method for determining size-transition matrices (STMs) that eliminates probabilities of negative growth and accounts for individual variability. STMs are an important part of size-structured models, which are used in the stock assessment of aquatic species. The elements of STMs represent the probability of growth from one size class to another, given a time step. The growth increment over this time step can be modelled with a variety of methods, but when a population construct is assumed for the underlying growth model, the resulting STM may contain entries that predict negative growth. To solve this problem, we use a maximum likelihood method that incorporates individual variability in the asymptotic length, relative age at tagging, and measurement error to obtain von Bertalanffy growth model parameter estimates. The statistical moments for the future length given an individual’s previous length measurement and time at liberty are then derived. We moment match the true conditional distributions with skewed-normal distributions and use these to accurately estimate the elements of the STMs. The method is investigated with simulated tag–recapture data and tag–recapture data gathered from the Australian eastern king prawn (Melicertus plebejus).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article provides a review of techniques for the analysis of survival data arising from respiratory health studies. Popular techniques such as the Kaplan–Meier survival plot and the Cox proportional hazards model are presented and illustrated using data from a lung cancer study. Advanced issues are also discussed, including parametric proportional hazards models, accelerated failure time models, time-varying explanatory variables, simultaneous analysis of multiple types of outcome events and the restricted mean survival time, a novel measure of the effect of treatment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The average dimensions of the peptide unit have been obtained from the data reported in recent crystal structure analyses of di- and tripeptides. The bond lengths and bond angles agree with those in common use, except for the bond angle C---N---H, which is about 4° less than the accepted value, and the angle C2α---N---H which is about 4° more. The angle τ (Cα) has a mean value of 114° for glycyl residues and 110° for non-glycyl residues. Attention is directed to these mean values as observed in crystal structures, as they are relevant for model building of peptide chain structures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research studied distributed computing of all-to-all comparison problems with big data sets. The thesis formalised the problem, and developed a high-performance and scalable computing framework with a programming model, data distribution strategies and task scheduling policies to solve the problem. The study considered storage usage, data locality and load balancing for performance improvement in solving the problem. The research outcomes can be applied in bioinformatics, biometrics and data mining and other domains in which all-to-all comparisons are a typical computing pattern.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The behaviour of laterally loaded piles is considerably influenced by the uncertainties in soil properties. Hence probabilistic models for assessment of allowable lateral load are necessary. Cone penetration test (CPT) data are often used to determine soil strength parameters, whereby the allowable lateral load of the pile is computed. In the present study, the maximum lateral displacement and moment of the pile are obtained based on the coefficient of subgrade reaction approach, considering the nonlinear soil behaviour in undrained clay. The coefficient of subgrade reaction is related to the undrained shear strength of soil, which can be obtained from CPT data. The soil medium is modelled as a one-dimensional random field along the depth, and it is described by the standard deviation and scale of fluctuation of the undrained shear strength of soil. Inherent soil variability, measurement uncertainty and transformation uncertainty are taken into consideration. The statistics of maximum lateral deflection and moment are obtained using the first-order, second-moment technique. Hasofer-Lind reliability indices for component and system failure criteria, based on the allowable lateral displacement and moment capacity of the pile section, are evaluated. The geotechnical database from the Konaseema site in India is used as a case example. It is shown that the reliability-based design approach for pile foundations, considering the spatial variability of soil, permits a rational choice of allowable lateral loads.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Deriving an estimate of optimal fishing effort or even an approximate estimate is very valuable for managing fisheries with multiple target species. The most challenging task associated with this is allocating effort to individual species when only the total effort is recorded. Spatial information on the distribution of each species within a fishery can be used to justify the allocations, but often such information is not available. To determine the long-term overall effort required to achieve maximum sustainable yield (MSY) and maximum economic yield (MEY), we consider three methods for allocating effort: (i) optimal allocation, which optimally allocates effort among target species; (ii) fixed proportions, which chooses proportions based on past catch data; and (iii) economic allocation, which splits effort based on the expected catch value of each species. Determining the overall fishing effort required to achieve these management objectives is a maximizing problem subject to constraints due to economic and social considerations. We illustrated the approaches using a case study of the Moreton Bay Prawn Trawl Fishery in Queensland (Australia). The results were consistent across the three methods. Importantly, our analysis demonstrated the optimal total effort was very sensitive to daily fishing costs—the effort ranged from 9500–11 500 to 6000–7000, 4000 and 2500 boat-days, using daily cost estimates of $0, $500, $750, and $950, respectively. The zero daily cost corresponds to the MSY, while a daily cost of $750 most closely represents the actual present fishing cost. Given the recent debate on which costs should be factored into the analyses for deriving MEY, our findings highlight the importance of including an appropriate cost function for practical management advice. The approaches developed here could be applied to other multispecies fisheries where only aggregated fishing effort data are recorded, as the literature on this type of modelling is sparse.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Contamination of urban streams is a rising topic worldwide, but the assessment and investigation of stormwater induced contamination is limited by the high amount of water quality data needed to obtain reliable results. In this study, stream bed sediments were studied to determine their contamination degree and their applicability in monitoring aquatic metal contamination in urban areas. The interpretation of sedimentary metal concentrations is, however, not straightforward, since the concentrations commonly show spatial and temporal variations as a response to natural processes. The variations of and controls on metal concentrations were examined at different scales to increase the understanding of the usefulness of sediment metal concentrations in detecting anthropogenic metal contamination patterns. The acid extractable concentrations of Zn, Cu, Pb and Cd were determined from the surface sediments and water of small streams in the Helsinki Metropolitan region, southern Finland. The data consists of two datasets: sediment samples from 53 sites located in the catchment of the Stream Gräsanoja and sediment and water samples from 67 independent catchments scattered around the metropolitan region. Moreover, the sediment samples were analyzed for their physical and chemical composition (e.g. total organic carbon, clay-%, Al, Li, Fe, Mn) and the speciation of metals (in the dataset of the Stream Gräsanoja). The metal concentrations revealed that the stream sediments were moderately contaminated and caused no immediate threat to the biota. However, at some sites the sediments appeared to be polluted with Cu or Zn. The metal concentrations increased with increasing intensity of urbanization, but site specific factors, such as point sources, were responsible for the occurrence of the highest metal concentrations. The sediment analyses revealed, thus a need for more detailed studies on the processes and factors that cause the hot spot metal concentrations. The sediment composition and metal speciation analyses indicated that organic matter is a very strong indirect control on metal concentrations, and it should be accounted for when studying anthropogenic metal contamination patterns. The fine-scale spatial and temporal variations of metal concentrations were low enough to allow meaningful interpretation of substantial metal concentration differences between sites. Furthermore, the metal concentrations in the stream bed sediments were correlated with the urbanization of the catchment better than the total metal concentrations in the water phase. These results suggest that stream sediments show true potential for wider use in detecting the spatial differences in metal contamination of urban streams. Consequently, using the sediment approach regional estimates of the stormwater related metal contamination could be obtained fairly cost-effectively, and the stability and reliability of results would be higher compared to analyses of single water samples. Nevertheless, water samples are essential in analysing the dissolved concentrations of metals, momentary discharges from point sources in particular.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents novel modelling applications for environmental geospatial data using remote sensing, GIS and statistical modelling techniques. The studied themes can be classified into four main themes: (i) to develop advanced geospatial databases. Paper (I) demonstrates the creation of a geospatial database for the Glanville fritillary butterfly (Melitaea cinxia) in the Åland Islands, south-western Finland; (ii) to analyse species diversity and distribution using GIS techniques. Paper (II) presents a diversity and geographical distribution analysis for Scopulini moths at a world-wide scale; (iii) to study spatiotemporal forest cover change. Paper (III) presents a study of exotic and indigenous tree cover change detection in Taita Hills Kenya using airborne imagery and GIS analysis techniques; (iv) to explore predictive modelling techniques using geospatial data. In Paper (IV) human population occurrence and abundance in the Taita Hills highlands was predicted using the generalized additive modelling (GAM) technique. Paper (V) presents techniques to enhance fire prediction and burned area estimation at a regional scale in East Caprivi Namibia. Paper (VI) compares eight state-of-the-art predictive modelling methods to improve fire prediction, burned area estimation and fire risk mapping in East Caprivi Namibia. The results in Paper (I) showed that geospatial data can be managed effectively using advanced relational database management systems. Metapopulation data for Melitaea cinxia butterfly was successfully combined with GPS-delimited habitat patch information and climatic data. Using the geospatial database, spatial analyses were successfully conducted at habitat patch level or at more coarse analysis scales. Moreover, this study showed it appears evident that at a large-scale spatially correlated weather conditions are one of the primary causes of spatially correlated changes in Melitaea cinxia population sizes. In Paper (II) spatiotemporal characteristics of Socupulini moths description, diversity and distribution were analysed at a world-wide scale and for the first time GIS techniques were used for Scopulini moth geographical distribution analysis. This study revealed that Scopulini moths have a cosmopolitan distribution. The majority of the species have been described from the low latitudes, sub-Saharan Africa being the hot spot of species diversity. However, the taxonomical effort has been uneven among biogeographical regions. Paper III showed that forest cover change can be analysed in great detail using modern airborne imagery techniques and historical aerial photographs. However, when spatiotemporal forest cover change is studied care has to be taken in co-registration and image interpretation when historical black and white aerial photography is used. In Paper (IV) human population distribution and abundance could be modelled with fairly good results using geospatial predictors and non-Gaussian predictive modelling techniques. Moreover, land cover layer is not necessary needed as a predictor because first and second-order image texture measurements derived from satellite imagery had more power to explain the variation in dwelling unit occurrence and abundance. Paper V showed that generalized linear model (GLM) is a suitable technique for fire occurrence prediction and for burned area estimation. GLM based burned area estimations were found to be more superior than the existing MODIS burned area product (MCD45A1). However, spatial autocorrelation of fires has to be taken into account when using the GLM technique for fire occurrence prediction. Paper VI showed that novel statistical predictive modelling techniques can be used to improve fire prediction, burned area estimation and fire risk mapping at a regional scale. However, some noticeable variation between different predictive modelling techniques for fire occurrence prediction and burned area estimation existed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of recovering information from measurement data has already been studied for a long time. In the beginning, the methods were mostly empirical, but already towards the end of the sixties Backus and Gilbert started the development of mathematical methods for the interpretation of geophysical data. The problem of recovering information about a physical phenomenon from measurement data is an inverse problem. Throughout this work, the statistical inversion method is used to obtain a solution. Assuming that the measurement vector is a realization of fractional Brownian motion, the goal is to retrieve the amplitude and the Hurst parameter. We prove that under some conditions, the solution of the discretized problem coincides with the solution of the corresponding continuous problem as the number of observations tends to infinity. The measurement data is usually noisy, and we assume the data to be the sum of two vectors: the trend and the noise. Both vectors are supposed to be realizations of fractional Brownian motions, and the goal is to retrieve their parameters using the statistical inversion method. We prove a partial uniqueness of the solution. Moreover, with the support of numerical simulations, we show that in certain cases the solution is reliable and the reconstruction of the trend vector is quite accurate.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advancements in the analysis techniques have led to a rapid accumulation of biological data in databases. Such data often are in the form of sequences of observations, examples including DNA sequences and amino acid sequences of proteins. The scale and quality of the data give promises of answering various biologically relevant questions in more detail than what has been possible before. For example, one may wish to identify areas in an amino acid sequence, which are important for the function of the corresponding protein, or investigate how characteristics on the level of DNA sequence affect the adaptation of a bacterial species to its environment. Many of the interesting questions are intimately associated with the understanding of the evolutionary relationships among the items under consideration. The aim of this work is to develop novel statistical models and computational techniques to meet with the challenge of deriving meaning from the increasing amounts of data. Our main concern is on modeling the evolutionary relationships based on the observed molecular data. We operate within a Bayesian statistical framework, which allows a probabilistic quantification of the uncertainties related to a particular solution. As the basis of our modeling approach we utilize a partition model, which is used to describe the structure of data by appropriately dividing the data items into clusters of related items. Generalizations and modifications of the partition model are developed and applied to various problems. Large-scale data sets provide also a computational challenge. The models used to describe the data must be realistic enough to capture the essential features of the current modeling task but, at the same time, simple enough to make it possible to carry out the inference in practice. The partition model fulfills these two requirements. The problem-specific features can be taken into account by modifying the prior probability distributions of the model parameters. The computational efficiency stems from the ability to integrate out the parameters of the partition model analytically, which enables the use of efficient stochastic search algorithms.