10 resultados para Parameters estimation

em Helda - Digital Repository of University of Helsinki


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study examines the properties of Generalised Regression (GREG) estimators for domain class frequencies and proportions. The family of GREG estimators forms the class of design-based model-assisted estimators. All GREG estimators utilise auxiliary information via modelling. The classic GREG estimator with a linear fixed effects assisting model (GREG-lin) is one example. But when estimating class frequencies, the study variable is binary or polytomous. Therefore logistic-type assisting models (e.g. logistic or probit model) should be preferred over the linear one. However, other GREG estimators than GREG-lin are rarely used, and knowledge about their properties is limited. This study examines the properties of L-GREG estimators, which are GREG estimators with fixed-effects logistic-type models. Three research questions are addressed. First, I study whether and when L-GREG estimators are more accurate than GREG-lin. Theoretical results and Monte Carlo experiments which cover both equal and unequal probability sampling designs and a wide variety of model formulations show that in standard situations, the difference between L-GREG and GREG-lin is small. But in the case of a strong assisting model, two interesting situations arise: if the domain sample size is reasonably large, L-GREG is more accurate than GREG-lin, and if the domain sample size is very small, estimation of assisting model parameters may be inaccurate, resulting in bias for L-GREG. Second, I study variance estimation for the L-GREG estimators. The standard variance estimator (S) for all GREG estimators resembles the Sen-Yates-Grundy variance estimator, but it is a double sum of prediction errors, not of the observed values of the study variable. Monte Carlo experiments show that S underestimates the variance of L-GREG especially if the domain sample size is minor, or if the assisting model is strong. Third, since the standard variance estimator S often fails for the L-GREG estimators, I propose a new augmented variance estimator (A). The difference between S and the new estimator A is that the latter takes into account the difference between the sample fit model and the census fit model. In Monte Carlo experiments, the new estimator A outperformed the standard estimator S in terms of bias, root mean square error and coverage rate. Thus the new estimator provides a good alternative to the standard estimator.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fluid bed granulation is a key pharmaceutical process which improves many of the powder properties for tablet compression. Dry mixing, wetting and drying phases are included in the fluid bed granulation process. Granules of high quality can be obtained by understanding and controlling the critical process parameters by timely measurements. Physical process measurements and particle size data of a fluid bed granulator that are analysed in an integrated manner are included in process analytical technologies (PAT). Recent regulatory guidelines strongly encourage the pharmaceutical industry to apply scientific and risk management approaches to the development of a product and its manufacturing process. The aim of this study was to utilise PAT tools to increase the process understanding of fluid bed granulation and drying. Inlet air humidity levels and granulation liquid feed affect powder moisture during fluid bed granulation. Moisture influences on many process, granule and tablet qualities. The approach in this thesis was to identify sources of variation that are mainly related to moisture. The aim was to determine correlations and relationships, and utilise the PAT and design space concepts for the fluid bed granulation and drying. Monitoring the material behaviour in a fluidised bed has traditionally relied on the observational ability and experience of an operator. There has been a lack of good criteria for characterising material behaviour during spraying and drying phases, even though the entire performance of a process and end product quality are dependent on it. The granules were produced in an instrumented bench-scale Glatt WSG5 fluid bed granulator. The effect of inlet air humidity and granulation liquid feed on the temperature measurements at different locations of a fluid bed granulator system were determined. This revealed dynamic changes in the measurements and enabled finding the most optimal sites for process control. The moisture originating from the granulation liquid and inlet air affected the temperature of the mass and pressure difference over granules. Moreover, the effects of inlet air humidity and granulation liquid feed rate on granule size were evaluated and compensatory techniques used to optimize particle size. Various end-point indication techniques of drying were compared. The ∆T method, which is based on thermodynamic principles, eliminated the effects of humidity variations and resulted in the most precise estimation of the drying end-point. The influence of fluidisation behaviour on drying end-point detection was determined. The feasibility of the ∆T method and thus the similarities of end-point moisture contents were found to be dependent on the variation in fluidisation between manufacturing batches. A novel parameter that describes behaviour of material in a fluid bed was developed. Flow rate of the process air and turbine fan speed were used to calculate this parameter and it was compared to the fluidisation behaviour and the particle size results. The design space process trajectories for smooth fluidisation based on the fluidisation parameters were determined. With this design space it is possible to avoid excessive fluidisation and improper fluidisation and bed collapse. Furthermore, various process phenomena and failure modes were observed with the in-line particle size analyser. Both rapid increase and a decrease in granule size could be monitored in a timely manner. The fluidisation parameter and the pressure difference over filters were also discovered to express particle size when the granules had been formed. The various physical parameters evaluated in this thesis give valuable information of fluid bed process performance and increase the process understanding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Drug Analysis without Primary Reference Standards: Application of LC-TOFMS and LC-CLND to Biofluids and Seized Material Primary reference standards for new drugs, metabolites, designer drugs or rare substances may not be obtainable within a reasonable period of time or their availability may also be hindered by extensive administrative requirements. Standards are usually costly and may have a limited shelf life. Finally, many compounds are not available commercially and sometimes not at all. A new approach within forensic and clinical drug analysis involves substance identification based on accurate mass measurement by liquid chromatography coupled with time-of-flight mass spectrometry (LC-TOFMS) and quantification by LC coupled with chemiluminescence nitrogen detection (LC-CLND) possessing equimolar response to nitrogen. Formula-based identification relies on the fact that the accurate mass of an ion from a chemical compound corresponds to the elemental composition of that compound. Single-calibrant nitrogen based quantification is feasible with a nitrogen-specific detector since approximately 90% of drugs contain nitrogen. A method was developed for toxicological drug screening in 1 ml urine samples by LC-TOFMS. A large target database of exact monoisotopic masses was constructed, representing the elemental formulae of reference drugs and their metabolites. Identification was based on matching the sample component s measured parameters with those in the database, including accurate mass and retention time, if available. In addition, an algorithm for isotopic pattern match (SigmaFit) was applied. Differences in ion abundance in urine extracts did not affect the mass accuracy or the SigmaFit values. For routine screening practice, a mass tolerance of 10 ppm and a SigmaFit tolerance of 0.03 were established. Seized street drug samples were analysed instantly by LC-TOFMS and LC-CLND, using a dilute and shoot approach. In the quantitative analysis of amphetamine, heroin and cocaine findings, the mean relative difference between the results of LC-CLND and the reference methods was only 11%. In blood specimens, liquid-liquid extraction recoveries for basic lipophilic drugs were first established and the validity of the generic extraction recovery-corrected single-calibrant LC-CLND was then verified with proficiency test samples. The mean accuracy was 24% and 17% for plasma and whole blood samples, respectively, all results falling within the confidence range of the reference concentrations. Further, metabolic ratios for the opioid drug tramadol were determined in a pharmacogenetic study setting. Extraction recovery estimation, based on model compounds with similar physicochemical characteristics, produced clinically feasible results without reference standards.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In genetic epidemiology, population-based disease registries are commonly used to collect genotype or other risk factor information concerning affected subjects and their relatives. This work presents two new approaches for the statistical inference of ascertained data: a conditional and full likelihood approaches for the disease with variable age at onset phenotype using familial data obtained from population-based registry of incident cases. The aim is to obtain statistically reliable estimates of the general population parameters. The statistical analysis of familial data with variable age at onset becomes more complicated when some of the study subjects are non-susceptible, that is to say these subjects never get the disease. A statistical model for a variable age at onset with long-term survivors is proposed for studies of familial aggregation, using latent variable approach, as well as for prospective studies of genetic association studies with candidate genes. In addition, we explore the possibility of a genetic explanation of the observed increase in the incidence of Type 1 diabetes (T1D) in Finland in recent decades and the hypothesis of non-Mendelian transmission of T1D associated genes. Both classical and Bayesian statistical inference were used in the modelling and estimation. Despite the fact that this work contains five studies with different statistical models, they all concern data obtained from nationwide registries of T1D and genetics of T1D. In the analyses of T1D data, non-Mendelian transmission of T1D susceptibility alleles was not observed. In addition, non-Mendelian transmission of T1D susceptibility genes did not make a plausible explanation for the increase in T1D incidence in Finland. Instead, the Human Leucocyte Antigen associations with T1D were confirmed in the population-based analysis, which combines T1D registry information, reference sample of healthy subjects and birth cohort information of the Finnish population. Finally, a substantial familial variation in the susceptibility of T1D nephropathy was observed. The presented studies show the benefits of sophisticated statistical modelling to explore risk factors for complex diseases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is an increasing need to compare the results obtained with different methods of estimation of tree biomass in order to reduce the uncertainty in the assessment of forest biomass carbon. In this study, tree biomass was investigated in a 30-year-old Scots pine (Pinus sylvestris) (Young-Stand) and a 130-year-old mixed Norway spruce (Picea abies)-Scots pine stand (Mature-Stand) located in southern Finland (61º50' N, 24º22' E). In particular, a comparison of the results of different estimation methods was conducted to assess the reliability and suitability of their applications. For the trees in Mature-Stand, annual stem biomass increment fluctuated following a sigmoid equation, and the fitting curves reached a maximum level (from about 1 kg/yr for understorey spruce to 7 kg/yr for dominant pine) when the trees were 100 years old. Tree biomass was estimated to be about 70 Mg/ha in Young-Stand and about 220 Mg/ha in Mature-Stand. In the region (58.00-62.13 ºN, 14-34 ºE, ≤ 300 m a.s.l.) surrounding the study stands, the tree biomass accumulation in Norway spruce and Scots pine stands followed a sigmoid equation with stand age, with a maximum of 230 Mg/ha at the age of 140 years. In Mature-Stand, lichen biomass on the trees was 1.63 Mg/ha with more than half of the biomass occurring on dead branches, and the standing crop of litter lichen on the ground was about 0.09 Mg/ha. There were substantial differences among the results estimated by different methods in the stands. These results imply that a possible estimation error should be taken into account when calculating tree biomass in a stand with an indirect approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis examines the feasibility of a forest inventory method based on two-phase sampling in estimating forest attributes at the stand or substand levels for forest management purposes. The method is based on multi-source forest inventory combining auxiliary data consisting of remote sensing imagery or other geographic information and field measurements. Auxiliary data are utilized as first-phase data for covering all inventory units. Various methods were examined for improving the accuracy of the forest estimates. Pre-processing of auxiliary data in the form of correcting the spectral properties of aerial imagery was examined (I), as was the selection of aerial image features for estimating forest attributes (II). Various spatial units were compared for extracting image features in a remote sensing aided forest inventory utilizing very high resolution imagery (III). A number of data sources were combined and different weighting procedures were tested in estimating forest attributes (IV, V). Correction of the spectral properties of aerial images proved to be a straightforward and advantageous method for improving the correlation between the image features and the measured forest attributes. Testing different image features that can be extracted from aerial photographs (and other very high resolution images) showed that the images contain a wealth of relevant information that can be extracted only by utilizing the spatial organization of the image pixel values. Furthermore, careful selection of image features for the inventory task generally gives better results than inputting all extractable features to the estimation procedure. When the spatial units for extracting very high resolution image features were examined, an approach based on image segmentation generally showed advantages compared with a traditional sample plot-based approach. Combining several data sources resulted in more accurate estimates than any of the individual data sources alone. The best combined estimate can be derived by weighting the estimates produced by the individual data sources by the inverse values of their mean square errors. Despite the fact that the plot-level estimation accuracy in two-phase sampling inventory can be improved in many ways, the accuracy of forest estimates based mainly on single-view satellite and aerial imagery is a relatively poor basis for making stand-level management decisions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pressurised hot water extraction (PHWE) exploits the unique temperature-dependent solvent properties of water minimising the use of harmful organic solvents. Water is environmentally friendly, cheap and easily available extraction medium. The effects of temperature, pressure and extraction time in PHWE have often been studied, but here the emphasis was on other parameters important for the extraction, most notably the dimensions of the extraction vessel and the stability and solubility of the analytes to be extracted. Non-linear data analysis and self-organising maps were employed in the data analysis to obtain correlations between the parameters studied, recoveries and relative errors. First, pressurised hot water extraction (PHWE) was combined on-line with liquid chromatography-gas chromatography (LC-GC), and the system was applied to the extraction and analysis of polycyclic aromatic hydrocarbons (PAHs) in sediment. The method is of superior sensitivity compared with the traditional methods, and only a small 10 mg sample was required for analysis. The commercial extraction vessels were replaced by laboratory-made stainless steel vessels because of some problems that arose. The performance of the laboratory-made vessels was comparable to that of the commercial ones. In an investigation of the effect of thermal desorption in PHWE, it was found that at lower temperatures (200ºC and 250ºC) the effect of thermal desorption is smaller than the effect of the solvating property of hot water. At 300ºC, however, thermal desorption is the main mechanism. The effect of the geometry of the extraction vessel on recoveries was studied with five specially constructed extraction vessels. In addition to the extraction vessel geometry, the sediment packing style and the direction of water flow through the vessel were investigated. The geometry of the vessel was found to have only minor effect on the recoveries, and the same was true of the sediment packing style and the direction of water flow through the vessel. These are good results because these parameters do not have to be carefully optimised before the start of extractions. Liquid-liquid extraction (LLE) and solid-phase extraction (SPE) were compared as trapping techniques for PHWE. LLE was more robust than SPE and it provided better recoveries and repeatabilities than did SPE. Problems related to blocking of the Tenax trap and unrepeatable trapping of the analytes were encountered in SPE. Thus, although LLE is more labour intensive, it can be recommended over SPE. The stabilities of the PAHs in aqueous solutions were measured using a batch-type reaction vessel. Degradation was observed at 300ºC even with the shortest heating time. Ketones and quinones and other oxidation products were observed. Although the conditions of the stability studies differed considerably from the extraction conditions in PHWE, the results indicate that the risk of analyte degradation must be taken into account in PHWE. The aqueous solubilities of acenaphthene, anthracene and pyrene were measured, first below and then above the melting point of the analytes. Measurements below the melting point were made to check that the equipment was working, and the results were compared with those obtained earlier. Good agreement was found between the measured and literature values. A new saturation cell was constructed for the solubility measurements above the melting point of the analytes because the flow-through saturation cell could not be used above the melting point. An exponential relationship was found between the solubilities measured for pyrene and anthracene and temperature.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Remote sensing provides methods to infer land cover information over large geographical areas at a variety of spatial and temporal resolutions. Land cover is input data for a range of environmental models and information on land cover dynamics is required for monitoring the implications of global change. Such data are also essential in support of environmental management and policymaking. Boreal forests are a key component of the global climate and a major sink of carbon. The northern latitudes are expected to experience a disproportionate and rapid warming, which can have a major impact on vegetation at forest limits. This thesis examines the use of optical remote sensing for estimating aboveground biomass, leaf area index (LAI), tree cover and tree height in the boreal forests and tundra taiga transition zone in Finland. The continuous fields of forest attributes are required, for example, to improve the mapping of forest extent. The thesis focus on studying the feasibility of satellite data at multiple spatial resolutions, assessing the potential of multispectral, -angular and -temporal information, and provides regional evaluation for global land cover data. Preprocessed ASTER, MISR and MODIS products are the principal satellite data. The reference data consist of field measurements, forest inventory data and fine resolution land cover maps. Fine resolution studies demonstrate how statistical relationships between biomass and satellite data are relatively strong in single species and low biomass mountain birch forests in comparison to higher biomass coniferous stands. The combination of forest stand data and fine resolution ASTER images provides a method for biomass estimation using medium resolution MODIS data. The multiangular data improve the accuracy of land cover mapping in the sparsely forested tundra taiga transition zone, particularly in mires. Similarly, multitemporal data improve the accuracy of coarse resolution tree cover estimates in comparison to single date data. Furthermore, the peak of the growing season is not necessarily the optimal time for land cover mapping in the northern boreal regions. The evaluated coarse resolution land cover data sets have considerable shortcomings in northernmost Finland and should be used with caution in similar regions. The quantitative reference data and upscaling methods for integrating multiresolution data are required for calibration of statistical models and evaluation of land cover data sets. The preprocessed image products have potential for wider use as they can considerably reduce the time and effort used for data processing.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization of statistical modeling. The most important notion of MDL is the stochastic complexity, which can be interpreted as the shortest description length of a given sample of data relative to a model class. The exact definition of the stochastic complexity has gone through several evolutionary steps. The latest instantation is based on the so-called Normalized Maximum Likelihood (NML) distribution which has been shown to possess several important theoretical properties. However, the applications of this modern version of the MDL have been quite rare because of computational complexity problems, i.e., for discrete data, the definition of NML involves an exponential sum, and in the case of continuous data, a multi-dimensional integral usually infeasible to evaluate or even approximate accurately. In this doctoral dissertation, we present mathematical techniques for computing NML efficiently for some model families involving discrete data. We also show how these techniques can be used to apply MDL in two practical applications: histogram density estimation and clustering of multi-dimensional data.