907 resultados para MAXIMUM-LIKELIHOOD-ESTIMATION


Relevância:

100.00% 100.00%

Publicador:

Resumo:

While molecular and cellular processes are often modeled as stochastic processes, such as Brownian motion, chemical reaction networks and gene regulatory networks, there are few attempts to program a molecular-scale process to physically implement stochastic processes. DNA has been used as a substrate for programming molecular interactions, but its applications are restricted to deterministic functions and unfavorable properties such as slow processing, thermal annealing, aqueous solvents and difficult readout limit them to proof-of-concept purposes. To date, whether there exists a molecular process that can be programmed to implement stochastic processes for practical applications remains unknown.

In this dissertation, a fully specified Resonance Energy Transfer (RET) network between chromophores is accurately fabricated via DNA self-assembly, and the exciton dynamics in the RET network physically implement a stochastic process, specifically a continuous-time Markov chain (CTMC), which has a direct mapping to the physical geometry of the chromophore network. Excited by a light source, a RET network generates random samples in the temporal domain in the form of fluorescence photons which can be detected by a photon detector. The intrinsic sampling distribution of a RET network is derived as a phase-type distribution configured by its CTMC model. The conclusion is that the exciton dynamics in a RET network implement a general and important class of stochastic processes that can be directly and accurately programmed and used for practical applications of photonics and optoelectronics. Different approaches to using RET networks exist with vast potential applications. As an entropy source that can directly generate samples from virtually arbitrary distributions, RET networks can benefit applications that rely on generating random samples such as 1) fluorescent taggants and 2) stochastic computing.

By using RET networks between chromophores to implement fluorescent taggants with temporally coded signatures, the taggant design is not constrained by resolvable dyes and has a significantly larger coding capacity than spectrally or lifetime coded fluorescent taggants. Meanwhile, the taggant detection process becomes highly efficient, and the Maximum Likelihood Estimation (MLE) based taggant identification guarantees high accuracy even with only a few hundred detected photons.

Meanwhile, RET-based sampling units (RSU) can be constructed to accelerate probabilistic algorithms for wide applications in machine learning and data analytics. Because probabilistic algorithms often rely on iteratively sampling from parameterized distributions, they can be inefficient in practice on the deterministic hardware traditional computers use, especially for high-dimensional and complex problems. As an efficient universal sampling unit, the proposed RSU can be integrated into a processor / GPU as specialized functional units or organized as a discrete accelerator to bring substantial speedups and power savings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications. We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices. The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms. Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models. The second theme is related to the estimation of static discrete choice models with large choice sets. We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm. Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process. The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important. Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An RVE–based stochastic numerical model is used to calculate the permeability of randomly generated porous media at different values of the fiber volume fraction for the case of transverse flow in a unidirectional ply. Analysis of the numerical results shows that the permeability is not normally distributed. With the aim of proposing a new understanding on this particular topic, permeability data are fitted using both a mixture model and a unimodal distribution. Our findings suggest that permeability can be fitted well using a mixture model based on the lognormal and power law distributions. In case of a unimodal distribution, it is found, using the maximum-likelihood estimation method (MLE), that the generalized extreme value (GEV) distribution represents the best fit. Finally, an expression of the permeability as a function of the fiber volume fraction based on the GEV distribution is discussed in light of the previous results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background

Kidscreen-27 was developed as part of a cross-cultural European Union-funded project to standardise the measurement of children’s health-related quality of life. Yet, research has reported mixed evidence for the hypothesised 5-factor model, and no confirmatory factor analysis (CFA) has been conducted on the instrument with children of low socio-economic status (SES) across Ireland (Northern and Republic).

Method

The data for this study were collected as part of a clustered randomised controlled trial. A total of 663 (347 male, 315 female) 8–9-year-old children (M = 8.74, SD = .50) of low SES took part. A 5- and modified 7-factor CFA models were specified using the maximum likelihood estimation. A nested Chi-square difference test was conducted to compare the fit of the models. Internal consistency and floor and ceiling effects were also examined.

Results

CFA found that the hypothesised 5-factor model was an unacceptable fit. However, the modified 7-factor model was supported. A nested Chi-square difference test confirmed that the fit of the 7-factor model was significantly better than that of the 5-factor model. Internal consistency was unacceptable for just one scale. Ceiling effects were present in all but one of the factors.

Conclusions

Future research should apply the 7-factor model with children of low socio-economic status. Such efforts would help monitor the health status of the population.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The impacts of climate change are considered to be strong in countries located in tropical Africa that depend on agriculture for their food, income and livelihood. Therefore, a better understanding of the local dimensions of adaptation strategies is essential to develop appropriate measures that will mitigate adverse consequences. Hence, this study was conducted to identify the most commonly used adaptation strategies that farm households practice among a set of options to withstand the effects of climate change and to identify factors that affect the choice of climate change adaptation strategies in the Central Rift Valley of Ethiopia. To address this objective, Multivariate Probit model was used. The results of the model indicated that the likelihood of households to adapt improved varieties of crops, adjust planting date, crop diversification and soil conservation practices were 58.73%, 57.72%, 35.61% and 41.15%, respectively. The Simulated Maximum Likelihood estimation of the Multivariate Probit model results suggested that there was positive and significant interdependence between household decisions to adapt crop diversification and using improved varieties of crops; and between adjusting planting date and using improved varieties of crops. The results also showed that there was a negative and significant relationship between household decisions to adapt crop diversification and soil conservation practices. The paper also recommended household, socioeconomic, institutional and plot characteristics that facilitate and impede the probability of choosing those adaptation strategies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work represents an original contribution to the methodology for ecosystem models' development as well as the rst attempt of an end-to-end (E2E) model of the Northern Humboldt Current Ecosystem (NHCE). The main purpose of the developed model is to build a tool for ecosystem-based management and decision making, reason why the credibility of the model is essential, and this can be assessed through confrontation to data. Additionally, the NHCE exhibits a high climatic and oceanographic variability at several scales, the major source of interannual variability being the interruption of the upwelling seasonality by the El Niño Southern Oscillation, which has direct e ects on larval survival and sh recruitment success. Fishing activity can also be highly variable, depending on the abundance and accessibility of the main shery resources. This context brings the two main methodological questions addressed in this thesis, through the development of an end-to-end model coupling the high trophic level model OSMOSE to the hydrodynamics and biogeochemical model ROMS-PISCES: i) how to calibrate ecosystem models using time series data and ii) how to incorporate the impact of the interannual variability of the environment and shing. First, this thesis highlights some issues related to the confrontation of complex ecosystem models to data and proposes a methodology for a sequential multi-phases calibration of ecosystem models. We propose two criteria to classify the parameters of a model: the model dependency and the time variability of the parameters. Then, these criteria along with the availability of approximate initial estimates are used as decision rules to determine which parameters need to be estimated, and their precedence order in the sequential calibration process. Additionally, a new Evolutionary Algorithm designed for the calibration of stochastic models (e.g Individual Based Model) and optimized for maximum likelihood estimation has been developed and applied to the calibration of the OSMOSE model to time series data. The environmental variability is explicit in the model: the ROMS-PISCES model forces the OSMOSE model and drives potential bottom-up e ects up the foodweb through plankton and sh trophic interactions, as well as through changes in the spatial distribution of sh. The latter e ect was taken into account using presence/ absence species distribution models which are traditionally assessed through a confusion matrix and the statistical metrics associated to it. However, when considering the prediction of the habitat against time, the variability in the spatial distribution of the habitat can be summarized and validated using the emerging patterns from the shape of the spatial distributions. We modeled the potential habitat of the main species of the Humboldt Current Ecosystem using several sources of information ( sheries, scienti c surveys and satellite monitoring of vessels) jointly with environmental data from remote sensing and in situ observations, from 1992 to 2008. The potential habitat was predicted over the study period with monthly resolution, and the model was validated using quantitative and qualitative information of the system using a pattern oriented approach. The nal ROMS-PISCES-OSMOSE E2E ecosystem model for the NHCE was calibrated using our evolutionary algorithm and a likelihood approach to t monthly time series data of landings, abundance indices and catch at length distributions from 1992 to 2008. To conclude, some potential applications of the model for shery management are presented and their limitations and perspectives discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a new stochastic clustering methodology devised for the analysis of categorized or sorted data. The methodology reveals consumers' common category knowledge as well as individual differences in using this knowledge for classifying brands in a designated product class. A small study involving the categorization of 28 brands of U.S. automobiles is presented where the results of the proposed methodology are compared with those obtained from KMEANS clustering. Finally, directions for future research are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mestrado em Contabilidade e Análise Financeira,

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objectives: Because there is scientific evidence that an appropriate intake of dietary fibre should be part of a healthy diet, given its importance in promoting health, the present study aimed to develop and validate an instrument to evaluate the knowledge of the general population about dietary fibres. Study design: The present study was a cross sectional study. Methods: The methodological study of psychometric validation was conducted with 6010 participants, residing in ten countries from 3 continents. The instrument is a questionnaire of self-response, aimed at collecting information on knowledge about food fibres. For exploratory factor analysis (EFA) was chosen the analysis of the main components using varimax orthogonal rotation and eigenvalues greater than 1. In confirmatory factor analysis by structural equation modelling (SEM) was considered the covariance matrix and adopted the Maximum Likelihood Estimation algorithm for parameter estimation. Results: Exploratory factor analysis retained two factors. The first was called Dietary Fibre and Promotion of Health (DFPH) and included 7 questions that explained 33.94 % of total variance ( = 0.852). The second was named Sources of Dietary Fibre (SDF) and included 4 questions that explained 22.46% of total variance ( = 0.786). The model was tested by SEM giving a final solution with four questions in each factor. This model showed a very good fit in practically all the indexes considered, except for the ratio 2/df. The values of average variance extracted (0.458 and 0.483) demonstrate the existence of convergent validity; the results also prove the existence of discriminant validity of the factors (r2 = 0.028) and finally good internal consistency was confirmed by the values of composite reliability (0.854 and 0.787). Conclusions: This study allowed validating the KADF scale, increasing the degree of confidence in the information obtained through this instrument in this and in future studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications. We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices. The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms. Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models. The second theme is related to the estimation of static discrete choice models with large choice sets. We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm. Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process. The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important. Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this research the integration of nanostructures and micro-scale devices was investigated using silica nanowires to develop a simple yet robust nanomanufacturing technique for improving the detection parameters of chemical and biological sensors. This has been achieved with the use of a dielectric barrier layer, to restrict nanowire growth to site-specific locations which has removed the need for post growth processing, by making it possible to place nanostructures on pre-pattern substrates. Nanowires were synthesized using the Vapor-Liquid-Solid growth method. Process parameters (temperature and time) and manufacturing aspects (structural integrity and biocompatibility) were investigated. Silica nanowires were observed experimentally to determine how their physical and chemical properties could be tuned for integration into existing sensing structures. Growth kinetic experiments performed using gold and palladium catalysts at 1050 ˚C for 60 minutes in an open-tube furnace yielded dense and consistent silica nanowire growth. This consistent growth led to the development of growth model fitting, through use of the Maximum Likelihood Estimation (MLE) and Bayesian hierarchical modeling. Transmission electron microscopy studies revealed the nanowires to be amorphous and X-ray diffraction confirmed the composition to be SiO2 . Silica nanowires were monitored in epithelial breast cancer media using Impedance spectroscopy, to test biocompatibility, due to potential in vivo use as a diagnostic aid. It was found that palladium catalyzed silica nanowires were toxic to breast cancer cells, however, nanowires were inert at 1µg/mL concentrations. Additionally a method for direct nanowire integration was developed that allowed for silica nanowires to be grown directly into interdigitated sensing structures. This technique eliminates the need for physical nanowire transfer thus preserving nanowire structure and performance integrity and further reduces fabrication cost. Successful nanowire integration was physically verified using Scanning electron microscopy and confirmed electrically using Electrochemical Impedance Spectroscopy of immobilized Prostate Specific Antigens (PSA). The experiments performed above serve as a guideline to addressing the metallurgic challenges in nanoscale integration of materials with varying composition and to understanding the effects of nanomaterials on biological structures that come in contact with the human body.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research develops an econometric framework to analyze time series processes with bounds. The framework is general enough that it can incorporate several different kinds of bounding information that constrain continuous-time stochastic processes between discretely-sampled observations. It applies to situations in which the process is known to remain within an interval between observations, by way of either a known constraint or through the observation of extreme realizations of the process. The main statistical technique employs the theory of maximum likelihood estimation. This approach leads to the development of the asymptotic distribution theory for the estimation of the parameters in bounded diffusion models. The results of this analysis present several implications for empirical research. The advantages are realized in the form of efficiency gains, bias reduction and in the flexibility of model specification. A bias arises in the presence of bounding information that is ignored, while it is mitigated within this framework. An efficiency gain arises, in the sense that the statistical methods make use of conditioning information, as revealed by the bounds. Further, the specification of an econometric model can be uncoupled from the restriction to the bounds, leaving the researcher free to model the process near the bound in a way that avoids bias from misspecification. One byproduct of the improvements in model specification is that the more precise model estimation exposes other sources of misspecification. Some processes reveal themselves to be unlikely candidates for a given diffusion model, once the observations are analyzed in combination with the bounding information. A closer inspection of the theoretical foundation behind diffusion models leads to a more general specification of the model. This approach is used to produce a set of algorithms to make the model computationally feasible and more widely applicable. Finally, the modeling framework is applied to a series of interest rates, which, for several years, have been constrained by the lower bound of zero. The estimates from a series of diffusion models suggest a substantial difference in estimation results between models that ignore bounds and the framework that takes bounding information into consideration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Despite recent methodological advances in inferring the time-scale of biological evolution from molecular data, the fundamental question of whether our substitution models are sufficiently well specified to accurately estimate branch-lengths has received little attention. I examine this implicit assumption of all molecular dating methods, on a vertebrate mitochondrial protein-coding dataset. Comparison with analyses in which the data are RY-coded (AG → R; CT → Y) suggests that even rates-across-sites maximum likelihood greatly under-compensates for multiple substitutions among the standard (ACGT) NT-coded data, which has been subject to greater phylogenetic signal erosion. Accordingly, the fossil record indicates that branch-lengths inferred from the NT-coded data translate into divergence time overestimates when calibrated from deeper in the tree. Intriguingly, RY-coding led to the opposite result. The underlying NT and RY substitution model misspecifications likely relate respectively to “hidden” rate heterogeneity and changes in substitution processes across the tree, for which I provide simulated examples. Given the magnitude of the inferred molecular dating errors, branch-length estimation biases may partly explain current conflicts with some palaeontological dating estimates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a novel online hidden Markov model (HMM) parameter estimator based on Kerridge inaccuracy rate (KIR) concepts. Under mild identifiability conditions, we prove that our online KIR-based estimator is strongly consistent. In simulation studies, we illustrate the convergence behaviour of our proposed online KIR-based estimator and provide a counter-example illustrating the local convergence properties of the well known recursive maximum likelihood estimator (arguably the best existing solution).