Biblioteca Digital

949 resultados para Statistical model

Living with Terminal Capitalization Rates: A Look at Real Estate Valuation Model Parameter Setting

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper examines assumptions about future prices used in real estate applications of DCF models. We confirm both the widespread reliance on an ad hoc rule of increasing period-zero capitalization rates by 50 to 100 basis points to obtain terminal capitalization rates and the inability of the rule to project future real estate pricing. To understand how investors form expectations about future prices, we model the spread between the contemporaneously period-zero going-in and terminal capitalization rates and the spread between terminal rates assigned in period zero and going-in rates assigned in period N. Our regression results confirm statistical relationships between the terminal and next holding period going-in capitalization rate spread and the period-zero discount rate, although other economically significant variables are statistically insignificant. Linking terminal capitalization rates by assumption to going-in capitalization rates implies investors view future real estate pricing with myopic expectations. We discuss alternative specifications devoid of such linkage that align more with a rational expectations view of future real estate pricing.

Evaluating statistical and machine learning methods to predict risk of in-hospital child mortality in Uganda

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-08

Convex Optimization Algorithms and Statistical Bounds for Learning Structured Models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Multivariate statistical modeling of an anode backing furnace - Modélisation statistique multivariée du four à cuisson des anodes utilisées dans la fabrication d'aluminium primaire

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La stratégie actuelle de contrôle de la qualité de l’anode est inadéquate pour détecter les anodes défectueuses avant qu’elles ne soient installées dans les cuves d’électrolyse. Des travaux antérieurs ont porté sur la modélisation du procédé de fabrication des anodes afin de prédire leurs propriétés directement après la cuisson en utilisant des méthodes statistiques multivariées. La stratégie de carottage des anodes utilisée à l’usine partenaire fait en sorte que ce modèle ne peut être utilisé que pour prédire les propriétés des anodes cuites aux positions les plus chaudes et les plus froides du four à cuire. Le travail actuel propose une stratégie pour considérer l’histoire thermique des anodes cuites à n’importe quelle position et permettre de prédire leurs propriétés. Il est montré qu’en combinant des variables binaires pour définir l’alvéole et la position de cuisson avec les données routinières mesurées sur le four à cuire, les profils de température des anodes cuites à différentes positions peuvent être prédits. Également, ces données ont été incluses dans le modèle pour la prédiction des propriétés des anodes. Les résultats de prédiction ont été validés en effectuant du carottage supplémentaire et les performances du modèle sont concluantes pour la densité apparente et réelle, la force de compression, la réactivité à l’air et le Lc et ce peu importe la position de cuisson.

Nonparametric Evaluation of Quantitative Traits in Population-Based Association Studies when the Genetic Model is Unknown

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical association between a single nucleotide polymorphism (SNP) genotype and a quantitative trait in genome-wide association studies is usually assessed using a linear regression model, or, in the case of non-normally distributed trait values, using the Kruskal-Wallis test. While linear regression models assume an additive mode of inheritance via equi-distant genotype scores, Kruskal-Wallis test merely tests global differences in trait values associated with the three genotype groups. Both approaches thus exhibit suboptimal power when the underlying inheritance mode is dominant or recessive. Furthermore, these tests do not perform well in the common situations when only a few trait values are available in a rare genotype category (disbalance), or when the values associated with the three genotype categories exhibit unequal variance (variance heterogeneity). We propose a maximum test based on Marcus-type multiple contrast test for relative effect sizes. This test allows model-specific testing of either dominant, additive or recessive mode of inheritance, and it is robust against variance heterogeneity. We show how to obtain mode-specific simultaneous confidence intervals for the relative effect sizes to aid in interpreting the biological relevance of the results. Further, we discuss the use of a related all-pairwise comparisons contrast test with range preserving confidence intervals as an alternative to Kruskal-Wallis heterogeneity test. We applied the proposed maximum test to the Bogalusa Heart Study dataset, and gained a remarkable increase in the power to detect association, particularly for rare genotypes. Our simulation study also demonstrated that the proposed non-parametric tests control family-wise error rate in the presence of non-normality and variance heterogeneity contrary to the standard parametric approaches. We provide a publicly available R library nparcomp that can be used to estimate simultaneous confidence intervals or compatible multiplicity-adjusted p-values associated with the proposed maximum test.

Statistical modeling of protein lysate array data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The protein lysate array is an emerging technology for quantifying the protein concentration ratios in multiple biological samples. It is gaining popularity, and has the potential to answer questions about post-translational modifications and protein pathway relationships. Statistical inference for a parametric quantification procedure has been inadequately addressed in the literature, mainly due to two challenges: the increasing dimension of the parameter space and the need to account for dependence in the data. Each chapter of this thesis addresses one of these issues. In Chapter 1, an introduction to the protein lysate array quantification is presented, followed by the motivations and goals for this thesis work. In Chapter 2, we develop a multi-step procedure for the Sigmoidal models, ensuring consistent estimation of the concentration level with full asymptotic efficiency. The results obtained in this chapter justify inferential procedures based on large-sample approximations. Simulation studies and real data analysis are used to illustrate the performance of the proposed method in finite-samples. The multi-step procedure is simpler in both theory and computation than the single-step least squares method that has been used in current practice. In Chapter 3, we introduce a new model to account for the dependence structure of the errors by a nonlinear mixed effects model. We consider a method to approximate the maximum likelihood estimator of all the parameters. Using the simulation studies on various error structures, we show that for data with non-i.i.d. errors the proposed method leads to more accurate estimates and better confidence intervals than the existing single-step least squares method.

THE PHYSICS OF IDEAS: INFERRING THE MECHANICS OF OPINION FORMATION FROM MACROSCOPIC STATISTICAL PATTERNS

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In a microscopic setting, humans behave in rich and unexpected ways. In a macroscopic setting, however, distinctive patterns of group behavior emerge, leading statistical physicists to search for an underlying mechanism. The aim of this dissertation is to analyze the macroscopic patterns of competing ideas in order to discern the mechanics of how group opinions form at the microscopic level. First, we explore the competition of answers in online Q&A (question and answer) boards. We find that a simple individual-level model can capture important features of user behavior, especially as the number of answers to a question grows. Our model further suggests that the wisdom of crowds may be constrained by information overload, in which users are unable to thoroughly evaluate each answer and therefore tend to use heuristics to pick what they believe is the best answer. Next, we explore models of opinion spread among voters to explain observed universal statistical patterns such as rescaled vote distributions and logarithmic vote correlations. We introduce a simple model that can explain both properties, as well as why it takes so long for large groups to reach consensus. An important feature of the model that facilitates agreement with data is that individuals become more stubborn (unwilling to change their opinion) over time. Finally, we explore potential underlying mechanisms for opinion formation in juries, by comparing data to various types of models. We find that different null hypotheses in which jurors do not interact when reaching a decision are in strong disagreement with data compared to a simple interaction model. These findings provide conceptual and mechanistic support for previous work that has found mutual influence can play a large role in group decisions. In addition, by matching our models to data, we are able to infer the time scales over which individuals change their opinions for different jury contexts. We find that these values increase as a function of the trial time, suggesting that jurors and judicial panels exhibit a kind of stubbornness similar to what we include in our model of voting behavior.

Analyzing change in medication use - statistical approaches

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this study was to gain an understanding of the effects of population heterogeneity, missing data, and causal relationships on parameter estimates from statistical models when analyzing change in medication use. From a public health perspective, two timely topics were addressed: the use and effects of statins in populations in primary prevention of cardiovascular disease and polypharmacy in older population. Growth mixture models were applied to characterize the accumulation of cardiovascular and diabetes medications among apparently healthy population of statin initiators. The causal effect of statin adherence on the incidence of acute cardiovascular events was estimated using marginal structural models in comparison with discrete-time hazards models. The impact of missing data on the growth estimates of evolution of polypharmacy was examined comparing statistical models under different assumptions for missing data mechanism. The data came from Finnish administrative registers and from the population-based Geriatric Multidisciplinary Strategy for the Good Care of the Elderly study conducted in Kuopio, Finland, during 2004–07. Five distinct patterns of accumulating medications emerged among the population of apparently healthy statin initiators during two years after statin initiation. Proper accounting for time-varying dependencies between adherence to statins and confounders using marginal structural models produced comparable estimation results with those from a discrete-time hazards model. Missing data mechanism was shown to be a key component when estimating the evolution of polypharmacy among older persons. In conclusion, population heterogeneity, missing data and causal relationships are important aspects in longitudinal studies that associate with the study question and should be critically assessed when performing statistical analyses. Analyses should be supplemented with sensitivity analyses towards model assumptions.

Activity-based life cycle costing model for cost estimation and monitoring – Case: manned and autonomous dry bulk carrier

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent developments in automation, robotics and artificial intelligence have given a push to a wider usage of these technologies in recent years, and nowadays, driverless transport systems are already state-of-the-art on certain legs of transportation. This has given a push for the maritime industry to join the advancement. The case organisation, AAWA initiative, is a joint industry-academia research consortium with the objective of developing readiness for the first commercial autonomous solutions, exploiting state-of-the-art autonomous and remote technology. The initiative develops both autonomous and remote operation technology for navigation, machinery, and all on-board operating systems. The aim of this study is to develop a model with which to estimate and forecast the operational costs, and thus enable comparisons between manned and autonomous cargo vessels. The building process of the model is also described and discussed. Furthermore, the model’s aim is to track and identify the critical success factors of the chosen ship design, and to enable monitoring and tracking of the incurred operational costs as the life cycle of the vessel progresses. The study adopts the constructive research approach, as the aim is to develop a construct to meet the needs of a case organisation. Data has been collected through discussions and meeting with consortium members and researchers, as well as through written and internal communications material. The model itself is built using activity-based life cycle costing, which enables both realistic cost estimation and forecasting, as well as the identification of critical success factors due to the process-orientation adopted from activity-based costing and the statistical nature of Monte Carlo simulation techniques. As the model was able to meet the multiple aims set for it, and the case organisation was satisfied with it, it could be argued that activity-based life cycle costing is the method with which to conduct cost estimation and forecasting in the case of autonomous cargo vessels. The model was able to perform the cost analysis and forecasting, as well as to trace the critical success factors. Later on, it also enabled, albeit hypothetically, monitoring and tracking of the incurred costs. By collecting costs this way, it was argued that the activity-based LCC model is able facilitate learning from and continuous improvement of the autonomous vessel. As with the building process of the model, an individual approach was chosen, while still using the implementation and model building steps presented in existing literature. This was due to two factors: the nature of the model and – perhaps even more importantly – the nature of the case organisation. Furthermore, the loosely organised network structure means that knowing the case organisation and its aims is of great importance when conducting a constructive research.

Vocabulary and language model adaptation using information retrieval

Relevância:

30.00% 30.00%

Publicador:

Resumo:

International audience

STATISTICAL METHODS FOR ADVANCED ECONOMETRIC MODELS WITH APPLICATIONS TO VEHICLE HOLDING AND SPEED QUANTILES DISTRIBUTION

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.

Implementation of the CDM model to the analisys of the hotel sector

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main aim of this study was to determine the impact of innovation on productivity in service sector companies — especially those in the hospitality sector — that value the reduction of environmental impact as relevant to the innovation process. We used a structural analysis model based on the one developed by Crépon, Duguet, and Mairesse (1998). This model is known as the CDM model (an acronym of the authors’ surnames). These authors developed seminal studies in the field of the relationships between innovation and productivity (see Griliches 1979; Pakes and Grilliches 1980). The main advantage of the CDM model is its ability to integrate the process of innovation and business productivity from an empirical perspective.

Influence diagnostic analysis in the possibly heteroskedastic linear model with exact restrictions

Relevância:

30.00% 30.00%

Publicador:

Use of posterior predictive assessments to evaluate model fit in multilevel logistic regression

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Assessing the fit of a model is an important final step in any statistical analysis, but this is not straightforward when complex discrete response models are used. Cross validation and posterior predictions have been suggested as methods to aid model criticism. In this paper a comparison is made between four methods of model predictive assessment in the context of a three level logistic regression model for clinical mastitis in dairy cattle; cross validation, a prediction using the full posterior predictive distribution and two “mixed” predictive methods that incorporate higher level random effects simulated from the underlying model distribution. Cross validation is considered a gold standard method but is computationally intensive and thus a comparison is made between posterior predictive assessments and cross validation. The analyses revealed that mixed prediction methods produced results close to cross validation whilst the full posterior predictive assessment gave predictions that were over-optimistic (closer to the observed disease rates) compared with cross validation. A mixed prediction method that simulated random effects from both higher levels was best at identifying the outlying level two (farm-year) units of interest. It is concluded that this mixed prediction method, simulating random effects from both higher levels, is straightforward and may be of value in model criticism of multilevel logistic regression, a technique commonly used for animal health data with a hierarchical structure.

GIS-integrated mathematical modeling of social phenomena at macro- and micro- levels—a multivariate geographically-weighted regression model for identifying locations vulnerable to hosting terrorist safe-houses: France as case study

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Adaptability and invisibility are hallmarks of modern terrorism, and keeping pace with its dynamic nature presents a serious challenge for societies throughout the world. Innovations in computer science have incorporated applied mathematics to develop a wide array of predictive models to support the variety of approaches to counterterrorism. Predictive models are usually designed to forecast the location of attacks. Although this may protect individual structures or locations, it does not reduce the threat—it merely changes the target. While predictive models dedicated to events or social relationships receive much attention where the mathematical and social science communities intersect, models dedicated to terrorist locations such as safe-houses (rather than their targets or training sites) are rare and possibly nonexistent. At the time of this research, there were no publically available models designed to predict locations where violent extremists are likely to reside. This research uses France as a case study to present a complex systems model that incorporates multiple quantitative, qualitative and geospatial variables that differ in terms of scale, weight, and type. Though many of these variables are recognized by specialists in security studies, there remains controversy with respect to their relative importance, degree of interaction, and interdependence. Additionally, some of the variables proposed in this research are not generally recognized as drivers, yet they warrant examination based on their potential role within a complex system. This research tested multiple regression models and determined that geographically-weighted regression analysis produced the most accurate result to accommodate non-stationary coefficient behavior, demonstrating that geographic variables are critical to understanding and predicting the phenomenon of terrorism. This dissertation presents a flexible prototypical model that can be refined and applied to other regions to inform stakeholders such as policy-makers and law enforcement in their efforts to improve national security and enhance quality-of-life.

«
1
2
...
56
57
58
59
60
61
62
63
64
»