898 resultados para Discrete Regression and Qualitative Choice Models


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Suppose two or more variables are jointly normally distributed. If there is a common relationship between these variables it would be very important to quantify this relationship by a parameter called the correlation coefficient which measures its strength, and the use of it can develop an equation for predicting, and ultimately draw testable conclusion about the parent population. This research focused on the correlation coefficient ρ for the bivariate and trivariate normal distribution when equal variances and equal covariances are considered. Particularly, we derived the maximum Likelihood Estimators (MLE) of the distribution parameters assuming all of them are unknown, and we studied the properties and asymptotic distribution of . Showing this asymptotic normality, we were able to construct confidence intervals of the correlation coefficient ρ and test hypothesis about ρ. With a series of simulations, the performance of our new estimators were studied and were compared with those estimators that already exist in the literature. The results indicated that the MLE has a better or similar performance than the others.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most of the existing open-source search engines, utilize keyword or tf-idf based techniques to find relevant documents and web pages relative to an input query. Although these methods, with the help of a page rank or knowledge graphs, proved to be effective in some cases, they often fail to retrieve relevant instances for more complicated queries that would require a semantic understanding to be exploited. In this Thesis, a self-supervised information retrieval system based on transformers is employed to build a semantic search engine over the library of Gruppo Maggioli company. Semantic search or search with meaning can refer to an understanding of the query, instead of simply finding words matches and, in general, it represents knowledge in a way suitable for retrieval. We chose to investigate a new self-supervised strategy to handle the training of unlabeled data based on the creation of pairs of ’artificial’ queries and the respective positive passages. We claim that by removing the reliance on labeled data, we may use the large volume of unlabeled material on the web without being limited to languages or domains where labeled data is abundant.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Slot and van Emde Boas Invariance Thesis states that a time (respectively, space) cost model is reasonable for a computational model C if there are mutual simulations between Turing machines and C such that the overhead is polynomial in time (respectively, linear in space). The rationale is that under the Invariance Thesis, complexity classes such as LOGSPACE, P, PSPACE, become robust, i.e. machine independent. In this dissertation, we want to find out if it possible to define a reasonable space cost model for the lambda-calculus, the paradigmatic model for functional programming languages. We start by considering an unusual evaluation mechanism for the lambda-calculus, based on Girard's Geometry of Interaction, that was conjectured to be the key ingredient to obtain a space reasonable cost model. By a fine complexity analysis of this schema, based on new variants of non-idempotent intersection types, we disprove this conjecture. Then, we change the target of our analysis. We consider a variant over Krivine's abstract machine, a standard evaluation mechanism for the call-by-name lambda-calculus, optimized for space complexity, and implemented without any pointer. A fine analysis of the execution of (a refined version of) the encoding of Turing machines into the lambda-calculus allows us to conclude that the space consumed by this machine is indeed a reasonable space cost model. In particular, for the first time we are able to measure also sub-linear space complexities. Moreover, we transfer this result to the call-by-value case. Finally, we provide also an intersection type system that characterizes compositionally this new reasonable space measure. This is done through a minimal, yet non trivial, modification of the original de Carvalho type system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this work is to analyse the chemistry models of low pressure Helicon discharges fed with iodine and air. In particular the focus of this research is to understand the plasma dynamics in order to predict propulsive performances of iodine and air-breathing Helicon Plasma Thrusters. The two systems have been simulated and analysed with the use of global models, i.e. a 0 dimensional tool to solve the set of governing equations by assuming that all quantities are volume averaged. Furthermore, some strategies have been implemented to improve the accuracy of this approach. A verification have been accomplished on the global models for both iodine and air, comparing results against simulations taken from literature. Moreover, the iodine global model has been validated against the experimental measurements of REGULUS, an helicon plasma thruster developed by the Italian company T4i, with a good agreement. From the analysis of iodine model, it has been found a significantly higher density for atomic positive ions with respect to molecular ions. Negative ions, instead, have shown to have negligible effect on the propulsive results. Also, the influence of reactions between heavy particles has been analysed with the global model. Results have demonstrated that, in the iodine case, chemistry is almost entirely affected by electronic collisions. For what concerns air-breathing results, it has been investigated the effects of the orbital height on propulsive performances. In particular, the global model has shown that at lower height, the values of thrust and specific impulse are lower due a change in atmosphere concentration. Finally, the iodine chemistry model has been introduced in the fluid code 3D-VIRTUS in order to preliminary assess the plasma properties of a Helicon discharge chamber for electric propulsion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quantum clock models are statistical mechanical spin models which may be regarded as a sort of bridge between the one-dimensional quantum Ising model and the one-dimensional quantum XY model. This thesis aims to provide an exhaustive review of these models using both analytical and numerical techniques. We present some important duality transformations which allow us to recast clock models into different forms, involving for example parafermions and lattice gauge theories. Thus, the notion of topological order enters into the game opening new scenarios for possible applications, like topological quantum computing. The second part of this thesis is devoted to the numerical analysis of clock models. We explore their phase diagram under different setups, with and without chirality, starting with a transverse field and then adding a longitudinal field as well. The most important observables we take into account for diagnosing criticality are the energy gap, the magnetisation, the entanglement entropy and the correlation functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background. The intrafamilial dynamics of endemic infection with human herpesvirus type 8 (HHV-8) in Amerindian populations is unknown. Methods. Serum samples were obtained from 517 Amerindians and tested for HHV-8 anti-latent nuclear antigen (anti-LANA) and antilytic antibodies by immunofluorescence assays. Logistic regression and mixed logistic models were used to estimate the odds of being HHV-8 seropositive among intrafamilial pairs. Results. HHV-8 seroprevalence by either assay was 75.4% (95% confidence interval [CI]: 71.5%-79.1%), and it was age-dependent (P-trend<.001). Familial dependence in HHV-8 seroprevalence by either assay was found between mother-offspring (odds ratio [OR], 5.44; 95% CI: 1.62-18.28) and siblings aged >= 10 years (OR 4.42, 95% CI: 1.70-11.45) or siblings in close age range (<5 years difference) (OR 3.37, 95% CI: 1.21-9.40), or in families with large (>4) number of siblings (OR, 3.20, 95% CI: 1.33-7.67). In separate analyses by serological assay, there was strong dependence in mother-offspring (OR 8.94, 95% CI: 2.94-27.23) and sibling pairs aged >= 10 years (OR, 11.91, 95% CI: 2.23-63.64) measured by LANA but not lytic antibodies. Conclusions. This pattern of familial dependence suggests that, in this endemic population, HHV-8 transmission mainly occurs from mother to offspring and between close siblings during early childhood, probably via saliva. The mother to offspring dependence was derived chiefly from anti-LANA antibodies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this thesis is twofold. The first and major part is devoted to sensitivity analysis of various discrete optimization problems while the second part addresses methods applied for calculating measures of solution stability and solving multicriteria discrete optimization problems. Despite numerous approaches to stability analysis of discrete optimization problems two major directions can be single out: quantitative and qualitative. Qualitative sensitivity analysis is conducted for multicriteria discrete optimization problems with minisum, minimax and minimin partial criteria. The main results obtained here are necessary and sufficient conditions for different stability types of optimal solutions (or a set of optimal solutions) of the considered problems. Within the framework of quantitative direction various measures of solution stability are investigated. A formula for a quantitative characteristic called stability radius is obtained for the generalized equilibrium situation invariant to changes of game parameters in the case of the H¨older metric. Quality of the problem solution can also be described in terms of robustness analysis. In this work the concepts of accuracy and robustness tolerances are presented for a strategic game with a finite number of players where initial coefficients (costs) of linear payoff functions are subject to perturbations. Investigation of stability radius also aims to devise methods for its calculation. A new metaheuristic approach is derived for calculation of stability radius of an optimal solution to the shortest path problem. The main advantage of the developed method is that it can be potentially applicable for calculating stability radii of NP-hard problems. The last chapter of the thesis focuses on deriving innovative methods based on interactive optimization approach for solving multicriteria combinatorial optimization problems. The key idea of the proposed approach is to utilize a parameterized achievement scalarizing function for solution calculation and to direct interactive procedure by changing weighting coefficients of this function. In order to illustrate the introduced ideas a decision making process is simulated for three objective median location problem. The concepts, models, and ideas collected and analyzed in this thesis create a good and relevant grounds for developing more complicated and integrated models of postoptimal analysis and solving the most computationally challenging problems related to it.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ties among event times are often recorded in survival studies. For example, in a two week laboratory study where event times are measured in days, ties are very likely to occur. The proportional hazards model might be used in this setting using an approximated partial likelihood function. This approximation works well when the number of ties is small. on the other hand, discrete regression models are suggested when the data are heavily tied. However, in many situations it is not clear which approach should be used in practice. In this work, empirical guidelines based on Monte Carlo simulations are provided. These recommendations are based on a measure of the amount of tied data present and the mean square error. An example illustrates the proposed criterion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As an important Civil Engineering material, asphalt concrete (AC) is commonly used to build road surfaces, airports, and parking lots. With traditional laboratory tests and theoretical equations, it is a challenge to fully understand such a random composite material. Based on the discrete element method (DEM), this research seeks to develop and implement computer models as research approaches for improving understandings of AC microstructure-based mechanics. In this research, three categories of approaches were developed or employed to simulate microstructures of AC materials, namely the randomly-generated models, the idealized models, and image-based models. The image-based models were recommended for accurately predicting AC performance, while the other models were recommended as research tools to obtain deep insight into the AC microstructure-based mechanics. A viscoelastic micromechanical model was developed to capture viscoelastic interactions within the AC microstructure. Four types of constitutive models were built to address the four categories of interactions within an AC specimen. Each of the constitutive models consists of three parts which represent three different interaction behaviors: a stiffness model (force-displace relation), a bonding model (shear and tensile strengths), and a slip model (frictional property). Three techniques were developed to reduce the computational time for AC viscoelastic simulations. It was found that the computational time was significantly reduced to days or hours from years or months for typical three-dimensional models. Dynamic modulus and creep stiffness tests were simulated and methodologies were developed to determine the viscoelastic parameters. It was found that the DE models could successfully predict dynamic modulus, phase angles, and creep stiffness in a wide range of frequencies, temperatures, and time spans. Mineral aggregate morphology characteristics (sphericity, orientation, and angularity) were studied to investigate their impacts on AC creep stiffness. It was found that aggregate characteristics significantly impact creep stiffness. Pavement responses and pavement-vehicle interactions were investigated by simulating pavement sections under a rolling wheel. It was found that wheel acceleration, steadily moving, and deceleration significantly impact contact forces. Additionally, summary and recommendations were provided in the last chapter and part of computer programming codes wree provided in the appendixes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

My dissertation focuses on developing methods for gene-gene/environment interactions and imprinting effect detections for human complex diseases and quantitative traits. It includes three sections: (1) generalizing the Natural and Orthogonal interaction (NOIA) model for the coding technique originally developed for gene-gene (GxG) interaction and also to reduced models; (2) developing a novel statistical approach that allows for modeling gene-environment (GxE) interactions influencing disease risk, and (3) developing a statistical approach for modeling genetic variants displaying parent-of-origin effects (POEs), such as imprinting. In the past decade, genetic researchers have identified a large number of causal variants for human genetic diseases and traits by single-locus analysis, and interaction has now become a hot topic in the effort to search for the complex network between multiple genes or environmental exposures contributing to the outcome. Epistasis, also known as gene-gene interaction is the departure from additive genetic effects from several genes to a trait, which means that the same alleles of one gene could display different genetic effects under different genetic backgrounds. In this study, we propose to implement the NOIA model for association studies along with interaction for human complex traits and diseases. We compare the performance of the new statistical models we developed and the usual functional model by both simulation study and real data analysis. Both simulation and real data analysis revealed higher power of the NOIA GxG interaction model for detecting both main genetic effects and interaction effects. Through application on a melanoma dataset, we confirmed the previously identified significant regions for melanoma risk at 15q13.1, 16q24.3 and 9p21.3. We also identified potential interactions with these significant regions that contribute to melanoma risk. Based on the NOIA model, we developed a novel statistical approach that allows us to model effects from a genetic factor and binary environmental exposure that are jointly influencing disease risk. Both simulation and real data analyses revealed higher power of the NOIA model for detecting both main genetic effects and interaction effects for both quantitative and binary traits. We also found that estimates of the parameters from logistic regression for binary traits are no longer statistically uncorrelated under the alternative model when there is an association. Applying our novel approach to a lung cancer dataset, we confirmed four SNPs in 5p15 and 15q25 region to be significantly associated with lung cancer risk in Caucasians population: rs2736100, rs402710, rs16969968 and rs8034191. We also validated that rs16969968 and rs8034191 in 15q25 region are significantly interacting with smoking in Caucasian population. Our approach identified the potential interactions of SNP rs2256543 in 6p21 with smoking on contributing to lung cancer risk. Genetic imprinting is the most well-known cause for parent-of-origin effect (POE) whereby a gene is differentially expressed depending on the parental origin of the same alleles. Genetic imprinting affects several human disorders, including diabetes, breast cancer, alcoholism, and obesity. This phenomenon has been shown to be important for normal embryonic development in mammals. Traditional association approaches ignore this important genetic phenomenon. In this study, we propose a NOIA framework for a single locus association study that estimates both main allelic effects and POEs. We develop statistical (Stat-POE) and functional (Func-POE) models, and demonstrate conditions for orthogonality of the Stat-POE model. We conducted simulations for both quantitative and qualitative traits to evaluate the performance of the statistical and functional models with different levels of POEs. Our results showed that the newly proposed Stat-POE model, which ensures orthogonality of variance components if Hardy-Weinberg Equilibrium (HWE) or equal minor and major allele frequencies is satisfied, had greater power for detecting the main allelic additive effect than a Func-POE model, which codes according to allelic substitutions, for both quantitative and qualitative traits. The power for detecting the POE was the same for the Stat-POE and Func-POE models under HWE for quantitative traits.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

During the last years cities around the world have invested important quantities of money in measures for reducing congestion and car-trips. Investments which are nothing but potential solutions for the well-known urban sprawl phenomenon, also called the “development trap” that leads to further congestion and a higher proportion of our time spent in slow moving cars. Over the path of this searching for solutions, the complex relationship between urban environment and travel behaviour has been studied in a number of cases. The main question on discussion is, how to encourage multi-stop tours? Thus, the objective of this paper is to verify whether unobserved factors influence tour complexity. For this purpose, we use a data-base from a survey conducted in 2006-2007 in Madrid, a suitable case study for analyzing urban sprawl due to new urban developments and substantial changes in mobility patterns in the last years. A total of 943 individuals were interviewed from 3 selected neighbourhoods (CBD, urban and suburban). We study the effect of unobserved factors on trip frequency. This paper present the estimation of an hybrid model where the latent variable is called propensity to travel and the discrete choice model is composed by 5 alternatives of tour type. The results show that characteristics of the neighbourhoods in Madrid are important to explain trip frequency. The influence of land use variables on trip generation is clear and in particular the presence of commercial retails. Through estimation of elasticities and forecasting we determine to what extent land-use policy measures modify travel demand. Comparing aggregate elasticities with percentage variations, it can be seen that percentage variations could lead to inconsistent results. The result shows that hybrid models better explain travel behavior than traditional discrete choice models.