114 resultados para Random regression
Resumo:
This paper provides a new proof of a theorem of Chandler-Wilde, Chonchaiya, and Lindner that the spectra of a certain class of infinite, random, tridiagonal matrices contain the unit disc almost surely. It also obtains an analogous result for a more general class of random matrices whose spectra contain a hole around the origin. The presence of the hole forces substantial changes to the analysis.
Resumo:
The problem of calculating the probability of error in a DS/SSMA system has been extensively studied for more than two decades. When random sequences are employed some conditioning must be done before the application of the central limit theorem is attempted, leading to a Gaussian distribution. The authors seek to characterise the multiple access interference as a random-walk with a random number of steps, for random and deterministic sequences. Using results from random-walk theory, they model the interference as a K-distributed random variable and use it to calculate the probability of error in the form of a series, for a DS/SSMA system with a coherent correlation receiver and BPSK modulation under Gaussian noise. The asymptotic properties of the proposed distribution agree with other analyses. This is, to the best of the authors' knowledge, the first attempt to propose a non-Gaussian distribution for the interference. The modelling can be extended to consider multipath fading and general modulation
Resumo:
In this paper we propose an efficient two-level model identification method for a large class of linear-in-the-parameters models from the observational data. A new elastic net orthogonal forward regression (ENOFR) algorithm is employed at the lower level to carry out simultaneous model selection and elastic net parameter estimation. The two regularization parameters in the elastic net are optimized using a particle swarm optimization (PSO) algorithm at the upper level by minimizing the leave one out (LOO) mean square error (LOOMSE). Illustrative examples are included to demonstrate the effectiveness of the new approaches.
Resumo:
Undirected graphical models are widely used in statistics, physics and machine vision. However Bayesian parameter estimation for undirected models is extremely challenging, since evaluation of the posterior typically involves the calculation of an intractable normalising constant. This problem has received much attention, but very little of this has focussed on the important practical case where the data consists of noisy or incomplete observations of the underlying hidden structure. This paper specifically addresses this problem, comparing two alternative methodologies. In the first of these approaches particle Markov chain Monte Carlo (Andrieu et al., 2010) is used to efficiently explore the parameter space, combined with the exchange algorithm (Murray et al., 2006) for avoiding the calculation of the intractable normalising constant (a proof showing that this combination targets the correct distribution in found in a supplementary appendix online). This approach is compared with approximate Bayesian computation (Pritchard et al., 1999). Applications to estimating the parameters of Ising models and exponential random graphs from noisy data are presented. Each algorithm used in the paper targets an approximation to the true posterior due to the use of MCMC to simulate from the latent graphical model, in lieu of being able to do this exactly in general. The supplementary appendix also describes the nature of the resulting approximation.
Resumo:
Currently, there are limited published data for the population dynamics of antimicrobial-resistant commensal bacteria. This study was designed to evaluate both the proportions of the Escherichia coli populations that are resistant to ampicillin at the level of the individual chicken on commercial broiler farms and the feasibility of obtaining repeated measures of fecal E. coli concentrations. Short-term temporal variation in the concentration of fecal E. coli was investigated, and a preliminary assessment was made of potential factors involved in the shedding of high numbers of ampicillin-resistant E. coli by growing birds in the absence of the use of antimicrobial drugs. Multilevel linear regression modeling revealed that the largest component of random variation in log-transformed fecal E. coli concentrations was seen between sampling occasions for individual birds. The incorporation of fixed effects into the model demonstrated that the older, heavier birds in the study were significantly more likely (P = 0.0003) to shed higher numbers of ampicillin-resistant E. coli. This association between increasing weight and high shedding was not seen for the total fecal E. coli population (P = 0.71). This implies that, in the absence of the administration of antimicrobial drugs, the proportion of fecal E. coli that was resistant to ampicillin increased as the birds grew. This study has shown that it is possible to collect quantitative microbiological data on broiler farms and that such data could make valuable contributions to risk assessments concerning the transfer of resistant bacteria between animal and human populations.
Resumo:
Ensemble learning techniques generate multiple classifiers, so called base classifiers, whose combined classification results are used in order to increase the overall classification accuracy. In most ensemble classifiers the base classifiers are based on the Top Down Induction of Decision Trees (TDIDT) approach. However, an alternative approach for the induction of rule based classifiers is the Prism family of algorithms. Prism algorithms produce modular classification rules that do not necessarily fit into a decision tree structure. Prism classification rulesets achieve a comparable and sometimes higher classification accuracy compared with decision tree classifiers, if the data is noisy and large. Yet Prism still suffers from overfitting on noisy and large datasets. In practice ensemble techniques tend to reduce the overfitting, however there exists no ensemble learner for modular classification rule inducers such as the Prism family of algorithms. This article describes the first development of an ensemble learner based on the Prism family of algorithms in order to enhance Prism’s classification accuracy by reducing overfitting.
Resumo:
Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.
Resumo:
In this paper I analyze the general equilibrium in a random Walrasian economy. Dependence among agents is introduced in the form of dependency neighborhoods. Under the uncertainty, an agent may fail to survive due to a meager endowment in a particular state (direct effect), as well as due to unfavorable equilibrium price system at which the value of the endowment falls short of the minimum needed for survival (indirect terms-of-trade effect). To illustrate the main result I compute the stochastic limit of equilibrium price and probability of survival of an agent in a large Cobb-Douglas economy.
Resumo:
Data augmentation is a powerful technique for estimating models with latent or missing data, but applications in agricultural economics have thus far been few. This paper showcases the technique in an application to data on milk market participation in the Ethiopian highlands. There, a key impediment to economic development is an apparently low rate of market participation. Consequently, economic interest centers on the “locations” of nonparticipants in relation to the market and their “reservation values” across covariates. These quantities are of policy interest because they provide measures of the additional inputs necessary in order for nonparticipants to enter the market. One quantity of primary interest is the minimum amount of surplus milk (the “minimum efficient scale of operations”) that the household must acquire before market participation becomes feasible. We estimate this quantity through routine application of data augmentation and Gibbs sampling applied to a random-censored Tobit regression. Incorporating random censoring affects markedly the marketable-surplus requirements of the household, but only slightly the covariates requirements estimates and, generally, leads to more plausible policy estimates than the estimates obtained from the zero-censored formulation
Resumo:
We present a model of market participation in which the presence of non-negligible fixed costs leads to random censoring of the traditional double-hurdle model. Fixed costs arise when household resources must be devoted a priori to the decision to participate in the market. These costs, usually of time, are manifested in non-negligible minimum-efficient supplies and supply correspondence that requires modification of the traditional Tobit regression. The costs also complicate econometric estimation of household behavior. These complications are overcome by application of the Gibbs sampler. The algorithm thus derived provides robust estimates of the fixed-costs, double-hurdle model. The model and procedures are demonstrated in an application to milk market participation in the Ethiopian highlands.