37 resultados para Limited dependent variable regression


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Gaussian processes provide natural non-parametric prior distributions over regression functions. In this paper we consider regression problems where there is noise on the output, and the variance of the noise depends on the inputs. If we assume that the noise is a smooth function of the inputs, then it is natural to model the noise variance using a second Gaussian process, in addition to the Gaussian process governing the noise-free output value. We show that prior uncertainty about the parameters controlling both processes can be handled and that the posterior distribution of the noise rate can be sampled from using Markov chain Monte Carlo methods. Our results on a synthetic data set give a posterior noise variance that well-approximates the true variance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In most treatments of the regression problem it is assumed that the distribution of target data can be described by a deterministic function of the inputs, together with additive Gaussian noise having constant variance. The use of maximum likelihood to train such models then corresponds to the minimization of a sum-of-squares error function. In many applications a more realistic model would allow the noise variance itself to depend on the input variables. However, the use of maximum likelihood to train such models would give highly biased results. In this paper we show how a Bayesian treatment can allow for an input-dependent variance while overcoming the bias of maximum likelihood.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is generally assumed when using Bayesian inference methods for neural networks that the input data contains no noise or corruption. For real-world (errors in variable) problems this is clearly an unsafe assumption. This paper presents a Bayesian neural network framework which allows for input noise given that some model of the noise process exists. In the limit where this noise process is small and symmetric it is shown, using the Laplace approximation, that there is an additional term to the usual Bayesian error bar which depends on the variance of the input noise process. Further, by treating the true (noiseless) input as a hidden variable and sampling this jointly with the network's weights, using Markov Chain Monte Carlo methods, it is demonstrated that it is possible to infer the unbiassed regression over the noiseless input.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Correlation and regression are two of the statistical procedures most widely used by optometrists. However, these tests are often misused or interpreted incorrectly, leading to erroneous conclusions from clinical experiments. This review examines the major statistical tests concerned with correlation and regression that are most likely to arise in clinical investigations in optometry. First, the use, interpretation and limitations of Pearson's product moment correlation coefficient are described. Second, the least squares method of fitting a linear regression to data and for testing how well a regression line fits the data are described. Third, the problems of using linear regression methods in observational studies, if there are errors associated in measuring the independent variable and for predicting a new value of Y for a given X, are discussed. Finally, methods for testing whether a non-linear relationship provides a better fit to the data and for comparing two or more regression lines are considered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ability to distinguish one visual stimulus from another slightly different one depends on the variability of their internal representations. In a recent paper on human visual-contrast discrimination, Kontsevich et al (2002 Vision Research 42 1771 - 1784) re-considered the long-standing question whether the internal noise that limits discrimination is fixed (contrast-invariant) or variable (contrast-dependent). They tested discrimination performance for 3 cycles deg-1 gratings over a wide range of incremental contrast levels at three masking contrasts, and showed that a simple model with an expansive response function and response-dependent noise could fit the data very well. Their conclusion - that noise in visual-discrimination tasks increases markedly with contrast - has profound implications for our understanding and modelling of vision. Here, however, we re-analyse their data, and report that a standard gain-control model with a compressive response function and fixed additive noise can also fit the data remarkably well. Thus these experimental data do not allow us to decide between the two models. The question remains open. [Supported by EPSRC grant GR/S74515/01]

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two types of prediction problem can be solved using a regression line viz., prediction of the ‘population’ regression line at the point ‘x’ and prediction of an ‘individual’ new member of the population ‘y1’ for which ‘x1’ has been measured. The second problem is probably the most commonly encountered and the most relevant to calibration studies. A regression line is likely to be most useful for calibration if the range of values of the X variable is large, if there is a good representation of the ‘x,y’ values across the range of X, and if several estimates of ‘y’ are made at each ‘x’. It is poor statistical practice to use a regression line for calibration or prediction beyond the limits of the data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Non-linear relationships are common in microbiological research and often necessitate the use of the statistical techniques of non-linear regression or curve fitting. In some circumstances, the investigator may wish to fit an exponential model to the data, i.e., to test the hypothesis that a quantity Y either increases or decays exponentially with increasing X. This type of model is straight forward to fit as taking logarithms of the Y variable linearises the relationship which can then be treated by the methods of linear regression.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

1. Pearson's correlation coefficient only tests whether the data fit a linear model. With large numbers of observations, quite small values of r become significant and the X variable may only account for a minute proportion of the variance in Y. Hence, the value of r squared should always be calculated and included in a discussion of the significance of r. 2. The use of r assumes that a bivariate normal distribution is present and this assumption should be examined prior to the study. If Pearson's r is not appropriate, then a non-parametric correlation coefficient such as Spearman's rs may be used. 3. A significant correlation should not be interpreted as indicating causation especially in observational studies in which there is a high probability that the two variables are correlated because of their mutual correlations with other variables. 4. In studies of measurement error, there are problems in using r as a test of reliability and the ‘intra-class correlation coefficient’ should be used as an alternative. A correlation test provides only limited information as to the relationship between two variables. Fitting a regression line to the data using the method known as ‘least square’ provides much more information and the methods of regression and their application in optometry will be discussed in the next article.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis describes an investigation which was carried out under the Interdisciplinary Higher Degres (IHD) Scheme of The University of Aston in Birmingham. The investigation, which involved joint collaboration between the IHD scheme, the Department of Mechanical Engineering, and G.E.C. Turbine Generators Limited, was concerned with hydrostatic bearing characteristics and of how hydrostatic bearings could be used to enable turbine generator rotor support impedances to be controlled to give an improved rotor dynamic response. Turbine generator rotor critical speeds are determined not only by the mass and flexibility of the rotor itself, which are relatively easily predicted, but also by the dynamic characteristics of the bearing oil film, pedestal, and foundations. It is because of the difficulty in accurately predicting the rotor support characteristics that the designer has a problem in ensuring that a rotor's normal running speed is not close to one of its critical speeds. The consequence of this situation is that some rotors do have critical speeds close to their normal running speed and the resulting high levels of vibration cause noise, high rotor stresses, and a shortening of bearing life. A combined theoretical and experimental investigation of the effects of mounting the normal rotor journal bearing in a hydrostatic bearing was carried out. The purpose of the work was to show that by changing the oil flow resistance offered by capillaries connecting accumulators to the hydrostatic bearing, the overall rotor support characteristics could be tuned to enable rotor critical speeds to be moved at will. Testing of a combined journal and hydrostatic bearing has confirmed the theory of its operation and a theoretical study of a full size machine showed that its critical speed could be moved by over 350 rpm and that its rotor vibration at running speed could be reduced by 80%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An investigator may also wish to select a small subset of the X variables which give the best prediction of the Y variable. In this case, the question is how many variables should the regression equation include? One method would be to calculate the regression of Y on every subset of the X variables and choose the subset that gives the smallest mean square deviation from the regression. Most investigators, however, prefer to use a ‘stepwise multiple regression’ procedure. There are two forms of this analysis called the ‘step-up’ (or ‘forward’) method and the ‘step-down’ (or ‘backward’) method. This Statnote illustrates the use of stepwise multiple regression with reference to the scenario introduced in Statnote 24, viz., the influence of climatic variables on the growth of the crustose lichen Rhizocarpon geographicum (L.)DC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this research work was primarily to examine the relevance of patient parameters, ward structures, procedures and practices, in respect of the potential hazards of wound cross-infection and nasal colonisation with multiple resistant strains of Staphylococcus aureus, which it is thought might provide a useful indication of a patient's general susceptibility to wound infection. Information from a large cross-sectional survey involving 12,000 patients from some 41 hospitals and 375 wards was collected over a five-year period from 1967-72, and its validity checked before any subsequent analysis was carried out. Many environmental factors and procedures which had previously been thought (but never conclusively proved) to have an influence on wound infection or nasal colonisation rates, were assessed, and subsequently dismissed as not being significant, provided that the standard of the current range of practices and procedures is maintained and not allowed to deteriorate. Retrospective analysis revealed that the probability of wound infection was influenced by the patient's age, duration of pre-operative hospitalisation, sex, type of wound, presence and type of drain, number of patients in ward, and other special risk factors, whilst nasal colonisation was found to be influenced by the patient's age, total duration of hospitalisation, sex, antibiotics, proportion of occupied beds in the ward, average distance between bed centres and special risk factors. A multi-variate regression analysis technique was used to develop statistical models, consisting of variable patient and environmental factors which were found to have a significant influence on the risks pertaining to wound infection and nasal colonisation. A relationship between wound infection and nasal colonisation was then established and this led to the development of a more advanced model for predicting wound infections, taking advantage of the additional knowledge of the patient's state of nasal colonisation prior to operation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A methodology is presented which can be used to produce the level of electromagnetic interference, in the form of conducted and radiated emissions, from variable speed drives, the drive that was modelled being a Eurotherm 583 drive. The conducted emissions are predicted using an accurate circuit model of the drive and its associated equipment. The circuit model was constructed from a number of different areas, these being: the power electronics of the drive, the line impedance stabilising network used during the experimental work to measure the conducted emissions, a model of an induction motor assuming near zero load, an accurate model of the shielded cable which connected the drive to the motor, and finally the parasitic capacitances that were present in the drive modelled. The conducted emissions were predicted with an error of +/-6dB over the frequency range 150kHz to 16MHz, which compares well with the limits set in the standards which specify a frequency range of 150kHz to 30MHz. The conducted emissions model was also used to predict the current and voltage sources which were used to predict the radiated emissions from the drive. Two methods for the prediction of the radiated emissions from the drive were investigated, the first being two-dimensional finite element analysis and the second three-dimensional transmission line matrix modelling. The finite element model took account of the features of the drive that were considered to produce the majority of the radiation, these features being the switching of the IGBT's in the inverter, the shielded cable which connected the drive to the motor as well as some of the cables that were present in the drive.The model also took account of the structure of the test rig used to measure the radiated emissions. It was found that the majority of the radiation produced came from the shielded cable and the common mode currents that were flowing in the shield, and that it was feasible to model the radiation from the drive by only modelling the shielded cable. The radiated emissions were correctly predicted in the frequency range 30MHz to 200MHz with an error of +10dB/-6dB. The transmission line matrix method modelled the shielded cable which connected the drive to the motor and also took account of the architecture of the test rig. Only limited simulations were performed using the transmission line matrix model as it was found to be a very slow method and not an ideal solution to the problem. However the limited results obtained were comparable, to within 5%, to the results obtained using the finite element model.