18 resultados para Regression discontinuity

em Universidad Politécnica de Madrid


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Locally weighted regression is a technique that predicts the response for new data items from their neighbors in the training data set, where closer data items are assigned higher weights in the prediction. However, the original method may suffer from overfitting and fail to select the relevant variables. In this paper we propose combining a regularization approach with locally weighted regression to achieve sparse models. Specifically, the lasso is a shrinkage and selection method for linear regression. We present an algorithm that embeds lasso in an iterative procedure that alternatively computes weights and performs lasso-wise regression. The algorithm is tested on three synthetic scenarios and two real data sets. Results show that the proposed method outperforms linear and local models for several kinds of scenarios

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There exists an interest in performing full core pin-by-pin computations for present nuclear reactors. In such type of problems the use of a transport approximation like the diffusion equation requires the introduction of correction parameters. Interface discontinuity factors can improve the diffusion solution to nearly reproduce a transport solution. Nevertheless, calculating accurate pin-by-pin IDF requires the knowledge of the heterogeneous neutron flux distribution, which depends on the boundary conditions of the pin-cell as well as the local variables along the nuclear reactor operation. As a consequence, it is impractical to compute them for each possible configuration. An alternative to generate accurate pin-by-pin interface discontinuity factors is to calculate reference values using zero-net-current boundary conditions and to synthesize afterwards their dependencies on the main neighborhood variables. In such way the factors can be accurately computed during fine-mesh diffusion calculations by correcting the reference values as a function of the actual environment of the pin-cell in the core. In this paper we propose a parameterization of the pin-by-pin interface discontinuity factors allowing the implementation of a cross sections library able to treat the neighborhood effect. First results are presented for typical PWR configurations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Interface discontinuity factors based on the Generalized Equivalence Theory are commonly used in nodal homogenized diffusion calculations so that diffusion average values approximate heterogeneous higher order solutions. In this paper, an additional form of interface correction factors is presented in the frame of the Analytic Coarse Mesh Finite Difference Method (ACMFD), based on a correction of the modal fluxes instead of the physical fluxes. In the ACMFD formulation, implemented in COBAYA3 code, the coupled multigroup diffusion equations inside a homogenized region are reduced to a set of uncoupled modal equations through diagonalization of the multigroup diffusion matrix. Then, physical fluxes are transformed into modal fluxes in the eigenspace of the diffusion matrix. It is possible to introduce interface flux discontinuity jumps as the difference of heterogeneous and homogeneous modal fluxes instead of introducing interface discontinuity factors as the ratio of heterogeneous and homogeneous physical fluxes. The formulation in the modal space has been implemented in COBAYA3 code and assessed by comparison with solutions using classical interface discontinuity factors in the physical space

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Linear regression is a technique widely used in digital signal processing. It consists on finding the linear function that better fits a given set of samples. This paper proposes different hardware architectures for the implementation of the linear regression method on FPGAs, specially targeting area restrictive systems. It saves area at the cost of constraining the lengths of the input signal to some fixed values. We have implemented the proposed scheme in an Automatic Modulation Classifier, meeting the hard real-time constraints this kind of systems have.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a methodology for reducing a straight line fitting regression problem to a Least Squares minimization one. This is accomplished through the definition of a measure on the data space that takes into account directional dependences of errors, and the use of polar descriptors for straight lines. This strategy improves the robustness by avoiding singularities and non-describable lines. The methodology is powerful enough to deal with non-normal bivariate heteroscedastic data error models, but can also supersede classical regression methods by making some particular assumptions. An implementation of the methodology for the normal bivariate case is developed and evaluated.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the common pathologies of brickwork masonry structural elements and walls is the cracking associated with the differential settlements and/or excessive deflections of the slabs along the life of the structure. The scarce capacity of the masonry in order to accompany the structural elements that surround it, such as floors, beams or foundations, in their movements makes the brickwork masonry to be an element that frequently presents this kind of problem. This problem is a fracture problem, where the wall is cracked under mixed mode fracture: tensile and shear stresses combination, under static loading. Consequently, it is necessary to advance in the simulation and prediction of brickwork masonry mechanical behaviour under tensile and shear loading. The quasi-brittle behaviour of the brickwork masonry can be studied using the cohesive crack model whose application to other quasibrittle materials like concrete has traditionally provided very satisfactory results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Performing three-dimensional pin-by-pin full core calculations based on an improved solution of the multi-group diffusion equation is an affordable option nowadays to compute accurate local safety parameters for light water reactors. Since a transport approximation is solved, appropriate correction factors, such as interface discontinuity factors, are required to nearly reproduce the fully heterogeneous transport solution. Calculating exact pin-by-pin discontinuity factors requires the knowledge of the heterogeneous neutron flux distribution, which depends on the boundary conditions of the pin-cell as well as the local variables along the nuclear reactor operation. As a consequence, it is impractical to compute them for each possible configuration; however, inaccurate correction factors are one major source of error in core analysis when using multi-group diffusion theory. An alternative to generate accurate pin-by-pin interface discontinuity factors is to build a functional-fitting that allows incorporating the environment dependence in the computed values. This paper suggests a methodology to consider the neighborhood effect based on the Analytic Coarse-Mesh Finite Difference method for the multi-group diffusion equation. It has been applied to both definitions of interface discontinuity factors, the one based on the Generalized Equivalence Theory and the one based on Black-Box Homogenization, and for different few energy groups structures. Conclusions are drawn over the optimal functional-fitting and demonstrative results are obtained with the multi-group pin-by-pin diffusion code COBAYA3 for representative PWR configurations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Este artículo propone un método para llevar a cabo la calibración de las familias de discontinuidades en macizos rocosos. We present a novel approach for calibration of stochastic discontinuity network parameters based on genetic algorithms (GAs). To validate the approach, examples of application of the method to cases with known parameters of the original Poisson discontinuity network are presented. Parameters of the model are encoded as chromosomes using a binary representation, and such chromosomes evolve as successive generations of a randomly generated initial population, subjected to GA operations of selection, crossover and mutation. Such back-calculated parameters are employed to make assessments about the inference capabilities of the model using different objective functions with different probabilities of crossover and mutation. Results show that the predictive capabilities of GAs significantly depend on the type of objective function considered; and they also show that the calibration capabilities of the genetic algorithm can be acceptable for practical engineering applications, since in most cases they can be expected to provide parameter estimates with relatively small errors for those parameters of the network (such as intensity and mean size of discontinuities) that have the strongest influence on many engineering applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aplicación de simulación de Monte Carlo y técnicas de Análisis de la Varianza (ANOVA) a la comparación de modelos estocásticos dinámicos para accidentes de tráfico.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fractal and multifractal are concepts that have grown increasingly popular in recent years in the soil analysis, along with the development of fractal models. One of the common steps is to calculate the slope of a linear fit commonly using least squares method. This shouldn?t be a special problem, however, in many situations using experimental data the researcher has to select the range of scales at which is going to work neglecting the rest of points to achieve the best linearity that in this type of analysis is necessary. Robust regression is a form of regression analysis designed to circumvent some limitations of traditional parametric and non-parametric methods. In this method we don?t have to assume that the outlier point is simply an extreme observation drawn from the tail of a normal distribution not compromising the validity of the regression results. In this work we have evaluated the capacity of robust regression to select the points in the experimental data used trying to avoid subjective choices. Based on this analysis we have developed a new work methodology that implies two basic steps: ? Evaluation of the improvement of linear fitting when consecutive points are eliminated based on R pvalue. In this way we consider the implications of reducing the number of points. ? Evaluation of the significance of slope difference between fitting with the two extremes points and fitted with the available points. We compare the results applying this methodology and the common used least squares one. The data selected for these comparisons are coming from experimental soil roughness transect and simulated based on middle point displacement method adding tendencies and noise. The results are discussed indicating the advantages and disadvantages of each methodology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, multiple regression analysis is used to model the top of descent (TOD) location of user-preferred descent trajectories computed by the flight management system (FMS) on over 1000 commercial flights into Melbourne, Australia. In addition to recording TOD, the cruise altitude, final altitude, cruise Mach, descent speed, wind, and engine type were also identified for use as the independent variables in the regression analysis. Both first-order and second-order models are considered, where cross-validation, hypothesis testing, and additional analysis are used to compare models. This identifies the models that should give the smallest errors if used to predict TOD location for new data in the future. A model that is linear in TOD altitude, final altitude, descent speed, and wind gives an estimated standard deviation of 3.9 nmi for TOD location given the trajectory parame- ters, which means about 80% of predictions would have error less than 5 nmi in absolute value. This accuracy is better than demonstrated by other ground automation predictions using kinetic models. Furthermore, this approach would enable online learning of the model. Additional data or further knowledge of algorithms is necessary to conclude definitively that no second-order terms are appropriate. Possible applications of the linear model are described, including enabling arriving aircraft to fly optimized descents computed by the FMS even in congested airspace.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a model of Bayesian network for continuous variables, where densities and conditional densities are estimated with B-spline MoPs. We use a novel approach to directly obtain conditional densities estimation using B-spline properties. In particular we implement naive Bayes and wrapper variables selection. Finally we apply our techniques to the problem of predicting neurons morphological variables from electrophysiological ones.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work proposes an automatic methodology for modeling complex systems. Our methodology is based on the combination of Grammatical Evolution and classical regression to obtain an optimal set of features that take part of a linear and convex model. This technique provides both Feature Engineering and Symbolic Regression in order to infer accurate models with no effort or designer's expertise requirements. As advanced Cloud services are becoming mainstream, the contribution of data centers in the overall power consumption of modern cities is growing dramatically. These facilities consume from 10 to 100 times more power per square foot than typical office buildings. Modeling the power consumption for these infrastructures is crucial to anticipate the effects of aggressive optimization policies, but accurate and fast power modeling is a complex challenge for high-end servers not yet satisfied by analytical approaches. For this case study, our methodology minimizes error in power prediction. This work has been tested using real Cloud applications resulting on an average error in power estimation of 3.98%. Our work improves the possibilities of deriving Cloud energy efficient policies in Cloud data centers being applicable to other computing environments with similar characteristics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Predicting failures in a distributed system based on previous events through logistic regression is a standard approach in literature. This technique is not reliable, though, in two situations: in the prediction of rare events, which do not appear in enough proportion for the algorithm to capture, and in environments where there are too many variables, as logistic regression tends to overfit on this situations; while manually selecting a subset of variables to create the model is error- prone. On this paper, we solve an industrial research case that presented this situation with a combination of elastic net logistic regression, a method that allows us to automatically select useful variables, a process of cross-validation on top of it and the application of a rare events prediction technique to reduce computation time. This process provides two layers of cross- validation that automatically obtain the optimal model complexity and the optimal mode l parameters values, while ensuring even rare events will be correctly predicted with a low amount of training instances. We tested this method against real industrial data, obtaining a total of 60 out of 80 possible models with a 90% average model accuracy.