126 resultados para Vector Auto Regression


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Increasingly semiconductor manufacturers are exploring opportunities for virtual metrology (VM) enabled process monitoring and control as a means of reducing non-value added metrology and achieving ever more demanding wafer fabrication tolerances. However, developing robust, reliable and interpretable VM models can be very challenging due to the highly correlated input space often associated with the underpinning data sets. A particularly pertinent example is etch rate prediction of plasma etch processes from multichannel optical emission spectroscopy data. This paper proposes a novel input-clustering based forward stepwise regression methodology for VM model building in such highly correlated input spaces. Max Separation Clustering (MSC) is employed as a pre-processing step to identify a reduced srt of well-conditioned, representative variables that can then be used as inputs to state-of-the-art model building techniques such as Forward Selection Regression (FSR), Ridge regression, LASSO and Forward Selection Ridge Regression (FCRR). The methodology is validated on a benchmark semiconductor plasma etch dataset and the results obtained are compared with those achieved when the state-of-art approaches are applied directly to the data without the MSC pre-processing step. Significant performance improvements are observed when MSC is combined with FSR (13%) and FSRR (8.5%), but not with Ridge Regression (-1%) or LASSO (-32%). The optimal VM results are obtained using the MSC-FSR and MSC-FSRR generated models. © 2012 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a Bayesian learning setting, the posterior distribution of a predictive model arises from a trade-off between its prior distribution and the conditional likelihood of observed data. Such distribution functions usually rely on additional hyperparameters which need to be tuned in order to achieve optimum predictive performance; this operation can be efficiently performed in an Empirical Bayes fashion by maximizing the posterior marginal likelihood of the observed data. Since the score function of this optimization problem is in general characterized by the presence of local optima, it is necessary to resort to global optimization strategies, which require a large number of function evaluations. Given that the evaluation is usually computationally intensive and badly scaled with respect to the dataset size, the maximum number of observations that can be treated simultaneously is quite limited. In this paper, we consider the case of hyperparameter tuning in Gaussian process regression. A straightforward implementation of the posterior log-likelihood for this model requires O(N^3) operations for every iteration of the optimization procedure, where N is the number of examples in the input dataset. We derive a novel set of identities that allow, after an initial overhead of O(N^3), the evaluation of the score function, as well as the Jacobian and Hessian matrices, in O(N) operations. We prove how the proposed identities, that follow from the eigendecomposition of the kernel matrix, yield a reduction of several orders of magnitude in the computation time for the hyperparameter optimization problem. Notably, the proposed solution provides computational advantages even with respect to state of the art approximations that rely on sparse kernel matrices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classification methods with embedded feature selection capability are very appealing for the analysis of complex processes since they allow the analysis of root causes even when the number of input variables is high. In this work, we investigate the performance of three techniques for classification within a Monte Carlo strategy with the aim of root cause analysis. We consider the naive bayes classifier and the logistic regression model with two different implementations for controlling model complexity, namely, a LASSO-like implementation with a L1 norm regularization and a fully Bayesian implementation of the logistic model, the so called relevance vector machine. Several challenges can arise when estimating such models mainly linked to the characteristics of the data: a large number of input variables, high correlation among subsets of variables, the situation where the number of variables is higher than the number of available data points and the case of unbalanced datasets. Using an ecological and a semiconductor manufacturing dataset, we show advantages and drawbacks of each method, highlighting the superior performance in term of classification accuracy for the relevance vector machine with respect to the other classifiers. Moreover, we show how the combination of the proposed techniques and the Monte Carlo approach can be used to get more robust insights into the problem under analysis when faced with challenging modelling conditions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Virtual metrology (VM) aims to predict metrology values using sensor data from production equipment and physical metrology values of preceding samples. VM is a promising technology for the semiconductor manufacturing industry as it can reduce the frequency of in-line metrology operations and provide supportive information for other operations such as fault detection, predictive maintenance and run-to-run control. The prediction models for VM can be from a large variety of linear and nonlinear regression methods and the selection of a proper regression method for a specific VM problem is not straightforward, especially when the candidate predictor set is of high dimension, correlated and noisy. Using process data from a benchmark semiconductor manufacturing process, this paper evaluates the performance of four typical regression methods for VM: multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), neural networks (NN) and Gaussian process regression (GPR). It is observed that GPR performs the best among the four methods and that, remarkably, the performance of linear regression approaches that of GPR as the subset of selected input variables is increased. The observed competitiveness of high-dimensional linear regression models, which does not hold true in general, is explained in the context of extreme learning machines and functional link neural networks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An orthogonal vector approach is proposed for the synthesis of multi-beam directional modulation (DM) transmitters. These systems have the capability of concurrently projecting independent data streams into different specified spatial directions while simultaneously distorting signal constellations in all other directions. Simulated bit error rate (BER) spatial distributions are presented for various multi-beam system configurations in order to illustrate representative examples of physical layer security performance enhancement that can be achieved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Artificial neural network (ANN) methods are used to predict forest characteristics. The data source is the Southeast Alaska (SEAK) Grid Inventory, a ground survey compiled by the USDA Forest Service at several thousand sites. The main objective of this article is to predict characteristics at unsurveyed locations between grid sites. A secondary objective is to evaluate the relative performance of different ANNs. Data from the grid sites are used to train six ANNs: multilayer perceptron, fuzzy ARTMAP, probabilistic, generalized regression, radial basis function, and learning vector quantization. A classification and regression tree method is used for comparison. Topographic variables are used to construct models: latitude and longitude coordinates, elevation, slope, and aspect. The models classify three forest characteristics: crown closure, species land cover, and tree size/structure. Models are constructed using n-fold cross-validation. Predictive accuracy is calculated using a method that accounts for the influence of misclassification as well as measuring correct classifications. The probabilistic and generalized regression networks are found to be the most accurate. The predictions of the ANN models are compared with a classification of the Tongass national forest in southeast Alaska based on the interpretation of satellite imagery and are found to be of similar accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the results of an investigation into the utility of remote sensing (RS) using meteorological satellites sensors and spatial interpolation (SI) of data from meteorological stations, for the prediction of spatial variation in monthly climate across continental Africa in 1990. Information from the Advanced Very High Resolution Radiometer (AVHRR) of the National Oceanic and Atmospheric Administration's (NOAA) polar-orbiting meteorological satellites was used to estimate land surface temperature (LST) and atmospheric moisture. Cold cloud duration (CCD) data derived from the High Resolution Radiometer (HRR) onboard the European Meteorological Satellite programme's (EUMETSAT) Meteosat satellite series were also used as a RS proxy measurement of rainfall. Temperature, atmospheric moisture and rainfall surfaces were independently derived from SI of measurements from the World Meteorological Organization (WMO) member stations of Africa. These meteorological station data were then used to test the accuracy of each methodology, so that the appropriateness of the two techniques for epidemiological research could be compared. SI was a more accurate predictor of temperature, whereas RS provided a better surrogate for rainfall; both were equally accurate at predicting atmospheric moisture. The implications of these results for mapping short and long-term climate change and hence their potential for the study anti control of disease vectors are considered. Taking into account logistic and analytical problems, there were no clear conclusions regarding the optimality of either technique, but there was considerable potential for synergy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: LuxS may function as a metabolic enzyme or as the synthase of a quorum sensing signalling molecule, auto-inducer-2 (AI-2); hence, the mechanism underlying phenotypic changes upon luxS inactivation is not always clear. In Helicobacter pylori, we have recently shown that, rather than functioning in recycling methionine as in most bacteria, LuxS (along with newly-characterised MccA and MccB), synthesises cysteine via reverse transsulphuration. In this study, we investigated whether and how LuxS controls motility of H. pylori, specifically if it has its effects via luxS-required cysteine metabolism or via AI-2 synthesis only.

RESULTS: We report that disruption of luxS renders H. pylori non-motile in soft agar and by microscopy, whereas disruption of mccAHp or mccBHp (other genes in the cysteine provision pathway) does not, implying that the lost phenotype is not due to disrupted cysteine provision. The motility defect of the DeltaluxSHp mutant was complemented genetically by luxSHp and also by addition of in vitro synthesised AI-2 or 4, 5-dihydroxy-2, 3-pentanedione (DPD, the precursor of AI-2). In contrast, exogenously added cysteine could not restore motility to the DeltaluxSHp mutant, confirming that AI-2 synthesis, but not the metabolic effect of LuxS was important. Microscopy showed reduced number and length of flagella in the DeltaluxSHp mutant. Immunoblotting identified decreased levels of FlaA and FlgE but not FlaB in the DeltaluxSHp mutant, and RT-PCR showed that the expression of flaA, flgE, motA, motB, flhA and fliI but not flaB was reduced. Addition of DPD but not cysteine to the DeltaluxSHp mutant restored flagellar gene transcription, and the number and length of flagella.

CONCLUSIONS: Our data show that as well as being a metabolic enzyme, H. pylori LuxS has an alternative role in regulation of motility by modulating flagellar transcripts and flagellar biosynthesis through production of the signalling molecule AI-2.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes an efficient learning mechanism to build fuzzy rule-based systems through the construction of sparse least-squares support vector machines (LS-SVMs). In addition to the significantly reduced computational complexity in model training, the resultant LS-SVM-based fuzzy system is sparser while offers satisfactory generalization capability over unseen data. It is well known that the LS-SVMs have their computational advantage over conventional SVMs in the model training process; however, the model sparseness is lost, which is the main drawback of LS-SVMs. This is an open problem for the LS-SVMs. To tackle the nonsparseness issue, a new regression alternative to the Lagrangian solution for the LS-SVM is first presented. A novel efficient learning mechanism is then proposed in this paper to extract a sparse set of support vectors for generating fuzzy IF-THEN rules. This novel mechanism works in a stepwise subset selection manner, including a forward expansion phase and a backward exclusion phase in each selection step. The implementation of the algorithm is computationally very efficient due to the introduction of a few key techniques to avoid the matrix inverse operations to accelerate the training process. The computational efficiency is also confirmed by detailed computational complexity analysis. As a result, the proposed approach is not only able to achieve the sparseness of the resultant LS-SVM-based fuzzy systems but significantly reduces the amount of computational effort in model training as well. Three experimental examples are presented to demonstrate the effectiveness and efficiency of the proposed learning mechanism and the sparseness of the obtained LS-SVM-based fuzzy systems, in comparison with other SVM-based learning techniques.