994 resultados para ridge regression


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Virtual metrology (VM) aims to predict metrology values using sensor data from production equipment and physical metrology values of preceding samples. VM is a promising technology for the semiconductor manufacturing industry as it can reduce the frequency of in-line metrology operations and provide supportive information for other operations such as fault detection, predictive maintenance and run-to-run control. The prediction models for VM can be from a large variety of linear and nonlinear regression methods and the selection of a proper regression method for a specific VM problem is not straightforward, especially when the candidate predictor set is of high dimension, correlated and noisy. Using process data from a benchmark semiconductor manufacturing process, this paper evaluates the performance of four typical regression methods for VM: multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), neural networks (NN) and Gaussian process regression (GPR). It is observed that GPR performs the best among the four methods and that, remarkably, the performance of linear regression approaches that of GPR as the subset of selected input variables is increased. The observed competitiveness of high-dimensional linear regression models, which does not hold true in general, is explained in the context of extreme learning machines and functional link neural networks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A forward and backward least angle regression (LAR) algorithm is proposed to construct the nonlinear autoregressive model with exogenous inputs (NARX) that is widely used to describe a large class of nonlinear dynamic systems. The main objective of this paper is to improve model sparsity and generalization performance of the original forward LAR algorithm. This is achieved by introducing a replacement scheme using an additional backward LAR stage. The backward stage replaces insignificant model terms selected by forward LAR with more significant ones, leading to an improved model in terms of the model compactness and performance. A numerical example to construct four types of NARX models, namely polynomials, radial basis function (RBF) networks, neuro fuzzy and wavelet networks, is presented to illustrate the effectiveness of the proposed technique in comparison with some popular methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In many applications, and especially those where batch processes are involved, a target scalar output of interest is often dependent on one or more time series of data. With the exponential growth in data logging in modern industries such time series are increasingly available for statistical modeling in soft sensing applications. In order to exploit time series data for predictive modelling, it is necessary to summarise the information they contain as a set of features to use as model regressors. Typically this is done in an unsupervised fashion using simple techniques such as computing statistical moments, principal components or wavelet decompositions, often leading to significant information loss and hence suboptimal predictive models. In this paper, a functional learning paradigm is exploited in a supervised fashion to derive continuous, smooth estimates of time series data (yielding aggregated local information), while simultaneously estimating a continuous shape function yielding optimal predictions. The proposed Supervised Aggregative Feature Extraction (SAFE) methodology can be extended to support nonlinear predictive models by embedding the functional learning framework in a Reproducing Kernel Hilbert Spaces setting. SAFE has a number of attractive features including closed form solution and the ability to explicitly incorporate first and second order derivative information. Using simulation studies and a practical semiconductor manufacturing case study we highlight the strengths of the new methodology with respect to standard unsupervised feature extraction approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Around 10-15% of patients with locally advanced rectal cancer (LARC) undergo a pathologically complete response (TRG4) to neoadjuvant chemoradiotherapy; the rest of patients exhibit a spectrum of tumour regression (TRG1-3). Understanding therapy-related genomic alterations may help us to identify underlying biology or novel targets associated with response that could increase the efficacy of therapy in patients that do not benefit from the current standard of care.
Methods: 48 FFPE rectal cancer biopsies and matched resections were analysed using the WG-DASL HumanHT-12_v4 Beadchip array on the illumina iScan. Bioinformatic analysis was conducted in Partek genomics suite and R studio. Limma and glmnet packages were used to identify genes differentially expressed between tumour regression grades. Validation of microarray results will be carried out using IHC, RNAscope and RT-PCR.
Results: Immune response genes were observed from supervised analysis of the biopsies which may have predictive value. Differential gene expression from the resections as well as pre and post therapy analysis revealed induction of genes in a tumour regression dependent manner. Pathway mapping and Gene Ontology analysis of these genes suggested antigen processing and natural killer mediated cytotoxicity respectively. The natural killer-like gene signature was switched off in non-responders and on in the responders. IHC has confirmed the presence of Natural killer cells through CD56+ staining.
Conclusion: Identification of NK cell genes and CD56+ cells in patients responding to neoadjuvant chemoradiotherapy warrants further investigation into their association with tumour regression grade in LARC. NK cells are known to lyse malignant cells and determining whether their presence is a cause or consequence of response is crucial. Interrogation of the cytokines upregulated in our NK-like signature will help guide future in vitro models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Geological, biological, morphological, and hydrochemical data are presented for the newly discovered Moytirra vent field at 45oN. This is the only high temperature hydrothermal vent known between the Azores and Iceland, in the North Atlantic and is located on a slow to ultraslow-spreading mid-ocean ridge uniquely situated on the 300 m high fault scarp of the eastern axial wall, 3.5 km from the axial volcanic ridge crest. Furthermore, the Moytirra vent field is, unusually for tectonically controlled hydrothermal vents systems, basalt hosted and perched midway up on the median valley wall and presumably heated by an off-axis magma chamber. The Moytirra vent field consists of an alignment of four sites of venting, three actively emitting "black smoke," producing a complex of chimneys and beehive diffusers. The largest chimney is 18 m tall and vigorously venting. The vent fauna described here are the only ones documented for the North Atlantic (Azores to Reykjanes Ridge) and significantly expands our knowledge of North Atlantic biodiversity. The surfaces of the vent chimneys are occupied by aggregations of gastropods (Peltospira sp.) and populations of alvinocaridid shrimp (Mirocaris sp. with Rimicaris sp. also present). Other fauna present include bythograeid crabs (Segonzacia sp.) and zoarcid fish (Pachycara sp.), but bathymodiolin mussels and actinostolid anemones were not observed in the vent field. The discovery of the Moytirra vent field therefore expands the known latitudinal distributions of several vent-endemic genera in the north Atlantic, and reveals faunal affinities with vents south of the Azores rather than north of Iceland. © 2013. American Geophysical Union. All Rights Reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Histone deacetylases (HDACs) are enzymes involved in transcriptional repression. We aimed to examine the significance of HDAC1 and HDAC2 gene expression in the prediction of recurrence and survival in 156 patients with hepatocellular carcinoma (HCC) among a South East Asian population who underwent curative surgical resection in Singapore. We found that HDAC1 and HDAC2 were upregulated in the majority of HCC tissues. The presence of HDAC1 in tumor tissues was correlated with poor tumor differentiation. Notably, HDAC1 expression in adjacent non-tumor hepatic tissues was correlated with the presence of satellite nodules and multiple lesions, suggesting that HDAC1 upregulation within the field of HCC may contribute to tumor spread. Using competing risk regression analysis, we found that increased cancer-specific mortality was significantly associated with HDAC2 expression. Mortality was also increased with high HDAC1 expression. In the liver cancer cell lines, HEP3B, HEPG2, PLC5, and a colorectal cancer cell line, HCT116, the combined knockdown of HDAC1 and HDAC2 increased cell death and reduced cell proliferation as well as colony formation. In contrast, knockdown of either HDAC1 or HDAC2 alone had minimal effects on cell death and proliferation. Taken together, our study suggests that both HDAC1 and HDAC2 exert pro-survival effects in HCC cells, and the combination of isoform-specific HDAC inhibitors against both HDACs may be effective in targeting HCC to reduce mortality.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação de Mestrado, Gestão da Água e da Costa, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2010

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tese de doutoramento, Geologia (Metalogenia), Universidade de Lisboa, Faculdade de Ciências, 2014

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Airborne concentrations of Poaceae pollen have been monitored in Poznań for more than ten years and the length of the dataset is now considered sufficient for statistical analysis. The objective of this paper is to produce long-range forecasts that predict certain characteristics of the grass pollen season (such as the start, peak and end dates of the grass pollen season) as well as short-term forecasts that predict daily variations in grass pollen counts for the next day or next few days throughout the main grass pollen season. The method of forecasting was regression analysis. Correlation analysis was used to examine the relationship between grass pollen counts and the factors that affect its production, release and dispersal. The models were constructed with data from 1994-2004 and tested on data from 2005 and 2006. The forecast models predicted the start of the grass pollen season to within 2 days and achieved 61% and 70% accuracy on a scale of 1-4 when forecasting variations in daily grass pollen counts in 2005 and 2006 respectively. This study has emphasised how important the weather during the few weeks or months preceding pollination is to grass pollen production, and draws attention to the importance of considering large-scale patterns of climate variability (indices of the North Atlantic Oscillation) when constructing forecast models for allergenic pollen.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The mesoscale (100–102 m) of river habitats has been identified as the scale that simultaneously offers insights into ecological structure and falls within the practical bounds of river management. Mesoscale habitat (mesohabitat) classifications for relatively large rivers, however, are underdeveloped compared with those produced for smaller streams. Approaches to habitat modelling have traditionally focused on individual species or proceeded on a species-by-species basis. This is particularly problematic in larger rivers where the effects of biological interactions are more complex and intense. Community-level approaches can rapidly model many species simultaneously, thereby integrating the effects of biological interactions while providing information on the relative importance of environmental variables in structuring the community. One such community-level approach, multivariate regression trees, was applied in order to determine the relative influences of abiotic factors on fish assemblages within shoreline mesohabitats of San Pedro River, Chile, and to define reference communities prior to the planned construction of a hydroelectric power plant. Flow depth, bank materials and the availability of riparian and instream cover, including woody debris, were the main variables driving differences between the assemblages. Species strongly indicative of distinctive mesohabitat types included the endemic Galaxias platei. Among other outcomes, the results provide information on the impact of non-native salmonids on river-dwelling Galaxias platei, suggesting a degree of habitat segregation between these taxa based on flow depth. The results support the use of the mesohabitat concept in large, relatively pristine river systems, and they represent a basis for assessing the impact of any future hydroelectric power plant construction and operation. By combing community classifications with simple sets of environmental rules, the multivariate regression trees produced can be used to predict the community structure of any mesohabitat along the reach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Long-term contractual decisions are the basis of an efficient risk management. However those types of decisions have to be supported with a robust price forecast methodology. This paper reports a different approach for long-term price forecast which tries to give answers to that need. Making use of regression models, the proposed methodology has as main objective to find the maximum and a minimum Market Clearing Price (MCP) for a specific programming period, and with a desired confidence level α. Due to the problem complexity, the meta-heuristic Particle Swarm Optimization (PSO) was used to find the best regression parameters and the results compared with the obtained by using a Genetic Algorithm (GA). To validate these models, results from realistic data are presented and discussed in detail.