981 resultados para rate prediction


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Virtual metrology (VM) aims to predict metrology values using sensor data from production equipment and physical metrology values of preceding samples. VM is a promising technology for the semiconductor manufacturing industry as it can reduce the frequency of in-line metrology operations and provide supportive information for other operations such as fault detection, predictive maintenance and run-to-run control. Methods with minimal user intervention are required to perform VM in a real-time industrial process. In this paper we propose extreme learning machines (ELM) as a competitive alternative to popular methods like lasso and ridge regression for developing VM models. In addition, we propose a new way to choose the hidden layer weights of ELMs that leads to an improvement in its prediction performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes how modern machine learning techniques can be used in conjunction with statistical methods to forecast short term movements in exchange rates, producing models suitable for use in trading. It compares the results achieved by two different techniques, and shows how they can be used in a complementary fashion. The paper draws on experience of both inter- and intra-day forecasting taken from earlier studies conducted by Logica and Chemical Bank Quantitative Research and Trading (QRT) group's experience in developing trading models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Increasingly semiconductor manufacturers are exploring opportunities for virtual metrology (VM) enabled process monitoring and control as a means of reducing non-value added metrology and achieving ever more demanding wafer fabrication tolerances. However, developing robust, reliable and interpretable VM models can be very challenging due to the highly correlated input space often associated with the underpinning data sets. A particularly pertinent example is etch rate prediction of plasma etch processes from multichannel optical emission spectroscopy data. This paper proposes a novel input-clustering based forward stepwise regression methodology for VM model building in such highly correlated input spaces. Max Separation Clustering (MSC) is employed as a pre-processing step to identify a reduced srt of well-conditioned, representative variables that can then be used as inputs to state-of-the-art model building techniques such as Forward Selection Regression (FSR), Ridge regression, LASSO and Forward Selection Ridge Regression (FCRR). The methodology is validated on a benchmark semiconductor plasma etch dataset and the results obtained are compared with those achieved when the state-of-art approaches are applied directly to the data without the MSC pre-processing step. Significant performance improvements are observed when MSC is combined with FSR (13%) and FSRR (8.5%), but not with Ridge Regression (-1%) or LASSO (-32%). The optimal VM results are obtained using the MSC-FSR and MSC-FSRR generated models. © 2012 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For years, choosing the right career by monitoring the trends and scope for different career paths have been a requirement for all youngsters all over the world. In this paper we provide a scientific, data mining based method for job absorption rate prediction and predicting the waiting time needed for 100% placement, for different engineering courses in India. This will help the students in India in a great deal in deciding the right discipline for them for a bright future. Information about passed out students are obtained from the NTMIS ( National technical manpower information system ) NODAL center in Kochi, India residing in Cochin University of science and technology

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Within the regression framework, we show how different levels of nonlinearity influence the instantaneous firing rate prediction of single neurons. Nonlinearity can be achieved in several ways. In particular, we can enrich the predictor set with basis expansions of the input variables (enlarging the number of inputs) or train a simple but different model for each area of the data domain. Spline-based models are popular within the first category. Kernel smoothing methods fall into the second category. Whereas the first choice is useful for globally characterizing complex functions, the second is very handy for temporal data and is able to include inner-state subject variations. Also, interactions among stimuli are considered. We compare state-of-the-art firing rate prediction methods with some more sophisticated spline-based nonlinear methods: multivariate adaptive regression splines and sparse additive models. We also study the impact of kernel smoothing. Finally, we explore the combination of various local models in an incremental learning procedure. Our goal is to demonstrate that appropriate nonlinearity treatment can greatly improve the results. We test our hypothesis on both synthetic data and real neuronal recordings in cat primary visual cortex, giving a plausible explanation of the results from a biological perspective.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

RESUMEN La dispersión del amoniaco (NH3) emitido por fuentes agrícolas en medias distancias, y su posterior deposición en el suelo y la vegetación, pueden llevar a la degradación de ecosistemas vulnerables y a la acidificación de los suelos. La deposición de NH3 suele ser mayor junto a la fuente emisora, por lo que los impactos negativos de dichas emisiones son generalmente mayores en esas zonas. Bajo la legislación comunitaria, varios estados miembros emplean modelos de dispersión inversa para estimar los impactos de las emisiones en las proximidades de las zonas naturales de especial conservación. Una revisión reciente de métodos para evaluar impactos de NH3 en distancias medias recomendaba la comparación de diferentes modelos para identificar diferencias importantes entre los métodos empleados por los distintos países de la UE. En base a esta recomendación, esta tesis doctoral compara y evalúa las predicciones de las concentraciones atmosféricas de NH3 de varios modelos bajo condiciones, tanto reales como hipotéticas, que plantean un potencial impacto sobre ecosistemas (incluidos aquellos bajo condiciones de clima Mediterráneo). En este sentido, se procedió además a la comparación y evaluación de varias técnicas de modelización inversa para inferir emisiones de NH3. Finalmente, se ha desarrollado un modelo matemático simple para calcular las concentraciones de NH3 y la velocidad de deposición de NH3 en ecosistemas vulnerables cercanos a una fuente emisora. La comparativa de modelos supuso la evaluación de cuatro modelos de dispersión (ADMS 4.1; AERMOD v07026; OPS-st v3.0.3 y LADD v2010) en un amplio rango de casos hipotéticos (dispersión de NH3 procedente de distintos tipos de fuentes agrícolas de emisión). La menor diferencia entre las concentraciones medias estimadas por los distintos modelos se obtuvo para escenarios simples. La convergencia entre las predicciones de los modelos fue mínima para el escenario relativo a la dispersión de NH3 procedente de un establo ventilado mecánicamente. En este caso, el modelo ADMS predijo concentraciones significativamente menores que los otros modelos. Una explicación de estas diferencias podríamos encontrarla en la interacción de diferentes “penachos” y “capas límite” durante el proceso de parametrización. Los cuatro modelos de dispersión fueron empleados para dos casos reales de dispersión de NH3: una granja de cerdos en Falster (Dinamarca) y otra en Carolina del Norte (EEUU). Las concentraciones medias anuales estimadas por los modelos fueron similares para el caso americano (emisión de granjas ventiladas de forma natural y balsa de purines). La comparación de las predicciones de los modelos con concentraciones medias anuales medidas in situ, así como la aplicación de los criterios establecidos para la aceptación estadística de los modelos, permitió concluir que los cuatro modelos se comportaron aceptablemente para este escenario. No ocurrió lo mismo en el caso danés (nave ventilada mecánicamente), en donde el modelo LADD no dio buenos resultados debido a la ausencia de procesos de “sobreelevacion de penacho” (plume-rise). Los modelos de dispersión dan a menudo pobres resultados en condiciones de baja velocidad de viento debido a que la teoría de dispersión en la que se basan no es aplicable en estas condiciones. En situaciones de frecuente descenso en la velocidad del viento, la actual guía de modelización propone usar un modelo que sea eficaz bajo dichas condiciones, máxime cuando se realice una valoración que tenga como objeto establecer una política de regularización. Esto puede no ser siempre posible debido a datos meteorológicos insuficientes, en cuyo caso la única opción sería utilizar un modelo más común, como la versión avanzada de los modelos Gausianos ADMS o AERMOD. Con el objetivo de evaluar la idoneidad de estos modelos para condiciones de bajas velocidades de viento, ambos modelos fueron utilizados en un caso con condiciones Mediterráneas. Lo que supone sucesivos periodos de baja velocidad del viento. El estudio se centró en la dispersión de NH3 procedente de una granja de cerdos en Segovia (España central). Para ello la concentración de NH3 media mensual fue medida en 21 localizaciones en torno a la granja. Se realizaron también medidas de concentración de alta resolución en una única localización durante una campaña de una semana. En este caso, se evaluaron dos estrategias para mejorar la respuesta del modelo ante bajas velocidades del viento. La primera se basó en “no zero wind” (NZW), que sustituyó periodos de calma con el mínimo límite de velocidad del viento y “accumulated calm emissions” (ACE), que forzaban al modelo a calcular las emisiones totales en un periodo de calma y la siguiente hora de no-calma. Debido a las importantes incertidumbres en los datos de entrada del modelo (inputs) (tasa de emisión de NH3, velocidad de salida de la fuente, parámetros de la capa límite, etc.), se utilizó el mismo caso para evaluar la incertidumbre en la predicción del modelo y valorar como dicha incertidumbre puede ser considerada en evaluaciones del modelo. Un modelo dinámico de emisión, modificado para el caso de clima Mediterráneo, fue empleado para estimar la variabilidad temporal en las emisiones de NH3. Así mismo, se realizó una comparativa utilizando las emisiones dinámicas y la tasa constante de emisión. La incertidumbre predicha asociada a la incertidumbre de los inputs fue de 67-98% del valor medio para el modelo ADMS y entre 53-83% del valor medio para AERMOD. La mayoría de esta incertidumbre se debió a la incertidumbre del ratio de emisión en la fuente (50%), seguida por la de las condiciones meteorológicas (10-20%) y aquella asociada a las velocidades de salida (5-10%). El modelo AERMOD predijo mayores concentraciones que ADMS y existieron más simulaciones que alcanzaron los criterios de aceptabilidad cuando se compararon las predicciones con las concentraciones medias anuales medidas. Sin embargo, las predicciones del modelo ADMS se correlacionaron espacialmente mejor con las mediciones. El uso de valores dinámicos de emisión estimados mejoró el comportamiento de ADMS, haciendo empeorar el de AERMOD. La aplicación de estrategias destinadas a mejorar el comportamiento de este último tuvo efectos contradictorios similares. Con el objeto de comparar distintas técnicas de modelización inversa, varios modelos (ADMS, LADD y WindTrax) fueron empleados para un caso no agrícola, una colonia de pingüinos en la Antártida. Este caso fue empleado para el estudio debido a que suponía la oportunidad de obtener el primer factor de emisión experimental para una colonia de pingüinos antárticos. Además las condiciones eran propicias desde el punto de vista de la casi total ausencia de concentraciones ambiente (background). Tras el trabajo de modelización existió una concordancia suficiente entre las estimaciones obtenidas por los tres modelos. De este modo se pudo definir un factor de emisión de para la colonia de 1.23 g NH3 por pareja criadora por día (con un rango de incertidumbre de 0.8-2.54 g NH3 por pareja criadora por día). Posteriores aplicaciones de técnicas de modelización inversa para casos agrícolas mostraron también un buen compromiso estadístico entre las emisiones estimadas por los distintos modelos. Con todo ello, es posible concluir que la modelización inversa es una técnica robusta para estimar tasas de emisión de NH3. Modelos de selección (screening) permiten obtener una rápida y aproximada estimación de los impactos medioambientales, siendo una herramienta útil para evaluaciones de impactos en tanto que permite eliminar casos que presentan un riesgo potencial de daño bajo. De esta forma, lo recursos del modelo pueden Resumen (Castellano) destinarse a casos en donde la posibilidad de daño es mayor. El modelo de Cálculo Simple de los Límites de Impacto de Amoniaco (SCAIL) se desarrolló para obtener una estimación de la concentración media de NH3 y de la tasa de deposición seca asociadas a una fuente agrícola. Está técnica de selección, basada en el modelo LADD, fue evaluada y calibrada con diferentes bases de datos y, finalmente, validada utilizando medidas independientes de concentraciones realizadas cerca de las fuentes. En general SCAIL dio buenos resultados de acuerdo a los criterios estadísticos establecidos. Este trabajo ha permitido definir situaciones en las que las concentraciones predichas por modelos de dispersión son similares, frente a otras en las que las predicciones difieren notablemente entre modelos. Algunos modelos nos están diseñados para simular determinados escenarios en tanto que no incluyen procesos relevantes o están más allá de los límites de su aplicabilidad. Un ejemplo es el modelo LADD que no es aplicable en fuentes con velocidad de salida significativa debido a que no incluye una parametrización de sobreelevacion del penacho. La evaluación de un esquema simple combinando la sobreelevacion del penacho y una turbulencia aumentada en la fuente mejoró el comportamiento del modelo. Sin embargo más pruebas son necesarias para avanzar en este sentido. Incluso modelos que son aplicables y que incluyen los procesos relevantes no siempre dan similares predicciones. Siendo las razones de esto aún desconocidas. Por ejemplo, AERMOD predice mayores concentraciones que ADMS para dispersión de NH3 procedente de naves de ganado ventiladas mecánicamente. Existe evidencia que sugiere que el modelo ADMS infraestima concentraciones en estas situaciones debido a un elevado límite de velocidad de viento. Por el contrario, existen evidencias de que AERMOD sobreestima concentraciones debido a sobreestimaciones a bajas Resumen (Castellano) velocidades de viento. Sin embrago, una modificación simple del pre-procesador meteorológico parece mejorar notablemente el comportamiento del modelo. Es de gran importancia que estas diferencias entre las predicciones de los modelos sean consideradas en los procesos de evaluación regulada por los organismos competentes. Esto puede ser realizado mediante la aplicación del modelo más útil para cada caso o, mejor aún, mediante modelos múltiples o híbridos. ABSTRACT Short-range atmospheric dispersion of ammonia (NH3) emitted by agricultural sources and its subsequent deposition to soil and vegetation can lead to the degradation of sensitive ecosystems and acidification of the soil. Atmospheric concentrations and dry deposition rates of NH3 are generally highest near the emission source and so environmental impacts to sensitive ecosystems are often largest at these locations. Under European legislation, several member states use short-range atmospheric dispersion models to estimate the impact of ammonia emissions on nearby designated nature conservation sites. A recent review of assessment methods for short-range impacts of NH3 recommended an intercomparison of the different models to identify whether there are notable differences to the assessment approaches used in different European countries. Based on this recommendation, this thesis compares and evaluates the atmospheric concentration predictions of several models used in these impact assessments for various real and hypothetical scenarios, including Mediterranean meteorological conditions. In addition, various inverse dispersion modelling techniques for the estimation of NH3 emissions rates are also compared and evaluated and a simple screening model to calculate the NH3 concentration and dry deposition rate at a sensitive ecosystem located close to an NH3 source was developed. The model intercomparison evaluated four atmospheric dispersion models (ADMS 4.1; AERMOD v07026; OPS-st v3.0.3 and LADD v2010) for a range of hypothetical case studies representing the atmospheric dispersion from several agricultural NH3 source types. The best agreement between the mean annual concentration predictions of the models was found for simple scenarios with area and volume sources. The agreement between the predictions of the models was worst for the scenario representing the dispersion from a mechanically ventilated livestock house, for which ADMS predicted significantly smaller concentrations than the other models. The reason for these differences appears to be due to the interaction of different plume-rise and boundary layer parameterisations. All four dispersion models were applied to two real case studies of dispersion of NH3 from pig farms in Falster (Denmark) and North Carolina (USA). The mean annual concentration predictions of the models were similar for the USA case study (emissions from naturally ventilated pig houses and a slurry lagoon). The comparison of model predictions with mean annual measured concentrations and the application of established statistical model acceptability criteria concluded that all four models performed acceptably for this case study. This was not the case for the Danish case study (mechanically ventilated pig house) for which the LADD model did not perform acceptably due to the lack of plume-rise processes in the model. Regulatory dispersion models often perform poorly in low wind speed conditions due to the model dispersion theory being inapplicable at low wind speeds. For situations with frequent low wind speed periods, current modelling guidance for regulatory assessments is to use a model that can handle these conditions in an acceptable way. This may not always be possible due to insufficient meteorological data and so the only option may be to carry out the assessment using a more common regulatory model, such as the advanced Gaussian models ADMS or AERMOD. In order to assess the suitability of these models for low wind conditions, they were applied to a Mediterranean case study that included many periods of low wind speed. The case study was the dispersion of NH3 emitted by a pig farm in Segovia, Central Spain, for which mean monthly atmospheric NH3 concentration measurements were made at 21 locations surrounding the farm as well as high-temporal-resolution concentration measurements at one location during a one-week campaign. Two strategies to improve the model performance for low wind speed conditions were tested. These were ‘no zero wind’ (NZW), which replaced calm periods with the minimum threshold wind speed of the model and ‘accumulated calm emissions’ (ACE), which forced the model to emit the total emissions during a calm period during the first subsequent non-calm hour. Due to large uncertainties in the model input data (NH3 emission rates, source exit velocities, boundary layer parameters), the case study was also used to assess model prediction uncertainty and assess how this uncertainty can be taken into account in model evaluations. A dynamic emission model modified for the Mediterranean climate was used to estimate the temporal variability in NH3 emission rates and a comparison was made between the simulations using the dynamic emissions and a constant emission rate. Prediction uncertainty due to model input uncertainty was 67-98% of the mean value for ADMS and between 53-83% of the mean value for AERMOD. Most of this uncertainty was due to source emission rate uncertainty (~50%), followed by uncertainty in the meteorological conditions (~10-20%) and uncertainty in exit velocities (~5-10%). AERMOD predicted higher concentrations than ADMS and more of the simulations met the model acceptability criteria when compared with the annual mean measured concentrations. However, the ADMS predictions were better correlated spatially with the measurements. The use of dynamic emission estimates improved the performance of ADMS but worsened the performance of AERMOD and the application of strategies to improved model performance had similar contradictory effects. In order to compare different inverse modelling techniques, several models (ADMS, LADD and WindTrax) were applied to a non-agricultural case study of a penguin colony in Antarctica. This case study was used since it gave the opportunity to provide the first experimentally-derived emission factor for an Antarctic penguin colony and also had the advantage of negligible background concentrations. There was sufficient agreement between the emission estimates obtained from the three models to define an emission factor for the penguin colony (1.23 g NH3 per breeding pair per day with an uncertainty range of 0.8-2.54 g NH3 per breeding pair per day). This emission estimate compared favourably to the value obtained using a simple micrometeorological technique (aerodynamic gradient) of 0.98 g ammonia per breeding pair per day (95% confidence interval: 0.2-2.4 g ammonia per breeding pair per day). Further application of the inverse modelling techniques for a range of agricultural case studies also demonstrated good agreement between the emission estimates. It is concluded, therefore, that inverse dispersion modelling is a robust technique for estimating NH3 emission rates. Screening models that can provide a quick and approximate estimate of environmental impacts are a useful tool for impact assessments because they can be used to filter out cases that potentially have a minimal environmental impact allowing resources to be focussed on more potentially damaging cases. The Simple Calculation of Ammonia Impact Limits (SCAIL) model was developed as a screening model to provide an estimate of the mean NH3 concentration and dry deposition rate downwind of an agricultural source. This screening tool, based on the LADD model, was evaluated and calibrated with several experimental datasets and then validated using independent concentration measurements made near sources. Overall SCAIL performed acceptably according to established statistical criteria. This work has identified situations where the concentration predictions of dispersion models are similar and other situations where the predictions are significantly different. Some models are simply not designed to simulate certain scenarios since they do not include the relevant processes or are beyond the limits of their applicability. An example is the LADD model that is not applicable to sources with significant exit velocity since the model does not include a plume-rise parameterisation. The testing of a simple scheme combining a momentum-driven plume rise and increased turbulence at the source improved model performance, but more testing is required. Even models that are applicable and include the relevant process do not always give similar predictions and the reasons for this need to be investigated. AERMOD for example predicts higher concentrations than ADMS for dispersion from mechanically ventilated livestock housing. There is evidence to suggest that ADMS underestimates concentrations in these situations due to a high wind speed threshold. Conversely, there is also evidence that AERMOD overestimates concentrations in these situations due to overestimation at low wind speeds. However, a simple modification to the meteorological pre-processor appears to improve the performance of the model. It is important that these differences between the predictions of these models are taken into account in regulatory assessments. This can be done by applying the most suitable model for the assessment in question or, better still, using multiple or hybrid models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Achievement of steady state during indirect calorimetry measurements of resting energy expenditure (REE) is necessary to reduce error and ensure accuracy in the measurement. Steady state is often defined as 5 consecutive min (5-min SS) during which oxygen consumption and carbon dioxide production vary by +/-10%. These criteria, however, are stringent and often difficult to satisfy. This study aimed to assess whether reducing the time period for steady state (4-min SS or 3-min SS) produced measurements of REE that were significantly different from 5-min SS. REE was measured with the use of open-circuit indirect calorimetry in 39 subjects, of whom only 21 (54%) met the 5-min SS criteria. In these 21 subjects, median biases in REE between 5-min SS and 4-min SS and between 5-min SS and 3-min SS were 0.1 and 0.01%, respectively. For individuals, 4-min SS measured REE within a clinically acceptable range of +/-2% of 5-min SS, whereas 3-min SS measured REE within a range of -2-3% of 5-min SS. Harris-Benedict prediction equations estimated REE for individuals within +/-20-30% of 5-min SS. Reducing the time period of steady state to 4 min produced measurements of REE for individuals that were within clinically acceptable, predetermined limits. The limits of agreement for 3-min SS fell outside the predefined limits of +/-2%; however, both 4-min SS and 3-min SS criteria greatly increased the proportion of subjects who satisfied steady state within smaller limits than would be achieved if relying on prediction equations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Online model order complexity estimation remains one of the key problems in neural network research. The problem is further exacerbated in situations where the underlying system generator is non-stationary. In this paper, we introduce a novelty criterion for resource allocating networks (RANs) which is capable of being applied to both stationary and slowly varying non-stationary problems. The deficiencies of existing novelty criteria are discussed and the relative performances are demonstrated on two real-world problems : electricity load forecasting and exchange rate prediction.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The converge-cast in wireless sensor networks (WSNs) is widely applied in many fields such as medical applications and the environmental monitoring. WSNs expect not only providing routing with high throughput but also achieving efficient energy saving. Network coding is one of the most promising techniques to reduce the energy consumption. By maximizing the encoding number, the message capacity per package can be extended to the most efficient condition. Thus, many researchers have focused their work on this field. Nevertheless, the packages sent by the outer nodes need to be temporary stored and delayed in order to maximize the encoding number. To find out the balance between inserting the delay time and maximizing the encoding number, a Converge-cast Scheme based on data collection rate prediction (CSRP) is proposed in this paper. To avoid producing the outdated information, a prediction method based on Modifying Index Curve Model is presented to deal with the dynamic data collection rate of every sensor in WSNs. Furthermore, a novel coding conditions based on CDS is proposed to increase the coding opportunity and to solve the collision problems. The corresponding analysis and experimental results indicate that the feasibility and efficiency of the CSRP is better than normal conditions without the prediction.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Most standard algorithms for prediction with expert advice depend on a parameter called the learning rate. This learning rate needs to be large enough to fit the data well, but small enough to prevent overfitting. For the exponential weights algorithm, a sequence of prior work has established theoretical guarantees for higher and higher data-dependent tunings of the learning rate, which allow for increasingly aggressive learning. But in practice such theoretical tunings often still perform worse (as measured by their regret) than ad hoc tuning with an even higher learning rate. To close the gap between theory and practice we introduce an approach to learn the learning rate. Up to a factor that is at most (poly)logarithmic in the number of experts and the inverse of the learning rate, our method performs as well as if we would know the empirically best learning rate from a large range that includes both conservative small values and values that are much higher than those for which formal guarantees were previously available. Our method employs a grid of learning rates, yet runs in linear time regardless of the size of the grid.