991 resultados para least absolute deviation


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The objective of this research was to use non-linear models to describe the growth pattern in Santa Ines sheep and to study the influence of environmental effects on curve parameters with the best-fit model. The models included the Brody, Richards, Von Bertalanffy, Gompertz, and Logistic models. We used 773 field reports on 162 animals ranging in age from 120 to 774 days, including 46 males and 116 females. The statistics used to evaluate the quality of fit included RMS (residual mean square), C% (percentage of convergence), R-2 (adjusted determination coefficient) and MAD (mean absolute deviation). Of the fixed effects studied, the only significant relationship was the effect of sex on parameter A. The Richards model was problematic during the process of convergence. Considering all studied criteria, the Logistic model presented the best fit in describing the growth pattern in Santa Ines sheep. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The single machine scheduling problem with a common due date and non-identical ready times for the jobs is examined in this work. Performance is measured by the minimization of the weighted sum of earliness and tardiness penalties of the jobs. Since this problem is NP-hard, the application of constructive heuristics that exploit specific characteristics of the problem to improve their performance is investigated. The proposed approaches are examined through a computational comparative study on a set of 280 benchmark test problems with up to 1000 jobs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Boiling points (T-B) of acyclic alkynes are predicted from their boiling point numbers (Y-BP) with the relationship T-B(K) = -16.802Y(BP)(2/3) + 337.377Y(BP)(1/3) - 437.883. In turn, Y-BP values are calculated from structure using the equation Y-BP = 1.726 + A(i) + 2.779C + 1.716M(3) + 1.564M + 4.204E(3) + 3.905E + 5.007P - 0.329D + 0.241G + 0.479V + 0.967T + 0.574S. Here A(i) depends on the substitution pattern of the alkyne and the remainder of the equation is the same as that reported earlier for alkanes. For a data set consisting of 76 acyclic alkynes, the correlation of predicted and literature T-B values had an average absolute deviation of 1.46 K, and the R-2 of the correlation was 0.999. In addition, the calculated Y-BP values can be used to predict the flash points of alkynes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Most studies on measures of transpiration of plants, especially woody fruit, relies on methods of heat supply in the trunk. This study aimed to calibrate the Thermal Dissipation Probe Method (TDP) to estimate the transpiration, study the effects of natural thermal gradients and determine the relation between outside diameter and area of xylem in 'Valencia' orange young plants. TDP were installed in 40 orange plants of 15 months old, planted in boxes of 500 L, in a greenhouse. It was tested the correction of the natural thermal differences (DTN) for the estimation based on two unheated probes. The area of the conductive section was related to the outside diameter of the stem by means of polynomial regression. The equation for estimation of sap flow was calibrated having as standard lysimeter measures of a representative plant. The angular coefficient of the equation for estimating sap flow was adjusted by minimizing the absolute deviation between the sap flow and daily transpiration measured by lysimeter. Based on these results, it was concluded that the method of TDP, adjusting the original calibration and correction of the DTN, was effective in transpiration assessment.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Boiling points (T B) of acyclic alkynes are predicted from their boiling point numbers (Y BP) with the relationship T B(K) = -16.802Y BP2/3 + 337.377Y BP1/3 - 437.883. In turn, Y BP values are calculated from structure using the equation Y BP = 1.726 + Ai + 2.779C + 1.716M3 + 1.564M + 4.204E3 + 3.905E + 5.007P - 0.329D + 0.241G + 0.479V + 0.967T + 0.574S. Here Ai depends on the substitution pattern of the alkyne and the remainder of the equation is the same as that reported earlier for alkanes. For a data set consisting of 76 acyclic alkynes, the correlation of predicted and literature T B values had an average absolute deviation of 1.46 K, and the R² of the correlation was 0.999. In addition, the calculated Y BP values can be used to predict the flash points of alkynes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The G3, CBS-QB3, and CBS-APNO methods have been used to calculate ΔH and ΔG values for deprotonation of seventeen gas-phase reactions where the experimental values are reported to be accurate within one kcal/mol. For these reactions, the mean absolute deviation of these three methods from experiment is 0.84 to 1.26 kcal/mol, and the root-mean-square deviation for ΔG and ΔH is 1.43 and 1.49 kcal/mol for the CBS-QB3 method, 1.06 and 1.14 kcal/mol for the CBS-APNO method, and 1.16 and 1.28 for the G3 method. The high accuracy of these methods makes them reliable for calculating gas-phase deprotonation reactions, and allows them to serve as a valuable check on the accuracy of experimental data reported in the National Institutes of Standards and Technology database.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND: Short-acting agents for neuromuscular block (NMB) require frequent dosing adjustments for individual patient's needs. In this study, we verified a new closed-loop controller for mivacurium dosing in clinical trials. METHODS: Fifteen patients were studied. T1% measured with electromyography was used as input signal for the model-based controller. After induction of propofol/opiate anaesthesia, stabilization of baseline electromyography signal was awaited and a bolus of 0.3 mg kg-1 mivacurium was then administered to facilitate endotracheal intubation. Closed-loop infusion was started thereafter, targeting a neuromuscular block of 90%. Setpoint deviation, the number of manual interventions and surgeon's complaints were recorded. Drug use and its variability between and within patients were evaluated. RESULTS: Median time of closed-loop control for the 11 patients included in the data processing was 135 [89-336] min (median [range]). Four patients had to be excluded because of sensor problems. Mean absolute deviation from setpoint was 1.8 +/- 0.9 T1%. Neither manual interventions nor complaints from the surgeons were recorded. Mean necessary mivacurium infusion rate was 7.0 +/- 2.2 microg kg-1 min-1. Intrapatient variability of mean infusion rates over 30-min interval showed high differences up to a factor of 1.8 between highest and lowest requirement in the same patient. CONCLUSIONS: Neuromuscular block can precisely be controlled with mivacurium using our model-based controller. The amount of mivacurium needed to maintain T1% at defined constant levels differed largely between and within patients. Closed-loop control seems therefore advantageous to automatically maintain neuromuscular block at constant levels.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Since 2010, the client base of online-trading service providers has grown significantly. Such companies enable small investors to access the stock market at advantageous rates. Because small investors buy and sell stocks in moderate amounts, they should consider fixed transaction costs, integral transaction units, and dividends when selecting their portfolio. In this paper, we consider the small investor’s problem of investing capital in stocks in a way that maximizes the expected portfolio return and guarantees that the portfolio risk does not exceed a prescribed risk level. Portfolio-optimization models known from the literature are in general designed for institutional investors and do not consider the specific constraints of small investors. We therefore extend four well-known portfolio-optimization models to make them applicable for small investors. We consider one nonlinear model that uses variance as a risk measure and three linear models that use the mean absolute deviation from the portfolio return, the maximum loss, and the conditional value-at-risk as risk measures. We extend all models to consider piecewise-constant transaction costs, integral transaction units, and dividends. In an out-of-sample experiment based on Swiss stock-market data and the cost structure of the online-trading service provider Swissquote, we apply both the basic models and the extended models; the former represent the perspective of an institutional investor, and the latter the perspective of a small investor. The basic models compute portfolios that yield on average a slightly higher return than the portfolios computed with the extended models. However, all generated portfolios yield on average a higher return than the Swiss performance index. There are considerable differences between the four risk measures with respect to the mean realized portfolio return and the standard deviation of the realized portfolio return.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND Many preschool children have wheeze or cough, but only some have asthma later. Existing prediction tools are difficult to apply in clinical practice or exhibit methodological weaknesses. OBJECTIVE We sought to develop a simple and robust tool for predicting asthma at school age in preschool children with wheeze or cough. METHODS From a population-based cohort in Leicestershire, United Kingdom, we included 1- to 3-year-old subjects seeing a doctor for wheeze or cough and assessed the prevalence of asthma 5 years later. We considered only noninvasive predictors that are easy to assess in primary care: demographic and perinatal data, eczema, upper and lower respiratory tract symptoms, and family history of atopy. We developed a model using logistic regression, avoided overfitting with the least absolute shrinkage and selection operator penalty, and then simplified it to a practical tool. We performed internal validation and assessed its predictive performance using the scaled Brier score and the area under the receiver operating characteristic curve. RESULTS Of 1226 symptomatic children with follow-up information, 345 (28%) had asthma 5 years later. The tool consists of 10 predictors yielding a total score between 0 and 15: sex, age, wheeze without colds, wheeze frequency, activity disturbance, shortness of breath, exercise-related and aeroallergen-related wheeze/cough, eczema, and parental history of asthma/bronchitis. The scaled Brier scores for the internally validated model and tool were 0.20 and 0.16, and the areas under the receiver operating characteristic curves were 0.76 and 0.74, respectively. CONCLUSION This tool represents a simple, low-cost, and noninvasive method to predict the risk of later asthma in symptomatic preschool children, which is ready to be tested in other populations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We combined 33 ice core records, 13 from the Northern Hemisphere and 20 from the Southern Hemisphere, to determine the timing and magnitude of the great Kuwae eruption in the mid-15th century. We extracted volcanic deposition signals by applying a high-pass loess filter to the time series and examining peaks that exceed twice the 31 year running median absolute deviation. By accounting for the dating uncertainties associated with each record, these ice core records together reveal a large volcanogenic acid deposition event during 1453 - 1457 A. D. The results suggest only one major stratospheric injection from the Kuwae eruption and confirm previous findings that the Kuwae eruption took place in late 1452 or early 1453, which may serve as a reference to evaluate and improve the dating of ice core records. The average total sulfate deposition from the Kuwae eruption was 93 kg SO4/km(2) in Antarctica and 25 kg SO4/km(2) in Greenland. The deposition in Greenland was probably underestimated since it was the average value of only two northern Greenland sites with very low accumulation rates. After taking the spatial variation into consideration, the average Kuwae deposition in Greenland was estimated to be 45 kg SO4/km(2). By applying the same technique to the other major eruptions of the past 700 years our result suggests that the Kuwae eruption was the largest stratospheric sulfate event of that period, probably surpassing the total sulfate deposition of the Tambora eruption of 1815, which produced 59 kg SO4/km(2) in Antarctica and 50 kg SO4/km(2) in Greenland.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Since 2010, the client base of online-trading service providers has grown significantly. Such companies enable small investors to access the stock market at advantageous rates. Because small investors buy and sell stocks in moderate amounts, they should consider fixed transaction costs, integral transaction units, and dividends when selecting their portfolio. In this paper, we consider the small investor’s problem of investing capital in stocks in a way that maximizes the expected portfolio return and guarantees that the portfolio risk does not exceed a prescribed risk level. Portfolio-optimization models known from the literature are in general designed for institutional investors and do not consider the specific constraints of small investors. We therefore extend four well-known portfolio-optimization models to make them applicable for small investors. We consider one nonlinear model that uses variance as a risk measure and three linear models that use the mean absolute deviation from the portfolio return, the maximum loss, and the conditional value-at-risk as risk measures. We extend all models to consider piecewise-constant transaction costs, integral transaction units, and dividends. In an out-of-sample experiment based on Swiss stock-market data and the cost structure of the online-trading service provider Swissquote, we apply both the basic models and the extended models; the former represent the perspective of an institutional investor, and the latter the perspective of a small investor. The basic models compute portfolios that yield on average a slightly higher return than the portfolios computed with the extended models. However, all generated portfolios yield on average a higher return than the Swiss performance index. There are considerable differences between the four risk measures with respect to the mean realized portfolio return and the standard deviation of the realized portfolio return.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The normal boiling point is a fundamental thermo-physical property, which is important in describing the transition between the vapor and liquid phases. Reliable method which can predict it is of great importance, especially for compounds where there are no experimental data available. In this work, an improved group contribution method, which is second order method, for determination of the normal boiling point of organic compounds based on the Joback functional first order groups with some changes and added some other functional groups was developed by using experimental data for 632 organic components. It could distinguish most of structural isomerism and stereoisomerism, which including the structural, cis- and trans- isomers of organic compounds. First and second order contributions for hydrocarbons and hydrocarbon derivatives containing carbon, hydrogen, oxygen, nitrogen, sulfur, fluorine, chlorine and bromine atoms, are given. The fminsearch mathematical approach from MATLAB software is used in this study to select an optimal collection of functional groups (65 functional groups) and subsequently to develop the model. This is a direct search method that uses the simplex search method of Lagarias et al. The results of the new method are compared to the several currently used methods and are shown to be far more accurate and reliable. The average absolute deviation of normal boiling point predictions for 632 organic compounds is 4.4350 K; and the average absolute relative deviation is 1.1047 %, which is of adequate accuracy for many practical applications.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models recently proposed to deal with multi-dimensional classification problems, where each instance in the data set has to be assigned to more than one class variable. In this paper, we propose a Markov blanket-based approach for learning MBCs from data. Basically, it consists of determining the Markov blanket around each class variable using the HITON algorithm, then specifying the directionality over the MBC subgraphs. Our approach is applied to the prediction problem of the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson’s Disease Questionnaire (PDQ-39) in order to estimate the health-related quality of life of Parkinson’s patients. Fivefold cross-validation experiments were carried out on randomly generated synthetic data sets, Yeast data set, as well as on a real-world Parkinson’s disease data set containing 488 patients. The experimental study, including comparison with additional Bayesian network-based approaches, back propagation for multi-label learning, multi-label k-nearest neighbor, multinomial logistic regression, ordinary least squares, and censored least absolute deviations, shows encouraging results in terms of predictive accuracy as well as the identification of dependence relationships among class and feature variables.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Hoy en día, con la evolución continua y rápida de las tecnologías de la información y los dispositivos de computación, se recogen y almacenan continuamente grandes volúmenes de datos en distintos dominios y a través de diversas aplicaciones del mundo real. La extracción de conocimiento útil de una cantidad tan enorme de datos no se puede realizar habitualmente de forma manual, y requiere el uso de técnicas adecuadas de aprendizaje automático y de minería de datos. La clasificación es una de las técnicas más importantes que ha sido aplicada con éxito a varias áreas. En general, la clasificación se compone de dos pasos principales: en primer lugar, aprender un modelo de clasificación o clasificador a partir de un conjunto de datos de entrenamiento, y en segundo lugar, clasificar las nuevas instancias de datos utilizando el clasificador aprendido. La clasificación es supervisada cuando todas las etiquetas están presentes en los datos de entrenamiento (es decir, datos completamente etiquetados), semi-supervisada cuando sólo algunas etiquetas son conocidas (es decir, datos parcialmente etiquetados), y no supervisada cuando todas las etiquetas están ausentes en los datos de entrenamiento (es decir, datos no etiquetados). Además, aparte de esta taxonomía, el problema de clasificación se puede categorizar en unidimensional o multidimensional en función del número de variables clase, una o más, respectivamente; o también puede ser categorizado en estacionario o cambiante con el tiempo en función de las características de los datos y de la tasa de cambio subyacente. A lo largo de esta tesis, tratamos el problema de clasificación desde tres perspectivas diferentes, a saber, clasificación supervisada multidimensional estacionaria, clasificación semisupervisada unidimensional cambiante con el tiempo, y clasificación supervisada multidimensional cambiante con el tiempo. Para llevar a cabo esta tarea, hemos usado básicamente los clasificadores Bayesianos como modelos. La primera contribución, dirigiéndose al problema de clasificación supervisada multidimensional estacionaria, se compone de dos nuevos métodos de aprendizaje de clasificadores Bayesianos multidimensionales a partir de datos estacionarios. Los métodos se proponen desde dos puntos de vista diferentes. El primer método, denominado CB-MBC, se basa en una estrategia de envoltura de selección de variables que es voraz y hacia delante, mientras que el segundo, denominado MB-MBC, es una estrategia de filtrado de variables con una aproximación basada en restricciones y en el manto de Markov. Ambos métodos han sido aplicados a dos problemas reales importantes, a saber, la predicción de los inhibidores de la transcriptasa inversa y de la proteasa para el problema de infección por el virus de la inmunodeficiencia humana tipo 1 (HIV-1), y la predicción del European Quality of Life-5 Dimensions (EQ-5D) a partir de los cuestionarios de la enfermedad de Parkinson con 39 ítems (PDQ-39). El estudio experimental incluye comparaciones de CB-MBC y MB-MBC con los métodos del estado del arte de la clasificación multidimensional, así como con métodos comúnmente utilizados para resolver el problema de predicción de la enfermedad de Parkinson, a saber, la regresión logística multinomial, mínimos cuadrados ordinarios, y mínimas desviaciones absolutas censuradas. En ambas aplicaciones, los resultados han sido prometedores con respecto a la precisión de la clasificación, así como en relación al análisis de las estructuras gráficas que identifican interacciones conocidas y novedosas entre las variables. La segunda contribución, referida al problema de clasificación semi-supervisada unidimensional cambiante con el tiempo, consiste en un método nuevo (CPL-DS) para clasificar flujos de datos parcialmente etiquetados. Los flujos de datos difieren de los conjuntos de datos estacionarios en su proceso de generación muy rápido y en su aspecto de cambio de concepto. Es decir, los conceptos aprendidos y/o la distribución subyacente están probablemente cambiando y evolucionando en el tiempo, lo que hace que el modelo de clasificación actual sea obsoleto y deba ser actualizado. CPL-DS utiliza la divergencia de Kullback-Leibler y el método de bootstrapping para cuantificar y detectar tres tipos posibles de cambio: en las predictoras, en la a posteriori de la clase o en ambas. Después, si se detecta cualquier cambio, un nuevo modelo de clasificación se aprende usando el algoritmo EM; si no, el modelo de clasificación actual se mantiene sin modificaciones. CPL-DS es general, ya que puede ser aplicado a varios modelos de clasificación. Usando dos modelos diferentes, el clasificador naive Bayes y la regresión logística, CPL-DS se ha probado con flujos de datos sintéticos y también se ha aplicado al problema real de la detección de código malware, en el cual los nuevos ficheros recibidos deben ser continuamente clasificados en malware o goodware. Los resultados experimentales muestran que nuestro método es efectivo para la detección de diferentes tipos de cambio a partir de los flujos de datos parcialmente etiquetados y también tiene una buena precisión de la clasificación. Finalmente, la tercera contribución, sobre el problema de clasificación supervisada multidimensional cambiante con el tiempo, consiste en dos métodos adaptativos, a saber, Locally Adpative-MB-MBC (LA-MB-MBC) y Globally Adpative-MB-MBC (GA-MB-MBC). Ambos métodos monitorizan el cambio de concepto a lo largo del tiempo utilizando la log-verosimilitud media como métrica y el test de Page-Hinkley. Luego, si se detecta un cambio de concepto, LA-MB-MBC adapta el actual clasificador Bayesiano multidimensional localmente alrededor de cada nodo cambiado, mientras que GA-MB-MBC aprende un nuevo clasificador Bayesiano multidimensional. El estudio experimental realizado usando flujos de datos sintéticos multidimensionales indica los méritos de los métodos adaptativos propuestos. ABSTRACT Nowadays, with the ongoing and rapid evolution of information technology and computing devices, large volumes of data are continuously collected and stored in different domains and through various real-world applications. Extracting useful knowledge from such a huge amount of data usually cannot be performed manually, and requires the use of adequate machine learning and data mining techniques. Classification is one of the most important techniques that has been successfully applied to several areas. Roughly speaking, classification consists of two main steps: first, learn a classification model or classifier from an available training data, and secondly, classify the new incoming unseen data instances using the learned classifier. Classification is supervised when the whole class values are present in the training data (i.e., fully labeled data), semi-supervised when only some class values are known (i.e., partially labeled data), and unsupervised when the whole class values are missing in the training data (i.e., unlabeled data). In addition, besides this taxonomy, the classification problem can be categorized into uni-dimensional or multi-dimensional depending on the number of class variables, one or more, respectively; or can be also categorized into stationary or streaming depending on the characteristics of the data and the rate of change underlying it. Through this thesis, we deal with the classification problem under three different settings, namely, supervised multi-dimensional stationary classification, semi-supervised unidimensional streaming classification, and supervised multi-dimensional streaming classification. To accomplish this task, we basically used Bayesian network classifiers as models. The first contribution, addressing the supervised multi-dimensional stationary classification problem, consists of two new methods for learning multi-dimensional Bayesian network classifiers from stationary data. They are proposed from two different points of view. The first method, named CB-MBC, is based on a wrapper greedy forward selection approach, while the second one, named MB-MBC, is a filter constraint-based approach based on Markov blankets. Both methods are applied to two important real-world problems, namely, the prediction of the human immunodeficiency virus type 1 (HIV-1) reverse transcriptase and protease inhibitors, and the prediction of the European Quality of Life-5 Dimensions (EQ-5D) from 39-item Parkinson’s Disease Questionnaire (PDQ-39). The experimental study includes comparisons of CB-MBC and MB-MBC against state-of-the-art multi-dimensional classification methods, as well as against commonly used methods for solving the Parkinson’s disease prediction problem, namely, multinomial logistic regression, ordinary least squares, and censored least absolute deviations. For both considered case studies, results are promising in terms of classification accuracy as well as regarding the analysis of the learned MBC graphical structures identifying known and novel interactions among variables. The second contribution, addressing the semi-supervised uni-dimensional streaming classification problem, consists of a novel method (CPL-DS) for classifying partially labeled data streams. Data streams differ from the stationary data sets by their highly rapid generation process and their concept-drifting aspect. That is, the learned concepts and/or the underlying distribution are likely changing and evolving over time, which makes the current classification model out-of-date requiring to be updated. CPL-DS uses the Kullback-Leibler divergence and bootstrapping method to quantify and detect three possible kinds of drift: feature, conditional or dual. Then, if any occurs, a new classification model is learned using the expectation-maximization algorithm; otherwise, the current classification model is kept unchanged. CPL-DS is general as it can be applied to several classification models. Using two different models, namely, naive Bayes classifier and logistic regression, CPL-DS is tested with synthetic data streams and applied to the real-world problem of malware detection, where the new received files should be continuously classified into malware or goodware. Experimental results show that our approach is effective for detecting different kinds of drift from partially labeled data streams, as well as having a good classification performance. Finally, the third contribution, addressing the supervised multi-dimensional streaming classification problem, consists of two adaptive methods, namely, Locally Adaptive-MB-MBC (LA-MB-MBC) and Globally Adaptive-MB-MBC (GA-MB-MBC). Both methods monitor the concept drift over time using the average log-likelihood score and the Page-Hinkley test. Then, if a drift is detected, LA-MB-MBC adapts the current multi-dimensional Bayesian network classifier locally around each changed node, whereas GA-MB-MBC learns a new multi-dimensional Bayesian network classifier from scratch. Experimental study carried out using synthetic multi-dimensional data streams shows the merits of both proposed adaptive methods.