868 resultados para Statistical models of Box-Jenkins. Artificial neural networks (ANN). Oil flow curve
Resumo:
Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^
Resumo:
Euphausiids constitute major biomass component in shelf ecosystems and play a fundamental role in the rapid vertical transport of carbon from the ocean surface to the deeper layers during their daily vertical migration (DVM). DVM depth and migration patterns depend on oceanographic conditions with respect to temperature, light and oxygen availability at depth, factors that are highly dependent on season in most marine regions. Changes in the abiotic conditions also shape Euphausiid metabolism including aerobic and anaerobic energy production. Here we introduce a global krill respiration model which includes the effect of latitude (LAT), the day of the year of interest (DoY), and the number of daylight hours on the day of interest (DLh), in addition to the basal variables that determine ectothermal oxygen consumption (temperature, body mass and depth) in the ANN model (Artificial Neural Networks). The newly implemented parameters link space and time in terms of season and photoperiod to krill respiration. The ANN model showed a better fit (r**2=0.780) when DLh and LAT were included, indicating a decrease in respiration with increasing LAT and decreasing DLh. We therefore propose DLh as a potential variable to consider when building physiological models for both hemispheres. We also tested for seasonality the standard respiration rate of the most common species that were investigated until now in a large range of DLh and DoY with Multiple Linear Regression (MLR) or General Additive model (GAM). GAM successfully integrated DLh (r**2= 0.563) and DoY (r**2= 0.572) effects on respiration rates of the Antarctic krill, Euphausia superba, yielding the minimum metabolic activity in mid-June and the maximum at the end of December. Neither the MLR nor the GAM approach worked for the North Pacific krill Euphausia pacifica, and MLR for the North Atlantic krill Meganyctiphanes norvegica remained inconclusive because of insufficient seasonal data coverage. We strongly encourage comparative respiration measurements of worldwide Euphausiid key species at different seasons to improve accuracy in ecosystem modelling.
Resumo:
The fuzzy min–max neural network classifier is a supervised learning method. This classifier takes the hybrid neural networks and fuzzy systems approach. All input variables in the network are required to correspond to continuously valued variables, and this can be a significant constraint in many real-world situations where there are not only quantitative but also categorical data. The usual way of dealing with this type of variables is to replace the categorical by numerical values and treat them as if they were continuously valued. But this method, implicitly defines a possibly unsuitable metric for the categories. A number of different procedures have been proposed to tackle the problem. In this article, we present a new method. The procedure extends the fuzzy min–max neural network input to categorical variables by introducing new fuzzy sets, a new operation, and a new architecture. This provides for greater flexibility and wider application. The proposed method is then applied to missing data imputation in voting intention polls. The micro data—the set of the respondents’ individual answers to the questions—of this type of poll are especially suited for evaluating the method since they include a large number of numerical and categorical attributes.
Resumo:
Once admitted the advantages of object-based classification compared to pixel-based classification; the need of simple and affordable methods to define and characterize objects to be classified, appears. This paper presents a new methodology for the identification and characterization of objects at different scales, through the integration of spectral information provided by the multispectral image, and textural information from the corresponding panchromatic image. In this way, it has defined a set of objects that yields a simplified representation of the information contained in the two source images. These objects can be characterized by different attributes that allow discriminating between different spectral&textural patterns. This methodology facilitates information processing, from a conceptual and computational point of view. Thus the vectors of attributes defined can be used directly as training pattern input for certain classifiers, as for example artificial neural networks. Growing Cell Structures have been used to classify the merged information.
Resumo:
There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson’s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.
Resumo:
Many neurodegenerative diseases are characterized by malfunction of the DNA damage response. Therefore, it is important to understand the connection between system level neural network behavior and DNA. Neural networks drawn from genetically engineered animals, interfaced with micro-electrode arrays allowed us to unveil connections between networks’ system level activity properties and such genome instability. We discovered that Atm protein deficiency, which in humans leads to progressive motor impairment, leads to a reduced synchronization persistence compared to wild type synchronization, after chemically imposed DNA damage. Not only do these results suggest a role for DNA stability in neural network activity, they also establish an experimental paradigm for empirically determining the role a gene plays on the behavior of a neural network.
Resumo:
The development of new-generation intelligent vehicle technologies will lead to a better level of road safety and CO2 emission reductions. However, the weak point of all these systems is their need for comprehensive and reliable data. For traffic data acquisition, two sources are currently available: 1) infrastructure sensors and 2) floating vehicles. The former consists of a set of fixed point detectors installed in the roads, and the latter consists of the use of mobile probe vehicles as mobile sensors. However, both systems still have some deficiencies. The infrastructure sensors retrieve information fromstatic points of the road, which are spaced, in some cases, kilometers apart. This means that the picture of the actual traffic situation is not a real one. This deficiency is corrected by floating cars, which retrieve dynamic information on the traffic situation. Unfortunately, the number of floating data vehicles currently available is too small and insufficient to give a complete picture of the road traffic. In this paper, we present a floating car data (FCD) augmentation system that combines information fromfloating data vehicles and infrastructure sensors, and that, by using neural networks, is capable of incrementing the amount of FCD with virtual information. This system has been implemented and tested on actual roads, and the results show little difference between the data supplied by the floating vehicles and the virtual vehicles.
Resumo:
In this paper we investigate the effect of biasing the axonal connection delay values in the number of polychronous groups produced for a spiking neuron network model. We use an estimation of distribution algorithm (EDA) that learns tree models to search for optimal delay configurations. Our results indicate that the introduced approach can be used to considerably increase the number of such groups.
Resumo:
This paper presents a new methodology to build parametric models to estimate global solar irradiation adjusted to specific on-site characteristics based on the evaluation of variable im- portance. Thus, those variables higly correlated to solar irradiation on a site are implemented in the model and therefore, different models might be proposed under different climates. This methodology is applied in a study case in La Rioja region (northern Spain). A new model is proposed and evaluated on stability and accuracy against a review of twenty-two already exist- ing parametric models based on temperatures and rainfall in seventeen meteorological stations in La Rioja. The methodology of model evaluation is based on bootstrapping, which leads to achieve a high level of confidence in model calibration and validation from short time series (in this case five years, from 2007 to 2011). The model proposed improves the estimates of the other twenty-two models with average mean absolute error (MAE) of 2.195 MJ/m2 day and average confidence interval width (95% C.I., n=100) of 0.261 MJ/m2 day. 41.65% of the daily residuals in the case of SIAR and 20.12% in that of SOS Rioja fall within the uncertainty tolerance of the pyranometers of the two networks (10% and 5%, respectively). Relative differences between measured and estimated irradiation on an annual cumulative basis are below 4.82%. Thus, the proposed model might be useful to estimate annual sums of global solar irradiation, reaching insignificant differences between measurements from pyranometers.
Resumo:
Automatic blood glucose classification may help specialists to provide a better interpretation of blood glucose data, downloaded directly from patients glucose meter and will contribute in the development of decision support systems for gestational diabetes. This paper presents an automatic blood glucose classifier for gestational diabetes that compares 6 different feature selection methods for two machine learning algorithms: neural networks and decision trees. Three searching algorithms, Greedy, Best First and Genetic, were combined with two different evaluators, CSF and Wrapper, for the feature selection. The study has been made with 6080 blood glucose measurements from 25 patients. Decision trees with a feature set selected with the Wrapper evaluator and the Best first search algorithm obtained the best accuracy: 95.92%.
Resumo:
In Operational Modal Analysis (OMA) of a structure, the data acquisition process may be repeated many times. In these cases, the analyst has several similar records for the modal analysis of the structure that have been obtained at di�erent time instants (multiple records). The solution obtained varies from one record to another, sometimes considerably. The differences are due to several reasons: statistical errors of estimation, changes in the external forces (unmeasured forces) that modify the output spectra, appearance of spurious modes, etc. Combining the results of the di�erent individual analysis is not straightforward. To solve the problem, we propose to make the joint estimation of the parameters using all the records. This can be done in a very simple way using state space models and computing the estimates by maximum-likelihood. The method provides a single result for the modal parameters that combines optimally all the records.