961 resultados para Predictive Mean Squared Efficiency


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this article we investigate the asymptotic and finite-sample properties of predictors of regression models with autocorrelated errors. We prove new theorems associated with the predictive efficiency of generalized least squares (GLS) and incorrectly structured GLS predictors. We also establish the form associated with their predictive mean squared errors as well as the magnitude of these errors relative to each other and to those generated from the ordinary least squares (OLS) predictor. A large simulation study is used to evaluate the finite-sample performance of forecasts generated from models using different corrections for the serial correlation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose exact likelihood-based mean-variance efficiency tests of the market portfolio in the context of Capital Asset Pricing Model (CAPM), allowing for a wide class of error distributions which include normality as a special case. These tests are developed in the frame-work of multivariate linear regressions (MLR). It is well known however that despite their simple statistical structure, standard asymptotically justified MLR-based tests are unreliable. In financial econometrics, exact tests have been proposed for a few specific hypotheses [Jobson and Korkie (Journal of Financial Economics, 1982), MacKinlay (Journal of Financial Economics, 1987), Gib-bons, Ross and Shanken (Econometrica, 1989), Zhou (Journal of Finance 1993)], most of which depend on normality. For the gaussian model, our tests correspond to Gibbons, Ross and Shanken’s mean-variance efficiency tests. In non-gaussian contexts, we reconsider mean-variance efficiency tests allowing for multivariate Student-t and gaussian mixture errors. Our framework allows to cast more evidence on whether the normality assumption is too restrictive when testing the CAPM. We also propose exact multivariate diagnostic checks (including tests for multivariate GARCH and mul-tivariate generalization of the well known variance ratio tests) and goodness of fit tests as well as a set estimate for the intervening nuisance parameters. Our results [over five-year subperiods] show the following: (i) multivariate normality is rejected in most subperiods, (ii) residual checks reveal no significant departures from the multivariate i.i.d. assumption, and (iii) mean-variance efficiency tests of the market portfolio is not rejected as frequently once it is allowed for the possibility of non-normal errors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A large number of ridge regression estimators have been proposed and used with little knowledge of their true distributions. Because of this lack of knowledge, these estimators cannot be used to test hypotheses or to form confidence intervals.^ This paper presents a basic technique for deriving the exact distribution functions for a class of generalized ridge estimators. The technique is applied to five prominent generalized ridge estimators. Graphs of the resulting distribution functions are presented. The actual behavior of these estimators is found to be considerably different than the behavior which is generally assumed for ridge estimators.^ This paper also uses the derived distributions to examine the mean squared error properties of the estimators. A technique for developing confidence intervals based on the generalized ridge estimators is also presented. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genomewide marker information can improve the reliability of breeding value predictions for young selection candidates in genomic selection. However, the cost of genotyping limits its use to elite animals, and how such selective genotyping affects predictive ability of genomic selection models is an open question. We performed a simulation study to evaluate the quality of breeding value predictions for selection candidates based on different selective genotyping strategies in a population undergoing selection. The genome consisted of 10 chromosomes of 100 cM each. After 5,000 generations of random mating with a population size of 100 (50 males and 50 females), generation G(0) (reference population) was produced via a full factorial mating between the 50 males and 50 females from generation 5,000. Different levels of selection intensities (animals with the largest yield deviation value) in G(0) or random sampling (no selection) were used to produce offspring of G(0) generation (G(1)). Five genotyping strategies were used to choose 500 animals in G(0) to be genotyped: 1) Random: randomly selected animals, 2) Top: animals with largest yield deviation values, 3) Bottom: animals with lowest yield deviations values, 4) Extreme: animals with the 250 largest and the 250 lowest yield deviations values, and 5) Less Related: less genetically related animals. The number of individuals in G(0) and G(1) was fixed at 2,500 each, and different levels of heritability were considered (0.10, 0.25, and 0.50). Additionally, all 5 selective genotyping strategies (Random, Top, Bottom, Extreme, and Less Related) were applied to an indicator trait in generation G(0), and the results were evaluated for the target trait in generation G(1), with the genetic correlation between the 2 traits set to 0.50. The 5 genotyping strategies applied to individuals in G(0) (reference population) were compared in terms of their ability to predict the genetic values of the animals in G(1) (selection candidates). Lower correlations between genomic-based estimates of breeding values (GEBV) and true breeding values (TBV) were obtained when using the Bottom strategy. For Random, Extreme, and Less Related strategies, the correlation between GEBV and TBV became slightly larger as selection intensity decreased and was largest when no selection occurred. These 3 strategies were better than the Top approach. In addition, the Extreme, Random, and Less Related strategies had smaller predictive mean squared errors (PMSE) followed by the Top and Bottom methods. Overall, the Extreme genotyping strategy led to the best predictive ability of breeding values, indicating that animals with extreme yield deviations values in a reference population are the most informative when training genomic selection models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An Artificial Neural Network (ANN) is a computational modeling tool which has found extensive acceptance in many disciplines for modeling complex real world problems. An ANN can model problems through learning by example, rather than by fully understanding the detailed characteristics and physics of the system. In the present study, the accuracy and predictive power of an ANN was evaluated in predicting kinetic viscosity of biodiesels over a wide range of temperatures typically encountered in diesel engine operation. In this model, temperature and chemical composition of biodiesel were used as input variables. In order to obtain the necessary data for model development, the chemical composition and temperature dependent fuel properties of ten different types of biodiesels were measured experimentally using laboratory standard testing equipments following internationally recognized testing procedures. The Neural Networks Toolbox of MatLab R2012a software was used to train, validate and simulate the ANN model on a personal computer. The network architecture was optimised following a trial and error method to obtain the best prediction of the kinematic viscosity. The predictive performance of the model was determined by calculating the absolute fraction of variance (R2), root mean squared (RMS) and maximum average error percentage (MAEP) between predicted and experimental results. This study found that ANN is highly accurate in predicting the viscosity of biodiesel and demonstrates the ability of the ANN model to find a meaningful relationship between biodiesel chemical composition and fuel properties at different temperature levels. Therefore the model developed in this study can be a useful tool in accurately predict biodiesel fuel properties instead of undertaking costly and time consuming experimental tests.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biodiesel, produced from renewable feedstock represents a more sustainable source of energy and will therefore play a significant role in providing the energy requirements for transportation in the near future. Chemically, all biodiesels are fatty acid methyl esters (FAME), produced from raw vegetable oil and animal fat. However, clear differences in chemical structure are apparent from one feedstock to the next in terms of chain length, degree of unsaturation, number of double bonds and double bond configuration-which all determine the fuel properties of biodiesel. In this study, prediction models were developed to estimate kinematic viscosity of biodiesel using an Artificial Neural Network (ANN) modelling technique. While developing the model, 27 parameters based on chemical composition commonly found in biodiesel were used as the input variables and kinematic viscosity of biodiesel was used as output variable. Necessary data to develop and simulate the network were collected from more than 120 published peer reviewed papers. The Neural Networks Toolbox of MatLab R2012a software was used to train, validate and simulate the ANN model on a personal computer. The network architecture and learning algorithm were optimised following a trial and error method to obtain the best prediction of the kinematic viscosity. The predictive performance of the model was determined by calculating the coefficient of determination (R2), root mean squared (RMS) and maximum average error percentage (MAEP) between predicted and experimental results. This study found high predictive accuracy of the ANN in predicting fuel properties of biodiesel and has demonstrated the ability of the ANN model to find a meaningful relationship between biodiesel chemical composition and fuel properties. Therefore the model developed in this study can be a useful tool to accurately predict biodiesel fuel properties instead of undertaking costly and time consuming experimental tests.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, demand for automated Gas metal arc welding (GMAW) is growing and consequently need for intelligent systems is increased to ensure the accuracy of the procedure. To date, welding pool geometry has been the most used factor in quality assessment of intelligent welding systems. But, it has recently been found that Mahalanobis Distance (MD) not only can be used for this purpose but also is more efficient. In the present paper, Artificial Neural Networks (ANN) has been used for prediction of MD parameter. However, advantages and disadvantages of other methods have been discussed. The Levenberg–Marquardt algorithm was found to be the most effective algorithm for GMAW process. It is known that the number of neurons plays an important role in optimal network design. In this work, using trial and error method, it has been found that 30 is the optimal number of neurons. The model has been investigated with different number of layers in Multilayer Perceptron (MLP) architecture and has been shown that for the aim of this work the optimal result is obtained when using MLP with one layer. Robustness of the system has been evaluated by adding noise into the input data and studying the effect of the noise in prediction capability of the network. The experiments for this study were conducted in an automated GMAW setup that was integrated with data acquisition system and prepared in a laboratory for welding of steel plate with 12 mm in thickness. The accuracy of the network was evaluated by Root Mean Squared (RMS) error between the measured and the estimated values. The low error value (about 0.008) reflects the good accuracy of the model. Also the comparison of the predicted results by ANN and the test data set showed very good agreement that reveals the predictive power of the model. Therefore, the ANN model offered in here for GMA welding process can be used effectively for prediction goals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We report an experimental study of a new type of turbulent flow that is driven purely by buoyancy. The flow is due to an unstable density difference, created using brine and water, across the ends of a long (length/diameter = 9) vertical pipe. The Schmidt number Sc is 670, and the Rayleigh number (Ra) based on the density gradient and diameter is about 10(8). Under these conditions the convection is turbulent, and the time-averaged velocity at any point is `zero'. The Reynolds number based on the Taylor microscale, Re-lambda, is about 65. The pipe is long enough for there to be an axially homogeneous region, with a linear density gradient, about 6-7 diameters long in the midlength of the pipe. In the absence of a mean flow and, therefore, mean shear, turbulence is sustained just by buoyancy. The flow can be thus considered to be an axially homogeneous turbulent natural convection driven by a constant (unstable) density gradient. We characterize the flow using flow visualization and particle image velocimetry (PIV). Measurements show that the mean velocities and the Reynolds shear stresses are zero across the cross-section; the root mean squared (r.m.s.) of the vertical velocity is larger than those of the lateral velocities (by about one and half times at the pipe axis). We identify some features of the turbulent flow using velocity correlation maps and the probability density functions of velocities and velocity differences. The flow away from the wall, affected mainly by buoyancy, consists of vertically moving fluid masses continually colliding and interacting, while the flow near the wall appears similar to that in wall-bound shear-free turbulence. The turbulence is anisotropic, with the anisotropy increasing to large values as the wall is approached. A mixing length model with the diameter of the pipe as the length scale predicts well the scalings for velocity fluctuations and the flux. This model implies that the Nusselt number would scale as (RaSc1/2)-Sc-1/2, and the Reynolds number would scale as (RaSc-1/2)-Sc-1/2. The velocity and the flux measurements appear to be consistent with the Ra-1/2 scaling, although it must be pointed out that the Rayleigh number range was less than 10. The Schmidt number was not varied to check the Sc scaling. The fluxes and the Reynolds numbers obtained in the present configuration are Much higher compared to what would be obtained in Rayleigh-Benard (R-B) convection for similar density differences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sampling strategies are developed based on the idea of ranked set sampling (RSS) to increase efficiency and therefore to reduce the cost of sampling in fishery research. The RSS incorporates information on concomitant variables that are correlated with the variable of interest in the selection of samples. For example, estimating a monitoring survey abundance index would be more efficient if the sampling sites were selected based on the information from previous surveys or catch rates of the fishery. We use two practical fishery examples to demonstrate the approach: site selection for a fishery-independent monitoring survey in the Australian northern prawn fishery (NPF) and fish age prediction by simple linear regression modelling a short-lived tropical clupeoid. The relative efficiencies of the new designs were derived analytically and compared with the traditional simple random sampling (SRS). Optimal sampling schemes were measured by different optimality criteria. For the NPF monitoring survey, the efficiency in terms of variance or mean squared errors of the estimated mean abundance index ranged from 114 to 199% compared with the SRS. In the case of a fish ageing study for Tenualosa ilisha in Bangladesh, the efficiency of age prediction from fish body weight reached 140%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this study, a quality assessment method based on sampling of primary laser inventory units (microsegments) was analysed. The accuracy of a laser inventory carried out in Kuhmo was analysed as a case study. Field sample plots were measured on the sampled microsegments in the Kuhmo inventory area. Two main questions were considered. Did the ALS based inventory meet the accuracy requirements set for the provider and how should a reliable, cost-efficient and independent quality assessment be undertaken. The agreement between control measurement and ALS based inventory was analysed in four ways: 1) The root mean squared errors (RMSEs) and bias were calculated. 2) Scatter plots with 95% confidence intervals were plotted and the placing of identity lines was checked. 3) Bland-Altman plots were drawn so that the mean difference of attributes between the control method and ALS-method was calculated and plotted against average value of attributes. 4) The tolerance limits were defined and combined with Bland-Altman plots. The RMSE values were compared to a reference study from which the accuracy requirements had been set to the service provider. The accuracy requirements in Kuhmo were achieved, however comparison of RMSE values proved to be difficult. Field control measurements are costly and time-consuming, but they are considered to be robust. However, control measurements might include errors, which are difficult to take into account. Using the Bland-Altman plots none of the compared methods are considered to be completely exact, so this offers a fair way to interpret results of assessment. The tolerance limits to be set on order combined with Bland-Altman plots were suggested to be taken in practise. In addition, bias should be calculated for total area. Some other approaches for quality control were briefly examined. No method was found to fulfil all the required demands of statistical reliability, cost-efficiency, time efficiency, simplicity and speed of implementation. Some benefits and shortcomings of the studied methods were discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Time evolution of mean-squared displacement based on molecular dynamics for a variety of adsorbate-zeolite systems is reported. Transition from ballistic to diffusive behavior is observed for all the systems. The transition times are found to be system dependent and show different types of dependence on temperature. Model calculations on a one-dimensional system are carried out which show that the characteristic length and transition times are dependent on the distance between the barriers, their heights, and temperature. In light of these findings, it is shown that it is possible to obtain valuable information about the average potential energy surface sampled under specific external conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We provide a filterbank precoding framework (FBP) for frequency selective channels using the minimum mean squared error (MMSE) criterion. The design obviates the need for introducing a guard interval between successive blocks, and hence can achieve the maximum possible bandwidth efficiency. This is especially useful in cases where the channel is of a high order. We treat both the presence and the absence of channel knowledge at the transmitter. In the former case, we obtain the jointly optimal precoder-equalizer pair of the specified order. In the latter case, we use a zero padding precoder, and obtain the MMSE equalizer. No restriction on the dimension or nature of the channel matrix is imposed. Simulation results indicate that the filterbank approach outperforms block based methods like OFDM and eigenmode precoding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We report an experimental study of a new type of turbulent flow that is driven purely by buoyancy. The flow is due to an unstable density difference, created using brine and water, across the ends of a long (length/diameter=9) vertical pipe. The Schmidt number Sc is 670, and the Rayleigh number (Ra) based on the density gradient and diameter is about 108. Under these conditions the convection is turbulent, and the time-averaged velocity at any point is ‘zero’. The Reynolds number based on the Taylor microscale, Reλ, is about 65. The pipe is long enough for there to be an axially homogeneous region, with a linear density gradient, about 6–7 diameters long in the midlength of the pipe. In the absence of a mean flow and, therefore, mean shear, turbulence is sustained just by buoyancy. The flow can be thus considered to be an axially homogeneous turbulent natural convection driven by a constant (unstable) density gradient. We characterize the flow using flow visualization and particle image velocimetry (PIV). Measurements show that the mean velocities and the Reynolds shear stresses are zero across the cross-section; the root mean squared (r.m.s.) of the vertical velocity is larger than those of the lateral velocities (by about one and half times at the pipe axis). We identify some features of the turbulent flow using velocity correlation maps and the probability density functions of velocities and velocity differences. The flow away from the wall, affected mainly by buoyancy, consists of vertically moving fluid masses continually colliding and interacting, while the flow near the wall appears similar to that in wall-bound shear-free turbulence. The turbulence is anisotropic, with the anisotropy increasing to large values as the wall is approached. A mixing length model with the diameter of the pipe as the length scale predicts well the scalings for velocity fluctuations and the flux. This model implies that the Nusselt number would scale as Ra1/2Sc1/2, and the Reynolds number would scale as Ra1/2Sc−1/2. The velocity and the flux measurements appear to be consistent with the Ra1/2 scaling, although it must be pointed out that the Rayleigh number range was less than 10. The Schmidt number was not varied to check the Sc scaling. The fluxes and the Reynolds numbers obtained in the present configuration are much higher compared to what would be obtained in Rayleigh–Bénard (R–B) convection for similar density differences.