32 resultados para Test data

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

60.00% 60.00%

Publicador:

Resumo:

We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical {\sc vc} dimension, empirical {\sc vc} entropy, andmargin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We use historical data that cover more than one century on real GDP for industrial countries and employ the Pesaran panel unit root test that allows for cross-sectional dependence to test for a unit root on real GDP. We find strong evidence against the unit root null. Our results are robust to the chosen group of countries and the sample period. Key words: real GDP stationarity, cross-sectional dependence, CIPS test. JEL Classification: C23, E32

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Planners in public and private institutions would like coherent forecasts of the components of age-specic mortality, such as causes of death. This has been di cult toachieve because the relative values of the forecast components often fail to behave ina way that is coherent with historical experience. In addition, when the group forecasts are combined the result is often incompatible with an all-groups forecast. It hasbeen shown that cause-specic mortality forecasts are pessimistic when compared withall-cause forecasts (Wilmoth, 1995). This paper abandons the conventional approachof using log mortality rates and forecasts the density of deaths in the life table. Sincethese values obey a unit sum constraint for both conventional single-decrement life tables (only one absorbing state) and multiple-decrement tables (more than one absorbingstate), they are intrinsically relative rather than absolute values across decrements aswell as ages. Using the methods of Compositional Data Analysis pioneered by Aitchison(1986), death densities are transformed into the real space so that the full range of multivariate statistics can be applied, then back-transformed to positive values so that theunit sum constraint is honoured. The structure of the best-known, single-decrementmortality-rate forecasting model, devised by Lee and Carter (1992), is expressed incompositional form and the results from the two models are compared. The compositional model is extended to a multiple-decrement form and used to forecast mortalityby cause of death for Japan

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Several unit root tests in panel data have recently been proposed. The test developed by Harris and Tzavalis (1999 JoE) performs particularly well when the time dimension is moderate in relation to the cross-section dimension. However, in common with the traditional tests designed for the unidimensional case, it was found to perform poorly when there is a structural break in the time series under the alternative. Here we derive the asymptotic distribution of the test allowing for a shift in the mean, and assess the small sample performance. We apply this new test to show how the hypothesis of (perfect) hysteresis in Spanish unemployment is rejected in favour of the alternative of the natural unemployment rate, when the possibility of a change in the latter is considered.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Several unit root tests in panel data have recently been proposed. The test developed by Harris and Tzavalis (1999 JoE) performs particularly well when the time dimension is moderate in relation to the cross-section dimension. However, in common with the traditional tests designed for the unidimensional case, it was found to perform poorly when there is a structural break in the time series under the alternative. Here we derive the asymptotic distribution of the test allowing for a shift in the mean, and assess the small sample performance. We apply this new test to show how the hypothesis of (perfect) hysteresis in Spanish unemployment is rejected in favour of the alternative of the natural unemployment rate, when the possibility of a change in the latter is considered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We re-examine the theoretical concept of a production function for cognitive achievement, and argue that an indirect production function that depends upon the variables that constrain parents' choices is both moretractable from an econometric point of view, and more interesting from an economic point of view than is a direct production function that depends upon a detailed list of direct inputs such as number of books in the household. We estimate flexible econometric models of indirect production functions for two achievement measures from the Woodcock-Johnson Revised battery, using data from two waves of the Child Development Supplement to the PSID. Elasticities of achievement measures with respect to family income and parents' educational levels are positive and significant. Gaps between scores of black and white children narrow or remain constant as children grow older, a result that differs from previous findings in the literature. The elasticities of achievement scores with respect to family income are substantially higher for children of black families, and there are some notable difference in elasticities with respect to parents' educational levels across blacks and whites.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La finalitat d'aquest projecte consisteix a establir una metodologia d'avaluació i disseny d'un test d'usuari. Es vol no només avaluar un lloc web en concret, sinó també establir una sèrie de pautes que es puguin aplicar a qualsevol altra aplicació futura.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A condition needed for testing nested hypotheses from a Bayesianviewpoint is that the prior for the alternative model concentratesmass around the small, or null, model. For testing independencein contingency tables, the intrinsic priors satisfy this requirement.Further, the degree of concentration of the priors is controlled bya discrete parameter m, the training sample size, which plays animportant role in the resulting answer regardless of the samplesize.In this paper we study robustness of the tests of independencein contingency tables with respect to the intrinsic priors withdifferent degree of concentration around the null, and comparewith other “robust” results by Good and Crook. Consistency ofthe intrinsic Bayesian tests is established.We also discuss conditioning issues and sampling schemes,and argue that conditioning should be on either one margin orthe table total, but not on both margins.Examples using real are simulated data are given

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human Immunodeficiency Virus continues to be a pandemic. Spain is one of the European countries with the highest incidence of HIV. Within Catalonia, Spain many projects have been implemented with the intention of improving HIV knowledge and lowering the incidence. HIV knowledge is also known to have a positive effect on lowering stigma and discrimination of the people living with HIV. However, few studies study the distribution of HIV knowledge and its association to HIV status, age, sex, geographical zone of origin and level of education within the same study. Objectives: To identify if HIV knowledge is associated with HIV status, age, sex, geographical zone of origin and level of education. Method: Quantitative, cross-sectional, centre-based study comprising of people receiving an HIV test in Catalonia, Spain. Data will be collected from the 11 HIV Non-Governmental Organisations in Catalonia, Spain. The Brief HIV Knowledge Scale will be used to assess HIV knowledge; information from the HIV test session will be used to assess HIV status, age, sex, geographic zone of origin and level of education. The association between HIV knowledge and the afore mentioned variables will then be calculated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Several eco-toxicological studies have shown that insectivorous mammals, due to theirfeeding habits, easily accumulate high amounts of pollutants in relation to other mammal species. To assess the bio-accumulation levels of toxic metals and their in°uenceon essential metals, we quantified the concentration of 19 elements (Ca, K, Fe, B, P,S, Na, Al, Zn, Ba, Rb, Sr, Cu, Mn, Hg, Cd, Mo, Cr and Pb) in bones of 105 greaterwhite-toothed shrews (Crocidura russula) from a polluted (Ebro Delta) and a control(Medas Islands) area. Since chemical contents of a bio-indicator are mainly compositional data, conventional statistical analyses currently used in eco-toxicology can givemisleading results. Therefore, to improve the interpretation of the data obtained, weused statistical techniques for compositional data analysis to define groups of metalsand to evaluate the relationships between them, from an inter-population viewpoint.Hypothesis testing on the adequate balance-coordinates allow us to confirm intuitionbased hypothesis and some previous results. The main statistical goal was to test equalmeans of balance-coordinates for the two defined populations. After checking normality,one-way ANOVA or Mann-Whitney tests were carried out for the inter-group balances

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper argues that low-stakes test scores, available in surveys, may be partially determined by test-taking motivation, which is associated with personality traits but not with cognitive ability. Therefore, such test score distributions may not be informative regarding cognitive ability distributions. Moreover, correlations, found in survey data, between high test scores and economic success may be partially caused by favorable personality traits. To demonstrate these points, I use the coding speed test that was administered without incentives to National Longitudinal Survey of Youth 1979 (NLSY) participants. I suggest that due to its simplicity its scores may especially depend on individuals' test-taking motivation. I show that controlling for conventional measures of cognitive skills, the coding speed scores are correlated with future earnings of male NLSY participants. Moreover, the coding speed scores of highly motivated, though less educated, population (potential enlists to the armed forces) are higher than NLSY participants' scores. I then use controlled experiments to show that when no performance-based incentives are provided, participants' characteristics, but not their cognitive skills, affect effort invested in the coding speed test. Thus, participants with the same ability (measured by their scores on an incentivized test) have significantly different scores on tests without performance- based incentives.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the application of normal theory methods to the estimation and testing of a general type of multivariate regressionmodels with errors--in--variables, in the case where various data setsare merged into a single analysis and the observable variables deviatepossibly from normality. The various samples to be merged can differ on the set of observable variables available. We show that there is a convenient way to parameterize the model so that, despite the possiblenon--normality of the data, normal--theory methods yield correct inferencesfor the parameters of interest and for the goodness--of--fit test. Thetheory described encompasses both the functional and structural modelcases, and can be implemented using standard software for structuralequations models, such as LISREL, EQS, LISCOMP, among others. An illustration with Monte Carlo data is presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hierarchical clustering is a popular method for finding structure in multivariate data,resulting in a binary tree constructed on the particular objects of the study, usually samplingunits. The user faces the decision where to cut the binary tree in order to determine the numberof clusters to interpret and there are various ad hoc rules for arriving at a decision. A simplepermutation test is presented that diagnoses whether non-random levels of clustering are presentin the set of objects and, if so, indicates the specific level at which the tree can be cut. The test isvalidated against random matrices to verify the type I error probability and a power study isperformed on data sets with known clusteredness to study the type II error.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper argues that low-stakes test scores, available in surveys, may be partially determinedby test-taking motivation, which is associated with personality traits but not with cognitiveability. Therefore, such test score distributions may not be informative regarding cognitiveability distributions. Moreover, correlations, found in survey data, between high test scoresand economic success may be partially caused by favorable personality traits. To demonstratethese points, I use the coding speed test that was administered without incentives to NationalLongitudinal Survey of Youth 1979 (NLSY) participants. I suggest that due to its simplicityits scores may especially depend on individuals' test-taking motivation. I show that controllingfor conventional measures of cognitive skills, the coding speed scores are correlated with futureearnings of male NLSY participants. Moreover, the coding speed scores of highly motivated,though less educated, population (potential enlists to the armed forces) are higher than NLSYparticipants' scores. I then use controlled experiments to show that when no performance-basedincentives are provided, participants' characteristics, but not their cognitive skills, affect effortinvested in the coding speed test. Thus, participants with the same ability (measured by theirscores on an incentivized test) have significantly different scores on tests without performance-based incentives.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A family of scaling corrections aimed to improve the chi-square approximation of goodness-of-fit test statistics in small samples, large models, and nonnormal data was proposed in Satorra and Bentler (1994). For structural equations models, Satorra-Bentler's (SB) scaling corrections are available in standard computer software. Often, however, the interest is not on the overall fit of a model, but on a test of the restrictions that a null model say ${\cal M}_0$ implies on a less restricted one ${\cal M}_1$. If $T_0$ and $T_1$ denote the goodness-of-fit test statistics associated to ${\cal M}_0$ and ${\cal M}_1$, respectively, then typically the difference $T_d = T_0 - T_1$ is used as a chi-square test statistic with degrees of freedom equal to the difference on the number of independent parameters estimated under the models ${\cal M}_0$ and ${\cal M}_1$. As in the case of the goodness-of-fit test, it is of interest to scale the statistic $T_d$ in order to improve its chi-square approximation in realistic, i.e., nonasymptotic and nonnormal, applications. In a recent paper, Satorra (1999) shows that the difference between two Satorra-Bentler scaled test statistics for overall model fit does not yield the correct SB scaled difference test statistic. Satorra developed an expression that permits scaling the difference test statistic, but his formula has some practical limitations, since it requires heavy computations that are notavailable in standard computer software. The purpose of the present paper is to provide an easy way to compute the scaled difference chi-square statistic from the scaled goodness-of-fit test statistics of models ${\cal M}_0$ and ${\cal M}_1$. A Monte Carlo study is provided to illustrate the performance of the competing statistics.