103 resultados para ALS data-set
Resumo:
In this article an empirical analyse of farming costs is performed withinthe frame of the activity based costing, employing a panel data set ofCatalan farms. One the main conclusions of the study is that there islimited association for transaction and farm costs, especially in indirectcosts. Direct and indirect costs are mainly driven by volume production.
Resumo:
Although the histogram is the most widely used density estimator, itis well--known that the appearance of a constructed histogram for a given binwidth can change markedly for different choices of anchor position. In thispaper we construct a stability index $G$ that assesses the potential changesin the appearance of histograms for a given data set and bin width as theanchor position changes. If a particular bin width choice leads to an unstableappearance, the arbitrary choice of any one anchor position is dangerous, anda different bin width should be considered. The index is based on the statisticalroughness of the histogram estimate. We show via Monte Carlo simulation thatdensities with more structure are more likely to lead to histograms withunstable appearance. In addition, ignoring the precision to which the datavalues are provided when choosing the bin width leads to instability. We provideseveral real data examples to illustrate the properties of $G$. Applicationsto other binned density estimators are also discussed.
Resumo:
In the analysis of multivariate categorical data, typically the analysis of questionnaire data, it is often advantageous, for substantive and technical reasons, to analyse a subset of response categories. In multiple correspondence analysis, where each category is coded as a column of an indicator matrix or row and column of Burt matrix, it is not correct to simply analyse the corresponding submatrix of data, since the whole geometric structure is different for the submatrix . A simple modification of the correspondence analysis algorithm allows the overall geometric structure of the complete data set to be retained while calculating the solution for the selected subset of points. This strategy is useful for analysing patterns of response amongst any subset of categories and relating these patterns to demographic factors, especially for studying patterns of particular responses such as missing and neutral responses. The methodology is illustrated using data from the International Social Survey Program on Family and Changing Gender Roles in 1994.
Resumo:
We have analyzed the spatial accuracy of European foreign trade statistics compared to Latin American. We have also included USA s data because of the importance of this country in Latin American trade. We have developed a method for mapping discrepancies between exporters and importers, trying to isolate systematic spatial deviations. Although our results don t allow a unique explanation, they present some interesting clues to the distribution channels in the Latin American Continent as well as some spatial deviations for statistics in individual countries. Connecting our results with the literature specialized in the accuracy of foreign trade statistics; we can revisit Morgernstern (1963) as well as Federico and Tena (1991). Morgernstern had had a really pessimistic view on the reliability of this statistic source, but his main alert was focused on the trade balances, not in gross export or import values. Federico and Tena (1991) have demonstrated howaccuracy increases by aggregation, geographical and of product at the same time. But they still have a pessimistic view with relation to distribution questions, remarking that perhaps it will be more accurate to use import sources in this latest case. We have stated that the data set coming from foreign trade statistics for a sample in 1925, being it exporters or importers, it s a valuable tool for geography of trade patterns, although in some specific cases it needs some spatial adjustments.
Resumo:
This paper investigates the role of employee referrals in the labor market.Using an original data set, I find that industries that pay wage premia andhave characteristics associated with high-wage sectors rely mainly on employeereferrals to fill jobs. Moreover, unemployment rates are higher in industries which use employee referrals more extensively. This paper develops an equilibrium matching model which can explain these empirical regularities. Inthis model, the matching process sorts heterogeneous firms and workers into two distinct groups: referrals match "good" jobs to "good" workers, while formalmethods (e.g., newspaper ads and employment agencies) match less-attractive jobs to disadvantaged workers. Thus, well-connected workers who learn quickly aboutjob opportunities use referrals to jump job queues, while those who are less well placed in the labor market search for jobs through formal methods. The split of firms and workers between referrals and formal search is, however, not necessarily efficient. Congestion externalities in referral search imply that unemployment would be closer to the optimal rate if firms and workers 'at themargin' searched formally.
Resumo:
This paper offers empirical evidence that a country's choice of exchange rate regime can have a signifficant impact on its medium-term rate of productivity growth. Moreover, the impact depends critically on the country's level of financial development, its degree of market regulation, and its distance from the global technology frontier. We illustrate how each of these channels may operate in a simple stylized growth model in which real exchange rate uncertainty exacerbates the negative investment e¤ects of domestic credit market constraints. The empirical analysis is based on an 83 country data set spanning the years 1960-2000. Our approach delivers results that are in striking contrast to the vast existing empirical exchange rate literature, which largely finds the effects of exchange rate volatility on real activity to be relatively small and insignificant.
Resumo:
An attendance equation is estimated using data on individual games playedin the Spanish First Division Football League. The specification includesas explanatory factors: economic variables, quality, uncertainty andopportunity costs. We concentrate the analysis on some specificationissues such as controlling the effect of unobservables given the paneldata structure of the data set, the type of functional form and thepotential endogeneity of prices. We obtain the expected effects onattendance for all the variables. The estimated price elasticities aresmaller than one in absolute value as usually occurs in this literaturebut are sensitive to the specification issues.
Resumo:
Aquest estudi analitza la distribució de la renda salarial a la comarca d’Osona, en comparació amb la resta de comarques de Catalunya i la seva distribució entre els municipis, utilitzant una base de dades original creada a partir de la informació sobre salaris de l’Enquesta d’Estructura Salarial i sobre la població dels censos de 1996 i 2001. La unitat espacial utilitzada, la secció censal, permet obtenir estimacions per als diferents àmbits geogràfics i calcular i descompondre índexs de desigualtat que mostren les característiques de les distribucions.
Resumo:
Background Nowadays, combining the different sources of information to improve the biological knowledge available is a challenge in bioinformatics. One of the most powerful methods for integrating heterogeneous data types are kernel-based methods. Kernel-based data integration approaches consist of two basic steps: firstly the right kernel is chosen for each data set; secondly the kernels from the different data sources are combined to give a complete representation of the available data for a given statistical task. Results We analyze the integration of data from several sources of information using kernel PCA, from the point of view of reducing dimensionality. Moreover, we improve the interpretability of kernel PCA by adding to the plot the representation of the input variables that belong to any dataset. In particular, for each input variable or linear combination of input variables, we can represent the direction of maximum growth locally, which allows us to identify those samples with higher/lower values of the variables analyzed. Conclusions The integration of different datasets and the simultaneous representation of samples and variables together give us a better understanding of biological knowledge.
Resumo:
The most adequate approach for benchmarking web accessibility is manual expert evaluation supplemented by automatic analysis tools. But manual evaluation has a high cost and is impractical to be applied on large websites. In reality, there is no choice but to rely on automated tools when reviewing large web sites for accessibility. The question is: to what extent the results from automatic evaluation of a web site and individual web pages can be used as an approximation for manual results? This paper presents the initial results of an investigation aimed at answering this question. He have performed both manual and automatic evaluations of the accessibility of web pages of two sites and we have compared the results. In our data set automatically retrieved results could most definitely be used as an approximation manual evaluation results.
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
Resumo:
Drawing on a very rich data set from a recent cohort of PhD graduates, we examine the correlates and consequences of qualification and skills mismatch. We show that job characteristics such as the economic sector and the main activity at work play a fundamental direct role in explaining the probability of being well matched. However, the effect of academic attributes seems to be mainly indirect, since it disappears once we control for the full set of work characteristics. We detected a significant earnings penalty for those who are both overqualified and overskilled and also showed that being mismatched reduces job satisfaction, especially for those whose skills are underutilized. Overall, the problem of mismatch among PhD graduates is closely related to demand-side constraints of the labor market. Increasing the supply of adequate jobs and broadening the skills PhD students acquire during training should be explored as possible responses.
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
Resumo:
Background Nowadays, combining the different sources of information to improve the biological knowledge available is a challenge in bioinformatics. One of the most powerful methods for integrating heterogeneous data types are kernel-based methods. Kernel-based data integration approaches consist of two basic steps: firstly the right kernel is chosen for each data set; secondly the kernels from the different data sources are combined to give a complete representation of the available data for a given statistical task. Results We analyze the integration of data from several sources of information using kernel PCA, from the point of view of reducing dimensionality. Moreover, we improve the interpretability of kernel PCA by adding to the plot the representation of the input variables that belong to any dataset. In particular, for each input variable or linear combination of input variables, we can represent the direction of maximum growth locally, which allows us to identify those samples with higher/lower values of the variables analyzed. Conclusions The integration of different datasets and the simultaneous representation of samples and variables together give us a better understanding of biological knowledge.
Resumo:
Income distribution in Spain has experienced a substantial improvement towards equalisation during the second half of the seventies and the eighties; a period during which most OECD countries experienced the opposite trend. In spite of the many recent papers on the Spanish income distribution, the period covered by those stops in 1990. The aim of this paper is to extent the analysis to 1996 employing the same methodology and the same data set (ECPF). Our results not only corroborate the (decreasing inequality) trend found by others during the second half of the eighties, but also suggest that this trend extends over the first half of the nineties. We also show that our main conclusions are robust to changes in the equivalence scale, to changes in the definition of income and to potential data contamination. Finally, we analyse some of the causes which may be driving the overall picture of income inequality using two decomposition techniques. From this analyses three variables emerge as the major responsible factors for the observed improvement in the income distribution: education, household composition and socioeconomic situation of the household head.