372 resultados para Sample selection
em Queensland University of Technology - ePrints Archive
Resumo:
Suggests an alternative and computationally simpler approach of non-random sampling of labour economics and represents an observed outcome of an individual female′s choice of whether or not to participate in the labour market. Concludes that there is an alternative to the Heckman two-step estimator.
Resumo:
Consider a general regression model with an arbitrary and unknown link function and a stochastic selection variable that determines whether the outcome variable is observable or missing. The paper proposes U-statistics that are based on kernel functions as estimators for the directions of the parameter vectors in the link function and the selection equation, and shows that these estimators are consistent and asymptotically normal.
Resumo:
This paper utilizes the Survey of Work History (1981) data to examine the importance of non-random sampling in the context of a model of interfirm labour mobility. The paper adopts Heckman's two-step procedure in order to estimate a three-equation model incorporating an individual's mobility status as endogenously determined. The main conclusion is that in estimating wage equations it is important to consider the role of job mobility and to correct for the effects of sample-selection bias. The results generally accord with those reported by Osberg et al. (1986) in the only previous Canadian study of job mobility in a sample-selection context.
Resumo:
Immigration has played an important role in the historical development of Australia. Thus, it is no surprise that a large body of empirical work has developed, which focuses upon how migrants fare in the land of opportunity. Much of the literature is comparatively recent, i.e. the last ten years or so, encouraged by the advent of public availability of Australian crosssection micro data. Several different aspects of migrant welfare have been addressed, with major emphasis being placed upon earnings and unemployment experience. For recent examples see Haig (1980), Stromback (1984), Chiswick and Miller (1985), Tran-Nam and Nevile (1988) and Beggs and Chapman (1988). The present paper contributes to the literature by providing additional empirical evidence on the native/migrant earnings differential. The data utilised are from the rather neglected Australian Bureau of Statistics, ABS Special Supplementary Survey No.4. 1982, otherwise known as the Family Survey. The paper also examines the importance of distinguishing between the wage and salary sector and the self-employment sector when discussing native/migrant differentials. Separate earnings equations for the two labour market groups are estimated and the native/migrant earnings differential is broken down by employment status. This is a novel application in the Australian context and provides some insight into the earnings of the selfemployed, a group that despite its size (around 20 per cent of the labour force) is frequently ignored by economic research. Most previous empirical research fails to examine the effect of employment status on earnings. Stromback (1984) includes a dummy variable representing self-employment status in an earnings equation estimated over a pooled sample of paid and self-employed workers. The variable is found to be highly significant, which leads Stromback to question the efficacy of including the self-employed in the estimation sample. The suggestion is that part of self-employed earnings represent a return to non-human capital investment, i.e. investments in machinery, buildings etc, the structural determinants of earnings differ significantly from those for paid employees. Tran-Nam and Nevile (1988) deal with differences between paid employees and the selfemployed by deleting the latter from their sample. However, deleting the self-employed from the estimation sample may lead to bias in the OLS estimation method (see Heckman 1979). The desirable properties of OLS are dependent upon estimation on a random sample. Thus, the 'Ran-Nam and Nevile results are likely to suffer from bias unless individuals are randomly allocated between self-employment and paid employment. The current analysis extends Tran-Nam and Nevile (1988) by explicitly treating the choice of paid employment versus self-employment as being endogenously determined. This allows an explicit test for the appropriateness of deleting self-employed workers from the sample. Earnings equations that are corrected for sample selection are estimated for both natives and migrants in the paid employee sector. The Heckman (1979) two-step estimator is employed. The paper is divided into five major sections. The next section presents the econometric model incorporating the specification of the earnings generating process together with an explicit model determining an individual's employment status. In Section 111 the data are described. Section IV draws together the main econometric results of the paper. First, the probit estimates of the labour market status equation are documented. This is followed by presentation and discussion of the Heckman two-stage estimates of the earnings specification for both native and migrant Australians. Separate earnings equations are estimated for paid employees and the self-employed. Section V documents estimates of the nativelmigrant earnings differential for both categories of employees. To aid comparison with earlier work, the Oaxaca decomposition of the earnings differential for paid-employees is carried out for both the simple OLS regression results as well as the parameter estimates corrected for sample selection effects. These differentials are interpreted and compared with previous Australian findings. A short section concludes the paper.
Resumo:
The question of whether or not there exists a meaningful economic distinction between quits and layoffs has attracted considerable attention. This paper utilizes a recent test proposed by J. S. Cramer and G. Ridder (1991) to test formally whether quits and layoffs may legitimately be aggregated into a single undifferentiated job-mover category. The paper also estimates wage equations for job stayers, quits, and layoffs, corrected for the endogeneity of job mobility. The major results are that quits and lay-off cannot legitimately be pooled and correction for sample selection would appear to be important.
Resumo:
This paper presents a new active learning query strategy for information extraction, called Domain Knowledge Informativeness (DKI). Active learning is often used to reduce the amount of annotation effort required to obtain training data for machine learning algorithms. A key component of an active learning approach is the query strategy, which is used to iteratively select samples for annotation. Knowledge resources have been used in information extraction as a means to derive additional features for sample representation. DKI is, however, the first query strategy that exploits such resources to inform sample selection. To evaluate the merits of DKI, in particular with respect to the reduction in annotation effort that the new query strategy allows to achieve, we conduct a comprehensive empirical comparison of active learning query strategies for information extraction within the clinical domain. The clinical domain was chosen for this work because of the availability of extensive structured knowledge resources which have often been exploited for feature generation. In addition, the clinical domain offers a compelling use case for active learning because of the necessary high costs and hurdles associated with obtaining annotations in this domain. Our experimental findings demonstrated that 1) amongst existing query strategies, the ones based on the classification model’s confidence are a better choice for clinical data as they perform equally well with a much lighter computational load, and 2) significant reductions in annotation effort are achievable by exploiting knowledge resources within active learning query strategies, with up to 14% less tokens and concepts to manually annotate than with state-of-the-art query strategies.
Resumo:
In this paper, we tackle the problem of unsupervised domain adaptation for classification. In the unsupervised scenario where no labeled samples from the target domain are provided, a popular approach consists in transforming the data such that the source and target distributions be- come similar. To compare the two distributions, existing approaches make use of the Maximum Mean Discrepancy (MMD). However, this does not exploit the fact that prob- ability distributions lie on a Riemannian manifold. Here, we propose to make better use of the structure of this man- ifold and rely on the distance on the manifold to compare the source and target distributions. In this framework, we introduce a sample selection method and a subspace-based method for unsupervised domain adaptation, and show that both these manifold-based techniques outperform the cor- responding approaches based on the MMD. Furthermore, we show that our subspace-based approach yields state-of- the-art results on a standard object recognition benchmark.
Resumo:
Background Despite decades of research, bullying in all its forms is still a significant problem within schools in Australia, as it is internationally. Anti-bullying policies and guidelines are thought to be one strategy as part of a whole school approach to reduce bullying. However, although Australian schools are required to have these policies, their effectiveness is not clear. As policies and guidelines about bullying and cyberbullying are developed within education departments, this paper explores the perspectives of those who are involved in their construction. Purpose This study examined the perspectives of professionals involved in policy construction, across three different Australian states. The aim was to determine how their relative jurisdictions define bullying and cyberbullying, the processes for developing policy, the bullying prevention and intervention recommendations given to schools and the content considered essential in current policies. Sample Eleven key stakeholders from three Australian states with similar education systems were invited to participate. The sample selection criteria included professionals with experience and training in education, cyber-safety and the responsibility to contribute to or make decisions which inform policy in this area for schools in their state. Design and methods Participants were interviewed about the definitions of bullying they used in their state policy frameworks; the extent to which cyberbullying was included; and the content they considered essential for schools to include in anti-bullying policies. Data were collected through in-depth, semi-structured interviews and analysed thematically. Findings Seven themes were identified in the data: - (1) Definition of bullying and cyberbullying; - (2) Existence of a policy template; - (3) Policy location; - (4) Adding cyberbullying; - (5) Distinguishing between bullying and cyberbullying; - (6) Effective policy, and; - (7) Policy as a prevention or intervention tool. The results were similar both across state boundaries and also across different disciplines. Conclusion Analysis of the data suggested that, across the themes, there was some lack of information about bullying and cyberbullying. This limitation could affect the subsequent development, dissemination and sustainability of school anti-bullying policies, which have implications for the translation of research to inform better student outcomes.
Resumo:
Classifier selection is a problem encountered by multi-biometric systems that aim to improve performance through fusion of decisions. A particular decision fusion architecture that combines multiple instances (n classifiers) and multiple samples (m attempts at each classifier) has been proposed in previous work to achieve controlled trade-off between false alarms and false rejects. Although analysis on text-dependent speaker verification has demonstrated better performance for fusion of decisions with favourable dependence compared to statistically independent decisions, the performance is not always optimal. Given a pool of instances, best performance with this architecture is obtained for certain combination of instances. Heuristic rules and diversity measures have been commonly used for classifier selection but it is shown that optimal performance is achieved for the `best combination performance' rule. As the search complexity for this rule increases exponentially with the addition of classifiers, a measure - the sequential error ratio (SER) - is proposed in this work that is specifically adapted to the characteristics of sequential fusion architecture. The proposed measure can be used to select a classifier that is most likely to produce a correct decision at each stage. Error rates for fusion of text-dependent HMM based speaker models using SER are compared with other classifier selection methodologies. SER is shown to achieve near optimal performance for sequential fusion of multiple instances with or without the use of multiple samples. The methodology applies to multiple speech utterances for telephone or internet based access control and to other systems such as multiple finger print and multiple handwriting sample based identity verification systems.
Resumo:
It has been proposed that body image disturbance is a form of cognitive bias wherein schemas for self-relevant information guide the selective processing of appearancerelated information in the environment. This threatening information receives disproportionately more attention and memory, as measured by an Emotional Stroop and incidental recall task. The aim of this thesis was to expand the literature on cognitive processing biases in non-clinical males and females by incorporating a number of significant methodological refinements. To achieve this aim, three phases of research were conducted. The initial two phases of research provided preliminary data to inform the development of the main study. Phase One was a qualitative exploration of body image concerns amongst males and females recruited through the general community and from a university. Seventeen participants (eight male; nine female) provided information on their body image and what factors they saw as positively and negatively impacting on their self evaluations. The importance of self esteem, mood, health and fitness, and recognition of the social ideal were identified as key themes. These themes were incorporated as psycho-social measures and Stroop word stimuli in subsequent phases of the research. Phase Two involved the selection and testing of stimuli to be used in the Emotional Stroop task. Six experimental categories of words were developed that reflected a broad range of health and body image concerns for males and females. These categories were high and low calorie food words, positive and negative appearance words, negative emotion words, and physical activity words. Phase Three addressed the central aim of the project by examining cognitive biases for body image information in empirically defined sub-groups. A National sample of males (N = 55) and females (N = 144), recruited from the general community and universities, completed an Emotional Stroop task, incidental memory test, and a collection of psycho-social questionnaires. Sub-groups of body image disturbance were sought using a cluster analysis, which identified three sub-groups in males (Normal, Dissatisfied, and Athletic) and four sub-groups in females (Normal, Health Conscious, Dissatisfied, and Symptomatic). No differences were noted between the groups in selective attention, although time taken to colour name the words was associated with some of the psycho-social variables. Memory biases found across the whole sample for negative emotion, low calorie food, and negative appearance words were interpreted as reflecting the current focus on health and stigma against being unattractive. Collectively these results have expanded our understanding of processing biases in the general community by demonstrating that the processing biases are found within non-clinical samples and that not all processing biases are associated with negative functionality
Resumo:
Global warming can have a significant impact on building energy performance and indoor thermal environment, as well as the health and productivity of people living and working inside them. Through the building simulation technique, this paper investigates the adaptation potential of different selections of building physical properties to increased outdoor temperature in Australia. It is found that overall, an office building with lower insulation level, smaller window to wall ratio and/or a glass type with lower shading coefficient, and lower internal load density will have the effect of lowering building cooling load and total energy use, and therefore have a better potential to adapt to the warming external climate. Compared with clear glass, it is shown that the use of reflective glass for the sample building with WWR being 0.5 reduces the building cooling load by more than 12%. A lower internal load can also have a significant impact on the reduction of building cooling load, as well as the building energy use. Through the comparison of results between current and future weather scenarios, it is found that the patterns found in the current weather scenario also exist in the future weather scenarios, but to a smaller extent.
Resumo:
Use of appropriate nursery environments will maximize gain from selection for yield of wheat (Triticum aestivum L.) in the target population of environments of a breeding program. The objective of this study was to investigate how well-irrigated (low-stress) nursery environments predict yield of lines in target environments that varied in degree of water limitation. Fifteen lines were sampled from the preliminary yield evaluation stage of the Queensland wheat breeding program and tested in 26 trials under on-farm conditions (Target Environments) across nine years (1985 to 1993) and also in 27 trials conducted at three research stations (Nursery Environments) in three years (1987 to 1989). The nursery environments were structured to impose different levels of water and nitrogen (N) limitation, whereas the target environments represented a random sample of on-farm conditions from the target population of environments. Indirect selection and pattern analysis methods were used to investigate selection for yield in the nursery environments and gain from selection in the target environments. Yield under low-stress nursery conditions was an effective predictor of yield under similar low-stress target environments (r = 0.89, P < 0.01). However, the value of the low-stress nursery as a predictor of yield in the water-limited target environments decreased with increasing water stress (moderate stress r = 0.53, P < 0.05, to r = 0.38, P > 0.05; severe stress r = -0.08, P > 0.05). Yield in the stress nurseries was a poor predictor of yield in the target environments. Until there is a clear understanding of the physiological-genetic basis of variation for adaptation of wheat to the water-limited environments in Queensland, yield improvement can best be achieved by selection for a combination of yield potential in an irrigated low-stress nursery and yield in on-farm trials that sample the range of water-limited environments of the target population of environments.
Resumo:
Background Early feeding practices lay the foundation for children’s eating habits and weight gain. Questionnaires are available to assess parental feeding but overlapping and inconsistent items, subscales and terminology limit conceptual clarity and between study comparisons. Our aim was to consolidate a range of existing items into a parsimonious and conceptually robust questionnaire for assessing feeding practices with very young children (<3 years). Methods Data were from 462 mothers and children (age 21–27 months) from the NOURISH trial. Items from five questionnaires and two study-specific items were submitted to a priori item selection, allocation and verification, before theoretically-derived factors were tested using Confirmatory Factor Analysis. Construct validity of the new factors was examined by correlating these with child eating behaviours and weight. Results Following expert review 10 factors were specified. Of these, 9 factors (40 items) showed acceptable model fit and internal reliability (Cronbach’s α: 0.61-0.89). Four factors reflected non-responsive feeding practices: ‘Distrust in Appetite’, ‘Reward for Behaviour’, ‘Reward for Eating’, and ‘Persuasive Feeding’. Five factors reflected structure of the meal environment and limits: ‘Structured Meal Setting’, ‘Structured Meal Timing’, ‘Family Meal Setting’, ‘Overt Restriction’ and ‘Covert Restriction’. Feeding practices generally showed the expected pattern of associations with child eating behaviours but none with weight. Conclusion The Feeding Practices and Structure Questionnaire (FPSQ) provides a new reliable and valid measure of parental feeding practices, specifically maternal responsiveness to children’s hunger/satiety signals facilitated by routine and structure in feeding. Further validation in more diverse samples is required.
Resumo:
Using a sample of companies from the top 500 listed firms in Australia, we investigate whether the presence of a designated nomination committee and female representation on the nomination committee affect board gender diversity. We also examine whether gender diversity on the board affects firm risk and financial performance. We find that board gender diversity is significantly and positively associated with the presence of a designated nomination committee and that female representation on the nomination committee is a significant explanatory factor of increasing board gender diversity following the release of the 2010 Australian Securities Exchange Corporate Governance Council (ASXCGC) recommendations. Further, our results support the business case for board gender diversity as we find greater gender diversity moderates excessive firm risk which in turn improves firms’ financial performance. Our results are robust after correcting for selection bias and controlling for other board, firm and industry characteristics.