980 resultados para Categorical Analysis
Resumo:
In this article, we offer a new way of exploring relationships between three different dimensions of a business operation, namely the stage of business development, the methods of creativity and the major cultural values. Although separately, each of these has gained enormous attention from the management research community, evidenced by a large volume of research studies, there have been not many studies that attempt to describe the logic that connect these three important aspects of a business; let alone empirical evidences that support any significant relationships among these variables. The paper also provides a data set and an empirical investigation on that data set, using a categorical data analysis, to conclude that examinations of these possible relationships are meaningful and possible for seemingly unquantifiable information. The results also show that the most significant category among all creativity methods employed in Vietnamese enterprises is the “creative disciplines” rule in the “entrepreneurial phase,” while in general creative disciplines have played a critical role in explaining the structure of our data sample, for both stages of development in our consideration.
Resumo:
We compare correspondance análisis to the logratio approach based on compositional data. We also compare correspondance análisis and an alternative approach using Hellinger distance, for representing categorical data in a contingency table. We propose a coefficient which globally measures the similarity between these approaches. This coefficient can be decomposed into several components, one component for each principal dimension, indicating the contribution of the dimensions to the difference between the two representations. These three methods of representation can produce quite similar results. One illustrative example is given
Resumo:
Compositional random vectors are fundamental tools in the Bayesian analysis of categorical data. Many of the issues that are discussed with reference to the statistical analysis of compositional data have a natural counterpart in the construction of a Bayesian statistical model for categorical data. This note builds on the idea of cross-fertilization of the two areas recommended by Aitchison (1986) in his seminal book on compositional data. Particular emphasis is put on the problem of what parameterization to use
Resumo:
Genetic association analyses of family-based studies with ordered categorical phenotypes are often conducted using methods either for quantitative or for binary traits, which can lead to suboptimal analyses. Here we present an alternative likelihood-based method of analysis for single nucleotide polymorphism (SNP) genotypes and ordered categorical phenotypes in nuclear families of any size. Our approach, which extends our previous work for binary phenotypes, permits straightforward inclusion of covariate, gene-gene and gene-covariate interaction terms in the likelihood, incorporates a simple model for ascertainment and allows for family-specific effects in the hypothesis test. Additionally, our method produces interpretable parameter estimates and valid confidence intervals. We assess the proposed method using simulated data, and apply it to a polymorphism in the c-reactive protein (CRP) gene typed in families collected to investigate human systemic lupus erythematosus. By including sex interactions in the analysis, we show that the polymorphism is associated with anti-nuclear autoantibody (ANA) production in females, while there appears to be no effect in males.
Resumo:
Reliability analysis of probabilistic forecasts, in particular through the rank histogram or Talagrand diagram, is revisited. Two shortcomings are pointed out: Firstly, a uniform rank histogram is but a necessary condition for reliability. Secondly, if the forecast is assumed to be reliable, an indication is needed how far a histogram is expected to deviate from uniformity merely due to randomness. Concerning the first shortcoming, it is suggested that forecasts be grouped or stratified along suitable criteria, and that reliability is analyzed individually for each forecast stratum. A reliable forecast should have uniform histograms for all individual forecast strata, not only for all forecasts as a whole. As to the second shortcoming, instead of the observed frequencies, the probability of the observed frequency is plotted, providing and indication of the likelihood of the result under the hypothesis that the forecast is reliable. Furthermore, a Goodness-Of-Fit statistic is discussed which is essentially the reliability term of the Ignorance score. The discussed tools are applied to medium range forecasts for 2 m-temperature anomalies at several locations and lead times. The forecasts are stratified along the expected ranked probability score. Those forecasts which feature a high expected score turn out to be particularly unreliable.
Resumo:
We review some issues related to the implications of different missing data mechanisms on statistical inference for contingency tables and consider simulation studies to compare the results obtained under such models to those where the units with missing data are disregarded. We confirm that although, in general, analyses under the correct missing at random and missing completely at random models are more efficient even for small sample sizes, there are exceptions where they may not improve the results obtained by ignoring the partially classified data. We show that under the missing not at random (MNAR) model, estimates on the boundary of the parameter space as well as lack of identifiability of the parameters of saturated models may be associated with undesirable asymptotic properties of maximum likelihood estimators and likelihood ratio tests; even in standard cases the bias of the estimators may be low only for very large samples. We also show that the probability of a boundary solution obtained under the correct MNAR model may be large even for large samples and that, consequently, we may not always conclude that a MNAR model is misspecified because the estimate is on the boundary of the parameter space.
Resumo:
P>In this study, Bayesian analysis under a threshold animal model was used to estimate genetic correlations between morphological traits (body structure, finishing precocity and muscling) in Nelore cattle evaluated at weaning and yearling. Visual scores obtained from 7651 Nelore cattle at weaning and from 4155 animals at yearling, belonging to the Brazilian Nelore Program, were used. Genetic parameters for the morphological traits were estimated by two-trait Bayesian analysis under a threshold animal model. The genetic correlations between the morphological traits evaluated at two ages of the animal (weaning and yearling) were positive and high for body structure (0.91), finishing precocity (0.96) and muscling (0.94). These results indicate that the traits are mainly determined by the same set of genes of additive action and that direct selection at weaning will also result in genetic progress for the same traits at yearling. Thus, selection of the best genotypes during only one phase of life of the animal is suggested. However, genetic differences between morphological traits were better detected during the growth phase to yearling. Direct selection for body structure, finishing precocity and muscling at only one age, preferentially at yearling, is recommended as genetic differences between traits can be detected at this age.
Resumo:
The need for timely population data for health planning and Indicators of need has Increased the demand for population estimates. The data required to produce estimates is difficult to obtain and the process is time consuming. Estimation methods that require less effort and fewer data are needed. The structure preserving estimator (SPREE) is a promising technique not previously used to estimate county population characteristics. This study first uses traditional regression estimation techniques to produce estimates of county population totals. Then the structure preserving estimator, using the results produced in the first phase as constraints, is evaluated.^ Regression methods are among the most frequently used demographic methods for estimating populations. These methods use symptomatic indicators to predict population change. This research evaluates three regression methods to determine which will produce the best estimates based on the 1970 to 1980 indicators of population change. Strategies for stratifying data to improve the ability of the methods to predict change were tested. Difference-correlation using PMSA strata produced the equation which fit the data the best. Regression diagnostics were used to evaluate the residuals.^ The second phase of this study is to evaluate use of the structure preserving estimator in making estimates of population characteristics. The SPREE estimation approach uses existing data (the association structure) to establish the relationship between the variable of interest and the associated variable(s) at the county level. Marginals at the state level (the allocation structure) supply the current relationship between the variables. The full allocation structure model uses current estimates of county population totals to limit the magnitude of county estimates. The limited full allocation structure model has no constraints on county size. The 1970 county census age - gender population provides the association structure, the allocation structure is the 1980 state age - gender distribution.^ The full allocation model produces good estimates of the 1980 county age - gender populations. An unanticipated finding of this research is that the limited full allocation model produces estimates of county population totals that are superior to those produced by the regression methods. The full allocation model is used to produce estimates of 1986 county population characteristics. ^
Resumo:
In this dissertation, we propose a continuous-time Markov chain model to examine the longitudinal data that have three categories in the outcome variable. The advantage of this model is that it permits a different number of measurements for each subject and the duration between two consecutive time points of measurements can be irregular. Using the maximum likelihood principle, we can estimate the transition probability between two time points. By using the information provided by the independent variables, this model can also estimate the transition probability for each subject. The Monte Carlo simulation method will be used to investigate the goodness of model fitting compared with that obtained from other models. A public health example will be used to demonstrate the application of this method. ^
Resumo:
Bibliographical footnotes.
Resumo:
This paper is devoted to the analysis of career paths and employability. The state-of-the-art on this topic is rather poor in methodologies. Some authors propose distances well adapted to the data, but are limiting their analysis to hierarchical clustering. Other authors apply sophisticated methods, but only after paying the price of transforming the categorical data into continuous, via a factorial analysis. The latter approach has an important drawback since it makes a linear assumption on the data. We propose a new methodology, inspired from biology and adapted to career paths, combining optimal matching and self-organizing maps. A complete study on real-life data will illustrate our proposal.
Resumo:
The focus of this study is on statistical analysis of categorical responses, where the response values are dependent of each other. The most typical example of this kind of dependence is when repeated responses have been obtained from the same study unit. For example, in Paper I, the response of interest is the pneumococcal nasopharengyal carriage (yes/no) on 329 children. For each child, the carriage is measured nine times during the first 18 months of life, and thus repeated respones on each child cannot be assumed independent of each other. In the case of the above example, the interest typically lies in the carriage prevalence, and whether different risk factors affect the prevalence. Regression analysis is the established method for studying the effects of risk factors. In order to make correct inferences from the regression model, the associations between repeated responses need to be taken into account. The analysis of repeated categorical responses typically focus on regression modelling. However, further insights can also be gained by investigating the structure of the association. The central theme in this study is on the development of joint regression and association models. The analysis of repeated, or otherwise clustered, categorical responses is computationally difficult. Likelihood-based inference is often feasible only when the number of repeated responses for each study unit is small. In Paper IV, an algorithm is presented, which substantially facilitates maximum likelihood fitting, especially when the number of repeated responses increase. In addition, a notable result arising from this work is the freely available software for likelihood-based estimation of clustered categorical responses.
Resumo:
This study analyses personal relationships linking research to sociological theory on the questions of the social bond and on the self as social. From the viewpoint of disruptive life events and experiences, such as loss, divorce and illness, it aims at understanding how selves are bound to their significant others as those specific people ‘close or otherwise important’ to them. Who form the configurations of significant others? How do different bonds respond in disruptions and how do relational processes unfold? How is the embeddedness of selves manifested in the processes of bonding, on the one hand, and in the relational formation of the self, on the other? The bonds are analyzed from an anti-categorical viewpoint based on personal citations of significance as opposed to given relationship categories, such as ‘family’ or ‘friendship’ – the two kinds of relationships that in fact are most frequently significant. The study draws from analysis of the personal narratives of 37 Finnish women and men (in all 80 interviews) and their entire configurations of those specific people who they cite as ‘close or otherwise important’. The analysis stresses the subjective experiences, while also investigating the actualized relational processes and configurations of all personal relationships with certain relationship histories embedded in micro-level structures. The research is based on four empirical sub-studies of personal relationships and a summary discussing the questions of the self and social bond. Discussion draws from G. H. Mead, C. Cooley, N. Elias, T. Scheff, G. Simmel and the contributors of ‘relational sociology’. Sub-studies analyse bonds to others from the viewpoint of biographical disruption and re-configuration of significant others, estranged family bonds, peer support and the formation of the most intimate relationships into exclusive and inclusive configurations. All analyses examine the dialectics of the social and the personal, asking how different structuring mechanisms and personal experiences and negotiations together contribute to the unfolding of the bonds. The summary elaborates personal relationships as social bonds embedded in wider webs of interdependent people and social settings that are laden with cultural expectations. Regarding the question of the relational self, the study proposes both bonding and individuality as significant. They are seen as interdependent phases of the relationality of the self. Bonding anchors the self to its significant relationships, in which individuality is manifested, for example, in contrasting and differentiating dynamics, but also in active attempts to connect with others. Individuality is not a fixed quality of the self, but a fluid and interdependent phase of the relational self. More specifically, it appears in three formats in the flux of relational processes: as a sense of unique self (via cultivation of subjective experiences), as agency and as (a search for) relative autonomy. The study includes an epilogue addressing the ambivalence between the social expectation of individuality in society and the bonded reality of selves.