3 resultados para random forest data analysis
em Illinois Digital Environment for Access to Learning and Scholarship Repository
Resumo:
The protein lysate array is an emerging technology for quantifying the protein concentration ratios in multiple biological samples. It is gaining popularity, and has the potential to answer questions about post-translational modifications and protein pathway relationships. Statistical inference for a parametric quantification procedure has been inadequately addressed in the literature, mainly due to two challenges: the increasing dimension of the parameter space and the need to account for dependence in the data. Each chapter of this thesis addresses one of these issues. In Chapter 1, an introduction to the protein lysate array quantification is presented, followed by the motivations and goals for this thesis work. In Chapter 2, we develop a multi-step procedure for the Sigmoidal models, ensuring consistent estimation of the concentration level with full asymptotic efficiency. The results obtained in this chapter justify inferential procedures based on large-sample approximations. Simulation studies and real data analysis are used to illustrate the performance of the proposed method in finite-samples. The multi-step procedure is simpler in both theory and computation than the single-step least squares method that has been used in current practice. In Chapter 3, we introduce a new model to account for the dependence structure of the errors by a nonlinear mixed effects model. We consider a method to approximate the maximum likelihood estimator of all the parameters. Using the simulation studies on various error structures, we show that for data with non-i.i.d. errors the proposed method leads to more accurate estimates and better confidence intervals than the existing single-step least squares method.
Resumo:
Abstract Considerable research has been carried out on entrepreneurship in efforts to understand its incidence in order to influence and maximize its benefits. Essentially, researchers and policy makers have sought to understand the link between individuals and business creation: Why some people start businesses while others do not. The research indicates that personality traits, individual background factors and association of entrepreneurship with career choice and small business enterprises, cannot sufficiently explain entrepreneurship. It is recognized that entrepreneurship is an intentional process and based on Ajzen’s Theory of Planned Behavior, the most defining characteristic of entrepreneurship is the intention to start a business. The purpose of this study was, therefore, to examine factors that influence entrepreneurial intention in high school students in Kenya. Specifically, the study aimed at determining if there were relationships between the perceptions of desirability, and feasibility of entrepreneurship with entrepreneurial intention of the students, identifying any difference in these perceptions with students of different backgrounds, and developing a model to predict entrepreneurship in the students. The study, therefore, tested how well Ajzen’s Theory of Planned Behavior applied in the Kenyan situation. A questionnaire was developed and administered to 969 final year high school students at a critical important point in their career decision making. Participants were selected using a combined convenience and random sampling technique, considering gender, rural/urban location, cost, and accessibility. Survey was the major method of data collection. Data analysis methods included descriptive statistics, correlation, ANOVA, factor analysis, effect size, and regression analysis. iii The findings of this study corroborate results from past studies. Attitudes are found to influence intention, and the attitudes to be moderated by individual background factors. Perceived personal desirability of entrepreneurship was found to have the greatest influence on entrepreneurial intention and perceived feasibility the lowest. The study findings also showed that perceived social desirability and feasibility of entrepreneurship contributed to perception of personal desirability, and that the background factors, including gender and prior experience, influenced entrepreneurial intention both directly and indirectly. In addition, based on the literature reviewed, the study finds that entrepreneurship promotion requires reduction of the high small business mortality rate and creation of both entrepreneurs and entrepreneurial opportunities (Kruger, 2000; Shane & Venkataraman, 2000). These findings have theoretical and practical implications for researchers, policy makers, teachers, and other entrepreneurship practitioners in Kenya.
Resumo:
This dissertation research points out major challenging problems with current Knowledge Organization (KO) systems, such as subject gateways or web directories: (1) the current systems use traditional knowledge organization systems based on controlled vocabulary which is not very well suited to web resources, and (2) information is organized by professionals not by users, which means it does not reflect intuitively and instantaneously expressed users’ current needs. In order to explore users’ needs, I examined social tags which are user-generated uncontrolled vocabulary. As investment in professionally-developed subject gateways and web directories diminishes (support for both BUBL and Intute, examined in this study, is being discontinued), understanding characteristics of social tagging becomes even more critical. Several researchers have discussed social tagging behavior and its usefulness for classification or retrieval; however, further research is needed to qualitatively and quantitatively investigate social tagging in order to verify its quality and benefit. This research particularly examined the indexing consistency of social tagging in comparison to professional indexing to examine the quality and efficacy of tagging. The data analysis was divided into three phases: analysis of indexing consistency, analysis of tagging effectiveness, and analysis of tag attributes. Most indexing consistency studies have been conducted with a small number of professional indexers, and they tended to exclude users. Furthermore, the studies mainly have focused on physical library collections. This dissertation research bridged these gaps by (1) extending the scope of resources to various web documents indexed by users and (2) employing the Information Retrieval (IR) Vector Space Model (VSM) - based indexing consistency method since it is suitable for dealing with a large number of indexers. As a second phase, an analysis of tagging effectiveness with tagging exhaustivity and tag specificity was conducted to ameliorate the drawbacks of consistency analysis based on only the quantitative measures of vocabulary matching. Finally, to investigate tagging pattern and behaviors, a content analysis on tag attributes was conducted based on the FRBR model. The findings revealed that there was greater consistency over all subjects among taggers compared to that for two groups of professionals. The analysis of tagging exhaustivity and tag specificity in relation to tagging effectiveness was conducted to ameliorate difficulties associated with limitations in the analysis of indexing consistency based on only the quantitative measures of vocabulary matching. Examination of exhaustivity and specificity of social tags provided insights into particular characteristics of tagging behavior and its variation across subjects. To further investigate the quality of tags, a Latent Semantic Analysis (LSA) was conducted to determine to what extent tags are conceptually related to professionals’ keywords and it was found that tags of higher specificity tended to have a higher semantic relatedness to professionals’ keywords. This leads to the conclusion that the term’s power as a differentiator is related to its semantic relatedness to documents. The findings on tag attributes identified the important bibliographic attributes of tags beyond describing subjects or topics of a document. The findings also showed that tags have essential attributes matching those defined in FRBR. Furthermore, in terms of specific subject areas, the findings originally identified that taggers exhibited different tagging behaviors representing distinctive features and tendencies on web documents characterizing digital heterogeneous media resources. These results have led to the conclusion that there should be an increased awareness of diverse user needs by subject in order to improve metadata in practical applications. This dissertation research is the first necessary step to utilize social tagging in digital information organization by verifying the quality and efficacy of social tagging. This dissertation research combined both quantitative (statistics) and qualitative (content analysis using FRBR) approaches to vocabulary analysis of tags which provided a more complete examination of the quality of tags. Through the detailed analysis of tag properties undertaken in this dissertation, we have a clearer understanding of the extent to which social tagging can be used to replace (and in some cases to improve upon) professional indexing.