19 resultados para multiple linear regression models
em Digital Commons at Florida International University
Resumo:
Multiple linear regression model plays a key role in statistical inference and it has extensive applications in business, environmental, physical and social sciences. Multicollinearity has been a considerable problem in multiple regression analysis. When the regressor variables are multicollinear, it becomes difficult to make precise statistical inferences about the regression coefficients. There are some statistical methods that can be used, which are discussed in this thesis are ridge regression, Liu, two parameter biased and LASSO estimators. Firstly, an analytical comparison on the basis of risk was made among ridge, Liu and LASSO estimators under orthonormal regression model. I found that LASSO dominates least squares, ridge and Liu estimators over a significant portion of the parameter space for large dimension. Secondly, a simulation study was conducted to compare performance of ridge, Liu and two parameter biased estimator by their mean squared error criterion. I found that two parameter biased estimator performs better than its corresponding ridge regression estimator. Overall, Liu estimator performs better than both ridge and two parameter biased estimator.
Resumo:
The nation's freeway systems are becoming increasingly congested. A major contribution to traffic congestion on freeways is due to traffic incidents. Traffic incidents are non-recurring events such as accidents or stranded vehicles that cause a temporary roadway capacity reduction, and they can account for as much as 60 percent of all traffic congestion on freeways. One major freeway incident management strategy involves diverting traffic to avoid incident locations by relaying timely information through Intelligent Transportation Systems (ITS) devices such as dynamic message signs or real-time traveler information systems. The decision to divert traffic depends foremost on the expected duration of an incident, which is difficult to predict. In addition, the duration of an incident is affected by many contributing factors. Determining and understanding these factors can help the process of identifying and developing better strategies to reduce incident durations and alleviate traffic congestion. A number of research studies have attempted to develop models to predict incident durations, yet with limited success. ^ This dissertation research attempts to improve on this previous effort by applying data mining techniques to a comprehensive incident database maintained by the District 4 ITS Office of the Florida Department of Transportation (FDOT). Two categories of incident duration prediction models were developed: "offline" models designed for use in the performance evaluation of incident management programs, and "online" models for real-time prediction of incident duration to aid in the decision making of traffic diversion in the event of an ongoing incident. Multiple data mining analysis techniques were applied and evaluated in the research. The multiple linear regression analysis and decision tree based method were applied to develop the offline models, and the rule-based method and a tree algorithm called M5P were used to develop the online models. ^ The results show that the models in general can achieve high prediction accuracy within acceptable time intervals of the actual durations. The research also identifies some new contributing factors that have not been examined in past studies. As part of the research effort, software code was developed to implement the models in the existing software system of District 4 FDOT for actual applications. ^
Resumo:
Hydrophobicity as measured by Log P is an important molecular property related to toxicity and carcinogenicity. With increasing public health concerns for the effects of Disinfection By-Products (DBPs), there are considerable benefits in developing Quantitative Structure and Activity Relationship (QSAR) models capable of accurately predicting Log P. In this research, Log P values of 173 DBP compounds in 6 functional classes were used to develop QSAR models, by applying 3 molecular descriptors, namely, Energy of the Lowest Unoccupied Molecular Orbital (ELUMO), Number of Chlorine (NCl) and Number of Carbon (NC) by Multiple Linear Regression (MLR) analysis. The QSAR models developed were validated based on the Organization for Economic Co-operation and Development (OECD) principles. The model Applicability Domain (AD) and mechanistic interpretation were explored. Considering the very complex nature of DBPs, the established QSAR models performed very well with respect to goodness-of-fit, robustness and predictability. The predicted values of Log P of DBPs by the QSAR models were found to be significant with a correlation coefficient R2 from 81% to 98%. The Leverage Approach by Williams Plot was applied to detect and remove outliers, consequently increasing R 2 by approximately 2% to 13% for different DBP classes. The developed QSAR models were statistically validated for their predictive power by the Leave-One-Out (LOO) and Leave-Many-Out (LMO) cross validation methods. Finally, Monte Carlo simulation was used to assess the variations and inherent uncertainties in the QSAR models of Log P and determine the most influential parameters in connection with Log P prediction. The developed QSAR models in this dissertation will have a broad applicability domain because the research data set covered six out of eight common DBP classes, including halogenated alkane, halogenated alkene, halogenated aromatic, halogenated aldehyde, halogenated ketone, and halogenated carboxylic acid, which have been brought to the attention of regulatory agencies in recent years. Furthermore, the QSAR models are suitable to be used for prediction of similar DBP compounds within the same applicability domain. The selection and integration of various methodologies developed in this research may also benefit future research in similar fields.
Resumo:
Highways are generally designed to serve a mixed traffic flow that consists of passenger cars, trucks, buses, recreational vehicles, etc. The fact that the impacts of these different vehicle types are not uniform creates problems in highway operations and safety. A common approach to reducing the impacts of truck traffic on freeways has been to restrict trucks to certain lane(s) to minimize the interaction between trucks and other vehicles and to compensate for their differences in operational characteristics. ^ The performance of different truck lane restriction alternatives differs under different traffic and geometric conditions. Thus, a good estimate of the operational performance of different truck lane restriction alternatives under prevailing conditions is needed to help make informed decisions on truck lane restriction alternatives. This study develops operational performance models that can be applied to help identify the most operationally efficient truck lane restriction alternative on a freeway under prevailing conditions. The operational performance measures examined in this study include average speed, throughput, speed difference, and lane changes. Prevailing conditions include number of lanes, interchange density, free-flow speeds, volumes, truck percentages, and ramp volumes. ^ Recognizing the difficulty of collecting sufficient data for an empirical modeling procedure that involves a high number of variables, the simulation approach was used to estimate the performance values for various truck lane restriction alternatives under various scenarios. Both the CORSIM and VISSIM simulation models were examined for their ability to model truck lane restrictions. Due to a major problem found in the CORSIM model for truck lane modeling, the VISSIM model was adopted as the simulator for this study. ^ The VISSIM model was calibrated mainly to replicate the capacity given in the 2000 Highway Capacity Manual (HCM) for various free-flow speeds under the ideal basic freeway section conditions. Non-linear regression models for average speed, throughput, average number of lane changes, and speed difference between the lane groups were developed. Based on the performance models developed, a simple decision procedure was recommended to select the desired truck lane restriction alternative for prevailing conditions. ^
Resumo:
Annual average daily traffic (AADT) is important information for many transportation planning, design, operation, and maintenance activities, as well as for the allocation of highway funds. Many studies have attempted AADT estimation using factor approach, regression analysis, time series, and artificial neural networks. However, these methods are unable to account for spatially variable influence of independent variables on the dependent variable even though it is well known that to many transportation problems, including AADT estimation, spatial context is important. ^ In this study, applications of geographically weighted regression (GWR) methods to estimating AADT were investigated. The GWR based methods considered the influence of correlations among the variables over space and the spatially non-stationarity of the variables. A GWR model allows different relationships between the dependent and independent variables to exist at different points in space. In other words, model parameters vary from location to location and the locally linear regression parameters at a point are affected more by observations near that point than observations further away. ^ The study area was Broward County, Florida. Broward County lies on the Atlantic coast between Palm Beach and Miami-Dade counties. In this study, a total of 67 variables were considered as potential AADT predictors, and six variables (lanes, speed, regional accessibility, direct access, density of roadway length, and density of seasonal household) were selected to develop the models. ^ To investigate the predictive powers of various AADT predictors over the space, the statistics including local r-square, local parameter estimates, and local errors were examined and mapped. The local variations in relationships among parameters were investigated, measured, and mapped to assess the usefulness of GWR methods. ^ The results indicated that the GWR models were able to better explain the variation in the data and to predict AADT with smaller errors than the ordinary linear regression models for the same dataset. Additionally, GWR was able to model the spatial non-stationarity in the data, i.e., the spatially varying relationship between AADT and predictors, which cannot be modeled in ordinary linear regression. ^
Resumo:
The purpose of this research study was to investigate if the determination of school readiness as it was evaluated by Broward County kindergarten teachers on the Florida's Expectations of School Readiness checklist can be attributed to the effects of gender, chronological age on school entry, racial or ethnic background, attending public preschool, native language other than English, or socioeconomic status.^ This is a descriptive study in which the number of expectations passed or failed for each of the identifier categories was compared. The Chi-squared distribution was used to evaluate the null hypothesis that "chronological age at entry to school, gender, race or ethnicity, native language other than English, public preschool experience, and socioeconomic status have no effect on the determination of readiness for school". Results were confirmed using t-tests, ANOVA, and linear regression models. The cohort of 1555 Broward County students in the study were evaluated using the Florida's Expectations for School Readiness checklist and were determined not ready for school during the initial data collection year 1996-1997.^ The determination of school readiness was significantly dependent on the gender, and racial or ethnic background of the students in the cohort. The socioeconomic status and native language other than English designations were significant for students only in the areas of preacademic, academic and literacy development. Chronological age on entry to school or attendance in public preschool prior to entry in kindergarten for the cohort was not significant in the determination of readiness for school.^ Given the fact that this study followed only students that were determined not ready for school, it is recommended that a second cohort of both "ready" and "not ready" students be studied. ^
Resumo:
As traffic congestion exuberates and new roadway construction is severely constrained because of limited availability of land, high cost of land acquisition, and communities' opposition to the building of major roads, new solutions have to be sought to either make roadway use more efficient or reduce travel demand. There is a general agreement that travel demand is affected by land use patterns. However, traditional aggregate four-step models, which are the prevailing modeling approach presently, assume that traffic condition will not affect people's decision on whether to make a trip or not when trip generation is estimated. Existing survey data indicate, however, that differences exist in trip rates for different geographic areas. The reasons for such differences have not been carefully studied, and the success of quantifying the influence of land use on travel demand beyond employment, households, and their characteristics has been limited to be useful to the traditional four-step models. There may be a number of reasons, such as that the representation of influence of land use on travel demand is aggregated and is not explicit and that land use variables such as density and mix and accessibility as measured by travel time and congestion have not been adequately considered. This research employs the artificial neural network technique to investigate the potential effects of land use and accessibility on trip productions. Sixty two variables that may potentially influence trip production are studied. These variables include demographic, socioeconomic, land use and accessibility variables. Different architectures of ANN models are tested. Sensitivity analysis of the models shows that land use does have an effect on trip production, so does traffic condition. The ANN models are compared with linear regression models and cross-classification models using the same data. The results show that ANN models are better than the linear regression models and cross-classification models in terms of RMSE. Future work may focus on finding a representation of traffic condition with existing network data and population data which might be available when the variables are needed to in prediction.
Resumo:
The purpose of this study was to assess the effect of performance feedback on Athletic Trainers’ (ATs) perceived knowledge (PK) and likelihood to pursue continuing education (CE). The investigation was grounded in the theories of “the definition of the situation” (Thomas & Thomas, 1928) and the “illusion of knowing,” (Glenberg, Wilkinson, & Epstein, 1982) suggesting that PK drives behavior. This investigation measured the degree to which knowledge gap predicted CE seeking behavior by providing performance feedback designed to change PK. A pre-test post-test control-group design was used to measure PK and likelihood to pursue CE before and after assessing actual knowledge. ATs (n=103) were randomly sampled and assigned to two groups, with and without performance feedback. Two independent samples t-tests were used to compare groups on the difference scores of the dependent variables. Likelihood to pursue CE was predicted by three variables using multiple linear regression: perceived knowledge, pre-test likelihood to pursue CE, and knowledge gap. There was a 68.4% significant difference (t101=2.72, p=0.01, ES=0.45) between groups in the change scores for likelihood to pursue CE because of the performance feedback (Experimental group=13.7% increase; Control group=4.3% increase). The strongest relationship among the dependent variables was between pre-test and post-test measures of likelihood to pursue CE (F2,102=56.80, p<0.01, r=0.73, R2=0.53). The pre- and post-test predictive relationship was enhanced when group was included in the model. In this model [YCEpost=0.76XCEpre-0.34 Xgroup+2.24+E], group accounted for a significant amount of unique variance in predicting CE while the pre-test likelihood to pursue CE variable was held constant (F3,102=40.28, p<0.01, r=0.74, R2=0.55). Pre-test knowledge gap, regardless of group allocation, was a linear predictor of the likelihood to pursue CE (F1,102=10.90, p=.01, r=.31, R2=.10). In this investigation, performance feedback significantly increased participants’ likelihood to pursue CE. Pre-test knowledge gap was a significant predictor of likelihood to pursue CE, regardless if performance feedback was provided. ATs may have self-assessed and engaged in internal feedback as a result of their test-taking experience. These findings indicate that feedback, both internal and external, may be necessary to trigger CE seeking behavior.
Resumo:
Cohort programs have been instituted at many universities to accommodate the growing number of mature adult graduate students who pursue degrees while maintaining multiple commitments such as work and family. While it is estimated that as many as 40–60% of students who begin graduate study fail to complete degrees, it is thought that attrition may be even higher for this population of students. Yet, little is known about the impact of cohorts on the learning environment and whether cohort programs affect graduate student retention. Retention theory stresses the importance of the academic department, quality of faculty-student relationships and student involvement in the life of the academic community as critical determinants in students' decisions to persist to degree completion. However, students who are employed full-time typically spend little time on campus engaged in the learning environment. Using academic and social integration theory, this study examined the experiences of working adult graduate students enrolled in cohort (CEP) and non-cohort (non-CEP) programs and the influence of these experiences on intention to persist. The Graduate Program Context Questionnaire was administered to graduate students (N = 310) to examine measures of academic and social integration and intention to persist. Sample t tests and ANOVAs were conducted to determine whether differences in perceptions could be identified between cohort and non-cohort students. Multiple linear regression was used to identify variables that predict students' intention to persist. While there were many similarities, significant differences were found between CEP and non-CEP student groups on two measures. CEP students rated peer-student relationships higher and scored higher on the intention to persist measure than non-CEP students. The psychological integration measure, however, was the strongest predictor of intention to persist for both the CEP and non-CEP groups. This study supports the research literature which suggests that CEP programs encourage the development of peer-student relationships and promote students' commitment to persistence.
Resumo:
Higher education institutions across the United States have developed global learning initiatives to support student achievement of global awareness and global perspective, but assessment options for these outcomes are extremely limited. A review of research for a global learning initiative at a large, Hispanic-serving, urban, public, research university in South Florida found a lack of instruments designed to measure global awareness and global perspective in the context of an authentic performance assessment. This quasi-experimental study explored the development of two rubrics for the global learning initiative and the extent to which evidence supported the rubrics' validity and reliability. One holistic rubric was developed to measure students' global awareness and the second to measure their global perspective. The study utilized a pretest/posttest nonequivalent group design. Multiple linear regression was used to ascertain the rubrics' ability to discern and compare average learning gains of undergraduate students enrolled in two global learning courses and students enrolled in two non-global learning courses. Parallel pretest/posttest forms of the performance task required students to respond to two open-ended questions, aligned with the learning outcomes, concerning a complex case narrative. Trained faculty raters read responses and used the rubrics to measure students' global awareness and perspective. Reliability was tested by calculating the rates of agreement among raters. Evidence supported the finding that the global awareness and global perspective rubrics yielded scores that were highly reliable measures of students' development of these learning outcomes. Chi-square tests of frequency found significant rates of inter-rater agreement exceeding the study's .80 minimum requirement. Evidence also supported the finding that the rubrics yielded scores that were valid measures of students' global awareness and global perspective. Regression analyses found little evidence of main effects; however, post hoc analyses revealed a significant interaction between global awareness pretest scores and the treatment, the global learning course. Significant interaction was also found between global perspective pretest scores and the treatment. These crossover interactions supported the finding that the global awareness and global perspective rubrics could be used to detect learning differences between the treatment and control groups as well as differences within the treatment group.
Resumo:
The cross sectional study investigated the association of tobacco smoke, vitamin D status, anthropometric parameters, and kidney function in Turkish immigrants with type 2 diabetes (T2D) living in the Netherlands. Study sample included a total of 110 participants aged 30 years and older (males= 46; females= 64). Serum cotinine, a biomarker for smoke exposure, was measured with a solid-phase competitive chemiluminescent immunoassay. Serum 25-hydroxyvitamin D [25(OH)D] was determined by electrochemiluminescence immunoassay (ECLIA). Measures of obesity including: body weight, body mass index (BMI), waist circumference (WC), and hip circumference (HC) were measured. Waist-to-hip ratio (WHR) and waist-to-height ratio (WHtR) were calculated. Urine albumin was measured by immunoturbidimetric assay. Urine creatinine was determined using the Jaffe method. All statistical analyses were performed using SPSS, version 19.0 (SPSS Inc., Chicago, IL, USA). Independent samples t-test, chi-squared tests, multiple linear regression and logistic regression analysis were used. Cotinine levels were positively associated with cholesterol to HDL ratio and atherosclerosis-index. Serum 25(OH)D levels were negatively associated with diastolic blood pressure. Gender-specific associations between anthropometric measures and high sensitivity C-reactive protein (hs-CRP) levels were observed. Hs-CRP was positively associated with WC and WHR in males and WHtR in females. Microalbuminuria (MAU), as determined by albumin-to-creatinine ratio, was present in 21% of the Turkish immigrants with T2D. Participants with hypertension were 6.58 times more likely (adjusted odds ratio) to have positive MAU as compared to normotensive participants. Our findings indicate that serum cotinine, 25(OH)D, hs-CRP, and MAU may be assessed as a standard of care for T2D management in the Turkish immigrant population. Further research should be conducted following cohorts to determine the effects of these biomarkers on CVD morbidity and mortality.
Resumo:
In the United States, the federal Empowerment Zone (EZ) program aimed to create and retain business investment in poor communities and to encourage local hiring through the use of special tax credits, relaxed regulations, social service grants, and other incentives. My dissertation explores whether the Round II Urban EZs had a beneficial impact on local communities and what factors influenced the implementation and performance of the EZs, using three modes of inquiry. First, linear regression models investigate whether the federal revitalization program had a statistically significant impact on the creation of new businesses and jobs in Round II Urban EZ communities. Second, location quotient and shift-share analysis are used to reveal the industry clusters in three EZ communities that experienced positive business and job growth. Third, qualitative analysis is employed to explore factors that influenced the implementation and performance of EZs in general, and in particular, Miami-Dade County, Florida. The results show an EZ's presence failed to have a significant influence on local business and job growth. In communities that experienced a beneficial impact from EZs, there has been a pattern of decline in manufacturing companies and increase in service-driven firms. The case study suggests that institutional factors, such as governance structure, leadership, administrative capacity, and community participation have affected the effectiveness of the program's implementation and performance.
Resumo:
Amidst concerns about achieving high levels of technology to remain competitive in the global market without compromising economic development, national economies are experiencing a high demand for human capital. As higher education is assumed to be the main source of human capital, this analysis focused on a more specific and less explored area of the generally accepted idea that higher education contributes to economic growth. The purpose of this study, therefore, was to find whether higher education also contributes to economic development, and whether that contribution is more substantial in a globalized context. ^ Consequently, a multiple linear regression analysis was conducted to support with statistical significance the answer to the research question: Does higher education contributes to economic development in the context of globalization? The information analyzed was obtained from historical data of 91 selected countries, and the period of time of the study was 10 years (1990–2000). Some variables, however, were lagged back 5, 10 or 15 years along a 15-year timeframe (1975–1990). The resulting comparative static model was based on the Cobb-Douglas production function and the Solow model to specify economic growth as a function of physical capital, labor, technology, and productivity. Then, formal education, economic development, and globalization were added to the equation. ^ The findings of this study supported the assumption that the independent contribution of the changes in higher education completion and globalization to changes in economic growth is more substantial than the contribution of their interaction. The results also suggested that changes in higher and secondary education completion contribute much more to changes in economic growth in less developed countries than in their more developed counterparts. ^ As a conclusion, based on the results of this study, I proposed the implementation of public policy in less developed countries to promote and expand adequate secondary and higher education systems with the purpose of helping in the achievement of economic development. I also recommended further research efforts on this topic to emphasize the contribution of education to the economy, mainly in less developed countries. ^
Resumo:
Post-Soviet Ukraine is in a time of upheaval and transition. Internal relations between pro-Western and pro-Russian supporters have deteriorated in the light of recent political events of Euro Revolution, Russia's occupation of the Crimean peninsula, and the militant confrontations in the southeastern regions of the country. In the light of these developments, intercultural competence is greatly needed to alleviate domestic tensions and enable effective intercultural communication with the representatives of different cultures within the country and beyond its borders.^ This study established a baseline of psychometric estimates of intercultural competence of Ukrainian higher education faculty. A sample of 276 professors of different academic majors from one university in Western Ukraine participated in the research. The Global Perspective Inventory (GPI; Merrill, Braskamp, & Braskamp, 2012) was chosen as a research instrument to measure intercultural competence of the faculty members. The GPI takes into account cognitive, intrapersonal, and interpersonal domains, each of which contains two scales reflective of theories of cultural development and intercultural communication – Cognitive-Knowing, Cognitive-Knowledge, Intrapersonal-Identity, Intrapersonal-Affect, Interpersonal-Social Responsibility, and Interpersonal-Social Interaction. Because the research instrument has neither been previously used as a measure of intercultural competence, nor administered in Ukraine, it was cross-validated using a Table of Specification (Newman, Lim, & Pineda, 2013) and two sets of factor analyses. As a result, a modified version of the GPI was created for use in Ukraine.^ Multiple linear regression analyses were used to test relationships between the participants' GPI scores on intercultural competence, and several independent variables that consisted of academic discipline, intercultural experience, and how long the participants taught at the university. The analyses determined a positive relationship between the scores on three out of six scales of the original version and two out of five scales of the modified version of the GPI and all the independent variables simultaneously. The relationship between the faculty responses on the six scales of both GPI versions and the independent variables controlling for each other produced mixed results. A unique role of intercultural professional development in predicting intercultural competence was discussed.^
Resumo:
Purpose: Metabolic syndrome (MetS) is associated with the development of cardiovascular disease (CVD) and type 2 diabetes. Decreases in circulating adiponectin and ghrelin have been associated with MetS. Our primary aim was to evaluate the relationship of MetS with adiponectin and ghrelin for Cuban Americans with and without type 2 diabetes. Methods: Cross-sectional study of 367 adults, self identified as Cuban extraction and randomly recruited from a mailing list of Broward and Miami-Dade counties. Fasted whole blood for adiponectin (ADPN) was collected using K3EDTA tubes and measured by ELISA. Ghrelin was assayed with fasted blood plasma by Enzyme Immunometric Assay. MetS and 10-year risk for coronary heart disease (CHD) were determined using the ATP III criteria. Results: Adiponectin (F=51.8, R2 =0.21 p<0.001) and ghrelin (F=12.77, R 2 =0.06, p<0.001) differed by diabetes status (ANOVA) not age and gender. In stepwise linear regression models triglyceride levels ≥ 150 mg/dL negatively corresponded (coefficient = -0.23) with ghrelin levels for persons without diabetes (F=7.45, R2 =0.053, p=0.007); abdominal obesity and fasting plasma glucose predicted high sensitivity C-reactive protein (hs-CRP) for persons with and without diabetes (F=16.3, R2 = 0.144, p <0.001). Conclusion: Low ghrelin levels were associated with MetS regardless of diabetes status. High adiponectin levels were related to a low probability for those without diabetes only. There was a positive association of hs-CRP with BMI, MetS and number of MetS components.