9 resultados para Linear regression analysis
em Digital Commons at Florida International University
Resumo:
Multiple linear regression model plays a key role in statistical inference and it has extensive applications in business, environmental, physical and social sciences. Multicollinearity has been a considerable problem in multiple regression analysis. When the regressor variables are multicollinear, it becomes difficult to make precise statistical inferences about the regression coefficients. There are some statistical methods that can be used, which are discussed in this thesis are ridge regression, Liu, two parameter biased and LASSO estimators. Firstly, an analytical comparison on the basis of risk was made among ridge, Liu and LASSO estimators under orthonormal regression model. I found that LASSO dominates least squares, ridge and Liu estimators over a significant portion of the parameter space for large dimension. Secondly, a simulation study was conducted to compare performance of ridge, Liu and two parameter biased estimator by their mean squared error criterion. I found that two parameter biased estimator performs better than its corresponding ridge regression estimator. Overall, Liu estimator performs better than both ridge and two parameter biased estimator.
Resumo:
The nation's freeway systems are becoming increasingly congested. A major contribution to traffic congestion on freeways is due to traffic incidents. Traffic incidents are non-recurring events such as accidents or stranded vehicles that cause a temporary roadway capacity reduction, and they can account for as much as 60 percent of all traffic congestion on freeways. One major freeway incident management strategy involves diverting traffic to avoid incident locations by relaying timely information through Intelligent Transportation Systems (ITS) devices such as dynamic message signs or real-time traveler information systems. The decision to divert traffic depends foremost on the expected duration of an incident, which is difficult to predict. In addition, the duration of an incident is affected by many contributing factors. Determining and understanding these factors can help the process of identifying and developing better strategies to reduce incident durations and alleviate traffic congestion. A number of research studies have attempted to develop models to predict incident durations, yet with limited success. ^ This dissertation research attempts to improve on this previous effort by applying data mining techniques to a comprehensive incident database maintained by the District 4 ITS Office of the Florida Department of Transportation (FDOT). Two categories of incident duration prediction models were developed: "offline" models designed for use in the performance evaluation of incident management programs, and "online" models for real-time prediction of incident duration to aid in the decision making of traffic diversion in the event of an ongoing incident. Multiple data mining analysis techniques were applied and evaluated in the research. The multiple linear regression analysis and decision tree based method were applied to develop the offline models, and the rule-based method and a tree algorithm called M5P were used to develop the online models. ^ The results show that the models in general can achieve high prediction accuracy within acceptable time intervals of the actual durations. The research also identifies some new contributing factors that have not been examined in past studies. As part of the research effort, software code was developed to implement the models in the existing software system of District 4 FDOT for actual applications. ^
Resumo:
Amidst concerns about achieving high levels of technology to remain competitive in the global market without compromising economic development, national economies are experiencing a high demand for human capital. As higher education is assumed to be the main source of human capital, this analysis focused on a more specific and less explored area of the generally accepted idea that higher education contributes to economic growth. The purpose of this study, therefore, was to find whether higher education also contributes to economic development, and whether that contribution is more substantial in a globalized context. ^ Consequently, a multiple linear regression analysis was conducted to support with statistical significance the answer to the research question: Does higher education contributes to economic development in the context of globalization? The information analyzed was obtained from historical data of 91 selected countries, and the period of time of the study was 10 years (1990–2000). Some variables, however, were lagged back 5, 10 or 15 years along a 15-year timeframe (1975–1990). The resulting comparative static model was based on the Cobb-Douglas production function and the Solow model to specify economic growth as a function of physical capital, labor, technology, and productivity. Then, formal education, economic development, and globalization were added to the equation. ^ The findings of this study supported the assumption that the independent contribution of the changes in higher education completion and globalization to changes in economic growth is more substantial than the contribution of their interaction. The results also suggested that changes in higher and secondary education completion contribute much more to changes in economic growth in less developed countries than in their more developed counterparts. ^ As a conclusion, based on the results of this study, I proposed the implementation of public policy in less developed countries to promote and expand adequate secondary and higher education systems with the purpose of helping in the achievement of economic development. I also recommended further research efforts on this topic to emphasize the contribution of education to the economy, mainly in less developed countries. ^
Resumo:
Prior to 2000, there were less than 1.6 million students enrolled in at least one online course. By fall 2010, student enrollment in online distance education showed a phenomenal 283% increase to 6.1 million. Two years later, this number had grown to 7.1 million. In light of this significant growth and skepticism about quality, there have been calls for greater oversight of this format of educational delivery. Accrediting bodies tasked with this oversight have developed guidelines and standards for online education. There is a lack of empirical studies that examine the relationship between accrediting standards and student success. The purpose of this study was to examine the relationship between the presence of Southern Association of Colleges and Schools Commission on College (SACSCOC) standards for online education in online courses, (a) student support services and (b) curriculum and instruction, and student success. An original 24-item survey with an overall reliability coefficient of .94 was administered to students (N=464) at Florida International University, enrolled in 24 university-wide undergraduate online courses during fall 2014, who rated the presence of these standards in their online courses. The general linear model was utilized to analyze the data. The results of the study indicated that the two standards, student support services and curriculum and instruction were both significantly and positively correlated with student success but with small R2 and strengths of association less than .35 and .20 respectively. Mixed results were produced from Chi-square tests for differences in student success between higher and lower rated online courses when controlling for various covariates such as discipline, gender, race/ethnicity, GPA, age, and number of online courses previously taken. A multiple linear regression analysis revealed that the curriculum and instruction standard was the only variable that accounted for a significant amount of unique variance in student success. Another regression test revealed that no significant interaction effect exists between the two SACSCOC standards and GPA in predicting student success. The results of this study are useful for administrators, faculty, and researchers who are interested in accreditation standards for online education and how these standards relate to student success.
Resumo:
Prior to 2000, there were less than 1.6 million students enrolled in at least one online course. By fall 2010, student enrollment in online distance education showed a phenomenal 283% increase to 6.1 million. Two years later, this number had grown to 7.1 million. In light of this significant growth and skepticism about quality, there have been calls for greater oversight of this format of educational delivery. Accrediting bodies tasked with this oversight have developed guidelines and standards for online education. ^ There is a lack of empirical studies that examine the relationship between accrediting standards and student success. The purpose of this study was to examine the relationship between the presence of Southern Association of Colleges and Schools Commission on College (SACSCOC) standards for online education in online courses, (a) student support services and (b) curriculum and instruction, and student success. An original 24-item survey with an overall reliability coefficient of .94 was administered to students (N=464) at Florida International University, enrolled in 24 university-wide undergraduate online courses during fall 2014, who rated the presence of these standards in their online courses. The general linear model was utilized to analyze the data. The results of the study indicated that the two standards, student support services and curriculum and instruction were both significantly and positively correlated with student success but with small R2 and strengths of association less than .35 and .20 respectively. Mixed results were produced from Chi-square tests for differences in student success between higher and lower rated online courses when controlling for various covariates such as discipline, gender, race/ethnicity, GPA, age, and number of online courses previously taken. A multiple linear regression analysis revealed that the curriculum and instruction standard was the only variable that accounted for a significant amount of unique variance in student success. Another regression test revealed that no significant interaction effect exists between the two SACSCOC standards and GPA in predicting student success. ^ The results of this study are useful for administrators, faculty, and researchers who are interested in accreditation standards for online education and how these standards relate to student success.^
Resumo:
Annual average daily traffic (AADT) is important information for many transportation planning, design, operation, and maintenance activities, as well as for the allocation of highway funds. Many studies have attempted AADT estimation using factor approach, regression analysis, time series, and artificial neural networks. However, these methods are unable to account for spatially variable influence of independent variables on the dependent variable even though it is well known that to many transportation problems, including AADT estimation, spatial context is important. ^ In this study, applications of geographically weighted regression (GWR) methods to estimating AADT were investigated. The GWR based methods considered the influence of correlations among the variables over space and the spatially non-stationarity of the variables. A GWR model allows different relationships between the dependent and independent variables to exist at different points in space. In other words, model parameters vary from location to location and the locally linear regression parameters at a point are affected more by observations near that point than observations further away. ^ The study area was Broward County, Florida. Broward County lies on the Atlantic coast between Palm Beach and Miami-Dade counties. In this study, a total of 67 variables were considered as potential AADT predictors, and six variables (lanes, speed, regional accessibility, direct access, density of roadway length, and density of seasonal household) were selected to develop the models. ^ To investigate the predictive powers of various AADT predictors over the space, the statistics including local r-square, local parameter estimates, and local errors were examined and mapped. The local variations in relationships among parameters were investigated, measured, and mapped to assess the usefulness of GWR methods. ^ The results indicated that the GWR models were able to better explain the variation in the data and to predict AADT with smaller errors than the ordinary linear regression models for the same dataset. Additionally, GWR was able to model the spatial non-stationarity in the data, i.e., the spatially varying relationship between AADT and predictors, which cannot be modeled in ordinary linear regression. ^
Resumo:
The purpose of this study was to better understand the study behaviors and habits of university undergraduate students. It was designed to determine whether undergraduate students could be grouped based on their self-reported study behaviors and if any grouping system could be determined, whether group membership was related to students’ academic achievement. A total of 152 undergraduate students voluntarily participated in the current study by completing the Study Behavior Inventory instrument. All participants were enrolled in fall semester of 2010 at Florida International University. The Q factor analysis technique using principal components extraction and a varimax rotation was used in order to examine the participants in relation to each other and to detect a pattern of intercorrelations among participants based on their self-reported study behaviors. The Q factor analysis yielded a two factor structure representing two distinct student types among participants regarding their study behaviors. The first student type (i.e., Factor 1) describes proactive learners who organize both their study materials and study time well. Type 1 students are labeled “Proactive Learners with Well-Organized Study Behaviors”. The second type (i.e., Factor 2) represents students who are poorly organized as well as being very likely to procrastinate. Type 2 students are labeled Disorganized Procrastinators. Hierarchical linear regression was employed to examine the relationship between student type and academic achievement as measured by current grade point averages (GPAs). The results showed significant differences in GPAs between Type 1 and Type 2 students at the .05 significance level. Furthermore, student type was found to be a significant predictor of academic achievement beyond and above students’ attribute variables including sex, age, major, and enrollment status. The study has several implications for educational researchers, practitioners, and policy makers in terms of improving college students' learning behaviors and outcomes.
Resumo:
Hydrophobicity as measured by Log P is an important molecular property related to toxicity and carcinogenicity. With increasing public health concerns for the effects of Disinfection By-Products (DBPs), there are considerable benefits in developing Quantitative Structure and Activity Relationship (QSAR) models capable of accurately predicting Log P. In this research, Log P values of 173 DBP compounds in 6 functional classes were used to develop QSAR models, by applying 3 molecular descriptors, namely, Energy of the Lowest Unoccupied Molecular Orbital (ELUMO), Number of Chlorine (NCl) and Number of Carbon (NC) by Multiple Linear Regression (MLR) analysis. The QSAR models developed were validated based on the Organization for Economic Co-operation and Development (OECD) principles. The model Applicability Domain (AD) and mechanistic interpretation were explored. Considering the very complex nature of DBPs, the established QSAR models performed very well with respect to goodness-of-fit, robustness and predictability. The predicted values of Log P of DBPs by the QSAR models were found to be significant with a correlation coefficient R2 from 81% to 98%. The Leverage Approach by Williams Plot was applied to detect and remove outliers, consequently increasing R 2 by approximately 2% to 13% for different DBP classes. The developed QSAR models were statistically validated for their predictive power by the Leave-One-Out (LOO) and Leave-Many-Out (LMO) cross validation methods. Finally, Monte Carlo simulation was used to assess the variations and inherent uncertainties in the QSAR models of Log P and determine the most influential parameters in connection with Log P prediction. The developed QSAR models in this dissertation will have a broad applicability domain because the research data set covered six out of eight common DBP classes, including halogenated alkane, halogenated alkene, halogenated aromatic, halogenated aldehyde, halogenated ketone, and halogenated carboxylic acid, which have been brought to the attention of regulatory agencies in recent years. Furthermore, the QSAR models are suitable to be used for prediction of similar DBP compounds within the same applicability domain. The selection and integration of various methodologies developed in this research may also benefit future research in similar fields.
Resumo:
The drugs studied in this work have been reportedly used to commit drug-facilitated sexual assault (DFSA), commonly known as "date rape". Detection of the drugs was performed using high-performance liquid chromatography with ultraviolet detection (HPLC/UV) and identified with high performance-liquid chromatography mass spectrometry (HPLC/MS) using selected ion monitoring (SIM). The objective of this study was to develop a single HPLC method for the simultaneous detection, identification and quantitation of these drugs. The following drugs were simultaneously analyzed: Gamma-hydroxybutyrate (GHB), scopolamine, lysergic acid diethylamide, ketamine, flunitrazepam, and diphenhydramine. The results showed increased sensitivity with electrospray (ES) ionization versus atmospheric pressure chemical ionization (APCI) using HPLC/MS. HPLC/ES/MS was approximately six times more sensitive than HPLC/APCI/MS and about fifty times more sensitive than HPLC/UV. A limit of detection (LOD) of 100 ppb was achieved for drug analysis using this method. The average linear regression coefficient of correlation squared (r2) was 0.933 for HPLC/UV and 0.998 for HPLC/ES/MS. The detection limits achieved by this method allowed for the detection of drug dosages used in beverage tampering. This method can be used to screen beverages suspected of drug tampering. The results of this study demonstrated that solid phase microextraction (SPME) did not improve sensitivity as an extraction technique when compared to direct injections of the drug standards.