11 resultados para Mean squared error
em Digital Commons at Florida International University
Resumo:
An iterative travel time forecasting scheme, named the Advanced Multilane Prediction based Real-time Fastest Path (AMPRFP) algorithm, is presented in this dissertation. This scheme is derived from the conventional kernel estimator based prediction model by the association of real-time nonlinear impacts that caused by neighboring arcs’ traffic patterns with the historical traffic behaviors. The AMPRFP algorithm is evaluated by prediction of the travel time of congested arcs in the urban area of Jacksonville City. Experiment results illustrate that the proposed scheme is able to significantly reduce both the relative mean error (RME) and the root-mean-squared error (RMSE) of the predicted travel time. To obtain high quality real-time traffic information, which is essential to the performance of the AMPRFP algorithm, a data clean scheme enhanced empirical learning (DCSEEL) algorithm is also introduced. This novel method investigates the correlation between distance and direction in the geometrical map, which is not considered in existing fingerprint localization methods. Specifically, empirical learning methods are applied to minimize the error that exists in the estimated distance. A direction filter is developed to clean joints that have negative influence to the localization accuracy. Synthetic experiments in urban, suburban and rural environments are designed to evaluate the performance of DCSEEL algorithm in determining the cellular probe’s position. The results show that the cellular probe’s localization accuracy can be notably improved by the DCSEEL algorithm. Additionally, a new fast correlation technique for overcoming the time efficiency problem of the existing correlation algorithm based floating car data (FCD) technique is developed. The matching process is transformed into a 1-dimensional (1-D) curve matching problem and the Fast Normalized Cross-Correlation (FNCC) algorithm is introduced to supersede the Pearson product Moment Correlation Co-efficient (PMCC) algorithm in order to achieve the real-time requirement of the FCD method. The fast correlation technique shows a significant improvement in reducing the computational cost without affecting the accuracy of the matching process.
Resumo:
Multiple linear regression model plays a key role in statistical inference and it has extensive applications in business, environmental, physical and social sciences. Multicollinearity has been a considerable problem in multiple regression analysis. When the regressor variables are multicollinear, it becomes difficult to make precise statistical inferences about the regression coefficients. There are some statistical methods that can be used, which are discussed in this thesis are ridge regression, Liu, two parameter biased and LASSO estimators. Firstly, an analytical comparison on the basis of risk was made among ridge, Liu and LASSO estimators under orthonormal regression model. I found that LASSO dominates least squares, ridge and Liu estimators over a significant portion of the parameter space for large dimension. Secondly, a simulation study was conducted to compare performance of ridge, Liu and two parameter biased estimator by their mean squared error criterion. I found that two parameter biased estimator performs better than its corresponding ridge regression estimator. Overall, Liu estimator performs better than both ridge and two parameter biased estimator.
Resumo:
As traffic congestion continues to worsen in large urban areas, solutions are urgently sought. However, transportation planning models, which estimate traffic volumes on transportation network links, are often unable to realistically consider travel time delays at intersections. Introducing signal controls in models often result in significant and unstable changes in network attributes, which, in turn, leads to instability of models. Ignoring the effect of delays at intersections makes the model output inaccurate and unable to predict travel time. To represent traffic conditions in a network more accurately, planning models should be capable of arriving at a network solution based on travel costs that are consistent with the intersection delays due to signal controls. This research attempts to achieve this goal by optimizing signal controls and estimating intersection delays accordingly, which are then used in traffic assignment. Simultaneous optimization of traffic routing and signal controls has not been accomplished in real-world applications of traffic assignment. To this end, a delay model dealing with five major types of intersections has been developed using artificial neural networks (ANNs). An ANN architecture consists of interconnecting artificial neurons. The architecture may either be used to gain an understanding of biological neural networks, or for solving artificial intelligence problems without necessarily creating a model of a real biological system. The ANN delay model has been trained using extensive simulations based on TRANSYT-7F signal optimizations. The delay estimates by the ANN delay model have percentage root-mean-squared errors (%RMSE) that are less than 25.6%, which is satisfactory for planning purposes. Larger prediction errors are typically associated with severely oversaturated conditions. A combined system has also been developed that includes the artificial neural network (ANN) delay estimating model and a user-equilibrium (UE) traffic assignment model. The combined system employs the Frank-Wolfe method to achieve a convergent solution. Because the ANN delay model provides no derivatives of the delay function, a Mesh Adaptive Direct Search (MADS) method is applied to assist in and expedite the iterative process of the Frank-Wolfe method. The performance of the combined system confirms that the convergence of the solution is achieved, although the global optimum may not be guaranteed.
Resumo:
Traffic incidents are a major source of traffic congestion on freeways. Freeway traffic diversion using pre-planned alternate routes has been used as a strategy to reduce traffic delays due to major traffic incidents. However, it is not always beneficial to divert traffic when an incident occurs. Route diversion may adversely impact traffic on the alternate routes and may not result in an overall benefit. This dissertation research attempts to apply Artificial Neural Network (ANN) and Support Vector Regression (SVR) techniques to predict the percent of delay reduction from route diversion to help determine whether traffic should be diverted under given conditions. The DYNASMART-P mesoscopic traffic simulation model was applied to generate simulated data that were used to develop the ANN and SVR models. A sample network that comes with the DYNASMART-P package was used as the base simulation network. A combination of different levels of incident duration, capacity lost, percent of drivers diverted, VMS (variable message sign) messaging duration, and network congestion was simulated to represent different incident scenarios. The resulting percent of delay reduction, average speed, and queue length from each scenario were extracted from the simulation output. The ANN and SVR models were then calibrated for percent of delay reduction as a function of all of the simulated input and output variables. The results show that both the calibrated ANN and SVR models, when applied to the same location used to generate the calibration data, were able to predict delay reduction with a relatively high accuracy in terms of mean square error (MSE) and regression correlation. It was also found that the performance of the ANN model was superior to that of the SVR model. Likewise, when the models were applied to a new location, only the ANN model could produce comparatively good delay reduction predictions under high network congestion level.
Resumo:
This dissertation introduces a new system for handwritten text recognition based on an improved neural network design. Most of the existing neural networks treat mean square error function as the standard error function. The system as proposed in this dissertation utilizes the mean quartic error function, where the third and fourth derivatives are non-zero. Consequently, many improvements on the training methods were achieved. The training results are carefully assessed before and after the update. To evaluate the performance of a training system, there are three essential factors to be considered, and they are from high to low importance priority: (1) error rate on testing set, (2) processing time needed to recognize a segmented character and (3) the total training time and subsequently the total testing time. It is observed that bounded training methods accelerate the training process, while semi-third order training methods, next-minimal training methods, and preprocessing operations reduce the error rate on the testing set. Empirical observations suggest that two combinations of training methods are needed for different case character recognition. Since character segmentation is required for word and sentence recognition, this dissertation provides also an effective rule-based segmentation method, which is different from the conventional adaptive segmentation methods. Dictionary-based correction is utilized to correct mistakes resulting from the recognition and segmentation phases. The integration of the segmentation methods with the handwritten character recognition algorithm yielded an accuracy of 92% for lower case characters and 97% for upper case characters. In the testing phase, the database consists of 20,000 handwritten characters, with 10,000 for each case. The testing phase on the recognition 10,000 handwritten characters required 8.5 seconds in processing time.
Resumo:
This research addresses the problem of cost estimation for product development in engineer-to-order (ETO) operations. An ETO operation starts the product development process with a product specification and ends with delivery of a rather complicated, highly customized product. ETO operations are practiced in various industries such as engineering tooling, factory plants, industrial boilers, pressure vessels, shipbuilding, bridges and buildings. ETO views each product as a delivery item in an industrial project and needs to make an accurate estimation of its development cost at the bidding and/or planning stage before any design or manufacturing activity starts. ^ Many ETO practitioners rely on an ad hoc approach to cost estimation, with use of past projects as reference, adapting them to the new requirements. This process is often carried out on a case-by-case basis and in a non-procedural fashion, thus limiting its applicability to other industry domains and transferability to other estimators. In addition to being time consuming, this approach usually does not lead to an accurate cost estimate, which varies from 30% to 50%. ^ This research proposes a generic cost modeling methodology for application in ETO operations across various industry domains. Using the proposed methodology, a cost estimator will be able to develop a cost estimation model for use in a chosen ETO industry in a more expeditious, systematic and accurate manner. ^ The development of the proposed methodology was carried out by following the meta-methodology as outlined by Thomann. Deploying the methodology, cost estimation models were created in two industry domains (building construction and the steel milling equipment manufacturing). The models are then applied to real cases; the cost estimates are significantly more accurate than the actual estimates, with mean absolute error rate of 17.3%. ^ This research fills an important need of quick and accurate cost estimation across various ETO industries. It differs from existing approaches to the problem in that a methodology is developed for use to quickly customize a cost estimation model for a chosen application domain. In addition to more accurate estimation, the major contributions are in its transferability to other users and applicability to different ETO operations. ^
Resumo:
Interferometric synthetic aperture radar (InSAR) techniques can successfully detect phase variations related to the water level changes in wetlands and produce spatially detailed high-resolution maps of water level changes. Despite the vast details, the usefulness of the wetland InSAR observations is rather limited, because hydrologists and water resources managers need information on absolute water level values and not on relative water level changes. We present an InSAR technique called Small Temporal Baseline Subset (STBAS) for monitoring absolute water level time series using radar interferograms acquired successively over wetlands. The method uses stage (water level) observation for calibrating the relative InSAR observations and tying them to the stage's vertical datum. We tested the STBAS technique with two-year long Radarsat-1 data acquired during 2006–2008 over the Water Conservation Area 1 (WCA1) in the Everglades wetlands, south Florida (USA). The InSAR-derived water level data were calibrated using 13 stage stations located in the study area to generate 28 successive high spatial resolution maps (50 m pixel resolution) of absolute water levels. We evaluate the quality of the STBAS technique using a root mean square error (RMSE) criterion of the difference between InSAR observations and stage measurements. The average RMSE is 6.6 cm, which provides an uncertainty estimation of the STBAS technique to monitor absolute water levels. About half of the uncertainties are attributed to the accuracy of the InSAR technique to detect relative water levels. The other half reflects uncertainties derived from tying the relative levels to the stage stations' datum.
Resumo:
Florida Bay is a highly dynamic estuary that exhibits wide natural fluctuations in salinity due to changes in the balance of precipitation, evaporation and freshwater runoff from the mainland. Rapid and large-scale modification of freshwater flow and construction of transportation conduits throughout the Florida Keys during the late nineteenth and twentieth centuries reshaped water circulation and salinity patterns across the ecosystem. In order to determine long-term patterns in salinity variation across the Florida Bay estuary, we used a diatom-based salinity transfer function to infer salinity within 3.27 ppt root mean square error of prediction from diatom assemblages from four ~130 year old sediment records. Sites were distributed along a gradient of exposure to anthropogenic shifts in the watershed and salinity. Precipitation was found to be the primary driver influencing salinity fluctuations over the entire record, but watershed modifications on the mainland and in the Florida Keys during the late-1800s and 1900s were the most likely cause of significant shifts in baseline salinity. The timing of these shifts in the salinity baseline varies across the Bay: that of the northeastern coring location coincides with the construction of the Florida Overseas Railway (AD 1906–1916), while that of the east-central coring location coincides with the drainage of Lake Okeechobee (AD 1881–1894). Subsequent decreases occurring after the 1960s (east-central region) and early 1980s (southwestern region) correspond to increases in freshwater delivered through water control structures in the 1950s–1970s and again in the 1980s. Concomitant increases in salinity in the northeastern and south-central regions of the Bay in the mid-1960s correspond to an extensive drought period and the occurrence of three major hurricanes, while the drop in the early 1970s could not be related to any natural event. This paper provides information about major factors influencing salinity conditions in Florida Bay in the past and quantitative estimates of the pre- and post-South Florida watershed modification salinity levels in different regions of the Bay. This information should be useful for environmental managers in setting restoration goals for the marine ecosystems in South Florida, especially for Florida Bay.
Resumo:
Colleges base their admission decisions on a number of factors to determine which applicants have the potential to succeed. This study utilized data for students that graduated from Florida International University between 2006 and 2012. Two models were developed (one using SAT as the principal explanatory variable and the other using ACT as the principal explanatory variable) to predict college success, measured using the student’s college grade point average at graduation. Some of the other factors that were used to make these predictions were high school performance, socioeconomic status, major, gender, and ethnicity. The model using ACT had a higher R^2 but the model using SAT had a lower mean square error. African Americans had a significantly lower college grade point average than graduates of other ethnicities. Females had a significantly higher college grade point average than males.
Resumo:
As traffic congestion continues to worsen in large urban areas, solutions are urgently sought. However, transportation planning models, which estimate traffic volumes on transportation network links, are often unable to realistically consider travel time delays at intersections. Introducing signal controls in models often result in significant and unstable changes in network attributes, which, in turn, leads to instability of models. Ignoring the effect of delays at intersections makes the model output inaccurate and unable to predict travel time. To represent traffic conditions in a network more accurately, planning models should be capable of arriving at a network solution based on travel costs that are consistent with the intersection delays due to signal controls. This research attempts to achieve this goal by optimizing signal controls and estimating intersection delays accordingly, which are then used in traffic assignment. Simultaneous optimization of traffic routing and signal controls has not been accomplished in real-world applications of traffic assignment. To this end, a delay model dealing with five major types of intersections has been developed using artificial neural networks (ANNs). An ANN architecture consists of interconnecting artificial neurons. The architecture may either be used to gain an understanding of biological neural networks, or for solving artificial intelligence problems without necessarily creating a model of a real biological system. The ANN delay model has been trained using extensive simulations based on TRANSYT-7F signal optimizations. The delay estimates by the ANN delay model have percentage root-mean-squared errors (%RMSE) that are less than 25.6%, which is satisfactory for planning purposes. Larger prediction errors are typically associated with severely oversaturated conditions. A combined system has also been developed that includes the artificial neural network (ANN) delay estimating model and a user-equilibrium (UE) traffic assignment model. The combined system employs the Frank-Wolfe method to achieve a convergent solution. Because the ANN delay model provides no derivatives of the delay function, a Mesh Adaptive Direct Search (MADS) method is applied to assist in and expedite the iterative process of the Frank-Wolfe method. The performance of the combined system confirms that the convergence of the solution is achieved, although the global optimum may not be guaranteed.
Resumo:
Traffic incidents are a major source of traffic congestion on freeways. Freeway traffic diversion using pre-planned alternate routes has been used as a strategy to reduce traffic delays due to major traffic incidents. However, it is not always beneficial to divert traffic when an incident occurs. Route diversion may adversely impact traffic on the alternate routes and may not result in an overall benefit. This dissertation research attempts to apply Artificial Neural Network (ANN) and Support Vector Regression (SVR) techniques to predict the percent of delay reduction from route diversion to help determine whether traffic should be diverted under given conditions. The DYNASMART-P mesoscopic traffic simulation model was applied to generate simulated data that were used to develop the ANN and SVR models. A sample network that comes with the DYNASMART-P package was used as the base simulation network. A combination of different levels of incident duration, capacity lost, percent of drivers diverted, VMS (variable message sign) messaging duration, and network congestion was simulated to represent different incident scenarios. The resulting percent of delay reduction, average speed, and queue length from each scenario were extracted from the simulation output. The ANN and SVR models were then calibrated for percent of delay reduction as a function of all of the simulated input and output variables. The results show that both the calibrated ANN and SVR models, when applied to the same location used to generate the calibration data, were able to predict delay reduction with a relatively high accuracy in terms of mean square error (MSE) and regression correlation. It was also found that the performance of the ANN model was superior to that of the SVR model. Likewise, when the models were applied to a new location, only the ANN model could produce comparatively good delay reduction predictions under high network congestion level.