9 resultados para Local classification method

em Digital Commons at Florida International University


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Modern IT infrastructures are constructed by large scale computing systems and administered by IT service providers. Manually maintaining such large computing systems is costly and inefficient. Service providers often seek automatic or semi-automatic methodologies of detecting and resolving system issues to improve their service quality and efficiency. This dissertation investigates several data-driven approaches for assisting service providers in achieving this goal. The detailed problems studied by these approaches can be categorized into the three aspects in the service workflow: 1) preprocessing raw textual system logs to structural events; 2) refining monitoring configurations for eliminating false positives and false negatives; 3) improving the efficiency of system diagnosis on detected alerts. Solving these problems usually requires a huge amount of domain knowledge about the particular computing systems. The approaches investigated by this dissertation are developed based on event mining algorithms, which are able to automatically derive part of that knowledge from the historical system logs, events and tickets. ^ In particular, two textual clustering algorithms are developed for converting raw textual logs into system events. For refining the monitoring configuration, a rule based alert prediction algorithm is proposed for eliminating false alerts (false positives) without losing any real alert and a textual classification method is applied to identify the missing alerts (false negatives) from manual incident tickets. For system diagnosis, this dissertation presents an efficient algorithm for discovering the temporal dependencies between system events with corresponding time lags, which can help the administrators to determine the redundancies of deployed monitoring situations and dependencies of system components. To improve the efficiency of incident ticket resolving, several KNN-based algorithms that recommend relevant historical tickets with resolutions for incoming tickets are investigated. Finally, this dissertation offers a novel algorithm for searching similar textual event segments over large system logs that assists administrators to locate similar system behaviors in the logs. Extensive empirical evaluation on system logs, events and tickets from real IT infrastructures demonstrates the effectiveness and efficiency of the proposed approaches.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The financial community is well aware that continued underfunding of state and local government pension plans poses many public policy and fiduciary management concerns. However, a well-defined theoretical rationale has not been developed to explain why and how public sector pension plans underfund. This study uses three methods: a survey of national pension experts, an incomplete covariance panel method, and field interviews.^ A survey of national public sector pension experts was conducted to provide a conceptual framework by which underfunding could be evaluated. Experts suggest that plan design, fiscal stress, and political culture factors impact underfunding. However, experts do not agree with previous research findings that unions actively pursue underfunding to secure current wage increases.^ Within the conceptual framework and determinants identified by experts, several empirical regularities are documented for the first time. Analysis of 173 local government pension plans, observed from 1987 to 1992, was conducted. Findings indicate that underfunding occurs in plans that have lower retirement ages, increased costs due to benefit enhancements, when the sponsor faces current year operating deficits, or when a local government relies heavily on inelastic revenue sources. Results also suggest that elected officials artificially inflate interest rate assumptions to reduce current pension costs, consequently shifting these costs to future generations. In concurrence with some experts there is no data to support the assumption that highly unionized employees secure more funding than less unionized employees.^ Empirical results provide satisfactory but not overwhelming statistical power, and only minor predictive capacity. To further explore why underfunding occurs, field interviews were carried out with 62 local government officials. Practitioners indicated that perceived fiscal stress, the willingness of policymakers to advance funding, bargaining strategies used by union officials, apathy by employees and retirees, pension board composition, and the level of influence by internal pension experts has an impact on funding outcomes.^ A pension funding process model was posited by triangulating the expert survey, empirical findings, and field survey results. The funding process model should help shape and refine our theoretical knowledge of state and local government pension underfunding in the future. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Annual average daily traffic (AADT) is important information for many transportation planning, design, operation, and maintenance activities, as well as for the allocation of highway funds. Many studies have attempted AADT estimation using factor approach, regression analysis, time series, and artificial neural networks. However, these methods are unable to account for spatially variable influence of independent variables on the dependent variable even though it is well known that to many transportation problems, including AADT estimation, spatial context is important. ^ In this study, applications of geographically weighted regression (GWR) methods to estimating AADT were investigated. The GWR based methods considered the influence of correlations among the variables over space and the spatially non-stationarity of the variables. A GWR model allows different relationships between the dependent and independent variables to exist at different points in space. In other words, model parameters vary from location to location and the locally linear regression parameters at a point are affected more by observations near that point than observations further away. ^ The study area was Broward County, Florida. Broward County lies on the Atlantic coast between Palm Beach and Miami-Dade counties. In this study, a total of 67 variables were considered as potential AADT predictors, and six variables (lanes, speed, regional accessibility, direct access, density of roadway length, and density of seasonal household) were selected to develop the models. ^ To investigate the predictive powers of various AADT predictors over the space, the statistics including local r-square, local parameter estimates, and local errors were examined and mapped. The local variations in relationships among parameters were investigated, measured, and mapped to assess the usefulness of GWR methods. ^ The results indicated that the GWR models were able to better explain the variation in the data and to predict AADT with smaller errors than the ordinary linear regression models for the same dataset. Additionally, GWR was able to model the spatial non-stationarity in the data, i.e., the spatially varying relationship between AADT and predictors, which cannot be modeled in ordinary linear regression. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation develops a new figure of merit to measure the similarity (or dissimilarity) of Gaussian distributions through a novel concept that relates the Fisher distance to the percentage of data overlap. The derivations are expanded to provide a generalized mathematical platform for determining an optimal separating boundary of Gaussian distributions in multiple dimensions. Real-world data used for implementation and in carrying out feasibility studies were provided by Beckman-Coulter. It is noted that although the data used is flow cytometric in nature, the mathematics are general in their derivation to include other types of data as long as their statistical behavior approximate Gaussian distributions. ^ Because this new figure of merit is heavily based on the statistical nature of the data, a new filtering technique is introduced to accommodate for the accumulation process involved with histogram data. When data is accumulated into a frequency histogram, the data is inherently smoothed in a linear fashion, since an averaging effect is taking place as the histogram is generated. This new filtering scheme addresses data that is accumulated in the uneven resolution of the channels of the frequency histogram. ^ The qualitative interpretation of flow cytometric data is currently a time consuming and imprecise method for evaluating histogram data. This method offers a broader spectrum of capabilities in the analysis of histograms, since the figure of merit derived in this dissertation integrates within its mathematics both a measure of similarity and the percentage of overlap between the distributions under analysis. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Freeway systems are becoming more congested each day. One contribution to freeway traffic congestion comprises platoons of on-ramp traffic merging into freeway mainlines. As a relatively low-cost countermeasure to the problem, ramp meters are being deployed in both directions of an 11-mile section of I-95 in Miami-Dade County, Florida. The local Fuzzy Logic (FL) ramp metering algorithm implemented in Seattle, Washington, has been selected for deployment. The FL ramp metering algorithm is powered by the Fuzzy Logic Controller (FLC). The FLC depends on a series of parameters that can significantly alter the behavior of the controller, thus affecting the performance of ramp meters. However, the most suitable values for these parameters are often difficult to determine, as they vary with current traffic conditions. Thus, for optimum performance, the parameter values must be fine-tuned. This research presents a new method of fine tuning the FLC parameters using Particle Swarm Optimization (PSO). PSO attempts to optimize several important parameters of the FLC. The objective function of the optimization model incorporates the METANET macroscopic traffic flow model to minimize delay time, subject to the constraints of reasonable ranges of ramp metering rates and FLC parameters. To further improve the performance, a short-term traffic forecasting module using a discrete Kalman filter was incorporated to predict the downstream freeway mainline occupancy. This helps to detect the presence of downstream bottlenecks. The CORSIM microscopic simulation model was selected as the platform to evaluate the performance of the proposed PSO tuning strategy. The ramp-metering algorithm incorporating the tuning strategy was implemented using CORSIM's run-time extension (RTE) and was tested on the aforementioned I-95 corridor. The performance of the FLC with PSO tuning was compared with the performance of the existing FLC without PSO tuning. The results show that the FLC with PSO tuning outperforms the existing FL metering, fixed-time metering, and existing conditions without metering in terms of total travel time savings, average speed, and system-wide throughput.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation develops a process improvement method for service operations based on the Theory of Constraints (TOC), a management philosophy that has been shown to be effective in manufacturing for decreasing WIP and improving throughput. While TOC has enjoyed much attention and success in the manufacturing arena, its application to services in general has been limited. The contribution to industry and knowledge is a method for improving global performance measures based on TOC principles. The method proposed in this dissertation will be tested using discrete event simulation based on the scenario of the service factory of airline turnaround operations. To evaluate the method, a simulation model of aircraft turn operations of a U.S. based carrier was made and validated using actual data from airline operations. The model was then adjusted to reflect an application of the Theory of Constraints for determining how to deploy the scarce resource of ramp workers. The results indicate that, given slight modifications to TOC terminology and the development of a method for constraint identification, the Theory of Constraints can be applied with success to services. Bottlenecks in services must be defined as those processes for which the process rates and amount of work remaining are such that completing the process will not be possible without an increase in the process rate. The bottleneck ratio is used to determine to what degree a process is a constraint. Simulation results also suggest that redefining performance measures to reflect a global business perspective of reducing costs related to specific flights versus the operational local optimum approach of turning all aircraft quickly results in significant savings to the company. Savings to the annual operating costs of the airline were simulated to equal 30% of possible current expenses for misconnecting passengers with a modest increase in utilization of the workers through a more efficient heuristic of deploying them to the highest priority tasks. This dissertation contributes to the literature on service operations by describing a dynamic, adaptive dispatch approach to manage service factory operations similar to airline turnaround operations using the management philosophy of the Theory of Constraints.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Flow Cytometry analyzers have become trusted companions due to their ability to perform fast and accurate analyses of human blood. The aim of these analyses is to determine the possible existence of abnormalities in the blood that have been correlated with serious disease states, such as infectious mononucleosis, leukemia, and various cancers. Though these analyzers provide important feedback, it is always desired to improve the accuracy of the results. This is evidenced by the occurrences of misclassifications reported by some users of these devices. It is advantageous to provide a pattern interpretation framework that is able to provide better classification ability than is currently available. Toward this end, the purpose of this dissertation was to establish a feature extraction and pattern classification framework capable of providing improved accuracy for detecting specific hematological abnormalities in flow cytometric blood data. ^ This involved extracting a unique and powerful set of shift-invariant statistical features from the multi-dimensional flow cytometry data and then using these features as inputs to a pattern classification engine composed of an artificial neural network (ANN). The contribution of this method consisted of developing a descriptor matrix that can be used to reliably assess if a donor’s blood pattern exhibits a clinically abnormal level of variant lymphocytes, which are blood cells that are potentially indicative of disorders such as leukemia and infectious mononucleosis. ^ This study showed that the set of shift-and-rotation-invariant statistical features extracted from the eigensystem of the flow cytometric data pattern performs better than other commonly-used features in this type of disease detection, exhibiting an accuracy of 80.7%, a sensitivity of 72.3%, and a specificity of 89.2%. This performance represents a major improvement for this type of hematological classifier, which has historically been plagued by poor performance, with accuracies as low as 60% in some cases. This research ultimately shows that an improved feature space was developed that can deliver improved performance for the detection of variant lymphocytes in human blood, thus providing significant utility in the realm of suspect flagging algorithms for the detection of blood-related diseases.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Annual Average Daily Traffic (AADT) is a critical input to many transportation analyses. By definition, AADT is the average 24-hour volume at a highway location over a full year. Traditionally, AADT is estimated using a mix of permanent and temporary traffic counts. Because field collection of traffic counts is expensive, it is usually done for only the major roads, thus leaving most of the local roads without any AADT information. However, AADTs are needed for local roads for many applications. For example, AADTs are used by state Departments of Transportation (DOTs) to calculate the crash rates of all local roads in order to identify the top five percent of hazardous locations for annual reporting to the U.S. DOT. ^ This dissertation develops a new method for estimating AADTs for local roads using travel demand modeling. A major component of the new method involves a parcel-level trip generation model that estimates the trips generated by each parcel. The model uses the tax parcel data together with the trip generation rates and equations provided by the ITE Trip Generation Report. The generated trips are then distributed to existing traffic count sites using a parcel-level trip distribution gravity model. The all-or-nothing assignment method is then used to assign the trips onto the roadway network to estimate the final AADTs. The entire process was implemented in the Cube demand modeling system with extensive spatial data processing using ArcGIS. ^ To evaluate the performance of the new method, data from several study areas in Broward County in Florida were used. The estimated AADTs were compared with those from two existing methods using actual traffic counts as the ground truths. The results show that the new method performs better than both existing methods. One limitation with the new method is that it relies on Cube which limits the number of zones to 32,000. Accordingly, a study area exceeding this limit must be partitioned into smaller areas. Because AADT estimates for roads near the boundary areas were found to be less accurate, further research could examine the best way to partition a study area to minimize the impact.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The fluctuation in water demand in the Redland community of Miami-Dade County was examined using land use data from 2001 and 2011 and water estimation techniques provided by local and state agencies. The data was converted to 30 m mosaicked raster grids that indicated land use change, and associated water demand measured in gallons per day per acre. The results indicate that, first, despite an increase in population, water demand decreased overall in Redland from 2001 to 2011. Second, conversion of agricultural lands to residential lands actually caused a decrease in water demand in most cases while acquisition of farmland by public agencies also caused a sharp decline. Third, conversion of row crops and groves to nurseries was substantial and resulted in a significant increase in water demand in all such areas converted. Finally, estimating water demand based on land use, rather than population, is a more accurate approach.