3 resultados para spatial clustering algorithms

em Digital Commons at Florida International University


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Modern IT infrastructures are constructed by large scale computing systems and administered by IT service providers. Manually maintaining such large computing systems is costly and inefficient. Service providers often seek automatic or semi-automatic methodologies of detecting and resolving system issues to improve their service quality and efficiency. This dissertation investigates several data-driven approaches for assisting service providers in achieving this goal. The detailed problems studied by these approaches can be categorized into the three aspects in the service workflow: 1) preprocessing raw textual system logs to structural events; 2) refining monitoring configurations for eliminating false positives and false negatives; 3) improving the efficiency of system diagnosis on detected alerts. Solving these problems usually requires a huge amount of domain knowledge about the particular computing systems. The approaches investigated by this dissertation are developed based on event mining algorithms, which are able to automatically derive part of that knowledge from the historical system logs, events and tickets. ^ In particular, two textual clustering algorithms are developed for converting raw textual logs into system events. For refining the monitoring configuration, a rule based alert prediction algorithm is proposed for eliminating false alerts (false positives) without losing any real alert and a textual classification method is applied to identify the missing alerts (false negatives) from manual incident tickets. For system diagnosis, this dissertation presents an efficient algorithm for discovering the temporal dependencies between system events with corresponding time lags, which can help the administrators to determine the redundancies of deployed monitoring situations and dependencies of system components. To improve the efficiency of incident ticket resolving, several KNN-based algorithms that recommend relevant historical tickets with resolutions for incoming tickets are investigated. Finally, this dissertation offers a novel algorithm for searching similar textual event segments over large system logs that assists administrators to locate similar system behaviors in the logs. Extensive empirical evaluation on system logs, events and tickets from real IT infrastructures demonstrates the effectiveness and efficiency of the proposed approaches.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research pursued the conceptualization and real-time verification of a system that allows a computer user to control the cursor of a computer interface without using his/her hands. The target user groups for this system are individuals who are unable to use their hands due to spinal dysfunction or other afflictions, and individuals who must use their hands for higher priority tasks while still requiring interaction with a computer. ^ The system receives two forms of input from the user: Electromyogram (EMG) signals from muscles in the face and point-of-gaze coordinates produced by an Eye Gaze Tracking (EGT) system. In order to produce reliable cursor control from the two forms of user input, the development of this EMG/EGT system addressed three key requirements: an algorithm was created to accurately translate EMG signals due to facial movements into cursor actions, a separate algorithm was created that recognized an eye gaze fixation and provided an estimate of the associated eye gaze position, and an information fusion protocol was devised to efficiently integrate the outputs of these algorithms. ^ Experiments were conducted to compare the performance of EMG/EGT cursor control to EGT-only control and mouse control. These experiments took the form of two different types of point-and-click trials. The data produced by these experiments were evaluated using statistical analysis, Fitts' Law analysis and target re-entry (TRE) analysis. ^ The experimental results revealed that though EMG/EGT control was slower than EGT-only and mouse control, it provided effective hands-free control of the cursor without a spatial accuracy limitation, and it also facilitated a reliable click operation. This combination of qualities is not possessed by either EGT-only or mouse control, making EMG/EGT cursor control a unique and practical alternative for a user's cursor control needs. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modern geographical databases, which are at the core of geographic information systems (GIS), store a rich set of aspatial attributes in addition to geographic data. Typically, aspatial information comes in textual and numeric format. Retrieving information constrained on spatial and aspatial data from geodatabases provides GIS users the ability to perform more interesting spatial analyses, and for applications to support composite location-aware searches; for example, in a real estate database: “Find the nearest homes for sale to my current location that have backyard and whose prices are between $50,000 and $80,000”. Efficient processing of such queries require combined indexing strategies of multiple types of data. Existing spatial query engines commonly apply a two-filter approach (spatial filter followed by nonspatial filter, or viceversa), which can incur large performance overheads. On the other hand, more recently, the amount of geolocation data has grown rapidly in databases due in part to advances in geolocation technologies (e.g., GPS-enabled smartphones) that allow users to associate location data to objects or events. The latter poses potential data ingestion challenges of large data volumes for practical GIS databases. In this dissertation, we first show how indexing spatial data with R-trees (a typical data pre-processing task) can be scaled in MapReduce—a widely-adopted parallel programming model for data intensive problems. The evaluation of our algorithms in a Hadoop cluster showed close to linear scalability in building R-tree indexes. Subsequently, we develop efficient algorithms for processing spatial queries with aspatial conditions. Novel techniques for simultaneously indexing spatial with textual and numeric data are developed to that end. Experimental evaluations with real-world, large spatial datasets measured query response times within the sub-second range for most cases, and up to a few seconds for a small number of cases, which is reasonable for interactive applications. Overall, the previous results show that the MapReduce parallel model is suitable for indexing tasks in spatial databases, and the adequate combination of spatial and aspatial attribute indexes can attain acceptable response times for interactive spatial queries with constraints on aspatial data.