3 resultados para text vector space model
em DigitalCommons@University of Nebraska - Lincoln
Resumo:
Wildlife biologists are often interested in how an animal uses space and the habitat resources within that space. We propose a single model that estimates an animal’s home range and habitat selection parameters within that range while accounting for the inherent autocorrelation in frequently sampled telemetry data. The model is applied to brown bear telemetry data in southeast Alaska.
Resumo:
The emerging Cyber-Physical Systems (CPSs) are envisioned to integrate computation, communication and control with the physical world. Therefore, CPS requires close interactions between the cyber and physical worlds both in time and space. These interactions are usually governed by events, which occur in the physical world and should autonomously be reflected in the cyber-world, and actions, which are taken by the CPS as a result of detection of events and certain decision mechanisms. Both event detection and action decision operations should be performed accurately and timely to guarantee temporal and spatial correctness. This calls for a flexible architecture and task representation framework to analyze CP operations. In this paper, we explore the temporal and spatial properties of events, define a novel CPS architecture, and develop a layered spatiotemporal event model for CPS. The event is represented as a function of attribute-based, temporal, and spatial event conditions. Moreover, logical operators are used to combine different types of event conditions to capture composite events. To the best of our knowledge, this is the first event model that captures the heterogeneous characteristics of CPS for formal temporal and spatial analysis.
Resumo:
Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.