24 resultados para data gathering algorithm
Resumo:
The Three-Layer distributed mediation architecture, designed by Secure System Architecture laboratory, employed a layered framework of presence, integration, and homogenization mediators. The architecture does not have any central component that may affect the system reliability. A distributed search technique was adapted in the system to increase its reliability. An Enhanced Chord-like algorithm (E-Chord) was designed and deployed in the integration layer. The E-Chord is a skip-list algorithm based on Distributed Hash Table (DHT) which is a distributed but structured architecture. DHT is distributed in the sense that no central unit is required to maintain indexes, and it is structured in the sense that indexes are distributed over the nodes in a systematic manner. Each node maintains three kind of routing information: a frequency list, a successor/predecessor list, and a finger table. None of the nodes in the system maintains all indexes, and each node knows about some other nodes in the system. These nodes, also called composer mediators, were connected in a P2P fashion. ^ A special composer mediator called a global mediator initiates the keyword-based matching decomposition of the request using the E-Chord. It generates an Integrated Data Structure Graph (IDSG) on the fly, creates association and dependency relations between nodes in the IDSG, and then generates a Global IDSG (GIDSG). The GIDSG graph is a plan which guides the global mediator how to integrate data. It is also used to stream data from the mediators in the homogenization layer which connected to the data sources. The connectors start sending the data to the global mediator just after the global mediator creates the GIDSG and just before the global mediator sends the answer to the presence mediator. Using the E-Chord and GIDSG made the mediation system more scalable than using a central global schema repository since all the composers in the integration layer are capable of handling and routing requests. Also, when a composer fails, it would only minimally affect the entire mediation system. ^
Resumo:
The deployment of wireless communications coupled with the popularity of portable devices has led to significant research in the area of mobile data caching. Prior research has focused on the development of solutions that allow applications to run in wireless environments using proxy based techniques. Most of these approaches are semantic based and do not provide adequate support for representing the context of a user (i.e., the interpreted human intention.). Although the context may be treated implicitly it is still crucial to data management. In order to address this challenge this dissertation focuses on two characteristics: how to predict (i) the future location of the user and (ii) locations of the fetched data where the queried data item has valid answers. Using this approach, more complete information about the dynamics of an application environment is maintained. ^ The contribution of this dissertation is a novel data caching mechanism for pervasive computing environments that can adapt dynamically to a mobile user's context. In this dissertation, we design and develop a conceptual model and context aware protocols for wireless data caching management. Our replacement policy uses the validity of the data fetched from the server and the neighboring locations to decide which of the cache entries is less likely to be needed in the future, and therefore a good candidate for eviction when cache space is needed. The context aware driven prefetching algorithm exploits the query context to effectively guide the prefetching process. The query context is defined using a mobile user's movement pattern and requested information context. Numerical results and simulations show that the proposed prefetching and replacement policies significantly outperform conventional ones. ^ Anticipated applications of these solutions include biomedical engineering, tele-health, medical information systems and business. ^
Resumo:
Recently, energy efficiency or green IT has become a hot issue for many IT infrastructures as they attempt to utilize energy-efficient strategies in their enterprise IT systems in order to minimize operational costs. Networking devices are shared resources connecting important IT infrastructures, especially in a data center network they are always operated 24/7 which consume a huge amount of energy, and it has been obviously shown that this energy consumption is largely independent of the traffic through the devices. As a result, power consumption in networking devices is becoming more and more a critical problem, which is of interest for both research community and general public. Multicast benefits group communications in saving link bandwidth and improving application throughput, both of which are important for green data center. In this paper, we study the deployment strategy of multicast switches in hybrid mode in energy-aware data center network: a case of famous fat-tree topology. The objective is to find the best location to deploy multicast switch not only to achieve optimal bandwidth utilization but also to minimize power consumption. We show that it is possible to easily achieve nearly 50% of energy consumption after applying our proposed algorithm.
Resumo:
Online Social Network (OSN) services provided by Internet companies bring people together to chat, share the information, and enjoy the information. Meanwhile, huge amounts of data are generated by those services (they can be regarded as the social media ) every day, every hour, even every minute, and every second. Currently, researchers are interested in analyzing the OSN data, extracting interesting patterns from it, and applying those patterns to real-world applications. However, due to the large-scale property of the OSN data, it is difficult to effectively analyze it. This dissertation focuses on applying data mining and information retrieval techniques to mine two key components in the social media data — users and user-generated contents. Specifically, it aims at addressing three problems related to the social media users and contents: (1) how does one organize the users and the contents? (2) how does one summarize the textual contents so that users do not have to go over every post to capture the general idea? (3) how does one identify the influential users in the social media to benefit other applications, e.g., Marketing Campaign? The contribution of this dissertation is briefly summarized as follows. (1) It provides a comprehensive and versatile data mining framework to analyze the users and user-generated contents from the social media. (2) It designs a hierarchical co-clustering algorithm to organize the users and contents. (3) It proposes multi-document summarization methods to extract core information from the social network contents. (4) It introduces three important dimensions of social influence, and a dynamic influence model for identifying influential users.
Resumo:
The electronics industry, is experiencing two trends one of which is the drive towards miniaturization of electronic products. The in-circuit testing predominantly used for continuity testing of printed circuit boards (PCB) can no longer meet the demands of smaller size circuits. This has lead to the development of moving probe testing equipment. Moving Probe Test opens up the opportunity to test PCBs where the test points are on a small pitch (distance between points). However, since the test uses probes that move sequentially to perform the test, the total test time is much greater than traditional in-circuit test. While significant effort has concentrated on the equipment design and development, little work has examined algorithms for efficient test sequencing. The test sequence has the greatest impact on total test time, which will determine the production cycle time of the product. Minimizing total test time is a NP-hard problem similar to the traveling salesman problem, except with two traveling salesmen that must coordinate their movements. The main goal of this thesis was to develop a heuristic algorithm to minimize the Flying Probe test time and evaluate the algorithm against a "Nearest Neighbor" algorithm. The algorithm was implemented with Visual Basic and MS Access database. The algorithm was evaluated with actual PCB test data taken from Industry. A statistical analysis with 95% C.C. was performed to test the hypothesis that the proposed algorithm finds a sequence which has a total test time less than the total test time found by the "Nearest Neighbor" approach. Findings demonstrated that the proposed heuristic algorithm reduces the total test time of the test and, therefore, production cycle time can be reduced through proper sequencing.
Resumo:
Wireless sensor networks are emerging as effective tools in the gathering and dissemination of data. They can be applied in many fields including health, environmental monitoring, home automation and the military. Like all other computing systems it is necessary to include security features, so that security sensitive data traversing the network is protected. However, traditional security techniques cannot be applied to wireless sensor networks. This is due to the constraints of battery power, memory, and the computational capacities of the miniature wireless sensor nodes. Therefore, to address this need, it becomes necessary to develop new lightweight security protocols. This dissertation focuses on designing a suite of lightweight trust-based security mechanisms and a cooperation enforcement protocol for wireless sensor networks. This dissertation presents a trust-based cluster head election mechanism used to elect new cluster heads. This solution prevents a major security breach against the routing protocol, namely, the election of malicious or compromised cluster heads. This dissertation also describes a location-aware, trust-based, compromise node detection, and isolation mechanism. Both of these mechanisms rely on the ability of a node to monitor its neighbors. Using neighbor monitoring techniques, the nodes are able to determine their neighbors’ reputation and trust level through probabilistic modeling. The mechanisms were designed to mitigate internal attacks within wireless sensor networks. The feasibility of the approach is demonstrated through extensive simulations. The dissertation also addresses non-cooperation problems in multi-user wireless sensor networks. A scalable lightweight enforcement algorithm using evolutionary game theory is also designed. The effectiveness of this cooperation enforcement algorithm is validated through mathematical analysis and simulation. This research has advanced the knowledge of wireless sensor network security and cooperation by developing new techniques based on mathematical models. By doing this, we have enabled others to build on our work towards the creation of highly trusted wireless sensor networks. This would facilitate its full utilization in many fields ranging from civilian to military applications.
Resumo:
Traffic incidents are non-recurring events that can cause a temporary reduction in roadway capacity. They have been recognized as a major contributor to traffic congestion on our national highway systems. To alleviate their impacts on capacity, automatic incident detection (AID) has been applied as an incident management strategy to reduce the total incident duration. AID relies on an algorithm to identify the occurrence of incidents by analyzing real-time traffic data collected from surveillance detectors. Significant research has been performed to develop AID algorithms for incident detection on freeways; however, similar research on major arterial streets remains largely at the initial stage of development and testing. This dissertation research aims to identify design strategies for the deployment of an Artificial Neural Network (ANN) based AID algorithm for major arterial streets. A section of the US-1 corridor in Miami-Dade County, Florida was coded in the CORSIM microscopic simulation model to generate data for both model calibration and validation. To better capture the relationship between the traffic data and the corresponding incident status, Discrete Wavelet Transform (DWT) and data normalization were applied to the simulated data. Multiple ANN models were then developed for different detector configurations, historical data usage, and the selection of traffic flow parameters. To assess the performance of different design alternatives, the model outputs were compared based on both detection rate (DR) and false alarm rate (FAR). The results show that the best models were able to achieve a high DR of between 90% and 95%, a mean time to detect (MTTD) of 55-85 seconds, and a FAR below 4%. The results also show that a detector configuration including only the mid-block and upstream detectors performs almost as well as one that also includes a downstream detector. In addition, DWT was found to be able to improve model performance, and the use of historical data from previous time cycles improved the detection rate. Speed was found to have the most significant impact on the detection rate, while volume was found to contribute the least. The results from this research provide useful insights on the design of AID for arterial street applications.
Resumo:
The focus of this thesis is placed on text data compression based on the fundamental coding scheme referred to as the American Standard Code for Information Interchange or ASCII. The research objective is the development of software algorithms that result in significant compression of text data. Past and current compression techniques have been thoroughly reviewed to ensure proper contrast between the compression results of the proposed technique with those of existing ones. The research problem is based on the need to achieve higher compression of text files in order to save valuable memory space and increase the transmission rate of these text files. It was deemed necessary that the compression algorithm to be developed would have to be effective even for small files and be able to contend with uncommon words as they are dynamically included in the dictionary once they are encountered. A critical design aspect of this compression technique is its compatibility to existing compression techniques. In other words, the developed algorithm can be used in conjunction with existing techniques to yield even higher compression ratios. This thesis demonstrates such capabilities and such outcomes, and the research objective of achieving higher compression ratio is attained.
Resumo:
Online Social Network (OSN) services provided by Internet companies bring people together to chat, share the information, and enjoy the information. Meanwhile, huge amounts of data are generated by those services (they can be regarded as the social media ) every day, every hour, even every minute, and every second. Currently, researchers are interested in analyzing the OSN data, extracting interesting patterns from it, and applying those patterns to real-world applications. However, due to the large-scale property of the OSN data, it is difficult to effectively analyze it. This dissertation focuses on applying data mining and information retrieval techniques to mine two key components in the social media data — users and user-generated contents. Specifically, it aims at addressing three problems related to the social media users and contents: (1) how does one organize the users and the contents? (2) how does one summarize the textual contents so that users do not have to go over every post to capture the general idea? (3) how does one identify the influential users in the social media to benefit other applications, e.g., Marketing Campaign? The contribution of this dissertation is briefly summarized as follows. (1) It provides a comprehensive and versatile data mining framework to analyze the users and user-generated contents from the social media. (2) It designs a hierarchical co-clustering algorithm to organize the users and contents. (3) It proposes multi-document summarization methods to extract core information from the social network contents. (4) It introduces three important dimensions of social influence, and a dynamic influence model for identifying influential users.