4 resultados para Data handling

em Digital Commons at Florida International University


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. ^ Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a twofold “custom wrapper” approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. ^ Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. ^ This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, the internet has grown exponentially, and become more complex. This increased complexity potentially introduces more network-level instability. But for any end-to-end internet connection, maintaining the connection's throughput and reliability at a certain level is very important. This is because it can directly affect the connection's normal operation. Therefore, a challenging research task is to improve a network's connection performance by optimizing its throughput and reliability. This dissertation proposed an efficient and reliable transport layer protocol (called concurrent TCP (cTCP)), an extension of the current TCP protocol, to optimize end-to-end connection throughput and enhance end-to-end connection fault tolerance. The proposed cTCP protocol could aggregate multiple paths' bandwidth by supporting concurrent data transfer (CDT) on a single connection. Here concurrent data transfer was defined as the concurrent transfer of data from local hosts to foreign hosts via two or more end-to-end paths. An RTT-Based CDT mechanism, which was based on a path's RTT (Round Trip Time) to optimize CDT performance, was developed for the proposed cTCP protocol. This mechanism primarily included an RTT-Based load distribution and path management scheme, which was used to optimize connections' throughput and reliability. A congestion control and retransmission policy based on RTT was also provided. According to experiment results, under different network conditions, our RTT-Based CDT mechanism could acquire good CDT performance. Finally a CWND-Based CDT mechanism, which was based on a path's CWND (Congestion Window), to optimize CDT performance was introduced. This mechanism primarily included: a CWND-Based load allocation scheme, which assigned corresponding data to paths based on their CWND to achieve aggregate bandwidth; a CWND-Based path management, which was used to optimize connections' fault tolerance; and a congestion control and retransmission management policy, which was similar to regular TCP in its separate path handling. According to corresponding experiment results, this mechanism could acquire near-optimal CDT performance under different network conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Electronic database handling of buisness information has gradually gained its popularity in the hospitality industry. This article provides an overview on the fundamental concepts of a hotel database and investigates the feasibility of incorporating computer-assisted data mining techniques into hospitality database applications. The author also exposes some potential myths associated with data mining in hospitaltiy database applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a two-fold "custom wrapper" approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases.