861 resultados para multiple data sources


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. ^ Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a twofold “custom wrapper” approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. ^ Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. ^ This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The research presented in this dissertation is comprised of several parts which jointly attain the goal of Semantic Distributed Database Management with Applications to Internet Dissemination of Environmental Data. ^ Part of the research into more effective and efficient data management has been pursued through enhancements to the Semantic Binary Object-Oriented database (Sem-ODB) such as more effective load balancing techniques for the database engine, and the use of Sem-ODB as a tool for integrating structured and unstructured heterogeneous data sources. Another part of the research in data management has pursued methods for optimizing queries in distributed databases through the intelligent use of network bandwidth; this has applications in networks that provide varying levels of Quality of Service or throughput. ^ The application of the Semantic Binary database model as a tool for relational database modeling has also been pursued. This has resulted in database applications that are used by researchers at the Everglades National Park to store environmental data and to remotely-sensed imagery. ^ The areas of research described above have contributed to the creation TerraFly, which provides for the dissemination of geospatial data via the Internet. TerraFly research presented herein ranges from the development of TerraFly's back-end database and interfaces, through the features that are presented to the public (such as the ability to provide autopilot scripts and on-demand data about a point), to applications of TerraFly in the areas of hazard mitigation, recreation, and aviation. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The purpose of this study was to determine fifth grade students' perceptions of the Fitnessgram physical fitness testing program. This study examined if the Fitnessgram physical fitness testing experience promotes an understanding of the health-related fitness components and examined the relationship between individual fitness test scores and time spent participating in out-of-school physical activity. Lastly, students' thoughts and feelings concerning the Fitnessgram experience were examined. ^ The primary participant population for the study was 110 fifth grade students at Redland Elementary School, a Miami-Dade County Public School (M-DCPS). Data were collected over the course of 5 months. Multiple sources of data allowed for triangulation. Data sources included Fitnessgram test scores, questionnaires, document analysis, and in-depth interviews. ^ Interview data were analyzed qualitatively for common broad themes, which were identified and defined. Document analysis included analyzing student fitness test scores and student questionnaire data. This information was analyzed to determine if the Fitnessgram test scores have an impact on student views about the school fitness-testing program. Data were statistically analyzed using analysis of frequency, crosstabulations (Bryman & Duncan, 1997), and Somers'd Correlation (Bryman & Duncan, 1997). The results of the analysis of data on student knowledge of the physical fitness components tested by each Fitnessgram test revealed students do not understand the health-related fitness components. ^ The results of determining a relationship between individuals' fitness test scores and time spent in out-of-school physical activity revealed a significant positive relationship for 2 of the 6 Fitnessgram tests. ^ The results of examining students' thoughts and feelings about each Fitnessgram test focused around 2 broad themes: (a) these children do not mind the physical fitness testing and (b) how they felt about the experience was directly related to how they thought they had performed. ^ If the goal of physical fitness was only to get children fit, this test may be appropriate. However, the ultimate goal of physical fitness is to encourage students to live active and healthy lifestyles. Findings suggest the Fitnessgram as implemented by M-DCPS may not be the most suitable measurement instrument when assessing attitudinal changes that affect a healthy lifelong lifestyle. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Mediation techniques provide interoperability and support integrated query processing among heterogeneous databases. While such techniques help data sharing among different sources, they increase the risk for data security, such as violating access control rules. Successful protection of information by an effective access control mechanism is a basic requirement for interoperation among heterogeneous data sources. ^ This dissertation first identified the challenges in the mediation system in order to achieve both interoperability and security in the interconnected and collaborative computing environment, which includes: (1) context-awareness, (2) semantic heterogeneity, and (3) multiple security policy specification. Currently few existing approaches address all three security challenges in mediation system. This dissertation provides a modeling and architectural solution to the problem of mediation security that addresses the aforementioned security challenges. A context-aware flexible authorization framework was developed in the dissertation to deal with security challenges faced by mediation system. The authorization framework consists of two major tasks, specifying security policies and enforcing security policies. Firstly, the security policy specification provides a generic and extensible method to model the security policies with respect to the challenges posed by the mediation system. The security policies in this study are specified by 5-tuples followed by a series of authorization constraints, which are identified based on the relationship of the different security components in the mediation system. Two essential features of mediation systems, i. e., relationship among authorization components and interoperability among heterogeneous data sources, are the focus of this investigation. Secondly, this dissertation supports effective access control on mediation systems while providing uniform access for heterogeneous data sources. The dynamic security constraints are handled in the authorization phase instead of the authentication phase, thus the maintenance cost of security specification can be reduced compared with related solutions. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The nation's freeway systems are becoming increasingly congested. A major contribution to traffic congestion on freeways is due to traffic incidents. Traffic incidents are non-recurring events such as accidents or stranded vehicles that cause a temporary roadway capacity reduction, and they can account for as much as 60 percent of all traffic congestion on freeways. One major freeway incident management strategy involves diverting traffic to avoid incident locations by relaying timely information through Intelligent Transportation Systems (ITS) devices such as dynamic message signs or real-time traveler information systems. The decision to divert traffic depends foremost on the expected duration of an incident, which is difficult to predict. In addition, the duration of an incident is affected by many contributing factors. Determining and understanding these factors can help the process of identifying and developing better strategies to reduce incident durations and alleviate traffic congestion. A number of research studies have attempted to develop models to predict incident durations, yet with limited success. ^ This dissertation research attempts to improve on this previous effort by applying data mining techniques to a comprehensive incident database maintained by the District 4 ITS Office of the Florida Department of Transportation (FDOT). Two categories of incident duration prediction models were developed: "offline" models designed for use in the performance evaluation of incident management programs, and "online" models for real-time prediction of incident duration to aid in the decision making of traffic diversion in the event of an ongoing incident. Multiple data mining analysis techniques were applied and evaluated in the research. The multiple linear regression analysis and decision tree based method were applied to develop the offline models, and the rule-based method and a tree algorithm called M5P were used to develop the online models. ^ The results show that the models in general can achieve high prediction accuracy within acceptable time intervals of the actual durations. The research also identifies some new contributing factors that have not been examined in past studies. As part of the research effort, software code was developed to implement the models in the existing software system of District 4 FDOT for actual applications. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recent advances in airborne Light Detection and Ranging (LIDAR) technology allow rapid and inexpensive measurements of topography over large areas. Airborne LIDAR systems usually return a 3-dimensional cloud of point measurements from reflective objects scanned by the laser beneath the flight path. This technology is becoming a primary method for extracting information of different kinds of geometrical objects, such as high-resolution digital terrain models (DTMs), buildings and trees, etc. In the past decade, LIDAR gets more and more interest from researchers in the field of remote sensing and GIS. Compared to the traditional data sources, such as aerial photography and satellite images, LIDAR measurements are not influenced by sun shadow and relief displacement. However, voluminous data pose a new challenge for automated extraction the geometrical information from LIDAR measurements because many raster image processing techniques cannot be directly applied to irregularly spaced LIDAR points. ^ In this dissertation, a framework is proposed to filter out information about different kinds of geometrical objects, such as terrain and buildings from LIDAR automatically. They are essential to numerous applications such as flood modeling, landslide prediction and hurricane animation. The framework consists of several intuitive algorithms. Firstly, a progressive morphological filter was developed to detect non-ground LIDAR measurements. By gradually increasing the window size and elevation difference threshold of the filter, the measurements of vehicles, vegetation, and buildings are removed, while ground data are preserved. Then, building measurements are identified from no-ground measurements using a region growing algorithm based on the plane-fitting technique. Raw footprints for segmented building measurements are derived by connecting boundary points and are further simplified and adjusted by several proposed operations to remove noise, which is caused by irregularly spaced LIDAR measurements. To reconstruct 3D building models, the raw 2D topology of each building is first extracted and then further adjusted. Since the adjusting operations for simple building models do not work well on 2D topology, 2D snake algorithm is proposed to adjust 2D topology. The 2D snake algorithm consists of newly defined energy functions for topology adjusting and a linear algorithm to find the minimal energy value of 2D snake problems. Data sets from urbanized areas including large institutional, commercial, and small residential buildings were employed to test the proposed framework. The results demonstrated that the proposed framework achieves a very good performance. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This study describes the case of private higher education in Ohio between 1980 and 2006 using Zumeta's (1996) model of state policy and private higher education. More specifically, this study used case study methodology and multiple sources to demonstrate the usefulness of Zumeta's model and illustrate its limitations. Ohio served as the subject state and data for 67 private, 4-year, degree-granting, Higher Learning Commission-accredited institutions were collected. Data sources for this study included the National Center for Education Statistics Integrated Postsecondary Data System as well as database information and documents from various state agencies in Ohio, including the Ohio Board of Regents. ^ The findings of this study indicated that the general state context for higher education in Ohio during the study time period was shaped by deteriorating economic factors, stagnating population growth coupled with a rapidly aging society, fluctuating state income and increasing expenditures in areas such as corrections, transportation and social services. However, private higher education experienced consistent enrollment growth, an increase in the number of institutions, widening involvement in state-wide planning for higher education, and greater fiscal support from the state in a variety of forms such as the Ohio Choice Grant. This study also demonstrated that private higher education in Ohio benefited because of its inclusion in state-wide planning and the state's decision to grant state aid directly to students. ^ Taken together, this study supported Zumeta's (1996) classification of Ohio as having a hybrid market-competitive/central-planning policy posture toward private higher education. Furthermore, this study demonstrated that Zumeta's model is a useful tool for both policy makers and researchers for understanding a state's relationship to its private higher education sector. However, this study also demonstrated that Zumeta's model is less useful when applied over an extended time period. Additionally, this study identifies a further limitation of Zumeta's model resulting from his failure to define "state mandate" and the "level of state mandates" that allows for inconsistent analysis of this component. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

It has long been known that vocabulary is essential in the development of reading. Because vocabulary leading to increased comprehension is important, it necessary to determine strategies for ensuring that the best methods of teaching vocabulary are used to help students make gains in vocabulary leading to reading comprehension. According to the National Reading Panel, multiple strategies that involve active engagement on the part of the student are more effective than the use of just one strategy. The purpose of this study was to determine if students' use of visualization, student-generated pictures of onset-and-rime-patterned vocabulary, and story read-alouds with discussion, would enable diverse first-grade students to increase their vocabulary and comprehension. In addition, this study examined the effect of the multimodal framework of strategies on English learners (ELs). This quasi-experimental study (N=69) was conducted in four first-grade classrooms in a low socio-economic school. Two treatment classes used a multimodal framework of strategies to learn weekly vocabulary words and comprehension. Two comparison classrooms used the traditional method of teaching weekly vocabulary and comprehension. Data sources included Florida Assessments for Instruction in Reading (FAIR), comprehension and vocabulary scores, and weekly MacMillan/McGraw Hill Treasures basal comprehension questions and onset-and-rime vocabulary questions. This research determined that the treatment had an effect in adjusted FAIR comprehension means by group, with the treatment group (adj M = 5.14) significantly higher than the comparison group ( adj M = -8.26) on post scores. However, the treatment means did not increase from pre to post, but the comparison means significantly decreased from pre to post as the materials became more challenging. For the FAIR vocabulary, there was a significant difference by group with the comparison adjusted post mean higher than the treatment's, although both groups significantly increased from pre to post. However, the FAIR vocabulary posttest was not part of the Treasures vocabulary, which was taught using the multimodal framework of strategies. The Treasures vocabulary scores were not significantly different by group on the assessment across the weeks, although the treatment means were higher than those of the comparison group. Continued research is needed in the area of vocabulary and comprehension instructional methods in order to determine strategies to increase diverse, urban students' performance.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The coastal zone of the Florida Keys features the only living coral reef in the continental United States and as such represents a unique regional environmental resource. Anthropogenic pressures combined with climate disturbances such as hurricanes can affect the biogeochemistry of the region and threaten the health of this unique ecosystem. As such, water quality monitoring has historically been implemented in the Florida Keys, and six spatially distinct zones have been identified. In these studies however, dissolved organic matter (DOM) has only been studied as a quantitative parameter, and DOM composition can be a valuable biogeochemical parameter in assessing environmental change in coastal regions. Here we report the first data of its kind on the application of optical properties of DOM, in particular excitation emission matrix fluorescence with parallel factor analysis (EEM-PARAFAC), throughout these six Florida Keys regions in an attempt to assess spatial differences in DOM sources. Our data suggests that while DOM in the Florida Keys can be influenced by distant terrestrial environments such as the Everglades, spatial differences in DOM distribution were also controlled in part by local surface runoff/fringe mangroves, contributions from seasgrass communities, as well as the reefs and waters from the Florida Current. Application of principal component analysis (PCA) of the relative abundance of EEM-PARAFAC components allowed for a clear distinction between the sources of DOM (allochthonous vs. autochthonous), between different autochthonous sources and/or the diagenetic status of DOM, and further clarified contribution of terrestrial DOM in zones where levels of DOM were low in abundance. The combination between EEM-PARAFAC and PCA proved to be ideally suited to discern DOM composition and source differences in coastal zones with complex hydrology and multiple DOM sources.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The purpose of this project was to evaluate the use of remote sensing 1) to detect and map Everglades wetland plant communities at different scales; and 2) to compare map products delineated and resampled at various scales with the intent to quantify and describe the quantitative and qualitative differences between such products. We evaluated data provided by Digital Globe’s WorldView 2 (WV2) sensor with a spatial resolution of 2m and data from Landsat’s Thematic and Enhanced Thematic Mapper (TM and ETM+) sensors with a spatial resolution of 30m. We were also interested in the comparability and scalability of products derived from these data sources. The adequacy of each data set to map wetland plant communities was evaluated utilizing two metrics: 1) model-based accuracy estimates of the classification procedures; and 2) design-based post-classification accuracy estimates of derived maps.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a two-fold "custom wrapper" approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Data integration systems offer uniform access to a set of autonomous and heterogeneous data sources. One of the main challenges in data integration is reconciling semantic differences among data sources. Approaches that been used to solve this problem can be categorized as schema-based and attribute-based. Schema-based approaches use schema information to identify the semantic similarity in data; furthermore, they focus on reconciling types before reconciling attributes. In contrast, attribute-based approaches use statistical and structural information of attributes to identify the semantic similarity of data in different sources. This research examines an approach to semantic reconciliation based on integrating properties expressed at different levels of abstraction or granularity using the concept of property precedence. Property precedence reconciles the meaning of attributes by identifying similarities between attributes based on what these attributes represent in the real world. In order to use property precedence for semantic integration, we need to identify the precedence of attributes within and across data sources. The goal of this research is to develop and evaluate a method and algorithms that will identify precedence relations among attributes and build property precedence graph (PPG) that can be used to support integration.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The pore water chemistry of mud volcanoes from the Olimpi Mud Volcano Field and the Anaximander Mountains in the eastern Mediterranean Sea have been studied for three major purposes: (1) modes and velocities of fluid transport were derived to assess the role of (upward) advection, and bioirrigation for benthic fluxes. (2) Differences in the fluid chemistry at sites of Milano mud volcano (Olimpi area) were compiled in a map to illustrate the spatial heterogeneity reflecting differences in fluid origin and transport in discrete conduits in near proximity. (3) Formation water temperatures of seeping fluids were calculated from theoretical geothermometers to predict the depth of fluid origin and geochemical reactions in the deeper subsurface. No indications for downward advection as required for convection cells have been found. Instead, measured pore water profiles have been simulated successfully by accounting for upward advection and bioirrigation. Advective flow velocities are found to be generally moderate (3-50 cm/y) compared to other cold seep areas. Depth-integrated rates of bioirrigation are 1-2 orders of magnitude higher than advective flow velocities documenting the importance of bioirrigation for flux considerations in surface sediments. Calculated formation water temperatures from the Anaximander Mountains are in the range of 80 to 145 °C suggesting a fluid origin from a depth zone associated with the seismic decollement. It is proposed that at that depth clay mineral dehydration leads to the formation and advection of fluids reduced in salinity relative to sea water. This explains the ubiquitous pore water freshening observed in surface sediments of the Anaximander Mountain area. Multiple fluid sources and formation water temperatures of 55 to 80 °C were derived for expelled fluids of the Olimpi area.