11 resultados para Spatial data

em Digital Commons at Florida International University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The virtual quadrilateral is the coalescence of novel data structures that reduces the storage requirements of spatial data without jeopardizing the quality and operability of the inherent information. The data representative of the observed area is parsed to ascertain the necessary contiguous measures that, when contained, implicitly define a quadrilateral. The virtual quadrilateral then represents a geolocated area of the observed space where all of the measures are the same. The area, contoured as a rectangle, is pseudo-delimited by the opposite coordinates of the bounding area. Once defined, the virtual quadrilateral is representative of an area in the observed space and is represented in a database by the attributes of its bounding coordinates and measure of its contiguous space. Virtual quadrilaterals have been found to ensure a lossless reduction of the physical storage, maintain the implied features of the data, facilitate the rapid retrieval of vast amount of the represented spatial data and accommodate complex queries. The methods presented herein demonstrate that virtual quadrilaterals are created quite easily, are stable and versatile objects in a database and have proven to be beneficial to exigent spatial data applications such as geographic information systems. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

As massive data sets become increasingly available, people are facing the problem of how to effectively process and understand these data. Traditional sequential computing models are giving way to parallel and distributed computing models, such as MapReduce, both due to the large size of the data sets and their high dimensionality. This dissertation, as in the same direction of other researches that are based on MapReduce, tries to develop effective techniques and applications using MapReduce that can help people solve large-scale problems. Three different problems are tackled in the dissertation. The first one deals with processing terabytes of raster data in a spatial data management system. Aerial imagery files are broken into tiles to enable data parallel computation. The second and third problems deal with dimension reduction techniques that can be used to handle data sets of high dimensionality. Three variants of the nonnegative matrix factorization technique are scaled up to factorize matrices of dimensions in the order of millions in MapReduce based on different matrix multiplication implementations. Two algorithms, which compute CANDECOMP/PARAFAC and Tucker tensor decompositions respectively, are parallelized in MapReduce based on carefully partitioning the data and arranging the computation to maximize data locality and parallelism.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This dissertation documents the everyday lives and spaces of a population of youth typically constructed as out of place, and the broader urban context in which they are rendered as such. Thirty-three female and transgender street youth participated in the development of this youth-based participatory action research (YPAR) project utilizing geo-ethnographic methods, auto-photography, and archival research throughout a six-phase, eighteen-month research process in Bogotá, Colombia. ^ This dissertation details the participatory writing process that enabled the YPAR research team to destabilize dominant representations of both street girls and urban space and the participatory mapping process that enabled the development of a youth vision of the city through cartographic images. The maps display individual and aggregate spatial data indicating trends within and making comparisons between three subgroups of the research population according to nine spatial variables. These spatial data, coupled with photographic and ethnographic data, substantiate that street girls’ mobilities and activity spaces intersect with and are altered by state-sponsored urban renewal projects and paramilitary-led social cleansing killings, both efforts to clean up Bogotá by purging the city center of deviant populations and places. ^ Advancing an ethical approach to conducting research with excluded populations, this dissertation argues for the enactment of critical field praxis and care ethics within a YPAR framework to incorporate young people as principal research actors rather than merely voices represented in adultist academic discourse. Interjection of considerations of space, gender, and participation into the study of street youth produce new ways of envisioning the city and the role of young people in research. Instead of seeing the city from a panoptic view, Bogotá is revealed through the eyes of street youth who participated in the construction and feminist visualization of a new cartography and counter-map of the city grounded in embodied, situated praxis. This dissertation presents a socially responsible approach to conducting action-research with high-risk youth by documenting how street girls reclaim their right to the city on paper and in practice; through maps of their everyday exclusion in Bogotá followed by activism to fight against it.^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Modern geographical databases, which are at the core of geographic information systems (GIS), store a rich set of aspatial attributes in addition to geographic data. Typically, aspatial information comes in textual and numeric format. Retrieving information constrained on spatial and aspatial data from geodatabases provides GIS users the ability to perform more interesting spatial analyses, and for applications to support composite location-aware searches; for example, in a real estate database: “Find the nearest homes for sale to my current location that have backyard and whose prices are between $50,000 and $80,000”. Efficient processing of such queries require combined indexing strategies of multiple types of data. Existing spatial query engines commonly apply a two-filter approach (spatial filter followed by nonspatial filter, or viceversa), which can incur large performance overheads. On the other hand, more recently, the amount of geolocation data has grown rapidly in databases due in part to advances in geolocation technologies (e.g., GPS-enabled smartphones) that allow users to associate location data to objects or events. The latter poses potential data ingestion challenges of large data volumes for practical GIS databases. In this dissertation, we first show how indexing spatial data with R-trees (a typical data pre-processing task) can be scaled in MapReduce—a widely-adopted parallel programming model for data intensive problems. The evaluation of our algorithms in a Hadoop cluster showed close to linear scalability in building R-tree indexes. Subsequently, we develop efficient algorithms for processing spatial queries with aspatial conditions. Novel techniques for simultaneously indexing spatial with textual and numeric data are developed to that end. Experimental evaluations with real-world, large spatial datasets measured query response times within the sub-second range for most cases, and up to a few seconds for a small number of cases, which is reasonable for interactive applications. Overall, the previous results show that the MapReduce parallel model is suitable for indexing tasks in spatial databases, and the adequate combination of spatial and aspatial attribute indexes can attain acceptable response times for interactive spatial queries with constraints on aspatial data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the exponential growth of the usage of web-based map services, the web GIS application has become more and more popular. Spatial data index, search, analysis, visualization and the resource management of such services are becoming increasingly important to deliver user-desired Quality of Service. First, spatial indexing is typically time-consuming and is not available to end-users. To address this, we introduce TerraFly sksOpen, an open-sourced an Online Indexing and Querying System for Big Geospatial Data. Integrated with the TerraFly Geospatial database [1-9], sksOpen is an efficient indexing and query engine for processing Top-k Spatial Boolean Queries. Further, we provide ergonomic visualization of query results on interactive maps to facilitate the user’s data analysis. Second, due to the highly complex and dynamic nature of GIS systems, it is quite challenging for the end users to quickly understand and analyze the spatial data, and to efficiently share their own data and analysis results with others. Built on the TerraFly Geo spatial database, TerraFly GeoCloud is an extra layer running upon the TerraFly map and can efficiently support many different visualization functions and spatial data analysis models. Furthermore, users can create unique URLs to visualize and share the analysis results. TerraFly GeoCloud also enables the MapQL technology to customize map visualization using SQL-like statements [10]. Third, map systems often serve dynamic web workloads and involve multiple CPU and I/O intensive tiers, which make it challenging to meet the response time targets of map requests while using the resources efficiently. Virtualization facilitates the deployment of web map services and improves their resource utilization through encapsulation and consolidation. Autonomic resource management allows resources to be automatically provisioned to a map service and its internal tiers on demand. v-TerraFly are techniques to predict the demand of map workloads online and optimize resource allocations, considering both response time and data freshness as the QoS target. The proposed v-TerraFly system is prototyped on TerraFly, a production web map service, and evaluated using real TerraFly workloads. The results show that v-TerraFly can accurately predict the workload demands: 18.91% more accurate; and efficiently allocate resources to meet the QoS target: improves the QoS by 26.19% and saves resource usages by 20.83% compared to traditional peak load-based resource allocation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Annual Average Daily Traffic (AADT) is a critical input to many transportation analyses. By definition, AADT is the average 24-hour volume at a highway location over a full year. Traditionally, AADT is estimated using a mix of permanent and temporary traffic counts. Because field collection of traffic counts is expensive, it is usually done for only the major roads, thus leaving most of the local roads without any AADT information. However, AADTs are needed for local roads for many applications. For example, AADTs are used by state Departments of Transportation (DOTs) to calculate the crash rates of all local roads in order to identify the top five percent of hazardous locations for annual reporting to the U.S. DOT. ^ This dissertation develops a new method for estimating AADTs for local roads using travel demand modeling. A major component of the new method involves a parcel-level trip generation model that estimates the trips generated by each parcel. The model uses the tax parcel data together with the trip generation rates and equations provided by the ITE Trip Generation Report. The generated trips are then distributed to existing traffic count sites using a parcel-level trip distribution gravity model. The all-or-nothing assignment method is then used to assign the trips onto the roadway network to estimate the final AADTs. The entire process was implemented in the Cube demand modeling system with extensive spatial data processing using ArcGIS. ^ To evaluate the performance of the new method, data from several study areas in Broward County in Florida were used. The estimated AADTs were compared with those from two existing methods using actual traffic counts as the ground truths. The results show that the new method performs better than both existing methods. One limitation with the new method is that it relies on Cube which limits the number of zones to 32,000. Accordingly, a study area exceeding this limit must be partitioned into smaller areas. Because AADT estimates for roads near the boundary areas were found to be less accurate, further research could examine the best way to partition a study area to minimize the impact.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Space-for-time substitution is often used in predictive models because long-term time-series data are not available. Critics of this method suggest factors other than the target driver may affect ecosystem response and could vary spatially, producing misleading results. Monitoring data from the Florida Everglades were used to test whether spatial data can be substituted for temporal data in forecasting models. Spatial models that predicted bluefin killifish (Lucania goodei) population response to a drying event performed comparably and sometimes better than temporal models. Models worked best when results were not extrapolated beyond the range of variation encompassed by the original dataset. These results were compared to other studies to determine whether ecosystem features influence whether space-for-time substitution is feasible. Taken in the context of other studies, these results suggest space-for-time substitution may work best in ecosystems with low beta-diversity, high connectivity between sites, and small lag in organismal response to the driver variable.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Space-for-time substitution is often used in predictive models because long-term time-series data are not available. Critics of this method suggest factors other than the target driver may affect ecosystem response and could vary spatially, producing misleading results. Monitoring data from the Florida Everglades were used to test whether spatial data can be substituted for temporal data in forecasting models. Spatial models that predicted bluefin killifish (Lucania goodei) population response to a drying event performed comparably and sometimes better than temporal models. Models worked best when results were not extrapolated beyond the range of variation encompassed by the original dataset. These results were compared to other studies to determine whether ecosystem features influence whether space-for-time substitution is feasible. Taken in the context of other studies, these results suggest space-fortime substitution may work best in ecosystems with low beta-diversity, high connectivity between sites, and small lag in organismal response to the driver variable.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis research describes the design and implementation of a Semantic Geographic Information System (GIS) and the creation of its spatial database. The database schema is designed and created, and all textual and spatial data are loaded into the database with the help of the Semantic DBMS's Binary Database Interface currently being developed at the FIU's High Performance Database Research Center (HPDRC). A friendly graphical user interface is created together with the other main system's areas: displaying process, data animation, and data retrieval. All these components are tightly integrated to form a novel and practical semantic GIS that has facilitated the interpretation, manipulation, analysis, and display of spatial data like: Ocean Temperature, Ozone(TOMS), and simulated SeaWiFS data. At the same time, this system has played a major role in the testing process of the HPDRC's high performance and efficient parallel Semantic DBMS.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An Automatic Vehicle Location (AVL) system is a computer-based vehicle tracking system that is capable of determining a vehicle's location in real time. As a major technology of the Advanced Public Transportation System (APTS), AVL systems have been widely deployed by transit agencies for purposes such as real-time operation monitoring, computer-aided dispatching, and arrival time prediction. AVL systems make a large amount of transit performance data available that are valuable for transit performance management and planning purposes. However, the difficulties of extracting useful information from the huge spatial-temporal database have hindered off-line applications of the AVL data. ^ In this study, a data mining process, including data integration, cluster analysis, and multiple regression, is proposed. The AVL-generated data are first integrated into a Geographic Information System (GIS) platform. The model-based cluster method is employed to investigate the spatial and temporal patterns of transit travel speeds, which may be easily translated into travel time. The transit speed variations along the route segments are identified. Transit service periods such as morning peak, mid-day, afternoon peak, and evening periods are determined based on analyses of transit travel speed variations for different times of day. The seasonal patterns of transit performance are investigated by using the analysis of variance (ANOVA). Travel speed models based on the clustered time-of-day intervals are developed using important factors identified as having significant effects on speed for different time-of-day periods. ^ It has been found that transit performance varied from different seasons and different time-of-day periods. The geographic location of a transit route segment also plays a role in the variation of the transit performance. The results of this research indicate that advanced data mining techniques have good potential in providing automated techniques of assisting transit agencies in service planning, scheduling, and operations control. ^