901 resultados para Data dissemination and sharing
Resumo:
This research presents several components encompassing the scope of the objective of Data Partitioning and Replication Management in Distributed GIS Database. Modern Geographic Information Systems (GIS) databases are often large and complicated. Therefore data partitioning and replication management problems need to be addresses in development of an efficient and scalable solution. ^ Part of the research is to study the patterns of geographical raster data processing and to propose the algorithms to improve availability of such data. These algorithms and approaches are targeting granularity of geographic data objects as well as data partitioning in geographic databases to achieve high data availability and Quality of Service(QoS) considering distributed data delivery and processing. To achieve this goal a dynamic, real-time approach for mosaicking digital images of different temporal and spatial characteristics into tiles is proposed. This dynamic approach reuses digital images upon demand and generates mosaicked tiles only for the required region according to user's requirements such as resolution, temporal range, and target bands to reduce redundancy in storage and to utilize available computing and storage resources more efficiently. ^ Another part of the research pursued methods for efficient acquiring of GIS data from external heterogeneous databases and Web services as well as end-user GIS data delivery enhancements, automation and 3D virtual reality presentation. ^ There are vast numbers of computing, network, and storage resources idling or not fully utilized available on the Internet. Proposed "Crawling Distributed Operating System "(CDOS) approach employs such resources and creates benefits for the hosts that lend their CPU, network, and storage resources to be used in GIS database context. ^ The results of this dissertation demonstrate effective ways to develop a highly scalable GIS database. The approach developed in this dissertation has resulted in creation of TerraFly GIS database that is used by US government, researchers, and general public to facilitate Web access to remotely-sensed imagery and GIS vector information. ^
Resumo:
With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.
Resumo:
With the explosive growth of the volume and complexity of document data (e.g., news, blogs, web pages), it has become a necessity to semantically understand documents and deliver meaningful information to users. Areas dealing with these problems are crossing data mining, information retrieval, and machine learning. For example, document clustering and summarization are two fundamental techniques for understanding document data and have attracted much attention in recent years. Given a collection of documents, document clustering aims to partition them into different groups to provide efficient document browsing and navigation mechanisms. One unrevealed area in document clustering is that how to generate meaningful interpretation for the each document cluster resulted from the clustering process. Document summarization is another effective technique for document understanding, which generates a summary by selecting sentences that deliver the major or topic-relevant information in the original documents. How to improve the automatic summarization performance and apply it to newly emerging problems are two valuable research directions. To assist people to capture the semantics of documents effectively and efficiently, the dissertation focuses on developing effective data mining and machine learning algorithms and systems for (1) integrating document clustering and summarization to obtain meaningful document clusters with summarized interpretation, (2) improving document summarization performance and building document understanding systems to solve real-world applications, and (3) summarizing the differences and evolution of multiple document sources.
Resumo:
Due to the rapid advances in computing and sensing technologies, enormous amounts of data are being generated everyday in various applications. The integration of data mining and data visualization has been widely used to analyze these massive and complex data sets to discover hidden patterns. For both data mining and visualization to be effective, it is important to include the visualization techniques in the mining process and to generate the discovered patterns for a more comprehensive visual view. In this dissertation, four related problems: dimensionality reduction for visualizing high dimensional datasets, visualization-based clustering evaluation, interactive document mining, and multiple clusterings exploration are studied to explore the integration of data mining and data visualization. In particular, we 1) propose an efficient feature selection method (reliefF + mRMR) for preprocessing high dimensional datasets; 2) present DClusterE to integrate cluster validation with user interaction and provide rich visualization tools for users to examine document clustering results from multiple perspectives; 3) design two interactive document summarization systems to involve users efforts and generate customized summaries from 2D sentence layouts; and 4) propose a new framework which organizes the different input clusterings into a hierarchical tree structure and allows for interactive exploration of multiple clustering solutions.
Resumo:
long-term research on freshwater ecosystems provides insights that can be difficult to obtain from other approaches. Widespread monitoring of ecologically relevant water-quality parameters spanning decades can facilitate important tests of ecological principles. Unique long-term data sets and analytical tools are increasingly available, allowing for powerful and synthetic analyses across sites. long-term measurements or experiments in aquatic systems can catch rare events, changes in highly variable systems, time-lagged responses, cumulative effects of stressors, and biotic responses that encompass multiple generations. Data are available from formal networks, local to international agencies, private organizations, various institutions, and paleontological and historic records; brief literature surveys suggest much existing data are not synthesized. Ecological sciences will benefit from careful maintenance and analyses of existing long-term programs, and subsequent insights can aid in the design of effective future long-term experimental and observational efforts. long-term research on freshwaters is particularly important because of their value to humanity.
Resumo:
This paper details the research methods an introductory qualitative research class used to both study an issue related to race and identity, and to familiarize themselves with data collection strategies. Throughout the paper the authors attempt to capture the challenges, disagreements, and consensus building that marked this unusual research endeavor.
Resumo:
The exponential growth of studies on the biological response to ocean acidification over the last few decades has generated a large amount of data. To facilitate data comparison, a data compilation hosted at the data publisher PANGAEA was initiated in 2008 and is updated on a regular basis (doi:10.1594/PANGAEA.149999). By January 2015, a total of 581 data sets (over 4 000 000 data points) from 539 papers had been archived. Here we present the developments of this data compilation five years since its first description by Nisumaa et al. (2010). Most of study sites from which data archived are still in the Northern Hemisphere and the number of archived data from studies from the Southern Hemisphere and polar oceans are still relatively low. Data from 60 studies that investigated the response of a mix of organisms or natural communities were all added after 2010, indicating a welcomed shift from the study of individual organisms to communities and ecosystems. The initial imbalance of considerably more data archived on calcification and primary production than on other processes has improved. There is also a clear tendency towards more data archived from multifactorial studies after 2010. For easier and more effective access to ocean acidification data, the ocean acidification community is strongly encouraged to contribute to the data archiving effort, and help develop standard vocabularies describing the variables and define best practices for archiving ocean acidification data.
Resumo:
The program PanTool was developed as a tool box like a Swiss Army Knife for data conversion and recalculation, written to harmonize individual data collections to standard import format used by PANGAEA. The format of input files the program PanTool needs is a tabular saved in plain ASCII. The user can create this files with a spread sheet program like MS-Excel or with the system text editor. PanTool is distributed as freeware for the operating systems Microsoft Windows, Apple OS X and Linux.
Resumo:
Research on temporal-order perception uses temporal-order judgment (TOJ) tasks or synchrony judgment (SJ) tasks in their binary SJ2 or ternary SJ3 variants. In all cases, two stimuli are presented with some temporal delay, and observers judge the order of presentation. Arbitrary psychometric functions are typically fitted to obtain performance measures such as sensitivity or the point of subjective simultaneity, but the parameters of these functions are uninterpretable. We describe routines in MATLAB and R that fit model-based functions whose parameters are interpretable in terms of the processes underlying temporal-order and simultaneity judgments and responses. These functions arise from an independent-channels model assuming arrival latencies with exponential distributions and a trichotomous decision space. Different routines fit data separately for SJ2, SJ3, and TOJ tasks, jointly for any two tasks, or also jointly for the three tasks (for common cases in which two or even the three tasks were used with the same stimuli and participants). Additional routines provide bootstrap p-values and confidence intervals for estimated parameters. A further routine is included that obtains performance measures from the fitted functions. An R package for Windows and source code of the MATLAB and R routines are available as Supplementary Files.
Resumo:
Peer reviewed
Resumo:
Funding and trial registration: Scottish Government Chief Scientist Office grant CZH/3/17. ClinicalTrials.gov registration NCT01602705.
Resumo:
In this work, we present an adaptive unequal loss protection (ULP) scheme for H264/AVC video transmission over lossy networks. This scheme combines erasure coding, H.264/AVC error resilience techniques and importance measures in video coding. The unequal importance of the video packets is identified in the group of pictures (GOP) and the H.264/AVC data partitioning levels. The presented method can adaptively assign unequal amount of forward error correction (FEC) parity across the video packets according to the network conditions, such as the available network bandwidth, packet loss rate and average packet burst loss length. A near optimal algorithm is developed to deal with the FEC assignment for optimization. The simulation results show that our scheme can effectively utilize network resources such as bandwidth, while improving the quality of the video transmission. In addition, the proposed ULP strategy ensures graceful degradation of the received video quality as the packet loss rate increases. © 2010 IEEE.