Biblioteca Digital

872 resultados para heterogeneous data sources

Data sources on probation, conditional discharge, supervision, and periodic imprisonment in Illinois /

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Includes bibliographies.

Veja mais

Estimating cause-specfic mortality from community- and facility-based data sources in the United Republic of Tanzania: options and implications for mortality burden estimates

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective To compare mortality burden estimates based on direct measurement of levels and causes in communities with indirect estimates based on combining health facility cause-specific mortality structures with community measurement of mortality levels. Methods. Data from sentinel vital registration (SVR) with verbal autopsy (VA) were used to determine the cause-specific mortality burden at the community level in two areas of the United Republic of Tanzania. Proportional cause-specific mortality structures from health facilities were applied to counts of deaths obtained by SVR to produce modelled estimates. The burden was expressed in years of life lost. Findings. A total of 2884 deaths were recorded from health facilities and 2167 recorded from SVR/VAs. In the perinatal and neonatal age group cause-specific mortality rates were dominated by perinatal conditions and stillbirths in both the community and the facility data. The modelled estimates for chronic causes were very similar to those from SVR/VA. Acute febrile illnesses were coded more specifically in the facility data than in the VA. Injuries were more prevalent in the SVR/VA data than in that from the facilities. Conclusion. In this setting, improved International classification of diseases and health related problems, tenth revision (ICD-10) coding practices and applying facility-based cause structures to counts of deaths from communities, derived from SVR, appears to produce reasonable estimates of the cause-specific mortality burden in those aged 5 years and older determined directly from VA. For the perinatal and neonatal age group, VA appears to be required. Use of this approach in a nationally representative sample of facilities may produce reliable national estimates of the cause-specific mortality burden for leading causes of death in adults.

Veja mais

A design pattern for efficient retrieval of large data sets from remote data sources

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Retrieving large amounts of information over wide area networks, including the Internet, is problematic due to issues arising from latency of response, lack of direct memory access to data serving resources, and fault tolerance. This paper describes a design pattern for solving the issues of handling results from queries that return large amounts of data. Typically these queries would be made by a client process across a wide area network (or Internet), with one or more middle-tiers, to a relational database residing on a remote server. The solution involves implementing a combination of data retrieval strategies, including the use of iterators for traversing data sets and providing an appropriate level of abstraction to the client, double-buffering of data subsets, multi-threaded data retrieval, and query slicing. This design has recently been implemented and incorporated into the framework of a commercial software product developed at Oracle Corporation.

Veja mais

Visualisation of heterogeneous data with the generalised generative topographic mapping

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Heterogeneous and incomplete datasets are common in many real-world visualisation applications. The probabilistic nature of the Generative Topographic Mapping (GTM), which was originally developed for complete continuous data, can be extended to model heterogeneous (i.e. containing both continuous and discrete values) and missing data. This paper describes and assesses the resulting model on both synthetic and real-world heterogeneous data with missing values.

Veja mais

Method of Elimination Data Sources Conflicts in Information System

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Use of modern object-oriented methods of designing of information systems (IS) both descriptions of interrelations IS and automated with its help business-processes of the enterprises leads to necessity of construction uniform complete IS on the basis of set of local models of such system. As a result of use of such approach there are the contradictions caused by inconsistency of actions of separate developers IS with each other and that is much more important, inconsistency of the points of view of separate users IS. Besides similar contradictions arise while in service IS at the enterprise because of constant change separate business- processes of the enterprise. It is necessary to note also, that now overwhelming majority IS is developed and maintained as set of separate functional modules. Each of such modules can function as independent IS. However the problem of integration of separate functional modules in uniform system can lead to a lot of problems. Among these problems it is possible to specify, for example, presence in modules of functions which are not used by the enterprise to destination, to complexity of information and program integration of modules of various manufacturers, etc. In most cases these contradictions and the reasons, their caused, are consequence of primary representation IS as equilibrium steady system. In work [1] representation IS as dynamic multistable system which is capable to carry out following actions has been considered:

Veja mais

Visualisation of heterogeneous data with simultaneous feature saliency using Generalised Generative Topographic Mapping

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most machine-learning algorithms are designed for datasets with features of a single type whereas very little attention has been given to datasets with mixed-type features. We recently proposed a model to handle mixed types with a probabilistic latent variable formalism. This proposed model describes the data by type-specific distributions that are conditionally independent given the latent space and is called generalised generative topographic mapping (GGTM). It has often been observed that visualisations of high-dimensional datasets can be poor in the presence of noisy features. In this paper we therefore propose to extend the GGTM to estimate feature saliency values (GGTMFS) as an integrated part of the parameter learning process with an expectation-maximisation (EM) algorithm. The efficacy of the proposed GGTMFS model is demonstrated both for synthetic and real datasets.

Veja mais

A semantic paradigm for intelligent data access

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An implementation of Sem-ODB—a database management system based on the Semantic Binary Model is presented. A metaschema of Sem-ODB database as well as the top-level architecture of the database engine is defined. A new benchmarking technique is proposed which allows databases built on different database models to compete fairly. This technique is applied to show that Sem-ODB has excellent efficiency comparing to a relational database on a certain class of database applications. A new semantic benchmark is designed which allows evaluation of the performance of the features characteristic of semantic database applications. An application used in the benchmark represents a class of problems requiring databases with sparse data, complex inheritances and many-to-many relations. Such databases can be naturally accommodated by semantic model. A fixed predefined implementation is not enforced allowing the database designer to choose the most efficient structures available in the DBMS tested. The results of the benchmark are analyzed. ^ A new high-level querying model for semantic databases is defined. It is proven adequate to serve as an efficient native semantic database interface, and has several advantages over the existing interfaces. It is optimizable and parallelizable, supports the definition of semantic userviews and the interoperability of semantic databases with other data sources such as World Wide Web, relational, and object-oriented databases. The query is structured as a semantic database schema graph with interlinking conditionals. The query result is a mini-database, accessible in the same way as the original database. The paradigm supports and utilizes the rich semantics and inherent ergonomics of semantic databases. ^ The analysis and high-level design of a system that exploits the superiority of the Semantic Database Model to other data models in expressive power and ease of use to allow uniform access to heterogeneous data sources such as semantic databases, relational databases, web sites, ASCII files, and others via a common query interface is presented. The Sem-ODB engine is used to control all the data sources combined under a unified semantic schema. A particular application of the system to provide an ODBC interface to the WWW as a data source is discussed. ^

Veja mais

Semantic distributed database management with applications to the Internet dissemination of environmental data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The research presented in this dissertation is comprised of several parts which jointly attain the goal of Semantic Distributed Database Management with Applications to Internet Dissemination of Environmental Data. ^ Part of the research into more effective and efficient data management has been pursued through enhancements to the Semantic Binary Object-Oriented database (Sem-ODB) such as more effective load balancing techniques for the database engine, and the use of Sem-ODB as a tool for integrating structured and unstructured heterogeneous data sources. Another part of the research in data management has pursued methods for optimizing queries in distributed databases through the intelligent use of network bandwidth; this has applications in networks that provide varying levels of Quality of Service or throughput. ^ The application of the Semantic Binary database model as a tool for relational database modeling has also been pursued. This has resulted in database applications that are used by researchers at the Everglades National Park to store environmental data and to remotely-sensed imagery. ^ The areas of research described above have contributed to the creation TerraFly, which provides for the dissemination of geospatial data via the Internet. TerraFly research presented herein ranges from the development of TerraFly's back-end database and interfaces, through the features that are presented to the public (such as the ability to provide autopilot scripts and on-demand data about a point), to applications of TerraFly in the areas of hazard mitigation, recreation, and aviation. ^

Veja mais

Kernel Methods for Small Sample and Asymptotic Tail Inference for Dependent, Heterogeneous Data

Relevância:

100.00% 100.00%

Publicador:

Veja mais

On Tail Index Estimation Using Dependent, Heterogeneous Data

Relevância:

100.00% 100.00%

Publicador:

Veja mais

Heterogeneous data source access for mobile devices

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The mediator software architecture design has been developed to provide data integration and retrieval in distributed, heterogeneous environments. Since the initial conceptualization of this architecture, many new technologies have emerged that can facilitate the implementation of this design. The purpose of this thesis was to show that a mediator framework supporting users of mobile devices could be implemented using common software technologies available today. In addition, the prototype was developed with a view to providing a better understanding of what a mediator is and to expose issues that will have to be addressed in full, more robust designs. The prototype developed for this thesis was implemented using various technologies including: Java, XML, and Simple Object Access Protocol (SOAP) among others. SOAP was used to accomplish inter-process communication. In the end, it is expected that more data intensive software applications will be possible in a world with ever-increasing demands for information.

Veja mais

Integrating structured data using property precedence

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data integration systems offer uniform access to a set of autonomous and heterogeneous data sources. One of the main challenges in data integration is reconciling semantic differences among data sources. Approaches that been used to solve this problem can be categorized as schema-based and attribute-based. Schema-based approaches use schema information to identify the semantic similarity in data; furthermore, they focus on reconciling types before reconciling attributes. In contrast, attribute-based approaches use statistical and structural information of attributes to identify the semantic similarity of data in different sources. This research examines an approach to semantic reconciliation based on integrating properties expressed at different levels of abstraction or granularity using the concept of property precedence. Property precedence reconciles the meaning of attributes by identifying similarities between attributes based on what these attributes represent in the real world. In order to use property precedence for semantic integration, we need to identify the precedence of attributes within and across data sources. The goal of this research is to develop and evaluate a method and algorithms that will identify precedence relations among attributes and build property precedence graph (PPG) that can be used to support integration.

Veja mais

Event Monitoring Based On Web Services for Heterogeneous Event Sources

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article discusses event monitoring options for heterogeneous event sources as they are given in nowadays heterogeneous distributed information systems. It follows the central assumption, that a fully generic event monitoring solution cannot provide complete support for event monitoring; instead, event source specific semantics such as certain event types or support for certain event monitoring techniques have to be taken into account. Following from this, the core result of the work presented here is the extension of a configurable event monitoring (Web) service for a variety of event sources. A service approach allows us to trade genericity for the exploitation of source specific characteristics. It thus delivers results for the areas of SOA, Web services, CEP and EDA.

Veja mais

Studying the Causes and Consequences of Internal Labor Migration Using Survey and Administrative Data Sources

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation comprises three chapters. The first chapter motivates the use of a novel data set combining survey and administrative sources for the study of internal labor migration. By following a sample of individuals from the American Community Survey (ACS) across their employment outcomes over time according to the Longitudinal Employer-Household Dynamics (LEHD) database, I construct a measure of geographic labor mobility that allows me to exploit information about individuals prior to their move. This enables me to explore aspects of the migration decision, such as homeownership and employment status, in ways that have not previously been possible. In the second chapter, I use this data set to test the theory that falling home prices affect a worker’s propensity to take a job in a different metropolitan area from where he is currently located. Employing a within-CBSA and time estimation that compares homeowners to renters in their propensities to relocate for jobs, I find that homeowners who have experienced declines in the nominal value of their homes are approximately 12% less likely than average to take a new job in a location outside of the metropolitan area where they currently reside. This evidence is consistent with the hypothesis that housing lock-in has contributed to the decline in labor mobility of homeowners during the recent housing bust. The third chapter focuses on a sample of unemployed workers in the same data set, in order to compare the unemployment durations of those who find subsequent employment by relocating to a new metropolitan area, versus those who find employment in their original location. Using an instrumental variables strategy to address the endogeneity of the migration decision, I find that out-migrating for a new job significantly reduces the time to re-employment. These results stand in contrast to OLS estimates, which suggest that those who move have longer unemployment durations. This implies that those who migrate for jobs in the data may be particularly disadvantaged in their ability to find employment, and thus have strong short-term incentives to relocate.

Veja mais

Serverless middlewares to integrate heterogeneous and distributed services in cloud continuum environments

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The application of modern ICT technologies is radically changing many fields pushing toward more open and dynamic value chains fostering the cooperation and integration of many connected partners, sensors, and devices. As a valuable example, the emerging Smart Tourism field derived from the application of ICT to Tourism so to create richer and more integrated experiences, making them more accessible and sustainable. From a technological viewpoint, a recurring challenge in these decentralized environments is the integration of heterogeneous services and data spanning multiple administrative domains, each possibly applying different security/privacy policies, device and process control mechanisms, service access, and provisioning schemes, etc. The distribution and heterogeneity of those sources exacerbate the complexity in the development of integrating solutions with consequent high effort and costs for partners seeking them. Taking a step towards addressing these issues, we propose APERTO, a decentralized and distributed architecture that aims at facilitating the blending of data and services. At its core, APERTO relies on APERTO FaaS, a Serverless platform allowing fast prototyping of the business logic, lowering the barrier of entry and development costs to newcomers, (zero) fine-grained scaling of resources servicing end-users, and reduced management overhead. APERTO FaaS infrastructure is based on asynchronous and transparent communications between the components of the architecture, allowing the development of optimized solutions that exploit the peculiarities of distributed and heterogeneous environments. In particular, APERTO addresses the provisioning of scalable and cost-efficient mechanisms targeting: i) function composition allowing the definition of complex workloads from simple, ready-to-use functions, enabling smarter management of complex tasks and improved multiplexing capabilities; ii) the creation of end-to-end differentiated QoS slices minimizing interfaces among application/service running on a shared infrastructure; i) an abstraction providing uniform and optimized access to heterogeneous data sources, iv) a decentralized approach for the verification of access rights to resources.

Veja mais

872 resultados para heterogeneous data sources

Filtro por publicador