935 resultados para Data access
As scientific workflows and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources to use, where the resources must be in readiness for processing etc.) becomes proportionally more difficult. While "workflow compilers", such as Pegasus, reduce this burden, a further problem arises: since specifying details of execution is now automatic, a workflow's results are harder to interpret, as they are partly due to specifics of execution. By automating steps between the experiment design and its results, we lose the connection between them, hindering interpretation of results. To reconnect the scientific data with the original experiment, we argue that scientists should have access to the full provenance of their data, including not only parameters, inputs and intermediary data, but also the abstract experiment, refined into a concrete execution by the "workflow compiler". In this paper, we describe preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.
Instrumentation and automation plays a vital role to managing the water industry. These systems generate vast amounts of data that must be effectively managed in order to enable intelligent decision making. Time series data management software, commonly known as data historians are used for collecting and managing real-time (time series) information. More advanced software solutions provide a data infrastructure or utility wide Operations Data Management System (ODMS) that stores, manages, calculates, displays, shares, and integrates data from multiple disparate automation and business systems that are used daily in water utilities. These ODMS solutions are proven and have the ability to manage data from smart water meters to the collaboration of data across third party corporations. This paper focuses on practical, utility successes in the water industry where utility managers are leveraging instantaneous access to data from proven, commercial off-the-shelf ODMS solutions to enable better real-time decision making. Successes include saving $650,000 / year in water loss control, safeguarding water quality, saving millions of dollars in energy management and asset management. Immediate opportunities exist to integrate the research being done in academia with these ODMS solutions in the field and to leverage these successes to utilities around the world.
HydroShare is an online, collaborative system being developed for open sharing of hydrologic data and models. The goal of HydroShare is to enable scientists to easily discover and access hydrologic data and models, retrieve them to their desktop or perform analyses in a distributed computing environment that may include grid, cloud or high performance computing model instances as necessary. Scientists may also publish outcomes (data, results or models) into HydroShare, using the system as a collaboration platform for sharing data, models and analyses. HydroShare is expanding the data sharing capability of the CUAHSI Hydrologic Information System by broadening the classes of data accommodated, creating new capability to share models and model components, and taking advantage of emerging social media functionality to enhance information about and collaboration around hydrologic data and models. One of the fundamental concepts in HydroShare is that of a Resource. All content is represented using a Resource Data Model that separates system and science metadata and has elements common to all resources as well as elements specific to the types of resources HydroShare will support. These will include different data types used in the hydrology community and models and workflows that require metadata on execution functionality. The HydroShare web interface and social media functions are being developed using the Drupal content management system. A geospatial visualization and analysis component enables searching, visualizing, and analyzing geographic datasets. The integrated Rule-Oriented Data System (iRODS) is being used to manage federated data content and perform rule-based background actions on data and model resources, including parsing to generate metadata catalog information and the execution of models and workflows. This presentation will introduce the HydroShare functionality developed to date, describe key elements of the Resource Data Model and outline the roadmap for future development.
The Short-term Water Information and Forecasting Tools (SWIFT) is a suite of tools for flood and short-term streamflow forecasting, consisting of a collection of hydrologic model components and utilities. Catchments are modeled using conceptual subareas and a node-link structure for channel routing. The tools comprise modules for calibration, model state updating, output error correction, ensemble runs and data assimilation. Given the combinatorial nature of the modelling experiments and the sub-daily time steps typically used for simulations, the volume of model configurations and time series data is substantial and its management is not trivial. SWIFT is currently used mostly for research purposes but has also been used operationally, with intersecting but significantly different requirements. Early versions of SWIFT used mostly ad-hoc text files handled via Fortran code, with limited use of netCDF for time series data. The configuration and data handling modules have since been redesigned. The model configuration now follows a design where the data model is decoupled from the on-disk persistence mechanism. For research purposes the preferred on-disk format is JSON, to leverage numerous software libraries in a variety of languages, while retaining the legacy option of custom tab-separated text formats when it is a preferred access arrangement for the researcher. By decoupling data model and data persistence, it is much easier to interchangeably use for instance relational databases to provide stricter provenance and audit trail capabilities in an operational flood forecasting context. For the time series data, given the volume and required throughput, text based formats are usually inadequate. A schema derived from CF conventions has been designed to efficiently handle time series for SWIFT.
o ambiente econômico atual tem exigido empenho das empresas em conhecer, interagir, diferenciar e personalizar cada vez mais produtos e serviços para os clientes. Este cenário requer ferramentas e modelos de gestão para gerenciar as relações com os clientes, com o objetivo de permitir que a empresa consiga perceber e responder rapidamente a exigências dos consumidores. Este trabalho revisa conceitos de CRM (Customer Relationschip Management ou Gerenciamento das Relações com os Clientes) e descreve a implementação de ferramenta de gestão de relacionamento com clientes em empresa de consórcio. O desenvolvimento do trabalho reflete uma necessidade apontada no planejamento estratégico da empresa, sendo que ferramentas de tecnologia de informação e software de banco de dados foram usadas como suporte aos propósitos da gestão empresarial. Como resultado do trabalho, a empresa está hoje atuando com um sistema de Data Base Marketing, o qual foi criado para auxiliar os profissionais envolvidos no processo de atendimento e gestão de relacionamento com clientes. O Data Base Marketing esta sendo utilizado para coletar dados de atendimento a clientes, tais como históricos de atendimento, dados cadastrais, perfil demográfico, perfil psicográfico e categoria de valor dos clientes. Durante o processo de interação com clientes, o sistema facilita o trabalho dos especialistas e permite melhorar a qualidade do atendimento aos clientes, contemplando necessidades dos diversos especialistas da empresa em assuntos como vendas, qualidade em serviços, finanças e gestão empresarial.O processo começou pela constituição de um grupo de trabalho interno para discutir estratégias e cronograma de implantação. A primeira decisão do grupo foi pelo desenvolvimento interno do software visando atender plenamente o "core business" da empresa. O processo começou pela constituição de um grupo de trabalho interno para discutir estratégias e cronograma de implantação. A primeira decisão do grupo foi pelo desenvolvimento interno do software visando atender plenamente o "core business" da empresa. O projeto contou com o conhecimento do negócio dos profissionais da empresa e auxilio de especialistas e consultores externos. O detalhamento do projeto, bem como os passos da pesquisa-ação, está descrito no corpo da dissertação.
Este trabalho apresenta um estudo de caso de mineração de dados no varejo. O negócio em questão é a comercialização de móveis e materiais de construção. A mineração foi realizada sobre informações geradas das transações de vendas por um período de 8 meses. Informações cadastrais de clientes também foram usadas e cruzadas com informações de venda, visando obter resultados que possam ser convertidos em ações que, por conseqüência, gerem lucro para a empresa. Toda a modelagem, preparação e transformação dos dados, foi feita visando facilitar a aplicação das técnicas de mineração que as ferramentas de mineração de dados proporcionam para a descoberta de conhecimento. O processo foi detalhado para uma melhor compreensão dos resultados obtidos. A metodologia CRISP usada no trabalho também é discutida, levando-se em conta as dificuldades e facilidades que se apresentaram durante as fases do processo de obtenção dos resultados. Também são analisados os pontos positivos e negativos das ferramentas de mineração utilizadas, o IBM Intelligent Miner e o WEKA - Waikato Environment for Knowledge Analysis, bem como de todos os outros softwares necessários para a realização do trabalho. Ao final, os resultados obtidos são apresentados e discutidos, sendo também apresentada a opinião dos proprietários da empresa sobre tais resultados e qual valor cada um deles poderá agregar ao negócio.
Sistemas de tomada de decisão baseados em Data Warehouse (DW) estão sendo cada dia mais utilizados por grandes empresas e organizações. O modelo multidimensional de organização dos dados utilizado por estes sistemas, juntamente com as técnicas de processamento analítico on-line (OLAP), permitem análises complexas sobre o histórico dos negócios através de uma simples e intuitiva interface de consulta. Apesar dos DWs armazenarem dados históricos por natureza, as estruturas de organização e classificação destes dados, chamadas de dimensões, não possuem a rigor uma representação temporal, refletindo somente a estrutura corrente. Para um sistema destinado à análise de dados, a falta do histórico das dimensões impossibilita consultas sobre o ambiente real de contextualização dos dados passados. Além disso, as alterações dos esquemas multidimensionais precisam ser assistidas e gerenciadas por um modelo de evolução, de forma a garantir a consistência e integridade do modelo multidimensional sem a perda de informações relevantes. Neste trabalho são apresentadas dezessete operações de alteração de esquema e sete operações de alteração de instâncias para modelos multidimensionais de DW. Um modelo de versões, baseado na associação de intervalos de validade aos esquemas e instâncias, é proposto para o gerenciamento dessas operações. Todo o histórico de definições e de dados do DW é mantido por esse modelo, permitindo análises completas dos dados passados e da evolução do DW. Além de suportar consultas históricas sobre as definições e as instâncias do DW, o modelo também permite a manutenção de mais de um esquema ativo simultaneamente. Isto é, dois ou mais esquemas podem continuar a ter seus dados atualizados periodicamente, permitindo assim que as aplicações possam consultar dados recentes utilizando diferentes versões de esquema.
With the constant grow of enterprises and the need to share information across departments and business areas becomes more critical, companies are turning to integration to provide a method for interconnecting heterogeneous, distributed and autonomous systems. Whether the sales application needs to interface with the inventory application, the procurement application connect to an auction site, it seems that any application can be made better by integrating it with other applications. Integration between applications can face several troublesome due the fact that applications may not have been designed and implemented having integration in mind. Regarding to integration issues, two tier software systems, composed by the database tier and by the “front-end” tier (interface), have shown some limitations. As a solution to overcome the two tier limitations, three tier systems were proposed in the literature. Thus, by adding a middle-tier (referred as middleware) between the database tier and the “front-end” tier (or simply referred application), three main benefits emerge. The first benefit is related with the fact that the division of software systems in three tiers enables increased integration capabilities with other systems. The second benefit is related with the fact that any modifications to the individual tiers may be carried out without necessarily affecting the other tiers and integrated systems and the third benefit, consequence of the others, is related with less maintenance tasks in software system and in all integrated systems. Concerning software development in three tiers, this dissertation focus on two emerging technologies, Semantic Web and Service Oriented Architecture, combined with middleware. These two technologies blended with middleware, which resulted in the development of Swoat framework (Service and Semantic Web Oriented ArchiTecture), lead to the following four synergic advantages: (1) allow the creation of loosely-coupled systems, decoupling the database from “front-end” tiers, therefore reducing maintenance; (2) the database schema is transparent to “front-end” tiers which are aware of the information model (or domain model) that describes what data is accessible; (3) integration with other heterogeneous systems is allowed by providing services provided by the middleware; (4) the service request by the “frontend” tier focus on ‘what’ data and not on ‘where’ and ‘how’ related issues, reducing this way the application development time by developers.
Sharing sensor data between multiple devices and users can be^challenging for naive users, and requires knowledge of programming and use of different communication channels and/or development tools, leading to non uniform solutions. This thesis proposes a system that allows users to access sensors, share sensor data and manage sensors. With this system we intent to manage devices, share sensor data, compare sensor data, and set policies to act based on rules. This thesis presents the design and implementation of the system, as well as three case studies of its use.
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
The computational program called GIS_EM (Geographic Information System for Environmental Monitoring), a software devised to manage geographic information for monitoring soil, surface, and ground water, developed for use in the Health, Safety, and Environment Division of Paulinia Refinery is presented. This program enables registering and management of alphanumeric information pertaining to specific themes such as drilling performed for sample collection and for installation of monitoring wells, geophysical and other tests, results of chemical analyses of soil, surface, and groundwater, as well as reference values providing orientation for soil and water quality, such as EPA, Dutch List, etc. Management of such themes is performed by means of alphanumeric search tools, with specific filters and, in the case of spatial search, through the selection of spatial elements (themes) in map view. Documents existing in digital form, such as reports, photos, maps, may be registered and managed in the network environment. As the system centralizes information generated upon environmental investigations, it expedites access to and search of documents produced and stored in the network environment, minimizing search time and the need to file printed documents. This is an abstract of a paper presented at the AIChE Annual Meeting and Fall Showcase (Cincinnati, OH 10/30/2005-11/4/2005).