889 resultados para heterogeneous data sources


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. ^ Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a twofold “custom wrapper” approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. ^ Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. ^ This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. ^ This thesis describes a heterogeneous database system being developed at High-performance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii) a framework for intelligent computing and communication on the Internet applying the concepts of our work. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Query processing is a commonly performed procedure and a vital and integral part of information processing. It is therefore important and necessary for information processing applications to continuously improve the accessibility of data sources as well as the ability to perform queries on those data sources. ^ It is well known that the relational database model and the Structured Query Language (SQL) are currently the most popular tools to implement and query databases. However, a certain level of expertise is needed to use SQL and to access relational databases. This study presents a semantic modeling approach that enables the average user to access and query existing relational databases without the concern of the database's structure or technicalities. This method includes an algorithm to represent relational database schemas in a more semantically rich way. The result of which is a semantic view of the relational database. The user performs queries using an adapted version of SQL, namely Semantic SQL. This method substantially reduces the size and complexity of queries. Additionally, it shortens the database application development cycle and improves maintenance and reliability by reducing the size of application programs. Furthermore, a Semantic Wrapper tool illustrating the semantic wrapping method is presented. ^ I further extend the use of this semantic wrapping method to heterogeneous database management. Relational, object-oriented databases and the Internet data sources are considered to be part of the heterogeneous database environment. Semantic schemas resulting from the algorithm presented in the method were employed to describe the structure of these data sources in a uniform way. Semantic SQL was utilized to query various data sources. As a result, this method provides users with the ability to access and perform queries on heterogeneous database systems in a more innate way. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a two-fold "custom wrapper" approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. This thesis describes a heterogeneous database system being developed at Highperformance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i.) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii.) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii.) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv.) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v.) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi.) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii.) a framework for intelligent computing and communication on the Internet applying the concepts of our work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The generation of heterogeneous big data sources with ever increasing volumes, velocities and veracities over the he last few years has inspired the data science and research community to address the challenge of extracting knowledge form big data. Such a wealth of generated data across the board can be intelligently exploited to advance our knowledge about our environment, public health, critical infrastructure and security. In recent years we have developed generic approaches to process such big data at multiple levels for advancing decision-support. It specifically concerns data processing with semantic harmonisation, low level fusion, analytics, knowledge modelling with high level fusion and reasoning. Such approaches will be introduced and presented in context of the TRIDEC project results on critical oil and gas industry drilling operations and also the ongoing large eVacuate project on critical crowd behaviour detection in confined spaces.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The last decades have been characterized by a continuous adoption of IT solutions in the healthcare sector, which resulted in the proliferation of tremendous amounts of data over heterogeneous systems. Distinct data types are currently generated, manipulated, and stored, in the several institutions where patients are treated. The data sharing and an integrated access to this information will allow extracting relevant knowledge that can lead to better diagnostics and treatments. This thesis proposes new integration models for gathering information and extracting knowledge from multiple and heterogeneous biomedical sources. The scenario complexity led us to split the integration problem according to the data type and to the usage specificity. The first contribution is a cloud-based architecture for exchanging medical imaging services. It offers a simplified registration mechanism for providers and services, promotes remote data access, and facilitates the integration of distributed data sources. Moreover, it is compliant with international standards, ensuring the platform interoperability with current medical imaging devices. The second proposal is a sensor-based architecture for integration of electronic health records. It follows a federated integration model and aims to provide a scalable solution to search and retrieve data from multiple information systems. The last contribution is an open architecture for gathering patient-level data from disperse and heterogeneous databases. All the proposed solutions were deployed and validated in real world use cases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract and Summary of Thesis: Background: Individuals with Major Mental Illness (such as schizophrenia and bipolar disorder) experience increased rates of physical health comorbidity compared to the general population. They also experience inequalities in access to certain aspects of healthcare. This ultimately leads to premature mortality. Studies detailing patterns of physical health comorbidity are limited by their definitions of comorbidity, single disease approach to comorbidity and by the study of heterogeneous groups. To date the investigation of possible sources of healthcare inequalities experienced by individuals with Major Mental Illness (MMI) is relatively limited. Moreover studies detailing the extent of premature mortality experienced by individuals with MMI vary both in terms of the measure of premature mortality reported and age of the cohort investigated, limiting their generalisability to the wider population. Therefore local and national data can be used to describe patterns of physical health comorbidity, investigate possible reasons for health inequalities and describe mortality rates. These findings will extend existing work in this area. Aims and Objectives: To review the relevant literature regarding: patterns of physical health comorbidity, evidence for inequalities in physical healthcare and evidence for premature mortality for individuals with MMI. To examine the rates of physical health comorbidity in a large primary care database and to assess for evidence for inequalities in access to healthcare using both routine primary care prescribing data and incentivised national Quality and Outcome Framework (QOF) data. Finally to examine the rates of premature mortality in a local context with a particular focus on cause of death across the lifespan and effect of International Classification of Disease Version 10 (ICD 10) diagnosis and socioeconomic status on rates and cause of death. Methods: A narrative review of the literature surrounding patterns of physical health comorbidity, the evidence for inequalities in physical healthcare and premature mortality in MMI was undertaken. Rates of physical health comorbidity and multimorbidity in schizophrenia and bipolar disorder were examined using a large primary care dataset (Scottish Programme for Improving Clinical Effectiveness in Primary Care (SPICE)). Possible inequalities in access to healthcare were investigated by comparing patterns of prescribing in individuals with MMI and comorbid physical health conditions with prescribing rates in individuals with physical health conditions without MMI using SPICE data. Potential inequalities in access to health promotion advice (in the form of smoking cessation) and prescribing of Nicotine Replacement Therapy (NRT) were also investigated using SPICE data. Possible inequalities in access to incentivised primary healthcare were investigated using National Quality and Outcome Framework (QOF) data. Finally a pre-existing case register (Glasgow Psychosis Clinical Information System (PsyCIS)) was linked to Scottish Mortality data (available from the Scottish Government Website) to investigate rates and primary cause of death in individuals with MMI. Rate and primary cause of death were compared to the local population and impact of age, socioeconomic status and ICD 10 diagnosis (schizophrenia vs. bipolar disorder) were investigated. Results: Analysis of the SPICE data found that sixteen out of the thirty two common physical comorbidities assessed, occurred significantly more frequently in individuals with schizophrenia. In individuals with bipolar disorder fourteen occurred more frequently. The most prevalent chronic physical health conditions in individuals with schizophrenia and bipolar disorder were: viral hepatitis (Odds Ratios (OR) 3.99 95% Confidence Interval (CI) 2.82-5.64 and OR 5.90 95% CI 3.16-11.03 respectively), constipation (OR 3.24 95% CI 3.01-3.49 and OR 2.84 95% CI 2.47-3.26 respectively) and Parkinson’s disease (OR 3.07 95% CI 2.43-3.89 and OR 2.52 95% CI 1.60-3.97 respectively). Both groups had significantly increased rates of multimorbidity compared to controls: in the schizophrenia group OR for two comorbidities was 1.37 95% CI 1.29-1.45 and in the bipolar disorder group OR was 1.34 95% CI 1.20-1.49. In the studies investigating inequalities in access to healthcare there was evidence of: under-recording of cardiovascular-related conditions for example in individuals with schizophrenia: OR for Atrial Fibrillation (AF) was 0.62 95% CI 0.52 - 0.73, for hypertension 0.71 95% CI 0.67 - 0.76, for Coronary Heart Disease (CHD) 0.76 95% CI 0.69 - 0.83 and for peripheral vascular disease (PVD) 0.83 95% CI 0.72 - 0.97. Similarly in individuals with bipolar disorder OR for AF was 0.56 95% CI 0.41-0.78, for hypertension 0.69 95% CI 0.62 - 0.77 and for CHD 0.77 95% CI 0.66 - 0.91. There was also evidence of less intensive prescribing for individuals with schizophrenia and bipolar disorder who had comorbid hypertension and CHD compared to individuals with hypertension and CHD who did not have schizophrenia or bipolar disorder. Rate of prescribing of statins for individuals with schizophrenia and CHD occurred significantly less frequently than in individuals with CHD without MMI (OR 0.67 95% CI 0.56-0.80). Rates of prescribing of 2 or more anti-hypertensives were lower in individuals with CHD and schizophrenia and CHD and bipolar disorder compared to individuals with CHD without MMI (OR 0.66 95% CI 0.56-0.78 and OR 0.55 95% CI 0.46-0.67, respectively). Smoking was more common in individuals with MMI compared to individuals without MMI (OR 2.53 95% CI 2.44-2.63) and was particularly increased in men (OR 2.83 95% CI 2.68-2.98). Rates of ex-smoking and non-smoking were lower in individuals with MMI (OR 0.79 95% CI 0.75-0.83 and OR 0.50 95% CI 0.48-0.52 respectively). However recorded rates of smoking cessation advice in smokers with MMI were significantly lower than the recorded rates of smoking cessation advice in smokers with diabetes (88.7% vs. 98.0%, p<0.001), smokers with CHD (88.9% vs. 98.7%, p<0.001) and smokers with hypertension (88.3% vs. 98.5%, p<0.001) without MMI. The odds ratio of NRT prescription was also significantly lower in smokers with MMI without diabetes compared to smokers with diabetes without MMI (OR 0.75 95% CI 0.69-0.81). Similar findings were found for smokers with MMI without CHD compared to smokers with CHD without MMI (OR 0.34 95% CI 0.31-0.38) and smokers with MMI without hypertension compared to smokers with hypertension without MMI (OR 0.71 95% CI 0.66-0.76). At a national level, payment and population achievement rates for the recording of body mass index (BMI) in MMI was significantly lower than the payment and population achievement rates for BMI recording in diabetes throughout the whole of the UK combined: payment rate 92.7% (Inter Quartile Range (IQR) 89.3-95.8 vs. 95.5% IQR 93.3-97.2, p<0.001 and population achievement rate 84.0% IQR 76.3-90.0 vs. 92.5% IQR 89.7-94.9, p<0.001 and for each country individually: for example in Scotland payment rate was 94.0% IQR 91.4-97.2 vs. 96.3% IQR 94.3-97.8, p<0.001. Exception rate was significantly higher for the recording of BMI in MMI than the exception rate for BMI recording in diabetes for the UK combined: 7.4% IQR 3.3-15.9 vs. 2.3% IQR 0.9-4.7, p<0.001 and for each country individually. For example in Scotland exception rate in MMI was 11.8% IQR 5.4-19.3 compared to 3.5% IQR 1.9-6.1 in diabetes. Similar findings were found for Blood Pressure (BP) recording: across the whole of the UK payment and population achievement rates for BP recording in MMI were also significantly reduced compared to payment and population achievement rates for the recording of BP in chronic kidney disease (CKD): payment rate: 94.1% IQR 90.9-97.1 vs.97.8% IQR 96.3-98.9 and p<0.001 and population achievement rate 87.0% IQR 81.3-91.7 vs. 97.1% IQR 95.5-98.4, p<0.001. Exception rates again were significantly higher for the recording of BP in MMI compared to CKD (6.4% IQR 3.0-13.1 vs. 0.3% IQR 0.0-1.0, p<0.001). There was also evidence of differences in rates of recording of BMI and BP in MMI across the UK. BMI and BP recording in MMI were significantly lower in Scotland compared to England (BMI:-1.5% 99% CI -2.7 to -0.3%, p<0.001 and BP: -1.8% 99% CI -2.7 to -0.9%, p<0.001). While rates of BMI and BP recording in diabetes and CKD were similar in Scotland compared to England (BMI: -0.5 99% CI -1.0 to 0.05, p=0.004 and BP: 0.02 99% CI -0.2 to 0.3, p=0.797). Data from the PsyCIS cohort showed an increase in Standardised Mortality Ratios (SMR) across the lifespan for individuals with MMI compared to the local Glasgow and wider Scottish populations (Glasgow SMR 1.8 95% CI 1.6-2.0 and Scotland SMR 2.7 95% CI 2.4-3.1). Increasing socioeconomic deprivation was associated with an increased overall rate of death in MMI (350.3 deaths/10,000 population/5 years in the least deprived quintile compared to 794.6 deaths/10,000 population/5 years in the most deprived quintile). No significant difference in rate of death for individuals with schizophrenia compared with bipolar disorder was reported (6.3% vs. 4.9%, p=0.086), but primary cause of death varied: with higher rates of suicide in individuals with bipolar disorder (22.4% vs. 11.7%, p=0.04). Discussion: Local and national datasets can be used for epidemiological study to inform local practice and complement existing national and international studies. While the strengths of this thesis include the large data sets used and therefore their likely representativeness to the wider population, some limitations largely associated with using secondary data sources are acknowledged. While this thesis has confirmed evidence of increased physical health comorbidity and multimorbidity in individuals with MMI, it is likely that these findings represent a significant under reporting and likely under recognition of physical health comorbidity in this population. This is likely due to a combination of patient, health professional and healthcare system factors and requires further investigation. Moreover, evidence of inequality in access to healthcare in terms of: physical health promotion (namely smoking cessation advice), recording of physical health indices (BMI and BP), prescribing of medications for the treatment of physical illness and prescribing of NRT has been found at a national level. While significant premature mortality in individuals with MMI within a Scottish setting has been confirmed, more work is required to further detail and investigate the impact of socioeconomic deprivation on cause and rate of death in this population. It is clear that further education and training is required for all healthcare staff to improve the recognition, diagnosis and treatment of physical health problems in this population with the aim of addressing the significant premature mortality that is seen. Conclusions: Future work lies in the challenge of designing strategies to reduce health inequalities and narrow the gap in premature mortality reported in individuals with MMI. Models of care that allow a much more integrated approach to diagnosing, monitoring and treating both the physical and mental health of individuals with MMI, particularly in areas of social and economic deprivation may be helpful. Strategies to engage this “hard to reach” population also need to be developed. While greater integration of psychiatric services with primary care and with specialist medical services is clearly vital the evidence on how best to achieve this is limited. While the National Health Service (NHS) is currently undergoing major reform, attention needs to be paid to designing better ways to improve the current disconnect between primary and secondary care. This should then help to improve physical, psychological and social outcomes for individuals with MMI.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El proceso de toma de decisiones en las bibliotecas universitarias es de suma importancia, sin embargo, se encuentra complicaciones como la gran cantidad de fuentes de datos y los grandes volúmenes de datos a analizar. Las bibliotecas universitarias están acostumbradas a producir y recopilar una gran cantidad de información sobre sus datos y servicios. Las fuentes de datos comunes son el resultado de sistemas internos, portales y catálogos en línea, evaluaciones de calidad y encuestas. Desafortunadamente estas fuentes de datos sólo se utilizan parcialmente para la toma de decisiones debido a la amplia variedad de formatos y estándares, así como la falta de métodos eficientes y herramientas de integración. Este proyecto de tesis presenta el análisis, diseño e implementación del Data Warehouse, que es un sistema integrado de toma de decisiones para el Centro de Documentación Juan Bautista Vázquez. En primer lugar se presenta los requerimientos y el análisis de los datos en base a una metodología, esta metodología incorpora elementos claves incluyendo el análisis de procesos, la calidad estimada, la información relevante y la interacción con el usuario que influyen en una decisión bibliotecaria. A continuación, se propone la arquitectura y el diseño del Data Warehouse y su respectiva implementación la misma que soporta la integración, procesamiento y el almacenamiento de datos. Finalmente los datos almacenados se analizan a través de herramientas de procesamiento analítico y la aplicación de técnicas de Bibliomining ayudando a los administradores del centro de documentación a tomar decisiones óptimas sobre sus recursos y servicios.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The over represented number of novice drivers involved in crashes is alarming. Driver training is one of the interventions aimed at mitigating the number of crashes that involve young drivers. To our knowledge, Advanced Driver Assistance Systems (ADAS) have never been comprehensively used in designing an intelligent driver training system. Currently, there is a need to develop and evaluate ADAS that could assess driving competencies. The aim is to develop an unsupervised system called Intelligent Driver Training System (IDTS) that analyzes crash risks in a given driving situation. In order to design a comprehensive IDTS, data is collected from the Driver, Vehicle and Environment (DVE), synchronized and analyzed. The first implementation phase of this intelligent driver training system deals with synchronizing multiple variables acquired from DVE. RTMaps is used to collect and synchronize data like GPS, vehicle dynamics and driver head movement. After the data synchronization, maneuvers are segmented out as right turn, left turn and overtake. Each maneuver is composed of several individual tasks that are necessary to be performed in a sequential manner. This paper focuses on turn maneuvers. Some of the tasks required in the analysis of ‘turn’ maneuver are: detect the start and end of the turn, detect the indicator status change, check if the indicator was turned on within a safe distance and check the lane keeping during the turn maneuver. This paper proposes a fusion and analysis of heterogeneous data, mainly involved in driving, to determine the risk factor of particular maneuvers within the drive. It also explains the segmentation and risk analysis of the turn maneuver in a drive.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objective: To examine the reliability of work-related activity coding for injury-related hospitalisations in Australia. Method: A random sample of 4373 injury-related hospital separations from 1 July 2002 to 30 June 2004 were obtained from a stratified random sample of 50 hospitals across 4 states in Australia. From this sample, cases were identified as work-related if they contained an ICD-10-AM work-related activity code (U73) allocated by either: (i) the original coder; (ii) an independent auditor, blinded to the original code; or (iii) a research assistant, blinded to both the original and auditor codes, who reviewed narrative text extracted from the medical record. The concordance of activity coding and number of cases identified as work-related using each method were compared. Results: Of the 4373 cases sampled, 318 cases were identified as being work-related using any of the three methods for identification. The original coder identified 217 and the auditor identified 266 work-related cases (68.2% and 83.6% of the total cases identified, respectively). Around 10% of cases were only identified through the text description review. The original coder and auditor agreed on the assignment of work-relatedness for 68.9% of cases. Conclusions and Implications: The current best estimates of the frequency of hospital admissions for occupational injury underestimate the burden by around 32%. This is a substantial underestimate that has major implications for public policy, and highlights the need for further work on improving the quality and completeness of routine, administrative data sources for a more complete identification of work-related injuries.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We consider the problem of designing a surveillance system to detect a broad range of invasive species across a heterogeneous sampling frame. We present a model to detect a range of invertebrate invasives whilst addressing the challenges of multiple data sources, stratifying for differential risk, managing labour costs and providing sufficient power of detection.We determine the number of detection devices required and their allocation across the landscape within limiting resource constraints. The resulting plan will lead to reduced financial and ecological costs and an optimal surveillance system.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objective: To quantify the extent to which alcohol related injuries are adequately identified in hospitalisation data using ICD-10-AM codes indicative of alcohol involvement. Method: A random sample of 4373 injury-related hospital separations from 1 July 2002 to 30 June 2004 were obtained from a stratified random sample of 50 hospitals across 4 states in Australia. From this sample, cases were identified as involving alcohol if they contained an ICD-10-AM diagnosis or external cause code referring to alcohol, or if the text description extracted from the medical records mentioned alcohol involvement. Results: Overall, identification of alcohol involvement using ICD codes detected 38% of the alcohol-related sample, whilst almost 94% of alcohol-related cases were identified through a search of the text extracted from the medical records. The resultant estimate of alcohol involvement in injury-related hospitalisations in this sample was 10%. Emergency department records were the most likely to identify whether the injury was alcohol-related with almost three-quarters of alcohol-related cases mentioning alcohol in the text abstracted from these records. Conclusions and Implications: The current best estimates of the frequency of hospital admissions where alcohol is involved prior to the injury underestimate the burden by around 62%. This is a substantial underestimate that has major implications for public policy, and highlights the need for further work on improving the quality and completeness of routine administrative data sources for identification of alcohol-related injuries.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: International data on child maltreatment are largely derived from child protection agencies, and predominantly report only substantiated cases of child maltreatment. This approach underestimates the incidence of maltreatment and makes inter-jurisdictional comparisons difficult. There has been a growing recognition of the importance of health professionals in identifying, documenting and reporting suspected child maltreatment. This study aimed to describe the issues around case identification using coded morbidity data, outline methods for selecting and grouping relevant codes, and illustrate patterns of maltreatment identified. Methods: A comprehensive review of the ICD-10-AM classification system was undertaken, including review of index terms, a free text search of tabular volumes, and a review of coding standards pertaining to child maltreatment coding. Identified codes were further categorised into maltreatment types including physical abuse, sexual abuse, emotional or psychological abuse, and neglect. Using these code groupings, one year of Australian hospitalisation data for children under 18 years of age was examined to quantify the proportion of patients identified and to explore the characteristics of cases assigned maltreatment-related codes. Results: Less than 0.5% of children hospitalised in Australia between 2005 and 2006 had a maltreatment code assigned, almost 4% of children with a principal diagnosis of a mental and behavioural disorder and over 1% of children with an injury or poisoning as the principal diagnosis had a maltreatment code assigned. The patterns of children assigned with definitive T74 codes varied by sex and age group. For males selected as having a maltreatment-related presentation, physical abuse was most commonly coded (62.6% of maltreatment cases) while for females selected as having a maltreatment-related presentation, sexual abuse was the most commonly assigned form of maltreatment (52.9% of maltreatment cases). Conclusion: This study has demonstrated that hospital data could provide valuable information for routine monitoring and surveillance of child maltreatment, even in the absence of population-based linked data sources. With national and international calls for a public health response to child maltreatment, better understanding of, investment in and utilisation of our core national routinely collected data sources will enhance the evidence-base needed to support an appropriate response to children at risk.