970 resultados para data collections
Resumo:
The reliance on police data for the counting of road crash injuries can be problematic, as it is well known that not all road crash injuries are reported to police which under-estimates the overall burden of road crash injuries. The aim of this study was to use multiple linked data sources to estimate the extent of under-reporting of road crash injuries to police in the Australian state of Queensland. Data from the Queensland Road Crash Database (QRCD), the Queensland Hospital Admitted Patients Data Collection (QHAPDC), Emergency Department Information System (EDIS), and the Queensland Injury Surveillance Unit (QISU) for the year 2009 were linked. The completeness of road crash cases reported to police was examined via discordance rates between the police data (QRCD) and the hospital data collections. In addition, the potential bias of this discordance (under-reporting) was assessed based on gender, age, road user group, and regional location. Results showed that the level of under-reporting varied depending on the data set with which the police data was compared. When all hospital data collections are examined together the estimated population of road crash injuries was approximately 28,000, with around two-thirds not linking to any record in the police data. The results also showed that the under-reporting was more likely for motorcyclists, cyclists, males, young people, and injuries occurring in Remote and Inner Regional areas. These results have important implications for road safety research and policy in terms of: prioritising funding and resources; targeting road safety interventions into areas of higher risk; and estimating the burden of road crash injuries.
Resumo:
Supported file formats: - CrossRef XML file(s) - TRiDaS (Tree Ring Data Standard, http://www.tridas.org). Example: hdl:10013/epic.42747.d001 - IMMA (International Maritime Meteorological Archive). Used by the project CLIWOC (García-Herrera et al. 2007, http://doi.pangaea.de/10.1594/PANGAEA.743343) - NOAA IOAS (International Ocean Atlas Series). Example: hdl:10013/epic.42747.d008 - SOCAT (Surface Ocean CO2 Atlas, Bakker et al. 2014, http://doi.pangaea.de/10.1594/PANGAEA.811776) - CHUAN (Comprehensive Historical Upper-Air Network, Stickler et al. 2013, http://doi.pangaea.de/10.1594/PANGAEA.821222). Example: hdl:10013/epic.42747.d003 - Thermosalinograph (TSG) data. Format developed by Gerd Rohardt. Example: hdl:10013/epic.42747.d002 - Columus GPS Data Logger V-900 format to KML or GPX. Example: hdl:10013/epic.42747.d006
Resumo:
In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is also proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Experiments based on several real-world data collections demonstrate that WebPut outperforms existing approaches.
Resumo:
In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Moreover, several optimization techniques are also proposed to reduce the cost of estimating the confidence of imputation queries at both the tuple-level and the database-level. Experiments based on several real-world data collections demonstrate not only the effectiveness of WebPut compared to existing approaches, but also the efficiency of our proposed algorithms and optimization techniques.
Resumo:
This program of research linked police and health data collections to investigate the potential benefits for road safety in terms of enhancing the quality of data. This research has important implications for road safety because, although police collected data has historically underpinned efforts in the area, it is known that many road crashes are not reported to police and that these data lack specific injury severity information. This research shows that data linkage provides a more accurate quantification of the severity and prevalence of road crash injuries which is essential for: prioritising funding; targeting interventions; and estimating the burden and cost of road trauma.
Resumo:
On 19 June 2015, representatives from over 40 Australian research institutions gathered in Canberra to launch their Open Data Collections. The one day event, hosted by the Australian National Data Service (ANDS), showcased to government and a range of national stakeholders the rich variety of data collections that have been generated through the Major Open Data Collections (MODC) project. Colin Eustace attended the showcase for QUT Library and presented a poster that reflected the work that he and Jodie Vaughan generated through the project. QUT’s Blueprint 4, the University’s five-year institutional strategic plan, outlines the key priorities of developing a commitment to working in partnership with industry, as well as combining disciplinary strengths with interdisciplinary application. The Division of Technology, Information and Learning Support (TILS) has undertaken a number of Australian National Data Service (ANDS) funded projects since 2009 with the aim of developing improved research data management services within the University to support these strategic aims. By leveraging existing tools and systems developed during these projects, the Major Open Data Collection (MODC) project delivered support to multi-disciplinary collaborative research activities through partnership building between QUT researchers and Queensland government agencies, in order to add to and promote the discovery and reuse of a collection of spatially referenced datasets. The MODC project built upon existing Research Data Finder infrastructure (which uses VIVO open source software, developed by Cornell University) to develop a separate collection, Spatial Data Finder (https://researchdatafinder.qut.edu.au/spatial) as the interface to display the spatial data collection. During the course of the project, 62 dataset descriptions were added to Spatial Data Finder, 7 added to Research Data Finder and two added to Software Finder, another separate collection. The project team met with 116 individual researchers and attended 13 school and faculty meetings to promote the MODC project and raise awareness of the Library’s services and resources for research data management.
Resumo:
Replication Data Management (RDM) aims at enabling the use of data collections from several iterations of an experiment. However, there are several major challenges to RDM from integrating data models and data from empirical study infrastructures that were not designed to cooperate, e.g., data model variation of local data sources. [Objective] In this paper we analyze RDM needs and evaluate conceptual RDM approaches to support replication researchers. [Method] We adapted the ATAM evaluation process to (a) analyze RDM use cases and needs of empirical replication study research groups and (b) compare three conceptual approaches to address these RDM needs: central data repositories with a fixed data model, heterogeneous local repositories, and an empirical ecosystem. [Results] While the central and local approaches have major issues that are hard to resolve in practice, the empirical ecosystem allows bridging current gaps in RDM from heterogeneous data sources. [Conclusions] The empirical ecosystem approach should be explored in diverse empirical environments.
Resumo:
The program PanTool was developed as a tool box like a Swiss Army Knife for data conversion and recalculation, written to harmonize individual data collections to standard import format used by PANGAEA. The format of input files the program PanTool needs is a tabular saved in plain ASCII. The user can create this files with a spread sheet program like MS-Excel or with the system text editor. PanTool is distributed as freeware for the operating systems Microsoft Windows, Apple OS X and Linux.
Resumo:
Background: The impact of cancer upon children, teenagers and young people can be profound. Research has been undertaken to explore the impacts upon children, teenagers and young people with cancer, but little is known about how researchers can ‘best’ engage with this group to explore their experiences. This review paper provides an overview of the utility of data collection methods employed when undertaking research with children, teenagers and young people. A systematic review of relevant databases was undertaken utilising the search terms ‘young people’, ‘young adult’, ‘adolescent’ and ‘data collection methods’. The full-text of the papers that were deemed eligible from the title and abstract were accessed and following discussion within the research team, thirty papers were included. Findings: Due to the heterogeneity in terms of the scope of the papers identified the following data collections methods were included in the results section. Three of the papers identified provided an overview of data collection methods utilised with this population and the remaining twenty seven papers covered the following data collection methods: Digital technologies; art based research; comparing the use of ‘paper and pencil’ research with web-based technologies, the use of games; the use of a specific communication tool; questionnaires and interviews; focus groups and telephone interviews/questionnaires. The strengths and limitations of the range of data collection methods included are discussed drawing upon such issues as of the appropriateness of particular methods for particular age groups, or the most appropriate method to employ when exploring a particularly sensitive topic area. Conclusions: There are a number of data collection methods utilised to undertaken research with children, teenagers and young adults. This review provides a summary of the current available evidence and an overview of the strengths and limitations of data collection methods employed.
Resumo:
Over the years, people have often held the hypothesis that negative feedback should be very useful for largely improving the performance of information filtering systems; however, we have not obtained very effective models to support this hypothesis. This paper, proposes an effective model that use negative relevance feedback based on a pattern mining approach to improve extracted features. This study focuses on two main issues of using negative relevance feedback: the selection of constructive negative examples to reduce the space of negative examples; and the revision of existing features based on the selected negative examples. The former selects some offender documents, where offender documents are negative documents that are most likely to be classified in the positive group. The later groups the extracted features into three groups: the positive specific category, general category and negative specific category to easily update the weight. An iterative algorithm is also proposed to implement this approach on RCV1 data collections, and substantial experiments show that the proposed approach achieves encouraging performance.
Resumo:
The Australian National Data Service (ANDS) was established in 2008 and aims to: influence national policy in the area of data management in the Australian research community; inform best practice for the curation of data, and, transform the disparate collections of research data around Australia into a cohesive collection of research resources One high profile ANDS activity is to establish the population of Research Data Australia, a set of web pages describing data collections produced by or relevant to Australian researchers. It is designed to promote visibility of research data collections in search engines, in order to encourage their re-use. As part of activities associated with the Australian National Data Service, an increasing number of Australian Universities are choosing to implement VIVO, not as a platform to profile information about researchers, but as a 'metadata store' platform to profile information about institutional research data sets, both locally and as part of a national data commons. To date, the University of Melbourne, Griffith University, the Queensland University of Technology, and the University of Western Australia have all chosen to implement VIVO, with interest from other Universities growing.
Resumo:
This paper presents a summary of the key findings of the TTF TPACK Survey developed and administered for the Teaching the Teachers for the Future (TTF) Project implemented in 2011. The TTF Project, funded by an Australian Government ICT Innovation Fund grant, involved all 39 Australian Higher Education Institutions which provide initial teacher education. TTF data collections were undertaken at the end of Semester 1 (T1) and at the end of Semester 2 (T2) in 2011. A total of 12881 participants completed the first survey (T1) and 5809 participants completed the second survey (T2). Groups of like-named items from the T1 survey were subject to a battery of complementary data analysis techniques. The psychometric properties of the four scales: Confidence - teacher items; Usefulness - teacher items; Confidence - student items; Usefulness- student items, were confirmed both at T1 and T2. Among the key findings summarised, at the national level, the scale: Confidence to use ICT as a teacher showed measurable growth across the whole scale from T1 to T2, and the scale: Confidence to facilitate student use of ICT also showed measurable growth across the whole scale from T1 to T2. Additional key TTF TPACK Survey findings are summarised.