Biblioteca Digital

867 resultados para data-basemanagement

Data distribution and task scheduling for distributed computing of all-to-all comparison problems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research studied distributed computing of all-to-all comparison problems with big data sets. The thesis formalised the problem, and developed a high-performance and scalable computing framework with a programming model, data distribution strategies and task scheduling policies to solve the problem. The study considered storage usage, data locality and load balancing for performance improvement in solving the problem. The research outcomes can be applied in bioinformatics, biometrics and data mining and other domains in which all-to-all comparisons are a typical computing pattern.

Making the most of spatial information in health: A tutorial in Bayesian disease mapping for areal data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Disease maps are effective tools for explaining and predicting patterns of disease outcomes across geographical space, identifying areas of potentially elevated risk, and formulating and validating aetiological hypotheses for a disease. Bayesian models have become a standard approach to disease mapping in recent decades. This article aims to provide a basic understanding of the key concepts involved in Bayesian disease mapping methods for areal data. It is anticipated that this will help in interpretation of published maps, and provide a useful starting point for anyone interested in running disease mapping methods for areal data. The article provides detailed motivation and descriptions on disease mapping methods by explaining the concepts, defining the technical terms, and illustrating the utility of disease mapping for epidemiological research by demonstrating various ways of visualising model outputs using a case study. The target audience includes spatial scientists in health and other fields, policy or decision makers, health geographers, spatial analysts, public health professionals, and epidemiologists.

Shifting centres: Pedagogical relations in the Era of Big Data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a cautious argument for re-thinking both the nature and the centrality of the one-to-one teacher/student relationship in contemporary pedagogy. A case is made that learning in and for our times requires us to broaden our understanding of pedagogical relations beyond the singularity of the teacher/student binary and to promote the connected teacher as better placed to lead learning for these times. The argument proceeds in three parts: first, a characterization of our times as defined increasingly by the digital knowledge explosion of Big Data; second, a re-thinking of the nature of pedagogical relationships in the context of Big Data; and third, an account of the ways in which leaders can support their teachers to become more effective in leading learning by being more closely connected to their professional colleagues.

Using Big Data to manage safety-related risk in the upstream oil and gas industry: A research agenda

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite considerable effort and a broad range of new approaches to safety management over the years, the upstream oil & gas industry has been frustrated by the sector’s stubbornly high rate of injuries and fatalities. This short communication points out, however, that the industry may be in a position to make considerable progress by applying “Big Data” analytical tools to the large volumes of safety-related data that have been collected by these organizations. Toward making this case, we examine existing safety-related information management practices in the upstream oil & gas industry, and specifically note that data in this sector often tends to be highly customized, difficult to analyze using conventional quantitative tools, and frequently ignored. We then contend that the application of new Big Data kinds of analytical techniques could potentially reveal patterns and trends that have been hidden or unknown thus far, and argue that these tools could help the upstream oil & gas sector to improve its injury and fatality statistics. Finally, we offer a research agenda toward accelerating the rate at which Big Data and new analytical capabilities could play a material role in helping the industry to improve its health and safety performance.

A data fusion approach of multiple maintenance data sources for real-world reliability modelling

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A central tenet in the theory of reliability modelling is the quantification of the probability of asset failure. In general, reliability depends on asset age and the maintenance policy applied. Usually, failure and maintenance times are the primary inputs to reliability models. However, for many organisations, different aspects of these data are often recorded in different databases (e.g. work order notifications, event logs, condition monitoring data, and process control data). These recorded data cannot be interpreted individually, since they typically do not have all the information necessary to ascertain failure and preventive maintenance times. This paper presents a methodology for the extraction of failure and preventive maintenance times using commonly-available, real-world data sources. A text-mining approach is employed to extract keywords indicative of the source of the maintenance event. Using these keywords, a Naïve Bayes classifier is then applied to attribute each machine stoppage to one of two classes: failure or preventive. The accuracy of the algorithm is assessed and the classified failure time data are then presented. The applicability of the methodology is demonstrated on a maintenance data set from an Australian electricity company.

A network forensics tool for precise data packet capture and replay in cyber-physical systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Network data packet capture and replay capabilities are basic requirements for forensic analysis of faults and security-related anomalies, as well as for testing and development. Cyber-physical networks, in which data packets are used to monitor and control physical devices, must operate within strict timing constraints, in order to match the hardware devices' characteristics. Standard network monitoring tools are unsuitable for such systems because they cannot guarantee to capture all data packets, may introduce their own traffic into the network, and cannot reliably reproduce the original timing of data packets. Here we present a high-speed network forensics tool specifically designed for capturing and replaying data traffic in Supervisory Control and Data Acquisition systems. Unlike general-purpose "packet capture" tools it does not affect the observed network's data traffic and guarantees that the original packet ordering is preserved. Most importantly, it allows replay of network traffic precisely matching its original timing. The tool was implemented by developing novel user interface and back-end software for a special-purpose network interface card. Experimental results show a clear improvement in data capture and replay capabilities over standard network monitoring methods and general-purpose forensics solutions.

Enhancing ethical data translation in educational qualitative research

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many educational researchers conducting studies in non-English speaking settings attempt to report on their project in English to boost their scholarly impact. It requires preparing and presenting translations of data collected from interviews and observations. This paper discusses the process and ethical considerations involved in this invisible methodological phase. The process includes activities prior to data analysis and to its presentation to be undertaken by the bilingual researcher as translator in order to convey participants’ original meanings as well as to establish and fulfil translation ethics. This paper offers strategies to address such issues; the most appropriate translation method for qualitative study; and approaches to address political issues when presenting such data.

Statistical analysis of spectral data: A methodology for designing an intelligent monitoring system for the diabetic foot

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Early detection of (pre-)signs of ulceration on a diabetic foot is valuable for clinical practice. Hyperspectral imaging is a promising technique for detection and classification of such (pre-)signs. However, the number of the spectral bands should be limited to avoid overfitting, which is critical for pixel classification with hyperspectral image data. The goal was to design a detector/classifier based on spectral imaging (SI) with a small number of optical bandpass filters. The performance and stability of the design were also investigated. The selection of the bandpass filters boils down to a feature selection problem. A dataset was built, containing reflectance spectra of 227 skin spots from 64 patients, measured with a spectrometer. Each skin spot was annotated manually by clinicians as "healthy" or a specific (pre-)sign of ulceration. Statistical analysis on the data set showed the number of required filters is between 3 and 7, depending on additional constraints on the filter set. The stability analysis revealed that shot noise was the most critical factor affecting the classification performance. It indicated that this impact could be avoided in future SI systems with a camera sensor whose saturation level is higher than 106, or by postimage processing.

Evaluating the specificity of community injury hospitalization data over time

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study identified the areas of poor specificity in national injury hospitalization data and the areas of improvement and deterioration in specificity over time. A descriptive analysis of ten years of national hospital discharge data for Australia from July 2002-June 2012 was performed. Proportions and percentage change of defined/undefined codes over time was examined. At the intent block level, accidents and assault were the most poorly defined with over 11% undefined in each block. The mechanism blocks for accidents showed a significant deterioration in specificity over time with up to 20% more undefined codes in some mechanisms. Place and activity were poorly defined at the broad block level (43% and 72% undefined respectively). Private hospitals and hospitals in very remote locations recorded the highest proportion of undefined codes. Those aged over 60 years and females had the higher proportion of undefined code usage. This study has identified significant, and worsening, deficiencies in the specificity of coded injury data in several areas. Focal attention is needed to improve the quality of injury data, especially on those identified in this study, to provide the evidence base needed to address the significant burden of injury in the Australian community.

Meltdown - a tool for classification and analysis of DSF data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An application that translates raw thermal melt curve data into more easily assimilated knowledge is described. This program, called ‘Meltdown’, performs a number of data remediation steps before classifying melt curves and estimating melting temperatures. The final output is a report that summarizes the results of a differential scanning fluorimetry experiment. Meltdown uses a Bayesian classification scheme, enabling reproducible identification of various trends commonly found in DSF datasets. The goal of Meltdown is not to replace human analysis of the raw data, but to provide a sensible interpretation of the data to make this useful experimental technique accessible to naïve users, as well as providing a starting point for detailed analyses by more experienced users.

Tools for compositional data with a total

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Compositional data analysis usually deals with relative information between parts where the total (abundances, mass, amount, etc.) is unknown or uninformative. This article addresses the question of what to do when the total is known and is of interest. Tools used in this case are reviewed and analysed, in particular the relationship between the positive orthant of D-dimensional real space, the product space of the real line times the D-part simplex, and their Euclidean space structures. The first alternative corresponds to data analysis taking logarithms on each component, and the second one to treat a log-transformed total jointly with a composition describing the distribution of component amounts. Real data about total abundances of phytoplankton in an Australian river motivated the present study and are used for illustration.

On the integration of self-tracking data amongst quantified self members

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Self-tracking, the process of recording one's own behaviours, thoughts and feelings, is a popular approach to enhance one's self-knowledge. While dedicated self-tracking apps and devices support data collection, previous research highlights that the integration of data constitutes a barrier for users. In this study we investigated how members of the Quantified Self movement---early adopters of self-tracking tools---overcome these barriers. We conducted a qualitative analysis of 51 videos of Quantified Self presentations to explore intentions for collecting data, methods for integrating and representing data, and how intentions and methods shaped reflection. The findings highlight two different intentions---striving for self-improvement and curiosity in personal data---which shaped how these users integrated data, i.e. the effort required. Furthermore, we identified three methods for representing data---binary, structured and abstract---which influenced reflection. Binary representations supported reflection-in-action, whereas structured and abstract representations supported iterative processes of data collection, integration and reflection. For people tracking out of curiosity, this iterative engagement with personal data often became an end in itself, rather than a means to achieve a goal. We discuss how these findings contribute to our current understanding of self-tracking amongst Quantified Self members and beyond, and we conclude with directions for future work to support self-trackers with their aspirations.

The logic of data-sense: Thinking through Learning Personalisation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Big Data and Learning Analytics’ promise to revolutionise educational institutions, endeavours, and actions through more and better data is now compelling. Multiple, and continually updating, data sets produce a new sense of ‘personalised learning’. A crucial attribute of the datafication, and subsequent profiling, of learner behaviour and engagement is the continual modification of the learning environment to induce greater levels of investment on the parts of each learner. The assumption is that more and better data, gathered faster and fed into ever-updating algorithms, provide more complete tools to understand, and therefore improve, learning experiences through adaptive personalisation. The argument in this paper is that Learning Personalisation names a new logistics of investment as the common ‘sense’ of the school, in which disciplinary education is ‘both disappearing and giving way to frightful continual training, to continual monitoring'.

Reputation model based on rating data and application in recommender systems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis introduced two novel reputation models to generate accurate item reputation scores using ratings data and the statistics of the dataset. It also presented an innovative method that incorporates reputation awareness in recommender systems by employing voting system methods to produce more accurate top-N item recommendations. Additionally, this thesis introduced a personalisation method for generating reputation scores based on users' interests, where a single item can have different reputation scores for different users. The personalised reputation scores are then used in the proposed reputation-aware recommender systems to enhance the recommendation quality.

Principals of audit: Testing, data and ‘implicated advocacy’

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Historically, school leaders have occupied a somewhat ambiguous position within networks of power. On the one hand, they appear to be celebrated as what Ball (2003) has termed the ‘new hero of educational reform'; on the other, they are often ‘held to account’ through those same performative processes and technologies. These have become compelling in schools and principals are ‘doubly bound’ through this. Adopting a Foucauldian notion of discursive production, this paper addresses the ways that the discursive ‘field’ of ‘principal’ (within larger regimes of truth such as schools, leadership, quality and efficiency) is produced. It explores how individual principals understand their roles and ethics within those practices of audit emerging in school governance, and how their self-regulation is constituted through NAPLAN – the National Assessment Program, Literacy and Numeracy. A key effect of NAPLAN has been the rise of auditing practices that change how education is valued. Open-ended interviews with 13 primary and secondary school principals from Western Australia, South Australia and New South Wales asked how they perceived NAPLAN's impact on their work, their relationships within their school community and their ethical practice.

«
1
2
...
50
51
52
53
54
55
56
57
58
»