Biblioteca Digital

856 resultados para ADMINISTRATIVE DATA

A principled experimental design approach to Big Data analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Big Datasets are endemic, but they are often notoriously difficult to analyse because of their size, heterogeneity, history and quality. The purpose of this paper is to open a discourse on the use of modern experimental design methods to analyse Big Data in order to answer particular questions of interest. By appealing to a range of examples, it is suggested that this perspective on Big Data modelling and analysis has wide generality and advantageous inferential and computational properties. In particular, the principled experimental design approach is shown to provide a flexible framework for analysis that, for certain classes of objectives and utility functions, delivers near equivalent answers compared with analyses of the full dataset under a controlled error rate. It can also provide a formalised method for iterative parameter estimation, model checking, identification of data gaps and evaluation of data quality. Finally, it has the potential to add value to other Big Data sampling algorithms, in particular divide-and-conquer strategies, by determining efficient sub-samples.

Collaborative data exploration interfaces - From participatory sensing to participatory sensemaking

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As technological capabilities for capturing, aggregating, and processing large quantities of data continue to improve, the question becomes how to effectively utilise these resources. Whenever automatic methods fail, it is necessary to rely on human background knowledge, intuition, and deliberation. This creates demand for data exploration interfaces that support the analytical process, allowing users to absorb and derive knowledge from data. Such interfaces have historically been designed for experts. However, existing research has shown promise in involving a broader range of users that act as citizen scientists, placing high demands in terms of usability. Visualisation is one of the most effective analytical tools for humans to process abstract information. Our research focuses on the development of interfaces to support collaborative, community-led inquiry into data, which we refer to as Participatory Data Analytics. The development of data exploration interfaces to support independent investigations by local communities around topics of their interest presents a unique set of challenges, which we discuss in this paper. We present our preliminary work towards suitable high-level abstractions and interaction concepts to allow users to construct and tailor visualisations to their own needs.

Medical Data Access Accountability in EHR Systems, A Practical Perspective

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The world has experienced a large increase in the amount of available data. Therefore, it requires better and more specialized tools for data storage and retrieval and information privacy. Recently Electronic Health Record (EHR) Systems have emerged to fulfill this need in health systems. They play an important role in medicine by granting access to information that can be used in medical diagnosis. Traditional systems have a focus on the storage and retrieval of this information, usually leaving issues related to privacy in the background. Doctors and patients may have different objectives when using an EHR system: patients try to restrict sensible information in their medical records to avoid misuse information while doctors want to see as much information as possible to ensure a correct diagnosis. One solution to this dilemma is the Accountable e-Health model, an access protocol model based in the Information Accountability Protocol. In this model patients are warned when doctors access their restricted data. They also enable a non-restrictive access for authenticated doctors. In this work we use FluxMED, an EHR system, and augment it with aspects of the Information Accountability Protocol to address these issues. The Implementation of the Information Accountability Framework (IAF) in FluxMED provides ways for both patients and physicians to have their privacy and access needs achieved. Issues related to storage and data security are secured by FluxMED, which contains mechanisms to ensure security and data integrity. The effort required to develop a platform for the management of medical information is mitigated by the FluxMED's workflow-based architecture: the system is flexible enough to allow the type and amount of information being altered without the need to change in your source code.

The role of environmental factors in the spatial distribution of Japanese encephalitis in mainland China

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Japanese encephalitis (JE) is the most common cause of viral encephalitis and an important public health concern in the Asia-Pacific region, particularly in China where 50% of global cases are notified. To explore the association between environmental factors and human JE cases and identify the high risk areas for JE transmission in China, we used annual notified data on JE cases at the center of administrative township and environmental variables with a pixel resolution of 1 km×1 km from 2005 to 2011 to construct models using ecological niche modeling (ENM) approaches based on maximum entropy. These models were then validated by overlaying reported human JE case localities from 2006 to 2012 onto each prediction map. ENMs had good discriminatory ability with the area under the curve (AUC) of the receiver operating curve (ROC) of 0.82-0.91, and low extrinsic omission rate of 5.44-7.42%. Resulting maps showed JE being presented extensively throughout southwestern and central China, with local spatial variations in probability influenced by minimum temperatures, human population density, mean temperatures, and elevation, with contribution of 17.94%-38.37%, 15.47%-21.82%, 3.86%-21.22%, and 12.05%-16.02%, respectively. Approximately 60% of JE cases occurred in predicted high risk areas, which covered less than 6% of areas in mainland China. Our findings will help inform optimal geographical allocation of the limited resources available for JE prevention and control in China, find hidden high-risk areas, and increase the effectiveness of public health interventions against JE transmission.

Predicting the speed of a Wave Glider autonomous surface vehicle from wave model data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A key component of robotic path planning is ensuring that one can reliably navigate a vehicle to a desired location. In addition, when the features of interest are dynamic and move with oceanic currents, vehicle speed plays an important role in the planning exercise to ensure that vehicles are in the right place at the right time. Aquatic robot design is moving towards utilizing the environment for propulsion rather than traditional motors and propellers. These new vehicles are able to realize significantly increased endurance, however the mission planning problem, in turn, becomes more difficult as the vehicle velocity is not directly controllable. In this paper, we examine Gaussian process models applied to existing wave model data to predict the behavior, i.e., velocity, of a Wave Glider Autonomous Surface Vehicle. Using training data from an on-board sensor and forecasting with the WAVEWATCH III model, our probabilistic regression models created an effective method for forecasting WG velocity.

FlexAnalytics: A flexible data analytics framework for big data applications with I/O performance improvement

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Increasingly larger scale applications are generating an unprecedented amount of data. However, the increasing gap between computation and I/O capacity on High End Computing machines makes a severe bottleneck for data analysis. Instead of moving data from its source to the output storage, in-situ analytics processes output data while simulations are running. However, in-situ data analysis incurs much more computing resource contentions with simulations. Such contentions severely damage the performance of simulation on HPE. Since different data processing strategies have different impact on performance and cost, there is a consequent need for flexibility in the location of data analytics. In this paper, we explore and analyze several potential data-analytics placement strategies along the I/O path. To find out the best strategy to reduce data movement in given situation, we propose a flexible data analytics (FlexAnalytics) framework in this paper. Based on this framework, a FlexAnalytics prototype system is developed for analytics placement. FlexAnalytics system enhances the scalability and flexibility of current I/O stack on HEC platforms and is useful for data pre-processing, runtime data analysis and visualization, as well as for large-scale data transfer. Two use cases – scientific data compression and remote visualization – have been applied in the study to verify the performance of FlexAnalytics. Experimental results demonstrate that FlexAnalytics framework increases data transition bandwidth and improves the application end-to-end transfer performance.

Combining HPLC-DAD and ICP-MS data for improved analysis of complex samples: Classification of the root samples from Cortex moutan

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A combined data matrix consisting of high performance liquid chromatography–diode array detector (HPLC–DAD) and inductively coupled plasma-mass spectrometry (ICP-MS) measurements of samples from the plant roots of the Cortex moutan (CM), produced much better classification and prediction results in comparison with those obtained from either of the individual data sets. The HPLC peaks (organic components) of the CM samples, and the ICP-MS measurements (trace metal elements) were investigated with the use of principal component analysis (PCA) and the linear discriminant analysis (LDA) methods of data analysis; essentially, qualitative results suggested that discrimination of the CM samples from three different provinces was possible with the combined matrix producing best results. Another three methods, K-nearest neighbor (KNN), back-propagation artificial neural network (BP-ANN) and least squares support vector machines (LS-SVM) were applied for the classification and prediction of the samples. Again, the combined data matrix analyzed by the KNN method produced best results (100% correct; prediction set data). Additionally, multiple linear regression (MLR) was utilized to explore any relationship between the organic constituents and the metal elements of the CM samples; the extracted linear regression equations showed that the essential metals as well as some metallic pollutants were related to the organic compounds on the basis of their concentrations

Data driven modeling for power transformer lifespan evaluation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Large sized power transformers are important parts of the power supply chain. These very critical networks of engineering assets are an essential base of a nation’s energy resource infrastructure. This research identifies the key factors influencing transformer normal operating conditions and predicts the asset management lifespan. Engineering asset research has developed few lifespan forecasting methods combining real-time monitoring solutions for transformer maintenance and replacement. Utilizing the rich data source from a remote terminal unit (RTU) system for sensor-data driven analysis, this research develops an innovative real-time lifespan forecasting approach applying logistic regression based on the Weibull distribution. The methodology and the implementation prototype are verified using a data series from 161 kV transformers to evaluate the efficiency and accuracy for energy sector applications. The asset stakeholders and suppliers significantly benefit from the real-time power transformer lifespan evaluation for maintenance and replacement decision support.

Occupational injury risk among Australian paramedics: An analysis of national data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective To identify the occupational risks for Australian paramedics, by describing the rate of injuries and fatalities and comparing those rates with other reports. Design and participants Retrospective descriptive study using data provided by Safe Work Australia for the period 2000–2010. The subjects were paramedics who had been injured in the course of their duties and for whom a claim had been made for workers compensation payments. Main outcome measures Rates of injury calculated from the data provided. Results The risk of serious injury among Australian paramedics was found to be more than seven times higher than the Australian national average. The fatality rate for paramedics was about six times higher than the national average. On average, every 2 years during the study period, one paramedic died and 30 were seriously injured in vehicle crashes. Ten Australian paramedics were seriously injured each year as a result of an assault. The injury rate for paramedics was more than two times higher than the rate for police officers. Conclusions The high rate of occupational injuries and fatalities among paramedics is a serious public health issue. The risk of injury in Australia is similar to that in the United States. While it may be anticipated that injury rates would be higher as a result of the nature of the work and environment of paramedics, further research is necessary to identify and validate the strategies required to minimise the rates of occupational injury for paramedics.

Optimal management of the critically Ill: Anaesthesia, monitoring, data capture, and point-of-care technological practices in ovine models of critical care

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Animal models of critical illness are vital in biomedical research. They provide possibilities for the investigation of pathophysiological processes that may not otherwise be possible in humans. In order to be clinically applicable, the model should simulate the critical care situation realistically, including anaesthesia, monitoring, sampling, utilising appropriate personnel skill mix, and therapeutic interventions. There are limited data documenting the constitution of ideal technologically advanced large animal critical care practices and all the processes of the animal model. In this paper, we describe the procedure of animal preparation, anaesthesia induction and maintenance, physiologic monitoring, data capture, point-of-care technology, and animal aftercare that has been successfully used to study several novel ovine models of critical illness. The relevant investigations are on respiratory failure due to smoke inhalation, transfusion related acute lung injury, endotoxin-induced proteogenomic alterations, haemorrhagic shock, septic shock, brain death, cerebral microcirculation, and artificial heart studies. We have demonstrated the functionality of monitoring practices during anaesthesia required to provide a platform for undertaking systematic investigations in complex ovine models of critical illness.

UWB channel measurement and data transfer analysis for multiuser Infostation applications

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, the results of the time dispersion parameters obtained from a set of channel measurements conducted in various environments that are typical of multiuser Infostation application scenarios are presented. The measurement procedure takes into account the practical scenarios typical of the positions and movements of the users in the particular Infostation network. To provide one with the knowledge of how much data can be downloaded by users over a given time and mobile speed, data transfer analysis for multiband orthogonal frequency division multiplexing (MB-OFDM) is presented. As expected, the rough estimate of simultaneous data transfer in a multiuser Infostation scenario indicates dependency of the percentage of download on the data size, number and speed of the users, and the elapse time.

A data mining approach for fault diagnosis: An application of anomaly detection algorithm

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rolling-element bearing failures are the most frequent problems in rotating machinery, which can be catastrophic and cause major downtime. Hence, providing advance failure warning and precise fault detection in such components are pivotal and cost-effective. The vast majority of past research has focused on signal processing and spectral analysis for fault diagnostics in rotating components. In this study, a data mining approach using a machine learning technique called anomaly detection (AD) is presented. This method employs classification techniques to discriminate between defect examples. Two features, kurtosis and Non-Gaussianity Score (NGS), are extracted to develop anomaly detection algorithms. The performance of the developed algorithms was examined through real data from a test to failure bearing. Finally, the application of anomaly detection is compared with one of the popular methods called Support Vector Machine (SVM) to investigate the sensitivity and accuracy of this approach and its ability to detect the anomalies in early stages.

A CCG virtual system for big data application communication costs analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Network topology and routing are two important factors in determining the communication costs of big data applications at large scale. As for a given Cluster, Cloud, or Grid system, the network topology is fixed and static or dynamic routing protocols are preinstalled to direct the network traffic. Users cannot change them once the system is deployed. Hence, it is hard for application developers to identify the optimal network topology and routing algorithm for their applications with distinct communication patterns. In this study, we design a CCG virtual system (CCGVS), which first uses container-based virtualization to allow users to create a farm of lightweight virtual machines on a single host. Then, it uses software-defined networking (SDN) technique to control the network traffic among these virtual machines. Users can change the network topology and control the network traffic programmingly, thereby enabling application developers to evaluate their applications on the same system with different network topologies and routing algorithms. The preliminary experimental results through both synthetic big data programs and NPB benchmarks have shown that CCGVS can represent application performance variations caused by network topology and routing algorithm.

Missing in space: An evaluation of imputation methods for missing data in spatial analysis of risk factors for type II diabetes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background Spatial analysis is increasingly important for identifying modifiable geographic risk factors for disease. However, spatial health data from surveys are often incomplete, ranging from missing data for only a few variables, to missing data for many variables. For spatial analyses of health outcomes, selection of an appropriate imputation method is critical in order to produce the most accurate inferences. Methods We present a cross-validation approach to select between three imputation methods for health survey data with correlated lifestyle covariates, using as a case study, type II diabetes mellitus (DM II) risk across 71 Queensland Local Government Areas (LGAs). We compare the accuracy of mean imputation to imputation using multivariate normal and conditional autoregressive prior distributions. Results Choice of imputation method depends upon the application and is not necessarily the most complex method. Mean imputation was selected as the most accurate method in this application. Conclusions Selecting an appropriate imputation method for health survey data, after accounting for spatial correlation and correlation between covariates, allows more complete analysis of geographic risk factors for disease with more confidence in the results to inform public policy decision-making.

5-year Chlamydia vaccination programme could reverse disease-related koala population decline: Predictions from a mathematical model using field data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND Many koala populations around Australia are in serious decline, with a substantial component of this decline in some Southeast Queensland populations attributed to the impact of Chlamydia. A Chlamydia vaccine for koalas is in development and has shown promise in early trials. This study contributes to implementation preparedness by simulating vaccination strategies designed to reverse population decline and by identifying which age and sex category it would be most effective to target. METHODS We used field data to inform the development and parameterisation of an individual-based stochastic simulation model of a koala population endemic with Chlamydia. The model took into account transmission, morbidity and mortality caused by Chlamydia infections. We calibrated the model to characteristics of typical Southeast Queensland koala populations. As there is uncertainty about the effectiveness of the vaccine in real-world settings, a variety of potential vaccine efficacies, half-lives and dosing schedules were simulated. RESULTS Assuming other threats remain constant, it is expected that current population declines could be reversed in around 5-6 years if female koalas aged 1-2 years are targeted, average vaccine protective efficacy is 75%, and vaccine coverage is around 10% per year. At lower vaccine efficacies the immunological effects of boosting become important: at 45% vaccine efficacy population decline is predicted to reverse in 6 years under optimistic boosting assumptions but in 9 years under pessimistic boosting assumptions. Terminating a successful vaccination programme at 5 years would lead to a rise in Chlamydia prevalence towards pre-vaccination levels. CONCLUSION For a range of vaccine efficacy levels it is projected that population decline due to endemic Chlamydia can be reversed under realistic dosing schedules, potentially in just 5 years. However, a vaccination programme might need to continue indefinitely in order to maintain Chlamydia prevalence at a sufficiently low level for population growth to continue.

«
1
2
...
50
51
52
53
54
55
56
57
58
»