182 resultados para Automated data analysis
Resumo:
This research is aimed at addressing problems in the field of asset management relating to risk analysis and decision making based on data from a Supervisory Control and Data Acquisition (SCADA) system. It is apparent that determining risk likelihood in risk analysis is difficult, especially when historical information is unreliable. This relates to a problem in SCADA data analysis because of nested data. A further problem is in providing beneficial information from a SCADA system to a managerial level information system (e.g. Enterprise Resource Planning/ERP). A Hierarchical Model is developed to address the problems. The model is composed of three different Analyses: Hierarchical Analysis, Failure Mode and Effect Analysis, and Interdependence Analysis. The significant contributions from the model include: (a) a new risk analysis model, namely an Interdependence Risk Analysis Model which does not rely on the existence of historical information because it utilises Interdependence Relationships to determine the risk likelihood, (b) improvement of the SCADA data analysis problem by addressing the nested data problem through the Hierarchical Analysis, and (c) presentation of a framework to provide beneficial information from SCADA systems to ERP systems. The case study of a Water Treatment Plant is utilised for model validation.
Resumo:
Monitoring Internet traffic is critical in order to acquire a good understanding of threats to computer and network security and in designing efficient computer security systems. Researchers and network administrators have applied several approaches to monitoring traffic for malicious content. These techniques include monitoring network components, aggregating IDS alerts, and monitoring unused IP address spaces. Another method for monitoring and analyzing malicious traffic, which has been widely tried and accepted, is the use of honeypots. Honeypots are very valuable security resources for gathering artefacts associated with a variety of Internet attack activities. As honeypots run no production services, any contact with them is considered potentially malicious or suspicious by definition. This unique characteristic of the honeypot reduces the amount of collected traffic and makes it a more valuable source of information than other existing techniques. Currently, there is insufficient research in the honeypot data analysis field. To date, most of the work on honeypots has been devoted to the design of new honeypots or optimizing the current ones. Approaches for analyzing data collected from honeypots, especially low-interaction honeypots, are presently immature, while analysis techniques are manual and focus mainly on identifying existing attacks. This research addresses the need for developing more advanced techniques for analyzing Internet traffic data collected from low-interaction honeypots. We believe that characterizing honeypot traffic will improve the security of networks and, if the honeypot data is handled in time, give early signs of new vulnerabilities or breakouts of new automated malicious codes, such as worms. The outcomes of this research include: • Identification of repeated use of attack tools and attack processes through grouping activities that exhibit similar packet inter-arrival time distributions using the cliquing algorithm; • Application of principal component analysis to detect the structure of attackers’ activities present in low-interaction honeypots and to visualize attackers’ behaviors; • Detection of new attacks in low-interaction honeypot traffic through the use of the principal component’s residual space and the square prediction error statistic; • Real-time detection of new attacks using recursive principal component analysis; • A proof of concept implementation for honeypot traffic analysis and real time monitoring.
Resumo:
Aims: To describe a local data linkage project to match hospital data with the Australian Institute of Health and Welfare (AIHW) National Death Index (NDI) to assess longterm outcomes of intensive care unit patients. Methods: Data were obtained from hospital intensive care and cardiac surgery databases on all patients aged 18 years and over admitted to either of two intensive care units at a tertiary-referral hospital between 1 January 1994 and 31 December 2005. Date of death was obtained from the AIHW NDI by probabilistic software matching, in addition to manual checking through hospital databases and other sources. Survival was calculated from time of ICU admission, with a censoring date of 14 February 2007. Data for patients with multiple hospital admissions requiring intensive care were analysed only from the first admission. Summary and descriptive statistics were used for preliminary data analysis. Kaplan-Meier survival analysis was used to analyse factors determining long-term survival. Results: During the study period, 21 415 unique patients had 22 552 hospital admissions that included an ICU admission; 19 058 surgical procedures were performed with a total of 20 092 ICU admissions. There were 4936 deaths. Median follow-up was 6.2 years, totalling 134 203 patient years. The casemix was predominantly cardiac surgery (80%), followed by cardiac medical (6%), and other medical (4%). The unadjusted survival at 1, 5 and 10 years was 97%, 84% and 70%, respectively. The 1-year survival ranged from 97% for cardiac surgery to 36% for cardiac arrest. An APACHE II score was available for 16 877 patients. In those discharged alive from hospital, the 1, 5 and 10-year survival varied with discharge location. Conclusions: ICU-based linkage projects are feasible to determine long-term outcomes of ICU patients
Resumo:
This paper presents the results from a study of information behaviors in the context of people's everyday lives undertaken in order to develop an integrated model of information behavior (IB). 34 participants from across 6 countries maintained a daily information journal or diary – mainly through a secure web log – for two weeks, to an aggregate of 468 participant days over five months. The text-rich diary data was analyzed using a multi-method qualitative-quantitative analysis in the following order: Grounded Theory analysis with manual coding, automated concept analysis using thesaurus-based visualization, and finally a statistical analysis of the coding data. The findings indicate that people engage in several information behaviors simultaneously throughout their everyday lives (including home and work life) and that sense-making is entangled in all aspects of them. Participants engaged in many of the information behaviors in a parallel, distributed, and concurrent fashion: many information behaviors for one information problem, one information behavior across many information problems, and many information behaviors concurrently across many information problems. Findings indicate also that information avoidance – both active and passive avoidance – is a common phenomenon and that information organizing behaviors or the lack thereof caused the most problems for participants. An integrated model of information behaviors is presented based on the findings.
Resumo:
Purpose: To analyze the repeatability of measuring nerve fiber length (NFL) from images of the human corneal subbasal nerve plexus using semiautomated software. Methods: Images were captured from the corneas of 50 subjects with type 2 diabetes mellitus who showed varying severity of neuropathy, using the Heidelberg Retina Tomograph 3 with Rostock Corneal Module. Semiautomated nerve analysis software was independently used by two observers to determine NFL from images of the subbasal nerve plexus. This procedure was undertaken on two occasions, 3 days apart. Results: The intraclass correlation coefficient values were 0.95 (95% confidence intervals: 0.92–0.97) for individual subjects and 0.95 (95% confidence intervals: 0.74–1.00) for observer. Bland-Altman plots of the NFL values indicated a reduced spread of data with lower NFL values. The overall spread of data was less for (a) the observer who was more experienced at analyzing nerve fiber images and (b) the second measurement occasion. Conclusions: Semiautomated measurement of NFL in the subbasal nerve fiber layer is highly repeatable. Repeatability can be enhanced by using more experienced observers. It may be possible to markedly improve repeatability when measuring this anatomic structure using fully automated image analysis software.
Resumo:
The period from 2007 to 2009 covered the residential property boom from early 2000, to the property recession following the Global Financial Crisis. Since late 2008, a number of residential property markets have suffered significant falls in house prices, buth this has not been consistent across all market sectors. This paper will analyze the housing market in Brisbane Australia to determine the impact, similarities and differences that the4 GFC had on range of residential sectors across a divesified property market. Data analysis will provide an overview of residential property prices, sales and listing volumes over the study period and will provide a comparison of median house price performance across the geographic and socio-economic areas of Brisbane.
Resumo:
Many initiatives to improve Business processes are emerging. The essential roles and contributions of Business Analyst (BA) and Business Process Management (BPM) professionals to such initiatives have been recognized in literature and practice. The roles and responsibilities of a BA or BPM practitioner typically require different skill-sets; however these differences are often vague. This vagueness creates much confusion in practice and academia. While both the BA and BPM communities have made attempts to describe their domains through capability defining empirical research and developments of Bodies of knowledge, there has not yet been any attempt to identify the commonality of skills required and points of uniqueness between the two professions. This study aims to address this gap and presents the findings of a detailed content mapping exercise (using NVivo as a qualitative data analysis tool) of the International Institution of Business Analysis (IIBA®) Guide to the Business Analysis Body of Knowledge (BABOK® Guide) against core BPM competency and capability frameworks.
Resumo:
Understanding the relationship between diet, physical activity and health in humans requires accurate measurement of body composition and daily energy expenditure. Stable isotopes provide a means of measuring total body water and daily energy expenditure under free-living conditions. While the use of isotope ratio mass spectrometry (IRMS) for the analysis of 2H (Deuterium) and 18O (Oxygen-18) is well established in the field of human energy metabolism research, numerous questions remain regarding the factors which influence analytical and measurement error using this methodology. This thesis was comprised of four studies with the following emphases. The aim of Study 1 was to determine the analytical and measurement error of the IRMS with regard to sample handling under certain conditions. Study 2 involved the comparison of TEE (Total daily energy expenditure) using two commonly employed equations. Further, saliva and urine samples, collected at different times, were used to determine if clinically significant differences would occur. Study 3 was undertaken to determine the appropriate collection times for TBW estimates and derived body composition values. Finally, Study 4, a single case study to investigate if TEE measures are affected when the human condition changes due to altered exercise and water intake. The aim of Study 1 was to validate laboratory approaches to measure isotopic enrichment to ensure accurate (to international standards), precise (reproducibility of three replicate samples) and linear (isotope ratio was constant over the expected concentration range) results. This established the machine variability for the IRMS equipment in use at Queensland University for both TBW and TEE. Using either 0.4mL or 0.5mL sample volumes for both oxygen-18 and deuterium were statistically acceptable (p>0.05) and showed a within analytical variance of 5.8 Delta VSOW units for deuterium, 0.41 Delta VSOW units for oxygen-18. This variance was used as “within analytical noise” to determine sample deviations. It was also found that there was no influence of equilibration time on oxygen-18 or deuterium values when comparing the minimum (oxygen-18: 24hr; deuterium: 3 days) and maximum (oxygen-18: and deuterium: 14 days) equilibration times. With regard to preparation using the vacuum line, any order of preparation is suitable as the TEE values fall within 8% of each other regardless of preparation order. An 8% variation is acceptable for the TEE values due to biological and technical errors (Schoeller, 1988). However, for the automated line, deuterium must be assessed first followed by oxygen-18 as the automated machine line does not evacuate tubes but merely refills them with an injection of gas for a predetermined time. Any fractionation (which may occur for both isotopes), would cause a slight elevation in the values and hence a lower TEE. The purpose of the second and third study was to investigate the use of IRMS to measure the TEE and TBW of and to validate the current IRMS practices in use with regard to sample collection times of urine and saliva, the use of two TEE equations from different research centers and the body composition values derived from these TEE and TBW values. Following the collection of a fasting baseline urine and saliva sample, 10 people (8 women, 2 men) were dosed with a doubly labeled water does comprised of 1.25g 10% oxygen-18 and 0.1 g 100% deuterium/kg body weight. The samples were collected hourly for 12 hrs on the first day and then morning, midday, and evening samples were collected for the next 14 days. The samples were analyzed using an isotope ratio mass spectrometer. For the TBW, time to equilibration was determined using three commonly employed data analysis approaches. Isotopic equilibration was reached in 90% of the sample by hour 6, and in 100% of the sample by hour 7. With regard to the TBW estimations, the optimal time for urine collection was found to be between hours 4 and 10 as to where there was no significant difference between values. In contrast, statistically significant differences in TBW estimations were found between hours 1-3 and from 11-12 when compared with hours 4-10. Most of the individuals in this study were in equilibrium after 7 hours. The TEE equations of Prof Dale Scholler (Chicago, USA, IAEA) and Prof K.Westerterp were compared with that of Prof. Andrew Coward (Dunn Nutrition Centre). When comparing values derived from samples collected in the morning and evening there was no effect of time or equation on resulting TEE values. The fourth study was a pilot study (n=1) to test the variability in TEE as a result of manipulations in fluid consumption and level of physical activity; the magnitude of change which may be expected in a sedentary adult. Physical activity levels were manipulated by increasing the number of steps per day to mimic the increases that may result when a sedentary individual commences an activity program. The study was comprised of three sub-studies completed on the same individual over a period of 8 months. There were no significant changes in TBW across all studies, even though the elimination rates changed with the supplemented water intake and additional physical activity. The extra activity may not have sufficiently strenuous enough and the water intake high enough to cause a significant change in the TBW and hence the CO2 production and TEE values. The TEE values measured show good agreement based on the estimated values calculated on an RMR of 1455 kcal/day, a DIT of 10% of TEE and activity based on measured steps. The covariance values tracked when plotting the residuals were found to be representative of “well-behaved” data and are indicative of the analytical accuracy. The ratio and product plots were found to reflect the water turnover and CO2 production and thus could, with further investigation, be employed to identify the changes in physical activity.
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
Virtual methods to assess the fitting of a fracture fixation plate were proposed recently, however with limitations such as simplified fit criteria or manual data processing. This study aims to automate a fit analysis procedure using clinical-based criteria, and then to analyse the results further for borderline fit cases. Three dimensional (3D) models of 45 bones and of a precontoured distal tibial plate were utilized to assess the fitting of the plate automatically. A Matlab program was developed to automatically measure the shortest distance between the bone and the plate at three regions of interest and a plate-bone angle. The measured values including the fit assessment results were recorded in a spreadsheet as part of the batch-process routine. An automated fit analysis procedure will enable the processing of larger bone datasets in a significantly shorter time, which will provide more representative data of the target population for plate shape design and validation. As a result, better fitting plates can be manufactured and made available to surgeons, thereby reducing the risk and cost associated with complications or corrective procedures. This in turn, is expected to translate into improving patients' quality of life.
Resumo:
The conventional manual power line corridor inspection processes that are used by most energy utilities are labor-intensive, time consuming and expensive. Remote sensing technologies represent an attractive and cost-effective alternative approach to these monitoring activities. This paper presents a comprehensive investigation into automated remote sensing based power line corridor monitoring, focusing on recent innovations in the area of increased automation of fixed-wing platforms for aerial data collection, and automated data processing for object recognition using a feature fusion process. Airborne automation is achieved by using a novel approach that provides improved lateral control for tracking corridors and automatic real-time dynamic turning for flying between corridor segments, we call this approach PTAGS. Improved object recognition is achieved by fusing information from multi-sensor (LiDAR and imagery) data and multiple visual feature descriptors (color and texture). The results from our experiments and field survey illustrate the effectiveness of the proposed aircraft control and feature fusion approaches.
Resumo:
Defence organisations perform information security evaluations to confirm that electronic communications devices are safe to use in security-critical situations. Such evaluations include tracing all possible dataflow paths through the device, but this process is tedious and error-prone, so automated reachability analysis tools are needed to make security evaluations faster and more accurate. Previous research has produced a tool, SIFA, for dataflow analysis of basic digital circuitry, but it cannot analyse dataflow through microprocessors embedded within the circuit since this depends on the software they run. We have developed a static analysis tool that produces SIFA compatible dataflow graphs from embedded microcontroller programs written in C. In this paper we present a case study which shows how this new capability supports combined hardware and software dataflow analyses of a security critical communications device.