128 resultados para automated thematic analysis of textual data
Resumo:
This technical report is concerned with one aspect of environmental monitoring—the detection and analysis of acoustic events in sound recordings of the environment. Sound recordings offer ecologists the advantage of cheaper and increased sampling but make available so much data that automated analysis becomes essential. The report describes a number of tools for automated analysis of recordings, including noise removal from spectrograms, acoustic event detection, event pattern recognition, spectral peak tracking, syntactic pattern recognition applied to call syllables, and oscillation detection. These algorithms are applied to a number of animal call recognition tasks, chosen because they illustrate quite different modes of analysis: (1) the detection of diffuse events caused by wind and rain, which are frequent contaminants of recordings of the terrestrial environment; (2) the detection of bird and calls; and (3) the preparation of acoustic maps for whole ecosystem analysis. This last task utilises the temporal distribution of events over a daily, monthly or yearly cycle.
Resumo:
Monitoring and assessing environmental health is becoming increasingly important as human activity and climate change place greater pressure on global biodiversity. Acoustic sensors provide the ability to collect data passively, objectively and continuously across large areas for extended periods of time. While these factors make acoustic sensors attractive as autonomous data collectors, there are significant issues associated with large-scale data manipulation and analysis. We present our current research into techniques for analysing large volumes of acoustic data effectively and efficiently. We provide an overview of a novel online acoustic environmental workbench and discuss a number of approaches to scaling analysis of acoustic data; collaboration, manual, automatic and human-in-the loop analysis.
Resumo:
Data analysis sessions are a common feature of discourse analytic communities, often involving participants with varying levels of expertise to those with significant expertise. Learning how to do data analysis and working with transcripts, however, are often new experiences for doctoral candidates within the social sciences. While many guides to doctoral education focus on procedures associated with data analysis (Heath, Hindmarsh, & Luff, 2010; McHoul & Rapley, 2001; Silverman, 2011; Wetherall, Taylor, & Yates, 2001), the in situ practices of doing data analysis are relatively undocumented. This chapter has been collaboratively written by members of a special interest research group, the Transcript Analysis Group (TAG), who meet regularly to examine transcripts representing audio- and video-recorded interactional data. Here, we investigate our own actual interactional practices and participation in this group where each member is both analyst and participant. We particularly focus on the pedagogic practices enacted in the group through investigating how members engage in the scholarly practice of data analysis. A key feature of talk within the data sessions is that members work collaboratively to identify and discuss ‘noticings’ from the audio-recorded and transcribed talk being examined, produce candidate analytic observations based on these discussions, and evaluate these observations. Our investigation of how talk constructs social practices in these sessions shows that participants move fluidly between actions that demonstrate pedagogic practices and expertise. Within any one session, members can display their expertise as analysts and, at the same time, display that they have gained an understanding that they did not have before. We take an ethnomethodological position that asks, ‘what’s going on here?’ in the data analysis session. By observing the in situ practices in fine-grained detail, we show how members participate in the data analysis sessions and make sense of a transcript.
Resumo:
Data flow analysis techniques can be used to help assess threats to data confidentiality and integrity in security critical program code. However, a fundamental weakness of static analysis techniques is that they overestimate the ways in which data may propagate at run time. Discounting large numbers of these false-positive data flow paths wastes an information security evaluator's time and effort. Here we show how to automatically eliminate some false-positive data flow paths by precisely modelling how classified data is blocked by certain expressions in embedded C code. We present a library of detailed data flow models of individual expression elements and an algorithm for introducing these components into conventional data flow graphs. The resulting models can be used to accurately trace byte-level or even bit-level data flow through expressions that are normally treated as atomic. This allows us to identify expressions that safely downgrade their classified inputs and thereby eliminate false-positive data flow paths from the security evaluation process. To validate the approach we have implemented and tested it in an existing data flow analysis toolkit.
Resumo:
Typical reference year (TRY) weather data is often used to represent the long term weather pattern for building simulation and design. Through the analysis of ten year historical hourly weather data for seven Australian major capital cities using the frequencies procedure of descriptive statistics analysis (by SPSS software), this paper investigates: • the closeness of the typical reference year (TRY) weather data in representing the long term weather pattern; • the variations and common features that may exist between relatively hot and cold years. It is found that for the given set of input data, in comparison with the other weather elements, the discrepancy between TRY and multiple years is much smaller for the dry bulb temperature, relative humidity and global solar irradiance. The overall distribution patterns of key weather elements are also generally similar between the hot and cold years, but with some shift and/or small distortion. There is little common tendency of change between the hot and the cold years for different weather variables at different study locations.
Resumo:
Concerns regarding groundwater contamination with nitrate and the long-term sustainability of groundwater resources have prompted the development of a multi-layered three dimensional (3D) geological model to characterise the aquifer geometry of the Wairau Plain, Marlborough District, New Zealand. The 3D geological model which consists of eight litho-stratigraphic units has been subsequently used to synthesise hydrogeological and hydrogeochemical data for different aquifers in an approach that aims to demonstrate how integration of water chemistry data within the physical framework of a 3D geological model can help to better understand and conceptualise groundwater systems in complex geological settings. Multivariate statistical techniques(e.g. Principal Component Analysis and Hierarchical Cluster Analysis) were applied to groundwater chemistry data to identify hydrochemical facies which are characteristic of distinct evolutionary pathways and a common hydrologic history of groundwaters. Principal Component Analysis on hydrochemical data demonstrated that natural water-rock interactions, redox potential and human agricultural impact are the key controls of groundwater quality in the Wairau Plain. Hierarchical Cluster Analysis revealed distinct hydrochemical water quality groups in the Wairau Plain groundwater system. Visualisation of the results of the multivariate statistical analyses and distribution of groundwater nitrate concentrations in the context of aquifer lithology highlighted the link between groundwater chemistry and the lithology of host aquifers. The methodology followed in this study can be applied in a variety of hydrogeological settings to synthesise geological, hydrogeological and hydrochemical data and present them in a format readily understood by a wide range of stakeholders. This enables a more efficient communication of the results of scientific studies to the wider community.
Resumo:
Distraction whilst driving on an approach to a signalized intersection is particularly dangerous, as potential vehicular conflicts and resulting angle collisions tend to be severe. This study examines the decisions of distracted drivers during the onset of amber lights. Driving simulator data were obtained from a sample of 58 drivers under baseline and handheld mobile phone conditions at the University of IOWA - National Advanced Driving Simulator. Explanatory variables include age, gender, cell phone use, distance to stop-line, and speed. An iterative combination of decision tree and logistic regression analyses are employed to identify main effects, non-linearities, and interactions effects. Results show that novice (16-17 years) and younger (18-25 years) drivers’ had heightened amber light running risk while distracted by cell phone, and speed and distance thresholds yielded significant interaction effects. Driver experience captured by age has a multiplicative effect with distraction, making the combined effect of being inexperienced and distracted particularly risky. Solutions are needed to combat the use of mobile phones whilst driving.
Resumo:
Data mining techniques extract repeated and useful patterns from a large data set that in turn are utilized to predict the outcome of future events. The main purpose of the research presented in this paper is to investigate data mining strategies and develop an efficient framework for multi-attribute project information analysis to predict the performance of construction projects. The research team first reviewed existing data mining algorithms, applied them to systematically analyze a large project data set collected by the survey, and finally proposed a data-mining-based decision support framework for project performance prediction. To evaluate the potential of the framework, a case study was conducted using data collected from 139 capital projects and analyzed the relationship between use of information technology and project cost performance. The study results showed that the proposed framework has potential to promote fast, easy to use, interpretable, and accurate project data analysis.
Resumo:
Structural health monitoring (SHM) refers to the procedure used to assess the condition of structures so that their performance can be monitored and any damage can be detected early. Early detection of damage and appropriate retrofitting will aid in preventing failure of the structure and save money spent on maintenance or replacement and ensure the structure operates safely and efficiently during its whole intended life. Though visual inspection and other techniques such as vibration based ones are available for SHM of structures such as bridges, the use of acoustic emission (AE) technique is an attractive option and is increasing in use. AE waves are high frequency stress waves generated by rapid release of energy from localised sources within a material, such as crack initiation and growth. AE technique involves recording these waves by means of sensors attached on the surface and then analysing the signals to extract information about the nature of the source. High sensitivity to crack growth, ability to locate source, passive nature (no need to supply energy from outside, but energy from damage source itself is utilised) and possibility to perform real time monitoring (detecting crack as it occurs or grows) are some of the attractive features of AE technique. In spite of these advantages, challenges still exist in using AE technique for monitoring applications, especially in the area of analysis of recorded AE data, as large volumes of data are usually generated during monitoring. The need for effective data analysis can be linked with three main aims of monitoring: (a) accurately locating the source of damage; (b) identifying and discriminating signals from different sources of acoustic emission and (c) quantifying the level of damage of AE source for severity assessment. In AE technique, the location of the emission source is usually calculated using the times of arrival and velocities of the AE signals recorded by a number of sensors. But complications arise as AE waves can travel in a structure in a number of different modes that have different velocities and frequencies. Hence, to accurately locate a source it is necessary to identify the modes recorded by the sensors. This study has proposed and tested the use of time-frequency analysis tools such as short time Fourier transform to identify the modes and the use of the velocities of these modes to achieve very accurate results. Further, this study has explored the possibility of reducing the number of sensors needed for data capture by using the velocities of modes captured by a single sensor for source localization. A major problem in practical use of AE technique is the presence of sources of AE other than crack related, such as rubbing and impacts between different components of a structure. These spurious AE signals often mask the signals from the crack activity; hence discrimination of signals to identify the sources is very important. This work developed a model that uses different signal processing tools such as cross-correlation, magnitude squared coherence and energy distribution in different frequency bands as well as modal analysis (comparing amplitudes of identified modes) for accurately differentiating signals from different simulated AE sources. Quantification tools to assess the severity of the damage sources are highly desirable in practical applications. Though different damage quantification methods have been proposed in AE technique, not all have achieved universal approval or have been approved as suitable for all situations. The b-value analysis, which involves the study of distribution of amplitudes of AE signals, and its modified form (known as improved b-value analysis), was investigated for suitability for damage quantification purposes in ductile materials such as steel. This was found to give encouraging results for analysis of data from laboratory, thereby extending the possibility of its use for real life structures. By addressing these primary issues, it is believed that this thesis has helped improve the effectiveness of AE technique for structural health monitoring of civil infrastructures such as bridges.
Resumo:
Allegations of child sexual abuse in Family Court cases have gained increasing attention. The study investigates factors involved in Family Court cases involving allegations of child sexual abuse. A qualitative methodology was employed to examine Records of Judgement and Psychiatric Reports for 20 cases distilled from the data corpus of 102 cases. A seven-stage methodology was developed utilising a thematic analysis process informed by principles of grounded theory and phenomenology. The explication of eight thematic clusters was undertaken. The findings point to complex issues and dynamics in which child sexual abuse allegations have been raised. The alleging parent’s allegations of sexual abuse against their ex-partner may be: the expression of unconscious deep fears for their children’s welfare, or an action to meet their needs for personal affirmation in the context of the painful upheaval of a relationship break-up. Implications of the findings are discussed.
Resumo:
Background Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset. Results We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours. Conclusions We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers
Resumo:
Background: Ankle fractures are one of the more commonly occurring forms of trauma managed by orthopaedic teams worldwide. The impacts of these injuries are not restricted to pain and disability caused at the time of the incident, but may also result in long term physical, psychological, and social consequences. There are currently no ankle fracture specific patient-reported outcome measures with a robust content foundation. This investigation aimed to develop a thematic conceptual framework of life impacts following ankle fracture from the experiences of people who have suffered ankle fractures as well as the health professionals who treat them. Methods: A qualitative investigation was undertaken using in-depth semi-structured interviews with people (n=12) who had previously sustained an ankle fracture (patients) and health professionals (n=6) that treat people with ankle fractures. Interviews were audio-recorded and transcribed. Each phrase was individually coded and grouped in categories and aligned under emerging themes by two independent researchers. Results: Saturation occurred after 10 in-depth patient interviews. Time since injury for patients ranged from 6 weeks to more than 2 years. Experience of health professionals ranged from 1 year to 16 years working with people with ankle fractures. Health professionals included an Orthopaedic surgeon (1), physiotherapists (3), a podiatrist (1) and an occupational therapist (1). The emerging framework derived from patient data included eight themes (Physical, Psychological, Daily Living, Social, Occupational and Domestic, Financial, Aesthetic and Medication Taking). Health professional responses did not reveal any additional themes, but tended to focus on physical and occupational themes. Conclusions: The nature of life impact following ankle fractures can extend beyond short term pain and discomfort into many areas of life. The findings from this research have provided an empirically derived framework from which a condition-specific patient-reported outcome measure can be developed.
Resumo:
The use of Wireless Sensor Networks (WSNs) for Structural Health Monitoring (SHM) has become a promising approach due to many advantages such as low cost, fast and flexible deployment. However, inherent technical issues such as data synchronization error and data loss have prevented these distinct systems from being extensively used. Recently, several SHM-oriented WSNs have been proposed and believed to be able to overcome a large number of technical uncertainties. Nevertheless, there is limited research verifying the applicability of those WSNs with respect to demanding SHM applications like modal analysis and damage identification. This paper first presents a brief review of the most inherent uncertainties of the SHM-oriented WSN platforms and then investigates their effects on outcomes and performance of the most robust Output-only Modal Analysis (OMA) techniques when employing merged data from multiple tests. The two OMA families selected for this investigation are Frequency Domain Decomposition (FDD) and Data-driven Stochastic Subspace Identification (SSI-data) due to the fact that they both have been widely applied in the past decade. Experimental accelerations collected by a wired sensory system on a large-scale laboratory bridge model are initially used as clean data before being contaminated by different data pollutants in sequential manner to simulate practical SHM-oriented WSN uncertainties. The results of this study show the robustness of FDD and the precautions needed for SSI-data family when dealing with SHM-WSN uncertainties. Finally, the use of the measurement channel projection for the time-domain OMA techniques and the preferred combination of the OMA techniques to cope with the SHM-WSN uncertainties is recommended.