895 resultados para Data linkage
Resumo:
Recently, vision-based systems have been deployed in professional sports to track the ball and players to enhance analysis of matches. Due to their unobtrusive nature, vision-based approaches are preferred to wearable sensors (e.g. GPS or RFID sensors) as it does not require players or balls to be instrumented prior to matches. Unfortunately, in continuous team sports where players need to be tracked continuously over long-periods of time (e.g. 35 minutes in field-hockey or 45 minutes in soccer), current vision-based tracking approaches are not reliable enough to provide fully automatic solutions. As such, human intervention is required to fix-up missed or false detections. However, in instances where a human can not intervene due to the sheer amount of data being generated - this data can not be used due to the missing/noisy data. In this paper, we investigate two representations based on raw player detections (and not tracking) which are immune to missed and false detections. Specifically, we show that both team occupancy maps and centroids can be used to detect team activities, while the occupancy maps can be used to retrieve specific team activities. An evaluation on over 8 hours of field hockey data captured at a recent international tournament demonstrates the validity of the proposed approach.
Resumo:
The promise of ‘big data’ has generated a significant deal of interest in the development of new approaches to research in the humanities and social sciences, as well as a range of important critical interventions which warn of an unquestioned rush to ‘big data’. Drawing on the experiences made in developing innovative ‘big data’ approaches to social media research, this paper examines some of the repercussions for the scholarly research and publication practices of those researchers who do pursue the path of ‘big data’–centric investigation in their work. As researchers import the tools and methods of highly quantitative, statistical analysis from the ‘hard’ sciences into computational, digital humanities research, must they also subscribe to the language and assumptions underlying such ‘scientificity’? If so, how does this affect the choices made in gathering, processing, analysing, and disseminating the outcomes of digital humanities research? In particular, is there a need to rethink the forms and formats of publishing scholarly work in order to enable the rigorous scrutiny and replicability of research outcomes?
Resumo:
It is only in recent years that the critical role that spatial data can play in disaster management and strengthening community resilience has been recognised. The recognition of this importance is singularly evident from the fact that in Australia spatial data is considered as soft infrastructure. In the aftermath of every disaster this importance is being increasingly strengthened with state agencies paying greater attention to ensuring the availability of accurate spatial data based on the lessons learnt. For example, the major flooding in Queensland during the summer of 2011 resulted in a comprehensive review of responsibilities and accountability for the provision of spatial information during such natural disasters. A high level commission of enquiry completed a comprehensive investigation of the 2011 Brisbane flood inundation event and made specific recommendations concerning the collection of and accessibility to spatial information for disaster management and for strengthening community resilience during and after a natural disaster. The lessons learnt and processes implemented were subsequently tested by natural disasters during subsequent years. This paper provides an overview of the practical implementation of the recommendations of the commission of enquiry. It focuses particularly on the measures adopted by the state agencies with the primary role for managing spatial data and the evolution of this role in Queensland State, Australia. The paper concludes with a review of the development of the role and the increasing importance of spatial data as an infrastructure for disaster planning and management which promotes the strengthening of community resilience.
Resumo:
Bioacoustic data can provide an important base for environmental monitoring. To explore a large amount of field recordings collected, an automated similarity search algorithm is presented in this paper. A region of an audio defined by frequency and time bounds is provided by a user; the content of the region is used to construct a query. In the retrieving process, our algorithm will automatically scan through recordings to search for similar regions. In detail, we present a feature extraction approach based on the visual content of vocalisations – in this case ridges, and develop a generic regional representation of vocalisations for indexing. Our feature extraction method works best for bird vocalisations showing ridge characteristics. The regional representation method allows the content of an arbitrary region of a continuous recording to be described in a compressed format.
Resumo:
Techniques to improve the automated analysis of natural and spontaneous facial expressions have been developed. The outcome of the research has applications in several fields including national security (eg: expression invariant face recognition); education (eg: affect aware interfaces); mental and physical health (eg: depression and pain recognition).
Resumo:
This project was a step forward in developing intrusion detection systems in distributed environments such as web services. It investigates a new approach of detection based on so-called "taint-marking" techniques and introduces a theoretical framework along with its implementation in the Linux kernel.
Resumo:
BACKGROUND: Observational data suggested that supplementation with vitamin D could reduce risk of infection, but trial data are inconsistent. OBJECTIVE: We aimed to examine the effect of oral vitamin D supplementation on antibiotic use. DESIGN: We conducted a post hoc analysis of data from pilot D-Health, which is a randomized trial carried out in a general community setting between October 2010 and February 2012. A total of 644 Australian residents aged 60-84 y were randomly assigned to receive monthly doses of a placebo (n = 214) or 30,000 (n = 215) or 60,000 (n = 215) IU oral cholecalciferol for ≤12 mo. Antibiotics prescribed during the intervention period were ascertained by linkage with pharmacy records through the national health insurance scheme (Medicare Australia). RESULTS: People who were randomly assigned 60,000 IU cholecalciferol had nonsignificant 28% lower risk of having antibiotics prescribed at least once than did people in the placebo group (RR: 0.72; 95% CI: 0.48, 1.07). In analyses stratified by age, in subjects aged ≥70 y, there was a significant reduction in antibiotic use in the high-dose vitamin D compared with placebo groups (RR: 0.53; 95% CI: 0.32, 0.90), whereas there was no effect in participants <70 y old (RR: 1.07; 95% CI: 0.58, 1.97) (P-interaction = 0.1). CONCLUSION: Although this study was a post hoc analysis and statistically nonsignificant, this trial lends some support to the hypothesis that supplementation with 60,000 IU vitamin D/mo is associated with lower risk of infection, particularly in older adults. The trial was registered at the Australian New Zealand Clinical Trials Registry (anzctr.org.au) as ACTRN12609001063202.
Resumo:
Endometrial cancer is one of the most common female diseases in developed nations and is the most commonly diagnosed gynaecological cancer in Australia. The disease is commonly classified by histology: endometrioid or non-endometrioid endometrial cancer. While non-endometrioid endometrial cancers are accepted to be high-grade, aggressive cancers, endometrioid cancers (comprising 80% of all endometrial cancers diagnosed) generally carry a favourable patient prognosis. However, endometrioid endometrial cancer patients endure significant morbidity due to surgery and radiotherapy used for disease treatment, and patients with recurrent disease have a 5-year survival rate of less than 50%. Genetic analysis of women with endometrial cancer could uncover novel markers associated with disease risk and/or prognosis, which could then be used to identify women at high risk and for the use of specialised treatments. Proteases are widely accepted to play an important role in the development and progression of cancer. This PhD project hypothesised that SNPs from two protease gene families, the matrix metalloproteases (MMPs, including their tissue inhibitors, TIMPs) and the tissue kallikrein-related peptidases (KLKs) would be associated with endometrial cancer susceptibility and/or prognosis. In the first part of this study, optimisation of the genotyping techniques was performed. Results from previously published endometrial cancer genetic association studies were attempted to be validated in a large, multicentre replication set (maximum cases n = 2,888, controls n = 4,483, 3 studies). The rs11224561 progesterone receptor SNP (PGR, A/G) was observed to be associated with increased endometrial cancer risk (per A allele OR 1.31, 95% CI 1.12-1.53; p-trend = 0.001), a result which was initially reported among a Chinese sample set. Previously reported associations for the remaining 8 SNPs investigated for this section of the PhD study were not confirmed, thereby reinforcing the importance of validation of genetic association studies. To examine the effect of SNPs from the MMP and KLK families on endometrial cancer risk, we selected the most significantly associated MMP and KLK SNPs from genome-wide association study analysis (GWAS) to be genotyped in the GWAS replication set (cases n = 4,725, controls n = 9,803, 13 studies). The significance of the MMP24 rs932562 SNP was unchanged after incorporation of the stage 2 samples (Stage 1 per allele OR 1.18, p = 0.002; Combined Stage 1 and 2 OR 1.09, p = 0.002). The rs10426 SNP, located 3' to KLK10 was predicted by bioinformatic analysis to effect miRNA binding. This SNP was observed in the GWAS stage 1 result to exhibit a recessive effect on endometrial cancer risk, a result which was not validated in the stage 2 sample set (Stage 1 OR 1.44, p = 0.007; Combined Stage 1 and 2 OR 1.14, p = 0.08). Investigation of the regions imputed surrounding the MMP, TIMP and KLK genes did not reveal any significant targets for further analysis. Analysis of the case data from the endometrial cancer GWAS to identify genetic variation associated with cancer grade did not reveal SNPs from the MMP, TIMP or KLK genes to be statistically significant. However, the representation of SNPs from the MMP, TIMP and KLK families by the GWAS genotyping platform used in this PhD project was examined and observed to be very low, with the genetic variation of four genes (MMP23A, MMP23B, MMP28 and TIMP1) not captured at all by this technique. This suggests that comprehensive candidate gene association studies will be required to assess the role of SNPs from these genes with endometrial cancer risk and prognosis. Meta-analysis of gene expression microarray datasets curated as part of this PhD study identified a number of MMP, TIMP and KLK genes to display differential expression by endometrial cancer status (MMP2, MMP10, MMP11, MMP13, MMP19, MMP25 and KLK1) and histology (MMP2, MMP11, MMP12, MMP26, MMP28, TIMP2, TIMP3, KLK6, KLK7, KLK11 and KLK12). In light of these findings these genes should be prioritised for future targeted genetic association studies. Two SNPs located 43.5 Mb apart on chromosome 15 were observed from the GWAS analysis to be associated with increased endometrial cancer grade, results that were validated in silico in two independent datasets. One of these SNPs, rs8035725 is located in the 5' untranslated region of a MYC promoter binding protein DENND4A (Stage 1 OR 1.15, p = 9.85 x 10P -5 P, combined Stage 1 and in silico validation OR 1.13, p = 5.24 x 10P -6 P). This SNP has previously been reported to alter the expression of PTPLAD1, a gene involved in the synthesis of very long fatty acid chains and in the Rac1 signaling pathway. Meta-analysis of gene expression microarray data found PTPLAD1 to display increased expression in the aggressive non-endometrioid histology compared with endometrioid endometrial cancer, suggesting that the causal SNP underlying the observed genetic association may influence expression of this gene. Neither rs8035725 nor significant SNPs identified by imputation were predicted bioinformatically to affect transcription factor binding sites, indicating that further studies are required to assess their potential effect on other regulatory elements. The other grade- associated SNP, rs6606792, is located upstream of an inferred pseudogene, ELMO2P1 (Stage 1 OR 1.12, p = 5 x 10P -5 P; combined Stage 1 and in silico validation OR 1.09, p = 3.56 x 10P -5 P). Imputation of the ±1 Mb region surrounding this SNP revealed a cluster of significantly associated variants which are predicted to abolish various transcription factor binding sites, and would be expected to decrease gene expression. ELMO2P1 was not included on the microarray platforms collected for this PhD, and so its expression could not be investigated. However, the high sequence homology of ELMO2P1 with ELMO2, a gene important to cell motility, indicates that ELMO2 could be the parent gene for ELMO2P1 and as such, ELMO2P1 could function to regulate the expression of ELMO2. Increased expression of ELMO2 was seen to be associated with increasing endometrial cancer grade, as well as with aggressive endometrial cancer histological subtypes by microarray meta-analysis. Thus, it is hypothesised that SNPs in linkage disequilibrium with rs6606792 decrease the transcription of ELMO2P1, reducing the regulatory effect of ELMO2P1 on ELMO2 expression. Consequently, ELMO2 expression is increased, cell motility is enhanced leading to an aggressive endometrial cancer phenotype. In summary, these findings have identified several areas of research for further study. The results presented in this thesis provide evidence that a SNP in PGR is associated with risk of developing endometrial cancer. This PhD study also reports two independent loci on chromosome 15 to be associated with increased endometrial cancer grade, and furthermore, genes associated with these SNPs to be differentially expressed according in aggressive subtypes and/or by grade. The studies reported in this thesis support the need for comprehensive SNP association studies on prioritised MMP, TIMP and KLK genes in large sample sets. Until these studies are performed, the role of MMP, TIMP and KLK genetic variation remains unclear. Overall, this PhD study has contributed to the understanding of genetic variation involvement in endometrial cancer susceptibility and prognosis. Importantly, the genetic regions highlighted in this study could lead to the identification of novel gene targets to better understand the biology of endometrial cancer and also aid in the development of therapeutics directed at treating this disease.
Resumo:
Purpose Intensity modulated radiotherapy (IMRT) treatments require more beam-on time and produce more linac head leakage to deliver similar doses to conventional, unmodulated, radiotherapy treatments. It is necessary to take this increased leakage into account when evaluating the results of radiation surveys around bunkers that are, or will be, used for IMRT. The recommended procedure of 15 applying a monitor-unit based workload correction factor to secondary barrier survey measurements, to account for this increased leakage when evaluating radiation survey measurements around IMRT bunkers, can lead to potentially-costly over estimation of the required barrier thickness. This study aims to provide initial guidance on the validity of reducing the value of the correction factor when applied to different radiation barriers (primary barriers, doors, maze walls and other walls) by 20 evaluating three different bunker designs. Methods Radiation survey measurements of primary, scattered and leakage radiation were obtained at each of five survey points around each of three different radiotherapy bunkers and the contribution of leakage to the total measured radiation dose at each point was evaluated. Measurements at each survey point were made with the linac gantry set to 12 equidistant positions from 0 to 330o, to 25 assess the effects of radiation beam direction on the results. Results For all three bunker designs, less than 0.5% of dose measured at and alongside the primary barriers, less than 25% of the dose measured outside the bunker doors and up to 100% of the dose measured outside other secondary barriers was found to be caused by linac head leakage. Conclusions Results of this study suggest that IMRT workload corrections are unnecessary, for 30 survey measurements made at and alongside primary barriers. Use of reduced IMRT workload correction factors is recommended when evaluating survey measurements around a bunker door, provided that a subset of the measurements used in this study are repeated for the bunker in question. Reduction of the correction factor for other secondary barrier survey measurements is not recommended unless the contribution from leakage is separetely evaluated.
Resumo:
This paper evaluates the efficiency of a number of popular corpus-based distributional models in performing discovery on very large document sets, including online collections. Literature-based discovery is the process of identifying previously unknown connections from text, often published literature, that could lead to the development of new techniques or technologies. Literature-based discovery has attracted growing research interest ever since Swanson's serendipitous discovery of the therapeutic effects of fish oil on Raynaud's disease in 1986. The successful application of distributional models in automating the identification of indirect associations underpinning literature-based discovery has been heavily demonstrated in the medical domain. However, we wish to investigate the computational complexity of distributional models for literature-based discovery on much larger document collections, as they may provide computationally tractable solutions to tasks including, predicting future disruptive innovations. In this paper we perform a computational complexity analysis on four successful corpus-based distributional models to evaluate their fit for such tasks. Our results indicate that corpus-based distributional models that store their representations in fixed dimensions provide superior efficiency on literature-based discovery tasks.
Resumo:
This research aims to develop a reliable density estimation method for signalised arterials based on cumulative counts from upstream and downstream detectors. In order to overcome counting errors associated with urban arterials with mid-link sinks and sources, CUmulative plots and Probe Integration for Travel timE estimation (CUPRITE) is employed for density estimation. The method, by utilizing probe vehicles’ samples, reduces or cancels the counting inconsistencies when vehicles’ conservation is not satisfied within a section. The method is tested in a controlled environment, and the authors demonstrate the effectiveness of CUPRITE for density estimation in a signalised section, and discuss issues associated with the method.
Resumo:
Numerous statements and declarations have been made over recent decades in support of open access to research data. The growing recognition of the importance of open access to research data has been accompanied by calls on public research funding agencies and universities to facilitate better access to publicly funded research data so that it can be re-used and redistributed as public goods. International and inter-governmental bodies such as the ICSU/CODATA, the OECD and the European Union are strong supporters of open access to and re-use of publicly funded research data. This thesis focuses on the research data created by university researchers in Malaysian public universities whose research activities are funded by the Federal Government of Malaysia. Malaysia, like many countries, has not yet formulated a policy on open access to and re-use of publicly funded research data. Therefore, the aim of this thesis is to develop a policy to support the objective of enabling open access to and re-use of publicly funded research data in Malaysian public universities. Policy development is very important if the objective of enabling open access to and re-use of publicly funded research data is to be successfully achieved. In developing the policy, this thesis identifies a myriad of legal impediments arising from intellectual property rights, confidentiality, privacy and national security laws, novelty requirements in patent law and lack of a legal duty to ensure data quality. Legal impediments such as these have the effect of restricting, obstructing, hindering or slowing down the objective of enabling open access to and re-use of publicly funded research data. A key focus in the formulation of the policy was the need to resolve the various legal impediments that have been identified. This thesis analyses the existing policies and guidelines of Malaysian public universities to ascertain to what extent the legal impediments have been resolved. An international perspective is adopted by making a comparative analysis of the policies of public research funding agencies and universities in the United Kingdom, the United States and Australia to understand how they have dealt with the identified legal impediments. These countries have led the way in introducing policies which support open access to and re-use of publicly funded research data. As well as proposing a policy supporting open access to and re-use of publicly funded research data in Malaysian public universities, this thesis provides procedures for the implementation of the policy and guidelines for addressing the legal impediments to open access and re-use.
Resumo:
Collecting regular personal reflections from first year teachers in rural and remote schools is challenging as they are busily absorbed in their practice, and separated from each other and the researchers by thousands of kilometres. In response, an innovative web-based solution was designed to both collect data and be a responsive support system for early career teachers as they came to terms with their new professional identities within rural and remote school settings. Using an emailed link to a web-based application named goingok.com, the participants are charting their first year plotlines using a sliding scale from ‘distressed’, ‘ok’ to ‘soaring’ and describing their self-assessment in short descriptive posts. These reflections are visible to the participants as a developing online journal, while the collections of de-identified developing plotlines are visible to the research team, alongside numerical data. This paper explores important aspects of the design process, together with the challenges and opportunities encountered in its implementation. A number of the key considerations for choosing to develop a web application for data collection are initially identified, and the resultant application features and scope are then examined. Examples are then provided about how a responsive software development approach can be part of a supportive feedback loop for participants while being an effective data collection process. Opportunities for further development are also suggested with projected implications for future research.
Resumo:
In contemporary game development circles the ‘game making jam’ has become an important rite of passage and baptism event, an exploration space and a central indie lifestyle affirmation and community event. Game jams have recently become a focus for design researchers interested in the creative process. In this paper we tell the story of an established local game jam and our various documentation and data collection methods. We present the beginnings of the current project, which seeks to map the creative teams and their process in the space of the challenge, and which aims to enable participants to be more than the objects of the data collection. A perceived issue is that typical documentation approaches are ‘about’ the event as opposed to ‘made by’ the participants and are thus both at odds with the spirit of the jam as a phenomenon and do not really access the rich playful potential of participant experience. In the data collection and visualisation projects described here, we focus on using collected data to re-include the participants in telling stories about their experiences of the event as a place-based experience. Our goal is to find a means to encourage production of ‘anecdata’ - data based on individual story telling that is subjective, malleable, and resists collection via formal mechanisms - and to enable mimesis, or active narrating, on the part of the participants. We present a concept design for data as game based on the logic of early medieval maps and we reflect on how we could enable participation in the data collection itself.
Resumo:
A 3-year longitudinal study Transforming Children’s Mathematical and Scientific Development integrates, through data modelling, a pedagogical approach focused on mathematical patterns and structural relationships with learning in science. As part of this study, a purposive sample of 21 highly able Grade 1 students was engaged in an innovative data modelling program. In the majority of students, representational development was observed. Their complex graphs depicting categorical and continuous data revealed a high level of structure and enabled identification of structural features critical to this development.