877 resultados para user data
Resumo:
We propose a model-based approach to unify clustering and network modeling using time-course gene expression data. Specifically, our approach uses a mixture model to cluster genes. Genes within the same cluster share a similar expression profile. The network is built over cluster-specific expression profiles using state-space models. We discuss the application of our model to simulated data as well as to time-course gene expression data arising from animal models on prostate cancer progression. The latter application shows that with a combined statistical/bioinformatics analyses, we are able to extract gene-to-gene relationships supported by the literature as well as new plausible relationships.
Resumo:
Digital collections are growing exponentially in size as the information age takes a firm grip on all aspects of society. As a result Information Retrieval (IR) has become an increasingly important area of research. It promises to provide new and more effective ways for users to find information relevant to their search intentions. Document clustering is one of the many tools in the IR toolbox and is far from being perfected. It groups documents that share common features. This grouping allows a user to quickly identify relevant information. If these groups are misleading then valuable information can accidentally be ignored. There- fore, the study and analysis of the quality of document clustering is important. With more and more digital information available, the performance of these algorithms is also of interest. An algorithm with a time complexity of O(n2) can quickly become impractical when clustering a corpus containing millions of documents. Therefore, the investigation of algorithms and data structures to perform clustering in an efficient manner is vital to its success as an IR tool. Document classification is another tool frequently used in the IR field. It predicts categories of new documents based on an existing database of (doc- ument, category) pairs. Support Vector Machines (SVM) have been found to be effective when classifying text documents. As the algorithms for classifica- tion are both efficient and of high quality, the largest gains can be made from improvements to representation. Document representations are vital for both clustering and classification. Representations exploit the content and structure of documents. Dimensionality reduction can improve the effectiveness of existing representations in terms of quality and run-time performance. Research into these areas is another way to improve the efficiency and quality of clustering and classification results. Evaluating document clustering is a difficult task. Intrinsic measures of quality such as distortion only indicate how well an algorithm minimised a sim- ilarity function in a particular vector space. Intrinsic comparisons are inherently limited by the given representation and are not comparable between different representations. Extrinsic measures of quality compare a clustering solution to a “ground truth” solution. This allows comparison between different approaches. As the “ground truth” is created by humans it can suffer from the fact that not every human interprets a topic in the same manner. Whether a document belongs to a particular topic or not can be subjective.
Resumo:
Data breach notification laws require organisations to notify affected persons or regulatory authorities when an unauthorised acquisition of personal data occurs. Most laws provide a safe harbour to this obligation if acquired data has been encrypted. There are three types of safe harbour: an exemption; a rebuttable presumption and factor-based analysis. We demonstrate, using three condition-based scenarios, that the broad formulation of most encryption safe harbours is based on the flawed assumption that encryption is the silver bullet for personal information protection. We then contend that reliance upon an encryption safe harbour should be dependent upon a rigorous and competent risk-based review that is required on a case-by-case basis. Finally, we recommend the use of both an encryption safe harbour and a notification trigger as our preferred choice for a data breach notification regulatory framework.
Resumo:
The advent of data breach notification laws in the United States (US) has unearthed a significant problem involving the mismanagement of personal information by a range of public and private sector organisations. At present, there is currently no statutory obligation under Australian law requiring public or private sector organisations to report a data breach of personal information to law enforcement agencies or affected persons. However, following a comprehensive review of Australian privacy law, the Australian Law Reform Commission (ALRC) has recommended the introduction of a mandatory data breach notification scheme. The issue of data breach notification has ignited fierce debate amongst stakeholders, especially larger private sector entities. The purpose of this article is to document the perspectives of key industry and government representatives to identify their standpoints regarding an appropriate regulatory approach to data breach notification in Australia.
Resumo:
Public and private sector organisations are now able to capture and utilise data on a vast scale, thus heightening the importance of adequate measures for protecting unauthorised disclosure of personal information. In this respect, data breach notification has emerged as an issue of increasing importance throughout the world. It has been the subject of law reform in the United States and in other jurisdictions. This article reviews US, Australian and EU legal developments regarding the mandatory notification of data breaches. The authors highlight areas of concern based on the extant US experience that require further consideration in Australia and in the EU.
Resumo:
The rapid growth in the number of online services leads to an increasing number of different digital identities each user needs to manage. As a result, many people feel overloaded with credentials, which in turn negatively impact their ability to manage them securely. Passwords are perhaps the most common type of credential used today. To avoid the tedious task of remembering difficult passwords, users often behave less securely by using low entropy and weak passwords. Weak passwords and bad password habits represent security threats to online services. Some solutions have been developed to eliminate the need for users to create and manage passwords. A typical solution is based on giving the user a hardware token that generates one-time-passwords, i.e. passwords for single session or transaction usage. Unfortunately, most of these solutions do not satisfy scalability and/or usability requirements, or they are simply insecure. In this paper, we propose a scalable OTP solution using mobile phones and based on trusted computing technology that combines enhanced usability with strong security.
Resumo:
Estimates of potential and actual C sequestration require areal information about various types of management activities. Forest surveys, land use data, and agricultural statistics contribute information enabling calculation of the impacts of current and historical land management on C sequestration in biomass (in forests) or in soil (in agricultural systems). Unfortunately little information exists on the distribution of various management activities that can impact soil C content in grassland systems. Limited information of this type restricts our ability to carry out bottom-up estimates of the current C balance of grasslands or to assess the potential for grasslands to act as C sinks with changes in management. Here we review currently available information about grassland management, how that information could be related to information about the impacts of management on soil C stocks, information that may be available in the future, and needs that remain to be filled before in-depth assessments may be carried out. We also evaluate constraints induced by variability in information sources within and between countries. It is readily apparent that activity data for grassland management is collected less frequently and on a coarser scale than data for forest or agricultural inventories and that grassland activity data cannot be directly translated into IPCC-type factors as is done for IPCC inventories of agricultural soils. However, those management data that are available can serve to delineate broad-scale differences in management activities within regions in which soil C is likely to change in response to changes in management. This, coupled with the distinct possibility of more intensive surveys planned in the future, may enable more accurate assessments of grassland C dynamics with higher resolution both spatially and in the number management activities.
Resumo:
We propose a digital rights management approach for sharing electronic health records in a health research facility and argue advantages of the approach. We also give an outline of the system under development and our implementation of the security features and discuss challenges that we faced and future directions.
Resumo:
This paper provides a review of the state of the art relevant work on the use of public mobile data networks for aircraft telemetry and control proposes. Moreover, it describes the characterisation for airborne uses of the public mobile data communication systems known broadly as 3G. The motivation for this study was the explore how this mature public communication systems could be used for aviation purposes. An experimental system was fitted to a light aircraft to record communication latency, line speed, RF level, packet loss and cell tower identifier. Communications was established using internet protocols and connection was made to a local server. The aircraft was flown in both remote and populous areas at altitudes up to 8500 ft in a region located in South East Queensland, Australia. Results show that the average airborne RF levels are better than those on the ground by 21% and in the order of - 77dbm. Latencies were in the order of 500ms (1/2 the latency of Iridium), an average download speed of 0.48Mb/s, average uplink speed of 0.85Mb/s, a packet of information loss of 6.5%. The maximum communication range was also observed to be 70km from a single cell station. The paper also describes possible limitations and utility of using such communications architecture for both manned and unmanned aircraft systems.
Resumo:
Many of the costs associated with greenfield residential development are apparent and tangible. For example, regulatory fees, government taxes, acquisition costs, selling fees, commissions and others are all relatively easily identified since they represent actual costs incurred at a given point in time. However, identification of holding costs are not always immediately evident since by contrast they characteristically lack visibility. One reason for this is that, for the most part, they are typically assessed over time in an ever-changing environment. In addition, wide variations exist in development pipeline components: they are typically represented from anywhere between a two and over sixteen years time period - even if located within the same geographical region. Determination of the starting and end points, with regards holding cost computation, can also prove problematic. Furthermore, the choice between application of prevailing inflation, or interest rates, or a combination of both over time, adds further complexity. Although research is emerging in these areas, a review of the literature reveals attempts to identify holding cost components are limited. Their quantification (in terms of relative weight or proportionate cost to a development project) is even less apparent; in fact, the computation and methodology behind the calculation of holding costs varies widely and in some instances completely ignored. In addition, it may be demonstrated that ambiguities exists in terms of the inclusion of various elements of holding costs and assessment of their relative contribution. Yet their impact on housing affordability is widely acknowledged to be profound, with their quantification potentially maximising the opportunities for delivering affordable housing. This paper seeks to build on earlier investigations into those elements related to holding costs, providing theoretical modelling of the size of their impact - specifically on the end user. At this point the research is reliant upon quantitative data sets, however additional qualitative analysis (not included here) will be relevant to account for certain variations between expectations and actual outcomes achieved by developers. Although this research stops short of cross-referencing with a regional or international comparison study, an improved understanding of the relationship between holding costs, regulatory charges, and housing affordability results.
Resumo:
In November 2009 the researcher embarked on a project aimed at reducing the amount of paper used by Queensland University of Technology (QUT) staff in their daily workplace activities. The key goal was to communicate to staff that excessive printing has a tangible and negative effect on their workplace and local environment. The research objective was to better understand what motivates staff towards more ecologically sustainable printing practises, whilst meeting their job’s demands. The current study is built on previous research that found that one interface does not address the needs of all users when creating persuasive Human Computer Interaction (HCI) interventions targeting resource consumption. In response, the current study created and trialled software that communicates individual paper consumption in precise metrics. Based on preliminary research data different metric sets have been defined to address the different motivations and beliefs of user archetypes using descriptive and injunctive normative information.
Resumo:
Now in its second edition, this book describes tools that are commonly used in transportation data analysis. The first part of the text provides statistical fundamentals while the second part presents continuous dependent variable models. With a focus on count and discrete dependent variable models, the third part features new chapters on mixed logit models, logistic regression, and ordered probability models. The last section provides additional coverage of Bayesian statistical modeling, including Bayesian inference and Markov chain Monte Carlo methods. Data sets are available online to use with the modeling techniques discussed.
Resumo:
National estimates of the prevalence of child abuse-related injuries are obtained from a variety of sectors including welfare, justice, and health resulting in inconsistent estimates across sectors. The International Classification of Diseases (ICD) is used as the international standard for categorising health data and aggregating data for statistical purposes, though there has been limited validation of the quality, completeness or concordance of these data with other sectors. This research study examined the quality of documentation and coding of child abuse recorded in hospital records in Queensland and the concordance of these data with child welfare records. A retrospective medical record review was used to examine the clinical documentation of over 1000 hospitalised injured children from 20 hospitals in Queensland. A data linkage methodology was used to link these records with records in the child welfare database. Cases were sampled from three sub-groups according to the presence of target ICD codes: Definite abuse, Possible abuse, unintentional injury. Less than 2% of cases coded as being unintentional were recoded after review as being possible abuse, and only 5% of cases coded as possible abuse cases were reclassified as unintentional, though there was greater variation in the classification of cases as definite abuse compared to possible abuse. Concordance of health data with child welfare data varied across patient subgroups. This study will inform the development of strategies to improve the quality, consistency and concordance of information between health and welfare agencies to ensure adequate system responses to children at risk of abuse.
Resumo:
Emergency departments (EDs) are often the first point of contact with an abused child. Despite legal mandate, the reporting of definite or suspected abusive injury to child safety authorities by ED clinicians varies due to a number of factors including training, access to child safety professionals, departmental culture and a fear of ‘getting it wrong’. This study examined the quality of documentation and coding of child abuse captured by ED based injury surveillance data and ED medical records in the state of Queensland and the concordance of these data with child welfare records. A retrospective medical record review was used to examine the clinical documentation of almost 1000 injured children included in the Queensland Injury Surveillance Unit database (QISU) from 10 hospitals in urban and rural centres. Independent experts re-coded the records based on their review of the notes. A data linkage methodology was then used to link these records with records in the state government’s child welfare database. Cases were sampled from three sub-groups according to the surveillance intent codes: Maltreatment by parent, Undetermined and Unintentional injury. Only 0.1% of cases coded as unintentional injury were recoded to maltreatment by parent, while 1.2% of cases coded as maltreatment by parent were reclassified as unintentional and 5% of cases where the intent was undetermined by the triage nurse were recoded as maltreatment by parent. Quality of documentation varied across type of hospital (tertiary referral centre, children’s, urban, regional and remote). Concordance of health data with child welfare data varied across patient subgroups. Outcomes from this research will guide initiatives to improve the quality of intentional child injury surveillance systems.