679 resultados para Data breach notification law
Resumo:
This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.
Resumo:
In the context of learning paradigms of identification in the limit, we address the question: why is uncertainty sometimes desirable? We use mind change bounds on the output hypotheses as a measure of uncertainty, and interpret ‘desirable’ as reduction in data memorization, also defined in terms of mind change bounds. The resulting model is closely related to iterative learning with bounded mind change complexity, but the dual use of mind change bounds — for hypotheses and for data — is a key distinctive feature of our approach. We show that situations exists where the more mind changes the learner is willing to accept, the lesser the amount of data it needs to remember in order to converge to the correct hypothesis. We also investigate relationships between our model and learning from good examples, set-driven, monotonic and strong-monotonic learners, as well as class-comprising versus class-preserving learnability.
Resumo:
Keyword Spotting is the task of detecting keywords of interest within continu- ous speech. The applications of this technology range from call centre dialogue systems to covert speech surveillance devices. Keyword spotting is particularly well suited to data mining tasks such as real-time keyword monitoring and unre- stricted vocabulary audio document indexing. However, to date, many keyword spotting approaches have su®ered from poor detection rates, high false alarm rates, or slow execution times, thus reducing their commercial viability. This work investigates the application of keyword spotting to data mining tasks. The thesis makes a number of major contributions to the ¯eld of keyword spotting. The ¯rst major contribution is the development of a novel keyword veri¯cation method named Cohort Word Veri¯cation. This method combines high level lin- guistic information with cohort-based veri¯cation techniques to obtain dramatic improvements in veri¯cation performance, in particular for the problematic short duration target word class. The second major contribution is the development of a novel audio document indexing technique named Dynamic Match Lattice Spotting. This technique aug- ments lattice-based audio indexing principles with dynamic sequence matching techniques to provide robustness to erroneous lattice realisations. The resulting algorithm obtains signi¯cant improvement in detection rate over lattice-based audio document indexing while still maintaining extremely fast search speeds. The third major contribution is the study of multiple veri¯er fusion for the task of keyword veri¯cation. The reported experiments demonstrate that substantial improvements in veri¯cation performance can be obtained through the fusion of multiple keyword veri¯ers. The research focuses on combinations of speech background model based veri¯ers and cohort word veri¯ers. The ¯nal major contribution is a comprehensive study of the e®ects of limited training data for keyword spotting. This study is performed with consideration as to how these e®ects impact the immediate development and deployment of speech technologies for non-English languages.
Resumo:
We propose a model-based approach to unify clustering and network modeling using time-course gene expression data. Specifically, our approach uses a mixture model to cluster genes. Genes within the same cluster share a similar expression profile. The network is built over cluster-specific expression profiles using state-space models. We discuss the application of our model to simulated data as well as to time-course gene expression data arising from animal models on prostate cancer progression. The latter application shows that with a combined statistical/bioinformatics analyses, we are able to extract gene-to-gene relationships supported by the literature as well as new plausible relationships.
Resumo:
With the growth of high-technology industries and knowledge intensive services, the pursuit of industrial competitiveness has progressed from a broad concern with the processes of industrialisation to a more focused analysis of the factors explaining cross-national variation in the level of participation in knowledge industries. From an examination of cross-national data, the paper develops the proposition that particular elements of the domestic science, technology and industry infrastructure—such as the stock of knowledge and competence in the economy, the capacity for learning and generation of new ideas and the capacity to commercialise new ideas—vary cross-nationally and are related to the level of participation of a nation in knowledge intensive activities. Existing understandings of the role of the state in promoting industrial competitiveness might be expanded to incorporate an analysis of the contribution of the state through the building of competencies in science, technology and industry. Keywords: Knowledge; economy; comparative public policy; innovation; science and technology policy
Resumo:
This project proposes a new conceptual framework for the regulation of social networks and virtual communities. By applying a model based upon the rule of law, this thesis addresses the growing tensions that revolve around the public use of private networks. This research examines the shortcomings of traditional contractual governance models and cyberlaw theory and provides a reconstituted approach that will allow public constitutional-type interests to be recognised in the interpretation and enforcement of contractual doctrine.
Resumo:
Estimates of potential and actual C sequestration require areal information about various types of management activities. Forest surveys, land use data, and agricultural statistics contribute information enabling calculation of the impacts of current and historical land management on C sequestration in biomass (in forests) or in soil (in agricultural systems). Unfortunately little information exists on the distribution of various management activities that can impact soil C content in grassland systems. Limited information of this type restricts our ability to carry out bottom-up estimates of the current C balance of grasslands or to assess the potential for grasslands to act as C sinks with changes in management. Here we review currently available information about grassland management, how that information could be related to information about the impacts of management on soil C stocks, information that may be available in the future, and needs that remain to be filled before in-depth assessments may be carried out. We also evaluate constraints induced by variability in information sources within and between countries. It is readily apparent that activity data for grassland management is collected less frequently and on a coarser scale than data for forest or agricultural inventories and that grassland activity data cannot be directly translated into IPCC-type factors as is done for IPCC inventories of agricultural soils. However, those management data that are available can serve to delineate broad-scale differences in management activities within regions in which soil C is likely to change in response to changes in management. This, coupled with the distinct possibility of more intensive surveys planned in the future, may enable more accurate assessments of grassland C dynamics with higher resolution both spatially and in the number management activities.
Resumo:
We propose a digital rights management approach for sharing electronic health records in a health research facility and argue advantages of the approach. We also give an outline of the system under development and our implementation of the security features and discuss challenges that we faced and future directions.
Resumo:
Marinas currently exist primarily to service recreational boats, and these vessels are a potential cause of both problems and opportunities in environmental management. Thus, on the one hand, destructive fuel and other pollutants may be expelled, boat wakes can cause littoral soil erosion, physical damage results from collisions with marine life, and litter and noise pollution occur in otherwise pristine habitat. Boats also provide access to otherwise inaccessible natural environments for educational and other management reasons. In this study, boat traffic at three large marinas located along the Queensland coastline has been field surveyed for introductory information. No attempt was made at this juncture to survey the behaviour of the boat crews and passengers (concerning actual destinations, activities on board, etc. or to survey the recreational boat industry. Such studies rely on boat registration records and personal questionnaires. Some other surveys relating to fishing draw on boat ramp surveys and direct submissions by recreational fishers; these provide some data on daily usage of boat ramps, but without particular attention to boats. We believe field observations of overall boat activities in the water are necessary for environmental management purposes. The aim of the survey was to provide information to help prioritize the potential impacts that boats’ activities have on the surrounding natural environment. Any impact by boats will be a product of their numbers, size, frequency of movement, carrying capacity and routes/destinations. The severity of impacts will dictate the appropriate management action.
Resumo:
This paper provides a review of the state of the art relevant work on the use of public mobile data networks for aircraft telemetry and control proposes. Moreover, it describes the characterisation for airborne uses of the public mobile data communication systems known broadly as 3G. The motivation for this study was the explore how this mature public communication systems could be used for aviation purposes. An experimental system was fitted to a light aircraft to record communication latency, line speed, RF level, packet loss and cell tower identifier. Communications was established using internet protocols and connection was made to a local server. The aircraft was flown in both remote and populous areas at altitudes up to 8500 ft in a region located in South East Queensland, Australia. Results show that the average airborne RF levels are better than those on the ground by 21% and in the order of - 77dbm. Latencies were in the order of 500ms (1/2 the latency of Iridium), an average download speed of 0.48Mb/s, average uplink speed of 0.85Mb/s, a packet of information loss of 6.5%. The maximum communication range was also observed to be 70km from a single cell station. The paper also describes possible limitations and utility of using such communications architecture for both manned and unmanned aircraft systems.
Resumo:
The broad objective of this study was to understand the incidence and severity of aggression among sexually abused girls who were trafficked and who were then further used for commercial sexual exploitation (referred to subsequently as sexually abused trafficked girls). In addition, the impact of counseling for minimizing aggression in these girls was investigated. A group of 120 sexually abused trafficked Indian girls and a group of 120 nonsexually abused Indian girls, aged 13 to 18, participated in the study. The sexually abused trafficked girls were purposively selected from four shelters located in and around Kolkata, India. The nonsexually abused girls were selected randomly from four schools situated near the shelters, and these girls were matched by age with the sexually abused trafficked girls. Data were collected using a Background Information Schedule and a standardized psychological test, that is, The Aggression Scale. Results revealed that 16.7% of the girls were first sexually abused between 6 and 9 years of age, 37.5% between 10 and 13 years of age, and 45.8% between 14 and 17 years of age. Findings further revealed that 4.2% of the sexually abused trafficked girls demonstrated saturated aggression, and 26.7% were highly aggressive, that is, extremely frustrated and rebellious. Across age groups, the sexually abused trafficked girls suffered from more aggression (p < .05), compared with the nonvictimized girls. Psychological interventions, such as individual and group counseling, were found to have a positive impact on the sexually abused trafficked girls. These findings should motivate counselors to deal with sexually abused children. It is also hoped that authorities in welfare homes will understand the importance of counseling for sexually abused trafficked children, and will appoint more counselors for this purpose.