747 resultados para statistical framework
em Queensland University of Technology - ePrints Archive
Resumo:
This paper presents a robust stochastic framework for the incorporation of visual observations into conventional estimation, data fusion, navigation and control algorithms. The representation combines Isomap, a non-linear dimensionality reduction algorithm, with expectation maximization, a statistical learning scheme. The joint probability distribution of this representation is computed offline based on existing training data. The training phase of the algorithm results in a nonlinear and non-Gaussian likelihood model of natural features conditioned on the underlying visual states. This generative model can be used online to instantiate likelihoods corresponding to observed visual features in real-time. The instantiated likelihoods are expressed as a Gaussian mixture model and are conveniently integrated within existing non-linear filtering algorithms. Example applications based on real visual data from heterogenous, unstructured environments demonstrate the versatility of the generative models.
Resumo:
This report presents the final deliverable from the project titled Conceptual and statistical framework for a water quality component of an integrated report card’ funded by the Marine and Tropical Sciences Research Facility (MTSRF; Project 3.7.7). The key management driver of this, and a number of other MTSRF projects concerned with indicator development, is the requirement for state and federal government authorities and other stakeholders to provide robust assessments of the present ‘state’ or ‘health’ of regional ecosystems in the Great Barrier Reef (GBR) catchments and adjacent marine waters. An integrated report card format, that encompasses both biophysical and socioeconomic factors, is an appropriate framework through which to deliver these assessments and meet a variety of reporting requirements. It is now well recognised that a ‘report card’ format for environmental reporting is very effective for community and stakeholder communication and engagement, and can be a key driver in galvanising community and political commitment and action. Although a report card it needs to be understandable by all levels of the community, it also needs to be underpinned by sound, quality-assured science. In this regard this project was to develop approaches to address the statistical issues that arise from amalgamation or integration of sets of discrete indicators into a final score or assessment of the state of the system. In brief, the two main issues are (1) selecting, measuring and interpreting specific indicators that vary both in space and time, and (2) integrating a range of indicators in such a way as to provide a succinct but robust overview of the state of the system. Although there is considerable research and knowledge of the use of indicators to inform the management of ecological, social and economic systems, methods on how to best to integrate multiple disparate indicators remain poorly developed. Therefore the objective of this project was to (i) focus on statistical approaches aimed at ensuring that estimates of individual indicators are as robust as possible, and (ii) present methods that can be used to report on the overall state of the system by integrating estimates of individual indicators. It was agreed at the outset, that this project was to focus on developing methods for a water quality report card. This was driven largely by the requirements of Reef Water Quality Protection Plan (RWQPP) and led to strong partner engagement with the Reef Water Quality Partnership.
Resumo:
Plant biosecurity requires statistical tools to interpret field surveillance data in order to manage pest incursions that threaten crop production and trade. Ultimately, management decisions need to be based on the probability that an area is infested or free of a pest. Current informal approaches to delimiting pest extent rely upon expert ecological interpretation of presence / absence data over space and time. Hierarchical Bayesian models provide a cohesive statistical framework that can formally integrate the available information on both pest ecology and data. The overarching method involves constructing an observation model for the surveillance data, conditional on the hidden extent of the pest and uncertain detection sensitivity. The extent of the pest is then modelled as a dynamic invasion process that includes uncertainty in ecological parameters. Modelling approaches to assimilate this information are explored through case studies on spiralling whitefly, Aleurodicus dispersus and red banded mango caterpillar, Deanolis sublimbalis. Markov chain Monte Carlo simulation is used to estimate the probable extent of pests, given the observation and process model conditioned by surveillance data. Statistical methods, based on time-to-event models, are developed to apply hierarchical Bayesian models to early detection programs and to demonstrate area freedom from pests. The value of early detection surveillance programs is demonstrated through an application to interpret surveillance data for exotic plant pests with uncertain spread rates. The model suggests that typical early detection programs provide a moderate reduction in the probability of an area being infested but a dramatic reduction in the expected area of incursions at a given time. Estimates of spiralling whitefly extent are examined at local, district and state-wide scales. The local model estimates the rate of natural spread and the influence of host architecture, host suitability and inspector efficiency. These parameter estimates can support the development of robust surveillance programs. Hierarchical Bayesian models for the human-mediated spread of spiralling whitefly are developed for the colonisation of discrete cells connected by a modified gravity model. By estimating dispersal parameters, the model can be used to predict the extent of the pest over time. An extended model predicts the climate restricted distribution of the pest in Queensland. These novel human-mediated movement models are well suited to demonstrating area freedom at coarse spatio-temporal scales. At finer scales, and in the presence of ecological complexity, exploratory models are developed to investigate the capacity for surveillance information to estimate the extent of red banded mango caterpillar. It is apparent that excessive uncertainty about observation and ecological parameters can impose limits on inference at the scales required for effective management of response programs. The thesis contributes novel statistical approaches to estimating the extent of pests and develops applications to assist decision-making across a range of plant biosecurity surveillance activities. Hierarchical Bayesian modelling is demonstrated as both a useful analytical tool for estimating pest extent and a natural investigative paradigm for developing and focussing biosecurity programs.
Resumo:
Robust hashing is an emerging field that can be used to hash certain data types in applications unsuitable for traditional cryptographic hashing methods. Traditional hashing functions have been used extensively for data/message integrity, data/message authentication, efficient file identification and password verification. These applications are possible because the hashing process is compressive, allowing for efficient comparisons in the hash domain but non-invertible meaning hashes can be used without revealing the original data. These techniques were developed with deterministic (non-changing) inputs such as files and passwords. For such data types a 1-bit or one character change can be significant, as a result the hashing process is sensitive to any change in the input. Unfortunately, there are certain applications where input data are not perfectly deterministic and minor changes cannot be avoided. Digital images and biometric features are two types of data where such changes exist but do not alter the meaning or appearance of the input. For such data types cryptographic hash functions cannot be usefully applied. In light of this, robust hashing has been developed as an alternative to cryptographic hashing and is designed to be robust to minor changes in the input. Although similar in name, robust hashing is fundamentally different from cryptographic hashing. Current robust hashing techniques are not based on cryptographic methods, but instead on pattern recognition techniques. Modern robust hashing algorithms consist of feature extraction followed by a randomization stage that introduces non-invertibility and compression, followed by quantization and binary encoding to produce a binary hash output. In order to preserve robustness of the extracted features, most randomization methods are linear and this is detrimental to the security aspects required of hash functions. Furthermore, the quantization and encoding stages used to binarize real-valued features requires the learning of appropriate quantization thresholds. How these thresholds are learnt has an important effect on hashing accuracy and the mere presence of such thresholds are a source of information leakage that can reduce hashing security. This dissertation outlines a systematic investigation of the quantization and encoding stages of robust hash functions. While existing literature has focused on the importance of quantization scheme, this research is the first to emphasise the importance of the quantizer training on both hashing accuracy and hashing security. The quantizer training process is presented in a statistical framework which allows a theoretical analysis of the effects of quantizer training on hashing performance. This is experimentally verified using a number of baseline robust image hashing algorithms over a large database of real world images. This dissertation also proposes a new randomization method for robust image hashing based on Higher Order Spectra (HOS) and Radon projections. The method is non-linear and this is an essential requirement for non-invertibility. The method is also designed to produce features more suited for quantization and encoding. The system can operate without the need for quantizer training, is more easily encoded and displays improved hashing performance when compared to existing robust image hashing algorithms. The dissertation also shows how the HOS method can be adapted to work with biometric features obtained from 2D and 3D face images.
Resumo:
The selection of optimal camera configurations (camera locations, orientations, etc.) for multi-camera networks remains an unsolved problem. Previous approaches largely focus on proposing various objective functions to achieve different tasks. Most of them, however, do not generalize well to large scale networks. To tackle this, we propose a statistical framework of the problem as well as propose a trans-dimensional simulated annealing algorithm to effectively deal with it. We compare our approach with a state-of-the-art method based on binary integer programming (BIP) and show that our approach offers similar performance on small scale problems. However, we also demonstrate the capability of our approach in dealing with large scale problems and show that our approach produces better results than two alternative heuristics designed to deal with the scalability issue of BIP. Last, we show the versatility of our approach using a number of specific scenarios.
Resumo:
In the commercial food industry, demonstration of microbiological safety and thermal process equivalence often involves a mathematical framework that assumes log-linear inactivation kinetics and invokes concepts of decimal reduction time (DT), z values, and accumulated lethality. However, many microbes, particularly spores, exhibit inactivation kinetics that are not log linear. This has led to alternative modeling approaches, such as the biphasic and Weibull models, that relax strong log-linear assumptions. Using a statistical framework, we developed a novel log-quadratic model, which approximates the biphasic and Weibull models and provides additional physiological interpretability. As a statistical linear model, the log-quadratic model is relatively simple to fit and straightforwardly provides confidence intervals for its fitted values. It allows a DT-like value to be derived, even from data that exhibit obvious "tailing." We also showed how existing models of non-log-linear microbial inactivation, such as the Weibull model, can fit into a statistical linear model framework that dramatically simplifies their solution. We applied the log-quadratic model to thermal inactivation data for the spore-forming bacterium Clostridium botulinum and evaluated its merits compared with those of popular previously described approaches. The log-quadratic model was used as the basis of a secondary model that can capture the dependence of microbial inactivation kinetics on temperature. This model, in turn, was linked to models of spore inactivation of Sapru et al. and Rodriguez et al. that posit different physiological states for spores within a population. We believe that the log-quadratic model provides a useful framework in which to test vitalistic and mechanistic hypotheses of inactivation by thermal and other processes. Copyright © 2009, American Society for Microbiology. All Rights Reserved.
Resumo:
Between-subject and within-subject variability is ubiquitous in biology and physiology and understanding and dealing with this is one of the biggest challenges in medicine. At the same time it is difficult to investigate this variability by experiments alone. A recent modelling and simulation approach, known as population of models (POM), allows this exploration to take place by building a mathematical model consisting of multiple parameter sets calibrated against experimental data. However, finding such sets within a high-dimensional parameter space of complex electrophysiological models is computationally challenging. By placing the POM approach within a statistical framework, we develop a novel and efficient algorithm based on sequential Monte Carlo (SMC). We compare the SMC approach with Latin hypercube sampling (LHS), a method commonly adopted in the literature for obtaining the POM, in terms of efficiency and output variability in the presence of a drug block through an in-depth investigation via the Beeler-Reuter cardiac electrophysiological model. We show improved efficiency via SMC and that it produces similar responses to LHS when making out-of-sample predictions in the presence of a simulated drug block.
Resumo:
It is important to promote a sustainable development approach to ensure that economic, environmental and social developments are maintained in balance. Sustainable development and its implications are not just a global concern, it also affects Australia. In particular, rural Australian communities are facing various economic, environmental and social challenges. Thus, the need for sustainable development in rural regions is becoming increasingly important. To promote sustainable development, proper frameworks along with the associated tools optimised for the specific regions, need to be developed. This will ensure that the decisions made for sustainable development are evidence based, instead of subjective opinions. To address these issues, Queensland University of Technology (QUT), through an Australian Research Council (ARC) linkage grant, has initiated research into the development of a Rural Statistical Sustainability Framework (RSSF) to aid sustainable decision making in rural Queensland. This particular branch of the research developed a decision support tool that will become the integrating component of the RSSF. This tool is developed on the web-based platform to allow easy dissemination, quick maintenance and to minimise compatibility issues. The tool is developed based on MapGuide Open Source and it follows the three-tier architecture: Client tier, Web tier and the Server tier. The developed tool is interactive and behaves similar to a familiar desktop-based application. It has the capability to handle and display vector-based spatial data and can give further visual outputs using charts and tables. The data used in this tool is obtained from the QUT research team. Overall the tool implements four tasks to help in the decision-making process. These are the Locality Classification, Trend Display, Impact Assessment and Data Entry and Update. The developed tool utilises open source and freely available software and accounts for easy extensibility and long-term sustainability.
Resumo:
In this paper, we used a nonconservative Lagrangian mechanics approach to formulate a new statistical algorithm for fluid registration of 3-D brain images. This algorithm is named SAFIRA, acronym for statistically-assisted fluid image registration algorithm. A nonstatistical version of this algorithm was implemented, where the deformation was regularized by penalizing deviations from a zero rate of strain. In, the terms regularizing the deformation included the covariance of the deformation matrices Σ and the vector fields (q). Here, we used a Lagrangian framework to reformulate this algorithm, showing that the regularizing terms essentially allow nonconservative work to occur during the flow. Given 3-D brain images from a group of subjects, vector fields and their corresponding deformation matrices are computed in a first round of registrations using the nonstatistical implementation. Covariance matrices for both the deformation matrices and the vector fields are then obtained and incorporated (separately or jointly) in the nonconservative terms, creating four versions of SAFIRA. We evaluated and compared our algorithms' performance on 92 3-D brain scans from healthy monozygotic and dizygotic twins; 2-D validations are also shown for corpus callosum shapes delineated at midline in the same subjects. After preliminary tests to demonstrate each method, we compared their detection power using tensor-based morphometry (TBM), a technique to analyze local volumetric differences in brain structure. We compared the accuracy of each algorithm variant using various statistical metrics derived from the images and deformation fields. All these tests were also run with a traditional fluid method, which has been quite widely used in TBM studies. The versions incorporating vector-based empirical statistics on brain variation were consistently more accurate than their counterparts, when used for automated volumetric quantification in new brain images. This suggests the advantages of this approach for large-scale neuroimaging studies.
Resumo:
Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones), rather than objective reality. Bayesian analysis is (arguably) a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.
Resumo:
Optimum Wellness involves the development, refinement and practice of lifestyle choices which resonate with personally meaningful frames of reference. Personal transformations are the means by which our frames of reference are refined across the lifespan. It is through critical reflection, supportive relationships and meaning making of our experiences that we construct and reconstruct our life paths. When individuals are able to be what they are destined to be or reach their higher purpose, then they are able to contribute to the world in positive and meaningful ways. Transformative education facilitates the changes in perspective that enable one to contemplate and travel a path in life that leads to self-actualisation. This thesis argues for an integrated theoretical framework for optimum Wellness Education. It establishes a learner centred approach to Wellness education in the form of an integrated instructional design framework derived from both Wellness and Transformative education constructs. Students’ approaches to learning and their study strategies in a Wellness education context serve to highlight convergences in the manner in which students can experience perspective transformation. As they learn to critically reflect, pursue relationships and adapt their frames of reference to sustain their pursuit of both learning and Wellness goals, strengthening the nexus between instrumental and transformative learning is a strategically important goal for educators. The aim of this exploratory research study was to examine those facets that serve to optimise the learning experiences of students in a Wellness course. This was accomplished through three research issues: 1) What are the relationships between Wellness, approaches to learning and academic success? 2) How are students approaching learning in an undergraduate Wellness subject? Why are students approaching their learning in the ways they do? 3) What sorts of transformations are students experiencing in their Wellness? How can transformative education be formulated in the context of an undergraduate Wellness subject? Subsequent to a thorough review of the literature pertaining to Wellness education, a mixed method embedded case study design was formulated to explore the research issues. This thesis examines the interrelationships between student, content and context in a one semester university undergraduate unit (a coherent set of learning activities which is assigned a unit code and a credit point value). The experiences of a cohort of 285 undergraduate students in a Wellness course formed the unit of study and seven individual students from a total of sixteen volunteers whose profiles could be constructed from complete data sets were selected for analysis as embedded cases. The introductory level course required participants to engage in a personal project involving a behaviour modification plan for a self-selected, single dimension of Wellness. Students were given access to the Standard Edition Testwell Survey to assess and report their Wellness as a part of their personal projects. To identify relationships among the constructs of Self-Regulated Learning (SRL), Wellness and Student Approaches to Learning (SAL) a blend of quantitative and qualitative methods to collect and analyse data was formulated. Surveys were the primary instruments for acquiring quantitative data. Sources included the Wellness data from Testwell surveys, SAL data from R-SPQ surveys, SRL data from MSLQ surveys and student self-evaluation data from an end of semester survey. Students’ final grades and GPA scores were used as indicators of academic performance. The sources of qualitative data included subject documentation, structured interview transcripts and open-ended responses to survey items. Subsequent to a pilot study in which survey reliability and validity were tested in context, amendments to processes for and instruments of data collection were made. Students who adopted meaning oriented (deep/achieving) approaches tended to assess their Wellness at a higher level, seek effective learning strategies and perform better in formal study. Posttest data in the main study revealed that there were significant positive statistical relationships between academic performance and total wellness scores (rs=.297, n=205, p<.01). Deep (rs=.343, n=137, p<.01) and achieving (rs=.286, n=123, p<.01) approaches to learning also significantly correlated with Wellness whilst surface approaches had negative correlations that were not significant. SRL strategies including metacognitive selfregulation, effort, help-seeking and critical thinking were increasingly correlated with Wellness. Qualitative findings suggest that while all students adopt similar patterns of day to day activities for example attending classes, taking notes, working on assignments the level of care with which these activities is undertaken varies considerably. The dominant motivational trigger for students in this cohort was the personal relevance and associated benefits of the material being learned and practiced. Students were inclined to set goals that had a positive impact on affect and used “sense of happiness” to evaluate their achievement status. Students who had a higher drive to succeed and/or understand tended to have or seek a wider range of strategies. Their goal orientations were generally learning rather than performance based and barriers presented a challenge which could be overcome as opposed to a blockage which prevented progress. Findings from an empirical analysis of the Testwell data suggest that a single third order Wellness construct exists. A revision of the instrument is necessary in order to juxtapose it with the chosen six dimensional Wellness model that forms the foundation construct in the course. Further, redevelopment should be sensitive to the Australian context and culture including choice of language, examples and scenarios used in item construction. This study concludes with an heuristic for use in Wellness education. Guided by principles of Transformative education theory and behaviour change theory, and informed by this representative case study the “CARING” heuristic is proposed as an instructional design tool for Wellness educators seeking to foster transformative learning. Based upon this study, recommendations were made for university educators to provide authentic and personal experiences in Wellness curricula. Emphasis must focus on involving students and teachers in a partnership for implementing Wellness programs both in the curriculum and co-curricularly. The implications of this research for practice are predicated on the willingness of academics to embrace transformative learning at a personal level and a professional one. To explore students’ profiles in detail is not practical however teaching students how to guide us in supporting them through the “pain” of learning is a skill which would benefit them and optimise the learning and teaching process. At a theoretical level, this research contributes to an ecological theory of Wellness education as transformational change. By signposting the wider contexts in which learning takes place, it seeks to encourage changing paradigms to ones which harness the energy of each successive contextual layer in which students live. Future research which amplifies the qualities of individuals and groups who are “Well” and seeks the refinement and development of instruments to measure Wellness constructs would be desirable for both theoretical and applied knowledge bases. Mixed method Wellness research derived and conducted by teams that incorporate expertise from multiple disciplines such as psychology, anthropology, education, and medicine would enable creative and multi-perspective programs of investigation to be designed and implemented. Congruences and inconsistencies in health promotion and education would provide valuable material for strengthening the nexus between transformational learning and behaviour change theories. Future development of and research on the effectiveness of the CARING heuristic would be valuable in advancing the understanding of pedagogies which advance rather than impede learning as a transformative process. Exploring pedagogical models that marry with transformative education may render solutions to the vexing challenge of teaching and learning in diverse contexts.
Resumo:
A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of model selection, in a general learning framework. Actually, we consider a weaker version of this condition that allows one to take into account that learning within a small model can be much easier than within a large one. Requiring this “strong margin adaptivity” makes the model selection problem more challenging. We first prove, in a general framework, that some penalization procedures (including local Rademacher complexities) exhibit this adaptivity when the models are nested. Contrary to previous results, this holds with penalties that only depend on the data. Our second main result is that strong margin adaptivity is not always possible when the models are not nested: for every model selection procedure (even a randomized one), there is a problem for which it does not demonstrate strong margin adaptivity.
Resumo:
The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.
Resumo:
Computer resource allocation represents a significant challenge particularly for multiprocessor systems, which consist of shared computing resources to be allocated among co-runner processes and threads. While an efficient resource allocation would result in a highly efficient and stable overall multiprocessor system and individual thread performance, ineffective poor resource allocation causes significant performance bottlenecks even for the system with high computing resources. This thesis proposes a cache aware adaptive closed loop scheduling framework as an efficient resource allocation strategy for the highly dynamic resource management problem, which requires instant estimation of highly uncertain and unpredictable resource patterns. Many different approaches to this highly dynamic resource allocation problem have been developed but neither the dynamic nature nor the time-varying and uncertain characteristics of the resource allocation problem is well considered. These approaches facilitate either static and dynamic optimization methods or advanced scheduling algorithms such as the Proportional Fair (PFair) scheduling algorithm. Some of these approaches, which consider the dynamic nature of multiprocessor systems, apply only a basic closed loop system; hence, they fail to take the time-varying and uncertainty of the system into account. Therefore, further research into the multiprocessor resource allocation is required. Our closed loop cache aware adaptive scheduling framework takes the resource availability and the resource usage patterns into account by measuring time-varying factors such as cache miss counts, stalls and instruction counts. More specifically, the cache usage pattern of the thread is identified using QR recursive least square algorithm (RLS) and cache miss count time series statistics. For the identified cache resource dynamics, our closed loop cache aware adaptive scheduling framework enforces instruction fairness for the threads. Fairness in the context of our research project is defined as a resource allocation equity, which reduces corunner thread dependence in a shared resource environment. In this way, instruction count degradation due to shared cache resource conflicts is overcome. In this respect, our closed loop cache aware adaptive scheduling framework contributes to the research field in two major and three minor aspects. The two major contributions lead to the cache aware scheduling system. The first major contribution is the development of the execution fairness algorithm, which degrades the co-runner cache impact on the thread performance. The second contribution is the development of relevant mathematical models, such as thread execution pattern and cache access pattern models, which in fact formulate the execution fairness algorithm in terms of mathematical quantities. Following the development of the cache aware scheduling system, our adaptive self-tuning control framework is constructed to add an adaptive closed loop aspect to the cache aware scheduling system. This control framework in fact consists of two main components: the parameter estimator, and the controller design module. The first minor contribution is the development of the parameter estimators; the QR Recursive Least Square(RLS) algorithm is applied into our closed loop cache aware adaptive scheduling framework to estimate highly uncertain and time-varying cache resource patterns of threads. The second minor contribution is the designing of a controller design module; the algebraic controller design algorithm, Pole Placement, is utilized to design the relevant controller, which is able to provide desired timevarying control action. The adaptive self-tuning control framework and cache aware scheduling system in fact constitute our final framework, closed loop cache aware adaptive scheduling framework. The third minor contribution is to validate this cache aware adaptive closed loop scheduling framework efficiency in overwhelming the co-runner cache dependency. The timeseries statistical counters are developed for M-Sim Multi-Core Simulator; and the theoretical findings and mathematical formulations are applied as MATLAB m-file software codes. In this way, the overall framework is tested and experiment outcomes are analyzed. According to our experiment outcomes, it is concluded that our closed loop cache aware adaptive scheduling framework successfully drives co-runner cache dependent thread instruction count to co-runner independent instruction count with an error margin up to 25% in case cache is highly utilized. In addition, thread cache access pattern is also estimated with 75% accuracy.