521 resultados para Selection techniques
em Queensland University of Technology - ePrints Archive
Resumo:
Continuous user authentication with keystroke dynamics uses characters sequences as features. Since users can type characters in any order, it is imperative to find character sequences (n-graphs) that are representative of user typing behavior. The contemporary feature selection approaches do not guarantee selecting frequently-typed features which may cause less accurate statistical user-representation. Furthermore, the selected features do not inherently reflect user typing behavior. We propose four statistical based feature selection techniques that mitigate limitations of existing approaches. The first technique selects the most frequently occurring features. The other three consider different user typing behaviors by selecting: n-graphs that are typed quickly; n-graphs that are typed with consistent time; and n-graphs that have large time variance among users. We use Gunetti’s keystroke dataset and k-means clustering algorithm for our experiments. The results show that among the proposed techniques, the most-frequent feature selection technique can effectively find user representative features. We further substantiate our results by comparing the most-frequent feature selection technique with three existing approaches (popular Italian words, common n-graphs, and least frequent ngraphs). We find that it performs better than the existing approaches after selecting a certain number of most-frequent n-graphs.
Resumo:
Two approaches are described, which aid the selection of the most appropriate procurement arrangements for a building project. The first is a multi-attribute technique based on the National Economic Development Office procurement path decision chart. A small study is described in which the utility factors involved were weighted by averaging the scores of five 'experts' for three hypothetical building projects. A concordance analysis is used to provide some evidence of any abnormal data sources. When applied to the study data, one of the experts was seen to be atypical. The second approach is by means of discriminant analysis. This was found to provide reasonably consistent predictions through three discriminant functions. The analysis also showed the quality criteria to have no significant impact on the decision process. Both approaches provided identical and intuitively correct answers in the study described. Some concluding remarks are made on the potential of discriminant analysis for future research and development in procurement selection techniques.
Resumo:
Most current computer systems authorise the user at the start of a session and do not detect whether the current user is still the initial authorised user, a substitute user, or an intruder pretending to be a valid user. Therefore, a system that continuously checks the identity of the user throughout the session is necessary without being intrusive to end-user and/or effectively doing this. Such a system is called a continuous authentication system (CAS). Researchers have applied several approaches for CAS and most of these techniques are based on biometrics. These continuous biometric authentication systems (CBAS) are supplied by user traits and characteristics. One of the main types of biometric is keystroke dynamics which has been widely tried and accepted for providing continuous user authentication. Keystroke dynamics is appealing for many reasons. First, it is less obtrusive, since users will be typing on the computer keyboard anyway. Second, it does not require extra hardware. Finally, keystroke dynamics will be available after the authentication step at the start of the computer session. Currently, there is insufficient research in the CBAS with keystroke dynamics field. To date, most of the existing schemes ignore the continuous authentication scenarios which might affect their practicality in different real world applications. Also, the contemporary CBAS with keystroke dynamics approaches use characters sequences as features that are representative of user typing behavior but their selected features criteria do not guarantee features with strong statistical significance which may cause less accurate statistical user-representation. Furthermore, their selected features do not inherently incorporate user typing behavior. Finally, the existing CBAS that are based on keystroke dynamics are typically dependent on pre-defined user-typing models for continuous authentication. This dependency restricts the systems to authenticate only known users whose typing samples are modelled. This research addresses the previous limitations associated with the existing CBAS schemes by developing a generic model to better identify and understand the characteristics and requirements of each type of CBAS and continuous authentication scenario. Also, the research proposes four statistical-based feature selection techniques that have highest statistical significance and encompasses different user typing behaviors which represent user typing patterns effectively. Finally, the research proposes the user-independent threshold approach that is able to authenticate a user accurately without needing any predefined user typing model a-priori. Also, we enhance the technique to detect the impostor or intruder who may take over during the entire computer session.
Resumo:
We use Bayesian model selection techniques to test extensions of the standard flat LambdaCDM paradigm. Dark-energy and curvature scenarios, and primordial perturbation models are considered. To that end, we calculate the Bayesian evidence in favour of each model using Population Monte Carlo (PMC), a new adaptive sampling technique which was recently applied in a cosmological context. The Bayesian evidence is immediately available from the PMC sample used for parameter estimation without further computational effort, and it comes with an associated error evaluation. Besides, it provides an unbiased estimator of the evidence after any fixed number of iterations and it is naturally parallelizable, in contrast with MCMC and nested sampling methods. By comparison with analytical predictions for simulated data, we show that our results obtained with PMC are reliable and robust. The variability in the evidence evaluation and the stability for various cases are estimated both from simulations and from data. For the cases we consider, the log-evidence is calculated with a precision of better than 0.08. Using a combined set of recent CMB, SNIa and BAO data, we find inconclusive evidence between flat LambdaCDM and simple dark-energy models. A curved Universe is moderately to strongly disfavoured with respect to a flat cosmology. Using physically well-motivated priors within the slow-roll approximation of inflation, we find a weak preference for a running spectral index. A Harrison-Zel'dovich spectrum is weakly disfavoured. With the current data, tensor modes are not detected; the large prior volume on the tensor-to-scalar ratio r results in moderate evidence in favour of r=0.
Resumo:
This report reviews the selection, design, and installation of fiber reinforced polymer systems for strengthening of reinforced concrete or pre-stressed concrete bridges and other structures. The report is prepared based on the knowledge gained from worldwide experimental research, analytical work, and field applications of FRP systems used to strengthen concrete structures. Information on material properties, design and installation methods of FRP systems used as external reinforcement are presented. This information can be used to select an FRP system for increasing the strength and stiffness of reinforced concrete beams or the ductility of columns, and other applications. Based on the available research, the design considerations and concepts are covered in this report. In the next stage of the project, these will be further developed as design tools. It is important to note, however, that the design concepts proposed in literature have not in many cases been thoroughly developed and proven. Therefore, a considerable amount of research work will be required prior to development of the design concepts into practical design tools, which is a major goal of the current research project. The durability and long-term performance of FRP materials has been the subject of much research, which still are on going. Long-term field data are not currently available, and it is still difficult to accurately predict the life of FRP strengthening systems. The report briefly addresses environmental degradation and long-term durability issues as well. A general overview of using FRP bars as primary reinforcement of concrete structures is presented in Chapter 8. In Chapter 9, a summary of strengthening techniques identified as part of this initial stage of the research project and the issues which require careful consideration prior to practical implementation of these identified techniques are presented.
Resumo:
The decision as to which procurement system to adopt is a complex and challenging task for clients of construction projects. Despite a plethora of tools and techniques available for selecting a procurement method, clients are still uncertain about what method to adopt for a given construction project to achieve success. This paper examines ‘how and why’ procurement methods are selected by public sector clients in Queensland (QLD) and Western Australia (WA). Findings from workshops with senior managers in procurement selection revealed that traditional lump sum methods (TLS) are preferred even though alternative forms could be better suited for a given project. Participants of the workshops agreed that alternative procurement forms should be considered for projects but an embedded culture of uncertainty avoidance meant the selection of TLS methods. It was perceived that only a limited number of contractors operating in the marketplace have the resources and experience to deliver projects using the non-traditional methods.
Resumo:
Purpose: Choosing the appropriate procurement system for construction projects is a complex and challenging task for clients particularly when professional advice has not been sought. To assist with the decision making process, a range of procurement selection tools and techniques have been developed by both academic and industry bodies. Public sector clients in Western Australia (WA) remain uncertain about the pairing of procurement method to bespoke construction project and how this decision will ultimately impact upon project success. This paper examines ‘how and why’ a public sector agency selected particular procurement methods. · Methodology/Approach: An analysis of two focus group workshops (with 18 senior project and policy managers involved with procurement selection) is reported upon · Findings: The traditional lump sum (TLS) method is still the preferred procurement path even though alternative forms such as design and construct, public-private-partnerships could optimize the project outcome. Paradoxically, workshop participants agreed that alternative procurement forms should be considered, but an embedded culture of uncertainty avoidance invariably meant that TLS methods were selected. Senior managers felt that only a limited number of contractors have the resources and experience to deliver projects using the nontraditional methods considered. · Research limitations/implications: The research identifies a need to develop a framework that public sector clients can use to select an appropriate procurement method. A procurement framework should be able to guide the decision-maker rather than provide a prescriptive solution. Learning from previous experiences with regard to procurement selection will further provide public sector clients with knowledge about how to best deliver their projects.
Resumo:
Biased estimation has the advantage of reducing the mean squared error (MSE) of an estimator. The question of interest is how biased estimation affects model selection. In this paper, we introduce biased estimation to a range of model selection criteria. Specifically, we analyze the performance of the minimum description length (MDL) criterion based on biased and unbiased estimation and compare it against modern model selection criteria such as Kay's conditional model order estimator (CME), the bootstrap and the more recently proposed hook-and-loop resampling based model selection. The advantages and limitations of the considered techniques are discussed. The results indicate that, in some cases, biased estimators can slightly improve the selection of the correct model. We also give an example for which the CME with an unbiased estimator fails, but could regain its power when a biased estimator is used.
Resumo:
Automatic recognition of people is an active field of research with important forensic and security applications. In these applications, it is not always possible for the subject to be in close proximity to the system. Voice represents a human behavioural trait which can be used to recognise people in such situations. Automatic Speaker Verification (ASV) is the process of verifying a persons identity through the analysis of their speech and enables recognition of a subject at a distance over a telephone channel { wired or wireless. A significant amount of research has focussed on the application of Gaussian mixture model (GMM) techniques to speaker verification systems providing state-of-the-art performance. GMM's are a type of generative classifier trained to model the probability distribution of the features used to represent a speaker. Recently introduced to the field of ASV research is the support vector machine (SVM). An SVM is a discriminative classifier requiring examples from both positive and negative classes to train a speaker model. The SVM is based on margin maximisation whereby a hyperplane attempts to separate classes in a high dimensional space. SVMs applied to the task of speaker verification have shown high potential, particularly when used to complement current GMM-based techniques in hybrid systems. This work aims to improve the performance of ASV systems using novel and innovative SVM-based techniques. Research was divided into three main themes: session variability compensation for SVMs; unsupervised model adaptation; and impostor dataset selection. The first theme investigated the differences between the GMM and SVM domains for the modelling of session variability | an aspect crucial for robust speaker verification. Techniques developed to improve the robustness of GMMbased classification were shown to bring about similar benefits to discriminative SVM classification through their integration in the hybrid GMM mean supervector SVM classifier. Further, the domains for the modelling of session variation were contrasted to find a number of common factors, however, the SVM-domain consistently provided marginally better session variation compensation. Minimal complementary information was found between the techniques due to the similarities in how they achieved their objectives. The second theme saw the proposal of a novel model for the purpose of session variation compensation in ASV systems. Continuous progressive model adaptation attempts to improve speaker models by retraining them after exploiting all encountered test utterances during normal use of the system. The introduction of the weight-based factor analysis model provided significant performance improvements of over 60% in an unsupervised scenario. SVM-based classification was then integrated into the progressive system providing further benefits in performance over the GMM counterpart. Analysis demonstrated that SVMs also hold several beneficial characteristics to the task of unsupervised model adaptation prompting further research in the area. In pursuing the final theme, an innovative background dataset selection technique was developed. This technique selects the most appropriate subset of examples from a large and diverse set of candidate impostor observations for use as the SVM background by exploiting the SVM training process. This selection was performed on a per-observation basis so as to overcome the shortcoming of the traditional heuristic-based approach to dataset selection. Results demonstrate the approach to provide performance improvements over both the use of the complete candidate dataset and the best heuristically-selected dataset whilst being only a fraction of the size. The refined dataset was also shown to generalise well to unseen corpora and be highly applicable to the selection of impostor cohorts required in alternate techniques for speaker verification.
Resumo:
The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.
Resumo:
This paper presents a comprehensive discussion of vegetation management approaches in power line corridors based on aerial remote sensing techniques. We address three issues 1) strategies for risk management in power line corridors, 2) selection of suitable platforms and sensor suite for data collection and 3) the progress in automated data processing techniques for vegetation management. We present initial results from a series of experiments and, challenges and lessons learnt from our project.
Resumo:
Life Cycle Cost Analysis provides a form of synopsis of the initial and consequential costs of building related decisions. These cost figures may be implemented to justify higher investments, for example, in the quality or flexibility of building solutions through a long term cost reduction. The emerging discipline of asset mnagement is a promising approach to this problem, because it can do things that techniques such as balanced scorecards and total quantity cannot. Decisions must be made about operating and maintaining infrastructure assets. An injudicious sensitivity of life cycle costing is that the longer something lasts, the less it costs over time. A life cycle cost analysis will be used as an economic evaluation tool and collaborate with various numbers of analyses. LCCA quantifies incurring costs commonly overlooked (by property and asset managers and designs) as replacement and maintenance costs. The purpose of this research is to examine the Life Cycle Cost Analysis on building floor materials. By implementing the life cycle cost analysis, the true cost of each material will be computed projecting 60 years as the building service life and 5.4% as the inflation rate percentage to classify and appreciate the different among the materials. The analysis results showed the high impact in selecting the floor materials according to the potential of service life cycle cost next.
Resumo:
Corneal-height data are typically measured with videokeratoscopes and modeled using a set of orthogonal Zernike polynomials. We address the estimation of the number of Zernike polynomials, which is formalized as a model-order selection problem in linear regression. Classical information-theoretic criteria tend to overestimate the corneal surface due to the weakness of their penalty functions, while bootstrap-based techniques tend to underestimate the surface or require extensive processing. In this paper, we propose to use the efficient detection criterion (EDC), which has the same general form of information-theoretic-based criteria, as an alternative to estimating the optimal number of Zernike polynomials. We first show, via simulations, that the EDC outperforms a large number of information-theoretic criteria and resampling-based techniques. We then illustrate that using the EDC for real corneas results in models that are in closer agreement with clinical expectations and provides means for distinguishing normal corneal surfaces from astigmatic and keratoconic surfaces.