326 resultados para incorporate probabilistic techniques
Resumo:
This paper introduces the Weighted Linear Discriminant Analysis (WLDA) technique, based upon the weighted pairwise Fisher criterion, for the purposes of improving i-vector speaker verification in the presence of high intersession variability. By taking advantage of the speaker discriminative information that is available in the distances between pairs of speakers clustered in the development i-vector space, the WLDA technique is shown to provide an improvement in speaker verification performance over traditional Linear Discriminant Analysis (LDA) approaches. A similar approach is also taken to extend the recently developed Source Normalised LDA (SNLDA) into Weighted SNLDA (WSNLDA) which, similarly, shows an improvement in speaker verification performance in both matched and mismatched enrolment/verification conditions. Based upon the results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset, we believe that both WLDA and WSNLDA are viable as replacement techniques to improve the performance of LDA and SNLDA-based i-vector speaker verification.
Resumo:
EMR (Electronic Medical Record) is an emerging technology that is highly-blended between non-IT and IT area. One methodology is to link the non-IT and IT area is to construct databases. Nowadays, it supports before and after-treatment for patients and should satisfy all stakeholders such as practitioners, nurses, researchers, administrators and financial departments and so on. In accordance with the database maintenance, DAS (Data as Service) model is one solution for outsourcing. However, there are some scalability and strategy issues when we need to plan to use DAS model properly. We constructed three kinds of databases such as plan-text, MS built-in encryption which is in-house model and custom AES (Advanced Encryption Standard) - DAS model scaling from 5K to 2560K records. To perform custom AES-DAS better, we also devised Bucket Index using Bloom Filter. The simulation showed the response times arithmetically increased in the beginning but after a certain threshold, exponentially increased in the end. In conclusion, if the database model is close to in-house model, then vendor technology is a good way to perform and get query response times in a consistent manner. If the model is DAS model, it is easy to outsource the database, however, some techniques like Bucket Index enhances its utilization. To get faster query response times, designing database such as consideration of the field type is also important. This study suggests cloud computing would be a next DAS model to satisfy the scalability and the security issues.
Resumo:
Electronic Health Record (EHR) retrieval processes are complex demanding Information Technology (IT) resources exponentially in particular memory usage. Database-as-a-service (DAS) model approach is proposed to meet the scalability factor of EHR retrieval processes. A simulation study using ranged of EHR records with DAS model was presented. The bucket-indexing model incorporated partitioning fields and bloom filters in a Singleton design pattern were used to implement custom database encryption system. It effectively provided faster responses in the range query compared to different types of queries used such as aggregation queries among the DAS, built-in encryption and the plain-text DBMS. The study also presented with constraints around the approach should consider for other practical applications.
Resumo:
Acoustic emission (AE) analysis is one of the several diagnostic techniques available nowadays for structural health monitoring (SHM) of engineering structures. Some of its advantages over other techniques include high sensitivity to crack growth and capability of monitoring a structure in real time. The phenomenon of rapid release of energy within a material by crack initiation or growth in form of stress waves is known as acoustic emission (AE). In AE technique, these stress waves are recorded by means of suitable sensors placed on the surface of a structure. Recorded signals are subsequently analysed to gather information about the nature of the source. By enabling early detection of crack growth, AE technique helps in planning timely retrofitting or other maintenance jobs or even replacement of the structure if required. In spite of being a promising tool, some challenges do still exist behind the successful application of AE technique. Large amount of data is generated during AE testing, hence effective data analysis is necessary, especially for long term monitoring uses. Appropriate analysis of AE data for quantification of damage level is an area that has received considerable attention. Various approaches available for damage quantification for severity assessment are discussed in this paper, with special focus on civil infrastructure such as bridges. One method called improved b-value analysis is used to analyse data collected from laboratory testing.
Resumo:
In the medical and healthcare arena, patients‟ data is not just their own personal history but also a valuable large dataset for finding solutions for diseases. While electronic medical records are becoming popular and are used in healthcare work places like hospitals, as well as insurance companies, and by major stakeholders such as physicians and their patients, the accessibility of such information should be dealt with in a way that preserves privacy and security. Thus, finding the best way to keep the data secure has become an important issue in the area of database security. Sensitive medical data should be encrypted in databases. There are many encryption/ decryption techniques and algorithms with regard to preserving privacy and security. Currently their performance is an important factor while the medical data is being managed in databases. Another important factor is that the stakeholders should decide more cost-effective ways to reduce the total cost of ownership. As an alternative, DAS (Data as Service) is a popular outsourcing model to satisfy the cost-effectiveness but it takes a consideration that the encryption/ decryption modules needs to be handled by trustworthy stakeholders. This research project is focusing on the query response times in a DAS model (AES-DAS) and analyses the comparison between the outsourcing model and the in-house model which incorporates Microsoft built-in encryption scheme in a SQL Server. This research project includes building a prototype of medical database schemas. There are 2 types of simulations to carry out the project. The first stage includes 6 databases in order to carry out simulations to measure the performance between plain-text, Microsoft built-in encryption and AES-DAS (Data as Service). Particularly, the AES-DAS incorporates implementations of symmetric key encryption such as AES (Advanced Encryption Standard) and a Bucket indexing processor using Bloom filter. The results are categorised such as character type, numeric type, range queries, range queries using Bucket Index and aggregate queries. The second stage takes the scalability test from 5K to 2560K records. The main result of these simulations is that particularly as an outsourcing model, AES-DAS using the Bucket index shows around 3.32 times faster than a normal AES-DAS under the 70 partitions and 10K record-sized databases. Retrieving Numeric typed data takes shorter time than Character typed data in AES-DAS. The aggregation query response time in AES-DAS is not as consistent as that in MS built-in encryption scheme. The scalability test shows that the DBMS reaches in a certain threshold; the query response time becomes rapidly slower. However, there is more to investigate in order to bring about other outcomes and to construct a secured EMR (Electronic Medical Record) more efficiently from these simulations.
Resumo:
Navigational collisions are one of the major safety concerns for many seaports. Continuing growth of shipping traffic in number and sizes is likely to result in increased number of traffic movements, which consequently could result higher risk of collisions in these restricted waters. This continually increasing safety concern warrants a comprehensive technique for modeling collision risk in port waters, particularly for modeling the probability of collision events and the associated consequences (i.e., injuries and fatalities). A number of techniques have been utilized for modeling the risk qualitatively, semi-quantitatively and quantitatively. These traditional techniques mostly rely on historical collision data, often in conjunction with expert judgments. However, these techniques are hampered by several shortcomings, such as randomness and rarity of collision occurrence leading to obtaining insufficient number of collision counts for a sound statistical analysis, insufficiency in explaining collision causation, and reactive approach to safety. A promising alternative approach that overcomes these shortcomings is the navigational traffic conflict technique (NTCT), which uses traffic conflicts as an alternative to the collisions for modeling the probability of collision events quantitatively. This article explores the existing techniques for modeling collision risk in port waters. In particular, it identifies the advantages and limitations of the traditional techniques and highlights the potentials of the NTCT in overcoming the limitations. In view of the principles of the NTCT, a structured method for managing collision risk is proposed. This risk management method allows safety analysts to diagnose safety deficiencies in a proactive manner, which consequently has great potential for managing collision risk in a fast, reliable and efficient manner.
Resumo:
In this paper we use a sequence-based visual localization algorithm to reveal surprising answers to the question, how much visual information is actually needed to conduct effective navigation? The algorithm actively searches for the best local image matches within a sliding window of short route segments or 'sub-routes', and matches sub-routes by searching for coherent sequences of local image matches. In contract to many existing techniques, the technique requires no pre-training or camera parameter calibration. We compare the algorithm's performance to the state-of-the-art FAB-MAP 2.0 algorithm on a 70 km benchmark dataset. Performance matches or exceeds the state of the art feature-based localization technique using images as small as 4 pixels, fields of view reduced by a factor of 250, and pixel bit depths reduced to 2 bits. We present further results demonstrating the system localizing in an office environment with near 100% precision using two 7 bit Lego light sensors, as well as using 16 and 32 pixel images from a motorbike race and a mountain rally car stage. By demonstrating how little image information is required to achieve localization along a route, we hope to stimulate future 'low fidelity' approaches to visual navigation that complement probabilistic feature-based techniques.
Resumo:
This chapter reviews common barriers to community engagement for Latino youth and suggests ways to move beyond those barriers by empowering them to communicate their experiences, address the challenges they face, and develop recommendations for making their community more youth-friendly. As a case study, this chapter describes a program called Youth FACE IT (Youth Fostering Active Community Engagement for Integration and Transformation)in Boulder County, Colorado. The program enables Latino youth to engage in critical dialogue and participate in a community-based initiative. The chapter concludes by explaining specific strategies that planners can use to support active community engagement and develop a future generation of planners and engaged community members that reflects emerging demographics.
Resumo:
In this paper we consider the variable order time fractional diffusion equation. We adopt the Coimbra variable order (VO) time fractional operator, which defines a consistent method for VO differentiation of physical variables. The Coimbra variable order fractional operator also can be viewed as a Caputo-type definition. Although this definition is the most appropriate definition having fundamental characteristics that are desirable for physical modeling, numerical methods for fractional partial differential equations using this definition have not yet appeared in the literature. Here an approximate scheme is first proposed. The stability, convergence and solvability of this numerical scheme are discussed via the technique of Fourier analysis. Numerical examples are provided to show that the numerical method is computationally efficient. Crown Copyright © 2012 Published by Elsevier Inc. All rights reserved.
Resumo:
The discovery of protein variation is an important strategy in disease diagnosis within the biological sciences. The current benchmark for elucidating information from multiple biological variables is the so called “omics” disciplines of the biological sciences. Such variability is uncovered by implementation of multivariable data mining techniques which come under two primary categories, machine learning strategies and statistical based approaches. Typically proteomic studies can produce hundreds or thousands of variables, p, per observation, n, depending on the analytical platform or method employed to generate the data. Many classification methods are limited by an n≪p constraint, and as such, require pre-treatment to reduce the dimensionality prior to classification. Recently machine learning techniques have gained popularity in the field for their ability to successfully classify unknown samples. One limitation of such methods is the lack of a functional model allowing meaningful interpretation of results in terms of the features used for classification. This is a problem that might be solved using a statistical model-based approach where not only is the importance of the individual protein explicit, they are combined into a readily interpretable classification rule without relying on a black box approach. Here we incorporate statistical dimension reduction techniques Partial Least Squares (PLS) and Principal Components Analysis (PCA) followed by both statistical and machine learning classification methods, and compared them to a popular machine learning technique, Support Vector Machines (SVM). Both PLS and SVM demonstrate strong utility for proteomic classification problems.
Resumo:
The research team recognized the value of network-level Falling Weight Deflectometer (FWD) testing to evaluate the structural condition trends of flexible pavements. However, practical limitations due to the cost of testing, traffic control and safety concerns and the ability to test a large network may discourage some agencies from conducting the network-level FWD testing. For this reason, the surrogate measure of the Structural Condition Index (SCI) is suggested for use. The main purpose of the research presented in this paper is to investigate data mining strategies and to develop a prediction method of the structural condition trends for network-level applications which does not require FWD testing. The research team first evaluated the existing and historical pavement condition, distress, ride, traffic and other data attributes in the Texas Department of Transportation (TxDOT) Pavement Maintenance Information System (PMIS), applied data mining strategies to the data, discovered useful patterns and knowledge for SCI value prediction, and finally provided a reasonable measure of pavement structural condition which is correlated to the SCI. To evaluate the performance of the developed prediction approach, a case study was conducted using the SCI data calculated from the FWD data collected on flexible pavements over a 5-year period (2005 – 09) from 354 PMIS sections representing 37 pavement sections on the Texas highway system. The preliminary study results showed that the proposed approach can be used as a supportive pavement structural index in the event when FWD deflection data is not available and help pavement managers identify the timing and appropriate treatment level of preventive maintenance activities.
Resumo:
Traversability maps are a global spatial representation of the relative difficulty in driving through a local region. These maps support simple optimisation of robot paths and have been very popular in path planning techniques. Despite the popularity of these maps, the methods for generating global traversability maps have been limited to using a-priori information. This paper explores the construction of large scale traversability maps for a vehicle performing a repeated activity in a bounded working environment, such as a repeated delivery task.We evaluate the use of vehicle power consumption, longitudinal slip, lateral slip and vehicle orientation to classify the traversability and incorporate this into a map generated from sparse information.
Resumo:
Effective, statistically robust sampling and surveillance strategies form an integral component of large agricultural industries such as the grains industry. Intensive in-storage sampling is essential for pest detection, Integrated Pest Management (IPM), to determine grain quality and to satisfy importing nation’s biosecurity concerns, while surveillance over broad geographic regions ensures that biosecurity risks can be excluded, monitored, eradicated or contained within an area. In the grains industry, a number of qualitative and quantitative methodologies for surveillance and in-storage sampling have been considered. Primarily, research has focussed on developing statistical methodologies for in storage sampling strategies concentrating on detection of pest insects within a grain bulk, however, the need for effective and statistically defensible surveillance strategies has also been recognised. Interestingly, although surveillance and in storage sampling have typically been considered independently, many techniques and concepts are common between the two fields of research. This review aims to consider the development of statistically based in storage sampling and surveillance strategies and to identify methods that may be useful for both surveillance and in storage sampling. We discuss the utility of new quantitative and qualitative approaches, such as Bayesian statistics, fault trees and more traditional probabilistic methods and show how these methods may be used in both surveillance and in storage sampling systems.