815 resultados para Alcohol Treatment, Machine Learning, Bayesian, Decision Tree
Resumo:
The amount and quality of available biomass is a key factor for the sustainable livestock industry and agricultural management related decision making. Globally 31.5% of land cover is grassland while 80% of Ireland’s agricultural land is grassland. In Ireland, grasslands are intensively managed and provide the cheapest feed source for animals. This dissertation presents a detailed state of the art review of satellite remote sensing of grasslands, and the potential application of optical (Moderate–resolution Imaging Spectroradiometer (MODIS)) and radar (TerraSAR-X) time series imagery to estimate the grassland biomass at two study sites (Moorepark and Grange) in the Republic of Ireland using both statistical and state of the art machine learning algorithms. High quality weather data available from the on-site weather station was also used to calculate the Growing Degree Days (GDD) for Grange to determine the impact of ancillary data on biomass estimation. In situ and satellite data covering 12 years for the Moorepark and 6 years for the Grange study sites were used to predict grassland biomass using multiple linear regression, Neuro Fuzzy Inference Systems (ANFIS) models. The results demonstrate that a dense (8-day composite) MODIS image time series, along with high quality in situ data, can be used to retrieve grassland biomass with high performance (R2 = 0:86; p < 0:05, RMSE = 11.07 for Moorepark). The model for Grange was modified to evaluate the synergistic use of vegetation indices derived from remote sensing time series and accumulated GDD information. As GDD is strongly linked to the plant development, or phonological stage, an improvement in biomass estimation would be expected. It was observed that using the ANFIS model the biomass estimation accuracy increased from R2 = 0:76 (p < 0:05) to R2 = 0:81 (p < 0:05) and the root mean square error was reduced by 2.72%. The work on the application of optical remote sensing was further developed using a TerraSAR-X Staring Spotlight mode time series over the Moorepark study site to explore the extent to which very high resolution Synthetic Aperture Radar (SAR) data of interferometrically coherent paddocks can be exploited to retrieve grassland biophysical parameters. After filtering out the non-coherent plots it is demonstrated that interferometric coherence can be used to retrieve grassland biophysical parameters (i. e., height, biomass), and that it is possible to detect changes due to the grass growth, and grazing and mowing events, when the temporal baseline is short (11 days). However, it not possible to automatically uniquely identify the cause of these changes based only on the SAR backscatter and coherence, due to the ambiguity caused by tall grass laid down due to the wind. Overall, the work presented in this dissertation has demonstrated the potential of dense remote sensing and weather data time series to predict grassland biomass using machine-learning algorithms, where high quality ground data were used for training. At present a major limitation for national scale biomass retrieval is the lack of spatial and temporal ground samples, which can be partially resolved by minor modifications in the existing PastureBaseIreland database by adding the location and extent ofeach grassland paddock in the database. As far as remote sensing data requirements are concerned, MODIS is useful for large scale evaluation but due to its coarse resolution it is not possible to detect the variations within the fields and between the fields at the farm scale. However, this issue will be resolved in terms of spatial resolution by the Sentinel-2 mission, and when both satellites (Sentinel-2A and Sentinel-2B) are operational the revisit time will reduce to 5 days, which together with Landsat-8, should enable sufficient cloud-free data for operational biomass estimation at a national scale. The Synthetic Aperture Radar Interferometry (InSAR) approach is feasible if there are enough coherent interferometric pairs available, however this is difficult to achieve due to the temporal decorrelation of the signal. For repeat-pass InSAR over a vegetated area even an 11 days temporal baseline is too large. In order to achieve better coherence a very high resolution is required at the cost of spatial coverage, which limits its scope for use in an operational context at a national scale. Future InSAR missions with pair acquisition in Tandem mode will minimize the temporal decorrelation over vegetation areas for more focused studies. The proposed approach complements the current paradigm of Big Data in Earth Observation, and illustrates the feasibility of integrating data from multiple sources. In future, this framework can be used to build an operational decision support system for retrieval of grassland biophysical parameters based on data from long term planned optical missions (e. g., Landsat, Sentinel) that will ensure the continuity of data acquisition. Similarly, Spanish X-band PAZ and TerraSAR-X2 missions will ensure the continuity of TerraSAR-X and COSMO-SkyMed.
Resumo:
Research endeavors on spoken dialogue systems in the 1990s and 2000s have led to the deployment of commercial spoken dialogue systems (SDS) in microdomains such as customer service automation, reservation/booking and question answering systems. Recent research in SDS has been focused on the development of applications in different domains (e.g. virtual counseling, personal coaches, social companions) which requires more sophistication than the previous generation of commercial SDS. The focus of this research project is the delivery of behavior change interventions based on the brief intervention counseling style via spoken dialogue systems. Brief interventions (BI) are evidence-based, short, well structured, one-on-one counseling sessions. Many challenges are involved in delivering BIs to people in need, such as finding the time to administer them in busy doctors' offices, obtaining the extra training that helps staff become comfortable providing these interventions, and managing the cost of delivering the interventions. Fortunately, recent developments in spoken dialogue systems make the development of systems that can deliver brief interventions possible. The overall objective of this research is to develop a data-driven, adaptable dialogue system for brief interventions for problematic drinking behavior, based on reinforcement learning methods. The implications of this research project includes, but are not limited to, assessing the feasibility of delivering structured brief health interventions with a data-driven spoken dialogue system. Furthermore, while the experimental system focuses on harmful alcohol drinking as a target behavior in this project, the produced knowledge and experience may also lead to implementation of similarly structured health interventions and assessments other than the alcohol domain (e.g. obesity, drug use, lack of exercise), using statistical machine learning approaches. In addition to designing a dialog system, the semantic and emotional meanings of user utterances have high impact on interaction. To perform domain specific reasoning and recognize concepts in user utterances, a named-entity recognizer and an ontology are designed and evaluated. To understand affective information conveyed through text, lexicons and sentiment analysis module are developed and tested.
Resumo:
Background: Sepsis can lead to multiple organ failure and death. Timely and appropriate treatment can reduce in-hospital mortality and morbidity. Objectives: To determine the clinical effectiveness and cost-effectiveness of three tests [LightCycler SeptiFast Test MGRADE® (Roche Diagnostics, Risch-Rotkreuz, Switzerland); SepsiTest™ (Molzym Molecular Diagnostics, Bremen, Germany); and the IRIDICA BAC BSI assay (Abbott Diagnostics, Lake Forest, IL, USA)] for the rapid identification of bloodstream bacteria and fungi in patients with suspected sepsis compared with standard practice (blood culture with or without matrix-absorbed laser desorption/ionisation time-offlight mass spectrometry). Data sources: Thirteen electronic databases (including MEDLINE, EMBASE and The Cochrane Library) were searched from January 2006 to May 2015 and supplemented by hand-searching relevant articles. Review methods: A systematic review and meta-analysis of effectiveness studies were conducted. A review of published economic analyses was undertaken and a de novo health economic model was constructed. A decision tree was used to estimate the costs and quality-adjusted life-years (QALYs) associated with each test; all other parameters were estimated from published sources. The model was populated with evidence from the systematic review or individual studies, if this was considered more appropriate (base case 1). In a secondary analysis, estimates (based on experience and opinion) from seven clinicians regarding the benefits of earlier test results were sought (base case 2). A NHS and Personal Social Services perspective was taken, and costs and benefits were discounted at 3.5% per annum. Scenario analyses were used to assess uncertainty. Results: For the review of diagnostic test accuracy, 62 studies of varying methodological quality were included. A meta-analysis of 54 studies comparing SeptiFast with blood culture found that SeptiFast had an estimated summary specificity of 0.86 [95% credible interval (CrI) 0.84 to 0.89] and sensitivity of 0.65 (95% CrI 0.60 to 0.71). Four studies comparing SepsiTest with blood culture found that SepsiTest had an estimated summary specificity of 0.86 (95% CrI 0.78 to 0.92) and sensitivity of 0.48 (95% CrI 0.21 to 0.74), and four studies comparing IRIDICA with blood culture found that IRIDICA had an estimated summary specificity of 0.84 (95% CrI 0.71 to 0.92) and sensitivity of 0.81 (95% CrI 0.69 to 0.90). Owing to the deficiencies in study quality for all interventions, diagnostic accuracy data should be treated with caution. No randomised clinical trial evidence was identified that indicated that any of the tests significantly improved key patient outcomes, such as mortality or duration in an intensive care unit or hospital. Base case 1 estimated that none of the three tests provided a benefit to patients compared with standard practice and thus all tests were dominated. In contrast, in base case 2 it was estimated that all cost per QALY-gained values were below £20,000; the IRIDICA BAC BSI assay had the highest estimated incremental net benefit, but results from base case 2 should be treated with caution as these are not evidence based. Limitations: Robust data to accurately assess the clinical effectiveness and cost-effectiveness of the interventions are currently unavailable. Conclusions: The clinical effectiveness and cost-effectiveness of the interventions cannot be reliably determined with the current evidence base. Appropriate studies, which allow information from the tests to be implemented in clinical practice, are required.
Resumo:
There has been an increasing interest in the development of new methods using Pareto optimality to deal with multi-objective criteria (for example, accuracy and time complexity). Once one has developed an approach to a problem of interest, the problem is then how to compare it with the state of art. In machine learning, algorithms are typically evaluated by comparing their performance on different data sets by means of statistical tests. Standard tests used for this purpose are able to consider jointly neither performance measures nor multiple competitors at once. The aim of this paper is to resolve these issues by developing statistical procedures that are able to account for multiple competing measures at the same time and to compare multiple algorithms altogether. In particular, we develop two tests: a frequentist procedure based on the generalized likelihood-ratio test and a Bayesian procedure based on a multinomial-Dirichlet conjugate model. We further extend them by discovering conditional independences among measures to reduce the number of parameters of such models, as usually the number of studied cases is very reduced in such comparisons. Data from a comparison among general purpose classifiers is used to show a practical application of our tests.
Resumo:
In questa tesi sono stati analizzati alcuni metodi di ricerca per dati 3D. Viene illustrata una panoramica generale sul campo della Computer Vision, sullo stato dell’arte dei sensori per l’acquisizione e su alcuni dei formati utilizzati per la descrizione di dati 3D. In seguito è stato fatto un approfondimento sulla 3D Object Recognition dove, oltre ad essere descritto l’intero processo di matching tra Local Features, è stata fatta una focalizzazione sulla fase di detection dei punti salienti. In particolare è stato analizzato un Learned Keypoint detector, basato su tecniche di apprendimento di machine learning. Quest ultimo viene illustrato con l’implementazione di due algoritmi di ricerca di vicini: uno esauriente (K-d tree) e uno approssimato (Radial Search). Sono state riportate infine alcune valutazioni sperimentali in termini di efficienza e velocità del detector implementato con diversi metodi di ricerca, mostrando l’effettivo miglioramento di performance senza una considerabile perdita di accuratezza con la ricerca approssimata.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
This paper is reviewing objective assessments of Parkinson’s disease(PD) motor symptoms, cardinal, and dyskinesia, using sensor systems. It surveys the manifestation of PD symptoms, sensors that were used for their detection, types of signals (measures) as well as their signal processing (data analysis) methods. A summary of this review’s finding is represented in a table including devices (sensors), measures and methods that were used in each reviewed motor symptom assessment study. In the gathered studies among sensors, accelerometers and touch screen devices are the most widely used to detect PD symptoms and among symptoms, bradykinesia and tremor were found to be mostly evaluated. In general, machine learning methods are potentially promising for this. PD is a complex disease that requires continuous monitoring and multidimensional symptom analysis. Combining existing technologies to develop new sensor platforms may assist in assessing the overall symptom profile more accurately to develop useful tools towards supporting better treatment process.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
The education of the radiography profession is based within higher education establishments, yet a critical part of all radiography programmes is the clinical component where students learn the practical skills of the profession. Assessments therefore not only have to assess a student’s knowledge, but also their clinical competence and core skills in line with both Health and Care Professions Council and the Society and College of Radiographers requirements. This timely thesis examines the possibility of using the Virtual Environment for RadioTherapy (VERT) as an assessment tool to evaluate a student’s competence so giving the advantage of a standard assessment and relieving time pressures in the clinical department. A mixed methods approach was taken which can be described as a Quantitative Qualitative design with the emphasis being on the Quantitative element; a so called QUAN qual design. The quantitative evaluation compared two simulations, one in the virtual reality environment and another in the department using a real treatment machine. Students were asked to perform two electron setups in each simulation; the order being randomly decided and so the study would be described as a randomised cross-over design. Following this, qualitative data was collected in student focus groups to explore student perspectives in more depth. Findings indicated that the performance between the two simulators was significantly different, p < 0∙001; the virtual simulation scoring significantly lower than the hospital based simulation overall and in virtually all parameters being assessed. Thematic analysis of the qualitative data supported this finding and identified 4 main themes; equipment use, a lack of reality, learning opportunities and assessment of competence. One other sub-theme identified for reality was that of the environment and senses.
Resumo:
This thesis addresses the Batch Reinforcement Learning methods in Robotics. This sub-class of Reinforcement Learning has shown promising results and has been the focus of recent research. Three contributions are proposed that aim to extend the state-of-art methods allowing for a faster and more stable learning process, such as required for learning in Robotics. The Q-learning update-rule is widely applied, since it allows to learn without the presence of a model of the environment. However, this update-rule is transition-based and does not take advantage of the underlying episodic structure of collected batch of interactions. The Q-Batch update-rule is proposed in this thesis, to process experiencies along the trajectories collected in the interaction phase. This allows a faster propagation of obtained rewards and penalties, resulting in faster and more robust learning. Non-parametric function approximations are explored, such as Gaussian Processes. This type of approximators allows to encode prior knowledge about the latent function, in the form of kernels, providing a higher level of exibility and accuracy. The application of Gaussian Processes in Batch Reinforcement Learning presented a higher performance in learning tasks than other function approximations used in the literature. Lastly, in order to extract more information from the experiences collected by the agent, model-learning techniques are incorporated to learn the system dynamics. In this way, it is possible to augment the set of collected experiences with experiences generated through planning using the learned models. Experiments were carried out mainly in simulation, with some tests carried out in a physical robotic platform. The obtained results show that the proposed approaches are able to outperform the classical Fitted Q Iteration.
Resumo:
We propose three research problems to explore the relations between trust and security in the setting of distributed computation. In the first problem, we study trust-based adversary detection in distributed consensus computation. The adversaries we consider behave arbitrarily disobeying the consensus protocol. We propose a trust-based consensus algorithm with local and global trust evaluations. The algorithm can be abstracted using a two-layer structure with the top layer running a trust-based consensus algorithm and the bottom layer as a subroutine executing a global trust update scheme. We utilize a set of pre-trusted nodes, headers, to propagate local trust opinions throughout the network. This two-layer framework is flexible in that it can be easily extensible to contain more complicated decision rules, and global trust schemes. The first problem assumes that normal nodes are homogeneous, i.e. it is guaranteed that a normal node always behaves as it is programmed. In the second and third problems however, we assume that nodes are heterogeneous, i.e, given a task, the probability that a node generates a correct answer varies from node to node. The adversaries considered in these two problems are workers from the open crowd who are either investing little efforts in the tasks assigned to them or intentionally give wrong answers to questions. In the second part of the thesis, we consider a typical crowdsourcing task that aggregates input from multiple workers as a problem in information fusion. To cope with the issue of noisy and sometimes malicious input from workers, trust is used to model workers' expertise. In a multi-domain knowledge learning task, however, using scalar-valued trust to model a worker's performance is not sufficient to reflect the worker's trustworthiness in each of the domains. To address this issue, we propose a probabilistic model to jointly infer multi-dimensional trust of workers, multi-domain properties of questions, and true labels of questions. Our model is very flexible and extensible to incorporate metadata associated with questions. To show that, we further propose two extended models, one of which handles input tasks with real-valued features and the other handles tasks with text features by incorporating topic models. Our models can effectively recover trust vectors of workers, which can be very useful in task assignment adaptive to workers' trust in the future. These results can be applied for fusion of information from multiple data sources like sensors, human input, machine learning results, or a hybrid of them. In the second subproblem, we address crowdsourcing with adversaries under logical constraints. We observe that questions are often not independent in real life applications. Instead, there are logical relations between them. Similarly, workers that provide answers are not independent of each other either. Answers given by workers with similar attributes tend to be correlated. Therefore, we propose a novel unified graphical model consisting of two layers. The top layer encodes domain knowledge which allows users to express logical relations using first-order logic rules and the bottom layer encodes a traditional crowdsourcing graphical model. Our model can be seen as a generalized probabilistic soft logic framework that encodes both logical relations and probabilistic dependencies. To solve the collective inference problem efficiently, we have devised a scalable joint inference algorithm based on the alternating direction method of multipliers. The third part of the thesis considers the problem of optimal assignment under budget constraints when workers are unreliable and sometimes malicious. In a real crowdsourcing market, each answer obtained from a worker incurs cost. The cost is associated with both the level of trustworthiness of workers and the difficulty of tasks. Typically, access to expert-level (more trustworthy) workers is more expensive than to average crowd and completion of a challenging task is more costly than a click-away question. In this problem, we address the problem of optimal assignment of heterogeneous tasks to workers of varying trust levels with budget constraints. Specifically, we design a trust-aware task allocation algorithm that takes as inputs the estimated trust of workers and pre-set budget, and outputs the optimal assignment of tasks to workers. We derive the bound of total error probability that relates to budget, trustworthiness of crowds, and costs of obtaining labels from crowds naturally. Higher budget, more trustworthy crowds, and less costly jobs result in a lower theoretical bound. Our allocation scheme does not depend on the specific design of the trust evaluation component. Therefore, it can be combined with generic trust evaluation algorithms.
Resumo:
An inference task in one in which some known set of information is used to produce an estimate about an unknown quantity. Existing theories of how humans make inferences include specialized heuristics that allow people to make these inferences in familiar environments quickly and without unnecessarily complex computation. Specialized heuristic processing may be unnecessary, however; other research suggests that the same patterns in judgment can be explained by existing patterns in encoding and retrieving memories. This dissertation compares and attempts to reconcile three alternate explanations of human inference. After justifying three hierarchical Bayesian version of existing inference models, the three models are com- pared on simulated, observed, and experimental data. The results suggest that the three models capture different patterns in human behavior but, based on posterior prediction using laboratory data, potentially ignore important determinants of the decision process.
Resumo:
Efficient crop monitoring and pest damage assessments are key to protecting the Australian agricultural industry and ensuring its leading position internationally. An important element in pest detection is gathering reliable crop data frequently and integrating analysis tools for decision making. Unmanned aerial systems are emerging as a cost-effective solution to a number of precision agriculture challenges. An important advantage of this technology is it provides a non-invasive aerial sensor platform to accurately monitor broad acre crops. In this presentation, we will give an overview on how unmanned aerial systems and machine learning can be combined to address crop protection challenges. A recent 2015 study on insect damage in sorghum will illustrate the effectiveness of this methodology. A UAV platform equipped with a high-resolution camera was deployed to autonomously perform a flight pattern over the target area. We describe the image processing pipeline implemented to create a georeferenced orthoimage and visualize the spatial distribution of the damage. An image analysis tool has been developed to minimize human input requirements. The computer program is based on a machine learning algorithm that automatically creates a meaningful partition of the image into clusters. Results show the algorithm delivers decision boundaries that accurately classify the field into crop health levels. The methodology presented in this paper represents a venue for further research towards automated crop protection assessments in the cotton industry, with applications in detecting, quantifying and monitoring the presence of mealybugs, mites and aphid pests.
Resumo:
This dissertation investigates the connection between spectral analysis and frame theory. When considering the spectral properties of a frame, we present a few novel results relating to the spectral decomposition. We first show that scalable frames have the property that the inner product of the scaling coefficients and the eigenvectors must equal the inverse eigenvalues. From this, we prove a similar result when an approximate scaling is obtained. We then focus on the optimization problems inherent to the scalable frames by first showing that there is an equivalence between scaling a frame and optimization problems with a non-restrictive objective function. Various objective functions are considered, and an analysis of the solution type is presented. For linear objectives, we can encourage sparse scalings, and with barrier objective functions, we force dense solutions. We further consider frames in high dimensions, and derive various solution techniques. From here, we restrict ourselves to various frame classes, to add more specificity to the results. Using frames generated from distributions allows for the placement of probabilistic bounds on scalability. For discrete distributions (Bernoulli and Rademacher), we bound the probability of encountering an ONB, and for continuous symmetric distributions (Uniform and Gaussian), we show that symmetry is retained in the transformed domain. We also prove several hyperplane-separation results. With the theory developed, we discuss graph applications of the scalability framework. We make a connection with graph conditioning, and show the in-feasibility of the problem in the general case. After a modification, we show that any complete graph can be conditioned. We then present a modification of standard PCA (robust PCA) developed by Cand\`es, and give some background into Electron Energy-Loss Spectroscopy (EELS). We design a novel scheme for the processing of EELS through robust PCA and least-squares regression, and test this scheme on biological samples. Finally, we take the idea of robust PCA and apply the technique of kernel PCA to perform robust manifold learning. We derive the problem and present an algorithm for its solution. There is also discussion of the differences with RPCA that make theoretical guarantees difficult.
Resumo:
Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs and makes a one-time prediction. In this disser- tation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving. In these problems, predictions are often made based only on partial information. We see this dynamic setting in many real-time, interactive applications. These problems usually involve a trade-off between the amount of input received (cost) and the quality of the output prediction (accuracy). Therefore, the evaluation considers both objectives (e.g., plotting a Pareto curve). Our goal is to develop a formal understanding of sequential prediction and decision-making problems in natural language processing and to propose efficient solutions. Toward this end, we present meta-algorithms that take an existent batch model and produce a dynamic model to handle sequential inputs and outputs. Webuild our framework upon theories of Markov Decision Process (MDP), which allows learning to trade off competing objectives in a principled way. The main machine learning techniques we use are from imitation learning and reinforcement learning, and we advance current techniques to tackle problems arising in our settings. We evaluate our algorithm on a variety of applications, including dependency parsing, machine translation, and question answering. We show that our approach achieves a better cost-accuracy trade-off than the batch approach and heuristic-based decision- making approaches. We first propose a general framework for cost-sensitive prediction, where dif- ferent parts of the input come at different costs. We formulate a decision-making process that selects pieces of the input sequentially, and the selection is adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing). We show that it achieves similar prediction quality to methods that use all input, while inducing a much smaller cost. Next, we extend the framework to problems where the input is revealed incremen- tally in a fixed order. We study two applications: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this set- ting and show that adding domain knowledge eases the decision-making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with sequential input/output and trade-off decisions, accompanied by a learning algorithm that solves the MDP.