861 resultados para Robust Learning Algorithm
Resumo:
This research is to establish new optimization methods for pattern recognition and classification of different white blood cells in actual patient data to enhance the process of diagnosis. Beckman-Coulter Corporation supplied flow cytometry data of numerous patients that are used as training sets to exploit the different physiological characteristics of the different samples provided. The methods of Support Vector Machines (SVM) and Artificial Neural Networks (ANN) were used as promising pattern classification techniques to identify different white blood cell samples and provide information to medical doctors in the form of diagnostic references for the specific disease states, leukemia. The obtained results prove that when a neural network classifier is well configured and trained with cross-validation, it can perform better than support vector classifiers alone for this type of data. Furthermore, a new unsupervised learning algorithm---Density based Adaptive Window Clustering algorithm (DAWC) was designed to process large volumes of data for finding location of high data cluster in real-time. It reduces the computational load to ∼O(N) number of computations, and thus making the algorithm more attractive and faster than current hierarchical algorithms.
Resumo:
Beamforming is a technique widely used in various fields. With the aid of an antenna array, the beamforming aims to minimize the contribution of unknown interferents directions, while capturing the desired signal in a given direction. In this thesis are proposed beamforming techniques using Reinforcement Learning (RL) through the Q-Learning algorithm in antennas array. One proposal is to use RL to find the optimal policy selection between the beamforming (BF) and power control (PC) in order to better leverage the individual characteristics of each of them for a certain amount of Signal to Interference plus noise Ration (SINR). Another proposal is to use RL to determine the optimal policy between blind beamforming algorithm of CMA (Constant Modulus Algorithm) and DD (Decision Direct) in multipath environments. Results from simulations showed that the RL technique could be effective in achieving na optimal of switching between different techniques.
Resumo:
Many tracking algorithms have difficulties dealing with occlusions and background clutters, and consequently don't converge to an appropriate solution. Tracking based on the mean shift algorithm has shown robust performance in many circumstances but still fails e.g. when encountering dramatic intensity or colour changes in a pre-defined neighbourhood. In this paper, we present a robust tracking algorithm that integrates the advantages of mean shift tracking with those of tracking local invariant features. These features are integrated into the mean shift formulation so that tracking is performed based both on mean shift and feature probability distributions, coupled with an expectation maximisation scheme. Experimental results show robust tracking performance on a series of complicated real image sequences. © 2010 IEEE.
Resumo:
Purpose: To investigate the effect of incorporating a beam spreading parameter in a beam angle optimization algorithm and to evaluate its efficacy for creating coplanar IMRT lung plans in conjunction with machine learning generated dose objectives.
Methods: Fifteen anonymized patient cases were each re-planned with ten values over the range of the beam spreading parameter, k, and analyzed with a Wilcoxon signed-rank test to determine whether any particular value resulted in significant improvement over the initially treated plan created by a trained dosimetrist. Dose constraints were generated by a machine learning algorithm and kept constant for each case across all k values. Parameters investigated for potential improvement included mean lung dose, V20 lung, V40 heart, 80% conformity index, and 90% conformity index.
Results: With a confidence level of 5%, treatment plans created with this method resulted in significantly better conformity indices. Dose coverage to the PTV was improved by an average of 12% over the initial plans. At the same time, these treatment plans showed no significant difference in mean lung dose, V20 lung, or V40 heart when compared to the initial plans; however, it should be noted that these results could be influenced by the small sample size of patient cases.
Conclusions: The beam angle optimization algorithm, with the inclusion of the beam spreading parameter k, increases the dose conformity of the automatically generated treatment plans over that of the initial plans without adversely affecting the dose to organs at risk. This parameter can be varied according to physician preference in order to control the tradeoff between dose conformity and OAR sparing without compromising the integrity of the plan.
Resumo:
This paper outlines the development of a crosscorrelation algorithm and a spiking neural network (SNN) for sound localisation based on real sound recorded in a noisy and dynamic environment by a mobile robot. The SNN architecture aims to simulate the sound localisation ability of the mammalian auditory pathways by exploiting the binaural cue of interaural time difference (ITD). The medial superior olive was the inspiration for the SNN architecture which required the integration of an encoding layer which produced biologically realistic spike trains, a model of the bushy cells found in the cochlear nucleus and a supervised learning algorithm. The experimental results demonstrate that biologically inspired sound localisation achieved using a SNN can compare favourably to the more classical technique of cross-correlation.
Resumo:
The goal of image retrieval and matching is to find and locate object instances in images from a large-scale image database. While visual features are abundant, how to combine them to improve performance by individual features remains a challenging task. In this work, we focus on leveraging multiple features for accurate and efficient image retrieval and matching. We first propose two graph-based approaches to rerank initially retrieved images for generic image retrieval. In the graph, vertices are images while edges are similarities between image pairs. Our first approach employs a mixture Markov model based on a random walk model on multiple graphs to fuse graphs. We introduce a probabilistic model to compute the importance of each feature for graph fusion under a naive Bayesian formulation, which requires statistics of similarities from a manually labeled dataset containing irrelevant images. To reduce human labeling, we further propose a fully unsupervised reranking algorithm based on a submodular objective function that can be efficiently optimized by greedy algorithm. By maximizing an information gain term over the graph, our submodular function favors a subset of database images that are similar to query images and resemble each other. The function also exploits the rank relationships of images from multiple ranked lists obtained by different features. We then study a more well-defined application, person re-identification, where the database contains labeled images of human bodies captured by multiple cameras. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information. We apply a novel multi-task learning algorithm using both low level features and attributes. A low rank attribute embedding is joint learned within the multi-task learning formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered. To locate objects in images, we design an object detector based on object proposals and deep convolutional neural networks (CNN) in view of the emergence of deep networks. We improve a Fast RCNN framework and investigate two new strategies to detect objects accurately and efficiently: scale-dependent pooling (SDP) and cascaded rejection classifiers (CRC). The SDP improves detection accuracy by exploiting appropriate convolutional features depending on the scale of input object proposals. The CRC effectively utilizes convolutional features and greatly eliminates negative proposals in a cascaded manner, while maintaining a high recall for true objects. The two strategies together improve the detection accuracy and reduce the computational cost.
Resumo:
Efficient crop monitoring and pest damage assessments are key to protecting the Australian agricultural industry and ensuring its leading position internationally. An important element in pest detection is gathering reliable crop data frequently and integrating analysis tools for decision making. Unmanned aerial systems are emerging as a cost-effective solution to a number of precision agriculture challenges. An important advantage of this technology is it provides a non-invasive aerial sensor platform to accurately monitor broad acre crops. In this presentation, we will give an overview on how unmanned aerial systems and machine learning can be combined to address crop protection challenges. A recent 2015 study on insect damage in sorghum will illustrate the effectiveness of this methodology. A UAV platform equipped with a high-resolution camera was deployed to autonomously perform a flight pattern over the target area. We describe the image processing pipeline implemented to create a georeferenced orthoimage and visualize the spatial distribution of the damage. An image analysis tool has been developed to minimize human input requirements. The computer program is based on a machine learning algorithm that automatically creates a meaningful partition of the image into clusters. Results show the algorithm delivers decision boundaries that accurately classify the field into crop health levels. The methodology presented in this paper represents a venue for further research towards automated crop protection assessments in the cotton industry, with applications in detecting, quantifying and monitoring the presence of mealybugs, mites and aphid pests.
Resumo:
Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs and makes a one-time prediction. In this disser- tation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving. In these problems, predictions are often made based only on partial information. We see this dynamic setting in many real-time, interactive applications. These problems usually involve a trade-off between the amount of input received (cost) and the quality of the output prediction (accuracy). Therefore, the evaluation considers both objectives (e.g., plotting a Pareto curve). Our goal is to develop a formal understanding of sequential prediction and decision-making problems in natural language processing and to propose efficient solutions. Toward this end, we present meta-algorithms that take an existent batch model and produce a dynamic model to handle sequential inputs and outputs. Webuild our framework upon theories of Markov Decision Process (MDP), which allows learning to trade off competing objectives in a principled way. The main machine learning techniques we use are from imitation learning and reinforcement learning, and we advance current techniques to tackle problems arising in our settings. We evaluate our algorithm on a variety of applications, including dependency parsing, machine translation, and question answering. We show that our approach achieves a better cost-accuracy trade-off than the batch approach and heuristic-based decision- making approaches. We first propose a general framework for cost-sensitive prediction, where dif- ferent parts of the input come at different costs. We formulate a decision-making process that selects pieces of the input sequentially, and the selection is adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing). We show that it achieves similar prediction quality to methods that use all input, while inducing a much smaller cost. Next, we extend the framework to problems where the input is revealed incremen- tally in a fixed order. We study two applications: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this set- ting and show that adding domain knowledge eases the decision-making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with sequential input/output and trade-off decisions, accompanied by a learning algorithm that solves the MDP.
Resumo:
Data sources are often dispersed geographically in real life applications. Finding a knowledge model may require to join all the data sources and to run a machine learning algorithm on the joint set. We present an alternative based on a Multi Agent System (MAS): an agent mines one data source in order to extract a local theory (knowledge model) and then merges it with the previous MAS theory using a knowledge fusion technique. This way, we obtain a global theory that summarizes the distributed knowledge without spending resources and time in joining data sources. New experiments have been executed including statistical significance analysis. The results show that, as a result of knowledge fusion, the accuracy of initial theories is significantly improved as well as the accuracy of the monolithic solution.
Resumo:
Las organizaciones y sus entornos son sistemas complejos. Tales sistemas son difíciles de comprender y predecir. Pese a ello, la predicción es una tarea fundamental para la gestión empresarial y para la toma de decisiones que implica siempre un riesgo. Los métodos clásicos de predicción (entre los cuales están: la regresión lineal, la Autoregresive Moving Average y el exponential smoothing) establecen supuestos como la linealidad, la estabilidad para ser matemática y computacionalmente tratables. Por diferentes medios, sin embargo, se han demostrado las limitaciones de tales métodos. Pues bien, en las últimas décadas nuevos métodos de predicción han surgido con el fin de abarcar la complejidad de los sistemas organizacionales y sus entornos, antes que evitarla. Entre ellos, los más promisorios son los métodos de predicción bio-inspirados (ej. redes neuronales, algoritmos genéticos /evolutivos y sistemas inmunes artificiales). Este artículo pretende establecer un estado situacional de las aplicaciones actuales y potenciales de los métodos bio-inspirados de predicción en la administración.
Resumo:
Modifications in vegetation cover can have an impact on the climate through changes in biogeochemical and biogeophysical processes. In this paper, the tree canopy cover percentage of a savannah-like ecosystem (montado/dehesa) was estimated at Landsat pixel level for 2011, and the role of different canopy cover percentages on land surface albedo (LSA) and land surface temperature (LST) were analysed. A modelling procedure using a SGB machine-learning algorithm and Landsat 5-TM spectral bands and derived vegetation indices as explanatory variables, showed that the estimation of montado canopy cover was obtained with good agreement (R2 = 78.4%). Overall, montado canopy cover estimations showed that low canopy cover class (MT_1) is the most representative with 50.63% of total montado area. MODIS LSA and LST products were used to investigate the magnitude of differences in mean annual LSA and LST values between contrasting montado canopy cover percentages. As a result, it was found a significant statistical relationship between montado canopy cover percentage and mean annual surface albedo (R2 = 0.866, p < 0.001) and surface temperature (R2 = 0.942, p < 0.001). The comparisons between the four contrasting montado canopy cover classes showed marked differences in LSA (χ2 = 192.17, df = 3, p < 0.001) and LST (χ2 = 318.18, df = 3, p < 0.001). The highest montado canopy cover percentage (MT_4) generally had lower albedo than lowest canopy cover class, presenting a difference of −11.2% in mean annual albedo values. It was also showed that MT_4 and MT_3 are the cooler canopy cover classes, and MT_2 and MT_1 the warmer, where MT_1 class had a difference of 3.42 °C compared with MT_4 class. Overall, this research highlighted the role that potential changes in montado canopy cover may play in local land surface albedo and temperature variations, as an increase in these two biogeophysical parameters may potentially bring about, in the long term, local/regional climatic changes moving towards greater aridity.
Resumo:
As descrições de produtos turísticos na área da hotelaria, aviação, rent-a-car e pacotes de férias baseiam-se sobretudo em descrições textuais em língua natural muito heterogénea com estilos, apresentações e conteúdos muito diferentes entre si. Uma vez que o sector do turismo é bastante dinâmico e que os seus produtos e ofertas estão constantemente em alteração, o tratamento manual de normalização de toda essa informação não é possível. Neste trabalho construiu-se um protótipo que permite a classificação e extracção automática de informação a partir de descrições de produtos de turismo. Inicialmente a informação é classificada quanto ao tipo. Seguidamente são extraídos os elementos relevantes de cada tipo e gerados objectos facilmente computáveis. Sobre os objectos extraídos, o protótipo com recurso a modelos de textos e imagens gera automaticamente descrições normalizadas e orientadas a um determinado mercado. Esta versatilidade permite um novo conjunto de serviços na promoção e venda dos produtos que seria impossível implementar com a informação original. Este protótipo, embora possa ser aplicado a outros domínios, foi avaliado na normalização da descrição de hotéis. As frases descritivas do hotel são classificadas consoante o seu tipo (Local, Serviços e/ou Equipamento) através de um algoritmo de aprendizagem automática que obtém valores médios de cobertura de 96% e precisão de 72%. A cobertura foi considerada a medida mais importante uma vez que a sua maximização permite que não se percam frases para processamentos posteriores. Este trabalho permitiu também a construção e população de uma base de dados de hotéis que possibilita a pesquisa de hotéis pelas suas características. Esta funcionalidade não seria possível utilizando os conteúdos originais. ABSTRACT: The description of tourism products, like hotel, aviation, rent-a-car and holiday packages, is strongly supported on natural language expressions. Due to the extent of tourism offers and considering the high dynamics in the tourism sector, manual data management is not a reliable or scalable solution. Offer descriptions - in the order of thousands - are structured in different ways, possibly comprising different languages, complementing and/or overlap one another. This work aims at creating a prototype for the automatic classification and extraction of relevant knowledge from tourism-related text expressions. Captured knowledge is represented in a normalized/standard format to enable new services based on this information in order to promote and sale tourism products that would be impossible to implement with the raw information. Although it could be applied to other areas, this prototype was evaluated in the normalization of hotel descriptions. Hotels descriptive sentences are classified according their type (Location, Services and/or Equipment) using a machine learning algorithm. The built setting obtained an average recall of 96% and precision of 72%. Recall considered the most important measure of performance since its maximization allows that sentences were not lost in further processes. As a side product a database of hotels was built and populated with search facilities on its characteristics. This ability would not be possible using the original contents.
Resumo:
Digital soil mapping is an alternative for the recognition of soil classes in areas where pedological surveys are not available. The main aim of this study was to obtain a digital soil map using artificial neural networks (ANN) and environmental variables that express soillandscape relationships. This study was carried out in an area of 11,072 ha located in the Barra Bonita municipality, state of São Paulo, Brazil. A soil survey was obtained from a reference area of approximately 500 ha located in the center of the area studied. With the mapping units identified together with the environmental variables elevation, slope, slope plan, slope profile, convergence index, geology and geomorphic surfaces, a supervised classification by ANN was implemented. The neural network simulator used was the Java NNS with the learning algorithm "back propagation." Reference points were collected for evaluating the performance of the digital map produced. The occurrence of soils in the landscape obtained in the reference area was observed in the following digital classification: medium-textured soils at the highest positions of the landscape, originating from sandstone, and clayey loam soils in the end thirds of the hillsides due to the greater presence of basalt. The variables elevation and slope were the most important factors for discriminating soil class through the ANN. An accuracy level of 82% between the reference points and the digital classification was observed. The methodology proposed allowed for a preliminary soil classification of an area not previously mapped using mapping units obtained in a reference area
Resumo:
In this Thesis, we analyze how climate risk impacts economic players and its consequences on the financial markets. Essentially, literature unravels two main channels through which climate change poses risks to the status quo, namely physical and transitional risk, that we cover in three works. Firstly, the call for a global shift to a net-zero economy implicitly devalues assets that contribute to global warming that regulators are forcing to dismiss. On the other hand, abnormal changes in the temperatures as well as weather-related events challenge the environmental equilibrium and could directly affect operations as well as profitability. We start the analysis with the physical component, by presenting a statistical measure that generally represents shocks to the distribution of temperature anomalies. We oppose this statistic to classical physical measures and assess that it is the driver of the electricity consumption, in the weather derivatives market, and in the cross-section of equity returns. We find two transmission channels, namely investor attention, and firm operations. We then analyze the transition risk component, by associating a regulatory horizon characterization to fixed income valuation. We disentangle a risk driver for corporate bond overperformance that is tight to change in credit riskiness. After controlling a statistical learning algorithm to forecast excess returns, we include carbon emission metrics without clear evidence. Finally, we analyze the effects of change in carbon emission on a regulated market such as the EU ETS by selecting utility sector corporate bond and, after controlling for the possible risk factor, we document how a firm’s carbon profile differently affects the term structure of credit riskiness.
Resumo:
Astrocytes are the most numerous glial cell type in the mammalian brain and permeate the entire CNS interacting with neurons, vasculature, and other glial cells. Astrocytes display intracellular calcium signals that encode information about local synaptic function, distributed network activity, and high-level cognitive functions. Several studies have investigated the calcium dynamics of astrocytes in sensory areas and have shown that these cells can encode sensory stimuli. Nevertheless, only recently the neuro-scientific community has focused its attention on the role and functions of astrocytes in associative areas such as the hippocampus. In our first study, we used the information theory formalism to show that astrocytes in the CA1 area of the hippocampus recorded with 2-photon fluorescence microscopy during spatial navigation encode spatial information that is complementary and synergistic to information encoded by nearby "place cell" neurons. In our second study, we investigated various computational aspects of applying the information theory formalism to astrocytic calcium data. For this reason, we generated realistic simulations of calcium signals in astrocytes to determine optimal hyperparameters and procedures of information measures and applied them to real astrocytic calcium imaging data. Calcium signals of astrocytes are characterized by complex spatiotemporal dynamics occurring in subcellular parcels of the astrocytic domain which makes studying these cells in 2-photon calcium imaging recordings difficult. However, current analytical tools which identify the astrocytic subcellular regions are time consuming and extensively rely on user-defined parameters. Here, we present Rapid Astrocytic calcium Spatio-Temporal Analysis (RASTA), a novel machine learning algorithm for spatiotemporal semantic segmentation of 2-photon calcium imaging recordings of astrocytes which operates without human intervention. We found that RASTA provided fast and accurate identification of astrocytic cell somata, processes, and cellular domains, extracting calcium signals from identified regions of interest across individual cells and populations of hundreds of astrocytes recorded in awake mice.