871 resultados para Data detection
Resumo:
We are investigating the combination of wavelets and decision trees to detect ships and other maritime surveillance targets from medium resolution SAR images. Wavelets have inherent advantages to extract image descriptors while decision trees are able to handle different data sources. In addition, our work aims to consider oceanic features such as ship wakes and ocean spills. In this incipient work, Haar and Cohen-Daubechies-Feauveau 9/7 wavelets obtain detailed descriptors from targets and ocean features and are inserted with other statistical parameters and wavelets into an oblique decision tree. © 2011 Springer-Verlag.
Resumo:
Optical remote sensing techniques have obvious advantages for monitoring gas and aerosol emissions, since they enable the operation over large distances, far from hostile environments, and fast processing of the measured signal. In this study two remote sensing devices, namely a Lidar (Light Detection and Ranging) for monitoring the vertical profile of backscattered light intensity, and a Sodar (Acoustic Radar, Sound Detection and Ranging) for monitoring the vertical profile of the wind vector were operated during specific periods. The acquired data were processed and compared with data of air quality obtained from ground level monitoring stations, in order to verify the possibility of using the remote sensing techniques to monitor industrial emissions. The campaigns were carried out in the area of the Environmental Research Center (Cepema) of the University of São Paulo, in the city of Cubatão, Brazil, a large industrial site, where numerous different industries are located, including an oil refinery, a steel plant, as well as fertilizer, cement and chemical/petrochemical plants. The local environmental problems caused by the industrial activities are aggravated by the climate and topography of the site, unfavorable to pollutant dispersion. Results of a campaign are presented for a 24- hour period, showing data of a Lidar, an air quality monitoring station and a Sodar. © 2011 SPIE.
Resumo:
Detecting misbehavior (such as transmissions of false information) in vehicular ad hoc networks (VANETs) is a very important problem with wide range of implications, including safety related and congestion avoidance applications. We discuss several limitations of existing misbehavior detection schemes (MDS) designed for VANETs. Most MDS are concerned with detection of malicious nodes. In most situations, vehicles would send wrong information because of selfish reasons of their owners, e.g. for gaining access to a particular lane. It is therefore more important to detect false information than to identify misbehaving nodes. We introduce the concept of data-centric misbehavior detection and propose algorithms which detect false alert messages and misbehaving nodes by observing their actions after sending out the alert messages. With the data-centric MDS, each node can decide whether an information received is correct or false. The decision is based on the consistency of recent messages and new alerts with reported and estimated vehicle positions. No voting or majority decisions is needed, making our MDS resilient to Sybil attacks. After misbehavior is detected, we do not revoke all the secret credentials of misbehaving nodes, as done in most schemes. Instead, we impose fines on misbehaving nodes (administered by the certification authority), discouraging them to act selfishly. This reduces the computation and communication costs involved in revoking all the secret credentials of misbehaving nodes. © 2011 IEEE.
Resumo:
Multisensor data fusion is a technique that combines the readings of multiple sensors to detect some phenomenon. Data fusion applications are numerous and they can be used in smart buildings, environment monitoring, industry and defense applications. The main goal of multisensor data fusion is to minimize false alarms and maximize the probability of detection based on the detection of multiple sensors. In this paper a local data fusion algorithm based on luminosity, temperature and flame for fire detection is presented. The data fusion approach was embedded in a low cost mobile robot. The prototype test validation has indicated that our approach can detect fire occurrence. Moreover, the low cost project allow the development of robots that could be discarded in their fire detection missions. © 2013 IEEE.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Concept drift, which refers to non stationary learning problems over time, has increasing importance in machine learning and data mining. Many concept drift applications require fast response, which means an algorithm must always be (re)trained with the latest available data. But the process of data labeling is usually expensive and/or time consuming when compared to acquisition of unlabeled data, thus usually only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are based on assumptions that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenging task in machine learning. Recently, a particle competition and cooperation approach has been developed to realize graph-based semi-supervised learning from static data. We have extend that approach to handle data streams and concept drift. The result is a passive algorithm which uses a single classifier approach, naturally adapted to concept changes without any explicit drift detection mechanism. It has built-in mechanisms that provide a natural way of learning from new data, gradually "forgetting" older knowledge as older data items are no longer useful for the classification of newer data items. The proposed algorithm is applied to the KDD Cup 1999 Data of network intrusion, showing its effectiveness.
Resumo:
In the instrumental records of daily precipitation, we often encounter one or more periods in which values below some threshold were not registered. Such periods, besides lacking small values, also have a large number of dry days. Their cumulative distribution function is shifted to the right in relation to that for other portions of the record having more reliable observations. Such problems are examined in this work, based mostly on the two-sample Kolmogorov–Smirnov (KS) test, where the portion of the series with more number of dry days is compared with the portion with less number of dry days. Another relatively common problem in daily rainfall data is the prevalence of integers either throughout the period of record or in some part of it, likely resulting from truncation during data compilation prior to archiving or by coarse rounding of daily readings by observers. This problem is identified by simple calculation of the proportion of integers in the series, taking the expected proportion as 10%. The above two procedures were applied to the daily rainfall data sets from the European Climate Assessment (ECA), Southeast Asian Climate Assessment (SACA), and Brazilian Water Resources Agency (BRA). Taking the statistic D of the KS test >0.15 and the corresponding p-value <0.001 as the condition to classify a given series as suspicious, the proportions of the ECA, SACA, and BRA series falling into this category are, respectively, 34.5%, 54.3%, and 62.5%. With relation to coarse rounding problem, the proportions of series exceeding twice the 10% reference level are 3%, 60%, and 43% for the ECA, SACA, and BRA data sets, respectively. A simple way to visualize the two problems addressed here is by plotting the time series of daily rainfall for a limited range, for instance, 0–10 mm day−1.
Resumo:
In this paper we discuss the detection of glucose and triglycerides using information visualization methods to process impedance spectroscopy data. The sensing units contained either lipase or glucose oxidase immobilized in layer-by-layer (LbL) films deposited onto interdigitated electrodes. The optimization consisted in identifying which part of the electrical response and combination of sensing units yielded the best distinguishing ability. It is shown that complete separation can be obtained for a range of concentrations of glucose and triglyceride when the interactive document map (IDMAP) technique is used to project the data into a two-dimensional plot. Most importantly, the optimization procedure can be extended to other types of biosensors, thus increasing the versatility of analysis provided by tailored molecular architectures exploited with various detection principles. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Máster Universitario en Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería (SIANI)
Resumo:
Nel mondo della sicurezza informatica, le tecnologie si evolvono per far fronte alle minacce. Non è possibile prescindere dalla prevenzione, ma occorre accettare il fatto che nessuna barriera risulterà impenetrabile e che la rilevazione, unitamente ad una pronta risposta, rappresenta una linea estremamente critica di difesa, ma l’unica veramente attuabile per poter guadagnare più tempo possibile o per limitare i danni. Introdurremo quindi un nuovo modello operativo composto da procedure capaci di affrontare le nuove sfide che il malware costantemente offre e allo stesso tempo di sollevare i comparti IT da attività onerose e sempre più complesse, ottimizzandone il processo di comunicazione e di risposta.
Resumo:
Data sets describing the state of the earth's atmosphere are of great importance in the atmospheric sciences. Over the last decades, the quality and sheer amount of the available data increased significantly, resulting in a rising demand for new tools capable of handling and analysing these large, multidimensional sets of atmospheric data. The interdisciplinary work presented in this thesis covers the development and the application of practical software tools and efficient algorithms from the field of computer science, aiming at the goal of enabling atmospheric scientists to analyse and to gain new insights from these large data sets. For this purpose, our tools combine novel techniques with well-established methods from different areas such as scientific visualization and data segmentation. In this thesis, three practical tools are presented. Two of these tools are software systems (Insight and IWAL) for different types of processing and interactive visualization of data, the third tool is an efficient algorithm for data segmentation implemented as part of Insight.Insight is a toolkit for the interactive, three-dimensional visualization and processing of large sets of atmospheric data, originally developed as a testing environment for the novel segmentation algorithm. It provides a dynamic system for combining at runtime data from different sources, a variety of different data processing algorithms, and several visualization techniques. Its modular architecture and flexible scripting support led to additional applications of the software, from which two examples are presented: the usage of Insight as a WMS (web map service) server, and the automatic production of a sequence of images for the visualization of cyclone simulations. The core application of Insight is the provision of the novel segmentation algorithm for the efficient detection and tracking of 3D features in large sets of atmospheric data, as well as for the precise localization of the occurring genesis, lysis, merging and splitting events. Data segmentation usually leads to a significant reduction of the size of the considered data. This enables a practical visualization of the data, statistical analyses of the features and their events, and the manual or automatic detection of interesting situations for subsequent detailed investigation. The concepts of the novel algorithm, its technical realization, and several extensions for avoiding under- and over-segmentation are discussed. As example applications, this thesis covers the setup and the results of the segmentation of upper-tropospheric jet streams and cyclones as full 3D objects. Finally, IWAL is presented, which is a web application for providing an easy interactive access to meteorological data visualizations, primarily aimed at students. As a web application, the needs to retrieve all input data sets and to install and handle complex visualization tools on a local machine are avoided. The main challenge in the provision of customizable visualizations to large numbers of simultaneous users was to find an acceptable trade-off between the available visualization options and the performance of the application. Besides the implementational details, benchmarks and the results of a user survey are presented.
Resumo:
This paper presents an automated solution for precise detection of fiducial screws from three-dimensional (3D) Computerized Tomography (CT)/Digital Volume Tomography (DVT) data for image-guided ENT surgery. Unlike previously published solutions, we regard the detection of the fiducial screws from the CT/DVT volume data as a pose estimation problem. We thus developed a model-based solution. Starting from a user-supplied initialization, our solution detects the fiducial screws by iteratively matching a computer aided design (CAD) model of the fiducial screw to features extracted from the CT/DVT data. We validated our solution on one conventional CT dataset and on five DVT volume datasets, resulting in a total detection of 24 fiducial screws. Our experimental results indicate that the proposed solution achieves much higher reproducibility and precision than the manual detection. Further comparison shows that the proposed solution produces better results on the DVT dataset than on the conventional CT dataset.
Resumo:
With recent advances in mass spectrometry techniques, it is now possible to investigate proteins over a wide range of molecular weights in small biological specimens. This advance has generated data-analytic challenges in proteomics, similar to those created by microarray technologies in genetics, namely, discovery of "signature" protein profiles specific to each pathologic state (e.g., normal vs. cancer) or differential profiles between experimental conditions (e.g., treated by a drug of interest vs. untreated) from high-dimensional data. We propose a data analytic strategy for discovering protein biomarkers based on such high-dimensional mass-spectrometry data. A real biomarker-discovery project on prostate cancer is taken as a concrete example throughout the paper: the project aims to identify proteins in serum that distinguish cancer, benign hyperplasia, and normal states of prostate using the Surface Enhanced Laser Desorption/Ionization (SELDI) technology, a recently developed mass spectrometry technique. Our data analytic strategy takes properties of the SELDI mass-spectrometer into account: the SELDI output of a specimen contains about 48,000 (x, y) points where x is the protein mass divided by the number of charges introduced by ionization and y is the protein intensity of the corresponding mass per charge value, x, in that specimen. Given high coefficients of variation and other characteristics of protein intensity measures (y values), we reduce the measures of protein intensities to a set of binary variables that indicate peaks in the y-axis direction in the nearest neighborhoods of each mass per charge point in the x-axis direction. We then account for a shifting (measurement error) problem of the x-axis in SELDI output. After these pre-analysis processing of data, we combine the binary predictors to generate classification rules for cancer, benign hyperplasia, and normal states of prostate. Our approach is to apply the boosting algorithm to select binary predictors and construct a summary classifier. We empirically evaluate sensitivity and specificity of the resulting summary classifiers with a test dataset that is independent from the training dataset used to construct the summary classifiers. The proposed method performed nearly perfectly in distinguishing cancer and benign hyperplasia from normal. In the classification of cancer vs. benign hyperplasia, however, an appreciable proportion of the benign specimens were classified incorrectly as cancer. We discuss practical issues associated with our proposed approach to the analysis of SELDI output and its application in cancer biomarker discovery.