947 resultados para Data pre-processing


Relevância:

90.00% 90.00%

Publicador:

Resumo:

SQL Injection Attack (SQLIA) remains a technique used by a computer network intruder to pilfer an organisation’s confidential data. This is done by an intruder re-crafting web form’s input and query strings used in web requests with malicious intent to compromise the security of an organisation’s confidential data stored at the back-end database. The database is the most valuable data source, and thus, intruders are unrelenting in constantly evolving new techniques to bypass the signature’s solutions currently provided in Web Application Firewalls (WAF) to mitigate SQLIA. There is therefore a need for an automated scalable methodology in the pre-processing of SQLIA features fit for a supervised learning model. However, obtaining a ready-made scalable dataset that is feature engineered with numerical attributes dataset items to train Artificial Neural Network (ANN) and Machine Leaning (ML) models is a known issue in applying artificial intelligence to effectively address ever evolving novel SQLIA signatures. This proposed approach applies numerical attributes encoding ontology to encode features (both legitimate web requests and SQLIA) to numerical data items as to extract scalable dataset for input to a supervised learning model in moving towards a ML SQLIA detection and prevention model. In numerical attributes encoding of features, the proposed model explores a hybrid of static and dynamic pattern matching by implementing a Non-Deterministic Finite Automaton (NFA). This combined with proxy and SQL parser Application Programming Interface (API) to intercept and parse web requests in transition to the back-end database. In developing a solution to address SQLIA, this model allows processed web requests at the proxy deemed to contain injected query string to be excluded from reaching the target back-end database. This paper is intended for evaluating the performance metrics of a dataset obtained by numerical encoding of features ontology in Microsoft Azure Machine Learning (MAML) studio using Two-Class Support Vector Machines (TCSVM) binary classifier. This methodology then forms the subject of the empirical evaluation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Solar Intensity X-ray and particle Spectrometer (SIXS) on board BepiColombo's Mercury Planetary Orbiter (MPO) will study solar energetic particles moving towards Mercury and solar X-rays on the dayside of Mercury. The SIXS instrument consists of two detector sub-systems; X-ray detector SIXS-X and particle detector SIXS-P. The SIXS-P subdetector will detect solar energetic electrons and protons in a broad energy range using a particle telescope approach with five outer Si detectors around a central CsI(Tl) scintillator. The measurements made by the SIXS instrument are necessary for other instruments on board the spacecraft. SIXS data will be used to study the Solar X-ray corona, solar flares, solar energetic particles, the Hermean magnetosphere, and solar eruptions. The SIXS-P detector was calibrated by comparing experimental measurement data from the instrument with Geant4 simulation data. Calibration curves were produced for the different side detectors and the core scintillator for electrons and protons, respectively. The side detector energy response was found to be linear for both electrons and protons. The core scintillator energy response to protons was found to be non-linear. The core scintillator calibration for electrons was omitted due to insufficient experimental data. The electron and proton acceptance of the SIXS-P detector was determined with Geant4 simulations. Electron and proton energy channels are clean in the main energy range of the instrument. At higher energies, protons and electrons produce non-ideal response in the energy channels. Due to the limited bandwidth of the spacecraft's telemetry, the particle measurements made by SIXS-P have to be pre-processed in the data processing unit of the SIXS instrument. A lookup table was created for the pre-processing of data with Geant4 simulations, and the ability of the lookup table to provide spectral information from a simulated electron event was analysed. The lookup table produces clean electron and proton channels and is able to separate protons and electrons. Based on a simulated solar energetic electron event, the incident electron spectrum cannot be determined from channel particle counts with a standard analysis method.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

En este trabajo se propone un nuevo sistema híbrido para el análisis de sentimientos en clase múltiple basado en el uso del diccionario General Inquirer (GI) y un enfoque jerárquico del clasificador Logistic Model Tree (LMT). Este nuevo sistema se compone de tres capas, la capa bipolar (BL) que consta de un LMT (LMT-1) para la clasificación de la polaridad de sentimientos, mientras que la segunda capa es la capa de la Intensidad (IL) y comprende dos LMTs (LMT-2 y LMT3) para detectar por separado tres intensidades de sentimientos positivos y tres intensidades de sentimientos negativos. Sólo en la fase de construcción, la capa de Agrupación (GL) se utiliza para agrupar las instancias positivas y negativas mediante el empleo de 2 k-means, respectivamente. En la fase de Pre-procesamiento, los textos son segmentados por palabras que son etiquetadas, reducidas a sus raíces y sometidas finalmente al diccionario GI con el objetivo de contar y etiquetar sólo los verbos, los sustantivos, los adjetivos y los adverbios con 24 marcadores que se utilizan luego para calcular los vectores de características. En la fase de Clasificación de Sentimientos, los vectores de características se introducen primero al LMT-1, a continuación, se agrupan en GL según la etiqueta de clase, después se etiquetan estos grupos de forma manual, y finalmente las instancias positivas son introducidas a LMT-2 y las instancias negativas a LMT-3. Los tres árboles están entrenados y evaluados usando las bases de datos Movie Review y SenTube con validación cruzada estratificada de 10-pliegues. LMT-1 produce un árbol de 48 hojas y 95 de tamaño, con 90,88% de exactitud, mientras que tanto LMT-2 y LMT-3 proporcionan dos árboles de una hoja y uno de tamaño, con 99,28% y 99,37% de exactitud,respectivamente. Los experimentos muestran que la metodología de clasificación jerárquica propuesta da un mejor rendimiento en comparación con otros enfoques prevalecientes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In order to fulfil European and Portuguese legal requirements, adequate alternatives to traditional municipal waste landfilling must be found namely concerning organic wastes and others susceptible of valorisation. According to the Portuguese Standard NP 4486:2008, refuse derived fuels (RDF) classification is based on three main parameters: lower heating value (considered as an economic parameter), chlorine content (considered as a technical parameter) and mercury content (considered as an environmental parameter). The purpose of this study was to characterize the rejected streams resulting from the mechanical treatment of unsorted municipal solid waste, from the plastic municipal selective collection and from the composting process, in order to evaluate their potential as RDF. To accomplish this purpose six sampling campaigns were performed. Chemical characterization comprised the proximate analysis – moisture content, volatile matter, ashes and fixed carbon, as well as trace elements. Physical characterization was also done. To evaluate their potential as RDF, the following parameters established in the Portuguese standard were also evaluated: heating value and chlorine content. As expected, results show that the refused stream from mechanical treatment is rather different from the selective collection rejected stream and from the rejected from the compost screening in terms of moisture, energetic matter and ashes, as well as heating value and chlorine. Preliminary data allows us to conclude that studied materials have a very interesting potential to be used as RDF. In fact, the rejected from selective collection and the one from composting have a heating value not very different from coal. Therefore, an important key factor may be the blending of these materials with others of higher heating values, after pre-processing, in order to get fuel pellets with good consistency, storage and handling characteristics and, therefore, combustion behavior.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The final goal of the thesis should be a real-world application in the production test data environment. This includes the pre-processing of the data, building models and visualizing the results. To do this, different machine learning models, outlier prediction oriented, should be investigated using a real dataset. Finally, the different outlier prediction algorithms should be compared, and their performance discussed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this dissertation is to describe the methodologies required to design, operate, and validate the performance of ground stations dedicated to near and deep space tracking, as well as the models developed to process the signals acquired, from raw data to the output parameters of the orbit determination of spacecraft. This work is framed in the context of lunar and planetary exploration missions by addressing the challenges in receiving and processing radiometric data for radio science investigations and navigation purposes. These challenges include the designing of an appropriate back-end to read, convert and store the antenna voltages, the definition of appropriate methodologies for pre-processing, calibration, and estimation of radiometric data for the extraction of information on the spacecraft state, and the definition and integration of accurate models of the spacecraft dynamics to evaluate the goodness of the recorded signals. Additionally, the experimental design of acquisition strategies to perform direct comparison between ground stations is described and discussed. In particular, the evaluation of the differential performance between stations requires the designing of a dedicated tracking campaign to maximize the overlap of the recorded datasets at the receivers, making it possible to correlate the received signals and isolate the contribution of the ground segment to the noise in the single link. Finally, in support of the methodologies and models presented, results from the validation and design work performed on the Deep Space Network (DSN) affiliated nodes DSS-69 and DSS-17 will also be reported.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Embedded systems are increasingly integral to daily life, improving and facilitating the efficiency of modern Cyber-Physical Systems which provide access to sensor data, and actuators. As modern architectures become increasingly complex and heterogeneous, their optimization becomes a challenging task. Additionally, ensuring platform security is important to avoid harm to individuals and assets. This study primarily addresses challenges in contemporary Embedded Systems, focusing on platform optimization and security enforcement. The initial section of this study delves into the application of machine learning methods to efficiently determine the optimal number of cores for a parallel RISC-V cluster to minimize energy consumption using static source code analysis. Results demonstrate that automated platform configuration is not only viable but also that there is a moderate performance trade-off when relying solely on static features. The second part focuses on addressing the problem of heterogeneous device mapping, which involves assigning tasks to the most suitable computational device in a heterogeneous platform for optimal runtime. The contribution of this section lies in the introduction of novel pre-processing techniques, along with a training framework called Siamese Networks, that enhances the classification performance of DeepLLVM, an advanced approach for task mapping. Importantly, these proposed approaches are independent from the specific deep-learning model used. Finally, this research work focuses on addressing issues concerning the binary exploitation of software running in modern Embedded Systems. It proposes an architecture to implement Control-Flow Integrity in embedded platforms with a Root-of-Trust, aiming to enhance security guarantees with limited hardware modifications. The approach involves enhancing the architecture of a modern RISC-V platform for autonomous vehicles by implementing a side-channel communication mechanism that relays control-flow changes executed by the process running on the host core to the Root-of-Trust. This approach has limited impact on performance and it is effective in enhancing the security of embedded platforms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Universidade Estadual de Campinas . Faculdade de Educação Física

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This study presents a solid-like finite element formulation to solve geometric non-linear three-dimensional inhomogeneous frames. To achieve the desired representation, unconstrained vectors are used instead of the classic rigid director triad; as a consequence, the resulting formulation does not use finite rotation schemes. High order curved elements with any cross section are developed using a full three-dimensional constitutive elastic relation. Warping and variable thickness strain modes are introduced to avoid locking. The warping mode is solved numerically in FEM pre-processing computational code, which is coupled to the main program. The extra calculations are relatively small when the number of finite elements. with the same cross section, increases. The warping mode is based on a 2D free torsion (Saint-Venant) problem that considers inhomogeneous material. A scheme that automatically generates shape functions and its derivatives allow the use of any degree of approximation for the developed frame element. General examples are solved to check the objectivity, path independence, locking free behavior, generality and accuracy of the proposed formulation. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The acquisition of HI Parkes All Shy Survey (HIPASS) southern sky data commenced at the Australia Telescope National Facility's Parkes 64-m telescope in 1997 February, and was completed in 2000 March. HIPASS is the deepest HI survey yet of the sky south of declination +2 degrees, and is sensitive to emission out to 170 h(75)(-1) Mpc. The characteristic root mean square noise in the survey images is 13.3 mJy. This paper describes the survey observations, which comprise 23 020 eight-degree scans of 9-min duration, and details the techniques used to calibrate and image the data. The processing algorithms are successfully designed to be statistically robust to the presence of interference signals, and are particular to imaging point (or nearly point) sources. Specifically, a major improvement in image quality is obtained by designing a median-gridding algorithm which uses the median estimator in place of the mean estimator.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently, regulating mechanisms of branching morphogenesis of fetal lung rat explants have been an essential tool for molecular research. The development of accurate and reliable segmentation techniques may be essential to improve research outcomes. This work presents an image processing method to measure the perimeter and area of lung branches on fetal rat explants. The algorithm starts by reducing the noise corrupting the image with a pre-processing stage. The outcome is input to a watershed operation that automatically segments the image into primitive regions. Then, an image pixel is selected within the lung explant epithelial, allowing a region growing between neighbouring watershed regions. This growing process is controlled by a statistical distribution of each region. When compared with manual segmentation, the results show the same tendency for lung development. High similarities were harder to obtain in the last two days of culture, due to the increased number of peripheral airway buds and complexity of lung architecture. However, using semiautomatic measurements, the standard deviation was lower and the results between independent researchers were more coherent

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently, regulating mechanisms of branching morphogenesis of fetal lung rat explants have been an essential tool for molecular research. The development of accurate and reliable segmentation techniques may be essential to improve research outcomes. This work presents an image processing method to measure the perimeter and area of lung branches on fetal rat explants. The algorithm starts by reducing the noise corrupting the image with a pre-processing stage. The outcome is input to a watershed operation that automatically segments the image into primitive regions. Then, an image pixel is selected within the lung explant epithelial, allowing a region growing between neighbouring watershed regions. This growing process is controlled by a statistical distribution of each region. When compared with manual segmentation, the results show the same tendency for lung development. High similarities were harder to obtain in the last two days of culture, due to the increased number of peripheral airway buds and complexity of lung architecture. However, using semiautomatic measurements, the standard deviation was lower and the results between independent researchers were more coherent.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The use of iris recognition for human authentication has been spreading in the past years. Daugman has proposed a method for iris recognition, composed by four stages: segmentation, normalization, feature extraction, and matching. In this paper we propose some modifications and extensions to Daugman's method to cope with noisy images. These modifications are proposed after a study of images of CASIA and UBIRIS databases. The major modification is on the computationally demanding segmentation stage, for which we propose a faster and equally accurate template matching approach. The extensions on the algorithm address the important issue of pre-processing that depends on the image database, being mandatory when we have a non infra-red camera, like a typical WebCam. For this scenario, we propose methods for reflection removal and pupil enhancement and isolation. The tests, carried out by our C# application on grayscale CASIA and UBIRIS images show that the template matching segmentation method is more accurate and faster than the previous one, for noisy images. The proposed algorithms are found to be efficient and necessary when we deal with non infra-red images and non uniform illumination.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

OBJECTIVE: As in Brazil cancer registries are mostly based on large cities, there are no estimates per state or per region and information on the disease incidence in the vast in-land areas is very scarce. An incidence survey was conducted in 18 major cities of the state of São Paulo, excluding the capital, aiming to collect information about cancer incidence in the state of São Paulo. METHODS: Of the 18 cities in state of São Paulo included in the survey, all had available resources for cancer management. Data from the year of 1991 were collected by the personnel of the Instituto Brasileiro de Geografia e Estatística (Brazilian Institute of Statistics), who were especially trained by the study coordinators at the Fundação Oncocentro de São Paulo (Cancer Center of São Paulo). The collected data were processed and analyzed at the Oncocentro. Data collection, processing, and analyses were performed according to the recommendations of the International Agency for Research on Cancer. RESULTS: Although some discrepancies were observed in cancer incidence rates between the cities, results obtained for all 18 cities combined were remarkably close to those recently found for the city of São Paulo in the year 1993. One remarkable finding was the relatively high cancer incidence rates in both sexes in the city of Santos. CONCLUSIONS: The very similar all-sites cancer incidence rates found in the year 1991, when compared to those for the city of São Paulo in the year 1993, are suggestive that all regions have common cancer-related factors. Nevertheless, other explanations, such as the inclusion in the study of prevalent cases, as well as of non-residents, may have occurred in both studies, biasing the results. There is a need of further studies to confirm the high cancer incidence in Santos.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mestrado em Engenharia Informática