942 resultados para Rule-Based Classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

N-gram analysis is an approach that investigates the structure of a program using bytes, characters or text strings. This research uses dynamic analysis to investigate malware detection using a classification approach based on N-gram analysis. The motivation for this research is to find a subset of Ngram features that makes a robust indicator of malware. The experiments within this paper represent programs as N-gram density histograms, gained through dynamic analysis. A Support Vector Machine (SVM) is used as the program classifier to determine the ability of N-grams to correctly determine the presence of malicious software. The preliminary findings show that an N-gram size N=3 and N=4 present the best avenues for further analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mobile malware has been growing in scale and complexity as smartphone usage continues to rise. Android has surpassed other mobile platforms as the most popular whilst also witnessing a dramatic increase in malware targeting the platform. A worrying trend that is emerging is the increasing sophistication of Android malware to evade detection by traditional signature-based scanners. As such, Android app marketplaces remain at risk of hosting malicious apps that could evade detection before being downloaded by unsuspecting users. Hence, in this paper we present an effective approach to alleviate this problem based on Bayesian classification models obtained from static code analysis. The models are built from a collection of code and app characteristics that provide indicators of potential malicious activities. The models are evaluated with real malware samples in the wild and results of experiments are presented to demonstrate the effectiveness of the proposed approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates camera control for capturing bottle cap target images in the fault-detection system of an industrial production line. The main purpose is to identify the targeted bottle caps accurately in real time from the images. This is achieved by combining iterative learning control and Kalman filtering to reduce the effect of various disturbances introduced into the detection system. A mathematical model, together with a physical simulation platform is established based on the actual production requirements, and the convergence properties of the model are analyzed. It is shown that the proposed method enables accurate real-time control of the camera, and further, the gain range of the learning rule is also obtained. The numerical simulation and experimental results confirm that the proposed method can not only reduce the effect of repeatable disturbances but also non-repeatable ones.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Masked implementations of cryptographic algorithms are often used in commercial embedded cryptographic devices to increase their resistance to side channel attacks. In this work we show how neural networks can be used to both identify the mask value, and to subsequently identify the secret key value with a single attack trace with high probability. We propose the use of a pre-processing step using principal component analysis (PCA) to significantly increase the success of the attack. We have developed a classifier that can correctly identify the mask for each trace, hence removing the security provided by that mask and reducing the attack to being equivalent to an attack against an unprotected implementation. The attack is performed on the freely available differential power analysis (DPA) contest data set to allow our work to be easily reproducible. We show that neural networks allow for a robust and efficient classification in the context of side-channel attacks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Magellanic Clouds are uniquely placed to study the stellar contribution to dust emission. Individual stars can be resolved in these systems even in the mid-infrared, and they are close enough to allow detection of infrared excess caused by dust. We have searched the Spitzer Space Telescope data archive for all Infrared Spectrograph (IRS) staring-mode observations of the Small Magellanic Cloud (SMC) and found that 209 Infrared Array Camera (IRAC) point sources within the footprint of the Surveying the Agents of Galaxy Evolution in the Small Magellanic Cloud (SAGE-SMC) Spitzer Legacy programme were targeted, within a total of 311 staring-mode observations. We classify these point sources using a decision tree method of object classification, based on infrared spectral features, continuum and spectral energy distribution shape, bolometric luminosity, cluster membership and variability information. We find 58 asymptotic giant branch (AGB) stars, 51 young stellar objects, 4 post-AGB objects, 22 red supergiants, 27 stars (of which 23 are dusty OB stars), 24 planetary nebulae (PNe), 10 Wolf-Rayet stars, 3 H II regions, 3 R Coronae Borealis stars, 1 Blue Supergiant and 6 other objects, including 2 foreground AGB stars. We use these classifications to evaluate the success of photometric classification methods reported in the literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Cetuximab has shown significant clinical activity in metastatic colon cancer. However, cetuximab-containing neoadjuvant chemoradiation has not been shown to improve tumor response in locally advanced rectal cancer patients in recent phase I/II trials. We evaluated functional germline polymorphisms of genes involved in epidermal growth factor receptor pathway, angiogenesis, antibody-dependent cell-mediated cytotoxicity, DNA repair, and drug metabolism, for their potential role as molecular predictors for clinical outcome in locally advanced rectal cancer patients treated with preoperative cetuximab-based chemoradiation.

METHODS: 130 patients (74 men and 56 women) with locally advanced rectal cancer (4 with stage II, 109 with stage III, and 15 with stage IV, 2 unknown) who were enrolled in phase I/II clinical trials treated with cetuximab-based chemoradiation in European cancer centers were included. Genomic DNA was extracted from formalin-fixed paraffin-embedded tumor samples and genotyping was done by using PCR-RFLP assays. Fisher's exact test was used to examine associations between polymorphisms and complete pathologic response (pCR) that was determined by a modified Dworak classification system (grade III vs. grade IV: complete response).

RESULTS: Patients with the epidermal growth factor (EGF) 61 G/G genotype had pCR of 45% (5/11), compared with 21% (11/53) in patients heterozygous, and 2% (1/54) in patients homozygous for the A/A allele (P < 0.001). In addition, this association between EGF 61 G allele and pCR remained significant (P = 0.019) in the 59 patients with wild-type KRAS.

CONCLUSION: This study suggested EGF A+61G polymorphism to be a predictive marker for pCR, independent of KRAS mutation status, to cetuximab-based neoadjuvant chemoradiation of patients with locally advanced rectal cancer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Clusters of text documents output by clustering algorithms are often hard to interpret. We describe motivating real-world scenarios that necessitate reconfigurability and high interpretability of clusters and outline the problem of generating clusterings with interpretable and reconfigurable cluster models. We develop two clustering algorithms toward the outlined goal of building interpretable and reconfigurable cluster models. They generate clusters with associated rules that are composed of conditions on word occurrences or nonoccurrences. The proposed approaches vary in the complexity of the format of the rules; RGC employs disjunctions and conjunctions in rule generation whereas RGC-D rules are simple disjunctions of conditions signifying presence of various words. In both the cases, each cluster is comprised of precisely the set of documents that satisfy the corresponding rule. Rules of the latter kind are easy to interpret, whereas the former leads to more accurate clustering. We show that our approaches outperform the unsupervised decision tree approach for rule-generating clustering and also an approach we provide for generating interpretable models for general clusterings, both by significant margins. We empirically show that the purity and f-measure losses to achieve interpretability can be as little as 3 and 5%, respectively using the algorithms presented herein.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Social media channels, such as Facebook or Twitter, allow for people to express their views and opinions about any public topics. Public sentiment related to future events, such as demonstrations or parades, indicate public attitude and therefore may be applied while trying to estimate the level of disruption and disorder during such events. Consequently, sentiment analysis of social media content may be of interest for different organisations, especially in security and law enforcement sectors. This paper presents a new lexicon-based sentiment analysis algorithm that has been designed with the main focus on real time Twitter content analysis. The algorithm consists of two key components, namely sentiment normalisation and evidence-based combination function, which have been used in order to estimate the intensity of the sentiment rather than positive/negative label and to support the mixed sentiment classification process. Finally, we illustrate a case study examining the relation between negative sentiment of twitter posts related to English Defence League and the level of disorder during the organisation’s related events.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Application of sensor-based technology within activity monitoring systems is becoming a popular technique within the smart environment paradigm. Nevertheless, the use of such an approach generates complex constructs of data, which subsequently requires the use of intricate activity recognition techniques to automatically infer the underlying activity. This paper explores a cluster-based ensemble method as a new solution for the purposes of activity recognition within smart environments. With this approach activities are modelled as collections of clusters built on different subsets of features. A classification process is performed by assigning a new instance to its closest cluster from each collection. Two different sensor data representations have been investigated, namely numeric and binary. Following the evaluation of the proposed methodology it has been demonstrated that the cluster-based ensemble method can be successfully applied as a viable option for activity recognition. Results following exposure to data collected from a range of activities indicated that the ensemble method had the ability to perform with accuracies of 94.2% and 97.5% for numeric and binary data, respectively. These results outperformed a range of single classifiers considered as benchmarks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An algorithm for approximate credal network updating is presented. The problem in its general formulation is a multilinear optimization task, which can be linearized by an appropriate rule for fixing all the local models apart from those of a single variable. This simple idea can be iterated and quickly leads to very accurate inferences. The approach can also be specialized to classification with credal networks based on the maximality criterion. A complexity analysis for both the problem and the algorithm is reported together with numerical experiments, which confirm the good performance of the method. While the inner approximation produced by the algorithm gives rise to a classifier which might return a subset of the optimal class set, preliminary empirical results suggest that the accuracy of the optimal class set is seldom affected by the approximate probabilities

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, a novel and effective lip-based biometric identification approach with the Discrete Hidden Markov Model Kernel (DHMMK) is developed. Lips are described by shape features (both geometrical and sequential) on two different grid layouts: rectangular and polar. These features are then specifically modeled by a DHMMK, and learnt by a support vector machine classifier. Our experiments are carried out in a ten-fold cross validation fashion on three different datasets, GPDS-ULPGC Face Dataset, PIE Face Dataset and RaFD Face Dataset. Results show that our approach has achieved an average classification accuracy of 99.8%, 97.13%, and 98.10%, using only two training images per class, on these three datasets, respectively. Our comparative studies further show that the DHMMK achieved a 53% improvement against the baseline HMM approach. The comparative ROC curves also confirm the efficacy of the proposed lip contour based biometrics learned by DHMMK. We also show that the performance of linear and RBF SVM is comparable under the frame work of DHMMK.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Breast cancer screening has led to a dramatic increase in the detection of pre-invasive breast lesions. While mastectomy is almost guaranteed to treat the disease, more conservative approaches could be as effective if patients can be stratified based on risk of co-existing or recurrent invasive disease.Here we use a range of biomarkers to interrogate and classify purely non-invasive lesions (PNL) and those with co-existing invasive breast cancer (CEIN). Apart from Ductal Carcinoma In Situ (DCIS), relative homogeneity is observed. DCIS contained a greater spread of molecular subtypes. Interestingly, high expression of p-mTOR was observed in all PNL with lower expression in DCIS and invasive carcinoma while the opposite expression pattern was observed for TOP2A.Comparing PNL with CEIN, we have identified p53 and Ki67 as predictors of CEIN with a combined PPV and NPV of 90.48% and 43.3% respectively. Furthermore, HER2 expression showed the best concordance between DCIS and its invasive counterpart.We propose that these biomarkers can be used to improve the management of patients with pre-invasive breast lesions following further validation and clinical trials. p53 and Ki67 could be used to stratify patients into low and high-risk groups for co-existing disease. Knowledge of expression of more actionable targets such as HER2 or TOP2A can be used to design chemoprevention or neo-adjuvant strategies. Increased knowledge of the molecular profile of pre-invasive lesions can only serve to enhance our understanding of the disease and, in the era of personalised medicine, bring us closer to improving breast cancer care.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigated using lip movements as a behavioural biometric for person authentication. The system was trained, evaluated and tested using the XM2VTS dataset, following the Lausanne Protocol configuration II. Features were selected from the DCT coefficients of the greyscale lip image. This paper investigated the number of DCT coefficients selected, the selection process, and static and dynamic feature combinations. Using a Gaussian Mixture Model - Universal Background Model framework an Equal Error Rate of 2.20% was achieved during evaluation and on an unseen test set a False Acceptance Rate of 1.7% and False Rejection Rate of 3.0% was achieved. This compares favourably with face authentication results on the same dataset whilst not being susceptible to spoofing attacks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Network management tools must be able to monitor and analyze traffic flowing through network systems. According to the OpenFlow protocol applied in Software-Defined Networking (SDN), packets are classified into flows that are searched in flow tables. Further actions, such as packet forwarding, modification, and redirection to a group table, are made in the flow table with respect to the search results. A novel hardware solution for SDN-enabled packet classification is presented in this paper. The proposed scheme is focused on a label-based search method, achieving high flexibility in memory usage. The implemented hardware architecture provides optimal lookup performance by configuring the search algorithm and by performing fast incremental update as programmed the software controller.