834 resultados para Naïve Bayes classifier
Resumo:
Mollusk shells are frequently radiocarbon dated and provide reliable calibrated age ranges when the regional marine reservoir correction is well-established. For mollusks from an estuarine environment the reservoir correction may be significantly different than the regional marine reservoir correction due to the input of bedrock or soil derived carbonates. Some mollusk species such as oysters are tolerant of a significant range of salinities which makes it difficult to determine which reservoir correction is appropriate. A case study is presented of an anomalous radiocarbon age for an oyster shell paint dish found in the fabric of the ruined nave walls of St Mary's Church, Shoreham-by-Sea, West Sussex, England. Stable isotopes (delta O-18 and delta C-13) were used to establish the type of environment in which the oyster had lived. Paired marine and terrestrial samples from a nearby medieval site were radiocarbon dated to provide an appropriate reservoir correction.
Resumo:
Efficient identification and follow-up of astronomical transients is hindered by the need for humans to manually select promising candidates from data streams that contain many false positives. These artefacts arise in the difference images that are produced by most major ground-based time-domain surveys with large format CCD cameras. This dependence on humans to reject bogus detections is unsustainable for next generation all-sky surveys and significant effort is now being invested to solve the problem computationally. In this paper, we explore a simple machine learning approach to real-bogus classification by constructing a training set from the image data of similar to 32 000 real astrophysical transients and bogus detections from the Pan-STARRS1 Medium Deep Survey. We derive our feature representation from the pixel intensity values of a 20 x 20 pixel stamp around the centre of the candidates. This differs from previous work in that it works directly on the pixels rather than catalogued domain knowledge for feature design or selection. Three machine learning algorithms are trained (artificial neural networks, support vector machines and random forests) and their performances are tested on a held-out subset of 25 per cent of the training data. We find the best results from the random forest classifier and demonstrate that by accepting a false positive rate of 1 per cent, the classifier initially suggests a missed detection rate of around 10 per cent. However, we also find that a combination of bright star variability, nuclear transients and uncertainty in human labelling means that our best estimate of the missed detection rate is approximately 6 per cent.
Resumo:
BACKGROUND: Combined Fludarabine and Cyclophosphamide is now standard first-line therapy in chronic lymphocytic leukaemia (CLL) and the addition of Rituximab improves outcome.
METHODS: We adopted a modified Fludarabine, Cyclophosphamide and Rituximab (FCR) protocol in treating 39 patients (median age 57 years) with progressive or advanced CLL. Depending on CR, treatment was given for four or six cycles.
RESULT: Twenty-six patients were treatment naïve and 13 were pre-treated. Twelve patients had progressive Binet stage A, 16 stage B and 11 stage C disease. The overall response rate (ORR) was 100%, with 75% achieving CR. Neutropenia was the major toxicity in 71/187 (38%) of the cycles. There were five deaths, two from infection and three from progressive disease. Twenty-six of 31 patients have maintained their post-treatment disease status for a median of 17 months (2-41).
CONCLUSION: We conclude that FCR is a feasible, well-tolerated and effective treatment for patients with CLL.
Resumo:
A commercial Bacillus anthracis (Anthrax) whole genome protein microarray has been used to identify immunogenic Anthrax proteins (IAP) using sera from groups of donors with (a) confirmed B. anthracis naturally acquired cutaneous infection, (b) confirmed B. anthracis intravenous drug use-acquired infection, (c) occupational exposure in a wool-sorters factory, (d) humans and rabbits vaccinated with the UK Anthrax protein vaccine and compared to naïve unexposed controls. Anti-IAP responses were observed for both IgG and IgA in the challenged groups; however the anti-IAP IgG response was more evident in the vaccinated group and the anti-IAP IgA response more evident in the B. anthracis-infected groups. Infected individuals appeared somewhat suppressed for their general IgG response, compared with other challenged groups. Immunogenic protein antigens were identified in all groups, some of which were shared between groups whilst others were specific for individual groups. The toxin proteins were immunodominant in all vaccinated, infected or other challenged groups. However, a number of other chromosomally-located and plasmid encoded open reading frame proteins were also recognized by infected or exposed groups in comparison to controls. Some of these antigens e.g., BA4182 are not recognized by vaccinated individuals, suggesting that there are proteins more specifically expressed by live Anthrax spores in vivo that are not currently found in the UK licensed Anthrax Vaccine (AVP). These may perhaps be preferentially expressed during infection and represent expression of alternative pathways in the B. anthracis "infectome." These may make highly attractive candidates for diagnostic and vaccine biomarker development as they may be more specifically associated with the infectious phase of the pathogen. A number of B. anthracis small hypothetical protein targets have been synthesized, tested in mouse immunogenicity studies and validated in parallel using human sera from the same study.
Resumo:
Masked implementations of cryptographic algorithms are often used in commercial embedded cryptographic devices to increase their resistance to side channel attacks. In this work we show how neural networks can be used to both identify the mask value, and to subsequently identify the secret key value with a single attack trace with high probability. We propose the use of a pre-processing step using principal component analysis (PCA) to significantly increase the success of the attack. We have developed a classifier that can correctly identify the mask for each trace, hence removing the security provided by that mask and reducing the attack to being equivalent to an attack against an unprotected implementation. The attack is performed on the freely available differential power analysis (DPA) contest data set to allow our work to be easily reproducible. We show that neural networks allow for a robust and efficient classification in the context of side-channel attacks.
Resumo:
Americans have been shown to attribute greater intentionality to immoral than to amoral actions in cases of causal deviance, that is, cases where a goal is satisfied in a way that deviates from initially planned means (e.g., a gunman wants to hit a target and his hand slips, but the bullet ricochets off a rock into the target). However, past research has yet to assess whether this asymmetry persists in cases of extreme causal deviance. Here, we manipulated the level of mild to extreme causal deviance of an immoral versus amoral act. The asymmetry in attributions of intentionality was observed at all but the
most extreme level of causal deviance, and, as we hypothesized, was mediated by attributions of Blame/credit and judgments of action performance. These findings are discussed as they support a multiple-concepts interpretation of the asymmetry, wherein blame renders a naïve concept of intentional action (the outcome matches the intention) more salient than a composite concept (the outcome matches the intention and was brought about by planned means), and in terms of their implications for cross-cultural research on judgments of agency.
Resumo:
This research presents a fast algorithm for projected support vector machines (PSVM) by selecting a basis vector set (BVS) for the kernel-induced feature space, the training points are projected onto the subspace spanned by the selected BVS. A standard linear support vector machine (SVM) is then produced in the subspace with the projected training points. As the dimension of the subspace is determined by the size of the selected basis vector set, the size of the produced SVM expansion can be specified. A two-stage algorithm is derived which selects and refines the basis vector set achieving a locally optimal model. The model expansion coefficients and bias are updated recursively for increase and decrease in the basis set and support vector set. The condition for a point to be classed as outside the current basis vector and selected as a new basis vector is derived and embedded in the recursive procedure. This guarantees the linear independence of the produced basis set. The proposed algorithm is tested and compared with an existing sparse primal SVM (SpSVM) and a standard SVM (LibSVM) on seven public benchmark classification problems. Our new algorithm is designed for use in the application area of human activity recognition using smart devices and embedded sensors where their sometimes limited memory and processing resources must be exploited to the full and the more robust and accurate the classification the more satisfied the user. Experimental results demonstrate the effectiveness and efficiency of the proposed algorithm. This work builds upon a previously published algorithm specifically created for activity recognition within mobile applications for the EU Haptimap project [1]. The algorithms detailed in this paper are more memory and resource efficient making them suitable for use with bigger data sets and more easily trained SVMs.
Resumo:
A practically viable multi-biometric recognition system should not only be stable, robust and accurate but should also adhere to real-time processing speed and memory constraints. This study proposes a cascaded classifier-based framework for use in biometric recognition systems. The proposed framework utilises a set of weak classifiers to reduce the enrolled users' dataset to a small list of candidate users. This list is then used by a strong classifier set as the final stage of the cascade to formulate the decision. At each stage, the candidate list is generated by a Mahalanobis distance-based match score quality measure. One of the key features of the authors framework is that each classifier in the ensemble can be designed to use a different modality thus providing the advantages of a truly multimodal biometric recognition system. In addition, it is one of the first truly multimodal cascaded classifier-based approaches for biometric recognition. The performance of the proposed system is evaluated both for single and multimodalities to demonstrate the effectiveness of the approach.
Resumo:
An outlier removal based data cleaning technique is proposed to
clean manually pre-segmented human skin data in colour images.
The 3-dimensional colour data is projected onto three 2-dimensional
planes, from which outliers are removed. The cleaned 2 dimensional
data projections are merged to yield a 3D clean RGB data. This data
is finally used to build a look up table and a single Gaussian classifier
for the purpose of human skin detection in colour images.
Resumo:
One of the most popular techniques of generating classifier ensembles is known as stacking which is based on a meta-learning approach. In this paper, we introduce an alternative method to stacking which is based on cluster analysis. Similar to stacking, instances from a validation set are initially classified by all base classifiers. The output of each classifier is subsequently considered as a new attribute of the instance. Following this, a validation set is divided into clusters according to the new attributes and a small subset of the original attributes of the instances. For each cluster, we find its centroid and calculate its class label. The collection of centroids is considered as a meta-classifier. Experimental results show that the new method outperformed all benchmark methods, namely Majority Voting, Stacking J48, Stacking LR, AdaBoost J48, and Random Forest, in 12 out of 22 data sets. The proposed method has two advantageous properties: it is very robust to relatively small training sets and it can be applied in semi-supervised learning problems. We provide a theoretical investigation regarding the proposed method. This demonstrates that for the method to be successful, the base classifiers applied in the ensemble should have greater than 50% accuracy levels.
Resumo:
An algorithm for approximate credal network updating is presented. The problem in its general formulation is a multilinear optimization task, which can be linearized by an appropriate rule for fixing all the local models apart from those of a single variable. This simple idea can be iterated and quickly leads to very accurate inferences. The approach can also be specialized to classification with credal networks based on the maximality criterion. A complexity analysis for both the problem and the algorithm is reported together with numerical experiments, which confirm the good performance of the method. While the inner approximation produced by the algorithm gives rise to a classifier which might return a subset of the optimal class set, preliminary empirical results suggest that the accuracy of the optimal class set is seldom affected by the approximate probabilities
Resumo:
In this paper, a novel and effective lip-based biometric identification approach with the Discrete Hidden Markov Model Kernel (DHMMK) is developed. Lips are described by shape features (both geometrical and sequential) on two different grid layouts: rectangular and polar. These features are then specifically modeled by a DHMMK, and learnt by a support vector machine classifier. Our experiments are carried out in a ten-fold cross validation fashion on three different datasets, GPDS-ULPGC Face Dataset, PIE Face Dataset and RaFD Face Dataset. Results show that our approach has achieved an average classification accuracy of 99.8%, 97.13%, and 98.10%, using only two training images per class, on these three datasets, respectively. Our comparative studies further show that the DHMMK achieved a 53% improvement against the baseline HMM approach. The comparative ROC curves also confirm the efficacy of the proposed lip contour based biometrics learned by DHMMK. We also show that the performance of linear and RBF SVM is comparable under the frame work of DHMMK.
Resumo:
We present a new wrapper feature selection algorithm for human detection. This algorithm is a hybrid featureselection approach combining the benefits of filter and wrapper methods. It allows the selection of an optimalfeature vector that well represents the shapes of the subjects in the images. In detail, the proposed featureselection algorithm adopts the k-fold subsampling and sequential backward elimination approach, while thestandard linear support vector machine (SVM) is used as the classifier for human detection. We apply theproposed algorithm to the publicly accessible INRIA and ETH pedestrian full image datasets with the PASCALVOC evaluation criteria. Compared to other state of the arts algorithms, our feature selection based approachcan improve the detection speed of the SVM classifier by over 50% with up to 2% better detection accuracy.Our algorithm also outperforms the equivalent systems introduced in the deformable part model approach witharound 9% improvement in the detection accuracy
Resumo:
The research presented, investigates the optimal set of operational codes (opcodes) that create a robust indicator of malicious software (malware) and also determines a program’s execution duration for accurate classification of benign and malicious software. The features extracted from the dataset are opcode density histograms, extracted during the program execution. The classifier used is a support vector machine and is configured to select those features to produce the optimal classification of malware over different program run lengths. The findings demonstrate that malware can be detected using dynamic analysis with relatively few opcodes.
Resumo:
Urothelial cancer (UC) is highly recurrent and can progress from non-invasive (NMIUC) to a more aggressive muscle-invasive (MIUC) subtype that invades the muscle tissue layer of the bladder. We present a proof of principle study that network-based features of gene pairs can be used to improve classifier performance and the functional analysis of urothelial cancer gene expression data. In the first step of our procedure each individual sample of a UC gene expression dataset is inflated by gene pair expression ratios that are defined based on a given network structure. In the second step an elastic net feature selection procedure for network-based signatures is applied to discriminate between NMIUC and MIUC samples. We performed a repeated random subsampling cross validation in three independent datasets. The network signatures were characterized by a functional enrichment analysis and studied for the enrichment of known cancer genes. We observed that the network-based gene signatures from meta collections of proteinprotein interaction (PPI) databases such as CPDB and the PPI databases HPRD and BioGrid improved the classification performance compared to single gene based signatures. The network based signatures that were derived from PPI databases showed a prominent enrichment of cancer genes (e.g., TP53, TRIM27 and HNRNPA2Bl). We provide a novel integrative approach for large-scale gene expression analysis for the identification and development of novel diagnostical targets in bladder cancer. Further, our method allowed to link cancer gene associations to network-based expression signatures that are not observed in gene-based expression signatures.