990 resultados para Features selection


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Email has become the critical communication medium for most organizations. Unfortunately, email-born attacks in computer networks are causing considerable economic losses worldwide. Exiting phishing email blocking appliances have little effect in weeding out the vast majority of phishing emails. At the same time, online criminals are becoming more dangerous and sophisticated. Phishing emails are more active than ever before and putting the average computer user and organizations at risk of significant data, brand and financial loss. In this paper, we propose a hybrid feature selection approach based combination of content-based and behaviour-based. The approach could mine the attacker behaviour based on email header. On a publicly available test corpus, our hybrid features selection is able to achieve 94% accuracy rate.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

It is crucial for a neuron spike sorting algorithm to cluster data from different neurons efficiently. In this study, the search capability of the Genetic Algorithm (GA) is exploited for identifying the optimal feature subset for neuron spike sorting with a clustering algorithm. Two important objectives of the optimization process are considered: to reduce the number of features and increase the clustering performance. Specifically, we employ a binary GA with the silhouette evaluation criterion as the fitness function for neuron spike sorting using the Super-Paramagnetic Clustering (SPC) algorithm. The clustering results of SPC with and without the GA-based feature selector are evaluated using benchmark synthetic neuron spike data sets. The outcome indicates the usefulness of the GA in identifying a smaller feature set with improved clustering performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pen-based user interface has become a hot research field in recent years. Pen gesture plays an important role in Pen-based user interfaces. But it’s difficult for UI designers to design, and for users to learn and use. In this purpose, we performed a research on user-centered design and recognition pen gestures. We performed a survey of 100 pen gestures in twelve famous pen-bases systems to find problems of pen gestures currently used. And we conducted a questionnaire to evaluate the matching degree between commands and pen gestures to discover the characteristics that a good pen gestures should have. Then cognition theories were applied to analyze the advantages of those characteristics in helping improving the learnability of pen gestures. From these, we analyzed the pen gesture recognition effect and presented some improvements on features selection in recognition algorithm of pen gestures. Finally we used a couple of psychology experiments to evaluate twelve pen gestures designed based on the research. It shows those gestures is better for user to learn and use. Research results of this paper can be used for designer as a primary principle to design pen gestures in pen-based systems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this research, we study the effect of feature selection in the spike detection and sorting accuracy.We introduce a new feature representation for neural spikes from multichannel recordings. The features selection plays a significant role in analyzing the response of brain neurons. The more precise selection of features leads to a more accurate spike sorting, which can group spikes more precisely into clusters based on the similarity of spikes. Proper spike sorting will enable the association between spikes and neurons. Different with other threshold-based methods, the cepstrum of spike signals is employed in our method to select the candidates of spike features. To choose the best features among different candidates, the Kolmogorov-Smirnov (KS) test is utilized. Then, we rely on the superparamagnetic method to cluster the neural spikes based on KS features. Simulation results demonstrate that the proposed method not only achieve more accurate clustering results but also reduce computational burden, which implies that it can be applied into real-time spike analysis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Dysfunction of Autonomic Nervous System (ANS) is a typical feature of chronic heart failure and other cardiovascular disease. As a simple non-invasive technology, heart rate variability (HRV) analysis provides reliable information on autonomic modulation of heart rate. The aim of this thesis was to research and develop automatic methods based on ANS assessment for evaluation of risk in cardiac patients. Several features selection and machine learning algorithms have been combined to achieve the goals. Automatic assessment of disease severity in Congestive Heart Failure (CHF) patients: a completely automatic method, based on long-term HRV was proposed in order to automatically assess the severity of CHF, achieving a sensitivity rate of 93% and a specificity rate of 64% in discriminating severe versus mild patients. Automatic identification of hypertensive patients at high risk of vascular events: a completely automatic system was proposed in order to identify hypertensive patients at higher risk to develop vascular events in the 12 months following the electrocardiographic recordings, achieving a sensitivity rate of 71% and a specificity rate of 86% in identifying high-risk subjects among hypertensive patients. Automatic identification of hypertensive patients with history of fall: it was explored whether an automatic identification of fallers among hypertensive patients based on HRV was feasible. The results obtained in this thesis could have implications both in clinical practice and in clinical research. The system has been designed and developed in order to be clinically feasible. Moreover, since 5-minute ECG recording is inexpensive, easy to assess, and non-invasive, future research will focus on the clinical applicability of the system as a screening tool in non-specialized ambulatories, in order to identify high-risk patients to be shortlisted for more complex investigations.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this thesis, the genetic variation of human populations from the Baltic Sea region was studied in order to elucidate population history as well as evolutionary adaptation in this region. The study provided novel understanding of how the complex population level processes of migration, genetic drift, and natural selection have shaped genetic variation in North European populations. Results from genome-wide, mitochondrial DNA and Y-chromosomal analyses suggested that the genetic background of the populations of the Baltic Sea region lies predominantly in Continental Europe, which is consistent with earlier studies and archaeological evidence. The late settlement of Fennoscandia after the Ice Age and the subsequent small population size have led to pronounced genetic drift, especially in Finland and Karelia but also in Sweden, evident especially in genome-wide and Y-chromosomal analyses. Consequently, these populations show striking genetic differentiation, as opposed to much more homogeneous pattern of variation in Central European populations. Additionally, the eastern side of the Baltic Sea was observed to have experienced eastern influence in the genome-wide data as well as in mitochondrial DNA and Y-chromosomal variation – consistent with linguistic connections. However, Slavic influence in the Baltic Sea populations appears minor on genetic level. While the genetic diversity of the Finnish population overall was low, genome-wide and Y-chromosomal results showed pronounced regional differences. The genetic distance between Western and Eastern Finland was larger than for many geographically distant population pairs, and provinces also showed genetic differences. This is probably mainly due to the late settlement of Eastern Finland and local isolation, although differences in ancestral migration waves may contribute to this, too. In contrast, mitochondrial DNA and Y-chromosomal analyses of the contemporary Swedish population revealed a much less pronounced population structure and a fusion of the traces of ancient admixture, genetic drift, and recent immigration. Genome-wide datasets also provide a resource for studying the adaptive evolution of human populations. This study revealed tens of loci with strong signs of recent positive selection in Northern Europe. These results provide interesting targets for future research on evolutionary adaptation, and may be important for understanding the background of disease-causing variants in human populations.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Feature selection is an important first step in regional hydrologic studies (RHYS). Over the past few decades, advances in data collection facilities have resulted in development of data archives on a variety of hydro-meteorological variables that may be used as features in RHYS. Currently there are no established procedures for selecting features from such archives. Therefore, hydrologists often use subjective methods to arrive at a set of features. This may lead to misleading results. To alleviate this problem, a probabilistic clustering method for regionalization is presented to determine appropriate features from the available dataset. The effectiveness of the method is demonstrated by application to regionalization of watersheds in conterminous United States for low flow frequency analysis. Plausible homogeneous regions that are formed by using the proposed clustering method are compared with those from conventional methods of regionalization using L-moment based homogeneity tests. Results show that the proposed methodology is promising for RHYS.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this study, Iranian and French male and female Oncorhynchus mykiss broodstocks were divided into two groups 50 and 24 respectively in Research center of genetic and breeding of coldwater fishes, Yasouj, Iran and the genetic structure of them was investigated using 6 microsatellite markers. Then 19 morphometric and 5 meristic of broodstock were measured and compared in two populations. Along with broodstock maturation, fertilization 1:1(female:male) were randomly assigned and occurred in 25 of 12 Iranian and French treatment respectively. Reproductive parameters were recorded for the whole family. Average number of observed alleles in Iranian and French stocks was 6.68 and 6.83, respectively. Average number of effective alleles in Iranian and French stocks was 3.13 and 3.45 respectively. Fixation index Fst was calculated based on allelic frequency between two stocks was 0.058 with significant difference between 2 stocks. Morphometric analysis showed significant difference between two stocks in 8 characteristics. Meristic characters was without significant difference in broodstock groups. Eyed percentage for french broodstock calculated zero and deleted. Fertilization rate (100-0), the eyed percentage (98- 0), The hatch rate (98-0), the average fecundity 4114.708, the average eggs size 4.88 mm, Survival in the first three months 19-73% calculated for Iranian broodstocks. Considering the quality of eggs and larvae at different stages and selection between the different family and the within family remained 10 treatments and are kept as future broodstocks. The relationship between fecundity - egg size, fecundity - weight , fecundity - length, egg size- weight was performed using regression. The results showed that Fecundity was influenced more by weight and productive length. The research is beginning to ID the broodstock in our country.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This work investigates the problem of feature selection in neuroimaging features from structural MRI brain images for the classification of subjects as healthy controls, suffering from Mild Cognitive Impairment or Alzheimer’s Disease. A Genetic Algorithm wrapper method for feature selection is adopted in conjunction with a Support Vector Machine classifier. In very large feature sets, feature selection is found to be redundant as the accuracy is often worsened when compared to an Support Vector Machine with no feature selection. However, when just the hippocampal subfields are used, feature selection shows a significant improvement of the classification accuracy. Three-class Support Vector Machines and two-class Support Vector Machines combined with weighted voting are also compared with the former and found more useful. The highest accuracy achieved at classifying the test data was 65.5% using a genetic algorithm for feature selection with a three-class Support Vector Machine classifier.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Accurate detection of depression at an individual level using structural magnetic resonance imaging (sMRI) remains a challenge. Brain volumetric changes at a structural level appear to have importance in depression biomarkers studies. An automated algorithm is developed to select brain sMRI volumetric features for the detection of depression.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The aim of the present study is to define an optimally performing computer-aided diagnosis (CAD) architecture for the classification of liver tissue from non-enhanced computed tomography (CT) images into normal liver (C1), hepatic cyst (C2), hemangioma (C3), and hepatocellular carcinoma (C4). To this end, various CAD architectures, based on texture features and ensembles of classifiers (ECs), are comparatively assessed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study assesses the recently proposed data-driven background dataset refinement technique for speaker verification using alternate SVM feature sets to the GMM supervector features for which it was originally designed. The performance improvements brought about in each trialled SVM configuration demonstrate the versatility of background dataset refinement. This work also extends on the originally proposed technique to exploit support vector coefficients as an impostor suitability metric in the data-driven selection process. Using support vector coefficients improved the performance of the refined datasets in the evaluation of unseen data. Further, attempts are made to exploit the differences in impostor example suitability measures from varying features spaces to provide added robustness.