978 resultados para Relevant features


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The quality of discovered features in relevance feedback (RF) is the key issue for effective search query. Most existing feedback methods do not carefully address the issue of selecting features for noise reduction. As a result, extracted noisy features can easily contribute to undesirable effectiveness. In this paper, we propose a novel feature extraction method for query formulation. This method first extract term association patterns in RF as knowledge for feature extraction. Negative RF is then used to improve the quality of the discovered knowledge. A novel information filtering (IF) model is developed to evaluate the proposed method. The experimental results conducted on Reuters Corpus Volume 1 and TREC topics confirm that the proposed model achieved encouraging performance compared to state-of-the-art IF models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the explosion of information resources, there is an imminent need to understand interesting text features or topics in massive text information. This thesis proposes a theoretical model to accurately weight specific text features, such as patterns and n-grams. The proposed model achieves impressive performance in two data collections, Reuters Corpus Volume 1 (RCV1) and Reuters 21578.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data mining can be defined as the extraction of implicit, previously un-known, and potentially useful information from data. Numerous re-searchers have been developing security technology and exploring new methods to detect cyber-attacks with the DARPA 1998 dataset for Intrusion Detection and the modified versions of this dataset KDDCup99 and NSL-KDD, but until now no one have examined the performance of the Top 10 data mining algorithms selected by experts in data mining. The compared classification learning algorithms in this thesis are: C4.5, CART, k-NN and Naïve Bayes. The performance of these algorithms are compared with accuracy, error rate and average cost on modified versions of NSL-KDD train and test dataset where the instances are classified into normal and four cyber-attack categories: DoS, Probing, R2L and U2R. Additionally the most important features to detect cyber-attacks in all categories and in each category are evaluated with Weka’s Attribute Evaluator and ranked according to Information Gain. The results show that the classification algorithm with best performance on the dataset is the k-NN algorithm. The most important features to detect cyber-attacks are basic features such as the number of seconds of a network connection, the protocol used for the connection, the network service used, normal or error status of the connection and the number of data bytes sent. The most important features to detect DoS, Probing and R2L attacks are basic features and the least important features are content features. Unlike U2R attacks, where the content features are the most important features to detect attacks.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This article presents a quantitative and objective approach to cat ganglion cell characterization and classification. The combination of several biologically relevant features such as diameter, eccentricity, fractal dimension, influence histogram, influence area, convex hull area, and convex hull diameter are derived from geometrical transforms and then processed by three different clustering methods (Ward's hierarchical scheme, K-means and genetic algorithm), whose results are then combined by a voting strategy. These experiments indicate the superiority of some features and also suggest some possible biological implications.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Abstract Background One of the least common types of alternative splicing is the complete retention of an intron in a mature transcript. Intron retention (IR) is believed to be the result of intron, rather than exon, definition associated with failure of the recognition of weak splice sites flanking short introns. Although studies on individual retained introns have been published, few systematic surveys of large amounts of data have been conducted on the mechanisms that lead to IR. Results TTo understand how sequence features are associated with or control IR, and to produce a generalized model that could reveal previously unknown signals that regulate this type of alternative splicing, we partitioned intron retention events observed in human cDNAs into two groups based on the relative abundance of both isoforms and compared relevant features. We found that a higher frequency of IR in human is associated with individual introns that have weaker splice sites, genes with shorter intron lengths, higher expression levels and lower density of both a set of exon splicing silencers (ESSs) and the intronic splicing enhancer GGG. Both groups of retained introns presented events conserved in mouse, in which the retained introns were also short and presented weaker splice sites. Conclusion Although our results confirmed that weaker splice sites are associated with IR, they showed that this feature alone cannot explain a non-negligible fraction of events. Our analysis suggests that cis-regulatory elements are likely to play a crucial role in regulating IR and also reveals previously unknown features that seem to influence its occurrence. These results highlight the importance of considering the interplay among these features in the regulation of the relative frequency of IR.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Data preprocessing is widely recognized as an important stage in anomaly detection. This paper reviews the data preprocessing techniques used by anomaly-based network intrusion detection systems (NIDS), concentrating on which aspects of the network traffic are analyzed, and what feature construction and selection methods have been used. Motivation for the paper comes from the large impact data preprocessing has on the accuracy and capability of anomaly-based NIDS. The review finds that many NIDS limit their view of network traffic to the TCP/IP packet headers. Time-based statistics can be derived from these headers to detect network scans, network worm behavior, and denial of service attacks. A number of other NIDS perform deeper inspection of request packets to detect attacks against network services and network applications. More recent approaches analyze full service responses to detect attacks targeting clients. The review covers a wide range of NIDS, highlighting which classes of attack are detectable by each of these approaches. Data preprocessing is found to predominantly rely on expert domain knowledge for identifying the most relevant parts of network traffic and for constructing the initial candidate set of traffic features. On the other hand, automated methods have been widely used for feature extraction to reduce data dimensionality, and feature selection to find the most relevant subset of features from this candidate set. The review shows a trend toward deeper packet inspection to construct more relevant features through targeted content parsing. These context sensitive features are required to detect current attacks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the growth of the Web, E-commerce activities are also becoming popular. Product recommendation is an effective way of marketing a product to potential customers. Based on a user’s previous searches, most recommendation methods employ two dimensional models to find relevant items. Such items are then recommended to a user. Further too many irrelevant recommendations worsen the information overload problem for a user. This happens because such models based on vectors and matrices are unable to find the latent relationships that exist between users and searches. Identifying user behaviour is a complex process, and usually involves comparing searches made by him. In most of the cases traditional vector and matrix based methods are used to find prominent features as searched by a user. In this research we employ tensors to find relevant features as searched by users. Such relevant features are then used for making recommendations. Evaluation on real datasets show the effectiveness of such recommendations over vector and matrix based methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we demonstrate how to monitor a smartphone running Symbian operating system and Windows Mobile in order to extract features for anomaly detection. These features are sent to a remote server because running a complex intrusion detection system on this kind of mobile device still is not feasible due to capability and hardware limitations. We give examples on how to compute relevant features and introduce the top ten applications used by mobile phone users based on a study in 2005. The usage of these applications is recorded by a monitoring client and visualized. Additionally, monitoring results of public and self-written malwares are shown. For improving monitoring client performance, Principal Component Analysis was applied which lead to a decrease of about 80 of the amount of monitored features.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This is an update of an earlier paper, and is written for Excel 2007. A series of Excel 2007 models is described. The more advanced versions allow solution of f(x)=0 by examining change of sign of function values. The function is graphed and change of sign easily detected by a change of colour. Relevant features of Excel 2007 used are Names, Scatter Chart and Conditional Formatting. Several sample Excel 2007 models are available for download, and the paper is intended to be used as a lesson plan for students having some familiarity with derivatives. For comparison and reference purposes, the paper also presents a brief outline of several common equation-solving strategies as an Appendix.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Holistic physics education in upper secondary level based on the optional course of physics Keywords: physics education, education, holistic, curriculum, world view, values A physics teacher s task is to put into practice all goals of the curriculum. Holistic physics education means in this research teaching, in which the school s common educational goals and the goals particular to the physics curriculum are taken into account. These involve knowledge, skills and personal value and attitude goals. Research task was to clarify how the educational goals involving student s values and attitudes can be carried out through the subject content of physics. How does the physics teacher communicate the modern world view through the content of the physics class? The goal of this research was to improve teaching, to find new points of view and to widen the perspective on how physics is taught. The teacher, who acted also as a researcher, planned and delivered an optional course where she could study the possibilities of holistic physics education. In 2001-2002 ten girls and two boys of the grade 9th class participated in that elective course. According to principles of action research the teacher-researcher reflected also on her own teaching action. Research method was content analysis that involved both analyzing student feedback, and relevant features of the teacher s knowledge, which are needed for planning and giving the physics lessons. In this research that means taking into account the subject matter knowledge, curriculum, didactic and the pedagogical content knowledge of the teacher. The didactic includes the knowledge of the learning process, students motivation, specific features of the physics didactics and the research of physics education. Among other things, the researcher constructed the contents of the curriculum and abstracted sentences as keywords, from which she drew a concept map. The concept maps, for instance, the map of educational goals and the mapping of the physics essence, were tools for studying contents which are included in the holistic physics education. Moreover, conclusions were reached concerning the contents of physics domains by which these can be achieved. According to this research, the contents employing the holistic physics education is as follows: perception, the essence of science, the development of science, new research topics and interactions in physics. The starting point of teaching should be connected with the student s life experiences and the approach to teaching should be broadly relevant to those experiences. The teacher-researcher observed and analyzed the effects of the experimental physics course, through the lens of a holistic physics education. The students reported that the goals of holistic physics education were achieved in the course. The discourses of the students indicated that in the experimental course they could express their opinions and feelings and make proposals and evaluations. The students had experiences about chances to affect the content of the course, and they considered the philosophical physics course interesting, it awakened questions, increased their self-esteem and helped them to become more aware of their world views. The students analytic skills developed in the interactive learning environment. The physics teacher needs broad knowledge for planning his or her teaching, which is evaluated in this research from contents maps made for the tools of the teaching. In the holistic physics education the teacher needs an open and curious mind and skills for interaction in teaching. This research indicates the importance of teaching physics in developing attitudes and values beside substance of the physics in class environment. The different points of view concerning human beings life make it possible to construct the modern world view of the students and to develop analytic skills and the self-esteem and thus help them in learning. Overall and wide points of view also help to transfer knowledge to practice. Since such contents is not employed by teaching the physics included in the standard curriculum, supplement relevant teaching material that includes such topics are needed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present a methodology for identifying best features from a large feature space. In high dimensional feature space nearest neighbor search is meaningless. In this feature space we see quality and performance issue with nearest neighbor search. Many data mining algorithms use nearest neighbor search. So instead of doing nearest neighbor search using all the features we need to select relevant features. We propose feature selection using Non-negative Matrix Factorization(NMF) and its application to nearest neighbor search. Recent clustering algorithm based on Locally Consistent Concept Factorization(LCCF) shows better quality of document clustering by using local geometrical and discriminating structure of the data. By using our feature selection method we have shown further improvement of performance in the clustering.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We give strong numerical evidence that a self-interacting probe scalar field in AdS, with only a few modes turned on initially, will undergo fast thermalization only if it is above a certain energetic threshold. Below the threshold the energy stays close to constant in a few modes for a very long time instead of cascading quickly. This indicates the existence of a Strong Stochasticity Threshold (SST) in holography. The idea of SST is familiar from certain statistical mechanical systems, and we suggest that it exists also in AdS gravity. This would naturally reconcile the generic nonlinear instability of AdS observed by Bizon and Rostworowski, with the Fermi-Pasta-Ulam-Tsingou-like quasiperiodicity noticed recently for some classes of initial conditions. We show that our simple setup captures many of the relevant features of the full gravity-scalar system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Selection of relevant features is an open problem in Brain-computer interfacing (BCI) research. Sometimes, features extracted from brain signals are high dimensional which in turn affects the accuracy of the classifier. Selection of the most relevant features improves the performance of the classifier and reduces the computational cost of the system. In this study, we have used a combination of Bacterial Foraging Optimization and Learning Automata to determine the best subset of features from a given motor imagery electroencephalography (EEG) based BCI dataset. Here, we have employed Discrete Wavelet Transform to obtain a high dimensional feature set and classified it by Distance Likelihood Ratio Test. Our proposed feature selector produced an accuracy of 80.291% in 216 seconds.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Paulicéia Desvairada, de Mário de Andrade e Paranóia, de Roberto Piva, carregam o aspecto revolucionário de estreias (estreia modernista de Mário) que chegam para derrubar o estabelecido. Ambos encontram um ambiente hostil e combatem o padrão de suas respectivas épocas com versos calcados na concepção de confronto. Muito deste ímpeto, por vezes, pode esconder outras características aparentemente menos relevantes e mais trabalhadas em obras posteriores. O conceito solidário de João Luiz Lafetá (1986) uma solidariedade também reforçada por Giorgio Agamben (1993) e que estaria nas entranhas de um posicionamento claramente mais radical é o caminho percorrido nesta dissertação, trazendo à tona o eu solidário para além do pano de fundo em Paulicéia Desvairada e Paranóia. A complexidade polissêmica na poesia de Mário de Andrade e o agressivo desregramento dos sentidos na poética de Roberto Piva unem-se na semelhança e na diferença, incitando e proporcionando esta pesquisa. Uma vez descortinada, a primeira pessoa solidária uma espécie de embrião a ser explorado revela-se maior, com uma força que atravessa o tempo e convida o leitor a não se entregar ao comodismo

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A study was conducted in Tebuwana Wathurana Wetland ecosystem to understand its vegetation structure and faunal composition in order to assess its conservation needs. As there are no published records on the flora and fauna of Wathurana Wetlands in Tebuwana, it is necessary to understand the ecological and other relevant features in order to develop strategies to conserve this wetland. These objectives were pursued by surveying the vegetation of the wetland and by identifying fish and bird species present. A total of 66 species of flora and 61 species of fauna were identified in the survey. Of the 27 fish species recorded from the Tebuwana Wetland, 9 species were endemic and 17 species belonged to the indigenous category. With regard to the flora in the wetlands, the dominant families were Rubaceae, Fabaceae and Arecaceae. The 66 species belonged to 39 families and 61 genera while 12 species were endemic and 4 species were considered highly threatened. These flora were found in four layers. Of the 22 species of birds recorded, two species were endemic. This study revealed that these Wathurana Wetlands have a high species diversity but that they face many threats including encroachments, extraction of forest products mainly as timber, land filling, mining and occurrence of invasive species. It is essential to minimize the exploitation of natural resources from this wetland in the future and in particular to mark the boundary, conduct awareness programmes and continue research.