962 resultados para Web, Search Engine, Overlap


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Feature selection is one of important and frequently used techniques in data preprocessing. It can improve the efficiency and the effectiveness of data mining by reducing the dimensions of feature space and removing the irrelevant and redundant information. Feature selection can be viewed as a global optimization problem of finding a minimum set of M relevant features that describes the dataset as well as the original N attributes. In this paper, we apply the adaptive partitioned random search strategy into our feature selection algorithm. Under this search strategy, the partition structure and evaluation function is proposed for feature selection problem. This algorithm ensures the global optimal solution in theory and avoids complete randomness in search direction. The good property of our algorithm is shown through the theoretical analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Plant-antivenom is a computational Websystem about medicinal plants with anti-venom properties. The system consists of a database of these plants, including scientific publications on this subject and amino acid sequences of active principles from venomous animals. The system relates these data allowing their integration through different search applications. For the development of the system, the first surveys were conducted in scientific literature, allowing the creation of a publication database in a library for reading and user interaction. Then, classes of categories were created, allowing the use of tags and the organization of content. This database on medicinal plants has information such as family, species, isolated compounds, activity, inhibited animal venoms, among others. Provision is made for submission of new information by registered users, by the use of wiki tools. Content submitted is released in accordance to permission rules defined by the system. The database on biological venom protein amino acid sequences was structured from the essential information from National Center for Biotechnology Information (NCBI). Plant-antivenom`s interface is simple, contributing to a fast and functional access to the system and the integration of different data registered on it. Plant-antivenom system is available on the Internet at http://gbi.fmrp.usp.br/plantantivenom.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Formal Concept Analysis is an unsupervised machine learning technique that has successfully been applied to document organisation by considering documents as objects and keywords as attributes. The basic algorithms of Formal Concept Analysis then allow an intelligent information retrieval system to cluster documents according to keyword views. This paper investigates the scalability of this idea. In particular we present the results of applying spatial data structures to large datasets in formal concept analysis. Our experiments are motivated by the application of the Formal Concept Analysis idea of a virtual filesystem [11,17,15]. In particular the libferris [1] Semantic File System. This paper presents customizations to an RD-Tree Generalized Index Search Tree based index structure to better support the application of Formal Concept Analysis to large data sources.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective: To test the feasibility of an evidence-based clinical literature search service to help answer general practitioners' (GPs') clinical questions. Design: Two search services supplied GPs who submitted questions with the best available empirical evidence to answer these questions. The GPs provided feedback on the value of the service, and concordance of answers from the two search services was assessed. Setting: Two literature search services (Queensland and Victoria), operating for nine months from February 1999. Main outcome measures: Use of the service; time taken to locate answers; availability of evidence; value of the service to GPs; and consistency of answers from the two services. Results: 58 GPs asked 160 questions (29 asked one, 11 asked five or more). The questions concerned treatment (65%), aetiology (17%), prognosis (13%), and diagnosis (5%). Answering a question took a mean of 3 hours 32 minutes of personnel time (95% Cl, 2.67-3.97); nine questions took longer than 10 hours each to answer, the longest taking 23 hours 30 minutes. Evidence of suitable quality to provide a sound answer was available for 126 (79%) questions. Feedback data for 84 (53%) questions, provided by 42 GPs, showed that they appreciated the service, and asking the questions changed clinical care. There were many minor differences between the answers from the two centres, and substantial differences in the evidence found for 4/14 questions. However, conclusions reached were largely similar, with no or only minor differences for all questions. Conclusions: It is feasible to provide a literature search service, but further assessment is needed to establish its cost effectiveness.