919 resultados para Information retrieval
Resumo:
CODEX SEARCH és un motor de recuperació d'informació especialitzat en dret d'estrangeria que està basat en eines i coneixement lingüístics. Per a desenvolupar un SRI (sistema de recuperació d'informació) eficient en el domini indicat no n'hi ha prou amb emprar un model tradicional de RI (recuperació d'informació), és a dir, comparar els termes de la pregunta amb els de la resposta, bàsicament perquè no expressen implicacions. En aquest sentit, la solució lingüística proposada es basa a incorporar el coneixement dels especialistes mitjançant la integració en el sistema d'una llibreria de casos. Els casos són exemples de procediments aplicats per experts/ertes en la solució de problemes que han ocorregut en la realitat i que han acabat en èxit o fracàs. Els resultats obtinguts en aquesta primera fase són molt encoratjadors, però és necessari continuar la investigació en aquest camp per millorar el rendiment del prototip.
Resumo:
BACKGROUND: The annotation of protein post-translational modifications (PTMs) is an important task of UniProtKB curators and, with continuing improvements in experimental methodology, an ever greater number of articles are being published on this topic. To help curators cope with this growing body of information we have developed a system which extracts information from the scientific literature for the most frequently annotated PTMs in UniProtKB. RESULTS: The procedure uses a pattern-matching and rule-based approach to extract sentences with information on the type and site of modification. A ranked list of protein candidates for the modification is also provided. For PTM extraction, precision varies from 57% to 94%, and recall from 75% to 95%, according to the type of modification. The procedure was used to track new publications on PTMs and to recover potential supporting evidence for phosphorylation sites annotated based on the results of large scale proteomics experiments. CONCLUSIONS: The information retrieval and extraction method we have developed in this study forms the basis of a simple tool for the manual curation of protein post-translational modifications in UniProtKB/Swiss-Prot. Our work demonstrates that even simple text-mining tools can be effectively adapted for database curation tasks, providing that a thorough understanding of the working process and requirements are first obtained. This system can be accessed at http://eagl.unige.ch/PTM/.
Resumo:
Summary: Using WordNet in information retrieval
Resumo:
Abstract Textual autocorrelation is a broad and pervasive concept, referring to the similarity between nearby textual units: lexical repetitions along consecutive sentences, semantic association between neighbouring lexemes, persistence of discourse types (narrative, descriptive, dialogal...) and so on. Textual autocorrelation can also be negative, as illustrated by alternating phonological or morpho-syntactic categories, or the succession of word lengths. This contribution proposes a general Markov formalism for textual navigation, and inspired by spatial statistics. The formalism can express well-known constructs in textual data analysis, such as term-document matrices, references and hyperlinks navigation, (web) information retrieval, and in particular textual autocorrelation, as measured by Moran's I relatively to the exchange matrix associated to neighbourhoods of various possible types. Four case studies (word lengths alternation, lexical repulsion, parts of speech autocorrelation, and semantic autocorrelation) illustrate the theory. In particular, one observes a short-range repulsion between nouns together with a short-range attraction between verbs, both at the lexical and semantic levels. Résumé: Le concept d'autocorrélation textuelle, fort vaste, réfère à la similarité entre unités textuelles voisines: répétitions lexicales entre phrases successives, association sémantique entre lexèmes voisins, persistance du type de discours (narratif, descriptif, dialogal...) et ainsi de suite. L'autocorrélation textuelle peut être également négative, comme l'illustrent l'alternance entre les catégories phonologiques ou morpho-syntaxiques, ou la succession des longueurs de mots. Cette contribution propose un formalisme markovien général pour la navigation textuelle, inspiré par la statistique spatiale. Le formalisme est capable d'exprimer des constructions bien connues en analyse des données textuelles, telles que les matrices termes-documents, les références et la navigation par hyperliens, la recherche documentaire sur internet, et, en particulier, l'autocorélation textuelle, telle que mesurée par le I de Moran relatif à une matrice d'échange associée à des voisinages de différents types possibles. Quatre cas d'étude illustrent la théorie: alternance des longueurs de mots, répulsion lexicale, autocorrélation des catégories morpho-syntaxiques et autocorrélation sémantique. On observe en particulier une répulsion à courte portée entre les noms, ainsi qu'une attraction à courte portée entre les verbes, tant au niveau lexical que sémantique.
Resumo:
El objetivo de este proyecto es familiarizarse con las tecnologías de Semántica, entender que es una ontología y aprender a modelar una en un dominio elegido por nosotros. Realizar un parser que conectándose a la la Wikipedia y/o DBpedia rellene dicha ontología permitiendo al usuario navegar por sus conceptos y estudiar sus relaciones.
Resumo:
En el curso y ejecución de este trabajo, ahondaré en el concepto de web semántica, unarealidad cada vez más tangible, que bajo el acrónimo de web 3.0 supondrá el relevo del actual modelo web.Al tratarse de un campo de aplicación muy extenso, centraremos la temática en el diseño y populación semiautomática de ontologías, siendo estas ultimas una pieza clave en el desarrollo y el éxito potencial de las tecnologías semánticas.
Resumo:
Software de lectura y población de ontología con información de DBpedia y Wikipedia.
Resumo:
Des d'aquest TFC volem estudiar l'evolució de la Web actual cap a la Web Semàntica.
Resumo:
Purpose This paper aims to analyse various aspects of an academic social network: the profile of users, the reasons for its use, its perceived benefits and the use of other social media for scholarly purposes. Design/methodology/approach The authors examined the profiles of the users of an academic social network. The users were affiliated with 12 universities. The following were recorded for each user: sex, the number of documents uploaded, the number of followers, and the number of people being followed. In addition, a survey was sent to the individuals who had an email address in their profile. Findings Half of the users of the social network were academics and a third were PhD students. Social sciences scholars accounted for nearly half of all users. Academics used the service to get in touch with other scholars, disseminate research results and follow other scholars. Other widely employed social media included citation indexes, document creation, edition and sharing tools and communication tools. Users complained about the lack of support for the utilisation of these tools. Research limitations/implications The results are based on a single case study. Originality/value This study provides new insights on the impact of social media in academic contexts by analysing the user profiles and benefits of a social network service that is specifically targeted at the academic community.
Resumo:
This piece of work which is Identification of Research Portfolio for Development of Filtration Equipment aims at presenting a novel approach to identify promising research topics in the field of design and development of filtration equipment and processes. The projected approach consists of identifying technological problems often encountered in filtration processes. The sources of information for the problem retrieval were patent documents and scientific papers that discussed filtration equipments and processes. The problem identification method adopted in this work focussed on the semantic nature of a sentence in order to generate series of subject-action-object structures. This was achieved with software called Knowledgist. List of problems often encountered in filtration processes that have been mentioned in patent documents and scientific papers were generated. These problems were carefully studied and categorized. Suggestions were made on the various classes of these problems that need further investigation in order to propose a research portfolio. The uses and importance of other methods of information retrieval were also highlighted in this work.
Resumo:
Web-portaalien aiheenmukaista luokittelua voidaan hyödyntää tunnistamaan käyttäjän kiinnostuksen kohteet keräämällä tilastotietoa hänen selaustottumuksistaan eri kategorioissa. Tämä diplomityö käsittelee web-sovelluksien osa-alueita, joissa kerättyä tilastotietoa voidaan hyödyntää personalisoinnissa. Yleisperiaatteet sisällön personalisoinnista, Internet-mainostamisesta ja tiedonhausta selitetään matemaattisia malleja käyttäen. Lisäksi työssä kuvaillaan yleisluontoiset ominaisuudet web-portaaleista sekä tilastotiedon keräämiseen liittyvät seikat.
Resumo:
En aquest treball es realitza un estudi sobre l'estat de l'art de la web semàntica i els seus estàndards actuals, més concretament sobre ontologies. Descriu també el procés pràctic emprat pel disseny i la implementació d'una ontologia en el domini concret de Twitter, en format OWL, fent servir l'aplicació Protégé per a la seva creació. Finalment explica la creació (captura de requeriments, disseny i implementació) d'una aplicació capaç d'obtenir dades reals de Twitter, processar-les per extreure'n la informació rellevant i emmagatzemar-la a la ontologia creada.
Resumo:
This paper presents a reflection on the need for libraries to think about how to facilitate access to the documentary sources they manage.As the number of resources available in electronic form increases, libraries are in the need to provide a simple and usable search tool that allows integrating the contents of the various information management systems they give access to.To define user expectations to the search interface, some of the features that they are accustomed to use in their requests for information on the Internet have been included.The technologies that allow the discovery layer implementation as a search tool that integrates the various information systems of the library are presented next. And below are some examples of implementations that work in line with the integration of various information sources into a single search engine, as models to consider for implementing a system of this kind.The purpose of it all is to present a state of the art of some cases of operational deployments as a starting point for any organization interested in improving access it offers to its resources on the basis of references study.
Resumo:
Summary : Fuzzy translation techniques in cross-language information retrieval between closely related languages