998 resultados para Anchor Identification


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents an overview of NTCIR-9 Cross-lingual Link Discovery (Crosslink) task. The overview includes: the motivation of cross-lingual link discovery; the Crosslink task definition; the run submission specification; the assessment and evaluation framework; the evaluation metrics; and the evaluation results of submitted runs. Cross-lingual link discovery (CLLD) is a way of automatically finding potential links between documents in different languages. The goal of this task is to create a reusable resource for evaluating automated CLLD approaches. The results of this research can be used in building and refining systems for automated link discovery. The task is focused on linking between English source documents and Chinese, Korean, and Japanese target documents.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we examine automated Chinese to English link discovery in Wikipedia and the effects of Chinese segmentation and Chinese to English translation on the hyperlink recommendation. Our experimental results show that the implemented link discovery framework can effectively recommend Chinese-to-English cross-lingual links. The techniques described here can assist bi-lingual users where a particular topic is not covered in Chinese, is not equally covered in both languages, or is biased in one language; as well as for language learning.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Nowadays people heavily rely on the Internet for information and knowledge. Wikipedia is an online multilingual encyclopaedia that contains a very large number of detailed articles covering most written languages. It is often considered to be a treasury of human knowledge. It includes extensive hypertext links between documents of the same language for easy navigation. However, the pages in different languages are rarely cross-linked except for direct equivalent pages on the same subject in different languages. This could pose serious difficulties to users seeking information or knowledge from different lingual sources, or where there is no equivalent page in one language or another. In this thesis, a new information retrieval task—cross-lingual link discovery (CLLD) is proposed to tackle the problem of the lack of cross-lingual anchored links in a knowledge base such as Wikipedia. In contrast to traditional information retrieval tasks, cross language link discovery algorithms actively recommend a set of meaningful anchors in a source document and establish links to documents in an alternative language. In other words, cross-lingual link discovery is a way of automatically finding hypertext links between documents in different languages, which is particularly helpful for knowledge discovery in different language domains. This study is specifically focused on Chinese / English link discovery (C/ELD). Chinese / English link discovery is a special case of cross-lingual link discovery task. It involves tasks including natural language processing (NLP), cross-lingual information retrieval (CLIR) and cross-lingual link discovery. To justify the effectiveness of CLLD, a standard evaluation framework is also proposed. The evaluation framework includes topics, document collections, a gold standard dataset, evaluation metrics, and toolkits for run pooling, link assessment and system evaluation. With the evaluation framework, performance of CLLD approaches and systems can be quantified. This thesis contributes to the research on natural language processing and cross-lingual information retrieval in CLLD: 1) a new simple, but effective Chinese segmentation method, n-gram mutual information, is presented for determining the boundaries of Chinese text; 2) a voting mechanism of name entity translation is demonstrated for achieving a high precision of English / Chinese machine translation; 3) a link mining approach that mines the existing link structure for anchor probabilities achieves encouraging results in suggesting cross-lingual Chinese / English links in Wikipedia. This approach was examined in the experiments for better, automatic generation of cross-lingual links that were carried out as part of the study. The overall major contribution of this thesis is the provision of a standard evaluation framework for cross-lingual link discovery research. It is important in CLLD evaluation to have this framework which helps in benchmarking the performance of various CLLD systems and in identifying good CLLD realisation approaches. The evaluation methods and the evaluation framework described in this thesis have been utilised to quantify the system performance in the NTCIR-9 Crosslink task which is the first information retrieval track of this kind.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

GPR40 was formerly an orphan G protein-coupled receptor whose endogenous ligands have recently been identified as free fatty acids (FFAs). The receptor, now named FFA receptor 1, has been implicated in the pathophysiology of type 2 diabetes and is a drug target because of its role in FFA-mediated enhancement of glucose-stimulated insulin release. Guided by molecular modeling, we investigated the molecular determinants contributing to binding of linoleic acid, a C18 polyunsaturated FFA, and GW9508, a synthetic small molecule agonist. Twelve residues within the putative GPR40-binding pocket including hydrophilic/positively charged, aromatic, and hydrophobic residues were identified and were subjected to site-directed mutagenesis. Our results suggest that linoleic acid and GW9508 are anchored on their carboxylate groups by Arg183, Asn244, and Arg258. Moreover, His86, Tyr91, and His137 may contribute to aromatic and/or hydrophobic interactions with GW9508 that are not present, or relatively weak, with linoleic acid. The anchor residues, as well as the residues Tyr12, Tyr91, His137, and Leu186, appear to be important for receptor activation also. Interestingly, His137 and particularly His86 may interact with GW9508 in a manner dependent on its protonation status. The greater number of putative interactions between GPR40 and GW9508 compared with linoleic acid may explain the higher potency of GW9508.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nonstructural protein 4B (NS4B) plays an essential role in the formation of the hepatitis C virus (HCV) replication complex. It is a relatively poorly characterized integral membrane protein predicted to comprise four transmembrane segments in its central portion. Here, we describe a novel determinant for membrane association represented by amino acids (aa) 40 to 69 in the N-terminal portion of NS4B. This segment was sufficient to target and tightly anchor the green fluorescent protein to cellular membranes, as assessed by fluorescence microscopy as well as membrane extraction and flotation analyses. Circular dichroism and nuclear magnetic resonance structural analyses showed that this segment comprises an amphipathic alpha-helix extending from aa 42 to 66. Attenuated total reflection infrared spectroscopy and glycosylation acceptor site tagging revealed that this amphipathic alpha-helix has the potential to traverse the phospholipid bilayer as a transmembrane segment, likely upon oligomerization. Alanine substitution of the fully conserved aromatic residues on the hydrophobic helix side abrogated membrane association of the segment comprising aa 40 to 69 and disrupted the formation of a functional replication complex. These results provide the first atomic resolution structure of an essential membrane-associated determinant of HCV NS4B.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Plasmodium vivax malaria remains a major health problem in tropical and sub-tropical regions worldwide. Several rhoptry proteins which are important for interaction with and/or invasion of red blood cells, such as PfRONs, Pf92, Pf38, Pf12 and Pf34, have been described during the last few years and are being considered as potential anti-malarial vaccine candidates. This study describes the identification and characterization of the P. vivax rhoptry neck protein 1 (PvRON1) and examine its antigenicity in natural P. vivax infections. Methods: The PvRON1 encoding gene, which is homologous to that encoding the P. falciparum apical sushi protein (ASP) according to the plasmoDB database, was selected as our study target. The pvron1 gene transcription was evaluated by RT-PCR using RNA obtained from the P. vivax VCG-1 strain. Two peptides derived from the deduced P. vivax Sal-I PvRON1 sequence were synthesized and inoculated in rabbits for obtaining anti-PvRON1 antibodies which were used to confirm the protein expression in VCG-1 strain schizonts along with its association with detergent-resistant microdomains (DRMs) by Western blot, and its localization by immunofluorescence assays. The antigenicity of the PvRON1 protein was assessed using human sera from individuals previously exposed to P. vivax malaria by ELISA. Results: In the P. vivax VCG-1 strain, RON1 is a 764 amino acid-long protein. In silico analysis has revealed that PvRON1 shares essential characteristics with different antigens involved in invasion, such as the presence of a secretory signal, a GPI-anchor sequence and a putative sushi domain. The PvRON1 protein is expressed in parasite's schizont stage, localized in rhoptry necks and it is associated with DRMs. Recombinant protein recognition by human sera indicates that this antigen can trigger an immune response during a natural infection with P. vivax. Conclusions: This study shows the identification and characterization of the P. vivax rhoptry neck protein 1 in the VCG-1 strain. Taking into account that PvRON1 shares several important characteristics with other Plasmodium antigens that play a functional role during RBC invasion and, as shown here, it is antigenic, it could be considered as a good vaccine candidate. Further studies aimed at assessing its immunogenicity and protection-inducing ability in the Aotus monkey model are thus recommended.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cytochrome b-type NAD(P)H oxidoreductases are involved in many physiological processes, including iron uptake in yeast, the respiratory burst, and perhaps oxygen sensing in mammals. We have identified a cytosolic cytochrome b-type NAD(P)H oxidoreductase in mammals, a flavohemoprotein (b5+b5R) containing cytochrome b5 (b5) and b5 reductase (b5R) domains. A genetic approach, using blast searches against dbest for FAD-, NAD(P)H-binding sequences followed by reverse transcription–PCR, was used to clone the complete cDNA sequence of human b5+b5R from the hepatoma cell line Hep 3B. Compared with the classical single-domain b5 and b5R proteins localized on endoplasmic reticulum membrane, b5+b5R also has binding motifs for heme, FAD, and NAD(P)H prosthetic groups but no membrane anchor. The human b5+b5R transcript was expressed at similar levels in all tissues and cell lines that were tested. The two functional domains b5* and b5R* are linked by an approximately 100-aa-long hinge bearing no sequence homology to any known proteins. When human b5+b5R was expressed as c-myc adduct in COS-7 cells, confocal microscopy revealed a cytosolic localization at the perinuclear space. The recombinant b5+b5R protein can be reduced by NAD(P)H, generating spectrum typical of reduced cytochrome b with alpha, beta, and Soret peaks at 557, 527, and 425 nm, respectively. Human b5+b5R flavohemoprotein is a NAD(P)H oxidoreductase, demonstrated by superoxide production in the presence of air and excess NAD(P)H and by cytochrome c reduction in vitro. The properties of this protein make it a plausible candidate oxygen sensor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

SNARE proteins have been classified as vesicular (v)- and target (t)-SNAREs and play a central role in the various membrane interactions in eukaryotic cells. Based on the Paramecium genome project, we have identified a multigene family of at least 26 members encoding the t-SNARE syntaxin (PtSyx) that can be grouped into 15 subfamilies. Paramecium syntaxins match the classical build-up of syntaxins, being 'tail-anchored' membrane proteins with an N-terminal cytoplasmic domain and a membrane-bound single C-terminal hydrophobic domain. The membrane anchor is preceded by a conserved SNARE domain of approximately 60 amino acids that is supposed to participate in SNARE complex assembly. In a phylogenetic analysis, most of the Paramecium syntaxin genes were found to cluster in groups together with those from other organisms in a pathway-specific manner, allowing an assignment to different compartments in a homology-dependent way. However, some of them seem to have no counterparts in metazoans. In another approach, we fused one representative member of each of the syntaxin isoforms to green fluorescent protein and assessed the in vivo localization, which was further supported by immunolocalization of some syntaxins. This allowed us to assign syntaxins to all important trafficking pathways in Paramecium.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Acknowledgements The authors would like to thank M. N. Cueto and J. M. Antonio (ECOBIOMAR) for molecular analysis and technical support. K. MacKenzie (University of Aberdeen) and A. Roura (ECOBIOMAR) assisted with the taxonomic identification of parasites. We are also grateful to P. Caballero (Service Nature Conservation of the Xunta de Galicia) for fish sampling support.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, such as indexing and sorting of large collections of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate the use of texture as a tool for determining the script of a document image, based on the observation that text has a distinct visual texture. An experimental evaluation of a number of commonly used texture features is conducted on a newly created script database, providing a qualitative measure of which features are most appropriate for this task. Strategies for improving classification results in situations with limited training data and multiple font types are also proposed.