240 resultados para Yue fu (Chinese poetry)
em Queensland University of Technology - ePrints Archive
Resumo:
In this paper, we propose an unsupervised segmentation approach, named "n-gram mutual information", or NGMI, which is used to segment Chinese documents into n-character words or phrases, using language statistics drawn from the Chinese Wikipedia corpus. The approach alleviates the tremendous effort that is required in preparing and maintaining the manually segmented Chinese text for training purposes, and manually maintaining ever expanding lexicons. Previously, mutual information was used to achieve automated segmentation into 2-character words. The NGMI approach extends the approach to handle longer n-character words. Experiments with heterogeneous documents from the Chinese Wikipedia collection show good results.
Resumo:
A distinctive feature of Chinese test is that a Chinese document is a sequence of Chinese with no space or boundary between Chinese words. This feature makes Chinese information retrieval more difficult since a retrieved document which contains the query term as a sequence of Chinese characters may not be really relevant to the query since the query term (as a sequence Chinese characters) may not be a valid Chinese word in that documents. On the other hand, a document that is actually relevant may not be retrieved because it does not contain the query sequence but contains other relevant words. In this research, we propose a hybrid Chinese information retrieval model by incorporating word-based techniques with the traditional character-based techniques. The aim of this approach is to investigate the influence of Chinese segmentation on the performance of Chinese information retrieval. Two ranking methods are proposed to rank retrieved documents based on the relevancy to the query calculated by combining character-based ranking and word-based ranking. Our experimental results show that Chinese segmentation can improve the performance of Chinese information retrieval, but the improvement is not significant if it incorporates only Chinese segmentation with the traditional character-based approach.
Resumo:
In this paper, we describe a voting mechanism for accurate named entity (NE) translation in English–Chinese question answering (QA). This mechanism involves translations from three different sources: machine translation,online encyclopaedia, and web documents. The translation with the highest number of votes is selected. We evaluated this approach using test collection, topics and assessment results from the NTCIR-8 evaluation forum. This mechanism achieved 95% accuracy in NEs translation and 0.3756 MAP in English–Chinese cross-lingual information retrieval of QA.
Resumo:
In this paper we examine automated Chinese to English link discovery in Wikipedia and the effects of Chinese segmentation and Chinese to English translation on the hyperlink recommendation. Our experimental results show that the implemented link discovery framework can effectively recommend Chinese-to-English cross-lingual links. The techniques described here can assist bi-lingual users where a particular topic is not covered in Chinese, is not equally covered in both languages, or is biased in one language; as well as for language learning.
Resumo:
This paper describes our participation in the Chinese word segmentation task of CIPS-SIGHAN 2010. We implemented an n-gram mutual information (NGMI) based segmentation algorithm with the mixed-up features from unsupervised, supervised and dictionarybased segmentation methods. This algorithm is also combined with a simple strategy for out-of-vocabulary (OOV) word recognition. The evaluation for both open and closed training shows encouraging results of our system. The results for OOV word recognition in closed training evaluation were however found unsatisfactory.
Resumo:
Objective The results of a recent genome-wide association study have shown that ERAP1 and IL23R are associated with ankylosing spondylitis (AS) in Caucasian populations from North America and the UK. Based on these findings, we undertook the current study to investigate whether single-nucleotide polymorphisms (SNPs) covering the genes ERAP1 and IL23R are associated with AS in a Han Chinese population. Methods A case-control study was performed in Han Chinese patients with AS (n = 527) and controls (n = 945) from Shanghai and Nanjing. All patients met the modified New York criteria for AS. The Sequenom iPlex platform was used to genotype cases and controls for 21 tag SNPs covering IL23R and 38 tag SNPs covering ERAP1. Statistical analysis was performed using the Cochran-Armitage test for trend. Results Multiple SNPs in ERAP1 were significantly associated with AS (for rs27980, P = 0.0048; for rs7711564, P = 0.0081). However, no association was observed between IL23R and AS (for all SNPs, P > 0.1). The nonsynonymous SNP in IL23R, rs11209026, widely thought to be the primary AS-associated SNP in IL23R in Europeans, was found not to be polymorphic in Chinese. Conclusion Our results demonstrate that genetic polymorphisms in ERAP1 are associated with AS in Han Chinese, suggesting a common pathogenic mechanism for the disease in Chinese and Caucasian populations, and that IL23R is not associated with AS in Chinese, indicating a difference in the mechanism of disease pathogenesis between Chinese and Caucasian populations. This may result from the fact that rs11209026, the nonsynonymous SNP in IL23R, is not polymorphic in Chinese patients, providing further evidence that rs11209026 is the key polymorphism associated with AS (and likely inflammatory bowel disease and psoriasis) in this gene.
Resumo:
Objective To investigate differences in genetic risk factors for rheumatoid arthritis (RA) in Han Chinese as compared with Europeans. Methods A genome-wide association study was conducted in China with 952 patients and 943 controls, and 32 variants were followed up in 2,132 patients and 2,553 controls. A transpopulation meta-analysis with results from a large European RA study was also performed to compare the genetic architecture across the 2 ethnic remote populations. Results Three non-major histocompatibility complex (non-MHC) loci were identified at the genome-wide significance level, the effect sizes of which were larger in anti-citrullinated protein antibody (ACPA)-positive patients than in ACPA-negative patients. These included 2 novel variants, rs12617656, located in an intron of DPP4 (odds ratio [OR] 1.56, P = 1.6 × 10 -21), and rs12379034, located in the coding region of CDK5RAP2 (OR 1.49, P = 1.1 × 10-16), as well as a variant at the known CCR6 locus, rs1854853 (OR 0.71, P = 6.5 × 10-15). The analysis of ACPA-positive patients versus ACPA-negative patients revealed that rs12617656 at the DPP4 locus showed a strong interaction effect with ACPAs (P = 5.3 × 10-18), and such an interaction was also observed for rs7748270 at the MHC locus (P = 5.9 × 10-8). The transpopulation meta-analysis showed genome-wide overlap and enrichment in association signals across the 2 populations, as confirmed by prediction analysis. Conclusion This study has expanded the list of alleles that confer risk of RA, provided new insight into the pathogenesis of RA, and added empirical evidence to the emerging polygenic nature of complex trait variation driven by common genetic variants. Copyright © 2014 by the American College of Rheumatology.
Resumo:
Purpose: To describe distributions of ocular biometry and their associations with refraction in 7- and 14-year-old children in urban areas of Anyang, central China. Methods: A total of 2271 grade 1 students aged 7.1 ± 0.4 years and 1786 grade 8 students aged 13.7 ± 0.5 years were measured with ocular biometry and cycloplegic refraction. A parental myopia questionnaire was administered to parents. Results: Mean axial length, anterior chamber depth, lens thickness, central corneal thickness, corneal diameter, corneal radius of curvature, axial length/corneal radius of curvature ratio, and spherical equivalent refraction were 22.72 ± 0.76 mm, 2.89 ± 0.24 mm, 3.61 ± 0.19 mm, 540.5 ± 31 μm, 12.06 ± 0.44 mm, 7.80 ± 0.25 mm, 2.91 ± 0.08, and +0.95 ± 1.05 diopters (D), respectively, in 7-year-old children. They were 24.39 ± 1.13 mm, 3.42 ± 0.41 mm, 3.18 ± 0.24 mm, 548.9 ± 33 μm, 12.03 ± 0.43 mm, 7.80 ± 0.26 mm, 3.13 ± 0.14, and −2.06 ± 2.20 D, respectively, in 14-year-old children. Compared with 7-year-old children, the older group had significantly more myopia (−3.0 D), longer axial length (1.7 mm), deeper anterior chamber depth (0.3 mm), thinner lens thickness (−0.2 mm), thicker central corneal thickness (10 μm), and greater axial length/corneal radius of curvature ratio (0.22) (all p < 0.001), as well as smaller corneal diameter (−0.03 mm, p = 0.02) and similar corneal radius of curvature. Sex differences were similar in both age groups, with boys having longer axial length (0.5 mm), deeper anterior chamber depth (0.1 mm), shorter lens thickness (0.03 mm), greater central corneal thickness (5 μm), greater corneal diameter (0.15 mm), and greater corneal radius of curvature (0.14 mm) than girls (all p < 0.01). The most important variables related to spherical equivalent refraction were vitreous length, corneal radius of curvature, and lens thickness. Conclusions: The 14-year-old group had larger parameter dimensions than the 7-year-old group except for corneal radius of curvature (unchanged) and lens thickness and corneal diameter (both smaller). Boys had large parameter dimensions than girls except for lens thickness (smaller). Axial length, corneal radius of curvature, and lens thickness were the most important determinants of refraction.