7 resultados para NER

em Deakin Research Online - Australia


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The study of interconnection networks is important because the overall performance of a distributed system is often critically hinged on the effectiveness of its interconnection network. In the mean time, the heterogeneity is one of the most important factors of such systems. This paper addresses the problem of interconnection networks performance modeling of large-scale distributed systems with emphases on heterogeneous multi-cluster computing systems. So, we present an analytical model to predict message latency in multi-cluster systems in the presence of cluster size heterogeneity. The model is validated through comprehensive simulation, which demonstrates that the proposed model exhibits a good degree of accuracy for various system organizations and under different working conditions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

DNA repair mechanisms constitute an essential cellular response to DNA damage arising either from metabolic processes or from environmental sources such as ultraviolet radiation. Repair of these lesions may be via direct reversal, or by processes such as nucleotide excision repair (NER), a coordinated pathway in which lesions and the surrounding nucleotides are excised and replaced via DNA resynthesis. The importance of repair is illustrated by human disease states such as xeroderma pigmentosum and Cockayne's syndrome which result from defects in the NER system arising from mutations in XP- genes or XP- and CS- genes respectively Little detail is known of DNA damage repair processes in plants, despite the economic and ecological importance of these organisms. This study aimed to expand our knowledge of the process of NER in plants, largely via a polymerase chain reaction (PCR)-based approach involving amplification, cloning and characterisation of plant genomic DNA and cDNA. Homologues of the NER components XPF/RAD1 and XPD/RAD3 were isolated as both genomic and complete cDNA sequences from the model dicotyledonous plant Arabidopsis thaliana. The sequence of the 3'-untranslated region of atXPD was also determined. Comparison of genomic and cDNA sequences allowed a detailed analysis of gene structures, including details of intron/exon processing. Variable transcript processing to produce three distinct transcripts was found in the case of atXPF. In an attempt to validate the proposed homologous function of these cDNAs, assays to test complementation of resistance to ultraviolet radiation in the relevant yeast mutants were performed. Despite extensive amino acid sequence conservation, neither plant cDNA was able to restore UV-resistance. As the yeast RAD3 gene product is also involved in vivo in transcription, and so is required for viability, the atXPD cDNA was tested in a complementation assay for this function in an appropriate yeast mutant. The plant cDNA was found to substantially increase the viability of the yeast mutant. The structural and functional significance of these results is discussed comparatively with reference to yeast, human and other known homologues. Other putative NER homologues were identified in A. thaliana database sequences, including those of ERCC1/RAD10 and XPG/ERCC5/RAD2, and are now the subjects of ongoing investigations. This study also describes preliminary investigations of putative REVS and RAD30 translesion synthesis genes from A. thaliana.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In named entity recognition (NER) for biomedical literature, approaches based on combined classifiers have demonstrated great performance improvement compared to a single (best) classifier. This is mainly owed to sufficient level of diversity exhibited among classifiers, which is a selective property of classifier set. Given a large number of classifiers, how to select different classifiers to put into a classifier-ensemble is a crucial issue of multiple classifier-ensemble design. With this observation in mind, we proposed a generic genetic classifier-ensemble method for the classifier selection in biomedical NER. Various diversity measures and majority voting are considered, and disjoint feature subsets are selected to construct individual classifiers. A basic type of individual classifier – Support Vector Machine (SVM) classifier is adopted as SVM-classifier committee. A multi-objective Genetic algorithm (GA) is employed as the classifier selector to facilitate the ensemble classifier to improve the overall sample classification accuracy. The proposed approach is tested on the benchmark dataset – GENIA version 3.02 corpus, and compared with both individual best SVM classifier and SVM-classifier ensemble algorithm as well as other machine learning methods such as CRF, HMM and MEMM. The results show that the proposed approach outperforms other classification algorithms and can be a useful method for the biomedical NER problem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Named entity recognition (NER) is an essential step in the process of information extraction within text mining. This paper proposes a technique to extract drug named entities from unstructured and informal medical text using a hybrid model of lexicon-based and rule-based techniques. In the proposed model, a lexicon is first used as the initial step to detect drug named entities. Inference rules are then deployed to further extract undetected drug names. The designed rules employ part of speech tags and morphological features for drug name detection. The proposed hybrid model is evaluated using a benchmark data set from the i2b2 2009 medication challenge, and is able to achieve an f-score of 66.97%.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective : The objective of this paper is to formulate an extended segment representation (SR) technique to enhance named entity recognition (NER) in medical applications.

Methods : An extension to the IOBES (Inside/Outside/Begin/End/Single) SR technique is formulated. In the proposed extension, a new class is assigned to words that do not belong to a named entity (NE) in one context but appear as an NE in other contexts. Ambiguity in such cases can negatively affect the results of classification-based NER techniques. Assigning a separate class to words that can potentially cause ambiguity in NER allows a classifier to detect NEs more accurately; therefore increasing classification accuracy.

Results : The proposed SR technique is evaluated using the i2b2 2010 medical challenge data set with eight different classifiers. Each classifier is trained separately to extract three different medical NEs, namely treatment, problem, and test. From the three experimental results, the extended SR technique is able to improve the average F1-measure results pertaining to seven out of eight classifiers. The kNN classifier shows an average reduction of 0.18% across three experiments, while the C4.5 classifier records an average improvement of 9.33%.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An accurate Named Entity Recognition (NER) is important for knowledge discovery in text mining. This paper proposes an ensemble machine learning approach to recognise Named Entities (NEs) from unstructured and informal medical text. Specifically, Conditional Random Field (CRF) and Maximum Entropy (ME) classifiers are applied individually to the test data set from the i2b2 2010 medication challenge. Each classifier is trained using a different set of features. The first set focuses on the contextual features of the data, while the second concentrates on the linguistic features of each word. The results of the two classifiers are then combined. The proposed approach achieves an f-score of 81.8%, showing a considerable improvement over the results from CRF and ME classifiers individually which achieve f-scores of 76% and 66.3% for the same data set, respectively.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Named Entity Recognition (NER) is a crucial step in text mining. This paper proposes a new graph-based technique for representing unstructured medical text. The new representation is used to extract discriminative features that are able to enhance the NER performance. To evaluate the usefulness of the proposed graph-based technique, the i2b2 medication challenge data set is used. Specifically, the 'treatment' named entities are extracted for evaluation using six different classifiers. The F-measure results of five classifiers are enhanced, with an average improvement of up to 26% in performance.