881 resultados para Text linguistics


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The development of text classification techniques has been largely promoted in the past decade due to the increasing availability and widespread use of digital documents. Usually, the performance of text classification relies on the quality of categories and the accuracy of classifiers learned from samples. When training samples are unavailable or categories are unqualified, text classification performance would be degraded. In this paper, we propose an unsupervised multi-label text classification method to classify documents using a large set of categories stored in a world ontology. The approach has been promisingly evaluated by compared with typical text classification methods, using a real-world document collection and based on the ground truth encoded by human experts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is a big challenge to clearly identify the boundary between positive and negative streams. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on RCV1, and substantial experiments show that the proposed approach achieves encouraging performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This chapter provides a historical materialist review of the development of applied and critical linguistics and their extensions and applications to the fields of English Language studies. Following Bourdieu, we view intellectual fields and their affiliated discourses as constructed in relation to specific economic and political formations and sociocultural contexts. We therefore take ‘applied linguistics’, ‘critical language studies’ and ‘English language studies’ as fields in dynamic and contested formation and relationship. Our review focuses on three historical moments. In the postwar period, we describe the technologisation of linguistics – with the enlistment of linguistics in the applied fields of language planning, literacy education and second/foreign language teaching. We then turn to document the multinationalisation of English, which, we argue entails a rationalisation of English as a universal form of economic capital in globalised economic and cultural flows. We conclude by exploring scenarios for the displacement of English language studies as a major field by other emergent economic lingua franca (e.g., Mandarin, Spanish) and shifts in the economic and cultural nexus of control over English from an Anglo/American centre to East and West Asia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper develops and evaluates an enhanced corpus based approach for semantic processing. Corpus based models that build representations of words directly from text do not require pre-existing linguistic knowledge, and have demonstrated psychologically relevant performance on a number of cognitive tasks. However, they have been criticised in the past for not incorporating sufficient structural information. Using ideas underpinning recent attempts to overcome this weakness, we develop an enhanced tensor encoding model to build representations of word meaning for semantic processing. Our enhanced model demonstrates superior performance when compared to a robust baseline model on a number of semantic processing tasks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Much has been written on Michel Foucault’s reluctance to clearly delineate a research method, particularly with respect to genealogy (Harwood 2000; Meadmore, Hatcher, & McWilliam 2000; Tamboukou 1999). Foucault (1994, p. 288) himself disliked prescription stating, “I take care not to dictate how things should be” and wrote provocatively to disrupt equilibrium and certainty, so that “all those who speak for others or to others” no longer know what to do. It is doubtful, however, that Foucault ever intended for researchers to be stricken by that malaise to the point of being unwilling to make an intellectual commitment to methodological possibilities. Taking criticism of “Foucauldian” discourse analysis as a convenient point of departure to discuss the objectives of poststructural analyses of language, this paper develops what might be called a discursive analytic; a methodological plan to approach the analysis of discourses through the location of statements that function with constitutive effects.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our everyday environment is full of text but this rich source of information remains largely inaccessible to mobile robots. In this paper we describe an active text spotting system that uses a small number of wide angle views to locate putative text in the environment and then foveates and zooms onto that text in order to improve the reliability of text recognition. We present extensive experimental results obtained with a pan/tilt/zoom camera and a ROS-based mobile robot operating in an indoor environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information retrieval (IR) by clinicians in the healthcare setting is critical for informing clinical decision-making. However, a large part of this information is in the form of free-text and inhibits clinical decision support and effective healthcare services. This makes meaningful use of clinical free-­text in electronic health records (EHRs) for patient care a difficult task. Within the context of IR, given a repository of free-­text clinical reports, one might want to retrieve and analyse data for patients who have a known clinical finding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tax law and policy is a vital part of Australian society. Australian society insists that the Federal Government provide extensive public programs, such as health services, education, social security, foreign aid, legal infra¬structure, regulation, police services, national defence and funding for sports development. These programs are costly to provide and are funded by taxation. The aim of this book is to introduce and explain the principles of tax law and tax policy in plain English. The book contains detailed commentary on tax principles together with extracts from cases and materials that illustrate the application of the principles. The book considers tax policy and the economic and social aspects of tax law. While tax students must develop technical competence in tax law, given the speed with which changes are made to the technical details of tax law, it is also important to grasp tax principles and policy to understand why tax law has changed or why it should change. The chapters are structured to direct readers to the key provisions of the tax law. Each case is introduced by an explanation of the facts, followed by the taxpayer’s arguments, the Commissioner’s assertions and the decision of the Administrative Appeals Tribunal or a court. The commentary guides readers through the issues considered in the judgments. The book contains extracts from: articles; materials dealing with tax policy; and the Commissioner’s rulings. The book also has references for further reading and medium-neutral citations (Internet citations) for cases decided since 1998.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the overwhelming increase in the amount of texts on the web, it is almost impossible for people to keep abreast of up-to-date information. Text mining is a process by which interesting information is derived from text through the discovery of patterns and trends. Text mining algorithms are used to guarantee the quality of extracted knowledge. However, the extracted patterns using text or data mining algorithms or methods leads to noisy patterns and inconsistency. Thus, different challenges arise, such as the question of how to understand these patterns, whether the model that has been used is suitable, and if all the patterns that have been extracted are relevant. Furthermore, the research raises the question of how to give a correct weight to the extracted knowledge. To address these issues, this paper presents a text post-processing method, which uses a pattern co-occurrence matrix to find the relation between extracted patterns in order to reduce noisy patterns. The main objective of this paper is not only reducing the number of closed sequential patterns, but also improving the performance of pattern mining as well. The experimental results on Reuters Corpus Volume 1 data collection and TREC filtering topics show that the proposed method is promising.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Internet services are important part of daily activities for most of us. These services come with sophisticated authentication requirements which may not be handled by average Internet users. The management of secure passwords for example creates an extra overhead which is often neglected due to usability reasons. Furthermore, password-based approaches are applicable only for initial logins and do not protect against unlocked workstation attacks. In this paper, we provide a non-intrusive identity verification scheme based on behavior biometrics where keystroke dynamics based-on free-text is used continuously for verifying the identity of a user in real-time. We improved existing keystroke dynamics based verification schemes in four aspects. First, we improve the scalability where we use a constant number of users instead of whole user space to verify the identity of target user. Second, we provide an adaptive user model which enables our solution to take the change of user behavior into consideration in verification decision. Next, we identify a new distance measure which enables us to verify identity of a user with shorter text. Fourth, we decrease the number of false results. Our solution is evaluated on a data set which we have collected from users while they were interacting with their mail-boxes during their daily activities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A big challenge for classification on text is the noisy of text data. It makes classification quality low. Many classification process can be divided into two sequential steps scoring and threshold setting (thresholding). Therefore to deal with noisy data problem, it is important to describe positive feature effectively scoring and to set a suitable threshold. Most existing text classifiers do not concentrate on these two jobs. In this paper, we propose a novel text classifier with pattern-based scoring that describe positive feature effectively, followed by threshold setting. The thresholding is based on score of training set, make it is simple to implement in other scoring methods. Experiment shows that our pattern-based classifier is promising.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Much has been written on Michel Foucault’s reluctance to clearly delineate a research method, particularly with respect to genealogy (Harwood 2000; Meadmore, Hatcher, & McWilliam 2000; Tamboukou 1999). Foucault (1994, p. 288) himself disliked prescription stating, “I take care not to dictate how things should be” and wrote provocatively to disrupt equilibrium and certainty, so that “all those who speak for others or to others” no longer know what to do. It is doubtful, however, that Foucault ever intended for researchers to be stricken by that malaise to the point of being unwilling to make an intellectual commitment to methodological possibilities. Taking criticism of “Foucauldian” discourse analysis as a convenient point of departure to discuss the objectives of poststructural analyses of language, this paper develops what might be called a discursive analytic; a methodological plan to approach the analysis of discourses through the location of statements that function with constitutive effects.