985 resultados para Word order


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Thai written language is one of the languages that does not have word boundaries. In order to discover the meaning of the document, all texts must be separated into syllables, words, sentences, and paragraphs. This paper develops a novel method to segment the Thai text by combining a non-dictionary based technique with a dictionary-based technique. This method first applies the Thai language grammar rules to the text for identifying syllables. The hidden Markov model is then used for merging possible syllables into words. The identified words are verified with a lexical dictionary and a decision tree is employed to discover the words unidentified by the lexical dictionary. Documents used in the litigation process of Thai court proceedings have been used in experiments. The results which are segmented words, obtained by the proposed method outperform the results obtained by other existing methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The increasing diversity of the Internet has created a vast number of multilingual resources on the Web. A huge number of these documents are written in various languages other than English. Consequently, the demand for searching in non-English languages is growing exponentially. It is desirable that a search engine can search for information over collections of documents in other languages. This research investigates the techniques for developing high-quality Chinese information retrieval systems. A distinctive feature of Chinese text is that a Chinese document is a sequence of Chinese characters with no space or boundary between Chinese words. This feature makes Chinese information retrieval more difficult since a retrieved document which contains the query term as a sequence of Chinese characters may not be really relevant to the query since the query term (as a sequence Chinese characters) may not be a valid Chinese word in that documents. On the other hand, a document that is actually relevant may not be retrieved because it does not contain the query sequence but contains other relevant words. In this research, we propose two approaches to deal with the problems. In the first approach, we propose a hybrid Chinese information retrieval model by incorporating word-based techniques with the traditional character-based techniques. The aim of this approach is to investigate the influence of Chinese segmentation on the performance of Chinese information retrieval. Two ranking methods are proposed to rank retrieved documents based on the relevancy to the query calculated by combining character-based ranking and word-based ranking. Our experimental results show that Chinese segmentation can improve the performance of Chinese information retrieval, but the improvement is not significant if it incorporates only Chinese segmentation with the traditional character-based approach. In the second approach, we propose a novel query expansion method which applies text mining techniques in order to find the most relevant words to extend the query. Unlike most existing query expansion methods, which generally select the highly frequent indexing terms from the retrieved documents to expand the query. In our approach, we utilize text mining techniques to find patterns from the retrieved documents that highly correlate with the query term and then use the relevant words in the patterns to expand the original query. This research project develops and implements a Chinese information retrieval system for evaluating the proposed approaches. There are two stages in the experiments. The first stage is to investigate if high accuracy segmentation can make an improvement to Chinese information retrieval. In the second stage, a text mining based query expansion approach is implemented and a further experiment has been done to compare its performance with the standard Rocchio approach with the proposed text mining based query expansion method. The NTCIR5 Chinese collections are used in the experiments. The experiment results show that by incorporating the text mining based query expansion with the hybrid model, significant improvement has been achieved in both precision and recall assessments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

If Project Management (PM) is a well-accepted mode of managing organizations, more and more organizations are adopting PM in order to satisfy the diversified needs of application areas within a variety of industries and organizations. Concurrently, the number of PM practitioners and people involved at various level of qualification is vigorously rising. Thus the importance to characterize, define and understand this field and its underlying strength, basis and development is paramount. For this purpose we will referee to sociology of actor-networks and qualitative scientometrics leading to the choice of the co-word analysis method in enabling us to capture the project management field and its dynamics. Results of a study based on the analysis of EBSCO Business Source Premier Database will be presented and some future trends and scenarios proposed. The main following trends are confirmed, in alignment with previous studies: continuous interest for the “cost engineering” aspects, on going interest for Economic aspects and contracts, how to deal with various project types (categorizations), the integration with Supply Chain Management and Learning and Knowledge Management. Furthermore besides these continuous trends, we can note new areas of interest: the link between strategy and project, Governance, the importance of maturity (organizational performance and metrics, control) and Change Management. We see the actors (Professional Bodies, Governmental Bodies, Agencies, Universities, Industries, Researchers, and Practitioners) reinforcing their competing/cooperative strategies in the development of standards and certifications and moving to more “business oriented” relationships with their members and main stakeholders (Governments, Institutions like European Community, Industries, Agencies, NGOs…), at least at central level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many existing information retrieval models do not explicitly take into account in- formation about word associations. Our approach makes use of rst and second order relationships found in natural language, known as syntagmatic and paradigmatic associ- ations, respectively. This is achieved by using a formal model of word meaning within the query expansion process. On ad hoc retrieval, our approach achieves statistically sig- ni cant improvements in MAP (0.158) and P@20 (0.396) over our baseline model. The ERR@20 and nDCG@20 of our system was 0.249 and 0.192 respectively. Our results and discussion suggest that information about both syntagamtic and paradigmatic associa- tions can assist with improving retrieval eectiveness on ad hoc retrieval.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many existing information retrieval models do not explicitly take into account in- formation about word associations. Our approach makes use of rst and second order relationships found in natural language, known as syntagmatic and paradigmatic associ- ations, respectively. This is achieved by using a formal model of word meaning within the query expansion process. On ad hoc retrieval, our approach achieves statistically sig- ni cant improvements in MAP (0.158) and P@20 (0.396) over our baseline model. The ERR@20 and nDCG@20 of our system was 0.249 and 0.192 respectively. Our results and discussion suggest that information about both syntagamtic and paradigmatic associa- tions can assist with improving retrieval eectiveness on ad hoc retrieval.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The article discusses the importance that learning to live sustainably in order to provide healthy and fulfilling lives for future generations. The things that need to be done differently and the innovative partnerships that are required are highlighted.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The work is based on the assumption that words with similar syntactic usage have similar meaning, which was proposed by Zellig S. Harris (1954,1968). We study his assumption from two aspects: Firstly, different meanings (word senses) of a word should manifest themselves in different usages (contexts), and secondly, similar usages (contexts) should lead to similar meanings (word senses). If we start with the different meanings of a word, we should be able to find distinct contexts for the meanings in text corpora. We separate the meanings by grouping and labeling contexts in an unsupervised or weakly supervised manner (Publication 1, 2 and 3). We are confronted with the question of how best to represent contexts in order to induce effective classifiers of contexts, because differences in context are the only means we have to separate word senses. If we start with words in similar contexts, we should be able to discover similarities in meaning. We can do this monolingually or multilingually. In the monolingual material, we find synonyms and other related words in an unsupervised way (Publication 4). In the multilingual material, we ?nd translations by supervised learning of transliterations (Publication 5). In both the monolingual and multilingual case, we first discover words with similar contexts, i.e., synonym or translation lists. In the monolingual case we also aim at finding structure in the lists by discovering groups of similar words, e.g., synonym sets. In this introduction to the publications of the thesis, we consider the larger background issues of how meaning arises, how it is quantized into word senses, and how it is modeled. We also consider how to define, collect and represent contexts. We discuss how to evaluate the trained context classi?ers and discovered word sense classifications, and ?nally we present the word sense discovery and disambiguation methods of the publications. This work supports Harris' hypothesis by implementing three new methods modeled on his hypothesis. The methods have practical consequences for creating thesauruses and translation dictionaries, e.g., for information retrieval and machine translation purposes. Keywords: Word senses, Context, Evaluation, Word sense disambiguation, Word sense discovery.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent advances in neural language models have contributed new methods for learning distributed vector representations of words (also called word embeddings). Two such methods are the continuous bag-of-words model and the skipgram model. These methods have been shown to produce embeddings that capture higher order relationships between words that are highly effective in natural language processing tasks involving the use of word similarity and word analogy. Despite these promising results, there has been little analysis of the use of these word embeddings for retrieval. Motivated by these observations, in this paper, we set out to determine how these word embeddings can be used within a retrieval model and what the benefit might be. To this aim, we use neural word embeddings within the well known translation language model for information retrieval. This language model captures implicit semantic relations between the words in queries and those in relevant documents, thus producing more accurate estimations of document relevance. The word embeddings used to estimate neural language models produce translations that differ from previous translation language model approaches; differences that deliver improvements in retrieval effectiveness. The models are robust to choices made in building word embeddings and, even more so, our results show that embeddings do not even need to be produced from the same corpus being used for retrieval.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this research was to investigate the role of electronic word of mouth (eWOM) in shaping consumer attitudes towards various products and services with concentration on the consumer attitude change. eWOM has long been proven to play an important role in influencing consumer attitudes and has been researched from a variety of perspectives. This study attempts to look deeper into the process of consumer attitude change by applying as the central theory of the study the Elaboration Likelihood Model of Persuasion by Petty and Cacioppo. In the processes of examining the background academic and empirical research the Internet and Web 2.0 are closely depicted in order to understand how throughout the past centuries technology allowed the rise of various mediums where consumers can not only share their opinions online about products and services but also communicate with other consumers. Manuel Castel’s Internet Galaxy, Gildin’s, Carl and Noland’s, Hennig-Thurau, Gwinner, Walsh and Gremler’s researches on eWOM are the central works that helped to shape both the theoretical and empirical parts of this study. The mixed method approach was chosen as a research method for this study. An online survey was conducted via the Surveymonkey.com platform and eight qualitative in-depth interviews were conducted. The results of the study show that central route queues as text quality and text argumentativeness are more prominent among the research subjects and the peripheral route queues: source credibility and source expertise did not show considerable significance. Also more experience and participation consumers have with user-rating websites and applications more inclined they are to elaborate on the central route cues and are more likely to search for opinions that they consider rational and credible. Also these respondents are less inclined to search for ratings that confirm their existing beliefs about products or services. Less experience/participation they have about eWOM more likely they are to search for reviews confirmatory to their own.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper examines some reflections and discussions about the role and nature of the press that took place in Buenos Aires during the 1850s, referring to the difficulties involved in congenial freedom and order. This was caused by the fact that the press was considered a pillar of republican and civilized societies, but also an agent capable of corroding the social and political order.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A postal survey was used to collect data from family members of deceased residents of six long-term care (LTC) facilities in order to explore end-of-life (EOL) care using the Family Perception of Care Scale. This article reports on the results of thematic analysis of family member comments provided while completing the survey. Family comments fell into two themes: 1) appreciation for care and 2) concerns with care. The appreciation for care theme included the following subthemes: psychosocial support, family care, and spiritual care. The concerns with care theme included the subthemes: physical care, staffing levels, staff knowledge, physician availability, communication, and physical environment. This study identified the need for improvement in EOL care skills among LTC staff and attending physicians. As such, there is a need to implement continuing education to address these issues. © 2006 Centre for Bioethics, IRCM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação apresentada à Escola Superior de Comunicação Social como parte dos requisitos para obtenção de grau de mestre em Publicidade e Marketing.