2 resultados para Records as Topic
em Dalarna University College Electronic Archive
Resumo:
Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Due to the free nature of Wikipedia and allowing open access to everyone to edit articles the quality of articles may be affected. As all people don’t have equal level of knowledge and also different people have different opinions about a topic so there may be difference between the contributions made by different authors. To overcome this situation it is very important to classify the articles so that the articles of good quality can be separated from the poor quality articles and should be removed from the database. The aim of this study is to classify the articles of Wikipedia into two classes class 0 (poor quality) and class 1(good quality) using the Adaptive Neuro Fuzzy Inference System (ANFIS) and data mining techniques. Two ANFIS are built using the Fuzzy Logic Toolbox [1] available in Matlab. The first ANFIS is based on the rules obtained from J48 classifier in WEKA while the other one was built by using the expert’s knowledge. The data used for this research work contains 226 article’s records taken from the German version of Wikipedia. The dataset consists of 19 inputs and one output. The data was preprocessed to remove any similar attributes. The input variables are related to the editors, contributors, length of articles and the lifecycle of articles. In the end analysis of different methods implemented in this research is made to analyze the performance of each classification method used.
Resumo:
In this article, we discuss ellipsis as an interactive strategy by analysing the author’s textchat corpus and the VOICE corpus of English as a Lingua Franca. It is found that there were fewer repetitions in the textchat data, and this is explained as a consequence of the textchat mode. Textchat contributions are preserved as long as the chat is active or has been saved, and therefore users can scroll through and review the discussion, compared to the more fleeting nature of oral conversation. As a result, repetition is less necessary. The frequency of other functions identified could be attributed to the topic of discourse. Discussions involve much ellipsis used to develop discourse, although some were self-presentations with repetition used to confirm details. Back-channel support and comments were often low because speakers instead used forms like yeah as supportive utterances.