5 resultados para Paraphrasing and plagiarism detection

em Universidad de Alicante


Relevância:

100.00% 100.00%

Publicador:

Resumo:

DIANA es un proyecto coordinado en el que participan el grupo de Ingeniería del Lenguaje Natural y Reconocimiento de Formas (ELiRF) de la Universitat Politècnica de València y el grupo Centre de Llenguatge i Computació (CLiC) de la Universitat de Barcelona. Se trata de un proyecto del programa de I+D (TIN2012-38603) financiado por el Ministerio de Economía y Competitividad. Paolo Rosso coordina el proyecto DIANA y lidera el subproyecto DIANA-Applications y M. Antònia Martí lidera el subproyecto DIANA-Constructions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Internet boom in recent years has increased the interest in the field of plagiarism detection. A lot of documents are published on the Net everyday and anyone can access and plagiarize them. Of course, checking all cases of plagiarism manually is an unfeasible task. Therefore, it is necessary to create new systems that are able to automatically detect cases of plagiarism produced. In this paper, we introduce a new hybrid system for plagiarism detection which combines the advantages of the two main plagiarism detection techniques. This system consists of two analysis phases: the first phase uses an intrinsic detection technique which dismisses much of the text, and the second phase employs an external detection technique to identify the plagiarized text sections. With this combination we achieve a detection system which obtains accurate results and is also faster thanks to the prefiltering of the text.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The exponential growth of the subjective information in the framework of the Web 2.0 has led to the need to create Natural Language Processing tools able to analyse and process such data for multiple practical applications. They require training on specifically annotated corpora, whose level of detail must be fine enough to capture the phenomena involved. This paper presents EmotiBlog – a fine-grained annotation scheme for subjectivity. We show the manner in which it is built and demonstrate the benefits it brings to the systems using it for training, through the experiments we carried out on opinion mining and emotion detection. We employ corpora of different textual genres –a set of annotated reported speech extracted from news articles, the set of news titles annotated with polarity and emotion from the SemEval 2007 (Task 14) and ISEAR, a corpus of real-life self-expressed emotion. We also show how the model built from the EmotiBlog annotations can be enhanced with external resources. The results demonstrate that EmotiBlog, through its structure and annotation paradigm, offers high quality training data for systems dealing both with opinion mining, as well as emotion detection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the chemical textile domain experts have to analyse chemical components and substances that might be harmful for their usage in clothing and textiles. Part of this analysis is performed searching opinions and reports people have expressed concerning these products in the Social Web. However, this type of information on the Internet is not as frequent for this domain as for others, so its detection and classification is difficult and time-consuming. Consequently, problems associated to the use of chemical substances in textiles may not be detected early enough, and could lead to health problems, such as allergies or burns. In this paper, we propose a framework able to detect, retrieve, and classify subjective sentences related to the chemical textile domain, that could be integrated into a wider health surveillance system. We also describe the creation of several datasets with opinions from this domain, the experiments performed using machine learning techniques and different lexical resources such as WordNet, and the evaluation focusing on the sentiment classification, and complaint detection (i.e., negativity). Despite the challenges involved in this domain, our approach obtains promising results with an F-score of 65% for polarity classification and 82% for complaint detection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As BIM adoption continues, the goal of a totally collaborative model with multiple contributors is attainable. Many initiatives such as the 2016 UK government level 2 BIM deadline are putting pressure on the construction industry to speed up the changeover. Clients and collaborators have higher expectations of using digital 3D models to communicate design ideas and solve practical problems. Contractors and clients are benefitting from cost saving scheduling and clash detection offered by BIM. Effective collaboration on the project will also give speed and efficiency gains. Despite this, many businesses of varying sizes are still having problems. The cost of the software and the training provides an obvious barrier for micro-enterprises and could explain a delay in adoption. Many studies have looked at these problems faced by SME and micro-enterprises. Larger companies have different problems. The efforts made by government to encourage them are quite comprehensive, but is anything being done to help smaller sectors and keep the industry cohesive? This limited study examines several companies of varying size and varying project type: architectural design businesses, main contractor, structural engineer and building consultancy. The study examines the barriers to a truly collaborative BIM workflow facing different specialities on a larger project and a contrasting small/medium project. The findings will establish that different barriers for each sector are actually pushing further apart, thus potentially creating a BIM-only construction elite, leaving the small companies remaining on 2D based drawing.