987 resultados para Documents électroniques


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In most previous research on distributional semantics, Vector Space Models (VSMs) of words are built either from topical information (e.g., documents in which a word is present), or from syntactic/semantic types of words (e.g., dependency parse links of a word in sentences), but not both. In this paper, we explore the utility of combining these two representations to build VSM for the task of semantic composition of adjective-noun phrases. Through extensive experiments on benchmark datasets, we find that even though a type-based VSM is effective for semantic composition, it is often outperformed by a VSM built using a combination of topic- and type-based statistics. We also introduce a new evaluation task wherein we predict the composed vector representation of a phrase from the brain activity of a human subject reading that phrase. We exploit a large syntactically parsed corpus of 16 billion tokens to build our VSMs, with vectors for both phrases and words, and make them publicly available.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of segmenting text documents that have a
two-part structure such as a problem part and a solution part. Documents
of this genre include incident reports that typically involve
description of events relating to a problem followed by those pertaining
to the solution that was tried. Segmenting such documents
into the component two parts would render them usable in knowledge
reuse frameworks such as Case-Based Reasoning. This segmentation
problem presents a hard case for traditional text segmentation
due to the lexical inter-relatedness of the segments. We develop
a two-part segmentation technique that can harness a corpus
of similar documents to model the behavior of the two segments
and their inter-relatedness using language models and translation
models respectively. In particular, we use separate language models
for the problem and solution segment types, whereas the interrelatedness
between segment types is modeled using an IBM Model
1 translation model. We model documents as being generated starting
from the problem part that comprises of words sampled from
the problem language model, followed by the solution part whose
words are sampled either from the solution language model or from
a translation model conditioned on the words already chosen in the
problem part. We show, through an extensive set of experiments on
real-world data, that our approach outperforms the state-of-the-art
text segmentation algorithms in the accuracy of segmentation, and
that such improved accuracy translates well to improved usability
in Case-based Reasoning systems. We also analyze the robustness
of our technique to varying amounts and types of noise and empirically
illustrate that our technique is quite noise tolerant, and
degrades gracefully with increasing amounts of noise

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La cigarette électronique (E-cigarette) est un phénomène relativement récent qui est en train de prendre une ampleur inattendue, surtout chez les jeunes. La littérature scientifique à ce sujet est encore relativement rare et surtout centrée sur les taux de prévalence. Bien que théoriquement conçues pour les adultes qui voudraient arrêter de fumer, les adolescents sont devenus un public cible pour ces produits, dont beaucoup n'ayant jamais fumé de cigarettes traditionnelles. Du point de vue de la santé publique, une des préoccupations majeures correspond au possible effet indésirable des cigarettes électroniques d'inciter les jeunes au tabagisme. Beaucoup de questions restent sans réponse quant à l'impact des cigarettes électroniques sur la santé publique. Par exemple, il n'est pas clair si les cigarettes électroniques sont juste une nouveauté que les jeunes n'essayent qu'une fois ou si elles ont le potentiel de concurrencer les cigarettes traditionnelles. Même si les cigarettes électroniques sont disponibles en Suisse depuis près de 10 ans, peu de données existent quant aux motifs de consommation des jeunes, les modalités de consommation, les effets recherchés et la perception de leur nocivité.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Etat de collection : 1929 (vol. 1, n °1)-1930 (vol. 2, n °8)