2 resultados para news selection

em Universidad Politécnica de Madrid


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In ubiquitous data stream mining applications, different devices often aim to learn concepts that are similar to some extent. In these applications, such as spam filtering or news recommendation, the data stream underlying concept (e.g., interesting mail/news) is likely to change over time. Therefore, the resultant model must be continuously adapted to such changes. This paper presents a novel Collaborative Data Stream Mining (Coll-Stream) approach that explores the similarities in the knowledge available from other devices to improve local classification accuracy. Coll-Stream integrates the community knowledge using an ensemble method where the classifiers are selected and weighted based on their local accuracy for different partitions of the feature space. We evaluate Coll-Stream classification accuracy in situations with concept drift, noise, partition granularity and concept similarity in relation to the local underlying concept. The experimental results show that Coll-Stream resultant model achieves stability and accuracy in a variety of situations using both synthetic and real world datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rhizobium leguminosarum bv.viciae is able to establish nitrogen-fixing symbioses with legumes of the genera Pisum, Lens, Lathyrus and Vicia. Classic studies using trap plants (Laguerre et al., Young et al.) provided evidence that different plant hosts are able to select different rhizobial genotypes among those available in a given soil. However, these studies were necessarily limited by the paucity of relevant biodiversity markers. We have now reappraised this problem with the help of genomic tools. A well-characterized agricultural soil (INRA Bretennieres) was used as source of rhizobia. Plants of Pisum sativum, Lens culinaris, Vicia sativa and V. faba were used as traps. Isolates from 100 nodules were pooled, and DNA from each pool was sequenced (BGI-Hong Kong; Illumina Hiseq 2000, 500 bp PE libraries, 100 bp reads, 12 Mreads). Reads were quality filtered (FastQC, Trimmomatic), mapped against reference R. leguminosarum genomes (Bowtie2, Samtools), and visualized (IGV). An important fraction of the filtered reads were not recruited by reference genomes, suggesting that plant isolates contain genes that are not present in the reference genomes. For this study, we focused on three conserved genomic regions: 16S-23S rDNA, atpD and nodDABC, and a Single Nucleotide Polymorphism (SNP) analysis was carried out with meta / multigenomes from each plant. Although the level of polymorphism varied (lowest in the rRNA region), polymorphic sites could be identified that define the specific soil population vs. reference genomes. More importantly, a plant-specific SNP distribution was observed. This could be confirmed with many other regions extracted from the reference genomes (data not shown). Our results confirm at the genomic level previous observations regarding plant selection of specific genotypes. We expect that further, ongoing comparative studies on differential meta / multigenomic sequences will identify specific gene components of the plant-selected genotypes