10 resultados para Data streams
em Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho"
Resumo:
Concept drift, which refers to non stationary learning problems over time, has increasing importance in machine learning and data mining. Many concept drift applications require fast response, which means an algorithm must always be (re)trained with the latest available data. But the process of data labeling is usually expensive and/or time consuming when compared to acquisition of unlabeled data, thus usually only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are based on assumptions that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenging task in machine learning. Recently, a particle competition and cooperation approach has been developed to realize graph-based semi-supervised learning from static data. We have extend that approach to handle data streams and concept drift. The result is a passive algorithm which uses a single classifier approach, naturally adapted to concept changes without any explicit drift detection mechanism. It has built-in mechanisms that provide a natural way of learning from new data, gradually "forgetting" older knowledge as older data items are no longer useful for the classification of newer data items. The proposed algorithm is applied to the KDD Cup 1999 Data of network intrusion, showing its effectiveness.
Resumo:
Concept drift is a problem of increasing importance in machine learning and data mining. Data sets under analysis are no longer only static databases, but also data streams in which concepts and data distributions may not be stable over time. However, most learning algorithms produced so far are based on the assumption that data comes from a fixed distribution, so they are not suitable to handle concept drifts. Moreover, some concept drifts applications requires fast response, which means an algorithm must always be (re) trained with the latest available data. But the process of labeling data is usually expensive and/or time consuming when compared to unlabeled data acquisition, thus only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are also based on the assumption that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenge in machine learning. Recently, a particle competition and cooperation approach was used to realize graph-based semi-supervised learning from static data. In this paper, we extend that approach to handle data streams and concept drift. The result is a passive algorithm using a single classifier, which naturally adapts to concept changes, without any explicit drift detection mechanism. Its built-in mechanisms provide a natural way of learning from new data, gradually forgetting older knowledge as older labeled data items became less influent on the classification of newer data items. Some computer simulation are presented, showing the effectiveness of the proposed method.
Resumo:
Pós-graduação em Engenharia Mecânica - FEG
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
1. This study aimed to link basic ethnobiological research on local ecological knowledge (LEK) to the conservation of Brazilian streams, based on two case studies: original data on LEK of fishermen about freshwater fish in the Negro River, Amazon, and previously published data about LEK of farmers on the ecological relationship between forest and streams in the Macabuzinho catchment, Atlantic Forest.2. Information was obtained from fishermen through interviews using standard questionnaires containing open-ended questions. Informants for interview were selected either following some defined criteria or applying the 'snowball' method.3. Fishermen's LEK about the diets and habitats of 14 fish species in the Negro River provided new biological information on plant species that are eaten by fish, in addition to confirming some ecological patterns from the biological literature, such as dependence of fish on forests as food sources.4. In the Atlantic Forest, a comparison between farmers' LEK and a rapid stream assessment in the farmers' properties indicated that farmers tended to overestimate the ecological integrity of their streams. Farmers recognized at least 11 forest attributes that correspond to the scientific concept of ecosystem services. Such information may be useful to promote or enhance dialogue among farmers, scientists and managers.5. These results may contribute to the devising of ecosystem management measures in the Negro River, aimed to conserve both rivers and their associated floodplain forests, involving local fishermen. In the Atlantic Forest, we proposed some initiatives, such as to allow direct economic use of their forests to conciliate conflicting perceptions of farmers about ecological benefits versus economic losses from reforestation. Despite their cultural, environmental and geographical differences, the two study cases are complementary and cost-effective and promising approaches to including LEK in the design of ecological research. Copyright (C) 2007 John Wiley & Sons, Ltd.
Resumo:
Batrachospermum delicatulum specimens from three stream segments were analyzed from a tropical region in south-eastern Brazil (20°18′- 20°49′S, 49°13′-49°46′W). Physical and chemical parameters and the spatial placement of thalli were investigated along with the reproductive characteristics of the gametophytic phase. Sequence data of the cox 2-3 spacer region was also utilized to evaluate genetic variation in individuals within and among stream segments. Gametophyte occurred under relatively diverse environmental conditions, whereas thalli abundance was weakly or not correlated to environmental variables within the stream segments. All specimens examined were dioecious. The ratio of male/female plants was relatively low (0.5 to 1.3) and male plants tended to occur as clumps (two or three plants together). High reproductive success was observed, as indicated by the occurrence of 100% fertilized (carposporophytic) female plants. This is similar to previous reports for this and other dioecious species, which is remarkable considering the relatively low proportion of male/female plants. Results support the two hypotheses to explain the high reproductive success in dioecious species. The occurrence of male plants in clumps was evidence for a strict spatial relationship (i.e. male plants located in upstream position of female plants in order to release spermatia, which would be carried by eddies through female plants). In contrast, the occurrence of male and female plants adjacent to each other allowed outcrossing among neighboring plants with intermingled male and female branches, which seemed more applicable to some situations (low turbulence habitats). The cox 2-3 spacer region from the 18 individuals sequenced was 376 bp and the DNA sequence was identical with no base pair substitutions. Likewise, a previous study of another Batrachospermum species showed that the same haplotypes were present in all stream segments from the same drainage basin, even though the stream segments were a considerable distance apart. Short distance dispersal either by small birds or waterway connectivity might explain these findings.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)