12 resultados para Data Streams Distribution

em Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho"


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Concept drift is a problem of increasing importance in machine learning and data mining. Data sets under analysis are no longer only static databases, but also data streams in which concepts and data distributions may not be stable over time. However, most learning algorithms produced so far are based on the assumption that data comes from a fixed distribution, so they are not suitable to handle concept drifts. Moreover, some concept drifts applications requires fast response, which means an algorithm must always be (re) trained with the latest available data. But the process of labeling data is usually expensive and/or time consuming when compared to unlabeled data acquisition, thus only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are also based on the assumption that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenge in machine learning. Recently, a particle competition and cooperation approach was used to realize graph-based semi-supervised learning from static data. In this paper, we extend that approach to handle data streams and concept drift. The result is a passive algorithm using a single classifier, which naturally adapts to concept changes, without any explicit drift detection mechanism. Its built-in mechanisms provide a natural way of learning from new data, gradually forgetting older knowledge as older labeled data items became less influent on the classification of newer data items. Some computer simulation are presented, showing the effectiveness of the proposed method.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Concept drift, which refers to non stationary learning problems over time, has increasing importance in machine learning and data mining. Many concept drift applications require fast response, which means an algorithm must always be (re)trained with the latest available data. But the process of data labeling is usually expensive and/or time consuming when compared to acquisition of unlabeled data, thus usually only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are based on assumptions that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenging task in machine learning. Recently, a particle competition and cooperation approach has been developed to realize graph-based semi-supervised learning from static data. We have extend that approach to handle data streams and concept drift. The result is a passive algorithm which uses a single classifier approach, naturally adapted to concept changes without any explicit drift detection mechanism. It has built-in mechanisms that provide a natural way of learning from new data, gradually "forgetting" older knowledge as older data items are no longer useful for the classification of newer data items. The proposed algorithm is applied to the KDD Cup 1999 Data of network intrusion, showing its effectiveness.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Trachoma is a disease known thousand years ago and still as a potential blindness disease all over the world. The authors call attention to the factors related with the transmission, present historical data and distribution of the disease in Brazil and in the world, comment on the agent, the signs and symptoms of this chronic conjunctivitis. Still, reinforce the need to enable professionals for the diagnosis, detection and treatment. The reflection of these attitudes will be the contribution to the elimination of this important disease as a blindness cause among us.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Pós-graduação em Engenharia Mecânica - FEG

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Distribution of Rhodophyta was investigated in 172 stream segments, which were sampled from May to October in 1992-1993 and 1996-1997 in six natural regions (parts of biomes or geological areas) of São Paulo State, southeastern Brazil. Red algae occurred in 60.5 % of stream segments sampled, a high frequency in comparison with other major surveys in the world (18-65 %). 17 species of freshwater red algae were found, of which the most widespread was Batrachospermum delicatulum occurring in 17 sites of five regions. The proportion of morphological types was as follows: gelatinous filaments (62.5 %), free filaments (19 %), tufts (12.5 %) and crusts (6 %); all but free filaments can be considered as having mechanisms to tolerate stress provoked by current velocity. No significant difference was found between the frequency distributions of variables measured for all streams and for those with red algae. Rhodophyta occurred under the following conditions (means): temperature (19.0 degrees C), current velocity (48 cm s(-1)), specific conductance (74 mu S cm(-1)), turbidity (8 NTU), oxygen (67.3 %) and pH (6.9 +/- 0.7). on the basis of species composition among the regions, the following patterns were evident: 1) the number of red algal species per region ranged from 1 to 10; 2) the highest proportion of sites with red algae (65-73 %) was found in hard water regions and in Atlantic rainforest, whereas the lowest (50 %) was found in tropical rainforest; 3) more than half of the species were exclusive from a single region, whereas the higher proportion of exclusive species was in the subtropical rainforest (50 %). No combination of stream variables was clearly associated with the occurrence of red algae for the regions as a whole. Species composition for streams and rivers of São Paulo State revealed higher similarities with other tropical regions and had few species in common with freshwater red algal floras of other continents.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Macroalgal species richness and diversity were analysed along a longitudinal profile in small and large scales during Spring, Fall and Winter, respectively in a small stream and a mid size river in the northwest region of São Paulo State, southeastern Brazil (20 degrees 23'-20 degrees 49'S, 49 degrees 26'-51 degrees 19'W). Longitudinal variation in species richness and diversity in small scale was strongly associated with incident light. Microhabitat distribution (from data taken by quadrat technique) revealed no significant correlations. Principal coordinates analysis (PCO) indicated no consistent groupings among sampling sites in distinct seasons (Spring, Fall and Winter). Longitudinal analysis in large scale revealed different patterns in the two seasons sampled (Spring and Winter), whereas species diversity presented a consistent tl end: high upstream, low in mid reaches and higher downstream. It was associated with type of substratum in Spring, rocky substrata presenting the highest values for species richness and diversity. Weak correlations were observed in Winter. Microhabitat distribution showed significant correlations between species abundance and the following variables: positive for rocky substrata and current velocity and negative for sandyclayish substratum and macrophyte-dominated substratum. PCO delineated only one consistent grouping formed by the two headwater sites. Small scale macroalgal distribution corroborated the longitudinal pattern predicted by the River Continuum Concept, whereas the large scale approach showed a distribution more associated with substratum type than to light availability. These results showed an opposite trend in relation to the expected distributional pattern. Longitudinal distribution in macroalgal community structure has yet to be better documented, particularly for tropical streams and no generalization is possible at this stage.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Two stochastic models have been fitted to daily rainfall data for an interior station of Brazil. Of these two models, the results show a better fit to describe the data, by truncated negative probability model in comparison with Markov chain probability model. Kolmogorov-Smirnov test is applied for significance for these models. © 1983 Springer-Verlag.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)