34 resultados para Data stream mining
Resumo:
Song-selection and mood are interdependent. If we capture a song’s sentiment, we can determine the mood of the listener, which can serve as a basis for recommendation systems. Songs are generally classified according to genres, which don’t entirely reflect sentiments. Thus, we require an unsupervised scheme to mine them. Sentiments are classified into either two (positive/negative) or multiple (happy/angry/sad/...) classes, depending on the application. We are interested in analyzing the feelings invoked by a song, involving multi-class sentiments. To mine the hidden sentimental structure behind a song, in terms of “topics”, we consider its lyrics and use Latent Dirichlet Allocation (LDA). Each song is a mixture of moods. Topics mined by LDA can represent moods. Thus we get a scheme of collecting similar-mood songs. For validation, we use a dataset of songs containing 6 moods annotated by users of a particular website.
Resumo:
The current understanding of wildfire effects on water chemistry is limited by the quantification of the elemental dissolution rates from ash and element release rate from the plant litter, as well as quantification of the specific ash contribution to stream water chemistry. The main objective of the study was to provide such knowledge through combination of experimental modelling, field data and end-member mixing analysis (EMMA) of wildfire impact on a watershed scale. The study concerns watershed effects of fire in the Indian subcontinent, a region that is typically not well represented in the fire science literature. In plant litter ash, major elements are either hosted in readily-soluble phases (K, Mg) such as salts, carbonates and oxides or in less-soluble carrier-phases (Si, Ca) such as amorphous silica, quartz and calcite. Accordingly, elemental release rates, inferred from ash leaching experiments in batch reactor, indicated that the element release into solution followed the order K > Mg > Na > Si > Ca. Experiments on plant litter leaching in mixed-flow reactor indicated two dissolution regimes: rapid, over the week and slower over the month. The mean dissolution rates at steady-state (R-ss) indicated that the release of major elements from plant litter followed the order Ca > Si > Cl > Mg > K > Na. R-ss for Si and Ca for tree leaves and herbaceous species are similar to those reported for boreal and European tree species and are higher than that from the dissolution of soil clay minerals. This identifies tropical plant litters as important source of Si and Ca for tropical surface waters. In the wildfire-impacted year 2004, the EMMA indicated that the streamflow composition (Ca, K, Mg, Na, Si, Cl) was controlled by four main sources: rainwater, throughfall, ash leaching and soil solution. The influence of the ash end-member was maximal early in the rainy season (the two first storm events) and decreased later in the rainy season, when the stream was dominated by the throughfall end-member. The contribution of plant litter decay to the streamwater composition for a year not impacted by wildfire is significant with estimated solute fluxes originating from this decay greatly exceed, for most major elements, the annual elemental dissolved fluxes at the Mule Hole watershed outlet. This highlighted the importance of solute retention and vegetation back uptake processes within the soil profile. Overall, the fire increased the mobility and export of major elements from the soils to the stream. It also shifted the vegetation-related contribution to the elemental fluxes at the watershed outlet from long-term (seasonal) to short-term (daily to monthly). (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
The problem of classification of time series data is an interesting problem in the field of data mining. Even though several algorithms have been proposed for the problem of time series classification we have developed an innovative algorithm which is computationally fast and accurate in several cases when compared with 1NN classifier. In our method we are calculating the fuzzy membership of each test pattern to be classified to each class. We have experimented with 6 benchmark datasets and compared our method with 1NN classifier.
Resumo:
Today's programming languages are supported by powerful third-party APIs. For a given application domain, it is common to have many competing APIs that provide similar functionality. Programmer productivity therefore depends heavily on the programmer's ability to discover suitable APIs both during an initial coding phase, as well as during software maintenance. The aim of this work is to support the discovery and migration of math APIs. Math APIs are at the heart of many application domains ranging from machine learning to scientific computations. Our approach, called MATHFINDER, combines executable specifications of mathematical computations with unit tests (operational specifications) of API methods. Given a math expression, MATHFINDER synthesizes pseudo-code comprised of API methods to compute the expression by mining unit tests of the API methods. We present a sequential version of our unit test mining algorithm and also design a more scalable data-parallel version. We perform extensive evaluation of MATHFINDER (1) for API discovery, where math algorithms are to be implemented from scratch and (2) for API migration, where client programs utilizing a math API are to be migrated to another API. We evaluated the precision and recall of MATHFINDER on a diverse collection of math expressions, culled from algorithms used in a wide range of application areas such as control systems and structural dynamics. In a user study to evaluate the productivity gains obtained by using MATHFINDER for API discovery, the programmers who used MATHFINDER finished their programming tasks twice as fast as their counterparts who used the usual techniques like web and code search, IDE code completion, and manual inspection of library documentation. For the problem of API migration, as a case study, we used MATHFINDER to migrate Weka, a popular machine learning library. Overall, our evaluation shows that MATHFINDER is easy to use, provides highly precise results across several math APIs and application domains even with a small number of unit tests per method, and scales to large collections of unit tests.