3 resultados para Labels.
em Digital Commons - Michigan Tech
Resumo:
The amount of information contained within the Internet has exploded in recent decades. As more and more news, blogs, and many other kinds of articles that are published on the Internet, categorization of articles and documents are increasingly desired. Among the approaches to categorize articles, labeling is one of the most common method; it provides a relatively intuitive and effective way to separate articles into different categories. However, manual labeling is limited by its efficiency, even thought the labels selected manually have relatively high quality. This report explores the topic modeling approach of Online Latent Dirichlet Allocation (Online-LDA). Additionally, a method to automatically label articles with their latent topics by combining the Online-LDA posterior with a probabilistic automatic labeling algorithm is implemented. The goal of this report is to examine the accuracy of the labels generated automatically by a topic model and probabilistic relevance algorithm for a set of real-world, dynamically updated articles from an online Rich Site Summary (RSS) service.
Resumo:
Virtually every sector of business and industry that uses computing, including financial analysis, search engines, and electronic commerce, incorporate Big Data analysis into their business model. Sophisticated clustering algorithms are popular for deducing the nature of data by assigning labels to unlabeled data. We address two main challenges in Big Data. First, by definition, the volume of Big Data is too large to be loaded into a computer’s memory (this volume changes based on the computer used or available, but there is always a data set that is too large for any computer). Second, in real-time applications, the velocity of new incoming data prevents historical data from being stored and future data from being accessed. Therefore, we propose our Streaming Kernel Fuzzy c-Means (stKFCM) algorithm, which reduces both computational complexity and space complexity significantly. The proposed stKFCM only requires O(n2) memory where n is the (predetermined) size of a data subset (or data chunk) at each time step, which makes this algorithm truly scalable (as n can be chosen based on the available memory). Furthermore, only 2n2 elements of the full N × N (where N >> n) kernel matrix need to be calculated at each time-step, thus reducing both the computation time in producing the kernel elements and also the complexity of the FCM algorithm. Empirical results show that stKFCM, even with relatively very small n, can provide clustering performance as accurately as kernel fuzzy c-means run on the entire data set while achieving a significant speedup.
Resumo:
In this report, we survey results on distance magic graphs and some closely related graphs. A distance magic labeling of a graph G with magic constant k is a bijection l from the vertex set to {1, 2, . . . , n}, such that for every vertex x Σ l(y) = k,y∈NG(x) where NG(x) is the set of vertices of G adjacent to x. If the graph G has a distance magic labeling we say that G is a distance magic graph. In Chapter 1, we explore the background of distance magic graphs by introducing examples of magic squares, magic graphs, and distance magic graphs. In Chapter 2, we begin by examining some basic results on distance magic graphs. We next look at results on different graph structures including regular graphs, multipartite graphs, graph products, join graphs, and splitting graphs. We conclude with other perspectives on distance magic graphs including embedding theorems, the matrix representation of distance magic graphs, lifted magic rectangles, and distance magic constants. In Chapter 3, we study graph labelings that retain the same labels as distance magic labelings, but alter the definition in some other way. These labelings include balanced distance magic labelings, closed distance magic labelings, D-distance magic labelings, and distance antimagic labelings. In Chapter 4, we examine results on neighborhood magic labelings, group distance magic labelings, and group distance antimagic labelings. These graph labelings change the label set, but are otherwise similar to distance magic graphs. In Chapter 5, we examine some applications of distance magic and distance antimagic labeling to the fair scheduling of tournaments. In Chapter 6, we conclude with some open problems.