879 resultados para Graph analytics
Resumo:
In this paper we propose a graph stream clustering algorithm with a unied similarity measure on both structural and attribute properties of vertices, with each attribute being treated as a vertex. Unlike others, our approach does not require an input parameter for the number of clusters, instead, it dynamically creates new sketch-based clusters and periodically merges existing similar clusters. Experiments on two publicly available datasets reveal the advantages of our approach in detecting vertex clusters in the graph stream. We provide a detailed investigation into how parameters affect the algorithm performance. We also provide a quantitative evaluation and comparison with a well-known offline community detection algorithm which shows that our streaming algorithm can achieve comparable or better average cluster purity.
Resumo:
We present a mathematically rigorous Quality-of-Service (QoS) metric which relates the achievable quality of service metric (QoS) for a real-time analytics service to the server energy cost of offering the service. Using a new iso-QoS evaluation methodology, we scale server resources to meet QoS targets and directly rank the servers in terms of their energy-efficiency and by extension cost of ownership. Our metric and method are platform-independent and enable fair comparison of datacenter compute servers with significant architectural diversity, including micro-servers. We deploy our metric and methodology to compare three servers running financial option pricing workloads on real-life market data. We find that server ranking is sensitive to data inputs and desired QoS level and that although scale-out micro-servers can be up to two times more energy-efficient than conventional heavyweight servers for the same target QoS, they are still six times less energy efficient than high-performance computational accelerators.
Resumo:
Do patterns in the YouTube viewing analytics of Lecture Capture videos point to areas of potential teaching and learning performance enhancement? The goal of this action based research project was to capture and quantitatively analyse the viewing behaviours and patterns of a series of video lecture captures across several computing modules in Queen’s University, Belfast, Northern Ireland. The research sought to establish if a quantitative analysis of viewing behaviours coupled with a qualitative evaluation of the material provided from the students could be correlated to provide generalised patterns that could then be used to understand the learning experience of students during face to face lectures and, thereby, present opportunities to reflectively enhance lecturer performance and the students’ overall learning experience and, ultimately, their level of academic attainment.
Resumo:
In this study, we introduce an original distance definition for graphs, called the Markov-inverse-F measure (MiF). This measure enables the integration of classical graph theory indices with new knowledge pertaining to structural feature extraction from semantic networks. MiF improves the conventional Jaccard and/or Simpson indices, and reconciles both the geodesic information (random walk) and co-occurrence adjustment (degree balance and distribution). We measure the effectiveness of graph-based coefficients through the application of linguistic graph information for a neural activity recorded during conceptual processing in the human brain. Specifically, the MiF distance is computed between each of the nouns used in a previous neural experiment and each of the in-between words in a subgraph derived from the Edinburgh Word Association Thesaurus of English. From the MiF-based information matrix, a machine learning model can accurately obtain a scalar parameter that specifies the degree to which each voxel in (the MRI image of) the brain is activated by each word or each principal component of the intermediate semantic features. Furthermore, correlating the voxel information with the MiF-based principal components, a new computational neurolinguistics model with a network connectivity paradigm is created. This allows two dimensions of context space to be incorporated with both semantic and neural distributional representations.
Resumo:
Realising memory intensive applications such as image and video processing on FPGA requires creation of complex, multi-level memory hierarchies to achieve real-time performance; however commerical High Level Synthesis tools are unable to automatically derive such structures and hence are unable to meet the demanding bandwidth and capacity constraints of these applications. Current approaches to solving this problem can only derive either single-level memory structures or very deep, highly inefficient hierarchies, leading in either case to one or more of high implementation cost and low performance. This paper presents an enhancement to an existing MC-HLS synthesis approach which solves this problem; it exploits and eliminates data duplication at multiple levels levels of the generated hierarchy, leading to a reduction in the number of levels and ultimately higher performance, lower cost implementations. When applied to synthesis of C-based Motion Estimation, Matrix Multiplication and Sobel Edge Detection applications, this enables reductions in Block RAM and Look Up Table (LUT) cost of up to 25%, whilst simultaneously increasing throughput.
Resumo:
Social media channels, such as Facebook or Twitter, allow for people to express their views and opinions about any public topics. Public sentiment related to future events, such as demonstrations or parades, indicate public attitude and therefore may be applied while trying to estimate the level of disruption and disorder during such events. Consequently, sentiment analysis of social media content may be of interest for different organisations, especially in security and law enforcement sectors. This paper presents a new lexicon-based sentiment analysis algorithm that has been designed with the main focus on real time Twitter content analysis. The algorithm consists of two key components, namely sentiment normalisation and evidence-based combination function, which have been used in order to estimate the intensity of the sentiment rather than positive/negative label and to support the mixed sentiment classification process. Finally, we illustrate a case study examining the relation between negative sentiment of twitter posts related to English Defence League and the level of disorder during the organisation’s related events.
Resumo:
We make a case for studying the impact of intra-node parallelism on the performance of data analytics. We identify four performance optimizations that are enabled by an increasing number of processing cores on a chip. We discuss the performance impact of these opimizations on two analytics operators and we identify how these optimizations affect each another.
Resumo:
ABSTRACT
The proliferation in the use of video lecture capture in universities worldwide presents an opportunity to analyse video watching patterns in an attempt to quantify and qualify how students engage and learn with the videos. It also presents an opportunity to investigate if there are similar student learning patterns during the equivalent physical lecture. The goal of this action based research project was to capture and quantitatively analyse the viewing behaviours and patterns of a series of video lecture captures across several university Java programming modules. It sought to study if a quantitative analysis of viewing behaviours of Lecture Capture videos coupled with a qualitative evaluation from the students and lecturers could be correlated to provide generalised patterns that could then be used to understand the learning experience of students during videos and potentially face to face lectures and, thereby, present opportunities to reflectively enhance lecturer performance and the students’ overall learning experience. The report establishes a baseline understanding of the analytics of videos of several commonly used pedagogical teaching methods used in the delivery of programming courses. It reflects on possible concurrences within live lecture delivery with the potential to inform and improve lecturing performance.
Resumo:
NanoStreams explores the design, implementation,and system software stack of micro-servers aimed at processingdata in-situ and in real time. These micro-servers can serve theemerging Edge computing ecosystem, namely the provisioningof advanced computational, storage, and networking capabilitynear data sources to achieve both low latency event processingand high throughput analytical processing, before consideringoff-loading some of this processing to high-capacity datacentres.NanoStreams explores a scale-out micro-server architecture thatcan achieve equivalent QoS to that of conventional rack-mountedservers for high-capacity datacentres, but with dramaticallyreduced form factors and power consumption. To this end,NanoStreams introduces novel solutions in programmable & con-figurable hardware accelerators, as well as the system softwarestack used to access, share, and program those accelerators.Our NanoStreams micro-server prototype has demonstrated 5.5×higher energy-efficiency than a standard Xeon Server. Simulationsof the microserver’s memory system extended to leveragehybrid DDR/NVM main memory indicated 5× higher energyefficiencythan a conventional DDR-based system.
Resumo:
A family of quadratic programming problems whose optimal values are upper bounds on the independence number of a graph is introduced. Among this family, the quadratic programming problem which gives the best upper bound is identified. Also the proof that the upper bound introduced by Hoffman and Lovász for regular graphs is a particular case of this family is given. In addition, some new results characterizing the class of graphs for which the independence number attains the optimal value of the above best upper bound are given. Finally a polynomial-time algorithm for approximating the size of the maximum independent set of an arbitrary graph is described and the computational experiments carried out on 36 DIMACS clique benchmark instances are reported.
Resumo:
A graph is singular if the zero eigenvalue is in the spectrum of its 0-1 adjacency matrix A. If an eigenvector belonging to the zero eigenspace of A has no zero entries, then the singular graph is said to be a core graph. A ( k,t)-regular set is a subset of the vertices inducing a k -regular subgraph such that every vertex not in the subset has t neighbours in it. We consider the case when k=t which relates to the eigenvalue zero under certain conditions. We show that if a regular graph has a ( k,k )-regular set, then it is a core graph. By considering the walk matrix we develop an algorithm to extract ( k,k )-regular sets and formulate a necessary and sufficient condition for a graph to be Hamiltonian.
Resumo:
Taking a Fiedler’s result on the spectrum of a matrix formed from two symmetric matrices as a motivation, a more general result is deduced and applied to the determination of adjacency and Laplacian spectra of graphs obtained by a generalized join graph operation on families of graphs (regular in the case of adjacency spectra and arbitrary in the case of Laplacian spectra). Some additional consequences are explored, namely regarding the largest eigenvalue and algebraic connectivity.