989 resultados para Electrical engineering|Nanoscience
Resumo:
Speaker diarization determines instances of the same speaker within a recording. Extending this task to a collection of recordings for linking together segments spoken by a unique speaker requires speaker linking. In this paper we propose a speaker linking system using linkage clustering and state-of-the-art speaker recognition techniques. We evaluate our approach against two baseline linking systems using agglomerative cluster merging (AC) and agglomerative clustering with model retraining (ACR). We demonstrate that our linking method, using complete-linkage clustering, provides a relative improvement of 20% and 29% in attribution error rate (AER), over the AC and ACR systems, respectively.
Resumo:
In this paper, a new comprehensive planning methodology is proposed for implementing distribution network reinforcement. The load growth, voltage profile, distribution line loss, and reliability are considered in this procedure. A time-segmentation technique is employed to reduce the computational load. Options considered range from supporting the load growth using the traditional approach of upgrading the conventional equipment in the distribution network, through to the use of dispatchable distributed generators (DDG). The objective function is composed of the construction cost, loss cost and reliability cost. As constraints, the bus voltages and the feeder currents should be maintained within the standard level. The DDG output power should not be less than a ratio of its rated power because of efficiency. A hybrid optimization method, called modified discrete particle swarm optimization, is employed to solve this nonlinear and discrete optimization problem. A comparison is performed between the optimized solution based on planning of capacitors along with tap-changing transformer and line upgrading and when DDGs are included in the optimization.
Resumo:
Retrieving information from Twitter is always challenging due to its large volume, inconsistent writing and noise. Most existing information retrieval (IR) and text mining methods focus on term-based approach, but suffers from the problems of terms variation such as polysemy and synonymy. This problem deteriorates when such methods are applied on Twitter due to the length limit. Over the years, people have held the hypothesis that pattern-based methods should perform better than term-based methods as it provides more context, but limited studies have been conducted to support such hypothesis especially in Twitter. This paper presents an innovative framework to address the issue of performing IR in microblog. The proposed framework discover patterns in tweets as higher level feature to assign weight for low-level features (i.e. terms) based on their distributions in higher level features. We present the experiment results based on TREC11 microblog dataset and shows that our proposed approach significantly outperforms term-based methods Okapi BM25, TF-IDF and pattern based methods, using precision, recall and F measures.
Resumo:
Grouping users in social networks is an important process that improves matching and recommendation activities in social networks. The data mining methods of clustering can be used in grouping the users in social networks. However, the existing general purpose clustering algorithms perform poorly on the social network data due to the special nature of users' data in social networks. One main reason is the constraints that need to be considered in grouping users in social networks. Another reason is the need of capturing large amount of information about users which imposes computational complexity to an algorithm. In this paper, we propose a scalable and effective constraint-based clustering algorithm based on a global similarity measure that takes into consideration the users' constraints and their importance in social networks. Each constraint's importance is calculated based on the occurrence of this constraint in the dataset. Performance of the algorithm is demonstrated on a dataset obtained from an online dating website using internal and external evaluation measures. Results show that the proposed algorithm is able to increases the accuracy of matching users in social networks by 10% in comparison to other algorithms.
Resumo:
Sfinks is a shift register based stream cipher designed for hardware implementation and submitted to the eSTREAM project. In this paper, we analyse the initialisation process of Sfinks. We demonstrate a slid property of the loaded state of the Sfinks cipher, where multiple key-IV pairs may produce phase shifted keystream sequences. The state update functions of both the initialisation process and keystream generation and also the pattern of the padding affect generation of the slid pairs.
Resumo:
Key distribution is one of the most challenging security issues in wireless sensor networks where sensor nodes are randomly scattered over a hostile territory. In such a sensor deployment scenario, there will be no prior knowledge of post deployment configuration. For security solutions requiring pairwise keys, it is impossible to decide how to distribute key pairs to sensor nodes before the deployment. Existing approaches to this problem are to assign more than one key, namely a key-chain, to each node. Key-chains are randomly drawn from a key-pool. Either two neighboring nodes have a key in common in their key-chains, or there is a path, called key-path, among these two nodes where each pair of neighboring nodes on this path has a key in common. Problem in such a solution is to decide on the key-chain size and key-pool size so that every pair of nodes can establish a session key directly or through a path with high probability. The size of the key-path is the key factor for the efficiency of the design. This paper presents novel, deterministic and hybrid approaches based on Combinatorial Design for key distribution. In particular, several block design techniques are considered for generating the key-chains and the key-pools.
Resumo:
Entity-oriented search has become an essential component of modern search engines. It focuses on retrieving a list of entities or information about the specific entities instead of documents. In this paper, we study the problem of finding entity related information, referred to as attribute-value pairs, that play a significant role in searching target entities. We propose a novel decomposition framework combining reduced relations and the discriminative model, Conditional Random Field (CRF), for automatically finding entity-related attribute-value pairs from free text documents. This decomposition framework allows us to locate potential text fragments and identify the hidden semantics, in the form of attribute-value pairs for user queries. Empirical analysis shows that the decomposition framework outperforms pattern-based approaches due to its capability of effective integration of syntactic and semantic features.
Resumo:
Chatrooms, for example Internet Relay Chat, are generally multi-user, multi-channel and multiserver chat-systems which run over the Internet and provide a protocol for real-time text-based conferencing between users all over the world. While a well-trained human observer is able to understand who is chatting with whom, there are no efficient and accurate automated tools to determine the groups of users conversing with each other. A precursor to analysing evolving cyber-social phenomena is to first determine what the conversations are and which groups of chatters are involved in each conversation. We consider this problem in this paper. We propose an algorithm to discover all groups of users that are engaged in conversation. Our algorithms are based on a statistical model of a chatroom that is founded on our experience with real chatrooms. Our approach does not require any semantic analysis of the conversations, rather it is based purely on the statistical information contained in the sequence of posts. We improve the accuracy by applying some graph algorithms to clean the statistical information. We present some experimental results which indicate that one can automatically determine the conversing groups in a chatroom, purely on the basis of statistical analysis.
Resumo:
This paper is concerned with the unsupervised learning of object representations by fusing visual and motor information. The problem is posed for a mobile robot that develops its representations as it incrementally gathers data. The scenario is problematic as the robot only has limited information at each time step with which it must generate and update its representations. Object representations are refined as multiple instances of sensory data are presented; however, it is uncertain whether two data instances are synonymous with the same object. This process can easily diverge from stability. The premise of the presented work is that a robot's motor information instigates successful generation of visual representations. An understanding of self-motion enables a prediction to be made before performing an action, resulting in a stronger belief of data association. The system is implemented as a data-driven partially observable semi-Markov decision process. Object representations are formed as the process's hidden states and are coordinated with motor commands through state transitions. Experiments show the prediction process is essential in enabling the unsupervised learning method to converge to a solution - improving precision and recall over using sensory data alone.
Resumo:
The challenge of persistent appearance-based navigation and mapping is to develop an autonomous robotic vision system that can simultaneously localize, map and navigate over the lifetime of the robot. However, the computation time and memory requirements of current appearance-based methods typically scale not only with the size of the environment but also with the operation time of the platform; also, repeated revisits to locations will develop multiple competing representations which reduce recall performance. In this paper we present a solution to the persistent localization, mapping and global path planning problem in the context of a delivery robot in an office environment over a one-week period. Using a graphical appearance-based SLAM algorithm, CAT-Graph, we demonstrate constant time and memory loop closure detection with minimal degradation during repeated revisits to locations, along with topological path planning that improves over time without using a global metric representation. We compare the localization performance of CAT-Graph to openFABMAP, an appearance-only SLAM algorithm, and the path planning performance to occupancy-grid based metric SLAM. We discuss the limitations of the algorithm with regard to environment change over time and illustrate how the topological graph representation can be coupled with local movement behaviors for persistent autonomous robot navigation.
Resumo:
Well-designed initialisation and keystream generation processes for stream ciphers should ensure that each key-IV pair generates a distinct keystream. In this paper, we analyse some ciphers where this does not happen due to state convergence occurring either during initialisation, keystream generation or both. We show how state convergence occurs in each case and identify two mechanisms which can cause state convergence.
Resumo:
Student performance on examinations is influenced by the level of difficulty of the questions. It seems reasonable to propose therefore that assessment of the difficulty of exam questions could be used to gauge the level of skills and knowledge expected at the end of a course. This paper reports the results of a study investigating the difficulty of exam questions using a subjective assessment of difficulty and a purpose-built exam question complexity classification scheme. The scheme, devised for exams in introductory programming courses, assesses the complexity of each question using six measures: external domain references, explicitness, linguistic complexity, conceptual complexity, length of code involved in the question and/or answer, and intellectual complexity (Bloom level). We apply the scheme to 20 introductory programming exam papers from five countries, and find substantial variation across the exams for all measures. Most exams include a mix of questions of low, medium, and high difficulty, although seven of the 20 have no questions of high difficulty. All of the complexity measures correlate with assessment of difficulty, indicating that the difficulty of an exam question relates to each of these more specific measures. We discuss the implications of these findings for the development of measures to assess learning standards in programming courses.
A qualitative think aloud study of the early Neo-Piagetian stages of reasoning in novice programmers
Resumo:
Recent research indicates that some of the difficulties faced by novice programmers are manifested very early in their learning. In this paper, we present data from think aloud studies that demonstrate the nature of those difficulties. In the think alouds, novices were required to complete short programming tasks which involved either hand executing ("tracing") a short piece of code, or writing a single sentence describing the purpose of the code. We interpret our think aloud data within a neo-Piagetian framework, demonstrating that some novices reason at the sensorimotor and preoperational stages, not at the higher concrete operational stage at which most instruction is implicitly targeted.
Resumo:
Climate change and land use pressures are making environmental monitoring increasingly important. As environmental health is degrading at an alarming rate, ecologists have tried to tackle the problem by monitoring the composition and condition of environment. However, traditional monitoring methods using experts are manual and expensive; to address this issue government organisations designed a simpler and faster surrogate-based assessment technique for consultants, landholders and ordinary citizens. However, it remains complex, subjective and error prone. This makes collected data difficult to interpret and compare. In this paper we describe a work-in-progress mobile application designed to address these shortcomings through the use of augmented reality and multimedia smartphone technology.
Resumo:
Citizen Science projects are initiatives in which members of the general public participate in scientific research projects and perform or manage research-related tasks such as data collection and/or data annotation. Citizen Science is technologically possible and scientifically significant. However, although research teams can save time and money by recruiting general citizens to volunteer their time and skills to help data analysis, the reliability of contributed data varies a lot. Data reliability issues are significant to the domain of Citizen Science due to the quantity and diversity of people and devices involved. Participants may submit low quality, misleading, inaccurate, or even malicious data. Therefore, finding a way to improve the data reliability has become an urgent demand. This study aims to investigate techniques to enhance the reliability of data contributed by general citizens in scientific research projects especially for acoustic sensing projects. In particular, we propose to design a reputation framework to enhance data reliability and also investigate some critical elements that should be aware of during developing and designing new reputation systems.