966 resultados para Computer sound processing


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the past years, we could observe a significant amount of new robotic systems in science, industry, and everyday life. To reduce the complexity of these systems, the industry constructs robots that are designated for the execution of a specific task such as vacuum cleaning, autonomous driving, observation, or transportation operations. As a result, such robotic systems need to combine their capabilities to accomplish complex tasks that exceed the abilities of individual robots. However, to achieve emergent cooperative behavior, multi-robot systems require a decision process that copes with the communication challenges of the application domain. This work investigates a distributed multi-robot decision process, which addresses unreliable and transient communication. This process composed by five steps, which we embedded into the ALICA multi-agent coordination language guided by the PROViDE negotiation middleware. The first step encompasses the specification of the decision problem, which is an integral part of the ALICA implementation. In our decision process, we describe multi-robot problems by continuous nonlinear constraint satisfaction problems. The second step addresses the calculation of solution proposals for this problem specification. Here, we propose an efficient solution algorithm that integrates incomplete local search and interval propagation techniques into a satisfiability solver, which forms a satisfiability modulo theories (SMT) solver. In the third decision step, the PROViDE middleware replicates the solution proposals among the robots. This replication process is parameterized with a distribution method, which determines the consistency properties of the proposals. In a fourth step, we investigate the conflict resolution. Therefore, an acceptance method ensures that each robot supports one of the replicated proposals. As we integrated the conflict resolution into the replication process, a sound selection of the distribution and acceptance methods leads to an eventual convergence of the robot proposals. In order to avoid the execution of conflicting proposals, the last step comprises a decision method, which selects a proposal for implementation in case the conflict resolution fails. The evaluation of our work shows that the usage of incomplete solution techniques of the constraint satisfaction solver outperforms the runtime of other state-of-the-art approaches for many typical robotic problems. We further show by experimental setups and practical application in the RoboCup environment that our decision process is suitable for making quick decisions in the presence of packet loss and delay. Moreover, PROViDE requires less memory and bandwidth compared to other state-of-the-art middleware approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The generation of heterogeneous big data sources with ever increasing volumes, velocities and veracities over the he last few years has inspired the data science and research community to address the challenge of extracting knowledge form big data. Such a wealth of generated data across the board can be intelligently exploited to advance our knowledge about our environment, public health, critical infrastructure and security. In recent years we have developed generic approaches to process such big data at multiple levels for advancing decision-support. It specifically concerns data processing with semantic harmonisation, low level fusion, analytics, knowledge modelling with high level fusion and reasoning. Such approaches will be introduced and presented in context of the TRIDEC project results on critical oil and gas industry drilling operations and also the ongoing large eVacuate project on critical crowd behaviour detection in confined spaces.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coupled map lattices (CML) can describe many relaxation and optimization algorithms currently used in image processing. We recently introduced the ‘‘plastic‐CML’’ as a paradigm to extract (segment) objects in an image. Here, the image is applied by a set of forces to a metal sheet which is allowed to undergo plastic deformation parallel to the applied forces. In this paper we present an analysis of our ‘‘plastic‐CML’’ in one and two dimensions, deriving the nature and stability of its stationary solutions. We also detail how to use the CML in image processing, how to set the system parameters and present examples of it at work. We conclude that the plastic‐CML is able to segment images with large amounts of noise and large dynamic range of pixel values, and is suitable for a very large scale integration(VLSI) implementation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract not available

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent advances in the massively parallel computational abilities of graphical processing units (GPUs) have increased their use for general purpose computation, as companies look to take advantage of big data processing techniques. This has given rise to the potential for malicious software targeting GPUs, which is of interest to forensic investigators examining the operation of software. The ability to carry out reverse-engineering of software is of great importance within the security and forensics elds, particularly when investigating malicious software or carrying out forensic analysis following a successful security breach. Due to the complexity of the Nvidia CUDA (Compute Uni ed Device Architecture) framework, it is not clear how best to approach the reverse engineering of a piece of CUDA software. We carry out a review of the di erent binary output formats which may be encountered from the CUDA compiler, and their implications on reverse engineering. We then demonstrate the process of carrying out disassembly of an example CUDA application, to establish the various techniques available to forensic investigators carrying out black-box disassembly and reverse engineering of CUDA binaries. We show that the Nvidia compiler, using default settings, leaks useful information. Finally, we demonstrate techniques to better protect intellectual property in CUDA algorithm implementations from reverse engineering.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Die Nützlichkeit des Einsatzes von Computern in Schule und Ausbildung ist schon seit einigen Jahren unbestritten. Uneinigkeit herrscht gegenwärtig allerdings darüber, welche Aufgaben von Computern eigenständig wahrgenommen werden können. Bewertet man die Übernahme von Lehrfunktionen durch computerbasierte Lehrsysteme, müssen häufig Mängel festgestellt werden. Das Ziel der vorliegenden Arbeit ist es, ausgehend von aktuellen Praxisrealisierungen computerbasierter Lehrsysteme unterschiedliche Klassen von zentralen Lehrkompetenzen (Schülermodellierung, Fachwissen und instruktionale Aktivitäten im engeren Sinne) zu bestimmen. Innerhalb jeder Klasse werden globale Leistungen der Lehrsysteme und notwendige, in komplementärer Relation stehende Tätigkeiten menschlicher Tutoren bestimmt. Das dabei entstandene Klassifikationsschema erlaubt sowohl die Einordnung typischer Lehrsysteme als auch die Feststellung von spezifischen Kompetenzen, die in der Lehrer- bzw. Trainerausbildung zukünftig vermehrt berücksichtigt werden sollten. (DIPF/Orig.)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article introduces the genre of a digital audio game and discusses selected play interaction solutions implemented in the Audio Game Hub, a prototype designed and evaluated in the years 2014 and 2015 at the Gamification Lab at Leuphana University Lüneburg.1 The Audio Game Hub constitutes a set of familiar playful activities (aiming at a target, reflex-based reacting to sound signals, labyrinth exploration) and casual games (e.g. Tetris, Memory) adapted to the digital medium and converted into the audio sphere, where the player is guided predominantly or solely by sound. The authors will discuss the design questions raised at early stages of the project, and confront them with the results of user experience testing performed on two groups of sighted and one group of visually impaired gamers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While humans can easily segregate and track a speaker's voice in a loud noisy environment, most modern speech recognition systems still perform poorly in loud background noise. The computational principles behind auditory source segregation in humans is not yet fully understood. In this dissertation, we develop a computational model for source segregation inspired by auditory processing in the brain. To support the key principles behind the computational model, we conduct a series of electro-encephalography experiments using both simple tone-based stimuli and more natural speech stimulus. Most source segregation algorithms utilize some form of prior information about the target speaker or use more than one simultaneous recording of the noisy speech mixtures. Other methods develop models on the noise characteristics. Source segregation of simultaneous speech mixtures with a single microphone recording and no knowledge of the target speaker is still a challenge. Using the principle of temporal coherence, we develop a novel computational model that exploits the difference in the temporal evolution of features that belong to different sources to perform unsupervised monaural source segregation. While using no prior information about the target speaker, this method can gracefully incorporate knowledge about the target speaker to further enhance the segregation.Through a series of EEG experiments we collect neurological evidence to support the principle behind the model. Aside from its unusual structure and computational innovations, the proposed model provides testable hypotheses of the physiological mechanisms of the remarkable perceptual ability of humans to segregate acoustic sources, and of its psychophysical manifestations in navigating complex sensory environments. Results from EEG experiments provide further insights into the assumptions behind the model and provide motivation for future single unit studies that can provide more direct evidence for the principle of temporal coherence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the evaluation of morpheme a sketching interface for the control of sound synthesis. We explain the task that was designed in order to assess the effectiveness of the interface, detect usability issues and gather participants’ responses regarding cognitive, experiential and expressive aspects of the interaction. The evaluation comprises a design task, where partici-pants were asked to design two soundscapes using the morpheme interface for two video footages. Responses were gathered using a series of likert type and open-ended questions. The analysis of the data gathered revealed a number of usability issues, however the performance of morpheme was satisfactory and participants recognised the creative potential of the interface and the synthesis methods for sound design applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main objective of this work was to develop an application capable of determining the diffusion times and diffusion coefficients of optical clearing agents and water inside a known type of muscle. Different types of chemical agents can also be used with the method implemented, such as medications or metabolic products. Since the diffusion times can be calculated, it is possible to describe the dehydration mechanism that occurs in the muscle. The calculation of the diffusion time of an optical clearing agent allows to characterize the refractive index matching mechanism of optical clearing. By using both the diffusion times and diffusion of water and clearing agents not only the optical clearing mechanisms are characterized, but also information about optical clearing effect duration and magnitude is obtained. Such information is crucial to plan a clinical intervention in cooperation with optical clearing. The experimental method and equations implemented in the developed application are described in throughout this document, demonstrating its effectiveness. The application was developed in MATLAB code, but the method was personalized so it better fits the application needs. This process significantly improved the processing efficiency, reduced the time to obtain he results, multiple validations prevents common errors and some extra functionalities were added such as saving application progress or export information in different formats. Tests were made using glucose measurements in muscle. Some of the data, for testing purposes, was also intentionally changed in order to obtain different simulations and results from the application. The entire project was validated by comparing the calculated results with the ones found in literature, which are also described in this document.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis is an investigation of structural brain abnormalities, as well as multisensory and unisensory processing deficits in autistic traits and Autism Spectrum Disorder (ASD). To achieve this, structural and functional magnetic resonance imaging (fMRI) and psychophysical techniques were employed. ASD is a neurodevelopmental condition which is characterised by the social communication and interaction deficits, as well as repetitive patterns of behaviour, interests and activities. These traits are thought to be present in a typical population. The Autism Spectrum Quotient questionnaire (AQ) was developed to assess the prevalence of autistic traits in the general population. Von dem Hagen et al. (2011) revealed a link between AQ with white matter (WM) and grey matter (GM) volume (using voxel-based-morphometry). However, their findings revealed no difference in GM in areas associated with social cognition. Cortical thickness (CT) measurements are known to be a more direct measure of cortical morphology than GM volume. Therefore, Chapter 2 investigated the relationship between AQ scores and CT in the same sample of participants. This study showed that AQ scores correlated with CT in the left temporo-occipital junction, left posterior cingulate, right precentral gyrus and bilateral precentral sulcus, in a typical population. These areas were previously associated with structural and functional differences in ASD. Thus the findings suggest, to some extent, autistic traits are reflected in brain structure - in the general population. The ability to integrate auditory and visual information is crucial to everyday life, and results are mixed regarding how ASD influences audiovisual integration. To investigate this question, Chapter 3 examined the Temporal Integration Window (TIW), which indicates how precisely sight and sound need to be temporally aligned so that a unitary audiovisual event can be perceived. 26 adult males with ASD and 26 age and IQ-matched typically developed males were presented with flash-beep (BF), point-light drummer, and face-voice (FV) displays with varying degrees of asynchrony and asked to make Synchrony Judgements (SJ) and Temporal Order Judgements (TOJ). Analysis of the data included fitting Gaussian functions as well as using an Independent Channels Model (ICM) to fit the data (Garcia-Perez & Alcala-Quintana, 2012). Gaussian curve fitting for SJs showed that the ASD group had a wider TIW, but for TOJ no group effect was found. The ICM supported these results and model parameters indicated that the wider TIW for SJs in the ASD group was not due to sensory processing at the unisensory level, but rather due to decreased temporal resolution at a decisional level of combining sensory information. Furthermore, when performing TOJ, the ICM revealed a smaller Point of Subjective Simultaneity (PSS; closer to physical synchrony) in the ASD group than in the TD group. Finding that audiovisual temporal processing is different in ASD encouraged us to investigate the neural correlates of multisensory as well as unisensory processing using functional magnetic resonance imaging fMRI. Therefore, Chapter 4 investigated audiovisual, auditory and visual processing in ASD of simple BF displays and complex, social FV displays. During a block design experiment, we measured the BOLD signal when 13 adults with ASD and 13 typically developed (TD) age-sex- and IQ- matched adults were presented with audiovisual, audio and visual information of BF and FV displays. Our analyses revealed that processing of audiovisual as well as unisensory auditory and visual stimulus conditions in both the BF and FV displays was associated with reduced activation in ASD. Audiovisual, auditory and visual conditions of FV stimuli revealed reduced activation in ASD in regions of the frontal cortex, while BF stimuli revealed reduced activation the lingual gyri. The inferior parietal gyrus revealed an interaction between stimulus sensory condition of BF stimuli and group. Conjunction analyses revealed smaller regions of the superior temporal cortex (STC) in ASD to be audiovisual sensitive. Against our predictions, the STC did not reveal any activation differences, per se, between the two groups. However, a superior frontal area was shown to be sensitive to audiovisual face-voice stimuli in the TD group, but not in the ASD group. Overall this study indicated differences in brain activity for audiovisual, auditory and visual processing of social and non-social stimuli in individuals with ASD compared to TD individuals. These results contrast previous behavioural findings, suggesting different audiovisual integration, yet intact auditory and visual processing in ASD. Our behavioural findings revealed audiovisual temporal processing deficits in ASD during SJ tasks, therefore we investigated the neural correlates of SJ in ASD and TD controls. Similar to Chapter 4, we used fMRI in Chapter 5 to investigate audiovisual temporal processing in ASD in the same participants as recruited in Chapter 4. BOLD signals were measured while the ASD and TD participants were asked to make SJ on audiovisual displays of different levels of asynchrony: the participants’ PSS, audio leading visual information (audio first), visual leading audio information (visual first). Whereas no effect of group was found with BF displays, increased putamen activation was observed in ASD participants compared to TD participants when making SJs on FV displays. Investigating SJ on audiovisual displays in the bilateral superior temporal gyrus (STG), an area involved in audiovisual integration (see Chapter 4), we found no group differences or interaction between group and levels of audiovisual asynchrony. The investigation of different levels of asynchrony revealed a complex pattern of results indicating a network of areas more involved in processing PSS than audio first and visual first, as well as areas responding differently to audio first compared to video first. These activation differences between audio first and video first in different brain areas are constant with the view that audio leading and visual leading stimuli are processed differently.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dataset for publication in PLOS One

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs and makes a one-time prediction. In this disser- tation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving. In these problems, predictions are often made based only on partial information. We see this dynamic setting in many real-time, interactive applications. These problems usually involve a trade-off between the amount of input received (cost) and the quality of the output prediction (accuracy). Therefore, the evaluation considers both objectives (e.g., plotting a Pareto curve). Our goal is to develop a formal understanding of sequential prediction and decision-making problems in natural language processing and to propose efficient solutions. Toward this end, we present meta-algorithms that take an existent batch model and produce a dynamic model to handle sequential inputs and outputs. Webuild our framework upon theories of Markov Decision Process (MDP), which allows learning to trade off competing objectives in a principled way. The main machine learning techniques we use are from imitation learning and reinforcement learning, and we advance current techniques to tackle problems arising in our settings. We evaluate our algorithm on a variety of applications, including dependency parsing, machine translation, and question answering. We show that our approach achieves a better cost-accuracy trade-off than the batch approach and heuristic-based decision- making approaches. We first propose a general framework for cost-sensitive prediction, where dif- ferent parts of the input come at different costs. We formulate a decision-making process that selects pieces of the input sequentially, and the selection is adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing). We show that it achieves similar prediction quality to methods that use all input, while inducing a much smaller cost. Next, we extend the framework to problems where the input is revealed incremen- tally in a fixed order. We study two applications: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this set- ting and show that adding domain knowledge eases the decision-making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with sequential input/output and trade-off decisions, accompanied by a learning algorithm that solves the MDP.