14 resultados para Voice Digital Processing

em DRUM (Digital Repository at the University of Maryland)


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The proliferation of new mobile communication devices, such as smartphones and tablets, has led to an exponential growth in network traffic. The demand for supporting the fast-growing consumer data rates urges the wireless service providers and researchers to seek a new efficient radio access technology, which is the so-called 5G technology, beyond what current 4G LTE can provide. On the other hand, ubiquitous RFID tags, sensors, actuators, mobile phones and etc. cut across many areas of modern-day living, which offers the ability to measure, infer and understand the environmental indicators. The proliferation of these devices creates the term of the Internet of Things (IoT). For the researchers and engineers in the field of wireless communication, the exploration of new effective techniques to support 5G communication and the IoT becomes an urgent task, which not only leads to fruitful research but also enhance the quality of our everyday life. Massive MIMO, which has shown the great potential in improving the achievable rate with a very large number of antennas, has become a popular candidate. However, the requirement of deploying a large number of antennas at the base station may not be feasible in indoor scenarios. Does there exist a good alternative that can achieve similar system performance to massive MIMO for indoor environment? In this dissertation, we address this question by proposing the time-reversal technique as a counterpart of massive MIMO in indoor scenario with the massive multipath effect. It is well known that radio signals will experience many multipaths due to the reflection from various scatters, especially in indoor environments. The traditional TR waveform is able to create a focusing effect at the intended receiver with very low transmitter complexity in a severe multipath channel. TR's focusing effect is in essence a spatial-temporal resonance effect that brings all the multipaths to arrive at a particular location at a specific moment. We show that by using time-reversal signal processing, with a sufficiently large bandwidth, one can harvest the massive multipaths naturally existing in a rich-scattering environment to form a large number of virtual antennas and achieve the desired massive multipath effect with a single antenna. Further, we explore the optimal bandwidth for TR system to achieve maximal spectral efficiency. Through evaluating the spectral efficiency, the optimal bandwidth for TR system is found determined by the system parameters, e.g., the number of users and backoff factor, instead of the waveform types. Moreover, we investigate the tradeoff between complexity and performance through establishing a generalized relationship between the system performance and waveform quantization in a practical communication system. It is shown that a 4-bit quantized waveforms can be used to achieve the similar bit-error-rate compared to the TR system with perfect precision waveforms. Besides 5G technology, Internet of Things (IoT) is another terminology that recently attracts more and more attention from both academia and industry. In the second part of this dissertation, the heterogeneity issue within the IoT is explored. One of the significant heterogeneity considering the massive amount of devices in the IoT is the device heterogeneity, i.e., the heterogeneous bandwidths and associated radio-frequency (RF) components. The traditional middleware techniques result in the fragmentation of the whole network, hampering the objects interoperability and slowing down the development of a unified reference model for the IoT. We propose a novel TR-based heterogeneous system, which can address the bandwidth heterogeneity and maintain the benefit of TR at the same time. The increase of complexity in the proposed system lies in the digital processing at the access point (AP), instead of at the devices' ends, which can be easily handled with more powerful digital signal processor (DSP). Meanwhile, the complexity of the terminal devices stays low and therefore satisfies the low-complexity and scalability requirement of the IoT. Since there is no middleware in the proposed scheme and the additional physical layer complexity concentrates on the AP side, the proposed heterogeneous TR system better satisfies the low-complexity and energy-efficiency requirement for the terminal devices (TDs) compared with the middleware approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While humans can easily segregate and track a speaker's voice in a loud noisy environment, most modern speech recognition systems still perform poorly in loud background noise. The computational principles behind auditory source segregation in humans is not yet fully understood. In this dissertation, we develop a computational model for source segregation inspired by auditory processing in the brain. To support the key principles behind the computational model, we conduct a series of electro-encephalography experiments using both simple tone-based stimuli and more natural speech stimulus. Most source segregation algorithms utilize some form of prior information about the target speaker or use more than one simultaneous recording of the noisy speech mixtures. Other methods develop models on the noise characteristics. Source segregation of simultaneous speech mixtures with a single microphone recording and no knowledge of the target speaker is still a challenge. Using the principle of temporal coherence, we develop a novel computational model that exploits the difference in the temporal evolution of features that belong to different sources to perform unsupervised monaural source segregation. While using no prior information about the target speaker, this method can gracefully incorporate knowledge about the target speaker to further enhance the segregation.Through a series of EEG experiments we collect neurological evidence to support the principle behind the model. Aside from its unusual structure and computational innovations, the proposed model provides testable hypotheses of the physiological mechanisms of the remarkable perceptual ability of humans to segregate acoustic sources, and of its psychophysical manifestations in navigating complex sensory environments. Results from EEG experiments provide further insights into the assumptions behind the model and provide motivation for future single unit studies that can provide more direct evidence for the principle of temporal coherence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Theories of sparse signal representation, wherein a signal is decomposed as the sum of a small number of constituent elements, play increasing roles in both mathematical signal processing and neuroscience. This happens despite the differences between signal models in the two domains. After reviewing preliminary material on sparse signal models, I use work on compressed sensing for the electron tomography of biological structures as a target for exploring the efficacy of sparse signal reconstruction in a challenging application domain. My research in this area addresses a topic of keen interest to the biological microscopy community, and has resulted in the development of tomographic reconstruction software which is competitive with the state of the art in its field. Moving from the linear signal domain into the nonlinear dynamics of neural encoding, I explain the sparse coding hypothesis in neuroscience and its relationship with olfaction in locusts. I implement a numerical ODE model of the activity of neural populations responsible for sparse odor coding in locusts as part of a project involving offset spiking in the Kenyon cells. I also explain the validation procedures we have devised to help assess the model's similarity to the biology. The thesis concludes with the development of a new, simplified model of locust olfactory network activity, which seeks with some success to explain statistical properties of the sparse coding processes carried out in the network.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Research suggests that supervisors and peers can help employees make sense of what is important or expected from them at work and, thereby, shape their behaviors. In this dissertation, I examine how employees’ organizational citizenship behaviors (OCB), such as helping and voice, are differentially affected by these two sources of influence over time. In particular, I compare the relative and joint effectiveness of two field interventions to enhance OCB: (a) a role clarification intervention in which supervisors are trained to set expectations for OCB for their employees and encourage them to engage in OCB and (b) a norm establishment intervention in which peers are trained to set expectations for each other and encourage each other to perform OCB. I utilize a mixed methods approach involving a quasi-field experiment to test for changes in OCB and qualitative data to explore the theoretical mechanisms over the course of three months in a large food processing plant. I find that role clarification interventions alone have immediate positive effects on OCB, whereas norm establishment interventions alone take a longer period of time to increase OCB. In addition, in the condition where both interventions were combined, norm establishment interventions weaken the effects of role clarification earlier on; however, at later stages in time, this pattern reverses as norm establishment enhances the effects of role clarification on OCB. Through these findings, I highlight how (a) organizations seeking quick increases in citizenship might be better off focusing on supervisors as sources of influence; (b) organizations need to persist with peer-focused interventions to see positive gains; and (c) despite initial hurdles with peer-focused interventions, over time, they can lead to the highest increases in OCB when combined with supervisor-focused interventions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Instructional methods employed by teachers of singing are mostly drawn from personal experience, personal reflections, and methods encountered in their own voice training (Welch & Howard, 2005). Even in Academia, singing pedagogy is one of the few disciplines in which research of teaching/learning practice efficacy has not been established (Crocco, et al., 2016). This dissertation argues the reason for this deficit is a lack of operationalization of constructs in singing, which, to date has not been undertaken. The researcher addresses issues of paradigm, epistemology, and methodology to suggest an appropriate model of experimental research towards the assessment of teaching/learning practice efficacy. A study was conducted adapting attentional focus research methodologies to test the effect of attentional focus on singing voice quality in adult novice singers. Based on previous attentional focus studies, it was hypothesized that external focus conditions would result in superior singing voice quality than internal focus conditions. While the hypothesis was partially supported by the data, the researcher welcomed refinement of the suggested research model. It is hoped that new research methodologies will emerge to investigate singing phenomena, yielding data that may be used towards the development of evidence-based frameworks for singing training.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Presentation from the MARAC conference in Roanoke, VA on October 7–10, 2015. S8 - Minimal Processing and Preservation: Friends or Foes?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Presentation from the MARAC conference in Pittsburgh, PA on April 14–16, 2016. S2 - Making Friends: The Highs, Lows, and Challenges of Inter-Repository Archival Relationships

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Older adults frequently report that they can hear what they have been told but cannot understand the meaning. This is particularly true in noisy conditions, where the additional challenge of suppressing irrelevant noise (i.e. a competing talker) adds another layer of difficulty to their speech understanding. Hearing aids improve speech perception in quiet, but their success in noisy environments has been modest, suggesting that peripheral hearing loss may not be the only factor in the older adult’s perceptual difficulties. Recent animal studies have shown that auditory synapses and cells undergo significant age-related changes that could impact the integrity of temporal processing in the central auditory system. Psychoacoustic studies carried out in humans have also shown that hearing loss can explain the decline in older adults’ performance in quiet compared to younger adults, but these psychoacoustic measurements are not accurate in describing auditory deficits in noisy conditions. These results would suggest that temporal auditory processing deficits could play an important role in explaining the reduced ability of older adults to process speech in noisy environments. The goals of this dissertation were to understand how age affects neural auditory mechanisms and at which level in the auditory system these changes are particularly relevant for explaining speech-in-noise problems. Specifically, we used non-invasive neuroimaging techniques to tap into the midbrain and the cortex in order to analyze how auditory stimuli are processed in younger (our standard) and older adults. We will also attempt to investigate a possible interaction between processing carried out in the midbrain and cortex.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation examines the role that music has played in the expression of identity and revitalization of culture of the Alevis in Turkey, since the start of their sociocultural revival movement in the late 1980s. Music is central to Alevi claims of ethnic and religious difference—singing and playing the bağlama (Turkish folk lute) constitutes an expressive practice in worship and everyday life. Based on research conducted from 2012 to 2014, I investigate and present Alevi music through the lens of discourses on the construction of identity as a social and musical process. Alevi musicians perform a revived repertoire of the ritual music and folk songs of Anatolian bards and dervish-lodge poets that developed over several centuries. Contemporary media and performance contexts have blurred former distinctions between sacred and secular, yet have provided new avenues to build community in an urban setting. I compare music performances in the worship services of urban and small-town areas, and other community events such as devotional meetings, concerts, clubs, and broadcast and social media to illustrate the ways that participation—both performing and listening—reinforces identity and solidarity. I also examine the influence of these different contexts on performers’ musical choices, and the power of music to evoke a range of responses and emotional feelings in the participants. Through my investigation I argue that the Alevi music repertoire is not only a cultural practice but also a symbol of power and collective action in their struggle for human rights and self-determination. As Alevis have faced a redefined Turkish nationalism that incorporates Sunni Muslim piety, this music has gained even greater potency in their resistance to misrecognition as a folkloric, rather than a living, tradition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs and makes a one-time prediction. In this disser- tation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving. In these problems, predictions are often made based only on partial information. We see this dynamic setting in many real-time, interactive applications. These problems usually involve a trade-off between the amount of input received (cost) and the quality of the output prediction (accuracy). Therefore, the evaluation considers both objectives (e.g., plotting a Pareto curve). Our goal is to develop a formal understanding of sequential prediction and decision-making problems in natural language processing and to propose efficient solutions. Toward this end, we present meta-algorithms that take an existent batch model and produce a dynamic model to handle sequential inputs and outputs. Webuild our framework upon theories of Markov Decision Process (MDP), which allows learning to trade off competing objectives in a principled way. The main machine learning techniques we use are from imitation learning and reinforcement learning, and we advance current techniques to tackle problems arising in our settings. We evaluate our algorithm on a variety of applications, including dependency parsing, machine translation, and question answering. We show that our approach achieves a better cost-accuracy trade-off than the batch approach and heuristic-based decision- making approaches. We first propose a general framework for cost-sensitive prediction, where dif- ferent parts of the input come at different costs. We formulate a decision-making process that selects pieces of the input sequentially, and the selection is adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing). We show that it achieves similar prediction quality to methods that use all input, while inducing a much smaller cost. Next, we extend the framework to problems where the input is revealed incremen- tally in a fixed order. We study two applications: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this set- ting and show that adding domain knowledge eases the decision-making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with sequential input/output and trade-off decisions, accompanied by a learning algorithm that solves the MDP.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation uses children’s acquisition of adjunct control as a case study to investigate grammatical and performance accounts of language acquisition. In previous research, children have consistently exhibited non-adultlike behavior for sentences with adjunct control. To explain children’s behavior, several different grammatical accounts have been proposed, but evidence for these accounts has been inconclusive. In this dissertation, I take two approaches to account for children’s errors. First, I spell out the predictions of previous grammatical accounts, and test these predictions after accounting for some methodological concerns that might have influenced children’s behavior in previous studies. While I reproduce the non-adultlike behavior observed in previous studies, the predictions of previous grammatical accounts are not borne out, suggesting that extragrammatical factors are needed to explain children’s behavior. Next, I consider the role of two different types of extragrammatical factors in predicting children’s non-adultlike behavior. With a new task designed to address the task demands in previous studies, children exhibit significantly higher accuracy than with previous tasks. This suggests that children’s behavior has been influenced by task- specific processing factors. In addition to the task, I also test the predictions of a similarity-based interference account, which links children’s errors to the same memory mechanisms involved in sentence processing difficulties observed in adults. These predictions are borne out, supporting a more continuous developmental trajectory as children’s processing mechanisms become more resistant to interference. Finally, I consider how children’s errors might influence their acquisition of adjunct control, given the distribution in the linguistic input. I discuss the results of a corpus analysis, including the possibility that adjunct control could be learned from the input. The kinds of information that could be useful to a learner become much more limited, however, after considering the processing limitations that would interfere with the representations available to the learner.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Strawberries harvested for processing as frozen fruits are currently de-calyxed manually in the field. This process requires the removal of the stem cap with green leaves (i.e. the calyx) and incurs many disadvantages when performed by hand. Not only does it necessitate the need to maintain cutting tool sanitation, but it also increases labor time and exposure of the de-capped strawberries before in-plant processing. This leads to labor inefficiency and decreased harvest yield. By moving the calyx removal process from the fields to the processing plants, this new practice would reduce field labor and improve management and logistics, while increasing annual yield. As labor prices continue to increase, the strawberry industry has shown great interest in the development and implementation of an automated calyx removal system. In response, this dissertation describes the design, operation, and performance of a full-scale automatic vision-guided intelligent de-calyxing (AVID) prototype machine. The AVID machine utilizes commercially available equipment to produce a relatively low cost automated de-calyxing system that can be retrofitted into existing food processing facilities. This dissertation is broken up into five sections. The first two sections include a machine overview and a 12-week processing plant pilot study. Results of the pilot study indicate the AVID machine is able to de-calyx grade-1-with-cap conical strawberries at roughly 66 percent output weight yield at a throughput of 10,000 pounds per hour. The remaining three sections describe in detail the three main components of the machine: a strawberry loading and orientation conveyor, a machine vision system for calyx identification, and a synchronized multi-waterjet knife calyx removal system. In short, the loading system utilizes rotational energy to orient conical strawberries. The machine vision system determines cut locations through RGB real-time feature extraction. The high-speed multi-waterjet knife system uses direct drive actuation to locate 30,000 psi cutting streams to precise coordinates for calyx removal. Based on the observations and studies performed within this dissertation, the AVID machine is seen to be a viable option for automated high-throughput strawberry calyx removal. A summary of future tasks and further improvements is discussed at the end.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Discretion plays a role in nearly every facet of the American criminal justice system. It is widely regarded as necessary to do justice but is not without criticisms – especially when it leads to unfavorable or disparate treatment. The role of discretion in sexual assault cases has been particularly scrutinized. Since the majority of sexual assaults do not fit stereotypic beliefs about what constitutes a “real rape” and “genuine victim,” criminal justice officials use their discretion to filter these cases out of the justice system. This study explored this issue by examining two stages of the criminal justice process: the police decision to refer cases for prosecution and the prosecutorial decision to accept referred cases. In doing so, it contributes to this body of literature in three ways. First, it included sexual assault cases that involve Alaska Native victims and suspects. Second, it addressed a gap in the theoretical scholarship by examining the downstream nature of police decision-making. And finally, it examined the formal reasons prosecutors give for charge dispositions. This study found a significant amount of attrition of sexual assault cases as they progressed through the criminal justice system. Moreover, a combination of legally relevant and extralegal factors was found to be important, but not consistently across all types of sexual assaults. Among legal factors, the number of victim injuries was the most consistent predictor. Among extralegal factors, cases that involved Alaska Native suspects had significantly higher odds of case referral and case acceptance compared to white suspects. The effect of suspect race was particularly pronounced in cases with a white victim. Additionally, the findings suggest that not only are Native American defendants more likely to have their cases referred by police, but once referred, they are also more likely to have them accepted for prosecution. Contrary to expectations, victim-suspect relationship, specifically non-stranger assaults, increased the odds of police referral compared to stranger cases. However, the opposite appears to be true for the decision to prosecute cases. Once referred, prosecutors were five times more likely to accept sexual assaults perpetrated by strangers. The implications of these findings are also discussed.