97 resultados para Noisy corpora.
Resumo:
The QUT-NOISE-TIMIT corpus consists of 600 hours of noisy speech sequences designed to enable a thorough evaluation of voice activity detection (VAD) algorithms across a wide variety of common background noise scenarios. In order to construct the final mixed-speech database, a collection of over 10 hours of background noise was conducted across 10 unique locations covering 5 common noise scenarios, to create the QUT-NOISE corpus. This background noise corpus was then mixed with speech events chosen from the TIMIT clean speech corpus over a wide variety of noise lengths, signal-to-noise ratios (SNRs) and active speech proportions to form the mixed-speech QUT-NOISE-TIMIT corpus. The evaluation of five baseline VAD systems on the QUT-NOISE-TIMIT corpus is conducted to validate the data and show that the variety of noise available will allow for better evaluation of VAD systems than existing approaches in the literature.
Resumo:
This silent swarm of stylized crickets is downloading data from Internet and catalogue searches being undertaken by the public at the State Library Queensland. These searches are being displayed on the screen on their backs. Each cricket downloads the searches and communicates this information with other crickets. Commonly found searches spread like a meme through the swarm. In this work memes replace the crickets’ song, washing like a wave through the swarm and changing on the whim of Internet users. When one cricket begins calling others, the swarm may respond to produce emergent patterns of text. When traffic is slow or of now interest to the crickets, they display onomatopoeia. The work is inspired by R. Murray Schafer’s research into acoustic ecologies. In the 1960’s Schafer proposed that many species develop calls that fit niches within their acoustic environment. An increasing background of white noise dominates the acoustic environment of urban human habitats, leaving few acoustic niches for other species to communicate. The popularity of headphones and portable music may be seen as an evolution of our acoustic ecology driven by our desire to hear expressive, meaningful sound, above the din of our cities. Similarly, the crickets in this work are hypothetical creatures that have evolved to survive in a noisy human environment. This speculative species replaces auditory calls with onomatopoeia and information memes, communicating with the swarm via radio frequency chirps instead of sound. Whilst these crickets cannot make sound, each individual has been programmed respond to sound generated by the audience, by making onomatopoeia calls in text. Try talking to a cricket, blowing on its tail, or making other sounds to trigger a call.
Resumo:
Since the launch of the ‘Clean Delhi, Green Delhi’ campaign in 2003, slums have become a significant social and political issue in India’s capital city. Through this campaign, the state, in collaboration with Delhi’s middle class through the ‘Bhagidari system’ (literally translated as ‘participatory system’), aims to transform Delhi into a ‘world-class city’ that offers a sanitised, aesthetically appealing urban experience to its citizens and Western visitors. In 2007, Delhi won the bid to host the 2010 Commonwealth Games; since then, this agenda has acquired an urgent, almost violent, impetus to transform Delhi into an environmentally friendly, aesthetically appealing and ‘truly international city’. Slums and slum-dwellers, with their ‘filth, dirt, and noise’, have no place in this imagined city. The violence inflicted upon slum-dwellers, including the denial of their judicial rights, is justified on these accounts. In addition, the juridical discourse since 2000 has ‘re-problematised slums as ‘nuisance’. The rising antagonism of the middle-classes against the poor, supported by the state’s ambition to have a ‘world-class city’, has allowed a new rhetoric to situate the slums in the city. These representations articulate slums as homogenised spaces of experience and identity. The ‘illegal’ status of slum-dwellers, as encroachers upon public space, is stretched to involve ‘social, cultural, and moral’ decadence and depravity. This thesis is an ethnographic exploration of everyday life in a prominent slum settlement in Delhi. It sensually examines the social, cultural and political materiality of slums, and the relationship of slums with the middle class. In doing so, it highlights the politics of sensorial ordering of slums as ‘filthy, dirty, and noisy’ by the middle classes to calcify their position as ‘others’ in order to further segregate, exclude and discriminate the slums. The ethnographic experience in the slums, however, highlights a complex sensorial ordering and politics of its own. Not only are the interactions between diverse communities in slums highly restricted and sensually ordained, but the middle class is identified as a sensual ‘other’, and its sensual practices prohibited. This is significant in two ways. First, it highlights the multiplicity of social, cultural experience and engagement in the slums, thereby challenging its homogenised representation. Second, the ethnographic exploration allowed me to frame a distinct sense of self amongst the slums, which is denied in mainstream discourses, and allowed me to identify the slums’ own ’others’, middle class being one of them. This thesis highlights sound – its production, performances and articulations – as an act with social, cultural, and political implications and manifestations. ‘Noise’ can be understood as a political construct to identify ‘others’ – and both slum-dwellers and the middle classes identify different sonic practices as noise to situate the ‘other’ sonically. It is within this context that this thesis frames the position of Listener and Hearer, which corresponds to their social-political positions. These positions can be, and are, resisted and circumvented through sonic practices. For instance, amplification tactics in the Karimnagar slums, which are understood as ‘uncultured, callous activities to just create more noise’ by the slums’ middle-class neighbours, also serve definite purposes in shaping and navigating the space through the slums’ soundscapes, asserting a presence that is otherwise denied. Such tactics allow the residents to define their sonic territories and scope of sonic performances; they are significant in terms of exerting one’s position, territory and identity, and they are very important in subverting hierarchies. The residents of the Karimnagar slums have to negotiate many social, cultural, moral and political prejudices in their everyday lives. Their identity is constantly under scrutiny and threat. However, the sonic cultures and practices in the Karimnagar slums allow their residents to exert a definite sonic presence – which the middle class has to hear. The articulation of noise and silence is an act manifesting, referencing and resisting social, cultural, and political power and hierarchies.
Resumo:
It is a big challenge to clearly identify the boundary between positive and negative streams for information filtering systems. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on the RCV1 data collection, and substantial experiments show that the proposed approach achieves encouraging performance and the performance is also consistent for adaptive filtering as well.
Resumo:
Extracellular matrix regulates many cellular processes likely to be important for development and regression of corpora lutea. Therefore, we identified the types and components of the extracellular matrix of the human corpus luteum at different stages of the menstrual cycle. Two different types of extracellular matrix were identified by electron microscopy; subendothelial basal laminas and an interstitial matrix located as aggregates at irregular intervals between the non-vascular cells. No basal laminas were associated with luteal cells. At all stages, collagen type IV α1 and laminins α5, β2 and γ1 were localized by immunohistochemistry to subendothelial basal laminas, and collagen type IV α1 and laminins α2, α5, β1 and β2 localized in the interstitial matrix. Laminin α4 and β1 chains occurred in the subendothelial basal lamina from mid-luteal stage to regression; at earlier stages, a punctate pattern of staining was observed. Therefore, human luteal subendothelial basal laminas potentially contain laminin 11 during early luteal development and, additionally, laminins 8, 9 and 10 at the mid-luteal phase. Laminin α1 and α3 chains were not detected in corpora lutea. Versican localized to the connective tissue extremities of the corpus luteum. Thus, during the formation of the human corpus luteum, remodelling of extracellular matrix does not result in basal laminas as present in the adrenal cortex or ovarian follicle. Instead, novel aggregates of interstitial matrix of collagen and laminin are deposited within the luteal parenchyma, and it remains to be seen whether this matrix is important for maintaining the luteal cell phenotype.
Resumo:
Several studies have demonstrated an association between polycystic ovary syndrome (PCOS) and the dinucleotide repeat microsatellite marker D19S884, which is located in intron 55 of the fibrillin-3 (FBN3) gene. Fibrillins, including FBN1 and 2, interact with latent transforming growth factor (TGF)-β-binding proteins (LTBP) and thereby control the bioactivity of TGFβs. TGFβs stimulate fibroblast replication and collagen production. The PCOS ovarian phenotype includes increased stromal collagen and expansion of the ovarian cortex, features feasibly influenced by abnormal fibrillin expression. To examine a possible role of fibrillins in PCOS, particularly FBN3, we undertook tagging and functional single nucleotide polymorphism (SNP) analysis (32 SNPs including 10 that generate non-synonymous amino acid changes) using DNA from 173 PCOS patients and 194 controls. No SNP showed a significant association with PCOS and alleles of most SNPs showed almost identical population frequencies between PCOS and control subjects. No significant differences were observed for microsatellite D19S884. In human PCO stroma/cortex (n = 4) and non-PCO ovarian stroma (n = 9), follicles (n = 3) and corpora lutea (n = 3) and in human ovarian cancer cell lines (KGN, SKOV-3, OVCAR-3, OVCAR-5), FBN1 mRNA levels were approximately 100 times greater than FBN2 and 200–1000-fold greater than FBN3. Expression of LTBP-1 mRNA was 3-fold greater than LTBP-2. We conclude that FBN3 appears to have little involvement in PCOS but cannot rule out that other markers in the region of chromosome 19p13.2 are associated with PCOS or that FBN3 expression occurs in other organs and that this may be influencing the PCOS phenotype.
Resumo:
Features derived from the trispectra of DFT magnitude slices are used for multi-font digit recognition. These features are insensitive to translation, rotation, or scaling of the input. They are also robust to noise. Classification accuracy tests were conducted on a common data base of 256× 256 pixel bilevel images of digits in 9 fonts. Randomly rotated and translated noisy versions were used for training and testing. The results indicate that the trispectral features are better than moment invariants and affine moment invariants. They achieve a classification accuracy of 95% compared to about 81% for Hu's (1962) moment invariants and 39% for the Flusser and Suk (1994) affine moment invariants on the same data in the presence of 1% impulse noise using a 1-NN classifier. For comparison, a multilayer perceptron with no normalization for rotations and translations yields 34% accuracy on 16× 16 pixel low-pass filtered and decimated versions of the same data.
Resumo:
Condition monitoring of diesel engines can prevent unpredicted engine failures and the associated consequence. This paper presents an experimental study of the signal characteristics of a 4-cylinder diesel engine under various loading conditions. Acoustic emission, vibration and in-cylinder pressure signals were employed to study the effectiveness of these techniques for condition monitoring and identifying symptoms of incipient failures. An event driven synchronous averaging technique was employed to average the quasi-periodic diesel engine signal in the time domain to eliminate or minimize the effect of engine speed and amplitude variations on the analysis of condition monitoring signal. It was shown that acoustic emission (AE) is a better technique than vibration method for condition monitor of diesel engines due to its ability to produce high quality signals (i.e., excellent signal to noise ratio) in a noisy diesel engine environment. It was found that the peak amplitude of AE RMS signals correlating to the impact-like combustion related events decreases in general due to a more stable mechanical process of the engine as the loading increases. A small shift in the exhaust valve closing time was observed as the engine load increases which indicates a prolong combustion process in the cylinder (to produce more power). On the contrary, peak amplitudes of the AE RMS attributing to fuel injection increase as the loading increases. This can be explained by the increase fuel friction caused by the increase volume flow rate during the injection. Multiple AE pulses during the combustion process were identified in the study, which were generated by the piston rocking motion and the interaction between the piston and the cylinder wall. The piston rocking motion is caused by the non-uniform pressure distribution acting on the piston head as a result of the non-linear combustion process of the engine. The rocking motion ceased when the pressure in the cylinder chamber stabilized.
Resumo:
The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.
Resumo:
Inverse problems based on using experimental data to estimate unknown parameters of a system often arise in biological and chaotic systems. In this paper, we consider parameter estimation in systems biology involving linear and non-linear complex dynamical models, including the Michaelis–Menten enzyme kinetic system, a dynamical model of competence induction in Bacillus subtilis bacteria and a model of feedback bypass in B. subtilis bacteria. We propose some novel techniques for inverse problems. Firstly, we establish an approximation of a non-linear differential algebraic equation that corresponds to the given biological systems. Secondly, we use the Picard contraction mapping, collage methods and numerical integration techniques to convert the parameter estimation into a minimization problem of the parameters. We propose two optimization techniques: a grid approximation method and a modified hybrid Nelder–Mead simplex search and particle swarm optimization (MH-NMSS-PSO) for non-linear parameter estimation. The two techniques are used for parameter estimation in a model of competence induction in B. subtilis bacteria with noisy data. The MH-NMSS-PSO scheme is applied to a dynamical model of competence induction in B. subtilis bacteria based on experimental data and the model for feedback bypass. Numerical results demonstrate the effectiveness of our approach.
Resumo:
Visual activity detection of lip movements can be used to overcome the poor performance of voice activity detection based solely in the audio domain, particularly in noisy acoustic conditions. However, most of the research conducted in visual voice activity detection (VVAD) has neglected addressing variabilities in the visual domain such as viewpoint variation. In this paper we investigate the effectiveness of the visual information from the speaker’s frontal and profile views (i.e left and right side views) for the task of VVAD. As far as we are aware, our work constitutes the first real attempt to study this problem. We describe our visual front end approach and the Gaussian mixture model (GMM) based VVAD framework, and report the experimental results using the freely available CUAVE database. The experimental results show that VVAD is indeed possible from profile views and we give a quantitative comparison of VVAD based on frontal and profile views The results presented are useful in the development of multi-modal Human Machine Interaction (HMI) using a single camera, where the speaker’s face may not always be frontal.
Resumo:
Information has no value unless it is accessible. Information must be connected together so a knowledge network can then be built. Such a knowledge base is a key resource for Internet users to interlink information from documents. Information retrieval, a key technology for knowledge management, guarantees access to large corpora of unstructured text. Collaborative knowledge management systems such as Wikipedia are becoming more popular than ever; however, their link creation function is not optimized for discovering possible links in the collection and the quality of automatically generated links has never been quantified. This research begins with an evaluation forum which is intended to cope with the experiments of focused link discovery in a collaborative way as well as with the investigation of the link discovery application. The research focus was on the evaluation strategy: the evaluation framework proposal, including rules, formats, pooling, validation, assessment and evaluation has proved to be efficient, reusable for further extension and efficient for conducting evaluation. The collection-split approach is used to re-construct the Wikipedia collection into a split collection comprising single passage files. This split collection is proved to be feasible for improving relevant passages discovery and is devoted to being a corpus for focused link discovery. Following these experiments, a mobile client-side prototype built on iPhone is developed to resolve the mobile Search issue by using focused link discovery technology. According to the interview survey, the proposed mobile interactive UI does improve the experience of mobile information seeking. Based on this evaluation framework, a novel cross-language link discovery proposal using multiple text collections is developed. A dynamic evaluation approach is proposed to enhance both the collaborative effort and the interacting experience between submission and evaluation. A realistic evaluation scheme has been implemented at NTCIR for cross-language link discovery tasks.
Resumo:
This paper presents an approach to building an observation likelihood function from a set of sparse, noisy training observations taken from known locations by a sensor with no obvious geometric model. The basic approach is to fit an interpolant to the training data, representing the expected observation, and to assume additive sensor noise. This paper takes a Bayesian view of the problem, maintaining a posterior over interpolants rather than simply the maximum-likelihood interpolant, giving a measure of uncertainty in the map at any point. This is done using a Gaussian process framework. To validate the approach experimentally, a model of an environment is built using observations from an omni-directional camera. After a model has been built from the training data, a particle filter is used to localise while traversing this environment
Resumo:
Large margin learning approaches, such as support vector machines (SVM), have been successfully applied to numerous classification tasks, especially for automatic facial expression recognition. The risk of such approaches however, is their sensitivity to large margin losses due to the influence from noisy training examples and outliers which is a common problem in the area of affective computing (i.e., manual coding at the frame level is tedious so coarse labels are normally assigned). In this paper, we leverage the relaxation of the parallel-hyperplanes constraint and propose the use of modified correlation filters (MCF). The MCF is similar in spirit to SVMs and correlation filters, but with the key difference of optimizing only a single hyperplane. We demonstrate the superiority of MCF over current techniques on a battery of experiments.
Resumo:
The city and the urban condition, popular subjects of art, literature, and film, have been commonly represented as fragmented, isolating, violent, with silent crowds moving through the hustle and bustle of a noisy, polluted cityspace. Included in this diverse artistic field is children’s literature—an area of creative and critical inquiry that continues to play a central role in illuminating and shaping perceptions of the city, of city lifestyles, and of the people who traverse the urban landscape. Fiction’s textual representations of cities, its sites and sights, lifestyles and characters have drawn on traditions of realist, satirical, and fantastic writing to produce the protean urban story—utopian, dystopian, visionary, satirical—with the goal of offering an account or critique of the contemporary city and the urban condition. In writing about cities and urban life, children’s literature variously locates the child in relation to the social (urban) space. This dialogic relation between subject and social space has been at the heart of writings about/of the flâneur: a figure who experiences modes of being in the city as it transforms under the influences of modernism and postmodernism. Within this context of a changing urban ontology brought about by (post)modern styles and practices, this article examines five contemporary picture books: The Cows Are Going to Paris by David Kirby and Allen Woodman; Ooh-la-la (Max in love) by Maira Kalman; Mr Chicken Goes to Paris and Old Tom’s Holiday by Leigh Hobbs; and The Empty City by David Megarrity. I investigate the possibility of these texts reviving the act of flânerie, but in a way that enables different modes of being a flâneur, a neo-flâneur. I suggest that the neo-flâneur retains some of the characteristics of the original flâneur, but incorporates others that take account of the changes wrought by postmodernity and globalization, particularly tourism and consumption. The dual issue at the heart of the discussion is that tourism and consumption as agents of cultural globalization offer a different way of thinking about the phenomenon of flânerie. While the flâneur can be regarded as the precursor to the tourist, the discussion considers how different modes of flânerie, such as the tourist-flâneur, are an inevitable outcome of commodification of the activities that accompany strolling through the (post)modern urban space.