989 resultados para Acoustic event classification


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The swallowing disturbers are defined as oropharyngeal dysphagia when present specifies signals and symptoms that are characterized for alterations in any phases of swallowing. Early diagnosis is crucial for the prognosis of patients with dysphagia and the potential to diagnose dysphagia in a noninvasive manner by assessing the sounds of swallowing is a highly attractive option for the dysphagia clinician. This study proposes a new framework for oropharyngeal dysphagia identification, having two main contributions: a new set of features extract from swallowing signal by discrete wavelet transform and the dysphagia classification by a novel pattern classifier called OPF. We also employed the well known SVM algorithm in the dysphagia identification task, for comparison purposes. We performed the experiments in two sub-signals: the first was the moment of the maximal peak (MP) of the signal and the second is the swallowing apnea period (SAP). The OPF final accuracy obtained were 85.2% and 80.2% for the analyzed signals MP and SAP, respectively, outperforming the SVM results. ©2008 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Several activities were conducted during my PhD activity. For the NEMO experiment a collaboration between the INFN/University groups of Catania and Bologna led to the development and production of a mixed signal acquisition board for the Nemo Km3 telescope. The research concerned the feasibility study for a different acquisition technique quite far from that adopted in the NEMO Phase 1 telescope. The DAQ board that we realized exploits the LIRA06 front-end chip for the analog acquisition of anodic an dynodic sources of a PMT (Photo-Multiplier Tube). The low-power analog acquisition allows to sample contemporaneously multiple channels of the PMT at different gain factors in order to increase the signal response linearity over a wider dynamic range. Also the auto triggering and self-event-classification features help to improve the acquisition performance and the knowledge on the neutrino event. A fully functional interface towards the first level data concentrator, the Floor Control Module, has been integrated as well on the board, and a specific firmware has been realized to comply with the present communication protocols. This stage of the project foresees the use of an FPGA, a high speed configurable device, to provide the board with a flexible digital logic control core. After the validation of the whole front-end architecture this feature would be probably integrated in a common mixed-signal ASIC (Application Specific Integrated Circuit). The volatile nature of the configuration memory of the FPGA implied the integration of a flash ISP (In System Programming) memory and a smart architecture for a safe remote reconfiguration of it. All the integrated features of the board have been tested. At the Catania laboratory the behavior of the LIRA chip has been investigated in the digital environment of the DAQ board and we succeeded in driving the acquisition with the FPGA. The PMT pulses generated with an arbitrary waveform generator were correctly triggered and acquired by the analog chip, and successively they were digitized by the on board ADC under the supervision of the FPGA. For the communication towards the data concentrator a test bench has been realized in Bologna where, thanks to a lending of the Roma University and INFN, a full readout chain equivalent to that present in the NEMO phase-1 was installed. These tests showed a good behavior of the digital electronic that was able to receive and to execute command imparted by the PC console and to answer back with a reply. The remotely configurable logic behaved well too and demonstrated, at least in principle, the validity of this technique. A new prototype board is now under development at the Catania laboratory as an evolution of the one described above. This board is going to be deployed within the NEMO Phase-2 tower in one of its floors dedicated to new front-end proposals. This board will integrate a new analog acquisition chip called SAS (Smart Auto-triggering Sampler) introducing thus a new analog front-end but inheriting most of the digital logic present in the current DAQ board discussed in this thesis. For what concern the activity on high-resolution vertex detectors, I worked within the SLIM5 collaboration for the characterization of a MAPS (Monolithic Active Pixel Sensor) device called APSEL-4D. The mentioned chip is a matrix of 4096 active pixel sensors with deep N-well implantations meant for charge collection and to shield the analog electronics from digital noise. The chip integrates the full-custom sensors matrix and the sparsifification/readout logic realized with standard-cells in STM CMOS technology 130 nm. For the chip characterization a test-beam has been set up on the 12 GeV PS (Proton Synchrotron) line facility at CERN of Geneva (CH). The collaboration prepared a silicon strip telescope and a DAQ system (hardware and software) for data acquisition and control of the telescope that allowed to store about 90 million events in 7 equivalent days of live-time of the beam. My activities concerned basically the realization of a firmware interface towards and from the MAPS chip in order to integrate it on the general DAQ system. Thereafter I worked on the DAQ software to implement on it a proper Slow Control interface of the APSEL4D. Several APSEL4D chips with different thinning have been tested during the test beam. Those with 100 and 300 um presented an overall efficiency of about 90% imparting a threshold of 450 electrons. The test-beam allowed to estimate also the resolution of the pixel sensor providing good results consistent with the pitch/sqrt(12) formula. The MAPS intrinsic resolution has been extracted from the width of the residual plot taking into account the multiple scattering effect.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

O Monitoramento Acústico Passivo (PAM) submarino refere-se ao uso de sistemas de escuta e gravação subaquática, com o intuito de detectar, monitorar e identificar fontes sonoras através das ondas de pressão que elas produzem. Se diz que é passivo já que tais sistemas unicamente ouvem, sem perturbam o meio ambiente acústico existente, diferentemente de ativos, como os sonares. O PAM submarino tem diversas áreas de aplicação, como em sistemas de vigilância militar, seguridade portuária, monitoramento ambiental, desenvolvimento de índices de densidade populacional de espécies, identificação de espécies, etc. Tecnologia nacional nesta área é praticamente inexistente apesar da sua importância. Neste contexto, o presente trabalho visa contribuir com o desenvolvimento de tecnologia nacional no tema através da concepção, construção e operação de equipamento autônomo de PAM e de métodos de processamento de sinais para detecção automatizada de eventos acústicos submarinos. Foi desenvolvido um equipamento, nomeado OceanPod, que possui características como baixo custo de fabrica¸c~ao, flexibilidade e facilidade de configuração e uso, voltado para a pesquisa científica, industrial e para controle ambiental. Vários protótipos desse equipamento foram construídos e utilizados em missões no mar. Essas jornadas de monitoramento permitiram iniciar a criação de um banco de dados acústico, o qual permitiu fornecer a matéria prima para o teste de detectores de eventos acústicos automatizados e em tempo real. Adicionalmente também é proposto um novo método de detecção-identificação de eventos acústicos, baseado em análise estatística da representação tempo-frequência dos sinais acústicos. Este novo método foi testado na detecção de cetáceos, presentes no banco de dados gerado pelas missões de monitoramento.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class­ based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state­ of ­the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state­ of­ the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

O conforto acústico na habitação é essencial para permitir um repouso tranquilo e regenerativo. Este conforto é obtido principalmente pela redução do ruído e aumento do isolamento sonoro dos elementos de compartimentação. Para a qualificação deste conforto é necessário efetuar uma análise global do edifício, onde são considerados os fatores internos e externos à habitação, ou seja, considerando a acústica da envolvente (Vizinhança), do edifício (Edifício) e da fração (Habitação). Para este efeito foi produzido o “Método LNEC para avaliação e classificação da qualidade acústica de edifícios habitacionais”, o qual permite fazer uma avaliação global do conforto acústico na habitação. Este método inovador em Portugal origina uma Classe Acústica LNEC que permite representar com razoável fiabilidade o conforto acústico realmente sentido e a qualidade acústica da habitação. Este método pode ser aplicado a edifícios novos e a edifícios a reabilitar. De modo a permitir uma estimação dos custos médios necessários para transitar entre determinadas classes (e alcançar respetivo conforto acústico) é necessário possuir uma ferramenta de cálculo apropriada. Deste modo apresenta-se nesta comunicação uma ferramenta para estimar os custos de transição entre classes acústicas. Esta metodologia permite fazer escolhas mais fundamentadas nos processos de obtenção de determinado conforto acústico. Abstract The acoustic comfort in dwellings is essential to a peaceful and regenerative sleep. The achievement of this comfort is primarily obtained by reducing noise and increasing sound insulation of separating elements. To classify this comfort is necessary to conduct a comprehensive analysis of the habitation. This global analysis assesses the internal and external factors of the habitation, evaluating the acoustics of the environment (Vicinity), of the building (Building) and of the dwelling place (Lodging). Recently in Portugal was developed the “LNEC method for evaluation and acoustic quality classification of residential buildings” that allows an overall evaluation of the acoustic comfort in dwellings, resulting also in a “LNEC Acoustic Class” that portrays the real acoustic comfort sensed. Since this method can be applied to evaluate the acoustic comfort of new and of restored buildings, it is necessary a tool that gives an estimation of the needed investment to upgrade to a specific “LNEC Acoustic Class” (and achieve the respective acoustic comfort). In this communication is presented a tool that allows the estimation of that upgrade costs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Acoustic classification of anurans (frogs) has received increasing attention for its promising application in biological and environment studies. In this study, a novel feature extraction method for frog call classification is presented based on the analysis of spectrograms. The frog calls are first automatically segmented into syllables. Then, spectral peak tracks are extracted to separate desired signal (frog calls) from background noise. The spectral peak tracks are used to extract various syllable features, including: syllable duration, dominant frequency, oscillation rate, frequency modulation, and energy modulation. Finally, a k-nearest neighbor classifier is used for classifying frog calls based on the results of principal component analysis. The experiment results show that syllable features can achieve an average classification accuracy of 90.5% which outperforms Mel-frequency cepstral coefficients features (79.0%).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Frog protection has become increasingly essential due to the rapid decline of its biodiversity. Therefore, it is valuable to develop new methods for studying this biodiversity. In this paper, a novel feature extraction method is proposed based on perceptual wavelet packet decomposition for classifying frog calls in noisy environments. Pre-processing and syllable segmentation are first applied to the frog call. Then, a spectral peak track is extracted from each syllable if possible. Track duration, dominant frequency and oscillation rate are directly extracted from the track. With k-means clustering algorithm, the calculated dominant frequency of all frog species is clustered into k parts, which produce a frequency scale for wavelet packet decomposition. Based on the adaptive frequency scale, wavelet packet decomposition is applied to the frog calls. Using the wavelet packet decomposition coefficients, a new feature set named perceptual wavelet packet decomposition sub-band cepstral coefficients is extracted. Finally, a k-nearest neighbour (k-NN) classifier is used for the classification. The experiment results show that the proposed features can achieve an average classification accuracy of 97.45% which outperforms syllable features (86.87%) and Mel-frequency cepstral coefficients (MFCCs) feature (90.80%).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Over past few decades, frog species have been experiencing dramatic decline around the world. The reason for this decline includes habitat loss, invasive species, climate change and so on. To better know the status of frog species, classifying frogs has become increasingly important. In this study, acoustic features are investigated for multi-level classification of Australian frogs: family, genus and species, including three families, eleven genera and eighty five species which are collected from Queensland, Australia. For each frog species, six instances are selected from which ten acoustic features are calculated. Then, the multicollinearity between ten features are studied for selecting non-correlated features for subsequent analysis. A decision tree (DT) classifier is used to visually and explicitly determine which acoustic features are relatively important for classifying family, which for genus, and which for species. Finally, a weighted support vector machines (SVMs) classifier is used for the multi- level classification with three most important acoustic features respectively. Our experiment results indicate that using different acoustic feature sets can successfully classify frogs at different levels and the average classification accuracy can be up to 85.6%, 86.1% and 56.2% for family, genus and species respectively.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Acoustics is a rich source of environmental information that can reflect the ecological dynamics. To deal with the escalating acoustic data, a variety of automated classification techniques have been used for acoustic patterns or scene recognition, including urban soundscapes such as streets and restaurants; and natural soundscapes such as raining and thundering. It is common to classify acoustic patterns under the assumption that a single type of soundscapes present in an audio clip. This assumption is reasonable for some carefully selected audios. However, only few experiments have been focused on classifying simultaneous acoustic patterns in long-duration recordings. This paper proposes a binary relevance based multi-label classification approach to recognise simultaneous acoustic patterns in one-minute audio clips. By utilising acoustic indices as global features and multilayer perceptron as a base classifier, we achieve good classification performance on in-the-field data. Compared with single-label classification, multi-label classification approach provides more detailed information about the distributions of various acoustic patterns in long-duration recordings. These results will merit further biodiversity investigations, such as bird species surveys.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Serial Analysis of Gene Expression (SAGE) is a relatively new method for monitoring gene expression levels and is expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. A promising application of SAGE gene expression data is classification of tumors. In this paper, we build three event models (the multivariate Bernoulli model, the multinomial model and the normalized multinomial model) for SAGE data classification. Both binary classification and multicategory classification are investigated. Experiments on two SAGE datasets show that the multivariate Bernoulli model performs well with small feature sizes, but the multinomial performs better at large feature sizes, while the normalized multinomial performs well with medium feature sizes. The multinomial achieves the highest overall accuracy.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper proposes a method for the detection and classification of multiple events in an electrical power system in real-time, namely; islanding, high frequency events (loss of load) and low frequency events (loss of generation). This method is based on principal component analysis of frequency measurements and employs a moving window approach to combat the time-varying nature of power systems, thereby increasing overall situational awareness of the power system. Numerical case studies using both real data, collected from the UK power system, and simulated case studies, constructed using DigSilent PowerFactory, for islanding events, as well as both loss of load and generation dip events, are used to demonstrate the reliability of the proposed method.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the last few years the number of systems and devices that use voice based interaction has grown significantly. For a continued use of these systems the interface must be reliable and pleasant in order to provide an optimal user experience. However there are currently very few studies that try to evaluate how good is a voice when the application is a speech based interface. In this paper we present a new automatic voice pleasantness classification system based on prosodic and acoustic patterns of voice preference. Our study is based on a multi-language database composed by female voices. In the objective performance evaluation the system achieved a 7.3% error rate.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Condition monitoring of wooden railway sleepers applications are generallycarried out by visual inspection and if necessary some impact acoustic examination iscarried out intuitively by skilled personnel. In this work, a pattern recognition solutionhas been proposed to automate the process for the achievement of robust results. Thestudy presents a comparison of several pattern recognition techniques together withvarious nonstationary feature extraction techniques for classification of impactacoustic emissions. Pattern classifiers such as multilayer perceptron, learning cectorquantization and gaussian mixture models, are combined with nonstationary featureextraction techniques such as Short Time Fourier Transform, Continuous WaveletTransform, Discrete Wavelet Transform and Wigner-Ville Distribution. Due to thepresence of several different feature extraction and classification technqies, datafusion has been investigated. Data fusion in the current case has mainly beeninvestigated on two levels, feature level and classifier level respectively. Fusion at thefeature level demonstrated best results with an overall accuracy of 82% whencompared to the human operator.