858 resultados para data pre-processing
Resumo:
In this work an adaptive modeling and spectral estimation scheme based on a dual Discrete Kalman Filtering (DKF) is proposed for speech enhancement. Both speech and noise signals are modeled by an autoregressive structure which provides an underlying time frame dependency and improves time-frequency resolution. The model parameters are arranged to obtain a combined state-space model and are also used to calculate instantaneous power spectral density estimates. The speech enhancement is performed by a dual discrete Kalman filter that simultaneously gives estimates for the models and the signals. This approach is particularly useful as a pre-processing module for parametric based speech recognition systems that rely on spectral time dependent models. The system performance has been evaluated by a set of human listeners and by spectral distances. In both cases the use of this pre-processing module has led to improved results.
Resumo:
Trabalho de Projeto apresentado ao Instituto de Contabilidade e Administração do Porto para a obtenção do grau de Mestre em Tradução e Interpretação Especializadas, sob orientação da Doutora Sara Cerqueira Pascoal
Resumo:
Le Ministère des Ressources Naturelles et de la Faune (MRNF) a mandaté la compagnie de géomatique SYNETIX inc. de Montréal et le laboratoire de télédétection de l’Université de Montréal dans le but de développer une application dédiée à la détection automatique et la mise à jour du réseau routier des cartes topographiques à l’échelle 1 : 20 000 à partir de l’imagerie optique à haute résolution spatiale. À cette fin, les mandataires ont entrepris l’adaptation du progiciel SIGMA0 qu’ils avaient conjointement développé pour la mise à jour cartographique à partir d’images satellitales de résolution d’environ 5 mètres. Le produit dérivé de SIGMA0 fut un module nommé SIGMA-ROUTES dont le principe de détection des routes repose sur le balayage d’un filtre le long des vecteurs routiers de la cartographie existante. Les réponses du filtre sur des images couleurs à très haute résolution d’une grande complexité radiométrique (photographies aériennes) conduisent à l’assignation d’étiquettes selon l’état intact, suspect, disparu ou nouveau aux segments routiers repérés. L’objectif général de ce projet est d’évaluer la justesse de l’assignation des statuts ou états en quantifiant le rendement sur la base des distances totales détectées en conformité avec la référence ainsi qu’en procédant à une analyse spatiale des incohérences. La séquence des essais cible d’abord l’effet de la résolution sur le taux de conformité et dans un second temps, les gains escomptés par une succession de traitements de rehaussement destinée à rendre ces images plus propices à l’extraction du réseau routier. La démarche globale implique d’abord la caractérisation d’un site d’essai dans la région de Sherbrooke comportant 40 km de routes de diverses catégories allant du sentier boisé au large collecteur sur une superficie de 2,8 km2. Une carte de vérité terrain des voies de communication nous a permis d’établir des données de référence issues d’une détection visuelle à laquelle sont confrontés les résultats de détection de SIGMA-ROUTES. Nos résultats confirment que la complexité radiométrique des images à haute résolution en milieu urbain bénéficie des prétraitements telles que la segmentation et la compensation d’histogramme uniformisant les surfaces routières. On constate aussi que les performances présentent une hypersensibilité aux variations de résolution alors que le passage entre nos trois résolutions (84, 168 et 210 cm) altère le taux de détection de pratiquement 15% sur les distances totales en concordance avec la référence et segmente spatialement de longs vecteurs intacts en plusieurs portions alternant entre les statuts intact, suspect et disparu. La détection des routes existantes en conformité avec la référence a atteint 78% avec notre plus efficace combinaison de résolution et de prétraitements d’images. Des problèmes chroniques de détection ont été repérés dont la présence de plusieurs segments sans assignation et ignorés du processus. Il y a aussi une surestimation de fausses détections assignées suspectes alors qu’elles devraient être identifiées intactes. Nous estimons, sur la base des mesures linéaires et des analyses spatiales des détections que l’assignation du statut intact devrait atteindre 90% de conformité avec la référence après divers ajustements à l’algorithme. La détection des nouvelles routes fut un échec sans égard à la résolution ou au rehaussement d’image. La recherche des nouveaux segments qui s’appuie sur le repérage de points potentiels de début de nouvelles routes en connexion avec les routes existantes génère un emballement de fausses détections navigant entre les entités non-routières. En lien avec ces incohérences, nous avons isolé de nombreuses fausses détections de nouvelles routes générées parallèlement aux routes préalablement assignées intactes. Finalement, nous suggérons une procédure mettant à profit certaines images rehaussées tout en intégrant l’intervention humaine à quelques phases charnières du processus.
Resumo:
Les systèmes statistiques de traduction automatique ont pour tâche la traduction d’une langue source vers une langue cible. Dans la plupart des systèmes de traduction de référence, l'unité de base considérée dans l'analyse textuelle est la forme telle qu’observée dans un texte. Une telle conception permet d’obtenir une bonne performance quand il s'agit de traduire entre deux langues morphologiquement pauvres. Toutefois, ceci n'est plus vrai lorsqu’il s’agit de traduire vers une langue morphologiquement riche (ou complexe). Le but de notre travail est de développer un système statistique de traduction automatique comme solution pour relever les défis soulevés par la complexité morphologique. Dans ce mémoire, nous examinons, dans un premier temps, un certain nombre de méthodes considérées comme des extensions aux systèmes de traduction traditionnels et nous évaluons leurs performances. Cette évaluation est faite par rapport aux systèmes à l’état de l’art (système de référence) et ceci dans des tâches de traduction anglais-inuktitut et anglais-finnois. Nous développons ensuite un nouvel algorithme de segmentation qui prend en compte les informations provenant de la paire de langues objet de la traduction. Cet algorithme de segmentation est ensuite intégré dans le modèle de traduction à base d’unités lexicales « Phrase-Based Models » pour former notre système de traduction à base de séquences de segments. Enfin, nous combinons le système obtenu avec des algorithmes de post-traitement pour obtenir un système de traduction complet. Les résultats des expériences réalisées dans ce mémoire montrent que le système de traduction à base de séquences de segments proposé permet d’obtenir des améliorations significatives au niveau de la qualité de la traduction en terme de le métrique d’évaluation BLEU (Papineni et al., 2002) et qui sert à évaluer. Plus particulièrement, notre approche de segmentation réussie à améliorer légèrement la qualité de la traduction par rapport au système de référence et une amélioration significative de la qualité de la traduction est observée par rapport aux techniques de prétraitement de base (baseline).
Resumo:
This paper underlines a methodology for translating text from English into the Dravidian language, Malayalam using statistical models. By using a monolingual Malayalam corpus and a bilingual English/Malayalam corpus in the training phase, the machine automatically generates Malayalam translations of English sentences. This paper also discusses a technique to improve the alignment model by incorporating the parts of speech information into the bilingual corpus. Removing the insignificant alignments from the sentence pairs by this approach has ensured better training results. Pre-processing techniques like suffix separation from the Malayalam corpus and stop word elimination from the bilingual corpus also proved to be effective in training. Various handcrafted rules designed for the suffix separation process which can be used as a guideline in implementing suffix separation in Malayalam language are also presented in this paper. The structural difference between the English Malayalam pair is resolved in the decoder by applying the order conversion rules. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics
Resumo:
A methodology for translating text from English into the Dravidian language, Malayalam using statistical models is discussed in this paper. The translator utilizes a monolingual Malayalam corpus and a bilingual English/Malayalam corpus in the training phase and generates automatically the Malayalam translation of an unseen English sentence. Various techniques to improve the alignment model by incorporating the morphological inputs into the bilingual corpus are discussed. Removing the insignificant alignments from the sentence pairs by this approach has ensured better training results. Pre-processing techniques like suffix separation from the Malayalam corpus and stop word elimination from the bilingual corpus also proved to be effective in producing better alignments. Difficulties in translation process that arise due to the structural difference between the English Malayalam pair is resolved in the decoding phase by applying the order conversion rules. The handcrafted rules designed for the suffix separation process which can be used as a guideline in implementing suffix separation in Malayalam language are also presented in this paper. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics
Resumo:
Presently different audio watermarking methods are available; most of them inclined towards copyright protection and copy protection. This is the key motive for the notion to develop a speaker verification scheme that guar- antees non-repudiation services and the thesis is its outcome. The research presented in this thesis scrutinizes the field of audio water- marking and the outcome is a speaker verification scheme that is proficient in addressing issues allied to non-repudiation to a great extent. This work aimed in developing novel audio watermarking schemes utilizing the fun- damental ideas of Fast-Fourier Transform (FFT) or Fast Walsh-Hadamard Transform (FWHT). The Mel-Frequency Cepstral Coefficients (MFCC) the best parametric representation of the acoustic signals along with few other key acoustic characteristics is employed in crafting of new schemes. The au- dio watermark created is entirely dependent to the acoustic features, hence named as FeatureMark and is crucial in this work. In any watermarking scheme, the quality of the extracted watermark de- pends exclusively on the pre-processing action and in this work framing and windowing techniques are involved. The theme non-repudiation provides immense significance in the audio watermarking schemes proposed in this work. Modification of the signal spectrum is achieved in a variety of ways by selecting appropriate FFT/FWHT coefficients and the watermarking schemes were evaluated for imperceptibility, robustness and capacity char- acteristics. The proposed schemes are unequivocally effective in terms of maintaining the sound quality, retrieving the embedded FeatureMark and in terms of the capacity to hold the mark bits. Robust nature of these marking schemes is achieved with the help of syn- chronization codes such as Barker Code with FFT based FeatureMarking scheme and Walsh Code with FWHT based FeatureMarking scheme. An- other important feature associated with this scheme is the employment of an encryption scheme towards the preparation of its FeatureMark that scrambles the signal features that helps to keep the signal features unreve- laed. A comparative study with the existing watermarking schemes and the ex- periments to evaluate imperceptibility, robustness and capacity tests guar- antee that the proposed schemes can be baselined as efficient audio water- marking schemes. The four new digital audio watermarking algorithms in terms of their performance are remarkable thereby opening more opportu- nities for further research.
Resumo:
Der Europäische Markt für ökologische Lebensmittel ist seit den 1990er Jahren stark gewachsen. Begünstigt wurde dies durch die Einführung der EU-Richtlinie 2092/91 zur Zertifizierung ökologischer Produkte und durch die Zahlung von Subventionen an umstellungswillige Landwirte. Diese Maßnahmen führten am Ende der 1990er Jahre für einige ökologische Produkte zu einem Überangebot auf europäischer Ebene. Die Verbrauchernachfrage stieg nicht in gleichem Maße wie das Angebot, und die Notwendigkeit für eine Verbesserung des Marktgleichgewichts wurde offensichtlich. Dieser Bedarf wurde im Jahr 2004 von der Europäischen Kommission im ersten „Europäischen Aktionsplan für ökologisch erzeugte Lebensmittel und den ökologischen Landbau“ formuliert. Als Voraussetzung für ein gleichmäßigeres Marktwachstum wird in diesem Aktionsplan die Schaffung eines transparenteren Marktes durch die Erhebung statistischer Daten über Produktion und Verbrauch ökologischer Produkte gefordert. Die Umsetzung dieses Aktionsplans ist jedoch bislang nicht befriedigend, da es auf EU-Ebene noch immer keine einheitliche Datenerfassung für den Öko-Sektor gibt. Ziel dieser Studie ist es, angemessene Methoden für die Erhebung, Verarbeitung und Analyse von Öko-Marktdaten zu finden. Geeignete Datenquellen werden identifiziert und es wird untersucht, wie die erhobenen Daten auf Plausibilität untersucht werden können. Hierzu wird ein umfangreicher Datensatz zum Öko-Markt analysiert, der im Rahmen des EU-Forschungsprojektes „Organic Marketing Initiatives and Rural Development” (OMIaRD) erhoben wurde und alle EU-15-Länder sowie Tschechien, Slowenien, Norwegen und die Schweiz abdeckt. Daten für folgende Öko-Produktgruppen werden untersucht: Getreide, Kartoffeln, Gemüse, Obst, Milch, Rindfleisch, Schaf- und Ziegenfleisch, Schweinefleisch, Geflügelfleisch und Eier. Ein zentraler Ansatz dieser Studie ist das Aufstellen von Öko-Versorgungsbilanzen, die einen zusammenfassenden Überblick von Angebot und Nachfrage der jeweiligen Produktgruppen liefern. Folgende Schlüsselvariablen werden untersucht: Öko-Produktion, Öko-Verkäufe, Öko-Verbrauch, Öko-Außenhandel, Öko-Erzeugerpreise und Öko-Verbraucherpreise. Zudem werden die Öko-Marktdaten in Relation zu den entsprechenden Zahlen für den Gesamtmarkt (öko plus konventionell) gesetzt, um die Bedeutung des Öko-Sektors auf Produkt- und Länderebene beurteilen zu können. Für die Datenerhebung werden Primär- und Sekundärforschung eingesetzt. Als Sekundärquellen werden Publikationen von Marktforschungsinstituten, Öko-Erzeugerverbänden und wissenschaftlichen Instituten ausgewertet. Empirische Daten zum Öko-Markt werden im Rahmen von umfangreichen Interviews mit Marktexperten in allen beteiligten Ländern erhoben. Die Daten werden mit Korrelations- und Regressionsanalysen untersucht, und es werden Hypothesen über vermutete Zusammenhänge zwischen Schlüsselvariablen des Öko-Marktes getestet. Die Datenbasis dieser Studie bezieht sich auf ein einzelnes Jahr und stellt damit einen Schnappschuss der Öko-Marktsituation der EU dar. Um die Marktakteure in die Lage zu versetzen, zukünftige Markttrends voraussagen zu können, wird der Aufbau eines EU-weiten Öko-Marktdaten-Erfassungssystems gefordert. Hierzu wird eine harmonisierte Datenerfassung in allen EU-Ländern gemäß einheitlicher Standards benötigt. Die Zusammenstellung der Marktdaten für den Öko-Sektor sollte kompatibel sein mit den Methoden und Variablen der bereits existierenden Eurostat-Datenbank für den gesamten Agrarmarkt (öko plus konventionell). Eine jährlich aktualisierte Öko-Markt-Datenbank würde die Transparenz des Öko-Marktes erhöhen und die zukünftige Entwicklung des Öko-Sektors erleichtern. ---------------------------
Resumo:
Realistic rendering animation is known to be an expensive processing task when physically-based global illumination methods are used in order to improve illumination details. This paper presents an acceleration technique to compute animations in radiosity environments. The technique is based on an interpolated approach that exploits temporal coherence in radiosity. A fast global Monte Carlo pre-processing step is introduced to the whole computation of the animated sequence to select important frames. These are fully computed and used as a base for the interpolation of all the sequence. The approach is completely view-independent. Once the illumination is computed, it can be visualized by any animated camera. Results present significant high speed-ups showing that the technique could be an interesting alternative to deterministic methods for computing non-interactive radiosity animations for moderately complex scenarios
Resumo:
Objective. This study investigated whether trait positive schizotypy or trait dissociation was associated with increased levels of data-driven processing and symptoms of post-traumatic distress following a road traffic accident. Methods. Forty-five survivors of road traffic accidents were recruited from a London Accident and Emergency service. Each completed measures of trait positive schizotypy, trait dissociation, data-driven processing, and post-traumatic stress. Results. Trait positive schizotypy was associated with increased levels of data-driven processing and post-traumatic symptoms during a road traffic accident, whereas trait dissociation was not. Conclusions. Previous results which report a significant relationship between trait dissociation and post-traumatic symptoms may be an artefact of the relationship between trait positive schizotypy and trait dissociation.
Resumo:
In this paper, a forward-looking infrared (FLIR) video surveillance system is presented for collision avoidance of moving ships to bridge piers. An image pre-processing algorithm is proposed to reduce clutter noises by multi-scale fractal analysis, in which the blanket method is used for fractal feature computation. Then, the moving ship detection algorithm is developed from image differentials of the fractal feature in the region of surveillance between regularly interval frames. Experimental results have shown that the approach is feasible and effective. It has achieved real-time and reliable alert to avoid collisions of moving ships to bridge piers
Resumo:
This paper presents a clocking pipeline technique referred to as a single-pulse pipeline (PP-Pipeline) and applies it to the problem of mapping pipelined circuits to a Field Programmable Gate Array (FPGA). A PP-pipeline replicates the operation of asynchronous micropipelined control mechanisms using synchronous-orientated logic resources commonly found in FPGA devices. Consequently, circuits with an asynchronous-like pipeline operation can be efficiently synthesized using a synchronous design methodology. The technique can be extended to include data-completion circuitry to take advantage of variable data-completion processing time in synchronous pipelined designs. It is also shown that the PP-pipeline reduces the clock tree power consumption of pipelined circuits. These potential applications are demonstrated by post-synthesis simulation of FPGA circuits. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
There is growing interest in the ways in which the location of a person can be utilized by new applications and services. Recent advances in mobile technologies have meant that the technical capability to record and transmit location data for processing is appearing in off-the-shelf handsets. This opens possibilities to profile people based on the places they visit, people they associate with, or other aspects of their complex routines determined through persistent tracking. It is possible that services offering customized information based on the results of such behavioral profiling could become commonplace. However, it may not be immediately apparent to the user that a wealth of information about them, potentially unrelated to the service, can be revealed. Further issues occur if the user agreed, while subscribing to the service, for data to be passed to third parties where it may be used to their detriment. Here, we report in detail on a short case study tracking four people, in three European member states, persistently for six weeks using mobile handsets. The GPS locations of these people have been mined to reveal places of interest and to create simple profiles. The information drawn from the profiling activity ranges from intuitive through special cases to insightful. In this paper, these results and further extensions to the technology are considered in light of European legislation to assess the privacy implications of this emerging technology.
Resumo:
The long observational record is critical to our understanding of the Earth’s climate, but most observing systems were not developed with a climate objective in mind. As a result, tremendous efforts have gone into assessing and reprocessing the data records to improve their usefulness in climate studies. The purpose of this paper is to both review recent progress in reprocessing and reanalyzing observations, and summarize the challenges that must be overcome in order to improve our understanding of climate and variability. Reprocessing improves data quality through more scrutiny and improved retrieval techniques for individual observing systems, while reanalysis merges many disparate observations with models through data assimilation, yet both aim to provide a climatology of Earth processes. Many challenges remain, such as tracking the improvement of processing algorithms and limited spatial coverage. Reanalyses have fostered significant research, yet reliable global trends in many physical fields are not yet attainable, despite significant advances in data assimilation and numerical modeling. Oceanic reanalyses have made significant advances in recent years, but will only be discussed here in terms of progress toward integrated Earth system analyses. Climate data sets are generally adequate for process studies and large-scale climate variability. Communication of the strengths, limitations and uncertainties of reprocessed observations and reanalysis data, not only among the community of developers, but also with the extended research community, including the new generations of researchers and the decision makers is crucial for further advancement of the observational data records. It must be emphasized that careful investigation of the data and processing methods are required to use the observations appropriately.
Resumo:
The analysis step of the (ensemble) Kalman filter is optimal when (1) the distribution of the background is Gaussian, (2) state variables and observations are related via a linear operator, and (3) the observational error is of additive nature and has Gaussian distribution. When these conditions are largely violated, a pre-processing step known as Gaussian anamorphosis (GA) can be applied. The objective of this procedure is to obtain state variables and observations that better fulfil the Gaussianity conditions in some sense. In this work we analyse GA from a joint perspective, paying attention to the effects of transformations in the joint state variable/observation space. First, we study transformations for state variables and observations that are independent from each other. Then, we introduce a targeted joint transformation with the objective to obtain joint Gaussianity in the transformed space. We focus primarily in the univariate case, and briefly comment on the multivariate one. A key point of this paper is that, when (1)-(3) are violated, using the analysis step of the EnKF will not recover the exact posterior density in spite of any transformations one may perform. These transformations, however, provide approximations of different quality to the Bayesian solution of the problem. Using an example in which the Bayesian posterior can be analytically computed, we assess the quality of the analysis distributions generated after applying the EnKF analysis step in conjunction with different GA options. The value of the targeted joint transformation is particularly clear for the case when the prior is Gaussian, the marginal density for the observations is close to Gaussian, and the likelihood is a Gaussian mixture.