895 resultados para Unsupervised endmember extraction
Resumo:
The identification of people by measuring some traits of individual anatomy or physiology has led to a specific research area called biometric recognition. This thesis is focused on improving fingerprint recognition systems considering three important problems: fingerprint enhancement, fingerprint orientation extraction and automatic evaluation of fingerprint algorithms. An effective extraction of salient fingerprint features depends on the quality of the input fingerprint. If the fingerprint is very noisy, we are not able to detect a reliable set of features. A new fingerprint enhancement method, which is both iterative and contextual, is proposed. This approach detects high-quality regions in fingerprints, selectively applies contextual filtering and iteratively expands like wildfire toward low-quality ones. A precise estimation of the orientation field would greatly simplify the estimation of other fingerprint features (singular points, minutiae) and improve the performance of a fingerprint recognition system. The fingerprint orientation extraction is improved following two directions. First, after the introduction of a new taxonomy of fingerprint orientation extraction methods, several variants of baseline methods are implemented and, pointing out the role of pre- and post- processing, we show how to improve the extraction. Second, the introduction of a new hybrid orientation extraction method, which follows an adaptive scheme, allows to improve significantly the orientation extraction in noisy fingerprints. Scientific papers typically propose recognition systems that integrate many modules and therefore an automatic evaluation of fingerprint algorithms is needed to isolate the contributions that determine an actual progress in the state-of-the-art. The lack of a publicly available framework to compare fingerprint orientation extraction algorithms, motivates the introduction of a new benchmark area called FOE (including fingerprints and manually-marked orientation ground-truth) along with fingerprint matching benchmarks in the FVC-onGoing framework. The success of such framework is discussed by providing relevant statistics: more than 1450 algorithms submitted and two international competitions.
Resumo:
The work presented in this thesis is focused on the open-ended coaxial-probe frequency-domain reflectometry technique for complex permittivity measurement at microwave frequencies of dispersive dielectric multilayer materials. An effective dielectric model is introduced and validated to extend the applicability of this technique to multilayer materials in on-line system context. In addition, the thesis presents: 1) a numerical study regarding the imperfectness of the contact at the probe-material interface, 2) a review of the available models and techniques, 3) a new classification of the extraction schemes with guidelines on how they can be used to improve the overall performance of the probe according to the problem requirements.
Resumo:
Except the article forming the main content most HTML documents on the WWW contain additional contents such as navigation menus, design elements or commercial banners. In the context of several applications it is necessary to draw the distinction between main and additional content automatically. Content extraction and template detection are the two approaches to solve this task. This thesis gives an extensive overview of existing algorithms from both areas. It contributes an objective way to measure and evaluate the performance of content extraction algorithms under different aspects. These evaluation measures allow to draw the first objective comparison of existing extraction solutions. The newly introduced content code blurring algorithm overcomes several drawbacks of previous approaches and proves to be the best content extraction algorithm at the moment. An analysis of methods to cluster web documents according to their underlying templates is the third major contribution of this thesis. In combination with a localised crawling process this clustering analysis can be used to automatically create sets of training documents for template detection algorithms. As the whole process can be automated it allows to perform template detection on a single document, thereby combining the advantages of single and multi document algorithms.
Resumo:
This research work is aimed at the valorization of two types of pomace deriving from the extra virgin olive oil mechanical extraction process, such as olive pomace and a new by-product named “paté”, in the livestock sector as important sources of antioxidants and unsaturated fatty acids. In the first research the suitability of dried stoned olive pomace as a dietary supplement for dairy buffaloes was evaluated. The effectiveness of this utilization in modifying fatty acid composition and improving the oxidative stability of buffalo milk and mozzarella cheese have been proven by means of the analysis of qualitative and quantitative parameters. In the second research the use of paté as a new by-product in dietary feed supplementation for dairy ewes, already fed with a source of unsaturated fatty acids such as extruded linseed, was studied in order to assess the effect of this combination on the dairy products obtained. The characterization of paté as a new by-product was also carried out, studying the optimal conditions of its stabilization and preservation at the same time. The main results, common to both researches, have been the detection and the characterization of hydrophilic phenols in the milk. The analytical detection of hydroxytyrosol and tyrosol in the ewes’ milk fed with the paté and hydroxytyrosol in buffalo fed with pomace showed for the first time the presence in the milk of hydroxytyrosol, which is one of the most important bioactive compounds of the oil industry products; the transfer of these antioxidants and the proven improvement of the quality of milk fat could positively interact in the prevention of some human cardiovascular diseases and some tumours, increasing in this manner the quality of dairy products, also improving their shelf-life. These results also provide important information on the bioavailability of these phenolic compounds.
Resumo:
This thesis aims at investigating methods and software architectures for discovering what are the typical and frequently occurring structures used for organizing knowledge in the Web. We identify these structures as Knowledge Patterns (KPs). KP discovery needs to address two main research problems: the heterogeneity of sources, formats and semantics in the Web (i.e., the knowledge soup problem) and the difficulty to draw relevant boundary around data that allows to capture the meaningful knowledge with respect to a certain context (i.e., the knowledge boundary problem). Hence, we introduce two methods that provide different solutions to these two problems by tackling KP discovery from two different perspectives: (i) the transformation of KP-like artifacts to KPs formalized as OWL2 ontologies; (ii) the bottom-up extraction of KPs by analyzing how data are organized in Linked Data. The two methods address the knowledge soup and boundary problems in different ways. The first method provides a solution to the two aforementioned problems that is based on a purely syntactic transformation step of the original source to RDF followed by a refactoring step whose aim is to add semantics to RDF by select meaningful RDF triples. The second method allows to draw boundaries around RDF in Linked Data by analyzing type paths. A type path is a possible route through an RDF that takes into account the types associated to the nodes of a path. Then we present K~ore, a software architecture conceived to be the basis for developing KP discovery systems and designed according to two software architectural styles, i.e, the Component-based and REST. Finally we provide an example of reuse of KP based on Aemoo, an exploratory search tool which exploits KPs for performing entity summarization.
Resumo:
Over the past ten years, the cross-correlation of long-time series of ambient seismic noise (ASN) has been widely adopted to extract the surface-wave part of the Green’s Functions (GF). This stochastic procedure relies on the assumption that ASN wave-field is diffuse and stationary. At frequencies <1Hz, the ASN is mainly composed by surface-waves, whose origin is attributed to the sea-wave climate. Consequently, marked directional properties may be observed, which call for accurate investigation about location and temporal evolution of the ASN-sources before attempting any GF retrieval. Within this general context, this thesis is aimed at a thorough investigation about feasibility and robustness of the noise-based methods toward the imaging of complex geological structures at the local (∼10-50km) scale. The study focused on the analysis of an extended (11 months) seismological data set collected at the Larderello-Travale geothermal field (Italy), an area for which the underground geological structures are well-constrained thanks to decades of geothermal exploration. Focusing on the secondary microseism band (SM;f>0.1Hz), I first investigate the spectral features and the kinematic properties of the noise wavefield using beamforming analysis, highlighting a marked variability with time and frequency. For the 0.1-0.3Hz frequency band and during Spring- Summer-time, the SMs waves propagate with high apparent velocities and from well-defined directions, likely associated with ocean-storms in the south- ern hemisphere. Conversely, at frequencies >0.3Hz the distribution of back- azimuths is more scattered, thus indicating that this frequency-band is the most appropriate for the application of stochastic techniques. For this latter frequency interval, I tested two correlation-based methods, acting in the time (NCF) and frequency (modified-SPAC) domains, respectively yielding esti- mates of the group- and phase-velocity dispersions. Velocity data provided by the two methods are markedly discordant; comparison with independent geological and geophysical constraints suggests that NCF results are more robust and reliable.
Resumo:
Zeitreihen sind allgegenwärtig. Die Erfassung und Verarbeitung kontinuierlich gemessener Daten ist in allen Bereichen der Naturwissenschaften, Medizin und Finanzwelt vertreten. Das enorme Anwachsen aufgezeichneter Datenmengen, sei es durch automatisierte Monitoring-Systeme oder integrierte Sensoren, bedarf außerordentlich schneller Algorithmen in Theorie und Praxis. Infolgedessen beschäftigt sich diese Arbeit mit der effizienten Berechnung von Teilsequenzalignments. Komplexe Algorithmen wie z.B. Anomaliedetektion, Motivfabfrage oder die unüberwachte Extraktion von prototypischen Bausteinen in Zeitreihen machen exzessiven Gebrauch von diesen Alignments. Darin begründet sich der Bedarf nach schnellen Implementierungen. Diese Arbeit untergliedert sich in drei Ansätze, die sich dieser Herausforderung widmen. Das umfasst vier Alignierungsalgorithmen und ihre Parallelisierung auf CUDA-fähiger Hardware, einen Algorithmus zur Segmentierung von Datenströmen und eine einheitliche Behandlung von Liegruppen-wertigen Zeitreihen.rnrnDer erste Beitrag ist eine vollständige CUDA-Portierung der UCR-Suite, die weltführende Implementierung von Teilsequenzalignierung. Das umfasst ein neues Berechnungsschema zur Ermittlung lokaler Alignierungsgüten unter Verwendung z-normierten euklidischen Abstands, welches auf jeder parallelen Hardware mit Unterstützung für schnelle Fouriertransformation einsetzbar ist. Des Weiteren geben wir eine SIMT-verträgliche Umsetzung der Lower-Bound-Kaskade der UCR-Suite zur effizienten Berechnung lokaler Alignierungsgüten unter Dynamic Time Warping an. Beide CUDA-Implementierungen ermöglichen eine um ein bis zwei Größenordnungen schnellere Berechnung als etablierte Methoden.rnrnAls zweites untersuchen wir zwei Linearzeit-Approximierungen für das elastische Alignment von Teilsequenzen. Auf der einen Seite behandeln wir ein SIMT-verträgliches Relaxierungschema für Greedy DTW und seine effiziente CUDA-Parallelisierung. Auf der anderen Seite führen wir ein neues lokales Abstandsmaß ein, den Gliding Elastic Match (GEM), welches mit der gleichen asymptotischen Zeitkomplexität wie Greedy DTW berechnet werden kann, jedoch eine vollständige Relaxierung der Penalty-Matrix bietet. Weitere Verbesserungen umfassen Invarianz gegen Trends auf der Messachse und uniforme Skalierung auf der Zeitachse. Des Weiteren wird eine Erweiterung von GEM zur Multi-Shape-Segmentierung diskutiert und auf Bewegungsdaten evaluiert. Beide CUDA-Parallelisierung verzeichnen Laufzeitverbesserungen um bis zu zwei Größenordnungen.rnrnDie Behandlung von Zeitreihen beschränkt sich in der Literatur in der Regel auf reellwertige Messdaten. Der dritte Beitrag umfasst eine einheitliche Methode zur Behandlung von Liegruppen-wertigen Zeitreihen. Darauf aufbauend werden Distanzmaße auf der Rotationsgruppe SO(3) und auf der euklidischen Gruppe SE(3) behandelt. Des Weiteren werden speichereffiziente Darstellungen und gruppenkompatible Erweiterungen elastischer Maße diskutiert.
Resumo:
In this thesis we are going to talk about technologies which allow us to approach sentiment analysis on newspapers articles. The final goal of this work is to help social scholars to do content analysis on big corpora of texts in a faster way thanks to the support of automatic text classification.
Resumo:
In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks. The rise of deep learning is also revolutionizing the entire field of Machine Learning and Pattern Recognition pushing forward the concepts of automatic feature extraction and unsupervised learning in general. However, despite the strong success both in science and business, deep learning has its own limitations. It is often questioned if such techniques are only some kind of brute-force statistical approaches and if they can only work in the context of High Performance Computing with tons of data. Another important question is whether they are really biologically inspired, as claimed in certain cases, and if they can scale well in terms of "intelligence". The dissertation is focused on trying to answer these key questions in the context of Computer Vision and, in particular, Object Recognition, a task that has been heavily revolutionized by recent advances in the field. Practically speaking, these answers are based on an exhaustive comparison between two, very different, deep learning techniques on the aforementioned task: Convolutional Neural Network (CNN) and Hierarchical Temporal memory (HTM). They stand for two different approaches and points of view within the big hat of deep learning and are the best choices to understand and point out strengths and weaknesses of each of them. CNN is considered one of the most classic and powerful supervised methods used today in machine learning and pattern recognition, especially in object recognition. CNNs are well received and accepted by the scientific community and are already deployed in large corporation like Google and Facebook for solving face recognition and image auto-tagging problems. HTM, on the other hand, is known as a new emerging paradigm and a new meanly-unsupervised method, that is more biologically inspired. It tries to gain more insights from the computational neuroscience community in order to incorporate concepts like time, context and attention during the learning process which are typical of the human brain. In the end, the thesis is supposed to prove that in certain cases, with a lower quantity of data, HTM can outperform CNN.
Resumo:
In questo lavoro si introducono i concetti di base di Natural Language Processing, soffermandosi su Information Extraction e analizzandone gli ambiti applicativi, le attività principali e la differenza rispetto a Information Retrieval. Successivamente si analizza il processo di Named Entity Recognition, focalizzando l’attenzione sulle principali problematiche di annotazione di testi e sui metodi per la valutazione della qualità dell’estrazione di entità. Infine si fornisce una panoramica della piattaforma software open-source di language processing GATE/ANNIE, descrivendone l’architettura e i suoi componenti principali, con approfondimenti sugli strumenti che GATE offre per l'approccio rule-based a Named Entity Recognition.
Resumo:
Early implant placement is one of the treatment options after tooth extraction. Implant surgery is performed after a healing period of 4 to 8 weeks and combined with a simultaneous contour augmentation using the guided bone regeneration technique to rebuild stable esthetic facial hard- and soft-tissue contours.
Resumo:
In most pathology laboratories worldwide, formalin-fixed paraffin embedded (FFPE) samples are the only tissue specimens available for routine diagnostics. Although commercial kits for diagnostic molecular pathology testing are becoming available, most of the current diagnostic tests are laboratory-based assays. Thus, there is a need for standardized procedures in molecular pathology, starting from the extraction of nucleic acids. To evaluate the current methods for extracting nucleic acids from FFPE tissues, 13 European laboratories, participating to the European FP6 program IMPACTS (www.impactsnetwork.eu), isolated nucleic acids from four diagnostic FFPE tissues using their routine methods, followed by quality assessment. The DNA-extraction protocols ranged from homemade protocols to commercial kits. Except for one homemade protocol, the majority gave comparable results in terms of the quality of the extracted DNA measured by the ability to amplify differently sized control gene fragments by PCR. For array-applications or tests that require an accurately determined DNA-input, we recommend using silica based adsorption columns for DNA recovery. For RNA extractions, the best results were obtained using chromatography column based commercial kits, which resulted in the highest quantity and best assayable RNA. Quality testing using RT-PCR gave successful amplification of 200 bp-250 bp PCR products from most tested tissues. Modifications of the proteinase-K digestion time led to better results, even when commercial kits were applied. The results of the study emphasize the need for quality control of the nucleic acid extracts with standardised methods to prevent false negative results and to allow data comparison among different diagnostic laboratories.
Resumo:
The aim of this study was to assess the changes in inclination of the maxillary second (M2) and third (M3) molars after orthodontic treatment of Class II Division 1 malocclusion with extraction of maxillary first molars.