897 resultados para high dimensional geometry
Resumo:
Multi-dimensional classification (MDC) is the supervised learning problem where an instance is associated with multiple classes, rather than with a single class, as in traditional classification problems. Since these classes are often strongly correlated, modeling the dependencies between them allows MDC methods to improve their performance – at the expense of an increased computational cost. In this paper we focus on the classifier chains (CC) approach for modeling dependencies, one of the most popular and highest-performing methods for multi-label classification (MLC), a particular case of MDC which involves only binary classes (i.e., labels). The original CC algorithm makes a greedy approximation, and is fast but tends to propagate errors along the chain. Here we present novel Monte Carlo schemes, both for finding a good chain sequence and performing efficient inference. Our algorithms remain tractable for high-dimensional data sets and obtain the best predictive performance across several real data sets.
Resumo:
In this work, various turbulent solutions of the two-dimensional (2D) and three-dimensional compressible Reynolds averaged Navier?Stokes equations are analyzed using global stability theory. This analysis is motivated by the onset of flow unsteadiness (Hopf bifurcation) for transonic buffet conditions where moderately high Reynolds numbers and compressible effects must be considered. The buffet phenomenon involves a complex interaction between the separated flow and a shock wave. The efficient numerical methodology presented in this paper predicts the critical parameters, namely, the angle of attack and Mach and Reynolds numbers beyond which the onset of flow unsteadiness appears. The geometry, a NACA0012 profile, and flow parameters selected reproduce situations of practical interest for aeronautical applications. The numerical computation is performed in three steps. First, a steady baseflow solution is obtained; second, the Jacobian matrix for the RANS equations based on a finite volume discretization is computed; and finally, the generalized eigenvalue problem is derived when the baseflow is linearly perturbed. The methodology is validated predicting the 2D Hopf bifurcation for a circular cylinder under laminar flow condition. This benchmark shows good agreement with the previous published computations and experimental data. In the transonic buffet case, the baseflow is computed using the Spalart?Allmaras turbulence model and represents a mean flow where the high frequency content and length scales of the order of the shear-layer thickness have been averaged. The lower frequency content is assumed to be decoupled from the high frequencies, thus allowing a stability analysis to be performed on the low frequency range. In addition, results of the corresponding adjoint problem and the sensitivity map are provided for the first time for the buffet problem. Finally, an extruded three-dimensional geometry of the NACA0012 airfoil, where all velocity components are considered, was also analyzed as a Triglobal stability case, and the outcoming results were compared to the previous 2D limited model, confirming that the buffet onset is well detected.
Resumo:
La aparición de inestabilidades en un flujo es un problema importante que puede afectar a algunas aplicaciones aerodinámicas. De hecho existen diferentes tipos de fenómenos no-estacionarios que actualmente son tema de investigación; casos como la separación a altos ángulos de ataque o el buffet transónico son dos ejemplos de cierta relevancia. El análisis de estabilidad global permite identificar la aparición de dichas condiciones inestables, proporcionando información importante sobre la región donde la inestabilidad es dominante y sobre la frecuencia del fenómeno inestable. La metodología empleada es capaz de calcular un flujo base promediado mediante una discretización con volúmenes finitos y posteriormente la solución de un problema de autovalores asociado a la linealización que aparece al perturbar el flujo base. El cálculo numérico se puede dividir en tres pasos: primero se calcula una solución estacionaria para las ecuaciones RANS, luego se extrae la matriz del Jacobiano que representa el problema linealizado y finalmente se deriva y se resuelve el problema de autovalores generalizado mediante el método iterativo de Arnoldi. Como primer caso de validación, la técnica descrita ha sido aplicada a un cilindro circular en condiciones laminares para detectar el principio de las oscilaciones de los vórtices de von Karman, y se han comparado los resultados con experimentos y cálculos anteriores. La parte más importante del estudio se centra en el análisis de flujos compresibles en régimen turbulento. La predicción de la aparición y la progresión de flujo separado a altos ángulos de ataque se han estudiado en el perfil NACA0012 en condiciones tanto subsónicas como supersónicas y en una sección del ala del A310 en condiciones de despegue. Para todas las geometrías analizadas, se ha podido observar que la separación gradual genera la aparición de un modo inestable específico para altos ángulos de ataque siempre mayores que el ángulo asociado al máximo coeficiente de sustentación. Además, se ha estudiado el problema adjunto para obtener información sobre la zona donde una fuerza externa provoca el máximo cambio en el campo fluido. El estudio se ha completado calculando el mapa de sensibilidad estructural y localizando el centro de la inestabilidad. En el presente trabajo de tesis se ha analizado otro importante fenómeno: el buffet transónico. En condiciones transónicas, la interacción entre la onda de choque y la capa límite genera una oscilación de la posición de la onda de choque y, por consiguiente, de las fuerzas aerodinámicas. El conocimiento de las condiciones críticas y su origen puede ayudar a evitar la oscilación causada por estas fuerzas. Las condiciones para las cuales comienza la inestabilidad han sido calculadas y comparadas con trabajos anteriores. Por otra parte, los resultados del correspondiente problema adjunto y el mapa de sensibilidad se han obtenido por primera vez para el buffet, indicando la región del dominio que sera necesario modificar para crear el mayor cambio en las propiedades del campo fluido. Dado el gran consumo de memoria requerido para los casos 3D, se ha realizado un estudio sobre la reducción del domino con la finalidad de reducirlo a la región donde está localizada la inestabilidad. La eficacia de dicha reducción de dominio ha sido evaluada investigando el cambio en la dimensión de la matriz del Jacobiano, no resultando muy eficiente en términos del consumo de memoria. Dado que el buffet es un problema en general tridimensional, el análisis TriGlobal de una geometría 3D podría considerarse el auténtico reto futuro. Como aproximación al problema, un primer estudio se ha realizado empleando una geometría tridimensional extruida del NACA00f2. El cálculo del flujo 3D y, por primera vez en casos tridimensionales compresibles y turbulentos, el análisis de estabilidad TriGlobal, se han llevado a cabo. La comparación de los resultados obtenidos con los resultados del anterior modelo 2D, ha permitido, primero, verificar la exactitud del cálculo 2D realizado anteriormente y también ha proporcionado una estimación del consumo de memoria requerido para el caso 3D. ABSTRACT Flow unsteadiness is an important problem in aerodynamic applications. In fact, there are several types of unsteady phenomena that are still at the cutting edge of research in the field; separation at high angles of attack and transonic buffet are two important examples. Global Stability Analysis can identify the unstable onset conditions, providing important information about the instability location in the domain and the frequency of the unstable phenomenon. The methodology computes a base flow averaged state based on a finite volume discretization and a solution for a generalized eigenvalue problem corresponding to the perturbed linearized equations. The numerical computation is then performed in three steps: first, a steady solution for the RANS equation is computed; second, the Jacobian matrix that represents the linearized problem is obtained; and finally, the generalized eigenvalue problem is derived and solved with an Arnoldi iterative method. As a first validation test, the technique has been applied on a laminar circular cylinder in order to detect the von Karman vortex shedding onset, comparing the results with experiments and with previous calculations. The main part of the study focusses on turbulent and compressible cases. The prediction of the origin and progression of separated flows at high angles of attack has been studied on the NACA0012 airfoil at subsonic and transonic conditions and for the A310 airfoil in take-off configuration. For all the analyzed geometries, it has been found that gradual separation generates the appearance of one specific unstable mode for angles of attack always greater than the ones related to the maximum lift coefficient. In addition, the adjoint problem has been studied to suggest the location of an external force that results in the largest change to the flow field. From the direct and the adjoint analysis the structural sensitivity map has been computed and the core of the instability has been located. The other important phenomenon analyzed in this work is the transonic buffet. In transonic conditions, the interaction between the shock wave and the boundary layer leads to an oscillation of the shock location and, consequently, of the aerodynamic forces. Knowing the critical operational conditions and its origin can be helpful in preventing such fluctuating forces. The instability onset has then been computed and compared with the literature. Moreover, results of the corresponding adjoint problem and a sensitivity map have been provided for the first time for the buffet problem, indicating the region that must be modified to create the biggest change in flow field properties. Because of the large memory consumption required when a 3D case is approached, a domain reduction study has been carried out with the aim of limiting the domain size to the region where the instability is located. The effectiveness of the domain reduction has been evaluated by investigating the change in the Jacobian matrix size, not being very efficient in terms of memory consumption. Since buffet is a three-dimensional problem, TriGlobal stability analysis can be seen as a future challenge. To approximate the problem, a first study has been carried out on an extruded three-dimensional geometry of the NACA0012 airfoil. The 3D flow computation and the TriGlobal stability analysis have been performed for the first time on a compressible and turbulent 3D case. The results have been compared with a 2D model, confirming that the buffet onset evaluated in the 2D case is well detected. Moreover, the computation has given an indication about the memory consumption for a 3D case.
Resumo:
A manufacturing technique for the production of aluminum components is described. A resin-bonded part is formed by a rapid prototyping technique and then debound and infiltrated by a second aluminum alloy under a nitrogen atmosphere. During thermal processing, the aluminum reacts with the nitrogen and is partially transformed into a rigid aluminum nitride skeleton, which provides the structural rigidity during infiltration. The simplicity and rapidity of this process in comparison to conventional production routes, combined with the ability to fabricate complicated parts of almost any geometry and with high dimensional precision, provide an additional means to manufacture aluminum components.
Resumo:
Finite mixture models are being increasingly used to model the distributions of a wide variety of random phenomena. While normal mixture models are often used to cluster data sets of continuous multivariate data, a more robust clustering can be obtained by considering the t mixture model-based approach. Mixtures of factor analyzers enable model-based density estimation to be undertaken for high-dimensional data where the number of observations n is very large relative to their dimension p. As the approach using the multivariate normal family of distributions is sensitive to outliers, it is more robust to adopt the multivariate t family for the component error and factor distributions. The computational aspects associated with robustness and high dimensionality in these approaches to cluster analysis are discussed and illustrated.
Resumo:
This thesis applies a hierarchical latent trait model system to a large quantity of data. The motivation for it was lack of viable approaches to analyse High Throughput Screening datasets which maybe include thousands of data points with high dimensions. High Throughput Screening (HTS) is an important tool in the pharmaceutical industry for discovering leads which can be optimised and further developed into candidate drugs. Since the development of new robotic technologies, the ability to test the activities of compounds has considerably increased in recent years. Traditional methods, looking at tables and graphical plots for analysing relationships between measured activities and the structure of compounds, have not been feasible when facing a large HTS dataset. Instead, data visualisation provides a method for analysing such large datasets, especially with high dimensions. So far, a few visualisation techniques for drug design have been developed, but most of them just cope with several properties of compounds at one time. We believe that a latent variable model (LTM) with a non-linear mapping from the latent space to the data space is a preferred choice for visualising a complex high-dimensional data set. As a type of latent variable model, the latent trait model can deal with either continuous data or discrete data, which makes it particularly useful in this domain. In addition, with the aid of differential geometry, we can imagine the distribution of data from magnification factor and curvature plots. Rather than obtaining the useful information just from a single plot, a hierarchical LTM arranges a set of LTMs and their corresponding plots in a tree structure. We model the whole data set with a LTM at the top level, which is broken down into clusters at deeper levels of t.he hierarchy. In this manner, the refined visualisation plots can be displayed in deeper levels and sub-clusters may be found. Hierarchy of LTMs is trained using expectation-maximisation (EM) algorithm to maximise its likelihood with respect to the data sample. Training proceeds interactively in a recursive fashion (top-down). The user subjectively identifies interesting regions on the visualisation plot that they would like to model in a greater detail. At each stage of hierarchical LTM construction, the EM algorithm alternates between the E- and M-step. Another problem that can occur when visualising a large data set is that there may be significant overlaps of data clusters. It is very difficult for the user to judge where centres of regions of interest should be put. We address this problem by employing the minimum message length technique, which can help the user to decide the optimal structure of the model. In this thesis we also demonstrate the applicability of the hierarchy of latent trait models in the field of document data mining.
Resumo:
We present a test for identifying clusters in high dimensional data based on the k-means algorithm when the null hypothesis is spherical normal. We show that projection techniques used for evaluating validity of clusters may be misleading for such data. In particular, we demonstrate that increasingly well-separated clusters are identified as the dimensionality increases, when no such clusters exist. Furthermore, in a case of true bimodality, increasing the dimensionality makes identifying the correct clusters more difficult. In addition to the original conservative test, we propose a practical test with the same asymptotic behavior that performs well for a moderate number of points and moderate dimensionality. ACM Computing Classification System (1998): I.5.3.
Resumo:
Popular dimension reduction and visualisation algorithms rely on the assumption that input dissimilarities are typically Euclidean, for instance Metric Multidimensional Scaling, t-distributed Stochastic Neighbour Embedding and the Gaussian Process Latent Variable Model. It is well known that this assumption does not hold for most datasets and often high-dimensional data sits upon a manifold of unknown global geometry. We present a method for improving the manifold charting process, coupled with Elastic MDS, such that we no longer assume that the manifold is Euclidean, or of any particular structure. We draw on the benefits of different dissimilarity measures allowing for the relative responsibilities, under a linear combination, to drive the visualisation process.
Resumo:
This dissertation focuses on two vital challenges in relation to whale acoustic signals: detection and classification.
In detection, we evaluated the influence of the uncertain ocean environment on the spectrogram-based detector, and derived the likelihood ratio of the proposed Short Time Fourier Transform detector. Experimental results showed that the proposed detector outperforms detectors based on the spectrogram. The proposed detector is more sensitive to environmental changes because it includes phase information.
In classification, our focus is on finding a robust and sparse representation of whale vocalizations. Because whale vocalizations can be modeled as polynomial phase signals, we can represent the whale calls by their polynomial phase coefficients. In this dissertation, we used the Weyl transform to capture chirp rate information, and used a two dimensional feature set to represent whale vocalizations globally. Experimental results showed that our Weyl feature set outperforms chirplet coefficients and MFCC (Mel Frequency Cepstral Coefficients) when applied to our collected data.
Since whale vocalizations can be represented by polynomial phase coefficients, it is plausible that the signals lie on a manifold parameterized by these coefficients. We also studied the intrinsic structure of high dimensional whale data by exploiting its geometry. Experimental results showed that nonlinear mappings such as Laplacian Eigenmap and ISOMAP outperform linear mappings such as PCA and MDS, suggesting that the whale acoustic data is nonlinear.
We also explored deep learning algorithms on whale acoustic data. We built each layer as convolutions with either a PCA filter bank (PCANet) or a DCT filter bank (DCTNet). With the DCT filter bank, each layer has different a time-frequency scale representation, and from this, one can extract different physical information. Experimental results showed that our PCANet and DCTNet achieve high classification rate on the whale vocalization data set. The word error rate of the DCTNet feature is similar to the MFSC in speech recognition tasks, suggesting that the convolutional network is able to reveal acoustic content of speech signals.
Resumo:
Human activities represent a significant burden on the global water cycle, with large and increasing demands placed on limited water resources by manufacturing, energy production and domestic water use. In addition to changing the quantity of available water resources, human activities lead to changes in water quality by introducing a large and often poorly-characterized array of chemical pollutants, which may negatively impact biodiversity in aquatic ecosystems, leading to impairment of valuable ecosystem functions and services. Domestic and industrial wastewaters represent a significant source of pollution to the aquatic environment due to inadequate or incomplete removal of chemicals introduced into waters by human activities. Currently, incomplete chemical characterization of treated wastewaters limits comprehensive risk assessment of this ubiquitous impact to water. In particular, a significant fraction of the organic chemical composition of treated industrial and domestic wastewaters remains uncharacterized at the molecular level. Efforts aimed at reducing the impacts of water pollution on aquatic ecosystems critically require knowledge of the composition of wastewaters to develop interventions capable of protecting our precious natural water resources.
The goal of this dissertation was to develop a robust, extensible and high-throughput framework for the comprehensive characterization of organic micropollutants in wastewaters by high-resolution accurate-mass mass spectrometry. High-resolution mass spectrometry provides the most powerful analytical technique available for assessing the occurrence and fate of organic pollutants in the water cycle. However, significant limitations in data processing, analysis and interpretation have limited this technique in achieving comprehensive characterization of organic pollutants occurring in natural and built environments. My work aimed to address these challenges by development of automated workflows for the structural characterization of organic pollutants in wastewater and wastewater impacted environments by high-resolution mass spectrometry, and to apply these methods in combination with novel data handling routines to conduct detailed fate studies of wastewater-derived organic micropollutants in the aquatic environment.
In Chapter 2, chemoinformatic tools were implemented along with novel non-targeted mass spectrometric analytical methods to characterize, map, and explore an environmentally-relevant “chemical space” in municipal wastewater. This was accomplished by characterizing the molecular composition of known wastewater-derived organic pollutants and substances that are prioritized as potential wastewater contaminants, using these databases to evaluate the pollutant-likeness of structures postulated for unknown organic compounds that I detected in wastewater extracts using high-resolution mass spectrometry approaches. Results showed that application of multiple computational mass spectrometric tools to structural elucidation of unknown organic pollutants arising in wastewaters improved the efficiency and veracity of screening approaches based on high-resolution mass spectrometry. Furthermore, structural similarity searching was essential for prioritizing substances sharing structural features with known organic pollutants or industrial and consumer chemicals that could enter the environment through use or disposal.
I then applied this comprehensive methodological and computational non-targeted analysis workflow to micropollutant fate analysis in domestic wastewaters (Chapter 3), surface waters impacted by water reuse activities (Chapter 4) and effluents of wastewater treatment facilities receiving wastewater from oil and gas extraction activities (Chapter 5). In Chapter 3, I showed that application of chemometric tools aided in the prioritization of non-targeted compounds arising at various stages of conventional wastewater treatment by partitioning high dimensional data into rational chemical categories based on knowledge of organic chemical fate processes, resulting in the classification of organic micropollutants based on their occurrence and/or removal during treatment. Similarly, in Chapter 4, high-resolution sampling and broad-spectrum targeted and non-targeted chemical analysis were applied to assess the occurrence and fate of organic micropollutants in a water reuse application, wherein reclaimed wastewater was applied for irrigation of turf grass. Results showed that organic micropollutant composition of surface waters receiving runoff from wastewater irrigated areas appeared to be minimally impacted by wastewater-derived organic micropollutants. Finally, Chapter 5 presents results of the comprehensive organic chemical composition of oil and gas wastewaters treated for surface water discharge. Concurrent analysis of effluent samples by complementary, broad-spectrum analytical techniques, revealed that low-levels of hydrophobic organic contaminants, but elevated concentrations of polymeric surfactants, which may effect the fate and analysis of contaminants of concern in oil and gas wastewaters.
Taken together, my work represents significant progress in the characterization of polar organic chemical pollutants associated with wastewater-impacted environments by high-resolution mass spectrometry. Application of these comprehensive methods to examine micropollutant fate processes in wastewater treatment systems, water reuse environments, and water applications in oil/gas exploration yielded new insights into the factors that influence transport, transformation, and persistence of organic micropollutants in these systems across an unprecedented breadth of chemical space.
Resumo:
Highlights of Data Expedition: • Students explored daily observations of local climate data spanning the past 35 years. • Topological Data Analysis, or TDA for short, provides cutting-edge tools for studying the geometry of data in arbitrarily high dimensions. • Using TDA tools, students discovered intrinsic dynamical features of the data and learned how to quantify periodic phenomenon in a time-series. • Since nature invariably produces noisy data which rarely has exact periodicity, students also considered the theoretical basis of almost-periodicity and even invented and tested new mathematical definitions of almost-periodic functions. Summary The dataset we used for this data expedition comes from the Global Historical Climatology Network. “GHCN (Global Historical Climatology Network)-Daily is an integrated database of daily climate summaries from land surface stations across the globe.” Source: https://www.ncdc.noaa.gov/oa/climate/ghcn-daily/ We focused on the daily maximum and minimum temperatures from January 1, 1980 to April 1, 2015 collected from RDU International Airport. Through a guided series of exercises designed to be performed in Matlab, students explore these time-series, initially by direct visualization and basic statistical techniques. Then students are guided through a special sliding-window construction which transforms a time-series into a high-dimensional geometric curve. These high-dimensional curves can be visualized by projecting down to lower dimensions as in the figure below (Figure 1), however, our focus here was to use persistent homology to directly study the high-dimensional embedding. The shape of these curves has meaningful information but how one describes the “shape” of data depends on which scale the data is being considered. However, choosing the appropriate scale is rarely an obvious choice. Persistent homology overcomes this obstacle by allowing us to quantitatively study geometric features of the data across multiple-scales. Through this data expedition, students are introduced to numerically computing persistent homology using the rips collapse algorithm and interpreting the results. In the specific context of sliding-window constructions, 1-dimensional persistent homology can reveal the nature of periodic structure in the original data. I created a special technique to study how these high-dimensional sliding-window curves form loops in order to quantify the periodicity. Students are guided through this construction and learn how to visualize and interpret this information. Climate data is extremely complex (as anyone who has suffered from a bad weather prediction can attest) and numerous variables play a role in determining our daily weather and temperatures. This complexity coupled with imperfections of measuring devices results in very noisy data. This causes the annual seasonal periodicity to be far from exact. To this end, I have students explore existing theoretical notions of almost-periodicity and test it on the data. They find that some existing definitions are also inadequate in this context. Hence I challenged them to invent new mathematics by proposing and testing their own definition. These students rose to the challenge and suggested a number of creative definitions. While autocorrelation and spectral methods based on Fourier analysis are often used to explore periodicity, the construction here provides an alternative paradigm to quantify periodic structure in almost-periodic signals using tools from topological data analysis.
Resumo:
While molecular and cellular processes are often modeled as stochastic processes, such as Brownian motion, chemical reaction networks and gene regulatory networks, there are few attempts to program a molecular-scale process to physically implement stochastic processes. DNA has been used as a substrate for programming molecular interactions, but its applications are restricted to deterministic functions and unfavorable properties such as slow processing, thermal annealing, aqueous solvents and difficult readout limit them to proof-of-concept purposes. To date, whether there exists a molecular process that can be programmed to implement stochastic processes for practical applications remains unknown.
In this dissertation, a fully specified Resonance Energy Transfer (RET) network between chromophores is accurately fabricated via DNA self-assembly, and the exciton dynamics in the RET network physically implement a stochastic process, specifically a continuous-time Markov chain (CTMC), which has a direct mapping to the physical geometry of the chromophore network. Excited by a light source, a RET network generates random samples in the temporal domain in the form of fluorescence photons which can be detected by a photon detector. The intrinsic sampling distribution of a RET network is derived as a phase-type distribution configured by its CTMC model. The conclusion is that the exciton dynamics in a RET network implement a general and important class of stochastic processes that can be directly and accurately programmed and used for practical applications of photonics and optoelectronics. Different approaches to using RET networks exist with vast potential applications. As an entropy source that can directly generate samples from virtually arbitrary distributions, RET networks can benefit applications that rely on generating random samples such as 1) fluorescent taggants and 2) stochastic computing.
By using RET networks between chromophores to implement fluorescent taggants with temporally coded signatures, the taggant design is not constrained by resolvable dyes and has a significantly larger coding capacity than spectrally or lifetime coded fluorescent taggants. Meanwhile, the taggant detection process becomes highly efficient, and the Maximum Likelihood Estimation (MLE) based taggant identification guarantees high accuracy even with only a few hundred detected photons.
Meanwhile, RET-based sampling units (RSU) can be constructed to accelerate probabilistic algorithms for wide applications in machine learning and data analytics. Because probabilistic algorithms often rely on iteratively sampling from parameterized distributions, they can be inefficient in practice on the deterministic hardware traditional computers use, especially for high-dimensional and complex problems. As an efficient universal sampling unit, the proposed RSU can be integrated into a processor / GPU as specialized functional units or organized as a discrete accelerator to bring substantial speedups and power savings.
Resumo:
Nowadays, new computers generation provides a high performance that enables to build computationally expensive computer vision applications applied to mobile robotics. Building a map of the environment is a common task of a robot and is an essential part to allow the robots to move through these environments. Traditionally, mobile robots used a combination of several sensors from different technologies. Lasers, sonars and contact sensors have been typically used in any mobile robotic architecture, however color cameras are an important sensor due to we want the robots to use the same information that humans to sense and move through the different environments. Color cameras are cheap and flexible but a lot of work need to be done to give robots enough visual understanding of the scenes. Computer vision algorithms are computational complex problems but nowadays robots have access to different and powerful architectures that can be used for mobile robotics purposes. The advent of low-cost RGB-D sensors like Microsoft Kinect which provide 3D colored point clouds at high frame rates made the computer vision even more relevant in the mobile robotics field. The combination of visual and 3D data allows the systems to use both computer vision and 3D processing and therefore to be aware of more details of the surrounding environment. The research described in this thesis was motivated by the need of scene mapping. Being aware of the surrounding environment is a key feature in many mobile robotics applications from simple robotic navigation to complex surveillance applications. In addition, the acquisition of a 3D model of the scenes is useful in many areas as video games scene modeling where well-known places are reconstructed and added to game systems or advertising where once you get the 3D model of one room the system can add furniture pieces using augmented reality techniques. In this thesis we perform an experimental study of the state-of-the-art registration methods to find which one fits better to our scene mapping purposes. Different methods are tested and analyzed on different scene distributions of visual and geometry appearance. In addition, this thesis proposes two methods for 3d data compression and representation of 3D maps. Our 3D representation proposal is based on the use of Growing Neural Gas (GNG) method. This Self-Organizing Maps (SOMs) has been successfully used for clustering, pattern recognition and topology representation of various kind of data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models without considering time constraints. Self-organising neural models have the ability to provide a good representation of the input space. In particular, the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time consuming, specially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This thesis proposes a hardware implementation leveraging the computing power of modern GPUs which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). Our proposed geometrical 3D compression method seeks to reduce the 3D information using plane detection as basic structure to compress the data. This is due to our target environments are man-made and therefore there are a lot of points that belong to a plane surface. Our proposed method is able to get good compression results in those man-made scenarios. The detected and compressed planes can be also used in other applications as surface reconstruction or plane-based registration algorithms. Finally, we have also demonstrated the goodness of the GPU technologies getting a high performance implementation of a CAD/CAM common technique called Virtual Digitizing.
Resumo:
Event extraction from texts aims to detect structured information such as what has happened, to whom, where and when. Event extraction and visualization are typically considered as two different tasks. In this paper, we propose a novel approach based on probabilistic modelling to jointly extract and visualize events from tweets where both tasks benefit from each other. We model each event as a joint distribution over named entities, a date, a location and event-related keywords. Moreover, both tweets and event instances are associated with coordinates in the visualization space. The manifold assumption that the intrinsic geometry of tweets is a low-rank, non-linear manifold within the high-dimensional space is incorporated into the learning framework using a regularization. Experimental results show that the proposed approach can effectively deal with both event extraction and visualization and performs remarkably better than both the state-of-the-art event extraction method and a pipeline approach for event extraction and visualization.
Resumo:
This thesis builds a framework for evaluating downside risk from multivariate data via a special class of risk measures (RM). The peculiarity of the analysis lies in getting rid of strong data distributional assumptions and in orientation towards the most critical data in risk management: those with asymmetries and heavy tails. At the same time, under typical assumptions, such as the ellipticity of the data probability distribution, the conformity with classical methods is shown. The constructed class of RM is a multivariate generalization of the coherent distortion RM, which possess valuable properties for a risk manager. The design of the framework is twofold. The first part contains new computational geometry methods for the high-dimensional data. The developed algorithms demonstrate computability of geometrical concepts used for constructing the RM. These concepts bring visuality and simplify interpretation of the RM. The second part develops models for applying the framework to actual problems. The spectrum of applications varies from robust portfolio selection up to broader spheres, such as stochastic conic optimization with risk constraints or supervised machine learning.