79 resultados para Embeddings
Resumo:
Every closed, oriented, real analytic Riemannian 3-manifold can be isometrically embedded as a special Lagrangian submanifold of a Calabi-Yau 3-fold, even as the real locus of an antiholomorphic, isometric involution. Every closed, oriented, real analytic Riemannian 4-manifold whose bundle of self-dual 2-forms is trivial can be isometrically embedded as a coassociative submanifold in a G_2-manifold, even as the fixed locus of an anti-G_2 involution. These results, when coupled with McLean's analysis of the moduli spaces of such calibrated submanifolds, yield a plentiful supply of examples of compact calibrated submanifolds with nontrivial deformation spaces.
Resumo:
There are two main aims of the paper. The first one is to extend the criterion for the precompactness of sets in Banach function spaces to the setting of quasi-Banach function spaces. The second one is to extend the criterion for the precompactness of sets in the Lebesgue spaces $L_p(\Rn)$, $1 \leq p < \infty$, to the so-called power quasi-Banach function spaces.
These criteria are applied to establish compact embeddings of abstract Besov spaces into quasi-Banach function spaces. The results are illustrated on embeddings of Besov spaces $B^s_{p,q}(\Rn)$, $0
Resumo:
Let S(M) be the ring of (continuous) semialgebraic functions on a semialgebraic set M and S*(M) its subring of bounded semialgebraic functions. In this work we compute the size of the fibers of the spectral maps Spec(j)1:Spec(S(N))→Spec(S(M)) and Spec(j)2:Spec(S*(N))→Spec(S*(M)) induced by the inclusion j:N M of a semialgebraic subset N of M. The ring S(M) can be understood as the localization of S*(M) at the multiplicative subset WM of those bounded semialgebraic functions on M with empty zero set. This provides a natural inclusion iM:Spec(S(M)) Spec(S*(M)) that reduces both problems above to an analysis of the fibers of the spectral map Spec(j)2:Spec(S*(N))→Spec(S*(M)). If we denote Z:=ClSpec(S*(M))(M N), it holds that the restriction map Spec(j)2|:Spec(S*(N)) Spec(j)2-1(Z)→Spec(S*(M)) Z is a homeomorphism. Our problem concentrates on the computation of the size of the fibers of Spec(j)2 at the points of Z. The size of the fibers of prime ideals "close" to the complement Y:=M N provides valuable information concerning how N is immersed inside M. If N is dense in M, the map Spec(j)2 is surjective and the generic fiber of a prime ideal p∈Z contains infinitely many elements. However, finite fibers may also appear and we provide a criterium to decide when the fiber Spec(j)2-1(p) is a finite set for p∈Z. If such is the case, our procedure allows us to compute the size s of Spec(j)2-1(p). If in addition N is locally compact and M is pure dimensional, s coincides with the number of minimal prime ideals contained in p. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Resumo:
The development of Next Generation Sequencing promotes Biology in the Big Data era. The ever-increasing gap between proteins with known sequences and those with a complete functional annotation requires computational methods for automatic structure and functional annotation. My research has been focusing on proteins and led so far to the development of three novel tools, DeepREx, E-SNPs&GO and ISPRED-SEQ, based on Machine and Deep Learning approaches. DeepREx computes the solvent exposure of residues in a protein chain. This problem is relevant for the definition of structural constraints regarding the possible folding of the protein. DeepREx exploits Long Short-Term Memory layers to capture residue-level interactions between positions distant in the sequence, achieving state-of-the-art performances. With DeepRex, I conducted a large-scale analysis investigating the relationship between solvent exposure of a residue and its probability to be pathogenic upon mutation. E-SNPs&GO predicts the pathogenicity of a Single Residue Variation. Variations occurring on a protein sequence can have different effects, possibly leading to the onset of diseases. E-SNPs&GO exploits protein embeddings generated by two novel Protein Language Models (PLMs), as well as a new way of representing functional information coming from the Gene Ontology. The method achieves state-of-the-art performances and is extremely time-efficient when compared to traditional approaches. ISPRED-SEQ predicts the presence of Protein-Protein Interaction sites in a protein sequence. Knowing how a protein interacts with other molecules is crucial for accurate functional characterization. ISPRED-SEQ exploits a convolutional layer to parse local context after embedding the protein sequence with two novel PLMs, greatly surpassing the current state-of-the-art. All methods are published in international journals and are available as user-friendly web servers. They have been developed keeping in mind standard guidelines for FAIRness (FAIR: Findable, Accessible, Interoperable, Reusable) and are integrated into the public collection of tools provided by ELIXIR, the European infrastructure for Bioinformatics.
Resumo:
Artificial Intelligence (AI) has substantially influenced numerous disciplines in recent years. Biology, chemistry, and bioinformatics are among them, with significant advances in protein structure prediction, paratope prediction, protein-protein interactions (PPIs), and antibody-antigen interactions. Understanding PPIs is critical since they are responsible for practically everything living and have several uses in vaccines, cancer, immunology, and inflammatory illnesses. Machine Learning (ML) offers enormous potential for effectively simulating antibody-antigen interactions and improving in-silico optimization of therapeutic antibodies for desired features, including binding activity, stability, and low immunogenicity. This research looks at the use of AI algorithms to better understand antibody-antigen interactions, and it further expands and explains several difficulties encountered in the field. Furthermore, we contribute by presenting a method that outperforms existing state-of-the-art strategies in paratope prediction from sequence data.
Resumo:
This thesis develops AI methods as a contribution to computational musicology, an interdisciplinary field that studies music with computers. In systematic musicology a composition is defined as the combination of harmony, melody and rhythm. According to de La Borde, harmony alone "merits the name of composition". This thesis focuses on analysing the harmony from a computational perspective. We concentrate on symbolic music representation and address the problem of formally representing chord progressions in western music compositions. Informally, chords are sets of pitches played simultaneously, and chord progressions constitute the harmony of a composition. Our approach combines ML techniques with knowledge-based techniques. We design and implement the Modal Harmony ontology (MHO), using OWL. It formalises one of the most important theories in western music: the Modal Harmony Theory. We propose and experiment with different types of embedding methods to encode chords, inspired by NLP and adapted to the music domain, using both statistical (extensional) knowledge by relying on a huge dataset of chord annotations (ChoCo), intensional knowledge by relying on MHO and a combination of the two. The methods are evaluated on two musicologically relevant tasks: chord classification and music structure segmentation. The former is verified by comparing the results of the Odd One Out algorithm to the classification obtained with MHO. Good performances (accuracy: 0.86) are achieved. We feed a RNN for the latter, using our embeddings. Results show that the best performance (F1: 0.6) is achieved with embeddings that combine both approaches. Our method outpeforms the state of the art (F1 = 0.42) for symbolic music structure segmentation. It is worth noticing that embeddings based only on MHO almost equal the best performance (F1 = 0.58). We remark that those embeddings only require the ontology as an input as opposed to other approaches that rely on large datasets.
Resumo:
Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.
Resumo:
Dans cette dissertation, nous présentons plusieurs techniques d’apprentissage d’espaces sémantiques pour plusieurs domaines, par exemple des mots et des images, mais aussi à l’intersection de différents domaines. Un espace de représentation est appelé sémantique si des entités jugées similaires par un être humain, ont leur similarité préservée dans cet espace. La première publication présente un enchaînement de méthodes d’apprentissage incluant plusieurs techniques d’apprentissage non supervisé qui nous a permis de remporter la compétition “Unsupervised and Transfer Learning Challenge” en 2011. Le deuxième article présente une manière d’extraire de l’information à partir d’un contexte structuré (177 détecteurs d’objets à différentes positions et échelles). On montrera que l’utilisation de la structure des données combinée à un apprentissage non supervisé permet de réduire la dimensionnalité de 97% tout en améliorant les performances de reconnaissance de scènes de +5% à +11% selon l’ensemble de données. Dans le troisième travail, on s’intéresse à la structure apprise par les réseaux de neurones profonds utilisés dans les deux précédentes publications. Plusieurs hypothèses sont présentées et testées expérimentalement montrant que l’espace appris a de meilleures propriétés de mixage (facilitant l’exploration de différentes classes durant le processus d’échantillonnage). Pour la quatrième publication, on s’intéresse à résoudre un problème d’analyse syntaxique et sémantique avec des réseaux de neurones récurrents appris sur des fenêtres de contexte de mots. Dans notre cinquième travail, nous proposons une façon d’effectuer de la recherche d’image ”augmentée” en apprenant un espace sémantique joint où une recherche d’image contenant un objet retournerait aussi des images des parties de l’objet, par exemple une recherche retournant des images de ”voiture” retournerait aussi des images de ”pare-brises”, ”coffres”, ”roues” en plus des images initiales.
Resumo:
A profile on a graph G is any nonempty multiset whose elements are vertices from G. The corresponding remoteness function associates to each vertex x 2 V.G/ the sum of distances from x to the vertices in the profile. Starting from some nice and useful properties of the remoteness function in hypercubes, the remoteness function is studied in arbitrary median graphs with respect to their isometric embeddings in hypercubes. In particular, a relation between the vertices in a median graph G whose remoteness function is maximum (antimedian set of G) with the antimedian set of the host hypercube is found. While for odd profiles the antimedian set is an independent set that lies in the strict boundary of a median graph, there exist median graphs in which special even profiles yield a constant remoteness function. We characterize such median graphs in two ways: as the graphs whose periphery transversal number is 2, and as the graphs with the geodetic number equal to 2. Finally, we present an algorithm that, given a graph G on n vertices and m edges, decides in O.mlog n/ time whether G is a median graph with geodetic number 2
Resumo:
[EU]Lan honetan semantika distribuzionalaren eta ikasketa automatikoaren erabilera aztertzen dugu itzulpen automatiko estatistikoa hobetzeko. Bide horretan, erregresio logistikoan oinarritutako ikasketa automatikoko eredu bat proposatzen dugu hitz-segiden itzulpen- probabilitatea modu dinamikoan modelatzeko. Proposatutako eredua itzulpen automatiko estatistikoko ohiko itzulpen-probabilitateen orokortze bat dela frogatzen dugu, eta testuinguruko nahiz semantika distribuzionaleko informazioa barneratzeko baliatu ezaugarri lexiko, hitz-cluster eta hitzen errepresentazio bektorialen bidez. Horretaz gain, semantika distribuzionaleko ezagutza itzulpen automatiko estatistikoan txertatzeko beste hurbilpen bat lantzen dugu: hitzen errepresentazio bektorial elebidunak erabiltzea hitz-segiden itzulpenen antzekotasuna modelatzeko. Gure esperimentuek proposatutako ereduen baliagarritasuna erakusten dute, emaitza itxaropentsuak eskuratuz oinarrizko sistema sendo baten gainean. Era berean, gure lanak ekarpen garrantzitsuak egiten ditu errepresentazio bektorialen mapaketa elebidunei eta hitzen errepresentazio bektorialetan oinarritutako hitz-segiden antzekotasun neurriei dagokienean, itzulpen automatikoaz haratago balio propio bat dutenak semantika distribuzionalaren arloan.
Resumo:
We provide bounds on the upper box-counting dimension of negatively invariant subsets of Banach spaces, a problem that is easily reduced to covering the image of the unit ball under a linear map by a collection of balls of smaller radius. As an application of the abstract theory we show that the global attractors of a very broad class of parabolic partial differential equations (semilinear equations in Banach spaces) are finite-dimensional. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
The concept of taut submanifold of Euclidean space is due to Carter and West, and can be traced back to the work of Chern and Lashof on immersions with minimal total absolute curvature and the subsequent reformulation of that work by Kuiper in terms of critical point theory. In this paper, we classify the reducible representations of compact simple Lie groups, all of whose orbits are tautly embedded in Euclidean space, with respect to Z(2)-coefficients.