891 resultados para High-Dimensional Space Geometrical Informatics (HDSGI)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Internet users consume online targeted advertising based on information collected about them and voluntarily share personal information in social networks. Sensor information and data from smart-phones is collected and used by applications, sometimes in unclear ways. As it happens today with smartphones, in the near future sensors will be shipped in all types of connected devices, enabling ubiquitous information gathering from the physical environment, enabling the vision of Ambient Intelligence. The value of gathered data, if not obvious, can be harnessed through data mining techniques and put to use by enabling personalized and tailored services as well as business intelligence practices, fueling the digital economy. However, the ever-expanding information gathering and use undermines the privacy conceptions of the past. Natural social practices of managing privacy in daily relations are overridden by socially-awkward communication tools, service providers struggle with security issues resulting in harmful data leaks, governments use mass surveillance techniques, the incentives of the digital economy threaten consumer privacy, and the advancement of consumergrade data-gathering technology enables new inter-personal abuses. A wide range of fields attempts to address technology-related privacy problems, however they vary immensely in terms of assumptions, scope and approach. Privacy of future use cases is typically handled vertically, instead of building upon previous work that can be re-contextualized, while current privacy problems are typically addressed per type in a more focused way. Because significant effort was required to make sense of the relations and structure of privacy-related work, this thesis attempts to transmit a structured view of it. It is multi-disciplinary - from cryptography to economics, including distributed systems and information theory - and addresses privacy issues of different natures. As existing work is framed and discussed, the contributions to the state-of-theart done in the scope of this thesis are presented. The contributions add to five distinct areas: 1) identity in distributed systems; 2) future context-aware services; 3) event-based context management; 4) low-latency information flow control; 5) high-dimensional dataset anonymity. Finally, having laid out such landscape of the privacy-preserving work, the current and future privacy challenges are discussed, considering not only technical but also socio-economic perspectives.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract The ultimate problem considered in this thesis is modeling a high-dimensional joint distribution over a set of discrete variables. For this purpose, we consider classes of context-specific graphical models and the main emphasis is on learning the structure of such models from data. Traditional graphical models compactly represent a joint distribution through a factorization justi ed by statements of conditional independence which are encoded by a graph structure. Context-speci c independence is a natural generalization of conditional independence that only holds in a certain context, speci ed by the conditioning variables. We introduce context-speci c generalizations of both Bayesian networks and Markov networks by including statements of context-specific independence which can be encoded as a part of the model structures. For the purpose of learning context-speci c model structures from data, we derive score functions, based on results from Bayesian statistics, by which the plausibility of a structure is assessed. To identify high-scoring structures, we construct stochastic and deterministic search algorithms designed to exploit the structural decomposition of our score functions. Numerical experiments on synthetic and real-world data show that the increased exibility of context-specific structures can more accurately emulate the dependence structure among the variables and thereby improve the predictive accuracy of the models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current practice for analysing functional neuroimaging data is to average the brain signals recorded at multiple sensors or channels on the scalp over time across hundreds of trials or replicates to eliminate noise and enhance the underlying signal of interest. These studies recording brain signals non-invasively using functional neuroimaging techniques such as electroencephalography (EEG) and magnetoencephalography (MEG) generate complex, high dimensional and noisy data for many subjects at a number of replicates. Single replicate (or single trial) analysis of neuroimaging data have gained focus as they are advantageous to study the features of the signals at each replicate without averaging out important features in the data that the current methods employ. The research here is conducted to systematically develop flexible regression mixed models for single trial analysis of specific brain activities using examples from EEG and MEG to illustrate the models. This thesis follows three specific themes: i) artefact correction to estimate the `brain' signal which is of interest, ii) characterisation of the signals to reduce their dimensions, and iii) model fitting for single trials after accounting for variations between subjects and within subjects (between replicates). The models are developed to establish evidence of two specific neurological phenomena - entrainment of brain signals to an $\alpha$ band of frequencies (8-12Hz) and dipolar brain activation in the same $\alpha$ frequency band in an EEG experiment and a MEG study, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent progress in the technology for single unit recordings has given the neuroscientific community theopportunity to record the spiking activity of large neuronal populations. At the same pace, statistical andmathematical tools were developed to deal with high-dimensional datasets typical of such recordings.A major line of research investigates the functional role of subsets of neurons with significant co-firingbehavior: the Hebbian cell assemblies. Here we review three linear methods for the detection of cellassemblies in large neuronal populations that rely on principal and independent component analysis.Based on their performance in spike train simulations, we propose a modified framework that incorpo-rates multiple features of these previous methods. We apply the new framework to actual single unitrecordings and show the existence of cell assemblies in the rat hippocampus, which typically oscillate attheta frequencies and couple to different phases of the underlying field rhythm

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Personal information is increasingly gathered and used for providing services tailored to user preferences, but the datasets used to provide such functionality can represent serious privacy threats if not appropriately protected. Work in privacy-preserving data publishing targeted privacy guarantees that protect against record re-identification, by making records indistinguishable, or sensitive attribute value disclosure, by introducing diversity or noise in the sensitive values. However, most approaches fail in the high-dimensional case, and the ones that don’t introduce a utility cost incompatible with tailored recommendation scenarios. This paper aims at a sensible trade-off between privacy and the benefits of tailored recommendations, in the context of privacy-preserving data publishing. We empirically demonstrate that significant privacy improvements can be achieved at a utility cost compatible with tailored recommendation scenarios, using a simple partition-based sanitization method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Compressed covariance sensing using quadratic samplers is gaining increasing interest in recent literature. Covariance matrix often plays the role of a sufficient statistic in many signal and information processing tasks. However, owing to the large dimension of the data, it may become necessary to obtain a compressed sketch of the high dimensional covariance matrix to reduce the associated storage and communication costs. Nested sampling has been proposed in the past as an efficient sub-Nyquist sampling strategy that enables perfect reconstruction of the autocorrelation sequence of Wide-Sense Stationary (WSS) signals, as though it was sampled at the Nyquist rate. The key idea behind nested sampling is to exploit properties of the difference set that naturally arises in quadratic measurement model associated with covariance compression. In this thesis, we will focus on developing novel versions of nested sampling for low rank Toeplitz covariance estimation, and phase retrieval, where the latter problem finds many applications in high resolution optical imaging, X-ray crystallography and molecular imaging. The problem of low rank compressive Toeplitz covariance estimation is first shown to be fundamentally related to that of line spectrum recovery. In absence if noise, this connection can be exploited to develop a particular kind of sampler called the Generalized Nested Sampler (GNS), that can achieve optimal compression rates. In presence of bounded noise, we develop a regularization-free algorithm that provably leads to stable recovery of the high dimensional Toeplitz matrix from its order-wise minimal sketch acquired using a GNS. Contrary to existing TV-norm and nuclear norm based reconstruction algorithms, our technique does not use any tuning parameters, which can be of great practical value. The idea of nested sampling idea also finds a surprising use in the problem of phase retrieval, which has been of great interest in recent times for its convex formulation via PhaseLift, By using another modified version of nested sampling, namely the Partial Nested Fourier Sampler (PNFS), we show that with probability one, it is possible to achieve a certain conjectured lower bound on the necessary measurement size. Moreover, for sparse data, an l1 minimization based algorithm is proposed that can lead to stable phase retrieval using order-wise minimal number of measurements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In these last years a great effort has been put in the development of new techniques for automatic object classification, also due to the consequences in many applications such as medical imaging or driverless cars. To this end, several mathematical models have been developed from logistic regression to neural networks. A crucial aspect of these so called classification algorithms is the use of algebraic tools to represent and approximate the input data. In this thesis, we examine two different models for image classification based on a particular tensor decomposition named Tensor-Train (TT) decomposition. The use of tensor approaches preserves the multidimensional structure of the data and the neighboring relations among pixels. Furthermore the Tensor-Train, differently from other tensor decompositions, does not suffer from the curse of dimensionality making it an extremely powerful strategy when dealing with high-dimensional data. It also allows data compression when combined with truncation strategies that reduce memory requirements without spoiling classification performance. The first model we propose is based on a direct decomposition of the database by means of the TT decomposition to find basis vectors used to classify a new object. The second model is a tensor dictionary learning model, based on the TT decomposition where the terms of the decomposition are estimated using a proximal alternating linearized minimization algorithm with a spectral stepsize.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The thesis deals with the problem of Model Selection (MS) motivated by information and prediction theory, focusing on parametric time series (TS) models. The main contribution of the thesis is the extension to the multivariate case of the Misspecification-Resistant Information Criterion (MRIC), a criterion introduced recently that solves Akaike’s original research problem posed 50 years ago, which led to the definition of the AIC. The importance of MS is witnessed by the huge amount of literature devoted to it and published in scientific journals of many different disciplines. Despite such a widespread treatment, the contributions that adopt a mathematically rigorous approach are not so numerous and one of the aims of this project is to review and assess them. Chapter 2 discusses methodological aspects of MS from information theory. Information criteria (IC) for the i.i.d. setting are surveyed along with their asymptotic properties; and the cases of small samples, misspecification, further estimators. Chapter 3 surveys criteria for TS. IC and prediction criteria are considered for: univariate models (AR, ARMA) in the time and frequency domain, parametric multivariate (VARMA, VAR); nonparametric nonlinear (NAR); and high-dimensional models. The MRIC answers Akaike’s original question on efficient criteria, for possibly-misspecified (PM) univariate TS models in multi-step prediction with high-dimensional data and nonlinear models. Chapter 4 extends the MRIC to PM multivariate TS models for multi-step prediction introducing the Vectorial MRIC (VMRIC). We show that the VMRIC is asymptotically efficient by proving the decomposition of the MSPE matrix and the consistency of its Method-of-Moments Estimator (MoME), for Least Squares multi-step prediction with univariate regressor. Chapter 5 extends the VMRIC to the general multiple regressor case, by showing that the MSPE matrix decomposition holds, obtaining consistency for its MoME, and proving its efficiency. The chapter concludes with a digression on the conditions for PM VARX models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes informatics for cross-sample analysis with comprehensive two-dimensional gas chromatography (GCxGC) and high-resolution mass spectrometry (HRMS). GCxGC-HRMS analysis produces large data sets that are rich with information, but highly complex. The size of the data and volume of information requires automated processing for comprehensive cross-sample analysis, but the complexity poses a challenge for developing robust methods. The approach developed here analyzes GCxGC-HRMS data from multiple samples to extract a feature template that comprehensively captures the pattern of peaks detected in the retention-times plane. Then, for each sample chromatogram, the template is geometrically transformed to align with the detected peak pattern and generate a set of feature measurements for cross-sample analyses such as sample classification and biomarker discovery. The approach avoids the intractable problem of comprehensive peak matching by using a few reliable peaks for alignment and peak-based retention-plane windows to define comprehensive features that can be reliably matched for cross-sample analysis. The informatics are demonstrated with a set of 18 samples from breast-cancer tumors, each from different individuals, six each for Grades 1-3. The features allow classification that matches grading by a cancer pathologist with 78% success in leave-one-out cross-validation experiments. The HRMS signatures of the features of interest can be examined for determining elemental compositions and identifying compounds.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The aim of the present study is to determine the level of correlation between the 3-dimensional (3D) characteristics of trabecular bone microarchitecture, as evaluated using microcomputed tomography (μCT) reconstruction, and trabecular bone score (TBS), as evaluated using 2D projection images directly derived from 3D μCT reconstruction (TBSμCT). Moreover, we have evaluated the effects of image degradation (resolution and noise) and X-ray energy of projection on these correlations. Thirty human cadaveric vertebrae were acquired on a microscanner at an isotropic resolution of 93μm. The 3D microarchitecture parameters were obtained using MicroView (GE Healthcare, Wauwatosa, MI). The 2D projections of these 3D models were generated using the Beer-Lambert law at different X-ray energies. Degradation of image resolution was simulated (from 93 to 1488μm). Relationships between 3D microarchitecture parameters and TBSμCT at different resolutions were evaluated using linear regression analysis. Significant correlations were observed between TBSμCT and 3D microarchitecture parameters, regardless of the resolution. Correlations were detected that were strongly to intermediately positive for connectivity density (0.711≤r(2)≤0.752) and trabecular number (0.584≤r(2)≤0.648) and negative for trabecular space (-0.407 ≤r(2)≤-0.491), up to a pixel size of 1023μm. In addition, TBSμCT values were strongly correlated between each other (0.77≤r(2)≤0.96). Study results show that the correlations between TBSμCT at 93μm and 3D microarchitecture parameters are weakly impacted by the degradation of image resolution and the presence of noise.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

L'imagerie par résonance magnétique (IRM) peut fournir aux cardiologues des informations diagnostiques importantes sur l'état de la maladie de l'artère coronarienne dans les patients. Le défi majeur pour l'IRM cardiaque est de gérer toutes les sources de mouvement qui peuvent affecter la qualité des images en réduisant l'information diagnostique. Cette thèse a donc comme but de développer des nouvelles techniques d'acquisitions des images IRM, en changeant les techniques de compensation du mouvement, pour en augmenter l'efficacité, la flexibilité, la robustesse et pour obtenir plus d'information sur le tissu et plus d'information temporelle. Les techniques proposées favorisent donc l'avancement de l'imagerie des coronaires dans une direction plus maniable et multi-usage qui peut facilement être transférée dans l'environnement clinique. La première partie de la thèse s'est concentrée sur l'étude du mouvement des artères coronariennes sur des patients en utilisant la techniques d'imagerie standard (rayons x), pour mesurer la précision avec laquelle les artères coronariennes retournent dans la même position battement après battement (repositionnement des coronaires). Nous avons découvert qu'il y a des intervalles dans le cycle cardiaque, tôt dans la systole et à moitié de la diastole, où le repositionnement des coronaires est au minimum. En réponse nous avons développé une nouvelle séquence d'acquisition (T2-post) capable d'acquérir les données aussi tôt dans la systole. Cette séquence a été testée sur des volontaires sains et on a pu constater que la qualité de visualisation des artère coronariennes est égale à celle obtenue avec les techniques standard. De plus, le rapport signal sur bruit fourni par la séquence d'acquisition proposée est supérieur à celui obtenu avec les techniques d'imagerie standard. La deuxième partie de la thèse a exploré un paradigme d'acquisition des images cardiaques complètement nouveau pour l'imagerie du coeur entier. La technique proposée dans ce travail acquiert les données sans arrêt (free-running) au lieu d'être synchronisée avec le mouvement cardiaque. De cette façon, l'efficacité de la séquence d'acquisition est augmentée de manière significative et les images produites représentent le coeur entier dans toutes les phases cardiaques (quatre dimensions, 4D). Par ailleurs, l'auto-navigation de la respiration permet d'effectuer cette acquisition en respiration libre. Cette technologie rend possible de visualiser et évaluer l'anatomie du coeur et de ses vaisseaux ainsi que la fonction cardiaque en quatre dimensions et avec une très haute résolution spatiale et temporelle, sans la nécessité d'injecter un moyen de contraste. Le pas essentiel qui a permis le développement de cette technique est l'utilisation d'une trajectoire d'acquisition radiale 3D basée sur l'angle d'or. Avec cette trajectoire, il est possible d'acquérir continûment les données d'espace k, puis de réordonner les données et choisir les paramètres temporel des images 4D a posteriori. L'acquisition 4D a été aussi couplée avec un algorithme de reconstructions itératif (compressed sensing) qui permet d'augmenter la résolution temporelle tout en augmentant la qualité des images. Grâce aux images 4D, il est possible maintenant de visualiser les artères coronariennes entières dans chaque phase du cycle cardiaque et, avec les mêmes données, de visualiser et mesurer la fonction cardiaque. La qualité des artères coronariennes dans les images 4D est la même que dans les images obtenues avec une acquisition 3D standard, acquise en diastole Par ailleurs, les valeurs de fonction cardiaque mesurées au moyen des images 4D concorde avec les valeurs obtenues avec les images 2D standard. Finalement, dans la dernière partie de la thèse une technique d'acquisition a temps d'écho ultra-court (UTE) a été développée pour la visualisation in vivo des calcifications des artères coronariennes. Des études récentes ont démontré que les acquisitions UTE permettent de visualiser les calcifications dans des plaques athérosclérotiques ex vivo. Cepandent le mouvement du coeur a entravé jusqu'à maintenant l'utilisation des techniques UTE in vivo. Pour résoudre ce problème nous avons développé une séquence d'acquisition UTE avec trajectoire radiale 3D et l'avons testée sur des volontaires. La technique proposée utilise une auto-navigation 3D pour corriger le mouvement respiratoire et est synchronisée avec l'ECG. Trois échos sont acquis pour extraire le signal de la calcification avec des composants au T2 très court tout en permettant de séparer le signal de la graisse depuis le signal de l'eau. Les résultats sont encore préliminaires mais on peut affirmer que la technique développé peut potentiellement montrer les calcifications des artères coronariennes in vivo. En conclusion, ce travail de thèse présente trois nouvelles techniques pour l'IRM du coeur entier capables d'améliorer la visualisation et la caractérisation de la maladie athérosclérotique des coronaires. Ces techniques fournissent des informations anatomiques et fonctionnelles en quatre dimensions et des informations sur la composition du tissu auparavant indisponibles. CORONARY artery magnetic resonance imaging (MRI) has the potential to provide the cardiologist with relevant diagnostic information relative to coronary artery disease of patients. The major challenge of cardiac MRI, though, is dealing with all sources of motions that can corrupt the images affecting the diagnostic information provided. The current thesis, thus, focused on the development of new MRI techniques that change the standard approach to cardiac motion compensation in order to increase the efficiency of cardioavscular MRI, to provide more flexibility and robustness, new temporal information and new tissue information. The proposed approaches help in advancing coronary magnetic resonance angiography (MRA) in the direction of an easy-to-use and multipurpose tool that can be translated to the clinical environment. The first part of the thesis focused on the study of coronary artery motion through gold standard imaging techniques (x-ray angiography) in patients, in order to measure the precision with which the coronary arteries assume the same position beat after beat (coronary artery repositioning). We learned that intervals with minimal coronary artery repositioning occur in peak systole and in mid diastole and we responded with a new pulse sequence (T2~post) that is able to provide peak-systolic imaging. Such a sequence was tested in healthy volunteers and, from the image quality comparison, we learned that the proposed approach provides coronary artery visualization and contrast-to-noise ratio (CNR) comparable with the standard acquisition approach, but with increased signal-to-noise ratio (SNR). The second part of the thesis explored a completely new paradigm for whole- heart cardiovascular MRI. The proposed techniques acquires the data continuously (free-running), instead of being triggered, thus increasing the efficiency of the acquisition and providing four dimensional images of the whole heart, while respiratory self navigation allows for the scan to be performed in free breathing. This enabling technology allows for anatomical and functional evaluation in four dimensions, with high spatial and temporal resolution and without the need for contrast agent injection. The enabling step is the use of a golden-angle based 3D radial trajectory, which allows for a continuous sampling of the k-space and a retrospective selection of the timing parameters of the reconstructed dataset. The free-running 4D acquisition was then combined with a compressed sensing reconstruction algorithm that further increases the temporal resolution of the 4D dataset, while at the same time increasing the overall image quality by removing undersampling artifacts. The obtained 4D images provide visualization of the whole coronary artery tree in each phases of the cardiac cycle and, at the same time, allow for the assessment of the cardiac function with a single free- breathing scan. The quality of the coronary arteries provided by the frames of the free-running 4D acquisition is in line with the one obtained with the standard ECG-triggered one, and the cardiac function evaluation matched the one measured with gold-standard stack of 2D cine approaches. Finally, the last part of the thesis focused on the development of ultrashort echo time (UTE) acquisition scheme for in vivo detection of calcification in the coronary arteries. Recent studies showed that UTE imaging allows for the coronary artery plaque calcification ex vivo, since it is able to detect the short T2 components of the calcification. The heart motion, though, prevented this technique from being applied in vivo. An ECG-triggered self-navigated 3D radial triple- echo UTE acquisition has then been developed and tested in healthy volunteers. The proposed sequence combines a 3D self-navigation approach with a 3D radial UTE acquisition enabling data collection during free breathing. Three echoes are simultaneously acquired to extract the short T2 components of the calcification while a water and fat separation technique allows for proper visualization of the coronary arteries. Even though the results are still preliminary, the proposed sequence showed great potential for the in vivo visualization of coronary artery calcification. In conclusion, the thesis presents three novel MRI approaches aimed at improved characterization and assessment of atherosclerotic coronary artery disease. These approaches provide new anatomical and functional information in four dimensions, and support tissue characterization for coronary artery plaques.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Three comprehensive one-dimensional simulators were used on the same PC to simulate the dynamics of different electrophoretic configurations, including two migrating hybrid boundaries, an isotachophoretic boundary and the zone electrophoretic separation of ten monovalent anions. Two simulators, SIMUL5 and GENTRANS, use a uniform grid, while SPRESSO uses a dynamic adaptive grid. The simulators differ in the way components are handled. SIMUL5 and SPRESSO feature one equation for all components, whereas GENTRANS is based on the use of separate modules for the different types of monovalent components, a module for multivalent components and a module for proteins. The code for multivalent components is executed more slowly compared to those for monovalent components. Furthermore, with SIMUL5, the computational time interval becomes smaller when it is operated with a reduced calculation space that features moving borders, whereas GENTRANS offers the possibility of using data smoothing (removal of negative concentrations), which can avoid numerical oscillations and speed up a simulation. SPRESSO with its adaptive grid could be employed to simulate the same configurations with smaller numbers of grid points and thus is faster in certain but not all cases. The data reveal that simulations featuring a large number of monovalent components distributed such that a high mesh is required throughout a large proportion of the column are fastest executed with GENTRANS.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

We introduce a new fiber-optical approach for reflection based refractive index mapping. Our approach leads to improved stability and reliability over existing free-space confocal instruments and significantly cuts alignment efforts and reduces the number of components needed. Other than properly cleaved fiber end-faces, this setup requires no additional sample preparation. The instrument is calibrated by means of a set of samples with known refractive indices. The index steps of commercially available fibers are measured accurately down to < 10⁻³. The precision limit of the instrument is currently of the order of 10⁻⁴.