914 resultados para High-dimensional index structure
Resumo:
[EN]In face recognition, where high-dimensional representation spaces are generally used, it is very important to take advantage of all the available information. In particular, many labelled facial images will be accumulated while the recognition system is functioning, and due to practical reasons some of them are often discarded. In this paper, we propose an algorithm for using this information. The algorithm has the fundamental characteristic of being incremental. On the other hand, the algorithm makes use of a combination of classification results for the images in the input sequence. Experiments with sequences obtained with a real person detection and tracking system allow us to analyze the performance of the algorithm, as well as its potential improvements.
Resumo:
L’obésité est un problème de santé publique reconnu. Dans la dernière décennie l’obésité abdominale (OA) a été considérée comme une maladie métabolique qui contribue davantage au risque de diabète et de maladies cardiovasculaires que l’obésité générale définie par l’indice de masse corporelle. Toutefois, dans les populations d’origine africaine, la relation entre l’OA et les autres biomarqueurs de risque cardiométabolique (RCM) demeure obscure à cause du manque d’études chez ces populations et de l’absence de valeurs-seuils spécifiques pour juger d’une OA. Cette étude visait à comparer la prévalence des biomarqueurs de RCM (OA, hypertension artérielle, hyperglycémie, dyslipidémie, résistance à l'insuline et inflammation pré-clinique) chez les Béninois de Cotonou et les Haïtiens de Port-au-Prince (PAP), à étudier l’association de l’OA avec les autres biomarqueurs de RCM, à documenter le rôle du niveau socio-économique (NSE) et du mode de vie dans cette association et à ’identifier les indicateurs anthropométriques de l’OA -tour de taille (TT) et le ratio TT/hauteur (TT/H)- et les seuils qui prédisent le mieux le RCM à Cotonou et à PAP. Il s’est agi d’une analyse de données transversales chez 452 adultes (52 % hommes) apparemment en bonne santé, âgés de 25 à 60 ans, avec 200 sujets vivant à Cotonou (Bénin) et 252 sujets à PAP (Haïti). Les biomarqueurs de RCM considérés étaient : le syndrome métabolique (SMet) d’après les critères harmonisés de 2009 et ses composantes individuelles - une OA à partir d’un TT ≥ 94cm chez les hommes et ≥ 80cm chez les femmes, une hypertension, une dyslipidémie et une hyperglycémie; la résistance à l’insuline définie chez l’ensemble des sujets de l’étude à partir du 75e centile de l’Homeostasis Model Assessment (HOMA-IR); un ratio d’athérogénicité élevé (Cholestérol sérique total/HDL-Cholestérol); et l’inflammation pré-clinique mesurée à partir d’un niveau de protéine C-réactive ultrasensible (PCRus) entre 3 et 10 mg/l. Le ratio TT/H était aussi considéré pour définir l’OA à partir d’un seuil de 0,5. Les données sur les habitudes alimentaires, la consommation d’alcool, le tabagisme, les caractéristiques sociodémographiques et les conditions socio-économiques incluant le niveau d’éducation et un proxy du revenu (basé sur l’analyse par composante principale des biens et des possessions) ont été recueillies au moyen d’un questionnaire. Sur la base de données de fréquence de consommation d’aliments occidentaux, urbains et traditionnels, des schémas alimentaires des sujets de chaque ville ont été identifiés par analyse typologique. La validité et les valeurs-seuils de TT et du ratio TT/H prédictives du RCM ont été définies à partir des courbes ROC (Receiver Operating Characteristics). Le SMet était présent chez 21,5 % et 16,1 % des participants, respectivement à Cotonou et à PAP. La prévalence d’OA était élevée à Cotonou (52,5 %) qu’à PAP (36%), avec une prévalence plus élevée chez les femmes que chez les hommes. Le profil lipidique sérique était plus athérogène à PAP avec 89,3 % d’HDL-c bas à PAP contre 79,7 % à Cotonou et un ratio CT/HDL-c élevé de 73,4 % à PAP contre 42 % à Cotonou. Les valeurs-seuils spécifiques de TT et du TT/H étaient respectivement 94 cm et 0,59 chez les femmes et 80 cm et 0,50 chez les hommes. Les analyses multivariées de l’OA avec les biomarqueurs de RCM les plus fortement prévalents dans ces deux populations montraient que l’OA était associée à un risque accru de résistance à l’insuline, d’athérogénicité et de tension artérielle élevée et ceci, indépendamment des facteurs socio-économiques et du mode de vie. Deux schémas alimentaires ont émergé, transitionnel et traditionnel, dans chaque ville, mais ceux-ci ne se révélaient pas associés aux biomarqueurs de RCM bien qu’ils soient en lien avec les variables socio-économiques. La présente étude confirme la présence de plusieurs biomarqueurs de RCM chez des sujets apparemment sains. En outre, l’OA est un élément clé du RCM dans ces deux populations. Les seuils actuels de TT devraient être reconsidérés éventuellement à la lumière d’études de plus grande envergure, afin de mieux définir l’OA chez les Noirs africains ou d’origine africaine, ce qui permettra une surveillance épidémiologique plus adéquate des biomarqueurs de RCM.
Resumo:
Recent progress in the technology for single unit recordings has given the neuroscientific community theopportunity to record the spiking activity of large neuronal populations. At the same pace, statistical andmathematical tools were developed to deal with high-dimensional datasets typical of such recordings.A major line of research investigates the functional role of subsets of neurons with significant co-firingbehavior: the Hebbian cell assemblies. Here we review three linear methods for the detection of cellassemblies in large neuronal populations that rely on principal and independent component analysis.Based on their performance in spike train simulations, we propose a modified framework that incorpo-rates multiple features of these previous methods. We apply the new framework to actual single unitrecordings and show the existence of cell assemblies in the rat hippocampus, which typically oscillate attheta frequencies and couple to different phases of the underlying field rhythm
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Current practice for analysing functional neuroimaging data is to average the brain signals recorded at multiple sensors or channels on the scalp over time across hundreds of trials or replicates to eliminate noise and enhance the underlying signal of interest. These studies recording brain signals non-invasively using functional neuroimaging techniques such as electroencephalography (EEG) and magnetoencephalography (MEG) generate complex, high dimensional and noisy data for many subjects at a number of replicates. Single replicate (or single trial) analysis of neuroimaging data have gained focus as they are advantageous to study the features of the signals at each replicate without averaging out important features in the data that the current methods employ. The research here is conducted to systematically develop flexible regression mixed models for single trial analysis of specific brain activities using examples from EEG and MEG to illustrate the models. This thesis follows three specific themes: i) artefact correction to estimate the `brain' signal which is of interest, ii) characterisation of the signals to reduce their dimensions, and iii) model fitting for single trials after accounting for variations between subjects and within subjects (between replicates). The models are developed to establish evidence of two specific neurological phenomena - entrainment of brain signals to an $\alpha$ band of frequencies (8-12Hz) and dipolar brain activation in the same $\alpha$ frequency band in an EEG experiment and a MEG study, respectively.
Resumo:
Recent progress in the technology for single unit recordings has given the neuroscientific community theopportunity to record the spiking activity of large neuronal populations. At the same pace, statistical andmathematical tools were developed to deal with high-dimensional datasets typical of such recordings.A major line of research investigates the functional role of subsets of neurons with significant co-firingbehavior: the Hebbian cell assemblies. Here we review three linear methods for the detection of cellassemblies in large neuronal populations that rely on principal and independent component analysis.Based on their performance in spike train simulations, we propose a modified framework that incorpo-rates multiple features of these previous methods. We apply the new framework to actual single unitrecordings and show the existence of cell assemblies in the rat hippocampus, which typically oscillate attheta frequencies and couple to different phases of the underlying field rhythm
Resumo:
Personal information is increasingly gathered and used for providing services tailored to user preferences, but the datasets used to provide such functionality can represent serious privacy threats if not appropriately protected. Work in privacy-preserving data publishing targeted privacy guarantees that protect against record re-identification, by making records indistinguishable, or sensitive attribute value disclosure, by introducing diversity or noise in the sensitive values. However, most approaches fail in the high-dimensional case, and the ones that don’t introduce a utility cost incompatible with tailored recommendation scenarios. This paper aims at a sensible trade-off between privacy and the benefits of tailored recommendations, in the context of privacy-preserving data publishing. We empirically demonstrate that significant privacy improvements can be achieved at a utility cost compatible with tailored recommendation scenarios, using a simple partition-based sanitization method.
Resumo:
Objective: To evaluate the knowledge of diabetes diet and identify factors that may interfere with the adherence to nutritional therapy and food choices of participants in a Community Center for the Elderly in Sairé, PE. Methods: A quantitative, descriptive and cross-sectional study, which evaluated 39 attendees of that center, from July to August 2014, with or without diabetes mellitus. Two questionnaires were applied to assess socioeconomic data, nutrition knowledge and cultural factors, and check the consumption of food with high and low glycemic index. Data was analyzed using the Assistat Program 7.0 Beta version. Results: The majority of the respondents have knowledge about types of foods that may influence the treatment of diabetes mellitus, as 51.2% (n=20) reported knowing some food that can reduce the risk for diabetes onset or assist in its treatment. Most of the participants reported having acquired such knowledge through the television 35% (n=7) and conversation with peers 35% (n=7). Evaluation of the food intake evidenced higher consumption of foods with high glycemic index. However, among diabetic patients, foods with low glycemic index are consumed more times per week. Conclusion: The knowledge about nutrition and diabetes mellitus was considered adequate, but socioeconomic and cultural factors may interfere in the adherence to diet therapy for diabetes or in the food choices made by the individuals. However, food consumption was considered appropriate among diabetics.
Resumo:
The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.
Resumo:
Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state of the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state of the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.
Resumo:
L’obésité est un problème de santé publique reconnu. Dans la dernière décennie l’obésité abdominale (OA) a été considérée comme une maladie métabolique qui contribue davantage au risque de diabète et de maladies cardiovasculaires que l’obésité générale définie par l’indice de masse corporelle. Toutefois, dans les populations d’origine africaine, la relation entre l’OA et les autres biomarqueurs de risque cardiométabolique (RCM) demeure obscure à cause du manque d’études chez ces populations et de l’absence de valeurs-seuils spécifiques pour juger d’une OA. Cette étude visait à comparer la prévalence des biomarqueurs de RCM (OA, hypertension artérielle, hyperglycémie, dyslipidémie, résistance à l'insuline et inflammation pré-clinique) chez les Béninois de Cotonou et les Haïtiens de Port-au-Prince (PAP), à étudier l’association de l’OA avec les autres biomarqueurs de RCM, à documenter le rôle du niveau socio-économique (NSE) et du mode de vie dans cette association et à ’identifier les indicateurs anthropométriques de l’OA -tour de taille (TT) et le ratio TT/hauteur (TT/H)- et les seuils qui prédisent le mieux le RCM à Cotonou et à PAP. Il s’est agi d’une analyse de données transversales chez 452 adultes (52 % hommes) apparemment en bonne santé, âgés de 25 à 60 ans, avec 200 sujets vivant à Cotonou (Bénin) et 252 sujets à PAP (Haïti). Les biomarqueurs de RCM considérés étaient : le syndrome métabolique (SMet) d’après les critères harmonisés de 2009 et ses composantes individuelles - une OA à partir d’un TT ≥ 94cm chez les hommes et ≥ 80cm chez les femmes, une hypertension, une dyslipidémie et une hyperglycémie; la résistance à l’insuline définie chez l’ensemble des sujets de l’étude à partir du 75e centile de l’Homeostasis Model Assessment (HOMA-IR); un ratio d’athérogénicité élevé (Cholestérol sérique total/HDL-Cholestérol); et l’inflammation pré-clinique mesurée à partir d’un niveau de protéine C-réactive ultrasensible (PCRus) entre 3 et 10 mg/l. Le ratio TT/H était aussi considéré pour définir l’OA à partir d’un seuil de 0,5. Les données sur les habitudes alimentaires, la consommation d’alcool, le tabagisme, les caractéristiques sociodémographiques et les conditions socio-économiques incluant le niveau d’éducation et un proxy du revenu (basé sur l’analyse par composante principale des biens et des possessions) ont été recueillies au moyen d’un questionnaire. Sur la base de données de fréquence de consommation d’aliments occidentaux, urbains et traditionnels, des schémas alimentaires des sujets de chaque ville ont été identifiés par analyse typologique. La validité et les valeurs-seuils de TT et du ratio TT/H prédictives du RCM ont été définies à partir des courbes ROC (Receiver Operating Characteristics). Le SMet était présent chez 21,5 % et 16,1 % des participants, respectivement à Cotonou et à PAP. La prévalence d’OA était élevée à Cotonou (52,5 %) qu’à PAP (36%), avec une prévalence plus élevée chez les femmes que chez les hommes. Le profil lipidique sérique était plus athérogène à PAP avec 89,3 % d’HDL-c bas à PAP contre 79,7 % à Cotonou et un ratio CT/HDL-c élevé de 73,4 % à PAP contre 42 % à Cotonou. Les valeurs-seuils spécifiques de TT et du TT/H étaient respectivement 94 cm et 0,59 chez les femmes et 80 cm et 0,50 chez les hommes. Les analyses multivariées de l’OA avec les biomarqueurs de RCM les plus fortement prévalents dans ces deux populations montraient que l’OA était associée à un risque accru de résistance à l’insuline, d’athérogénicité et de tension artérielle élevée et ceci, indépendamment des facteurs socio-économiques et du mode de vie. Deux schémas alimentaires ont émergé, transitionnel et traditionnel, dans chaque ville, mais ceux-ci ne se révélaient pas associés aux biomarqueurs de RCM bien qu’ils soient en lien avec les variables socio-économiques. La présente étude confirme la présence de plusieurs biomarqueurs de RCM chez des sujets apparemment sains. En outre, l’OA est un élément clé du RCM dans ces deux populations. Les seuils actuels de TT devraient être reconsidérés éventuellement à la lumière d’études de plus grande envergure, afin de mieux définir l’OA chez les Noirs africains ou d’origine africaine, ce qui permettra une surveillance épidémiologique plus adéquate des biomarqueurs de RCM.
Resumo:
Compreender o sentido da vida e os diferentes sentidos que os indivíduos atribuem a suas vidas são desafios que acompanham o ser humano desde os primórdios, auxiliando-o a lidar com perplexidades. Sentidos de vida podem ser considerados valores, uma vez que orientam a conduta humana, passíveis de aprendizado e compartilhamento. Este artigo teve como objetivo comparar a estrutura de valores relativos a sentido de vida de gestores brasileiros e portugueses. Teve como propósito, aprofundar a compreensão de semelhanças e diferenças entre gestores de duas culturas – brasileira e portuguesa – que possuem histórias imbricadas, no entanto, com identidades próprias. De natureza exploratório-descritiva, a pesquisa coletou dados mediante a aplicação da escala de valores de sentido de vida – VSV – a uma amostra de 187 gestores brasileiros e 71 portugueses, tendo os mesmos sido tratados mediante escalonamento multidimensional. Os resultados evidenciaram a existência de uma única estrutura de valores de sentidos de vida para ambas as amostras, no entanto há diferenças de nuances, quanto à composição das categorias nela contempladas, que foram interpretadas tendo em vista as peculiaridades de cada cultura. A estrutura encontrada caracterizou-se como radex, composta de dois círculos concêntricos onde, no externo, situou-se o valor de sentido de vida Evolução Espiritual, e no interno, uma estrutura polar bi-dimensional contemplando as demais categorias teoricamente previstas: Solidariedade Humana versus Auto-realização, Relacionamento com Pessoas Próximas versus Evolução Pessoal.
Resumo:
This work focuses on the creation and applications of a dynamic simulation software in order to study the hard metal structure (WC-Co). The technological ground used to increase the GPU hardware capacity was Geforce 9600 GT along with the PhysX chip created to make games more realistic. The software simulates the three-dimensional carbide structure to the shape of a cubic box where tungsten carbide (WC) are modeled as triangular prisms and truncated triangular prisms. The program was proven effective regarding checking testes, ranging from calculations of parameter measures such as the capacity to increase the number of particles simulated dynamically. It was possible to make an investigation of both the mean parameters and distributions stereological parameters used to characterize the carbide structure through cutting plans. Grounded on the cutting plans concerning the analyzed structures, we have investigated the linear intercepts, the intercepts to the area, and the perimeter section of the intercepted grains as well as the binder phase to the structure by calculating the mean value and distribution of the free path. As literature shows almost consensually that the distribution of the linear intercepts is lognormal, this suggests that the grain distribution is also lognormal. Thus, a routine was developed regarding the program which made possible a more detailed research on this issue. We have observed that it is possible, under certain values for the parameters which define the shape and size of the Prismatic grain to find out the distribution to the linear intercepts that approach the lognormal shape. Regarding a number of developed simulations, we have observed that the distribution curves of the linear and area intercepts as well as the perimeter section are consistent with studies on static computer simulation to these parameters.
Resumo:
Compressed covariance sensing using quadratic samplers is gaining increasing interest in recent literature. Covariance matrix often plays the role of a sufficient statistic in many signal and information processing tasks. However, owing to the large dimension of the data, it may become necessary to obtain a compressed sketch of the high dimensional covariance matrix to reduce the associated storage and communication costs. Nested sampling has been proposed in the past as an efficient sub-Nyquist sampling strategy that enables perfect reconstruction of the autocorrelation sequence of Wide-Sense Stationary (WSS) signals, as though it was sampled at the Nyquist rate. The key idea behind nested sampling is to exploit properties of the difference set that naturally arises in quadratic measurement model associated with covariance compression. In this thesis, we will focus on developing novel versions of nested sampling for low rank Toeplitz covariance estimation, and phase retrieval, where the latter problem finds many applications in high resolution optical imaging, X-ray crystallography and molecular imaging. The problem of low rank compressive Toeplitz covariance estimation is first shown to be fundamentally related to that of line spectrum recovery. In absence if noise, this connection can be exploited to develop a particular kind of sampler called the Generalized Nested Sampler (GNS), that can achieve optimal compression rates. In presence of bounded noise, we develop a regularization-free algorithm that provably leads to stable recovery of the high dimensional Toeplitz matrix from its order-wise minimal sketch acquired using a GNS. Contrary to existing TV-norm and nuclear norm based reconstruction algorithms, our technique does not use any tuning parameters, which can be of great practical value. The idea of nested sampling idea also finds a surprising use in the problem of phase retrieval, which has been of great interest in recent times for its convex formulation via PhaseLift, By using another modified version of nested sampling, namely the Partial Nested Fourier Sampler (PNFS), we show that with probability one, it is possible to achieve a certain conjectured lower bound on the necessary measurement size. Moreover, for sparse data, an l1 minimization based algorithm is proposed that can lead to stable phase retrieval using order-wise minimal number of measurements.
Resumo:
A presente investigação enquadra-se nos estudos sobre o percurso académico e inserção profissional dos recém-licenciados dos anos letivos de 2010/11 e 2011/12 da Faculdade de Motricidade Humana, em colaboração com o Observatório da Empregabilidade da FMH. Tem como principal objetivo a caraterização do emprego dos recém-licenciados pela Faculdade. A metodologia aproveitou e aperfeiçoou uma plataforma eletrónica proprietária (AgonScopio v.1.7.51), para o desenvolvimento de questionários online, no meio Web. O universo do estudo foi representado pelos recém-licenciados dos dois anos letivos em estudo, das seguintes licenciaturas: Ciências do Desporto, Dança, Ergonomia, Gestão do Desporto e Reabilitação Psicomotora. A amostra foi representada pelos resultados obtidos das 105 respostas conseguidas, de um universo de 334 licenciados, permitindo caraterizar o comportamento dos recém-licenciados, de acordo com nove dimensões estudadas, nomeadamente: dados gerais, enquadramento sociocultural com o objeto da FMH, primeiro emprego, formação, experiência profissional, trabalho e remuneração, expetativas, mobilidade e formação pós licenciatura. Aferimos que os recém-licenciados da FMH possuem um bom índice de empregabilidade e o emprego é maioritariamente na sua área de formação. A maioria dos licenciados obtém emprego até 12 meses após a conclusão das respetivas licenciaturas (71%).