930 resultados para nonlinear dimensionality reduction
Resumo:
Locality to other nodes on a peer-to-peer overlay network can be established by means of a set of landmarks shared among the participating nodes. Each node independently collects a set of latency measures to landmark nodes, which are used as a multi-dimensional feature vector. Each peer node uses the feature vector to generate a unique scalar index which is correlated to its topological locality. A popular dimensionality reduction technique is the space filling Hilbert’s curve, as it possesses good locality preserving properties. However, there exists little comparison between Hilbert’s curve and other techniques for dimensionality reduction. This work carries out a quantitative analysis of their properties. Linear and non-linear techniques for scaling the landmark vectors to a single dimension are investigated. Hilbert’s curve, Sammon’s mapping and Principal Component Analysis have been used to generate a 1d space with locality preserving properties. This work provides empirical evidence to support the use of Hilbert’s curve in the context of locality preservation when generating peer identifiers by means of landmark vector analysis. A comparative analysis is carried out with an artificial 2d network model and with a realistic network topology model with a typical power-law distribution of node connectivity in the Internet. Nearest neighbour analysis confirms Hilbert’s curve to be very effective in both artificial and realistic network topologies. Nevertheless, the results in the realistic network model show that there is scope for improvements and better techniques to preserve locality information are required.
Resumo:
BACKGROUND: Low vitamin D status has been shown to be a risk factor for several metabolic traits such as obesity, diabetes and cardiovascular disease. The biological actions of 1, 25-dihydroxyvitamin D, are mediated through the vitamin D receptor (VDR), which heterodimerizes with retinoid X receptor, gamma (RXRG). Hence, we examined the potential interactions between the tagging polymorphisms in the VDR (22 tag SNPs) and RXRG (23 tag SNPs) genes on metabolic outcomes such as body mass index, waist circumference, waist-hip ratio (WHR), high- and low-density lipoprotein (LDL) cholesterols, serum triglycerides, systolic and diastolic blood pressures and glycated haemoglobin in the 1958 British Birth Cohort (1958BC, up to n = 5,231). We used Multifactor- dimensionality reduction (MDR) program as a non-parametric test to examine for potential interactions between the VDR and RXRG gene polymorphisms in the 1958BC. We used the data from Northern Finland Birth Cohort 1966 (NFBC66, up to n = 5,316) and Twins UK (up to n = 3,943) to replicate our initial findings from 1958BC. RESULTS: After Bonferroni correction, the joint-likelihood ratio test suggested interactions on serum triglycerides (4 SNP - SNP pairs), LDL cholesterol (2 SNP - SNP pairs) and WHR (1 SNP - SNP pair) in the 1958BC. MDR permutation model testing analysis showed one two-way and one three-way interaction to be statistically significant on serum triglycerides in the 1958BC. In meta-analysis of results from two replication cohorts (NFBC66 and Twins UK, total n = 8,183), none of the interactions remained after correction for multiple testing (Pinteraction >0.17). CONCLUSIONS: Our results did not provide strong evidence for interactions between allelic variations in VDR and RXRG genes on metabolic outcomes; however, further replication studies on large samples are needed to confirm our findings.
Resumo:
The objective of the present study was to validate a recently reported synergistic effect between variants located in the leptin receptor (LEPR) gene and in the beta-2 adrenergic receptor (ADRB2) gene on the risk of overweight/obesity. We studied a middle-aged/ elderly sample of 4,193 nondiabetic Japanese subjects stratified according gender (1,911 women and 2,282 men). The LEPR Gln223Arg (rs1137101) variant as well as both ADRB2 Arg16Gly (rs1042713) and Gln27Glu (rs1042714) polymorphisms were analyzed. The primary outcome was the risk of overweight/obesity defined as BMI >= 25 kg/m(2), whereas secondary outcomes included the risk of a BMI >= 27 kg/m(2) and BMI as a continuous variable. None of the studied polymorphisms showed statistically significant individual effects, regardless of the group or phenotype studied. Haplotype analysis also did not disclose any associations of ADRB2 polymorphisms with BMI. However, dimensionality reduction-based models confirmed significant interactions among the investigated variants for BMI as a continuous variable as well as for the risk of obesity defined as BMI >= 27 kg/m(2). All disclosed interactions were found in men only. Our results provide external validation for a male specific ADRB2-LEPR interaction effect on the risk of overweight/obesity, but indicate that effect sizes associated with these interactions may be smaller in the population studied.
Resumo:
We investigated whether variants in major candidate genes for food intake and body weight regulation contribute to obesity-related traits under a multilocus perspective. We studied 375 Brazilian subjects from partially isolated African-derived populations (quilombos). Seven variants displaying conflicting results in previous reports and supposedly implicated in the susceptibility of obesity-related phenotypes were investigated: beta(2)-adrenergic receptor (ADRB2) (Arg16Gly), insulin induced gene 2 (INSIG2) (rs7566605), leptin (LEP) (A19G), LEP receptor (LEPR) (Gln223Arg), perilipin (PLIN) (6209T > C), peroxisome proliferator-activated receptor-gamma (PPARG) (Pro12Ala), and resistin (RETN) (-420C > G). Regression models as well as generalized multifactor dimensionality reduction (GMDR) were employed to test the contribution of individual effects and higher-order interactions to BMI and waist-hip ratio (WHR) variation and risk of overweight/obesity. The best multilocus association signal identified in the quilombos was further examined in an independent sample of 334 Brazilian subjects of European ancestry. In quilombos, only the PPARG polymorphism displayed significant individual effects (WHR variation, P = 0.028). No association was observed either with the risk of overweight/obesity (BMI >= 25 kg/m(2)), risk of obesity alone (BMI >= 30 kg/m(2)) or BMI variation. However, GMDR analyses revealed an interaction between the LEPR and ADRB2 polymorphisms (P = 0.009) as well as a third-order effect involving the latter two variants plus INSIG2 (P = 0.034) with overweight/obesity. Assessment of the LEPR-ADRB2 interaction in the second sample indicated a marginally significant association (P = 0.0724), which was further verified to be limited to men (P = 0.0118). Together, our findings suggest evidence for a two-locus interaction between the LEPR Gln223Arg and ADRB2 Arg16Gly variants in the risk of overweight/obesity, and highlight further the importance of multilocus effects in the genetic component of obesity.
Resumo:
Most multidimensional projection techniques rely on distance (dissimilarity) information between data instances to embed high-dimensional data into a visual space. When data are endowed with Cartesian coordinates, an extra computational effort is necessary to compute the needed distances, making multidimensional projection prohibitive in applications dealing with interactivity and massive data. The novel multidimensional projection technique proposed in this work, called Part-Linear Multidimensional Projection (PLMP), has been tailored to handle multivariate data represented in Cartesian high-dimensional spaces, requiring only distance information between pairs of representative samples. This characteristic renders PLMP faster than previous methods when processing large data sets while still being competitive in terms of precision. Moreover, knowing the range of variation for data instances in the high-dimensional space, we can make PLMP a truly streaming data projection technique, a trait absent in previous methods.
Resumo:
The ever increasing spurt in digital crimes such as image manipulation, image tampering, signature forgery, image forgery, illegal transaction, etc. have hard pressed the demand to combat these forms of criminal activities. In this direction, biometrics - the computer-based validation of a persons' identity is becoming more and more essential particularly for high security systems. The essence of biometrics is the measurement of person’s physiological or behavioral characteristics, it enables authentication of a person’s identity. Biometric-based authentication is also becoming increasingly important in computer-based applications because the amount of sensitive data stored in such systems is growing. The new demands of biometric systems are robustness, high recognition rates, capability to handle imprecision, uncertainties of non-statistical kind and magnanimous flexibility. It is exactly here that, the role of soft computing techniques comes to play. The main aim of this write-up is to present a pragmatic view on applications of soft computing techniques in biometrics and to analyze its impact. It is found that soft computing has already made inroads in terms of individual methods or in combination. Applications of varieties of neural networks top the list followed by fuzzy logic and evolutionary algorithms. In a nutshell, the soft computing paradigms are used for biometric tasks such as feature extraction, dimensionality reduction, pattern identification, pattern mapping and the like.
Resumo:
Self-organizing maps (SOM) are artificial neural networks widely used in the data mining field, mainly because they constitute a dimensionality reduction technique given the fixed grid of neurons associated with the network. In order to properly the partition and visualize the SOM network, the various methods available in the literature must be applied in a post-processing stage, that consists of inferring, through its neurons, relevant characteristics of the data set. In general, such processing applied to the network neurons, instead of the entire database, reduces the computational costs due to vector quantization. This work proposes a post-processing of the SOM neurons in the input and output spaces, combining visualization techniques with algorithms based on gravitational forces and the search for the shortest path with the greatest reward. Such methods take into account the connection strength between neighbouring neurons and characteristics of pattern density and distances among neurons, both associated with the position that the neurons occupy in the data space after training the network. Thus, the goal consists of defining more clearly the arrangement of the clusters present in the data. Experiments were carried out so as to evaluate the proposed methods using various artificially generated data sets, as well as real world data sets. The results obtained were compared with those from a number of well-known methods existent in the literature
Resumo:
In the world we are constantly performing everyday actions. Two of these actions are frequent and of great importance: classify (sort by classes) and take decision. When we encounter problems with a relatively high degree of complexity, we tend to seek other opinions, usually from people who have some knowledge or even to the extent possible, are experts in the problem domain in question in order to help us in the decision-making process. Both the classification process as the process of decision making, we are guided by consideration of the characteristics involved in the specific problem. The characterization of a set of objects is part of the decision making process in general. In Machine Learning this classification happens through a learning algorithm and the characterization is applied to databases. The classification algorithms can be employed individually or by machine committees. The choice of the best methods to be used in the construction of a committee is a very arduous task. In this work, it will be investigated meta-learning techniques in selecting the best configuration parameters of homogeneous committees for applications in various classification problems. These parameters are: the base classifier, the architecture and the size of this architecture. We investigated nine types of inductors candidates for based classifier, two methods of generation of architecture and nine medium-sized groups for architecture. Dimensionality reduction techniques have been applied to metabases looking for improvement. Five classifiers methods are investigated as meta-learners in the process of choosing the best parameters of a homogeneous committee.
Resumo:
In this article, we evaluate the performance of the T2 chart based on the principal components (PC chart) and the simultaneous univariate control charts based on the original variables (SU X̄ charts) or based on the principal components (SUPC charts). The main reason to consider the PC chart lies on the dimensionality reduction. However, depending on the disturbance and on the way the original variables are related, the chart is very slow in signaling, except when all variables are negatively correlated and the principal component is wisely selected. Comparing the SU X̄, the SUPC and the T 2 charts we conclude that the SU X̄ charts (SUPC charts) have a better overall performance when the variables are positively (negatively) correlated. We also develop the expression to obtain the power of two S 2 charts designed for monitoring the covariance matrix. These joint S2 charts are, in the majority of the cases, more efficient than the generalized variance |S| chart.
Resumo:
Besides optimizing classifier predictive performance and addressing the curse of the dimensionality problem, feature selection techniques support a classification model as simple as possible. In this paper, we present a wrapper feature selection approach based on Bat Algorithm (BA) and Optimum-Path Forest (OPF), in which we model the problem of feature selection as an binary-based optimization technique, guided by BA using the OPF accuracy over a validating set as the fitness function to be maximized. Moreover, we present a methodology to better estimate the quality of the reduced feature set. Experiments conducted over six public datasets demonstrated that the proposed approach provides statistically significant more compact sets and, in some cases, it can indeed improve the classification effectiveness. © 2013 Elsevier Ltd. All rights reserved.
Resumo:
A identificação e descrição dos caracteres litológicos de uma formação são indispensáveis à avaliação de formações complexas. Com este objetivo, tem sido sistematicamente usada a combinação de ferramentas nucleares em poços não-revestidos. Os perfis resultantes podem ser considerados como a interação entre duas fases distintas: • Fase de transporte da radiação desde a fonte até um ou mais detectores, através da formação. • Fase de detecção, que consiste na coleção da radiação, sua transformação em pulsos de corrente e, finalmente, na distribuição espectral destes pulsos. Visto que a presença do detector não afeta fortemente o resultado do transporte da radiação, cada fase pode ser simulada independentemente uma da outra, o que permite introduzir um novo tipo de modelamento que desacopla as duas fases. Neste trabalho, a resposta final é simulada combinando soluções numéricas do transporte com uma biblioteca de funções resposta do detector, para diferentes energias incidentes e para cada arranjo específico de fontes e detectores. O transporte da radiação é calculado através do algoritmo de elementos finitos (FEM), na forma de fluxo escalar 2½-D, proveniente da solução numérica da aproximação de difusão para multigrupos da equação de transporte de Boltzmann, no espaço de fase, dita aproximação P1, onde a variável direção é expandida em termos dos polinômios ortogonais de Legendre. Isto determina a redução da dimensionalidade do problema, tornando-o mais compatível com o algoritmo FEM, onde o fluxo dependa exclusivamente da variável espacial e das propriedades físicas da formação. A função resposta do detector NaI(Tl) é obtida independentemente pelo método Monte Carlo (MC) em que a reconstrução da vida de uma partícula dentro do cristal cintilador é feita simulando, interação por interação, a posição, direção e energia das diferentes partículas, com a ajuda de números aleatórios aos quais estão associados leis de probabilidades adequadas. Os possíveis tipos de interação (Rayleigh, Efeito fotoelétrico, Compton e Produção de pares) são determinados similarmente. Completa-se a simulação quando as funções resposta do detector são convolvidas com o fluxo escalar, produzindo como resposta final, o espectro de altura de pulso do sistema modelado. Neste espectro serão selecionados conjuntos de canais denominados janelas de detecção. As taxas de contagens em cada janela apresentam dependências diferenciadas sobre a densidade eletrônica e a fitologia. Isto permite utilizar a combinação dessas janelas na determinação da densidade e do fator de absorção fotoelétrico das formações. De acordo com a metodologia desenvolvida, os perfis, tanto em modelos de camadas espessas quanto finas, puderam ser simulados. O desempenho do método foi testado em formações complexas, principalmente naquelas em que a presença de minerais de argila, feldspato e mica, produziram efeitos consideráveis capazes de perturbar a resposta final das ferramentas. Os resultados mostraram que as formações com densidade entre 1.8 e 4.0 g/cm3 e fatores de absorção fotoelétrico no intervalo de 1.5 a 5 barns/e-, tiveram seus caracteres físicos e litológicos perfeitamente identificados. As concentrações de Potássio, Urânio e Tório, puderam ser obtidas com a introdução de um novo sistema de calibração, capaz de corrigir os efeitos devidos à influência de altas variâncias e de correlações negativas, observadas principalmente no cálculo das concentrações em massa de Urânio e Potássio. Na simulação da resposta da sonda CNL, utilizando o algoritmo de regressão polinomial de Tittle, foi verificado que, devido à resolução vertical limitada por ela apresentada, as camadas com espessuras inferiores ao espaçamento fonte - detector mais distante tiveram os valores de porosidade aparente medidos erroneamente. Isto deve-se ao fato do algoritmo de Tittle aplicar-se exclusivamente a camadas espessas. Em virtude desse erro, foi desenvolvido um método que leva em conta um fator de contribuição determinado pela área relativa de cada camada dentro da zona de máxima informação. Assim, a porosidade de cada ponto em subsuperfície pôde ser determinada convolvendo estes fatores com os índices de porosidade locais, porém supondo cada camada suficientemente espessa a fim de adequar-se ao algoritmo de Tittle. Por fim, as limitações adicionais impostas pela presença de minerais perturbadores, foram resolvidas supondo a formação como que composta por um mineral base totalmente saturada com água, sendo os componentes restantes considerados perturbações sobre este caso base. Estes resultados permitem calcular perfis sintéticos de poço, que poderão ser utilizados em esquemas de inversão com o objetivo de obter uma avaliação quantitativa mais detalhada de formações complexas.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
The pathogenic mechanisms involved in migraine are complex and not completely clarified. Because there is evidence for the involvement of nitric oxide (NO) in migraine pathophysiology, candidate gene approaches focusing on genes affecting the endothelial function have been studied including the genes encoding endothelial NO synthase (eNOS), inducible NO synthase (iNOS), and vascular endothelial growth factor (VEGF). However, investigations on gene-gene interactions are warranted to better elucidate the genetic basis of migraine. This study aimed at characterizing interactions among nine clinically relevant polymorphisms in eNOS (T-786C/rs2070744, the 27 bp VNTR in intron 4, the Glu298Asp/rs1799983, and two additional tagSNPs rs3918226 and rs743506), iNOS (C(-1026)A/rs2779249 and G2087A/rs2297518), and VEGF (C(-2578)A/rs699947 and G(-634)C/rs2010963) in migraine patients and control group. Genotypes were determined by real-time polymerase chain reaction using the Taqman(A (R)) allele discrimination assays or PCR and fragment separation by electrophoresis in 99 healthy women without migraine (control group) and in 150 women with migraine divided into two groups: 107 with migraine without aura and 43 with aura. The multifactor dimensionality reduction method was used to detect and characterize gene-gene interactions. We found a significant interaction between eNOS rs743506 and iNOS 2087G/A polymorphisms in migraine patients compared to control group (P < 0.05), suggesting that this combination affect the susceptibility to migraine. Further studies are needed to determine the molecular mechanisms explaining this interaction.
Resumo:
Polymorphisms of the endothelial nitric oxide synthase (eNOS), matrix metalloproteinase-9 (MMP-9) and vascular endothelial growth factor (VEGF) genes were shown to be associated with hypertensive disorders of pregnancy. However, epistasis is suggested to be an important component of the genetic susceptibility to preeclampsia (PE). The aim of this study was to characterize the interactions among these genes in PE and gestational hypertension (GH). Seven clinically relevant polymorphisms of eNOS (T-786C, rs2070744, a variable number of tandem repeats in intron 4 and Glu298Asp, rs1799983), MMP-9 (C-1562T, rs3918242 and -90(CA)(13-25), rs2234681) and VEGF (C-2578A, rs699947 and G-634C, rs2010963) were genotyped by TaqMan allelic discrimination assays or PCR and fragment separation by electrophoresis in 122 patients with PE, 107 patients with GH and a control group of 102 normotensive pregnant (NP) women. A robust multifactor dimensionality reduction analysis was used to characterize gene-gene interactions. Although no significant genotype combinations were observed for the comparison between the GH and NP groups (P>0.05), the combination of MMP-9-1562CC with VEGF-634GG was more frequent in NP women than in women with PE (P<0.05). Moreover, the combination of MMP-9-1562CC with VEGF-634CC or MMP-9-1562CT with VEGF-634CC or-634GG was more frequent in women with PE than in NP women (P<0.05). These results are obscured when single polymorphisms in these genes are considered and suggest that specific genotype combinations of MMP-9 and VEGF contribute to PE susceptibility. Hypertension Research (2012) 35, 917-921; doi:10.1038/hr.2012.60; published online 10 May 2012
Resumo:
Statistical shape analysis techniques commonly employed in the medical imaging community, such as active shape models or active appearance models, rely on principal component analysis (PCA) to decompose shape variability into a reduced set of interpretable components. In this paper we propose principal factor analysis (PFA) as an alternative and complementary tool to PCA providing a decomposition into modes of variation that can be more easily interpretable, while still being a linear efficient technique that performs dimensionality reduction (as opposed to independent component analysis, ICA). The key difference between PFA and PCA is that PFA models covariance between variables, rather than the total variance in the data. The added value of PFA is illustrated on 2D landmark data of corpora callosa outlines. Then, a study of the 3D shape variability of the human left femur is performed. Finally, we report results on vector-valued 3D deformation fields resulting from non-rigid registration of ventricles in MRI of the brain.