960 resultados para FUNCTIONAL DATA


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analyzing functional data often leads to finding common factors, for which functional principal component analysis proves to be a useful tool to summarize and characterize the random variation in a function space. The representation in terms of eigenfunctions is optimal in the sense of L-2 approximation. However, the eigenfunctions are not always directed towards an interesting and interpretable direction in the context of functional data and thus could obscure the underlying structure. To overcome such difficulty, an alternative to functional principal component analysis is proposed that produces directed components which may be more informative and easier to interpret. These structural components are similar to principal components, but are adapted to situations in which the domain of the function may be decomposed into disjoint intervals such that there is effectively independence between intervals and positive correlation within intervals. The approach is demonstrated with synthetic examples as well as real data. Properties for special cases are also studied.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work presents a novel approach in order to increase the recognition power of Multiscale Fractal Dimension (MFD) techniques, when applied to image classification. The proposal uses Functional Data Analysis (FDA) with the aim of enhancing the MFD technique precision achieving a more representative descriptors vector, capable of recognizing and characterizing more precisely objects in an image. FDA is applied to signatures extracted by using the Bouligand-Minkowsky MFD technique in the generation of a descriptors vector from them. For the evaluation of the obtained improvement, an experiment using two datasets of objects was carried out. A dataset was used of characters shapes (26 characters of the Latin alphabet) carrying different levels of controlled noise and a dataset of fish images contours. A comparison with the use of the well-known methods of Fourier and wavelets descriptors was performed with the aim of verifying the performance of FDA method. The descriptor vectors were submitted to Linear Discriminant Analysis (LDA) classification method and we compared the correctness rate in the classification process among the descriptors methods. The results demonstrate that FDA overcomes the literature methods (Fourier and wavelets) in the processing of information extracted from the MFD signature. In this way, the proposed method can be considered as an interesting choice for pattern recognition and image classification using fractal analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a novel class of models for functional data exhibiting skewness or other shape characteristics that vary with spatial or temporal location. We use copulas so that the marginal distributions and the dependence structure can be modeled independently. Dependence is modeled with a Gaussian or t-copula, so that there is an underlying latent Gaussian process. We model the marginal distributions using the skew t family. The mean, variance, and shape parameters are modeled nonparametrically as functions of location. A computationally tractable inferential framework for estimating heterogeneous asymmetric or heavy-tailed marginal distributions is introduced. This framework provides a new set of tools for increasingly complex data collected in medical and public health studies. Our methods were motivated by and are illustrated with a state-of-the-art study of neuronal tracts in multiple sclerosis patients and healthy controls. Using the tools we have developed, we were able to find those locations along the tract most affected by the disease. However, our methods are general and highly relevant to many functional data sets. In addition to the application to one-dimensional tract profiles illustrated here, higher-dimensional extensions of the methodology could have direct applications to other biological data including functional and structural MRI.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Functional Data Analysis (FDA) deals with samples where a whole function is observedfor each individual. A particular case of FDA is when the observed functions are densityfunctions, that are also an example of infinite dimensional compositional data. In thiswork we compare several methods for dimensionality reduction for this particular typeof data: functional principal components analysis (PCA) with or without a previousdata transformation and multidimensional scaling (MDS) for diferent inter-densitiesdistances, one of them taking into account the compositional nature of density functions. The difeerent methods are applied to both artificial and real data (householdsincome distributions)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Our essay aims at studying suitable statistical methods for the clustering ofcompositional data in situations where observations are constituted by trajectories ofcompositional data, that is, by sequences of composition measurements along a domain.Observed trajectories are known as “functional data” and several methods have beenproposed for their analysis.In particular, methods for clustering functional data, known as Functional ClusterAnalysis (FCA), have been applied by practitioners and scientists in many fields. To ourknowledge, FCA techniques have not been extended to cope with the problem ofclustering compositional data trajectories. In order to extend FCA techniques to theanalysis of compositional data, FCA clustering techniques have to be adapted by using asuitable compositional algebra.The present work centres on the following question: given a sample of compositionaldata trajectories, how can we formulate a segmentation procedure giving homogeneousclasses? To address this problem we follow the steps described below.First of all we adapt the well-known spline smoothing techniques in order to cope withthe smoothing of compositional data trajectories. In fact, an observed curve can bethought of as the sum of a smooth part plus some noise due to measurement errors.Spline smoothing techniques are used to isolate the smooth part of the trajectory:clustering algorithms are then applied to these smooth curves.The second step consists in building suitable metrics for measuring the dissimilaritybetween trajectories: we propose a metric that accounts for difference in both shape andlevel, and a metric accounting for differences in shape only.A simulation study is performed in order to evaluate the proposed methodologies, usingboth hierarchical and partitional clustering algorithm. The quality of the obtained resultsis assessed by means of several indices

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Genetic and functional data indicate that variation in the expression of the neurotrophin-3 receptor gene (NTRK3) may have an impact on neuronal plasticity, suggesting a role for NTRK3 in the pathophysiology of anxiety disorders. MicroRNA (miRNA) posttranscriptional gene regulators act by base-pairing to specific sequence sites, usually at the 3'UTR of the target mRNA. Variants at these sites might result in gene expression changes contributing to disease susceptibility. We investigated genetic variation in two different isoforms of NTRK3 as candidate susceptibility factors for anxiety by resequencing their 3'UTRs in patients with panic disorder (PD), obsessive-compulsive disorder (OCD), and in controls. We have found the C allele of rs28521337, located in a functional target site for miR-485-3p in the truncated isoform of NTRK3, to be significantly associated with the hoarding phenotype of OCD. We have also identified two new rare variants in the 3'UTR of NTRK3, ss102661458 and ss102661460, each present only in one chromosome of a patient with PD. The ss102661458 variant is located in a functional target site for miR-765, and the ss102661460 in functional target sites for two miRNAs, miR-509 and miR-128, the latter being a brain-enriched miRNA involved in neuronal differentiation and synaptic processing. Interestingly, these two variants significantly alter the miRNA-mediated regulation of NTRK3, resulting in recovery of gene expression. These data implicate miRNAs as key posttranscriptional regulators of NTRK3 and provide a framework for allele-specific miRNA regulation of NTRK3 in anxiety disorders.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The use of comparative genomics to infer genome function relies on the understanding of how different components of the genome change over evolutionary time. The aim of such comparative analysis is to identify conserved, functionally transcribed sequences such as protein-coding genes and non-coding RNA genes, and other functional sequences such as regulatory regions, as well as other genomic features. Here, we have compared the entire human chromosome 21 with syntenic regions of the mouse genome, and have identified a large number of conserved blocks of unknown function. Although previous studies have made similar observations, it is unknown whether these conserved sequences are genes or not. Here we present an extensive experimental and computational analysis of human chromosome 21 in an effort to assign function to sequences conserved between human chromosome 21 (ref. 8) and the syntenic mouse regions. Our data support the presence of a large number of potentially functional non-genic sequences, probably regulatory and structural. The integration of the properties of the conserved components of human chromosome 21 to the rapidly accumulating functional data for this chromosome will improve considerably our understanding of the role of sequence conservation in mammalian genomes.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Les lésions de la moelle épinière ont un impact significatif sur la qualité de la vie car elles peuvent induire des déficits moteurs (paralysie) et sensoriels. Ces déficits évoluent dans le temps à mesure que le système nerveux central se réorganise, en impliquant des mécanismes physiologiques et neurochimiques encore mal connus. L'ampleur de ces déficits ainsi que le processus de réhabilitation dépendent fortement des voies anatomiques qui ont été altérées dans la moelle épinière. Il est donc crucial de pouvoir attester l'intégrité de la matière blanche après une lésion spinale et évaluer quantitativement l'état fonctionnel des neurones spinaux. Un grand intérêt de l'imagerie par résonance magnétique (IRM) est qu'elle permet d'imager de façon non invasive les propriétés fonctionnelles et anatomiques du système nerveux central. Le premier objectif de ce projet de thèse a été de développer l'IRM de diffusion afin d'évaluer l'intégrité des axones de la matière blanche après une lésion médullaire. Le deuxième objectif a été d'évaluer dans quelle mesure l'IRM fonctionnelle permet de mesurer l'activité des neurones de la moelle épinière. Bien que largement appliquées au cerveau, l'IRM de diffusion et l'IRM fonctionnelle de la moelle épinière sont plus problématiques. Les difficultés associées à l'IRM de la moelle épinière relèvent de sa fine géométrie (environ 1 cm de diamètre chez l'humain), de la présence de mouvements d'origine physiologique (cardiaques et respiratoires) et de la présence d'artefacts de susceptibilité magnétique induits par les inhomogénéités de champ, notamment au niveau des disques intervertébraux et des poumons. L'objectif principal de cette thèse a donc été de développer des méthodes permettant de contourner ces difficultés. Ce développement a notamment reposé sur l'optimisation des paramètres d'acquisition d'images anatomiques, d'images pondérées en diffusion et de données fonctionnelles chez le chat et chez l'humain sur un IRM à 3 Tesla. En outre, diverses stratégies ont été étudiées afin de corriger les distorsions d'images induites par les artefacts de susceptibilité magnétique, et une étude a été menée sur la sensibilité et la spécificité de l'IRM fonctionnelle de la moelle épinière. Les résultats de ces études démontrent la faisabilité d'acquérir des images pondérées en diffusion de haute qualité, et d'évaluer l'intégrité de voies spinales spécifiques après lésion complète et partielle. De plus, l'activité des neurones spinaux a pu être détectée par IRM fonctionnelle chez des chats anesthésiés. Bien qu'encourageants, ces résultats mettent en lumière la nécessité de développer davantage ces nouvelles techniques. L'existence d'un outil de neuroimagerie fiable et robuste, capable de confirmer les paramètres cliniques, permettrait d'améliorer le diagnostic et le pronostic chez les patients atteints de lésions médullaires. Un des enjeux majeurs serait de suivre et de valider l'effet de diverses stratégies thérapeutiques. De telles outils représentent un espoir immense pour nombre de personnes souffrant de traumatismes et de maladies neurodégénératives telles que les lésions de la moelle épinière, les tumeurs spinales, la sclérose en plaques et la sclérose latérale amyotrophique.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Our essay aims at studying suitable statistical methods for the clustering of compositional data in situations where observations are constituted by trajectories of compositional data, that is, by sequences of composition measurements along a domain. Observed trajectories are known as “functional data” and several methods have been proposed for their analysis. In particular, methods for clustering functional data, known as Functional Cluster Analysis (FCA), have been applied by practitioners and scientists in many fields. To our knowledge, FCA techniques have not been extended to cope with the problem of clustering compositional data trajectories. In order to extend FCA techniques to the analysis of compositional data, FCA clustering techniques have to be adapted by using a suitable compositional algebra. The present work centres on the following question: given a sample of compositional data trajectories, how can we formulate a segmentation procedure giving homogeneous classes? To address this problem we follow the steps described below. First of all we adapt the well-known spline smoothing techniques in order to cope with the smoothing of compositional data trajectories. In fact, an observed curve can be thought of as the sum of a smooth part plus some noise due to measurement errors. Spline smoothing techniques are used to isolate the smooth part of the trajectory: clustering algorithms are then applied to these smooth curves. The second step consists in building suitable metrics for measuring the dissimilarity between trajectories: we propose a metric that accounts for difference in both shape and level, and a metric accounting for differences in shape only. A simulation study is performed in order to evaluate the proposed methodologies, using both hierarchical and partitional clustering algorithm. The quality of the obtained results is assessed by means of several indices

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Functional Data Analysis (FDA) deals with samples where a whole function is observed for each individual. A particular case of FDA is when the observed functions are density functions, that are also an example of infinite dimensional compositional data. In this work we compare several methods for dimensionality reduction for this particular type of data: functional principal components analysis (PCA) with or without a previous data transformation and multidimensional scaling (MDS) for diferent inter-densities distances, one of them taking into account the compositional nature of density functions. The difeerent methods are applied to both artificial and real data (households income distributions)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

OBJECTIVE The aim of the study was to elucidate the cellular mechanism underlying the suppression of glucose-induced insulin secretion in mice fed a high-fat diet (HFD) for 15 weeks. RESEARCH DESIGN AND METHODS-C57BL6J mice were fed a HFD or a normal diet (ND) for 3 or 15 weeks. Plasma insulin and glucose levels in vivo were assessed by intraperitoneal glucose tolerance test. Insulin secretion in vitro was studied using static incubations and a perfused pancreas preparation. Membrane currents, electrical activity, and exocytosis were examined by patch-clamp technique measurements. Intracellular calcium concentration ([Ca(2+)](i)) was measured by microfluorimetry. Total internal reflection fluorescence microscope (TIRFM) was used for optical imaging of exocytosis and submembrane depolarization-evoked [Ca(2+)](i). The functional data were complemented by analyses of histology and gene transcription. RESULTS After 15 weeks, but not 3 weeks, mice on HFD exhibited hyperglycemia and hypoinsulinemia. Pancreatic islet content and beta-cell area increased 2- and 1.5-fold, respectively. These changes correlated with a 20-50% reduction of glucose-induced insulin secretion (normalized to insulin content). The latter effect was not associated with impaired electrical activity or [Ca(2+)](i) signaling. Single-cell capacitance and TIRFM measurements of exocytosis revealed a selective suppression (>70%) of exocytosis elicited by short (50 ms) depolarization, whereas the responses to longer depolarizations were (500 ms) less affected. The loss of rapid exocytosis correlated with dispersion of Ca(2+) entry in HFD beta-cells. No changes in gene transcription of key exocytotic protein were observed. CONCLUSIONS HFD results in reduced insulin secretion by causing the functional dissociation of voltage-gated Ca(2+) entry from exocytosis. These observations suggest a novel explanation to the well-established link between obesity and diabetes. Diabetes 59:1192-1201, 2010