922 resultados para Functional data analysis
Resumo:
Atmospheric surface boundary layer parameters vary anomalously in response to the occurrence of annular solar eclipse on 15th January 2010 over Cochin. It was the longest annular solar eclipse occurred over South India with high intensity. As it occurred during the noon hours, it is considered to be much more significant because of its effects in all the regions of atmosphere including ionosphere. Since the insolation is the main driving factor responsible for the anomalous changes occurred in the surface layer due to annular solar eclipse, occurred on 15th January 2010, that played very important role in understanding dynamics of the atmosphere during the eclipse period because of its coincidence with the noon time. The Sonic anemometer is able to give data of zonal, meridional and vertical wind as well as the air temperature at a temporal resolution of 1 s. Different surface boundary layer parameters and turbulent fluxes were computed by the application of eddy correlation technique using the high resolution station data. The surface boundary layer parameters that are computed using the sonic anemometer data during the period are momentum flux, sensible heat flux, turbulent kinetic energy, frictional velocity (u*), variance of temperature, variances of u, v and w wind. In order to compare the results, a control run has been done using the data of previous day as well as next day. It is noted that over the specified time period of annular solar eclipse, all the above stated surface boundary layer parameters vary anomalously when compared with the control run. From the observations we could note that momentum flux was 0.1 Nm 2 instead of the mean value 0.2 Nm-2 when there was eclipse. Sensible heat flux anomalously decreases to 50 Nm 2 instead of the mean value 200 Nm 2 at the time of solar eclipse. The turbulent kinetic energy decreases to 0.2 m2s 2 from the mean value 1 m2s 2. The frictional velocity value decreases to 0.05 ms 1 instead of the mean value 0.2 ms 1. The present study aimed at understanding the dynamics of surface layer in response to the annular solar eclipse over a tropical coastal station, occurred during the noon hours. Key words: annular solar eclipse, surface boundary layer, sonic anemometer
Resumo:
Several eco-toxicological studies have shown that insectivorous mammals, due to their feeding habits, easily accumulate high amounts of pollutants in relation to other mammal species. To assess the bio-accumulation levels of toxic metals and their in°uence on essential metals, we quantified the concentration of 19 elements (Ca, K, Fe, B, P, S, Na, Al, Zn, Ba, Rb, Sr, Cu, Mn, Hg, Cd, Mo, Cr and Pb) in bones of 105 greater white-toothed shrews (Crocidura russula) from a polluted (Ebro Delta) and a control (Medas Islands) area. Since chemical contents of a bio-indicator are mainly compositional data, conventional statistical analyses currently used in eco-toxicology can give misleading results. Therefore, to improve the interpretation of the data obtained, we used statistical techniques for compositional data analysis to define groups of metals and to evaluate the relationships between them, from an inter-population viewpoint. Hypothesis testing on the adequate balance-coordinates allow us to confirm intuition based hypothesis and some previous results. The main statistical goal was to test equal means of balance-coordinates for the two defined populations. After checking normality, one-way ANOVA or Mann-Whitney tests were carried out for the inter-group balances
Resumo:
Our essay aims at studying suitable statistical methods for the clustering of compositional data in situations where observations are constituted by trajectories of compositional data, that is, by sequences of composition measurements along a domain. Observed trajectories are known as “functional data” and several methods have been proposed for their analysis. In particular, methods for clustering functional data, known as Functional Cluster Analysis (FCA), have been applied by practitioners and scientists in many fields. To our knowledge, FCA techniques have not been extended to cope with the problem of clustering compositional data trajectories. In order to extend FCA techniques to the analysis of compositional data, FCA clustering techniques have to be adapted by using a suitable compositional algebra. The present work centres on the following question: given a sample of compositional data trajectories, how can we formulate a segmentation procedure giving homogeneous classes? To address this problem we follow the steps described below. First of all we adapt the well-known spline smoothing techniques in order to cope with the smoothing of compositional data trajectories. In fact, an observed curve can be thought of as the sum of a smooth part plus some noise due to measurement errors. Spline smoothing techniques are used to isolate the smooth part of the trajectory: clustering algorithms are then applied to these smooth curves. The second step consists in building suitable metrics for measuring the dissimilarity between trajectories: we propose a metric that accounts for difference in both shape and level, and a metric accounting for differences in shape only. A simulation study is performed in order to evaluate the proposed methodologies, using both hierarchical and partitional clustering algorithm. The quality of the obtained results is assessed by means of several indices
Resumo:
Factor analysis as frequent technique for multivariate data inspection is widely used also for compositional data analysis. The usual way is to use a centered logratio (clr) transformation to obtain the random vector y of dimension D. The factor model is then y = Λf + e (1) with the factors f of dimension k < D, the error term e, and the loadings matrix Λ. Using the usual model assumptions (see, e.g., Basilevsky, 1994), the factor analysis model (1) can be written as Cov(y) = ΛΛT + ψ (2) where ψ = Cov(e) has a diagonal form. The diagonal elements of ψ as well as the loadings matrix Λ are estimated from an estimation of Cov(y). Given observed clr transformed data Y as realizations of the random vector y. Outliers or deviations from the idealized model assumptions of factor analysis can severely effect the parameter estimation. As a way out, robust estimation of the covariance matrix of Y will lead to robust estimates of Λ and ψ in (2), see Pison et al. (2003). Well known robust covariance estimators with good statistical properties, like the MCD or the S-estimators (see, e.g. Maronna et al., 2006), rely on a full-rank data matrix Y which is not the case for clr transformed data (see, e.g., Aitchison, 1986). The isometric logratio (ilr) transformation (Egozcue et al., 2003) solves this singularity problem. The data matrix Y is transformed to a matrix Z by using an orthonormal basis of lower dimension. Using the ilr transformed data, a robust covariance matrix C(Z) can be estimated. The result can be back-transformed to the clr space by C(Y ) = V C(Z)V T where the matrix V with orthonormal columns comes from the relation between the clr and the ilr transformation. Now the parameters in the model (2) can be estimated (Basilevsky, 1994) and the results have a direct interpretation since the links to the original variables are still preserved. The above procedure will be applied to data from geochemistry. Our special interest is on comparing the results with those of Reimann et al. (2002) for the Kola project data
Resumo:
A presentation on the collection and analysis of data taken from SOES 6018. This module aims to ensure that MSc Oceanography, MSc Marine Science, Policy & Law and MSc Marine Resource Management students are equipped with the skills they need to function as professional marine scientists, in addition to / in conjuction with the skills training in other MSc modules. The module covers training in fieldwork techniques, communication & research skills, IT & data analysis and professional development.
Resumo:
Class exercise to analyse qualitative data mediated on use of a set of transcripts, augmented by videos from web site. Discussion is around not only how the data is codes, interview bias, dimensions of analysis. Designed as an introduction.
Resumo:
Event-related functional magnetic resonance imaging (efMRI) has emerged as a powerful technique for detecting brains' responses to presented stimuli. A primary goal in efMRI data analysis is to estimate the Hemodynamic Response Function (HRF) and to locate activated regions in human brains when specific tasks are performed. This paper develops new methodologies that are important improvements not only to parametric but also to nonparametric estimation and hypothesis testing of the HRF. First, an effective and computationally fast scheme for estimating the error covariance matrix for efMRI is proposed. Second, methodologies for estimation and hypothesis testing of the HRF are developed. Simulations support the effectiveness of our proposed methods. When applied to an efMRI dataset from an emotional control study, our method reveals more meaningful findings than the popular methods offered by AFNI and FSL. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.
Resumo:
In this paper a new parametric method to deal with discrepant experimental results is developed. The method is based on the fit of a probability density function to the data. This paper also compares the characteristics of different methods used to deduce recommended values and uncertainties from a discrepant set of experimental data. The methods are applied to the (137)Cs and (90)Sr published half-lives and special emphasis is given to the deduced confidence intervals. The obtained results are analyzed considering two fundamental properties expected from an experimental result: the probability content of confidence intervals and the statistical consistency between different recommended values. The recommended values and uncertainties for the (137)Cs and (90)Sr half-lives are 10,984 (24) days and 10,523 (70) days, respectively. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Periodontal disease (PD) is characterized as an inflammatory process that compromises the support and protection of the periodontium. Patients with Down's syndrome (DS) are prone to develop PD. Neutrophils (NE) are the first line of defense against infection and their absence sets the stage for disease. Aim: To compare the activity and function of NE in the peripheral blood from DS patients with and without PD, assisted at the Center for Dental Assistance to Patients with Special Needs affiliated with the School of Dentistry of Araçatuba, Brazil. Methods: Purified NE were collected from peripheral blood of 22 DS patients. NE were used to detect the 5-lypoxigenase (5-LO) expression by RT-PCR. Plasma from peripheral blood was collected to measure tumor necrosis factor-a (TNF-α) and interleukin-8 (IL-8) by ELISA and nitrite (NO 3) using a Griess assay. Results: Data analysis demonstrated that DS patients with PD present high levels of TNF-a and IL-8 when compared with DS patients without PD. However, there was no statistically significant difference in the levels of NO 3 production between the groups. The levels of the inflammatory mediator 5-LO expression increased in DS patients with PD. Conclusions: According with these results, it was concluded that TNF-α and IL-8 are produced by DS patients with PD. Furthermore, DS patients with PD presented high levels of 5-LO expression, suggesting the presence of leukotriene B 4 (LTB 4) in PD, thus demonstrating that the changes in NE function due to the elevation of inflammatory mediators contribute to PD.
Resumo:
Background: Uterine Leiomyomas (ULs) are the most common benign tumours affecting women of reproductive age. ULs represent a major problem in public health, as they are the main indication for hysterectomy. Approximately 40-50% of ULs have non-random cytogenetic abnormalities, and half of ULs may have copy number alterations (CNAs). Gene expression microarrays studies have demonstrated that cell proliferation genes act in response to growth factors and steroids. However, only a few genes mapping to CNAs regions were found to be associated with ULs. Methodology: We applied an integrative analysis using genomic and transcriptomic data to identify the pathways and molecular markers associated with ULs. Fifty-one fresh frozen specimens were evaluated by array CGH (JISTIC) and gene expression microarrays (SAM). The CONEXIC algorithm was applied to integrate the data. Principal Findings: The integrated analysis identified the top 30 significant genes (P<0.01), which comprised genes associated with cancer, whereas the protein-protein interaction analysis indicated a strong association between FANCA and BRCA1. Functional in silico analysis revealed target molecules for drugs involved in cell proliferation, including FGFR1 and IGFBP5. Transcriptional and protein analyses showed that FGFR1 (P = 0.006 and P<0.01, respectively) and IGFBP5 (P = 0.0002 and P = 0.006, respectively) were up-regulated in the tumours when compared with the adjacent normal myometrium. Conclusions: The integrative genomic and transcriptomic approach indicated that FGFR1 and IGFBP5 amplification, as well as the consequent up-regulation of the protein products, plays an important role in the aetiology of ULs and thus provides data for potential drug therapies development to target genes associated with cellular proliferation in ULs. © 2013 Cirilo et al.
Resumo:
Introduction. Tricuspid regurgitation (TR) is the most commonly valvular dysfunction found after heart transplantation (HTx). It may be related to endomyocardial biopsy (EMB) performed for allograft rejection surveillance. Objective. This investigation evaluated the presence of tricuspid valve tissue fragments obtained during routine EMB performed after HTx and its possible effect on short-term and long-term hemodynamic status. Method. This single-center review included prospectively collected and retrospectively analyzed data. From 1985 to 2010, 417 patients underwent 3550 EMB after HTx. All myocardial specimens were reviewed to identify the presence of tricuspid valve tissue by 2 observers initially and in doubtful cases by a third observer. The echocardiographic and hemodynamic parameters were only considered for valvular functional damage analysis in cases of tricuspid tissue inadvertently removed during EMB. Results. The 417 HTx patients to 3550 EMB, including 17,550 myocardial specimens. Tricuspid valve tissue was observed in 12 (2.9%) patients corresponding to 0.07% of the removed fragments. The echocardiographic and hemodynamic parameters of these patients before versus after the biopsy showed increased TR in 2 cases (2/12; 16.7%) quantified as moderate without progression in the long term. Only the right atrial pressure showed a significant increase (P = .0420) after tricuspid injury; however, the worsening of the functional class was not significant enough in any of the subjects. Thus, surgical intervention was not required. Conclusions. Histological evidence of chordal tissue in EMB specimens is a real-world problem of relatively low frequency. Traumatic tricuspid valve injury due to EMB rarely leads to severe valvular regurgitation; only a minority of patients develop significant clinical symptoms. Hemodynamic and echocardiographic alterations are also less often observed in most patients.
Resumo:
Dimensionality reduction is employed for visual data analysis as a way to obtaining reduced spaces for high dimensional data or to mapping data directly into 2D or 3D spaces. Although techniques have evolved to improve data segregation on reduced or visual spaces, they have limited capabilities for adjusting the results according to user's knowledge. In this paper, we propose a novel approach to handling both dimensionality reduction and visualization of high dimensional data, taking into account user's input. It employs Partial Least Squares (PLS), a statistical tool to perform retrieval of latent spaces focusing on the discriminability of the data. The method employs a training set for building a highly precise model that can then be applied to a much larger data set very effectively. The reduced data set can be exhibited using various existing visualization techniques. The training data is important to code user's knowledge into the loop. However, this work also devises a strategy for calculating PLS reduced spaces when no training data is available. The approach produces increasingly precise visual mappings as the user feeds back his or her knowledge and is capable of working with small and unbalanced training sets.
Resumo:
Independent component analysis (ICA) or seed based approaches (SBA) in functional magnetic resonance imaging blood oxygenation level dependent (BOLD) data became widely applied tools to identify functionally connected, large scale brain networks. Differences between task conditions as well as specific alterations of the networks in patients as compared to healthy controls were reported. However, BOLD lacks the possibility of quantifying absolute network metabolic activity, which is of particular interest in the case of pathological alterations. In contrast, arterial spin labeling (ASL) techniques allow quantifying absolute cerebral blood flow (CBF) in rest and in task-related conditions. In this study, we explored the ability of identifying networks in ASL data using ICA and to quantify network activity in terms of absolute CBF values. Moreover, we compared the results to SBA and performed a test-retest analysis. Twelve healthy young subjects performed a fingertapping block-design experiment. During the task pseudo-continuous ASL was measured. After CBF quantification the individual datasets were concatenated and subjected to the ICA algorithm. ICA proved capable to identify the somato-motor and the default mode network. Moreover, absolute network CBF within the separate networks during either condition could be quantified. We could demonstrate that using ICA and SBA functional connectivity analysis is feasible and robust in ASL-CBF data. CBF functional connectivity is a novel approach that opens a new strategy to evaluate differences of network activity in terms of absolute network CBF and thus allows quantifying inter-individual differences in the resting state and task-related activations and deactivations.