920 resultados para Multivariate data analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The bewildering complexity of cortical microcircuits at the single cell level gives rise to surprisingly robust emergent activity patterns at the level of laminar and columnar local field potentials (LFPs) in response to targeted local stimuli. Here we report the results of our multivariate data-analytic approach based on simultaneous multi-site recordings using micro-electrode-array chips for investigation of the microcircuitary of rat somatosensory (barrel) cortex. We find high repeatability of stimulus-induced responses, and typical spatial distributions of LFP responses to stimuli in supragranular, granular, and infragranular layers, where the last form a particularly distinct class. Population spikes appear to travel with about 33 cm/s from granular to infragranular layers. Responses within barrel related columns have different profiles than those in neighbouring columns to the left or interchangeably to the right. Variations between slices occur, but can be minimized by strictly obeying controlled experimental protocols. Cluster analysis on normalized recordings indicates specific spatial distributions of time series reflecting the location of sources and sinks independent of the stimulus layer. Although the precise correspondences between single cell activity and LFPs are still far from clear, a sophisticated neuroinformatics approach in combination with multi-site LFP recordings in the standardized slice preparation is suitable for comparing normal conditions to genetically or pharmacologically altered situations based on real cortical microcircuitry.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper a new parametric method to deal with discrepant experimental results is developed. The method is based on the fit of a probability density function to the data. This paper also compares the characteristics of different methods used to deduce recommended values and uncertainties from a discrepant set of experimental data. The methods are applied to the (137)Cs and (90)Sr published half-lives and special emphasis is given to the deduced confidence intervals. The obtained results are analyzed considering two fundamental properties expected from an experimental result: the probability content of confidence intervals and the statistical consistency between different recommended values. The recommended values and uncertainties for the (137)Cs and (90)Sr half-lives are 10,984 (24) days and 10,523 (70) days, respectively. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Measurement error models often arise in epidemiological and clinical research. Usually, in this set up it is assumed that the latent variable has a normal distribution. However, the normality assumption may not be always correct. Skew-normal/independent distribution is a class of asymmetric thick-tailed distributions which includes the Skew-normal distribution as a special case. In this paper, we explore the use of skew-normal/independent distribution as a robust alternative to null intercept measurement error model under a Bayesian paradigm. We assume that the random errors and the unobserved value of the covariate (latent variable) follows jointly a skew-normal/independent distribution, providing an appealing robust alternative to the routine use of symmetric normal distribution in this type of model. Specific distributions examined include univariate and multivariate versions of the skew-normal distribution, the skew-t distributions, the skew-slash distributions and the skew contaminated normal distributions. The methods developed is illustrated using a real data set from a dental clinical trial. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Researchers analyzing spatiotemporal or panel data, which varies both in location and over time, often find that their data has holes or gaps. This thesis explores alternative methods for filling those gaps and also suggests a set of techniques for evaluating those gap-filling methods to determine which works best.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diplopods belonging to the subclass Helminthomorpha may present one or both leg pairs of the seventh diplosegment modified into structures that aid copulation, called gonopods. These structures are used as a taxonomic trait for the description of most species. In the genus Rhinocricus these structures are closely similar, so that it is difficult to distinguish species only on the basis of this trait. Two species, R. padbergi and R. varians, are found in the same habitat and present gonopods practically identical in shape; together they present a broad colour gradient, ranging from dark brown to light beige. Morphometric data for individuals of the experimental group were submitted to ANOVA and MANOVA, using Hotelling-Lawley Trace and generalized Mahalanobis distances (D 2) tests. The results demonstrated a relationship between size and colour, with darker individuals being larger. On the basis of this preliminary analysis, we may suggest that the two species are distinct since dark individuals are distant from medium- and light-coloured individuals according to the D 2 values. This seems to indicate a possible polymorphism of individuals belonging to R. padbergi which present close proximity in the values obtained. In all analyses, we observed that the main variables were diameter, length and telson size.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ABSTRACT: The present work uses multivariate statistical analysis as a form of establishing the main sources of error in the Quantitative Phase Analysis (QPA) using the Rietveld method. The quantitative determination of crystalline phases using x ray powder diffraction is a complex measurement process whose results are influenced by several factors. Ternary mixtures of Al2O3, MgO and NiO were prepared under controlled conditions and the diffractions were obtained using the Bragg-Brentano geometric arrangement. It was possible to establish four sources of critical variations: the experimental absorption and the scale factor of NiO, which is the phase with the greatest linear absorption coefficient of the ternary mixture; the instrumental characteristics represented by mechanical errors of the goniometer and sample displacement; the other two phases (Al2O3 and MgO); and the temperature and relative humidity of the air in the laboratory. The error sources excessively impair the QPA with the Rietveld method. Therefore it becomes necessary to control them during the measurement procedure.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we describe how morphological castes can be distinguished using multivariate statistical methods combined with jackknife estimators of the allometric coefficients. Data from the polymorphic ant, Camponotus rufipes, produced two distinct patterns of allometric variation, and thus two morphological castes. Morphometric analysis distinguished different allometric patterns within the two castes, with overall variability being greater in the major workers. Caste-specific scaling variabilities were associated with the relative importance of first principal component. The static multivariate allometric coefficients for each of 10 measured characters were different between castes, but their relative magnitudes within castes were similar. Multivariate statistical analysis of worker polymorphism in ants is a more complete descriptor of shape variation than, and provides statistical and conceptual advantages over, the standard bivariate techniques commonly used.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dimensionality reduction is employed for visual data analysis as a way to obtaining reduced spaces for high dimensional data or to mapping data directly into 2D or 3D spaces. Although techniques have evolved to improve data segregation on reduced or visual spaces, they have limited capabilities for adjusting the results according to user's knowledge. In this paper, we propose a novel approach to handling both dimensionality reduction and visualization of high dimensional data, taking into account user's input. It employs Partial Least Squares (PLS), a statistical tool to perform retrieval of latent spaces focusing on the discriminability of the data. The method employs a training set for building a highly precise model that can then be applied to a much larger data set very effectively. The reduced data set can be exhibited using various existing visualization techniques. The training data is important to code user's knowledge into the loop. However, this work also devises a strategy for calculating PLS reduced spaces when no training data is available. The approach produces increasingly precise visual mappings as the user feeds back his or her knowledge and is capable of working with small and unbalanced training sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data. We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Since the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme and breast cancer are analyzed, and comparisons are made with some widely-used algorithms to illustrate the reliability and success of the technique.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A time series is a sequence of observations made over time. Examples in public health include daily ozone concentrations, weekly admissions to an emergency department or annual expenditures on health care in the United States. Time series models are used to describe the dependence of the response at each time on predictor variables including covariates and possibly previous values in the series. Time series methods are necessary to account for the correlation among repeated responses over time. This paper gives an overview of time series ideas and methods used in public health research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nitrogen and water are essential for plant growth and development. In this study, we designed experiments to produce gene expression data of poplar roots under nitrogen starvation and water deprivation conditions. We found low concentration of nitrogen led first to increased root elongation followed by lateral root proliferation and eventually increased root biomass. To identify genes regulating root growth and development under nitrogen starvation and water deprivation, we designed a series of data analysis procedures, through which, we have successfully identified biologically important genes. Differentially Expressed Genes (DEGs) analysis identified the genes that are differentially expressed under nitrogen starvation or drought. Protein domain enrichment analysis identified enriched themes (in same domains) that are highly interactive during the treatment. Gene Ontology (GO) enrichment analysis allowed us to identify biological process changed during nitrogen starvation. Based on the above analyses, we examined the local Gene Regulatory Network (GRN) and identified a number of transcription factors. After testing, one of them is a high hierarchically ranked transcription factor that affects root growth under nitrogen starvation. It is very tedious and time-consuming to analyze gene expression data. To avoid doing analysis manually, we attempt to automate a computational pipeline that now can be used for identification of DEGs and protein domain analysis in a single run. It is implemented in scripts of Perl and R.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND AND OBJECTIVES: Combination antiretroviral therapy (cART) is changing, and this may affect the type and occurrence of side effects. We examined the frequency of lipodystrophy (LD) and weight changes in relation to the use of specific drugs in the Swiss HIV Cohort Study (SHCS). METHODS: In the SHCS, patients are followed twice a year and scored by the treating physician as having 'fat accumulation', 'fat loss', or neither. Treatments, and reasons for change thereof, are recorded. Our study sample included all patients treated with cART between 2003 and 2006 and, in addition, all patients who started cART between 2000 and 2003. RESULTS: From 2003 to 2006, the percentage of patients taking stavudine, didanosine and nelfinavir decreased, the percentage taking lopinavir, nevirapine and efavirenz remained stable, and the percentage taking atazanavir and tenofovir increased by 18.7 and 22.2%, respectively. In life-table Kaplan-Meier analysis, patients starting cART in 2003-2006 were less likely to develop LD than those starting cART from 2000 to 2002 (P<0.02). LD was quoted as the reason for treatment change or discontinuation for 4% of patients on cART in 2003, and for 1% of patients treated in 2006 (P for trend <0.001). In univariate and multivariate regression analysis, patients with a weight gain of >or=5 kg were more likely to take lopinavir or atazanavir than patients without such a weight gain [odds ratio (OR) 2, 95% confidence interval (CI) 1.3-2.9, and OR 1.7, 95% CI 1.3-2.1, respectively]. CONCLUSIONS: LD has become less frequent in the SHCS from 2000 to 2006. A weight gain of more than 5 kg was associated with the use of atazanavir and lopinavir.