920 resultados para multivariate data analysis


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Studio ed analisi delle principali tecniche in ambito di Social Data Analysis. Progettazione e Realizzazione di una soluzione software implementata con linguaggio Java in ambiente Eclipse. Il software realizzato permette di integrare differenti servizi di API REST, per l'estrazione di dati sociali da Twitter, la loro memorizzazione in un database non-relazionale (realizzato con MongoDB), e la loro gestione. Inoltre permette di effettuare operazioni di classificazione di topic, e di analizzare dati complessivi sulle collection di dati estratti. Infine permette di visualizzare un albero delle "ricondivisioni", partendo da singoli tweet selezionati, ed una mappa geo-localizzata, contenente gli utenti coinvolti nella catena di ricondivisioni, e i relativi archi di "retweet".

Relevância:

90.00% 90.00%

Publicador:

Resumo:

With the outlook of improving seismic vulnerability assessment for the city of Bishkek (Kyrgyzstan), the global dynamic behaviour of four nine-storey r.c. large-panel buildings in elastic regime is studied. The four buildings were built during the Soviet era within a serial production system. Since they all belong to the same series, they have very similar geometries both in plan and in height. Firstly, ambient vibration measurements are performed in the four buildings. The data analysis composed of discrete Fourier transform, modal analysis (frequency domain decomposition) and deconvolution interferometry, yields the modal characteristics and an estimate of the linear impulse response function for the structures of the four buildings. Then, finite element models are set up for all four buildings and the results of the numerical modal analysis are compared with the experimental ones. The numerical models are finally calibrated considering the first three global modes and their results match the experimental ones with an error of less then 20%.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

PURPOSE: Tumor stage and nuclear grade are the most important prognostic parameters of clear cell renal cell carcinoma (ccRCC). The progression risk of ccRCC remains difficult to predict particularly for tumors with organ-confined stage and intermediate differentiation grade. Elucidating molecular pathways deregulated in ccRCC may point to novel prognostic parameters that facilitate planning of therapeutic approaches. EXPERIMENTAL DESIGN: Using tissue microarrays, expression patterns of 15 different proteins were evaluated in over 800 ccRCC patients to analyze pathways reported to be physiologically controlled by the tumor suppressors von Hippel-Lindau protein and phosphatase and tensin homologue (PTEN). Tumor staging and grading were improved by performing variable selection using Cox regression and a recursive bootstrap elimination scheme. RESULTS: Patients with pT2 and pT3 tumors that were p27 and CAIX positive had a better outcome than those with all remaining marker combinations. A prolonged survival among patients with intermediate grade (grade 2) correlated with both nuclear p27 and cytoplasmic PTEN expression, as well as with inactive, nonphosphorylated ribosomal protein S6. By applying graphical log-linear modeling for over 700 ccRCC for which the molecular parameters were available, only a weak conditional dependence existed between the expression of p27, PTEN, CAIX, and p-S6, suggesting that the dysregulation of several independent pathways are crucial for tumor progression. CONCLUSIONS: The use of recursive bootstrap elimination, as well as graphical log-linear modeling for comprehensive tissue microarray (TMA) data analysis allows the unraveling of complex molecular contexts and may improve predictive evaluations for patients with advanced renal cancer.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In multivariate time series analysis, the equal-time cross-correlation is a classic and computationally efficient measure for quantifying linear interrelations between data channels. When the cross-correlation coefficient is estimated using a finite amount of data points, its non-random part may be strongly contaminated by a sizable random contribution, such that no reliable conclusion can be drawn about genuine mutual interdependencies. The random correlations are determined by the signals' frequency content and the amount of data points used. Here, we introduce adjusted correlation matrices that can be employed to disentangle random from non-random contributions to each matrix element independently of the signal frequencies. Extending our previous work these matrices allow analyzing spatial patterns of genuine cross-correlation in multivariate data regardless of confounding influences. The performance is illustrated by example of model systems with known interdependence patterns. Finally, we apply the methods to electroencephalographic (EEG) data with epileptic seizure activity.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recent data have suggested a relation among long-term endurance sport practice, left atrial remodeling, and atrial fibrillation. We investigated the influence of an increased vagal tone, represented by the early repolarization (ER) pattern, on diastolic function and left atrial size in professional soccer players. Fifty-four consecutive athletes underwent electrocardiography, echocardiography, and exercise testing as part of their preparticipation screening. Athletes were divided into 2 groups according to presence or absence of an ER pattern, defined as a ST-segment elevation at the J-point (STE) > or =0.1 mm in 2 leads. For linear comparisons average STE was calculated. Mean age was 24 +/- 4 years. Twenty-five athletes (46%) showed an ER pattern. Athletes with an ER pattern had a significant lower heart rate (54 +/- 9 vs 62 +/- 11 beats/min, p = 0.024), an increased E/e' ratio (6.1 +/- 1.2 vs 5.1 +/- 1.0, p = 0.002), and larger volumes of the left atrium (25.6 +/- 7.3 vs 21.8 +/- 5.0 ml/m(2), p = 0.031) compared to athletes without an ER pattern. There were no significant differences concerning maximum workload, left ventricular dimensions, and systolic function. Univariate regression analysis revealed significant correlations among age, STE, and left atrial volume. In a stepwise multivariate regression analysis age, STE and e' contributed independently to left atrial size (r = 0.659, p <0.001). In conclusion, athletes with an ER pattern had an increased E/e' ratio, reflecting a higher left atrial filling pressure, contributing to left atrial remodeling over time.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background New HIV infections in men who have sex with men (MSM) have increased in Switzerland since 2000 despite combination antiretroviral therapy (cART). The objectives of this mathematical modelling study were: to describe the dynamics of the HIV epidemic in MSM in Switzerland using national data; to explore the effects of hypothetical prevention scenarios; and to conduct a multivariate sensitivity analysis. Methodology/Principal Findings The model describes HIV transmission, progression and the effects of cART using differential equations. The model was fitted to Swiss HIV and AIDS surveillance data and twelve unknown parameters were estimated. Predicted numbers of diagnosed HIV infections and AIDS cases fitted the observed data well. By the end of 2010, an estimated 13.5% (95% CI 12.5, 14.6%) of all HIV-infected MSM were undiagnosed and accounted for 81.8% (95% CI 81.1, 82.4%) of new HIV infections. The transmission rate was at its lowest from 1995–1999, with a nadir of 46 incident HIV infections in 1999, but increased from 2000. The estimated number of new infections continued to increase to more than 250 in 2010, although the reproduction number was still below the epidemic threshold. Prevention scenarios included temporary reductions in risk behaviour, annual test and treat, and reduction in risk behaviour to levels observed earlier in the epidemic. These led to predicted reductions in new infections from 2 to 26% by 2020. Parameters related to disease progression and relative infectiousness at different HIV stages had the greatest influence on estimates of the net transmission rate. Conclusions/Significance The model outputs suggest that the increase in HIV transmission amongst MSM in Switzerland is the result of continuing risky sexual behaviour, particularly by those unaware of their infection status. Long term reductions in the incidence of HIV infection in MSM in Switzerland will require increased and sustained uptake of effective interventions.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this article, we will link neuroimaging, data analysis, and intervention methods in an important psychiatric condition: auditory verbal hallucinations (AVH). The clinical and phenomenological background as well as neurophysiological findings will be covered and discussed with respect to noninvasive brain stimulation. Additionally, methods of noninvasive brain stimulation will be presented as ways to intervene with AVH. Finally, preliminary conclusions and possible future perspectives will be proposed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Simulation Automation Framework for Experiments (SAFE) streamlines the de- sign and execution of experiments with the ns-3 network simulator. SAFE ensures that best practices are followed throughout the workflow a network simulation study, guaranteeing that results are both credible and reproducible by third parties. Data analysis is a crucial part of this workflow, where mistakes are often made. Even when appearing in highly regarded venues, scientific graphics in numerous network simulation publications fail to include graphic titles, units, legends, and confidence intervals. After studying the literature in network simulation methodology and in- formation graphics visualization, I developed a visualization component for SAFE to help users avoid these errors in their scientific workflow. The functionality of this new component includes support for interactive visualization through a web-based interface and for the generation of high-quality, static plots that can be included in publications. The overarching goal of my contribution is to help users create graphics that follow best practices in visualization and thereby succeed in conveying the right information about simulation results.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The analysis of short segments of noise-contaminated, multivariate real world data constitutes a challenge. In this paper we compare several techniques of analysis, which are supposed to correctly extract the amount of genuine cross-correlations from a multivariate data set. In order to test for the quality of their performance we derive time series from a linear test model, which allows the analytical derivation of genuine correlations. We compare the numerical estimates of the four measures with the analytical results for different correlation pattern. In the bivariate case all but one measure performs similarly well. However, in the multivariate case measures based on the eigenvalues of the equal-time cross-correlation matrix do not extract exclusively information about the amount of genuine correlations, but they rather reflect the spatial organization of the correlation pattern. This may lead to failures when interpreting the numerical results as illustrated by an application to three electroencephalographic recordings of three patients suffering from pharmacoresistent epilepsy.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Estimation for bivariate right censored data is a problem that has had much study over the past 15 years. In this paper we propose a new class of estimators for the bivariate survival function based on locally efficient estimation. We introduce the locally efficient estimator for bivariate right censored data, present an asymptotic theorem, present the results of simulation studies and perform a brief data analysis illustrating the use of the locally efficient estimator.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We propose robust and e±cient tests and estimators for gene-environment/gene-drug interactions in family-based association studies. The methodology is designed for studies in which haplotypes, quantitative pheno- types and complex exposure/treatment variables are analyzed. Using causal inference methodology, we derive family-based association tests and estimators for the genetic main effects and the interactions. The tests and estimators are robust against population admixture and strati¯cation without requiring adjustment for confounding variables. We illustrate the practical relevance of our approach by an application to a COPD study. The data analysis suggests a gene-environment interaction between a SNP in the Serpine gene and smok- ing status/pack years of smoking that reduces the FEV1 volume by about 0.02 liter per pack year of smoking. Simulation studies show that the pro- posed methodology is su±ciently powered for realistic sample sizes and that it provides valid tests and effect size estimators in the presence of admixture and stratification.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: The recent development of semi-automated techniques for staining and analyzing flow cytometry samples has presented new challenges. Quality control and quality assessment are critical when developing new high throughput technologies and their associated information services. Our experience suggests that significant bottlenecks remain in the development of high throughput flow cytometry methods for data analysis and display. Especially, data quality control and quality assessment are crucial steps in processing and analyzing high throughput flow cytometry data. Methods: We propose a variety of graphical exploratory data analytic tools for exploring ungated flow cytometry data. We have implemented a number of specialized functions and methods in the Bioconductor package rflowcyt. We demonstrate the use of these approaches by investigating two independent sets of high throughput flow cytometry data. Results: We found that graphical representations can reveal substantial non-biological differences in samples. Empirical Cumulative Distribution Function and summary scatterplots were especially useful in the rapid identification of problems not identified by manual review. Conclusions: Graphical exploratory data analytic tools are quick and useful means of assessing data quality. We propose that the described visualizations should be used as quality assessment tools and where possible, be used for quality control.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper considers a wide class of semiparametric problems with a parametric part for some covariate effects and repeated evaluations of a nonparametric function. Special cases in our approach include marginal models for longitudinal/clustered data, conditional logistic regression for matched case-control studies, multivariate measurement error models, generalized linear mixed models with a semiparametric component, and many others. We propose profile-kernel and backfitting estimation methods for these problems, derive their asymptotic distributions, and show that in likelihood problems the methods are semiparametric efficient. While generally not true, with our methods profiling and backfitting are asymptotically equivalent. We also consider pseudolikelihood methods where some nuisance parameters are estimated from a different algorithm. The proposed methods are evaluated using simulation studies and applied to the Kenya hemoglobin data.