5 resultados para Smoothed bootstrap

em Digital Commons at Florida International University


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation develops a new figure of merit to measure the similarity (or dissimilarity) of Gaussian distributions through a novel concept that relates the Fisher distance to the percentage of data overlap. The derivations are expanded to provide a generalized mathematical platform for determining an optimal separating boundary of Gaussian distributions in multiple dimensions. Real-world data used for implementation and in carrying out feasibility studies were provided by Beckman-Coulter. It is noted that although the data used is flow cytometric in nature, the mathematics are general in their derivation to include other types of data as long as their statistical behavior approximate Gaussian distributions. ^ Because this new figure of merit is heavily based on the statistical nature of the data, a new filtering technique is introduced to accommodate for the accumulation process involved with histogram data. When data is accumulated into a frequency histogram, the data is inherently smoothed in a linear fashion, since an averaging effect is taking place as the histogram is generated. This new filtering scheme addresses data that is accumulated in the uneven resolution of the channels of the frequency histogram. ^ The qualitative interpretation of flow cytometric data is currently a time consuming and imprecise method for evaluating histogram data. This method offers a broader spectrum of capabilities in the analysis of histograms, since the figure of merit derived in this dissertation integrates within its mathematics both a measure of similarity and the percentage of overlap between the distributions under analysis. ^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background The HIV virus is known for its ability to exploit numerous genetic and evolutionary mechanisms to ensure its proliferation, among them, high replication, mutation and recombination rates. Sliding MinPD, a recently introduced computational method [1], was used to investigate the patterns of evolution of serially-sampled HIV-1 sequence data from eight patients with a special focus on the emergence of X4 strains. Unlike other phylogenetic methods, Sliding MinPD combines distance-based inference with a nonparametric bootstrap procedure and automated recombination detection to reconstruct the evolutionary history of longitudinal sequence data. We present serial evolutionary networks as a longitudinal representation of the mutational pathways of a viral population in a within-host environment. The longitudinal representation of the evolutionary networks was complemented with charts of clinical markers to facilitate correlation analysis between pertinent clinical information and the evolutionary relationships. Results Analysis based on the predicted networks suggests the following:: significantly stronger recombination signals (p = 0.003) for the inferred ancestors of the X4 strains, recombination events between different lineages and recombination events between putative reservoir virus and those from a later population, an early star-like topology observed for four of the patients who died of AIDS. A significantly higher number of recombinants were predicted at sampling points that corresponded to peaks in the viral load levels (p = 0.0042). Conclusion Our results indicate that serial evolutionary networks of HIV sequences enable systematic statistical analysis of the implicit relations embedded in the topology of the structure and can greatly facilitate identification of patterns of evolution that can lead to specific hypotheses and new insights. The conclusions of applying our method to empirical HIV data support the conventional wisdom of the new generation HIV treatments, that in order to keep the virus in check, viral loads need to be suppressed to almost undetectable levels.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this study, I divided samples from individuals within Afghanistan based upon geography (i.e., north versus south). I determined allelic frequencies and other statistical parameters for 15 STR loci (i.e., D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, Dl3S317, D16S539, D2S1338, D19S433, vWA, TPOX, D18S51, D5S818, and FGA). I conducted pairwise comparisons with 19 neighboring Eurasian populations to assign Gstatistics and p-values. Categorizing the populations into five groups (i.e., Central Asia, East Asia, South Asia, the Middle East, and the Caucasus/Anatolia), I derived values for intra-population, inter-population, and total variance. Admixture analyses determined the highest allelic contributions to be from the Caucasus/ Anatolia, while negligible contributions were made by Central Asia and East Asia. A Correspondence Analysis revealed clustering of both northern and southern Afghanistan with Georgia, Turkey, northern Iran, and southern Iran of the Caucasus/ Anatolia and the Middle East. A Neighbor-Joining phylogenetic tree was constructed to generate bootstrap values over 1, 000 reiterations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis proposes some confidence intervals for the mean of a positively skewed distribution. The following confidence intervals are considered: Student-t, Johnson-t, median-t, mad-t, bootstrap-t, BCA, T1 , T3 and six new confidence intervals, the median bootstrap-t, mad bootstrap-t, median T1, mad T1 , median T3 and the mad T3. A simulation study has been conducted and average widths, coefficient of variation of widths, and coverage probabilities were recorded and compared across confidence intervals. To compare confidence intervals, the width and coverage probabilities were compared so that smaller widths indicated a better confidence interval when coverage probabilities were the same. Results showed that the median T1 and median T3 outperformed other confidence intervals in terms of coverage probability and the mad bootstrap-t, mad-t, and mad T3 outperformed others in terms of width. Some real life data are considered to illustrate the findings of the thesis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study computed trends in extreme precipitation events of Florida for 1950-2010. Hourly aggregated rainfall data from 24 stations of the National Climatic Data Centre were analyzed to derive time-series of extreme rainfalls for 12 durations, ranging from 1 hour to 7 day. Non-parametric Mann-Kendall test and Theil-Sen Approach were applied to detect the significance of trends in annual maximum rainfalls, number of above threshold events and average magnitude of above threshold events for four common analysis periods. Trend Free Pre-Whitening (TFPW) approach was applied to remove the serial correlations and bootstrap resampling approach was used to detect the field significance of trends. The results for annual maximum rainfall revealed dominant increasing trends at the statistical significance level of 0.10, especially for hourly events in longer period and daily events in recent period. The number of above threshold events exhibited strong decreasing trends for hourly durations in all time periods.