14 resultados para Analysis and statistical methods

em Digital Commons at Florida International University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation develops a new figure of merit to measure the similarity (or dissimilarity) of Gaussian distributions through a novel concept that relates the Fisher distance to the percentage of data overlap. The derivations are expanded to provide a generalized mathematical platform for determining an optimal separating boundary of Gaussian distributions in multiple dimensions. Real-world data used for implementation and in carrying out feasibility studies were provided by Beckman-Coulter. It is noted that although the data used is flow cytometric in nature, the mathematics are general in their derivation to include other types of data as long as their statistical behavior approximate Gaussian distributions. ^ Because this new figure of merit is heavily based on the statistical nature of the data, a new filtering technique is introduced to accommodate for the accumulation process involved with histogram data. When data is accumulated into a frequency histogram, the data is inherently smoothed in a linear fashion, since an averaging effect is taking place as the histogram is generated. This new filtering scheme addresses data that is accumulated in the uneven resolution of the channels of the frequency histogram. ^ The qualitative interpretation of flow cytometric data is currently a time consuming and imprecise method for evaluating histogram data. This method offers a broader spectrum of capabilities in the analysis of histograms, since the figure of merit derived in this dissertation integrates within its mathematics both a measure of similarity and the percentage of overlap between the distributions under analysis. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The adverse health effects of long-term exposure to lead are well established, with major uptake into the human body occurring mainly through oral ingestion by young children. Lead-based paint was frequently used in homes built before 1978, particularly in inner-city areas. Minority populations experience the effects of lead poisoning disproportionately. ^ Lead-based paint abatement is costly. In the United States, residents of about 400,000 homes, occupied by 900,000 young children, lack the means to correct lead-based paint hazards. The magnitude of this problem demands research on affordable methods of hazard control. One method is encapsulation, defined as any covering or coating that acts as a permanent barrier between the lead-based paint surface and the environment. ^ Two encapsulants were tested for reliability and effective life span through an accelerated lifetime experiment that applied stresses exceeding those encountered under normal use conditions. The resulting time-to-failure data were used to extrapolate the failure time under conditions of normal use. Statistical analysis and models of the test data allow forecasting of long-term reliability relative to the 20-year encapsulation requirement. Typical housing material specimens simulating walls and doors coated with lead-based paint were overstressed before encapsulation. A second, un-aged set was also tested. Specimens were monitored after the stress test with a surface chemical testing pad to identify the presence of lead breaking through the encapsulant. ^ Graphical analysis proposed by Shapiro and Meeker and the general log-linear model developed by Cox were used to obtain results. Findings for the 80% reliability time to failure varied, with close to 21 years of life under normal use conditions for encapsulant A. The application of product A on the aged gypsum and aged wood substrates yielded slightly lower times. Encapsulant B had an 80% reliable life of 19.78 years. ^ This study reveals that encapsulation technologies can offer safe and effective control of lead-based paint hazards and may be less expensive than other options. The U.S. Department of Health and Human Services and the CDC are committed to eliminating childhood lead poisoning by 2010. This ambitious target is feasible, provided there is an efficient application of innovative technology, a goal to which this study aims to contribute. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Microarray platforms have been around for many years and while there is a rise of new technologies in laboratories, microarrays are still prevalent. When it comes to the analysis of microarray data to identify differentially expressed (DE) genes, many methods have been proposed and modified for improvement. However, the most popular methods such as Significance Analysis of Microarrays (SAM), samroc, fold change, and rank product are far from perfect. When it comes down to choosing which method is most powerful, it comes down to the characteristics of the sample and distribution of the gene expressions. The most practiced method is usually SAM or samroc but when the data tends to be skewed, the power of these methods decrease. With the concept that the median becomes a better measure of central tendency than the mean when the data is skewed, the tests statistics of the SAM and fold change methods are modified in this thesis. This study shows that the median modified fold change method improves the power for many cases when identifying DE genes if the data follows a lognormal distribution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The major purpose of this study was to ascertain how needs assessment findings and methodologies are accepted by public decision makers in the U.S. Virgin Islands. To accomplish this, the following five different needs assessments were executed: (1) population survey; (2) key informants survey; (3) community forum; (4) rates-under-treatment (RUT); and (5) social indicators analysis. The assessments measured unmet needs of older persons regarding transportation, in-home care, and socio-recreation services, and determined which of the five methodologies is most costly, time consuming, and valid.^ The results of a five-way comparative analysis was presented to public sector decision makers who were surveyed to determine whether they are influenced more by needs assessment findings, or by the methodology used, and to ascertain the factors that lead to their acceptance of needs assessment findings and methodologies.^ The survey results revealed that acceptance of findings and methodology is influenced by the congruency of the findings with decision makers' goals and objectives, feasibility of the findings, and credibility of the researcher.^ The study also found that decision makers are influenced equally by needs assessment findings and methodology; that they prefer population surveys, although they are the most expensive and time consuming of the methodologies; that different types of needs assessments produce different results; and, that needs assessment is an essential program planning tool. Executive decision makers are found to be influenced more by management factors than by legal and political factors, while legislative decision makers are influenced more by legal factors. Decision makers overwhelmingly view their leadership style as democratic.^ A typology of the five needs assessments, highlighting their strengths and weaknesses, is offered as a planning guide for public decision makers. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The physics of self-organization and complexity is manifested on a variety of biological scales, from large ecosystems to the molecular level. Protein molecules exhibit characteristics of complex systems in terms of their structure, dynamics, and function. Proteins have the extraordinary ability to fold to a specific functional three-dimensional shape, starting from a random coil, in a biologically relevant time. How they accomplish this is one of the secrets of life. In this work, theoretical research into understanding this remarkable behavior is discussed. Thermodynamic and statistical mechanical tools are used in order to investigate the protein folding dynamics and stability. Theoretical analyses of the results from computer simulation of the dynamics of a four-helix bundle show that the excluded volume entropic effects are very important in protein dynamics and crucial for protein stability. The dramatic effects of changing the size of sidechains imply that a strategic placement of amino acid residues with a particular size may be an important consideration in protein engineering. Another investigation deals with modeling protein structural transitions as a phase transition. Using finite size scaling theory, the nature of unfolding transition of a four-helix bundle protein was investigated and critical exponents for the transition were calculated for various hydrophobic strengths in the core. It is found that the order of the transition changes from first to higher order as the strength of the hydrophobic interaction in the core region is significantly increased. Finally, a detailed kinetic and thermodynamic analysis was carried out in a model two-helix bundle. The connection between the structural free-energy landscape and folding kinetics was quantified. I show how simple protein engineering, by changing the hydropathy of a small number of amino acids, can enhance protein folding by significantly changing the free energy landscape so that kinetic traps are removed. The results have general applicability in protein engineering as well as understanding the underlying physical mechanisms of protein folding. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation establishes a novel data-driven method to identify language network activation patterns in pediatric epilepsy through the use of the Principal Component Analysis (PCA) on functional magnetic resonance imaging (fMRI). A total of 122 subjects’ data sets from five different hospitals were included in the study through a web-based repository site designed here at FIU. Research was conducted to evaluate different classification and clustering techniques in identifying hidden activation patterns and their associations with meaningful clinical variables. The results were assessed through agreement analysis with the conventional methods of lateralization index (LI) and visual rating. What is unique in this approach is the new mechanism designed for projecting language network patterns in the PCA-based decisional space. Synthetic activation maps were randomly generated from real data sets to uniquely establish nonlinear decision functions (NDF) which are then used to classify any new fMRI activation map into typical or atypical. The best nonlinear classifier was obtained on a 4D space with a complexity (nonlinearity) degree of 7. Based on the significant association of language dominance and intensities with the top eigenvectors of the PCA decisional space, a new algorithm was deployed to delineate primary cluster members without intensity normalization. In this case, three distinct activations patterns (groups) were identified (averaged kappa with rating 0.65, with LI 0.76) and were characterized by the regions of: (1) the left inferior frontal Gyrus (IFG) and left superior temporal gyrus (STG), considered typical for the language task; (2) the IFG, left mesial frontal lobe, right cerebellum regions, representing a variant left dominant pattern by higher activation; and (3) the right homologues of the first pattern in Broca's and Wernicke's language areas. Interestingly, group 2 was found to reflect a different language compensation mechanism than reorganization. Its high intensity activation suggests a possible remote effect on the right hemisphere focus on traditionally left-lateralized functions. In retrospect, this data-driven method provides new insights into mechanisms for brain compensation/reorganization and neural plasticity in pediatric epilepsy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of the present dissertation was to evaluate the internal validity of symptoms of four common anxiety disorders included in the Diagnostic and Statistical Manual of Mental Disorders fourth edition (text revision) (DSM-IV-TR; American Psychiatric Association, 2000), namely, separation anxiety disorder (SAD), social phobia (SOP), specific phobia (SP), and generalized anxiety disorder (GAD), in a sample of 625 youth (ages 6 to 17 years) referred to an anxiety disorders clinic and 479 parents. Confirmatory factor analyses (CFAs) were conducted on the dichotomous items of the SAD, SOP, SP, and GAD sections of the youth and parent versions of the Anxiety Disorders Interview Schedule for DSM-IV (ADIS-IV: C/P; Silverman & Albano, 1996) to test and compare a number of factor models including a factor model based on the DSM. Contrary to predictions, findings from CFAs showed that a correlated model with five factors of SAD, SOP, SP, GAD worry, and GAD somatic distress, provided the best fit of the youth data as well as the parent data. Multiple group CFAs supported the metric invariance of the correlated five factor model across boys and girls. Thus, the present study’s finding supports the internal validity of DSM-IV SAD, SOP, and SP, but raises doubt regarding the internal validity of GAD.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reduced organic sulfur (ROS) compounds are environmentally ubiquitous and play an important role in sulfur cycling as well as in biogeochemical cycles of toxic metals, in particular mercury. Development of effective methods for analysis of ROS in environmental samples and investigations on the interactions of ROS with mercury are critical for understanding the role of ROS in mercury cycling, yet both of which are poorly studied. Covalent affinity chromatography-based methods were attempted for analysis of ROS in environmental water samples. A method was developed for analysis of environmental thiols, by preconcentration using affinity covalent chromatographic column or solid phase extraction, followed by releasing of thiols from the thiopropyl sepharose gel using TCEP and analysis using HPLC-UV or HPLC-FL. Under the optimized conditions, the detection limits of the method using HPLC-FL detection were 0.45 and 0.36 nM for Cys and GSH, respectively. Our results suggest that covalent affinity methods are efficient for thiol enrichment and interference elimination, demonstrating their promising applications in developing a sensitive, reliable, and useful technique for thiol analysis in environmental water samples. The dissolution of mercury sulfide (HgS) in the presence of ROS and dissolved organic matter (DOM) was investigated, by quantifying the effects of ROS on HgS dissolution and determining the speciation of the mercury released from ROS-induced HgS dissolution. It was observed that the presence of small ROS (e.g., Cys and GSH) and large molecule DOM, in particular at high concentrations, could significantly enhance the dissolution of HgS. The dissolved Hg during HgS dissolution determined using the conventional 0.22 μm cutoff method could include colloidal Hg (e.g., HgS colloids) and truly dissolved Hg (e.g., Hg-ROS complexes). A centrifugal filtration method (with 3 kDa MWCO) was employed to characterize the speciation and reactivity of the Hg released during ROS-enhanced HgS dissolution. The presence of small ROS could produce a considerable fraction (about 40% of total mercury in the solution) of truly dissolved mercury (< 3 kDa), probably due to the formation of Hg-Cys or Hg-GSH complexes. The truly dissolved Hg formed during GSH- or Cys-enhanced HgS dissolution was directly reducible (100% for GSH and 40% for Cys) by stannous chloride, demonstrating its potential role in Hg transformation and bioaccumulation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The elemental analysis of soil is useful in forensic and environmental sciences. Methods were developed and optimized for two laser-based multi-element analysis techniques: laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) and laser-induced breakdown spectroscopy (LIBS). This work represents the first use of a 266 nm laser for forensic soil analysis by LIBS. Sample preparation methods were developed and optimized for a variety of sample types, including pellets for large bulk soil specimens (470 mg) and sediment-laden filters (47 mg), and tape-mounting for small transfer evidence specimens (10 mg). Analytical performance for sediment filter pellets and tape-mounted soils was similar to that achieved with bulk pellets. An inter-laboratory comparison exercise was designed to evaluate the performance of the LA-ICP-MS and LIBS methods, as well as for micro X-ray fluorescence (μXRF), across multiple laboratories. Limits of detection (LODs) were 0.01-23 ppm for LA-ICP-MS, 0.25-574 ppm for LIBS, 16-4400 ppm for μXRF, and well below the levels normally seen in soils. Good intra-laboratory precision (≤ 6 % relative standard deviation (RSD) for LA-ICP-MS; ≤ 8 % for μXRF; ≤ 17 % for LIBS) and inter-laboratory precision (≤ 19 % for LA-ICP-MS; ≤ 25 % for μXRF) were achieved for most elements, which is encouraging for a first inter-laboratory exercise. While LIBS generally has higher LODs and RSDs than LA-ICP-MS, both were capable of generating good quality multi-element data sufficient for discrimination purposes. Multivariate methods using principal components analysis (PCA) and linear discriminant analysis (LDA) were developed for discriminations of soils from different sources. Specimens from different sites that were indistinguishable by color alone were discriminated by elemental analysis. Correct classification rates of 94.5 % or better were achieved in a simulated forensic discrimination of three similar sites for both LIBS and LA-ICP-MS. Results for tape-mounted specimens were nearly identical to those achieved with pellets. Methods were tested on soils from USA, Canada and Tanzania. Within-site heterogeneity was site-specific. Elemental differences were greatest for specimens separated by large distances, even within the same lithology. Elemental profiles can be used to discriminate soils from different locations and narrow down locations even when mineralogy is similar.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation establishes a novel data-driven method to identify language network activation patterns in pediatric epilepsy through the use of the Principal Component Analysis (PCA) on functional magnetic resonance imaging (fMRI). A total of 122 subjects’ data sets from five different hospitals were included in the study through a web-based repository site designed here at FIU. Research was conducted to evaluate different classification and clustering techniques in identifying hidden activation patterns and their associations with meaningful clinical variables. The results were assessed through agreement analysis with the conventional methods of lateralization index (LI) and visual rating. What is unique in this approach is the new mechanism designed for projecting language network patterns in the PCA-based decisional space. Synthetic activation maps were randomly generated from real data sets to uniquely establish nonlinear decision functions (NDF) which are then used to classify any new fMRI activation map into typical or atypical. The best nonlinear classifier was obtained on a 4D space with a complexity (nonlinearity) degree of 7. Based on the significant association of language dominance and intensities with the top eigenvectors of the PCA decisional space, a new algorithm was deployed to delineate primary cluster members without intensity normalization. In this case, three distinct activations patterns (groups) were identified (averaged kappa with rating 0.65, with LI 0.76) and were characterized by the regions of: 1) the left inferior frontal Gyrus (IFG) and left superior temporal gyrus (STG), considered typical for the language task; 2) the IFG, left mesial frontal lobe, right cerebellum regions, representing a variant left dominant pattern by higher activation; and 3) the right homologues of the first pattern in Broca's and Wernicke's language areas. Interestingly, group 2 was found to reflect a different language compensation mechanism than reorganization. Its high intensity activation suggests a possible remote effect on the right hemisphere focus on traditionally left-lateralized functions. In retrospect, this data-driven method provides new insights into mechanisms for brain compensation/reorganization and neural plasticity in pediatric epilepsy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The major purpose of this study was to ascertain how needs assessment findings and methodologies are accepted by public decision makers in the U. S. Virgin Islands. To accomplish this, the following five different needs assessments were executed: (1) population survey; (2) key informants survey; (3) community forum; (4) rates-under-treatment (RUT); and (5) social indicators analysis. The assessments measured unmet needs of older persons regarding transportation, in-home care, and sociorecreation services, and determined which of the five methodologies is most costly, time consuming, and valid. The results of a five-way comparative analysis was presented to public sector decision makers who were surveyed to determine whether they are influenced more by needs assessment findings, or by the methodology used, and to ascertain the factors that lead to their acceptance of needs assessment findings and methodologies. The survey results revealed that acceptance of findings and methodology is influenced by the congruency of the findings with decision makers' goals and objectives, feasibility of the findings, and credibility of the researcher. The study also found that decision makers are influenced equally by needs assessment findings and methodology; that they prefer population surveys, although they are the most expensive and time consuming of the methodologies; that different types of needs assessments produce different results; and, that needs assessment is an essential program planning tool. Executive decision makers are found to be influenced more by management factors than by legal and political factors, while legislative decision makers are influenced more by legal factors. Decision makers overwhelmingly view their leadership style as democratic. A typology of the five needs assessments, highlighting their strengths and weaknesses is offered as a planning guide for public decision makers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The presence of harmful algal blooms (HAB) is a growing concern in aquatic environments. Among HAB organisms, cyanobacteria are of special concern because they have been reported worldwide to cause environmental and human health problem through contamination of drinking water. Although several analytical approaches have been applied to monitoring cyanobacteria toxins, conventional methods are costly and time-consuming so that analyses take weeks for field sampling and subsequent lab analysis. Capillary electrophoresis (CE) becomes a particularly suitable analytical separation method that can couple very small samples and rapid separations to a wide range of selective and sensitive detection techniques. This paper demonstrates a method for rapid separation and identification of four microcystin variants commonly found in aquatic environments. CE coupled to UV and electrospray ionization time-of-flight mass spectrometry (ESI-TOF) procedures were developed. All four analytes were separated within 6 minutes. The ESI-TOF experiment provides accurate molecular information, which further identifies analytes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The elemental analysis of soil is useful in forensic and environmental sciences. Methods were developed and optimized for two laser-based multi-element analysis techniques: laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) and laser-induced breakdown spectroscopy (LIBS). This work represents the first use of a 266 nm laser for forensic soil analysis by LIBS. Sample preparation methods were developed and optimized for a variety of sample types, including pellets for large bulk soil specimens (470 mg) and sediment-laden filters (47 mg), and tape-mounting for small transfer evidence specimens (10 mg). Analytical performance for sediment filter pellets and tape-mounted soils was similar to that achieved with bulk pellets. An inter-laboratory comparison exercise was designed to evaluate the performance of the LA-ICP-MS and LIBS methods, as well as for micro X-ray fluorescence (μXRF), across multiple laboratories. Limits of detection (LODs) were 0.01-23 ppm for LA-ICP-MS, 0.25-574 ppm for LIBS, 16-4400 ppm for µXRF, and well below the levels normally seen in soils. Good intra-laboratory precision (≤ 6 % relative standard deviation (RSD) for LA-ICP-MS; ≤ 8 % for µXRF; ≤ 17 % for LIBS) and inter-laboratory precision (≤ 19 % for LA-ICP-MS; ≤ 25 % for µXRF) were achieved for most elements, which is encouraging for a first inter-laboratory exercise. While LIBS generally has higher LODs and RSDs than LA-ICP-MS, both were capable of generating good quality multi-element data sufficient for discrimination purposes. Multivariate methods using principal components analysis (PCA) and linear discriminant analysis (LDA) were developed for discriminations of soils from different sources. Specimens from different sites that were indistinguishable by color alone were discriminated by elemental analysis. Correct classification rates of 94.5 % or better were achieved in a simulated forensic discrimination of three similar sites for both LIBS and LA-ICP-MS. Results for tape-mounted specimens were nearly identical to those achieved with pellets. Methods were tested on soils from USA, Canada and Tanzania. Within-site heterogeneity was site-specific. Elemental differences were greatest for specimens separated by large distances, even within the same lithology. Elemental profiles can be used to discriminate soils from different locations and narrow down locations even when mineralogy is similar.