979 resultados para statistical analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The developmental processes and functions of an organism are controlled by the genes and the proteins that are derived from these genes. The identification of key genes and the reconstruction of gene networks can provide a model to help us understand the regulatory mechanisms for the initiation and progression of biological processes or functional abnormalities (e.g. diseases) in living organisms. In this dissertation, I have developed statistical methods to identify the genes and transcription factors (TFs) involved in biological processes, constructed their regulatory networks, and also evaluated some existing association methods to find robust methods for coexpression analyses. Two kinds of data sets were used for this work: genotype data and gene expression microarray data. On the basis of these data sets, this dissertation has two major parts, together forming six chapters. The first part deals with developing association methods for rare variants using genotype data (chapter 4 and 5). The second part deals with developing and/or evaluating statistical methods to identify genes and TFs involved in biological processes, and construction of their regulatory networks using gene expression data (chapter 2, 3, and 6). For the first part, I have developed two methods to find the groupwise association of rare variants with given diseases or traits. The first method is based on kernel machine learning and can be applied to both quantitative as well as qualitative traits. Simulation results showed that the proposed method has improved power over the existing weighted sum method (WS) in most settings. The second method uses multiple phenotypes to select a few top significant genes. It then finds the association of each gene with each phenotype while controlling the population stratification by adjusting the data for ancestry using principal components. This method was applied to GAW 17 data and was able to find several disease risk genes. For the second part, I have worked on three problems. First problem involved evaluation of eight gene association methods. A very comprehensive comparison of these methods with further analysis clearly demonstrates the distinct and common performance of these eight gene association methods. For the second problem, an algorithm named the bottom-up graphical Gaussian model was developed to identify the TFs that regulate pathway genes and reconstruct their hierarchical regulatory networks. This algorithm has produced very significant results and it is the first report to produce such hierarchical networks for these pathways. The third problem dealt with developing another algorithm called the top-down graphical Gaussian model that identifies the network governed by a specific TF. The network produced by the algorithm is proven to be of very high accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High density spatial and temporal sampling of EEG data enhances the quality of results of electrophysiological experiments. Because EEG sources typically produce widespread electric fields (see Chapter 3) and operate at frequencies well below the sampling rate, increasing the number of electrodes and time samples will not necessarily increase the number of observed processes, but mainly increase the accuracy of the representation of these processes. This is namely the case when inverse solutions are computed. As a consequence, increasing the sampling in space and time increases the redundancy of the data (in space, because electrodes are correlated due to volume conduction, and time, because neighboring time points are correlated), while the degrees of freedom of the data change only little. This has to be taken into account when statistical inferences are to be made from the data. However, in many ERP studies, the intrinsic correlation structure of the data has been disregarded. Often, some electrodes or groups of electrodes are a priori selected as the analysis entity and considered as repeated (within subject) measures that are analyzed using standard univariate statistics. The increased spatial resolution obtained with more electrodes is thus poorly represented by the resulting statistics. In addition, the assumptions made (e.g. in terms of what constitutes a repeated measure) are not supported by what we know about the properties of EEG data. From the point of view of physics (see Chapter 3), the natural “atomic” analysis entity of EEG and ERP data is the scalp electric field

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES In dental research multiple site observations within patients or taken at various time intervals are commonplace. These clustered observations are not independent; statistical analysis should be amended accordingly. This study aimed to assess whether adjustment for clustering effects during statistical analysis was undertaken in five specialty dental journals. METHODS Thirty recent consecutive issues of Orthodontics (OJ), Periodontology (PJ), Endodontology (EJ), Maxillofacial (MJ) and Paediatric Dentristry (PDJ) journals were hand searched. Articles requiring adjustment accounting for clustering effects were identified and statistical techniques used were scrutinized. RESULTS Of 559 studies considered to have inherent clustering effects, adjustment for this was made in the statistical analysis in 223 (39.1%). Studies published in the Periodontology specialty accounted for clustering effects in the statistical analysis more often than articles published in other journals (OJ vs. PJ: OR=0.21, 95% CI: 0.12, 0.37, p<0.001; MJ vs. PJ: OR=0.02, 95% CI: 0.00, 0.07, p<0.001; PDJ vs. PJ: OR=0.14, 95% CI: 0.07, 0.28, p<0.001; EJ vs. PJ: OR=0.11, 95% CI: 0.06, 0.22, p<0.001). A positive correlation was found between increasing prevalence of clustering effects in individual specialty journals and correct statistical handling of clustering (r=0.89). CONCLUSIONS The majority of studies in 5 dental specialty journals (60.9%) examined failed to account for clustering effects in statistical analysis where indicated, raising the possibility of inappropriate decreases in p-values and the risk of inappropriate inferences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Assemblages of organic-walled dinoflagellate cysts (dinocysts) from 116 marine surface samples have been analysed to assess the relationship between the spatial distribution of dinocysts and modern local environmental conditions [e.g. sea surface temperature (SST), sea surface salinity (SSS), productivity] in the eastern Indian Ocean. Results from the percentage analysis and statistical methods such as multivariate ordination analysis and end-member modelling, indicate the existence of three distinct environmental and oceanographic regions in the study area. Region 1 is located in western and eastern Indonesia and controlled by high SSTs and a low nutrient content of the surface waters. The Indonesian Throughflow (ITF) region (Region 2) is dominated by heterotrophic dinocyst species reflecting the region's high productivity. Region 3 is encompassing the area offshore north-west and west Australia which is characterised by the water masses of the Leeuwin Current, a saline and nutrient depleted southward current featuring energetic eddies.