999 resultados para Statistical Genetics


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Construction of multiple sequence alignments is a fundamental task in Bioinformatics. Multiple sequence alignments are used as a prerequisite in many Bioinformatics methods, and subsequently the quality of such methods can be critically dependent on the quality of the alignment. However, automatic construction of a multiple sequence alignment for a set of remotely related sequences does not always provide biologically relevant alignments.Therefore, there is a need for an objective approach for evaluating the quality of automatically aligned sequences. The profile hidden Markov model is a powerful approach in comparative genomics. In the profile hidden Markov model, the symbol probabilities are estimated at each conserved alignment position. This can increase the dimension of parameter space and cause an overfitting problem. These two research problems are both related to conservation. We have developed statistical measures for quantifying the conservation of multiple sequence alignments. Two types of methods are considered, those identifying conserved residues in an alignment position, and those calculating positional conservation scores. The positional conservation score was exploited in a statistical prediction model for assessing the quality of multiple sequence alignments. The residue conservation score was used as part of the emission probability estimation method proposed for profile hidden Markov models. The results of the predicted alignment quality score highly correlated with the correct alignment quality scores, indicating that our method is reliable for assessing the quality of any multiple sequence alignment. The comparison of the emission probability estimation method with the maximum likelihood method showed that the number of estimated parameters in the model was dramatically decreased, while the same level of accuracy was maintained. To conclude, we have shown that conservation can be successfully used in the statistical model for alignment quality assessment and in the estimation of emission probabilities in the profile hidden Markov models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis the X-ray tomography is discussed from the Bayesian statistical viewpoint. The unknown parameters are assumed random variables and as opposite to traditional methods the solution is obtained as a large sample of the distribution of all possible solutions. As an introduction to tomography an inversion formula for Radon transform is presented on a plane. The vastly used filtered backprojection algorithm is derived. The traditional regularization methods are presented sufficiently to ground the Bayesian approach. The measurements are foton counts at the detector pixels. Thus the assumption of a Poisson distributed measurement error is justified. Often the error is assumed Gaussian, altough the electronic noise caused by the measurement device can change the error structure. The assumption of Gaussian measurement error is discussed. In the thesis the use of different prior distributions in X-ray tomography is discussed. Especially in severely ill-posed problems the use of a suitable prior is the main part of the whole solution process. In the empirical part the presented prior distributions are tested using simulated measurements. The effect of different prior distributions produce are shown in the empirical part of the thesis. The use of prior is shown obligatory in case of severely ill-posed problem.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hedelmättömyyttä aiheuttavan siittiöiden puolihäntävian molekyyligenetiikka Suomalaisissa Yorkshire karjuissa yleistyi 1990-luvun lopulla autosomaalisesti ja resessiivisesti periytyvä hedelmättömyyttä aiheuttava siittiöiden puolihäntävika (ISTS, immotile short tail sperm). Sairaus aiheuttaa normaalia lyhyemmän ja täysin liikkumattoman siittiön hännän muodostuksen. Muita oireita sairailla karjuilla ei ole havaittu ja emakot ovat oireettomia. Tämän tutkimuksen tarkoituksena oli kartoittaa siittiöiden puolihäntävian aiheuttava geenivirhe ja kehittää DNA-testi markkeri- ja geeniavusteiseen valintaan. Koko genomin kartoituksessa vian aiheuttava alue paikannettiin sian kromosomiin 16. Paikannuksen perusteella kahden geenimerkin haplotyyppi kehitettiin käytettäväksi markkeri-avusteisessa valinnassa. Sairauteen kytkeytyneen alueen hienokartoitusta jatkettiin geenitestin kehittämiseksi kantajadiagnostiikkaan. Vertailevalla kartoituksella oireeseen kytkeytynyt alue paikannettiin 2 cM:n alueelle ihmisen kromosomiin viisi (5p13.2). Tällä alueella sijaitsevia geenejä vastaavista sian sekvensseistä löydetyn muuntelun perusteella voitiin tarkentaa sairauteen kytkeytyneitä haplotyyppejä. Haplotyyppien perusteella puolihäntäoireeseen kytkeytynyt alue rajattiin kahdeksan geenin alueelle ihmisen geenikartalla. Alueelle paikannetun kandidaattigeenin (KPL2) sekvensointi paljasti introniin liittyneen liikkuvan DNA-sekvenssin, Line-1 retroposonin. Tämä retroposoni muuttaa geenin silmikointia siten, että sitä edeltävä eksoni jätetään pois tai myös osa introni- ja inserttisekvenssiä liitetään geenin mRNA tuotteeseen. Molemmissa tapauksissa tuloksena on lyhentynyt KPL2 proteiini. Tähän retroposoni-inserttiin perustuva geenitesti on ollut sianjalostajien käytössä vuodesta 2006. KPL2 geenin ilmenemisen tarkastelu sialla ja hiirellä paljasti useita kudosspesifisiä silmikointimuotoja. KPL2 geenin pitkä muoto ilmenee pääasiassa vain kiveksessä, mikä selittää geenivirheen aiheuttamat erityisesti siittiön kehitykseen liittyvät oireet. KPL2 proteiinin ilmeneminen hiiren siittiön hännän kehityksen aikana ja mahdollinen yhteistoiminta IFT20 proteiinin kanssa viittaavat tehtävään proteiinien kuljetuksessa siittiön häntään. Mahdollisen kuljetustehtävän lisäksi KPL2 saattaa toimia myös siittiön hännän rakenneosana, koska se paikannettiin valmiin siittiön hännän keskiosaan. Lisäksi KPL2 proteiini saattaa myös toimia Golgin laitteessa sekä Sertolin solujen ja spermatidien liitoksissa, mutta nämä havainnot kuitenkin vaativat lisätutkimuksia. Tämän tutkimuksen tulokset osoittavat, että KPL2 geeni on tärkeä siittiön hännän kehitykselle ja sen rakennemuutos aiheuttaa siittiöiden puolihäntäoireen suomalaisilla Yorkshire karjuilla. KPL2 proteiinin ilmeneminen ja paikannus siittiön kehityksen aikana antaa viitteitä proteiinin toiminnasta. Koska KPL2 geenisekvenssi on erittäin konservoitunut, nämä tulokset tuovat uutta tietoa kaikkien nisäkkäiden siittiöiden kehitykseen ja urosten hedelmättömyyteen syihin.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis was focussed on statistical analysis methods and proposes the use of Bayesian inference to extract information contained in experimental data by estimating Ebola model parameters. The model is a system of differential equations expressing the behavior and dynamics of Ebola. Two sets of data (onset and death data) were both used to estimate parameters, which has not been done by previous researchers in (Chowell, 2004). To be able to use both data, a new version of the model has been built. Model parameters have been estimated and then used to calculate the basic reproduction number and to study the disease-free equilibrium. Estimates of the parameters were useful to determine how well the model fits the data and how good estimates were, in terms of the information they provided about the possible relationship between variables. The solution showed that Ebola model fits the observed onset data at 98.95% and the observed death data at 93.6%. Since Bayesian inference can not be performed analytically, the Markov chain Monte Carlo approach has been used to generate samples from the posterior distribution over parameters. Samples have been used to check the accuracy of the model and other characteristics of the target posteriors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The optimal design of a heat exchanger system is based on given model parameters together with given standard ranges for machine design variables. The goals set for minimizing the Life Cycle Cost (LCC) function which represents the price of the saved energy, for maximizing the momentary heat recovery output with given constraints satisfied and taking into account the uncertainty in the models were successfully done. Nondominated Sorting Genetic Algorithm II (NSGA-II) for the design optimization of a system is presented and implemented inMatlab environment. Markov ChainMonte Carlo (MCMC) methods are also used to take into account the uncertainty in themodels. Results show that the price of saved energy can be optimized. A wet heat exchanger is found to be more efficient and beneficial than a dry heat exchanger even though its construction is expensive (160 EUR/m2) compared to the construction of a dry heat exchanger (50 EUR/m2). It has been found that the longer lifetime weights higher CAPEX and lower OPEX and vice versa, and the effect of the uncertainty in the models has been identified in a simplified case of minimizing the area of a dry heat exchanger.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL, OMIM #125310) is an inherited vascular disease. The main symptoms include migraineous headache, recurrent strokes and progressive cognitive impairment. CADASIL is caused by mutations in the NOTCH3 gene which result in degeneration of vascular smooth muscle cells, arteriolar stenosis and impaired cerebral blood flow. The aims of this study were assessment of the genetic background of Finnish and Swedish CADASIL patients, analysis of genetic and environmental factors that may influence the phenotype, and identification of the optimal diagnostic strategy. The majority of Finnish CADASIL patients carry the p.Arg133Cys mutation. Haplotype analysis of 18 families revealed a region of linkage disequilibrium around the NOTCH3 locus, which is evidence for a founder effect and a common ancestral mutation. Despite the same mutational background, the clinical course of CADASIL is highly variable between and even within families. The association of several genetic factors with the phenotypic variation was investigated in 120 CADASIL patients. Apolipoprotein E allele 4 was associated with earlier occurrence of strokes, especially in younger patients. Study of a pair of monozygotic twins with CADASIL revealed environmental factors which may influence the phenotype, i.e. smoking, statin medication and physical activity. Knowledge of these factors is useful, since life-style choices may influence the disease progression. The clinical CADASIL diagnosis can be confirmed by detection of either the NOTCH3 mutation or granular osmiophilic material by electron microscopy in skin biopsy, although the sensitivity estimates have been contradictory. Comparison of these two methods in a group of 131 diagnostic cases from Finland, Sweden and France demonstrated that both methods are highly sensitive and reliable.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Two high performance liquid chromatography (HPLC) methods for the quantitative determination of indinavir sulfate were tested, validated and statistically compared. Assays were carried out using as mobile phases mixtures of dibutylammonium phosphate buffer pH 6.5 and acetonitrile (55:45) at 1 mL/min or citrate buffer pH 5 and acetonitrile (60:40) at 1 mL/min, an octylsilane column (RP-8) and a UV spectrophotometric detector at 260 nm. Both methods showed good sensitivity, linearity, precision and accuracy. The statistical analysis using the t-student test for the determination of indinavir sulfate raw material and capsules indicated no statistically significant difference between the two methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Increasing evidence suggests oceanic traits may play a key role in the genetic structuring of marine organisms. Whereas genetic breaks in the open ocean are well known in fishes and marine invertebrates, the importance of marine habitat characteristics in seabirds remains less certain. We investigated the role of oceanic transitions versus population genetic processes in driving population differentiation in a highly vagile seabird, the Cory"s shearwater, combining molecular, morphological and ecological data from 27 breeding colonies distributed across the Mediterranean (Calonectris diomedea diomedea) and the Atlantic (C. d. borealis). Genetic and biometric analyses showed a clear differentiation between Atlantic and Mediterranean Cory"s shearwaters. Ringing-recovery data indicated high site fidelity of the species, but we found some cases of dispersal among neighbouring breeding sites (<300 km) and a few long distance movements (>1000 km) within and between each basin. In agreement with this, comparison of phenotypic and genetic data revealed both current and historical dispersal events. Within each region, we did not detect any genetic substructure among archipelagos in the Atlantic, but we found a slight genetic differentiation between western and eastern breeding colonies in the Mediterranean. Accordingly, gene flow estimates suggested substantial dispersal among colonies within basins. Overall, genetic structure of the Cory"s shearwater matches main oceanographic breaks (Almería-Oran Oceanic Front and Siculo-Tunisian Strait), but spatial analyses suggest that patterns of genetic differentiation are better explained by geographic rather than oceanographic distances. In line with previous studies, genetic, phenotypic and ecological evidence supported the separation of Atlantic and Mediterranean forms, suggesting the 2 taxa should be regarded as different species.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the current study, we evaluated various robust statistical methods for comparing two independent groups. Two scenarios for simulation were generated: one of equality and another of population mean differences. In each of the scenarios, 33 experimental conditions were used as a function of sample size, standard deviation and asymmetry. For each condition, 5000 replications per group were generated. The results obtained by this study show an adequate type error I rate but not a high power for the confidence intervals. In general, for the two scenarios studied (mean population differences and not mean population differences) in the different conditions analysed, the Mann-Whitney U-test demonstrated strong performance, and a little worse the t-test of Yuen-Welch.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The identifiability of the parameters of a heat exchanger model without phase change was studied in this Master’s thesis using synthetically made data. A fast, two-step Markov chain Monte Carlo method (MCMC) was tested with a couple of case studies and a heat exchanger model. The two-step MCMC-method worked well and decreased the computation time compared to the traditional MCMC-method. The effect of measurement accuracy of certain control variables to the identifiability of parameters was also studied. The accuracy used did not seem to have a remarkable effect to the identifiability of parameters. The use of the posterior distribution of parameters in different heat exchanger geometries was studied. It would be computationally most efficient to use the same posterior distribution among different geometries in the optimisation of heat exchanger networks. According to the results, this was possible in the case when the frontal surface areas were the same among different geometries. In the other cases the same posterior distribution can be used for optimisation too, but that will give a wider predictive distribution as a result. For condensing surface heat exchangers the numerical stability of the simulation model was studied. As a result, a stable algorithm was developed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genetic diversity is one of the levels of biodiversity that the World Conservation Union (IUCN) has recognized as being important to preserve. This is because genetic diversity is fundamental to the future evolution and to the adaptive flexibility of a species to respond to the inherently dynamic nature of the natural world. Therefore, the key to maintaining biodiversity and healthy ecosystems is to identify, monitor and maintain locally-adapted populations, along with their unique gene pools, upon which future adaptation depends. Thus, conservation genetics deals with the genetic factors that affect extinction risk and the genetic management regimes required to minimize the risk. The conservation of exploited species, such as salmonid fishes, is particularly challenging due to the conflicts between different interest groups. In this thesis, I conduct a series of conservation genetic studies on primarily Finnish populations of two salmonid fish species (European grayling, Thymallus thymallus, and lake-run brown trout, Salmo trutta) which are popular recreational game fishes in Finland. The general aim of these studies was to apply and develop population genetic approaches to assist conservation and sustainable harvest of these populations. The approaches applied included: i) the characterization of population genetic structure at national and local scales; ii) the identification of management units and the prioritization of populations for conservation based on evolutionary forces shaping indigenous gene pools; iii) the detection of population declines and the testing of the assumptions underlying these tests; and iv) the evaluation of the contribution of natural populations to a mixed stock fishery. Based on microsatellite analyses, clear genetic structuring of exploited Finnish grayling and brown trout populations was detected at both national and local scales. Finnish grayling were clustered into three genetically distinct groups, corresponding to northern, Baltic and south-eastern geographic areas of Finland. The genetic differentiation among and within population groups of grayling ranged from moderate to high levels. Such strong genetic structuring combined with low genetic diversity strongly indicates that genetic drift plays a major role in the evolution of grayling populations. Further analyses of European grayling covering the majority of the species’ distribution range indicated a strong global footprint of population decline. Using a coalescent approach the beginning of population reduction was dated back to 1 000-10 000 years ago (ca. 200-2 000 generations). Forward simulations demonstrated that the bottleneck footprints measured using the M ratio can persist within small populations much longer than previously anticipated in the face of low levels of gene flow. In contrast to the M ratio, two alternative methods for genetic bottleneck detection identified recent bottlenecks in six grayling populations that warrant future monitoring. Consistent with the predominant role of random genetic drift, the effective population size (Ne) estimates of all grayling populations were very low with the majority of Ne estimates below 50. Taken together, highly structured local populations, limited gene flow and the small Ne of grayling populations indicates that grayling populations are vulnerable to overexploitation and, hence, monitoring and careful management using the precautionary principles is required not only in Finland but throughout Europe. Population genetic analyses of lake-run brown trout populations in the Inari basin (northernmost Finland) revealed hierarchical population structure where individual populations were clustered into three population groups largely corresponding to different geographic regions of the basin. Similar to my earlier work with European grayling, the genetic differentiation among and within population groups of lake-run brown trout was relatively high. Such strong differentiation indicated that the power to determine the relative contribution of populations in mixed fisheries should be relatively high. Consistent with these expectations, high accuracy and precision in mixed stock analysis (MSA) simulations were observed. Application of MSA to indigenous fish caught in the Inari basin identified altogether twelve populations that contributed significantly to mixed stock fisheries with the Ivalojoki river system being the major contributor (70%) to the total catch. When the contribution of wild trout populations to the fisheries was evaluated regionally, geographically nearby populations were the main contributors to the local catches. MSA also revealed a clear separation between the lower and upper reaches of Ivalojoki river system – in contrast to lower reaches of the Ivalojoki river that contributed considerably to the catch, populations from the upper reaches of the Ivalojoki river system (>140 km from the river mouth) did not contribute significantly to the fishery. This could be related to the available habitat size but also associated with a resident type life history and increased cost of migration. The studies in my thesis highlight the importance of dense sampling and wide population coverage at the scale being studied and also demonstrate the importance of critical evaluation of the underlying assumptions of the population genetic models and methods used. These results have important implications for conservation and sustainable fisheries management of Finnish populations of European grayling and brown trout in the Inari basin.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A statistical mixture-design technique was used to study the effects of different solvents and their mixtures on the yield, total polyphenol content, and antioxidant capacity of the crude extracts from the bark of Schinus terebinthifolius Raddi (Anacardiaceae). The experimental results and their response-surface models showed that ternary mixtures with equal portions of all the three solvents (water, ethanol and acetone) were better than the binary mixtures in generating crude extracts with the highest yield (22.04 ± 0.48%), total polyphenol content (29.39 ± 0.39%), and antioxidant capacity (6.38 ± 0.21). An analytical method was developed and validated for the determination of total polyphenols in the extracts. Optimal conditions for the various parameters in this analytical method, namely, the time for the chromophoric reaction to stabilize, wavelength of the absorption maxima to be monitored, the reference standard and the concentration of sodium carbonate were determined to be 5 min, 780 nm, pyrogallol, and 14.06% w v-1, respectively. UV-Vis spectrophotometric monitoring of the reaction under these conditions proved the method to be linear, specific, precise, accurate, reproducible, robust, and easy to perform.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We explore a DNA statistical model to obtain information about the behavior of the thermodynamics quantities. Special attention is given to the thermal denaturation of this macromolecule.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study aimed to evaluate the genetic variability among individuals of a base population of Eucalyptus grandis and to build a molecular marker database for the analyzed populations. The Eucalyptus grandis base population comprised 327 individuals from Coff's Harbour, Atherton and Rio Claro. A few plants came from other sites (Belthorpe MT. Pandanus, Kenilworth, Yabbra, etc.). Since this base population had a heterogeneous composition, the groups were divided according to geographic localization (latitude and longitude), and genetic breeding level. Thus, the influence of those two factors (geographic localization and genetic breeding level) on the genetic variability detected was discussed. The RAPD technique allowed the evaluation of 70 loci. The binary matrix was used to estimate the genetic similarity among individuals using Jaccard's Coefficient. Parametric statistical tests were used to compare within-group similarity of the means. The obtained results showed that the base population had wide genetic variability and a mean genetic similarity of 0.328. Sub-group 3 (wild materials from the Atherton region) showed mean genetic similarity of 0.318. S.P.A. (from Coff's Harbour region) had a mean genetic similarity of 0.322 and was found to be very important for maintenance of variation in the base population. This can be explained since the individuals from those groups accounted for most of the base population (48.3% for it). The base population plants with genetic similarity higher than 0.60 should be phenotypically analyzed again in order to clarify the tendency of genetic variability during breeding programs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

ABSTRACT This study aimed to develop a methodology based on multivariate statistical analysis of principal components and cluster analysis, in order to identify the most representative variables in studies of minimum streamflow regionalization, and to optimize the identification of the hydrologically homogeneous regions for the Doce river basin. Ten variables were used, referring to the river basin climatic and morphometric characteristics. These variables were individualized for each of the 61 gauging stations. Three dependent variables that are indicative of minimum streamflow (Q7,10, Q90 and Q95). And seven independent variables that concern to climatic and morphometric characteristics of the basin (total annual rainfall – Pa; total semiannual rainfall of the dry and of the rainy season – Pss and Psc; watershed drainage area – Ad; length of the main river – Lp; total length of the rivers – Lt; and average watershed slope – SL). The results of the principal component analysis pointed out that the variable SL was the least representative for the study, and so it was discarded. The most representative independent variables were Ad and Psc. The best divisions of hydrologically homogeneous regions for the three studied flow characteristics were obtained using the Mahalanobis similarity matrix and the complete linkage clustering method. The cluster analysis enabled the identification of four hydrologically homogeneous regions in the Doce river basin.