953 resultados para Bias, Error Rates, Genetic Modelling


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation has three separate parts: the first part deals with the general pedigree association testing incorporating continuous covariates; the second part deals with the association tests under population stratification using the conditional likelihood tests; the third part deals with the genome-wide association studies based on the real rheumatoid arthritis (RA) disease data sets from Genetic Analysis Workshop 16 (GAW16) problem 1. Many statistical tests are developed to test the linkage and association using either case-control status or phenotype covariates for family data structure, separately. Those univariate analyses might not use all the information coming from the family members in practical studies. On the other hand, the human complex disease do not have a clear inheritance pattern, there might exist the gene interactions or act independently. In part I, the new proposed approach MPDT is focused on how to use both the case control information as well as the phenotype covariates. This approach can be applied to detect multiple marker effects. Based on the two existing popular statistics in family studies for case-control and quantitative traits respectively, the new approach could be used in the simple family structure data set as well as general pedigree structure. The combined statistics are calculated using the two statistics; A permutation procedure is applied for assessing the p-value with adjustment from the Bonferroni for the multiple markers. We use simulation studies to evaluate the type I error rates and the powers of the proposed approach. Our results show that the combined test using both case-control information and phenotype covariates not only has the correct type I error rates but also is more powerful than the other existing methods. For multiple marker interactions, our proposed method is also very powerful. Selective genotyping is an economical strategy in detecting and mapping quantitative trait loci in the genetic dissection of complex disease. When the samples arise from different ethnic groups or an admixture population, all the existing selective genotyping methods may result in spurious association due to different ancestry distributions. The problem can be more serious when the sample size is large, a general requirement to obtain sufficient power to detect modest genetic effects for most complex traits. In part II, I describe a useful strategy in selective genotyping while population stratification is present. Our procedure used a principal component based approach to eliminate any effect of population stratification. The paper evaluates the performance of our procedure using both simulated data from an early study data sets and also the HapMap data sets in a variety of population admixture models generated from empirical data. There are one binary trait and two continuous traits in the rheumatoid arthritis dataset of Problem 1 in the Genetic Analysis Workshop 16 (GAW16): RA status, AntiCCP and IgM. To allow multiple traits, we suggest a set of SNP-level F statistics by the concept of multiple-correlation to measure the genetic association between multiple trait values and SNP-specific genotypic scores and obtain their null distributions. Hereby, we perform 6 genome-wide association analyses using the novel one- and two-stage approaches which are based on single, double and triple traits. Incorporating all these 6 analyses, we successfully validate the SNPs which have been identified to be responsible for rheumatoid arthritis in the literature and detect more disease susceptibility SNPs for follow-up studies in the future. Except for chromosome 13 and 18, each of the others is found to harbour susceptible genetic regions for rheumatoid arthritis or related diseases, i.e., lupus erythematosus. This topic is discussed in part III.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Linkage and association studies are major analytical tools to search for susceptibility genes for complex diseases. With the availability of large collection of single nucleotide polymorphisms (SNPs) and the rapid progresses for high throughput genotyping technologies, together with the ambitious goals of the International HapMap Project, genetic markers covering the whole genome will be available for genome-wide linkage and association studies. In order not to inflate the type I error rate in performing genome-wide linkage and association studies, multiple adjustment for the significant level for each independent linkage and/or association test is required, and this has led to the suggestion of genome-wide significant cut-off as low as 5 × 10 −7. Almost no linkage and/or association study can meet such a stringent threshold by the standard statistical methods. Developing new statistics with high power is urgently needed to tackle this problem. This dissertation proposes and explores a class of novel test statistics that can be used in both population-based and family-based genetic data by employing a completely new strategy, which uses nonlinear transformation of the sample means to construct test statistics for linkage and association studies. Extensive simulation studies are used to illustrate the properties of the nonlinear test statistics. Power calculations are performed using both analytical and empirical methods. Finally, real data sets are analyzed with the nonlinear test statistics. Results show that the nonlinear test statistics have correct type I error rates, and most of the studied nonlinear test statistics have higher power than the standard chi-square test. This dissertation introduces a new idea to design novel test statistics with high power and might open new ways to mapping susceptibility genes for complex diseases. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The MFG test is a family-based association test that detects genetic effects contributing to disease in offspring, including offspring allelic effects, maternal allelic effects and MFG incompatibility effects. Like many other family-based association tests, it assumes that the offspring survival and the offspring-parent genotypes are conditionally independent provided the offspring is affected. However, when the putative disease-increasing locus can affect another competing phenotype, for example, offspring viability, the conditional independence assumption fails and these tests could lead to incorrect conclusions regarding the role of the gene in disease. We propose the v-MFG test to adjust for the genetic effects on one phenotype, e.g., viability, when testing the effects of that locus on another phenotype, e.g., disease. Using genotype data from nuclear families containing parents and at least one affected offspring, the v-MFG test models the distribution of family genotypes conditional on offspring phenotypes. It simultaneously estimates genetic effects on two phenotypes, viability and disease. Simulations show that the v-MFG test produces accurate genetic effect estimates on disease as well as on viability under several different scenarios. It generates accurate type-I error rates and provides adequate power with moderate sample sizes to detect genetic effects on disease risk when viability is reduced. We demonstrate the v-MFG test with HLA-DRB1 data from study participants with rheumatoid arthritis (RA) and their parents, we show that the v-MFG test successfully detects an MFG incompatibility effect on RA while simultaneously adjusting for a possible viability loss.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speaker verification is the process of verifying the identity of a person by analysing their speech. There are several important applications for automatic speaker verification (ASV) technology including suspect identification, tracking terrorists and detecting a person’s presence at a remote location in the surveillance domain, as well as person authentication for phone banking and credit card transactions in the private sector. Telephones and telephony networks provide a natural medium for these applications. The aim of this work is to improve the usefulness of ASV technology for practical applications in the presence of adverse conditions. In a telephony environment, background noise, handset mismatch, channel distortions, room acoustics and restrictions on the available testing and training data are common sources of errors for ASV systems. Two research themes were pursued to overcome these adverse conditions: Modelling mismatch and modelling uncertainty. To directly address the performance degradation incurred through mismatched conditions it was proposed to directly model this mismatch. Feature mapping was evaluated for combating handset mismatch and was extended through the use of a blind clustering algorithm to remove the need for accurate handset labels for the training data. Mismatch modelling was then generalised by explicitly modelling the session conditions as a constrained offset of the speaker model means. This session variability modelling approach enabled the modelling of arbitrary sources of mismatch, including handset type, and halved the error rates in many cases. Methods to model the uncertainty in speaker model estimates and verification scores were developed to address the difficulties of limited training and testing data. The Bayes factor was introduced to account for the uncertainty of the speaker model estimates in testing by applying Bayesian theory to the verification criterion, with improved performance in matched conditions. Modelling the uncertainty in the verification score itself met with significant success. Estimating a confidence interval for the "true" verification score enabled an order of magnitude reduction in the average quantity of speech required to make a confident verification decision based on a threshold. The confidence measures developed in this work may also have significant applications for forensic speaker verification tasks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As order dependencies between process tasks can get complex, it is easy to make mistakes in process model design, especially behavioral ones such as deadlocks. Notions such as soundness formalize behavioral errors and tools exist that can identify such errors. However these tools do not provide assistance with the correction of the process models. Error correction can be very challenging as the intentions of the process modeler are not known and there may be many ways in which an error can be corrected. We present a novel technique for automatic error correction in process models based on simulated annealing. Via this technique a number of process model alternatives are identified that resolve one or more errors in the original model. The technique is implemented and validated on a sample of industrial process models. The tests show that at least one sound solution can be found for each input model and that the response times are short.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fusion techniques have received considerable attention for achieving lower error rates with biometrics. A fused classifier architecture based on sequential integration of multi-instance and multi-sample fusion schemes allows controlled trade-off between false alarms and false rejects. Expressions for each type of error for the fused system have previously been derived for the case of statistically independent classifier decisions. It is shown in this paper that the performance of this architecture can be improved by modelling the correlation between classifier decisions. Correlation modelling also enables better tuning of fusion model parameters, ‘N’, the number of classifiers and ‘M’, the number of attempts/samples, and facilitates the determination of error bounds for false rejects and false accepts for each specific user. Error trade-off performance of the architecture is evaluated using HMM based speaker verification on utterances of individual digits. Results show that performance is improved for the case of favourable correlated decisions. The architecture investigated here is directly applicable to speaker verification from spoken digit strings such as credit card numbers in telephone or voice over internet protocol based applications. It is also applicable to other biometric modalities such as finger prints and handwriting samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Despite the prominent use of the Suchey-Brooks (S-B) method of age estimation in forensic anthropological practice, it is subject to intrinsic limitations, with reports of differential inter-population error rates between geographical locations. This study assessed the accuracy of the S-B method to a contemporary adult population in Queensland, Australia and provides robust age parameters calibrated for our population. Three-dimensional surface reconstructions were generated from computed tomography scans of the pubic symphysis of male and female Caucasian individuals aged 15–70 years (n = 195) in Amira® and Rapidform®. Error was analyzed on the basis of bias, inaccuracy and percentage correct classification for left and right symphyseal surfaces. Application of transition analysis and Chi-square statistics demonstrated 63.9% and 69.7% correct age classification associated with the left symphyseal surface of Australian males and females, respectively, using the S-B method. Using Bayesian statistics, probability density distributions for each S-B phase were calculated, providing refined age parameters for our population. Mean inaccuracies of 6.77 (±2.76) and 8.28 (±4.41) years were reported for the left surfaces of males and females, respectively; with positive biases for younger individuals (<55 years) and negative biases in older individuals. Significant sexual dimorphism in the application of the S-B method was observed; and asymmetry in phase classification of the pubic symphysis was a frequent phenomenon. These results recommend that the S-B method should be applied with caution in medico-legal death investigations of Queensland skeletal remains and warrant further investigation of reliable age estimation techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Encompasses the whole BPM lifecycle, including process identification, modelling, analysis, redesign, automation and monitoring Class-tested textbook complemented with additional teaching material on the accompanying website Covers both relevant conceptual background, industrial standards and actionable skills Business Process Management (BPM) is the art and science of how work should be performed in an organization in order to ensure consistent outputs and to take advantage of improvement opportunities, e.g. reducing costs, execution times or error rates. Importantly, BPM is not about improving the way individual activities are performed, but rather about managing entire chains of events, activities and decisions that ultimately produce added value for an organization and its customers. This textbook encompasses the entire BPM lifecycle, from process identification to process monitoring, covering along the way process modelling, analysis, redesign and automation. Concepts, methods and tools from business management, computer science and industrial engineering are blended into one comprehensive and inter-disciplinary approach. The presentation is illustrated using the BPMN industry standard defined by the Object Management Group and widely endorsed by practitioners and vendors worldwide. In addition to explaining the relevant conceptual background, the book provides dozens of examples, more than 100 hands-on exercises – many with solutions – as well as numerous suggestions for further reading. The textbook is the result of many years of combined teaching experience of the authors, both at the undergraduate and graduate levels as well as in the context of professional training. Students and professionals from both business management and computer science will benefit from the step-by-step style of the textbook and its focus on fundamental concepts and proven methods. Lecturers will appreciate the class-tested format and the additional teaching material available on the accompanying website fundamentals-of-bpm.org.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speaker diarization is the process of annotating an input audio with information that attributes temporal regions of the audio signal to their respective sources, which may include both speech and non-speech events. For speech regions, the diarization system also specifies the locations of speaker boundaries and assign relative speaker labels to each homogeneous segment of speech. In short, speaker diarization systems effectively answer the question of ‘who spoke when’. There are several important applications for speaker diarization technology, such as facilitating speaker indexing systems to allow users to directly access the relevant segments of interest within a given audio, and assisting with other downstream processes such as summarizing and parsing. When combined with automatic speech recognition (ASR) systems, the metadata extracted from a speaker diarization system can provide complementary information for ASR transcripts including the location of speaker turns and relative speaker segment labels, making the transcripts more readable. Speaker diarization output can also be used to localize the instances of specific speakers to pool data for model adaptation, which in turn boosts transcription accuracies. Speaker diarization therefore plays an important role as a preliminary step in automatic transcription of audio data. The aim of this work is to improve the usefulness and practicality of speaker diarization technology, through the reduction of diarization error rates. In particular, this research is focused on the segmentation and clustering stages within a diarization system. Although particular emphasis is placed on the broadcast news audio domain and systems developed throughout this work are also trained and tested on broadcast news data, the techniques proposed in this dissertation are also applicable to other domains including telephone conversations and meetings audio. Three main research themes were pursued: heuristic rules for speaker segmentation, modelling uncertainty in speaker model estimates, and modelling uncertainty in eigenvoice speaker modelling. The use of heuristic approaches for the speaker segmentation task was first investigated, with emphasis placed on minimizing missed boundary detections. A set of heuristic rules was proposed, to govern the detection and heuristic selection of candidate speaker segment boundaries. A second pass, using the same heuristic algorithm with a smaller window, was also proposed with the aim of improving detection of boundaries around short speaker segments. Compared to single threshold based methods, the proposed heuristic approach was shown to provide improved segmentation performance, leading to a reduction in the overall diarization error rate. Methods to model the uncertainty in speaker model estimates were developed, to address the difficulties associated with making segmentation and clustering decisions with limited data in the speaker segments. The Bayes factor, derived specifically for multivariate Gaussian speaker modelling, was introduced to account for the uncertainty of the speaker model estimates. The use of the Bayes factor also enabled the incorporation of prior information regarding the audio to aid segmentation and clustering decisions. The idea of modelling uncertainty in speaker model estimates was also extended to the eigenvoice speaker modelling framework for the speaker clustering task. Building on the application of Bayesian approaches to the speaker diarization problem, the proposed approach takes into account the uncertainty associated with the explicit estimation of the speaker factors. The proposed decision criteria, based on Bayesian theory, was shown to generally outperform their non- Bayesian counterparts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Classifier selection is a problem encountered by multi-biometric systems that aim to improve performance through fusion of decisions. A particular decision fusion architecture that combines multiple instances (n classifiers) and multiple samples (m attempts at each classifier) has been proposed in previous work to achieve controlled trade-off between false alarms and false rejects. Although analysis on text-dependent speaker verification has demonstrated better performance for fusion of decisions with favourable dependence compared to statistically independent decisions, the performance is not always optimal. Given a pool of instances, best performance with this architecture is obtained for certain combination of instances. Heuristic rules and diversity measures have been commonly used for classifier selection but it is shown that optimal performance is achieved for the `best combination performance' rule. As the search complexity for this rule increases exponentially with the addition of classifiers, a measure - the sequential error ratio (SER) - is proposed in this work that is specifically adapted to the characteristics of sequential fusion architecture. The proposed measure can be used to select a classifier that is most likely to produce a correct decision at each stage. Error rates for fusion of text-dependent HMM based speaker models using SER are compared with other classifier selection methodologies. SER is shown to achieve near optimal performance for sequential fusion of multiple instances with or without the use of multiple samples. The methodology applies to multiple speech utterances for telephone or internet based access control and to other systems such as multiple finger print and multiple handwriting sample based identity verification systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reliability of the performance of biometric identity verification systems remains a significant challenge. Individual biometric samples of the same person (identity class) are not identical at each presentation and performance degradation arises from intra-class variability and inter-class similarity. These limitations lead to false accepts and false rejects that are dependent. It is therefore difficult to reduce the rate of one type of error without increasing the other. The focus of this dissertation is to investigate a method based on classifier fusion techniques to better control the trade-off between the verification errors using text-dependent speaker verification as the test platform. A sequential classifier fusion architecture that integrates multi-instance and multisample fusion schemes is proposed. This fusion method enables a controlled trade-off between false alarms and false rejects. For statistically independent classifier decisions, analytical expressions for each type of verification error are derived using base classifier performances. As this assumption may not be always valid, these expressions are modified to incorporate the correlation between statistically dependent decisions from clients and impostors. The architecture is empirically evaluated by applying the proposed architecture for text dependent speaker verification using the Hidden Markov Model based digit dependent speaker models in each stage with multiple attempts for each digit utterance. The trade-off between the verification errors is controlled using the parameters, number of decision stages (instances) and the number of attempts at each decision stage (samples), fine-tuned on evaluation/tune set. The statistical validation of the derived expressions for error estimates is evaluated on test data. The performance of the sequential method is further demonstrated to depend on the order of the combination of digits (instances) and the nature of repetitive attempts (samples). The false rejection and false acceptance rates for proposed fusion are estimated using the base classifier performances, the variance in correlation between classifier decisions and the sequence of classifiers with favourable dependence selected using the 'Sequential Error Ratio' criteria. The error rates are better estimated by incorporating user-dependent (such as speaker-dependent thresholds and speaker-specific digit combinations) and class-dependent (such as clientimpostor dependent favourable combinations and class-error based threshold estimation) information. The proposed architecture is desirable in most of the speaker verification applications such as remote authentication, telephone and internet shopping applications. The tuning of parameters - the number of instances and samples - serve both the security and user convenience requirements of speaker-specific verification. The architecture investigated here is applicable to verification using other biometric modalities such as handwriting, fingerprints and key strokes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

So far, most Phase II trials have been designed and analysed under a frequentist framework. Under this framework, a trial is designed so that the overall Type I and Type II errors of the trial are controlled at some desired levels. Recently, a number of articles have advocated the use of Bavesian designs in practice. Under a Bayesian framework, a trial is designed so that the trial stops when the posterior probability of treatment is within certain prespecified thresholds. In this article, we argue that trials under a Bayesian framework can also be designed to control frequentist error rates. We introduce a Bayesian version of Simon's well-known two-stage design to achieve this goal. We also consider two other errors, which are called Bayesian errors in this article because of their similarities to posterior probabilities. We show that our method can also control these Bayesian-type errors. We compare our method with other recent Bayesian designs in a numerical study and discuss implications of different designs on error rates. An example of a clinical trial for patients with nasopharyngeal carcinoma is used to illustrate differences of the different designs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Species distribution modelling (SDM) typically analyses species’ presence together with some form of absence information. Ideally absences comprise observations or are inferred from comprehensive sampling. When such information is not available, then pseudo-absences are often generated from the background locations within the study region of interest containing the presences, or else absence is implied through the comparison of presences to the whole study region, e.g. as is the case in Maximum Entropy (MaxEnt) or Poisson point process modelling. However, the choice of which absence information to include can be both challenging and highly influential on SDM predictions (e.g. Oksanen and Minchin, 2002). In practice, the use of pseudo- or implied absences often leads to an imbalance where absences far outnumber presences. This leaves analysis highly susceptible to ‘naughty-noughts’: absences that occur beyond the envelope of the species, which can exert strong influence on the model and its predictions (Austin and Meyers, 1996). Also known as ‘excess zeros’, naughty noughts can be estimated via an overall proportion in simple hurdle or mixture models (Martin et al., 2005). However, absences, especially those that occur beyond the species envelope, can often be more diverse than presences. Here we consider an extension to excess zero models. The two-staged approach first exploits the compartmentalisation provided by classification trees (CTs) (as in O’Leary, 2008) to identify multiple sources of naughty noughts and simultaneously delineate several species envelopes. Then SDMs can be fit separately within each envelope, and for this stage, we examine both CTs (as in Falk et al., 2014) and the popular MaxEnt (Elith et al., 2006). We introduce a wider range of model performance measures to improve treatment of naughty noughts in SDM. We retain an overall measure of model performance, the area under the curve (AUC) of the Receiver-Operating Curve (ROC), but focus on its constituent measures of false negative rate (FNR) and false positive rate (FPR), and how these relate to the threshold in the predicted probability of presence that delimits predicted presence from absence. We also propose error rates more relevant to users of predictions: false omission rate (FOR), the chance that a predicted absence corresponds to (and hence wastes) an observed presence, and the false discovery rate (FDR), reflecting those predicted (or potential) presences that correspond to absence. A high FDR may be desirable since it could help target future search efforts, whereas zero or low FOR is desirable since it indicates none of the (often valuable) presences have been ignored in the SDM. For illustration, we chose Bradypus variegatus, a species that has previously been published as an exemplar species for MaxEnt, proposed by Phillips et al. (2006). We used CTs to increasingly refine the species envelope, starting with the whole study region (E0), eliminating more and more potential naughty noughts (E1–E3). When combined with an SDM fit within the species envelope, the best CT SDM had similar AUC and FPR to the best MaxEnt SDM, but otherwise performed better. The FNR and FOR were greatly reduced, suggesting that CTs handle absences better. Interestingly, MaxEnt predictions showed low discriminatory performance, with the most common predicted probability of presence being in the same range (0.00-0.20) for both true absences and presences. In summary, this example shows that SDMs can be improved by introducing an initial hurdle to identify naughty noughts and partition the envelope before applying SDMs. This improvement was barely detectable via AUC and FPR yet visible in FOR, FNR, and the comparison of predicted probability of presence distribution for pres/absence.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Noise can be defined as unwanted sound. It may adversely affect the health and well-being of individuals. Noise sensitivity is a personality trait covering attitudes towards noise in general and a predictor of noise annoyance. Noise sensitive individuals are more affected by noise than less sensitive individuals. The determinants and characteristics related to noise sensitivity are rather poorly known. The risk of health effects caused by noise can be hypothesized to be higher for noise sensitive individuals compared to those who are not noise sensitive. A cardiovascular disease may be an example of outcomes. The general aim of the present study was to investigate the association of noise sensitivity with specific somatic and psychological factors, including the genetic component of noise sensitivity, and the association of noise sensitivity with mortality. The study was based on the Finnish Twin Cohort of same-sex twin pairs born before 1958. In 1988 a questionnaire was sent to twin pairs discordant for hypertension. 1495 individuals (688 men, 807 women) aged 31 88 years replied, including 573 twin pairs. 218 of the subjects lived in the Helsinki Metropolitan Area. Self-reported noise sensitivity, lifetime noise exposure and hypertension were obtained from the questionnaire study in 1988 and other somatic and psychological factors from the questionnaire study in 1981 for the same individuals. In addition, noise map information (1988 1992) from the Helsinki Metropolitan Area and mortality follow-up 1989 2003 were used. To evaluate the stability and validity of noise sensitivity, a new questionnaire was sent in 2002 to a sample of the subjects who had replied to the 1988 questionnaire. Of all subjects who had answered the question on noise sensitivity, 38 % were noise sensitive. Noise sensitivity was independent of noise exposure levels indicated in noise maps. Subjects with high noise sensitivity reported more transportation noise exposure than subjects with low noise sensitivity. Noise sensitive subjects reported transportation noise exposure outside the environmental noise map areas almost twice as often as non-sensitive subjects. Noise sensitivity was associated with hypertension, emphysema, use of psychotropic drugs, smoking, stress and hostility, even when lifetime noise exposure was adjusted for. Monozygotic twin pairs were more similar with regards noise sensitivity than dizygotic twin pairs, and quantitative genetic modelling indicated significant familiality. The best fitting genetic model provided an estimate of heritability of 36 %. Follow-up of subjects in the case-control study showed that cardiovascular mortality was significantly increased among noise sensitive women, but not among men. For coronary heart mortality the interaction of noise sensitivity and lifetime noise exposure was statistically significant in women. In conclusion, noise sensitivity has both somatic and psychological components. It does aggregate in families and probably has a genetic component. Noise sensitivity may be a risk factor for cardiovascular mortality in women.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cardiovascular disease (CVD) is a complex disease with multifactorial aetiology. Both genetic and environmental factors contribute to the disease risk. The lifetime risk for CVD differs markedly between men and women, men being at increased risk. Inflammatory reaction contributes to the development of the disease by promoting atherosclerosis in artery walls. In the first part of this thesis, we identified several inflammatory related CVD risk factors associating with the amount of DNA from whole blood samples, indicating a potential source of bias if a genetic study selects the participants based on the available amount of DNA. In the following studies, this observation was taken into account by applying whole genome amplification to samples otherwise subjected to exclusion due to very low DNA yield. We continued by investigating the contribution of inflammatory genes to the risk for CVD separately in men and women, and looked for sex-genotype interaction. In the second part, we explored a new candidate gene and its role in the risk for CVD. Selenoprotein S (SEPS1) is a membrane protein residing in the endoplasmic reticulum where it participates in retro-translocation of unfolded proteins to cytosolic protein degradation. Previous studies have indicated that SEPS1 protects cells from oxidative stress and that variations in the gene are associated with circulating levels of inflammatory cytokines. In our study, we identified two variants in the SEPS1 gene, which associated with coronary heart disease and ischemic stroke in women. This is, to our knowledge, the first study suggesting a role of SEPS1 in the risk for CVD after extensively examining the variation within the gene region. In the third part of this thesis, we focused on a set of seven genes (angiotensin converting enzyme, angiotensin II receptor type I, C-reactive protein (CRP), and fibrinogen alpha-, beta-, and gamma-chains (FGA, FGB, FGG)) related to inflammatory cytokine interleukin 6 (IL6) and their association with the risk for CVD. We identified one variant in the IL6 gene conferring risk for CVD in men and a variant pair from IL6 and FGA genes associated with decreased risk. Moreover, we identified and confirmed an association between a rare variant in the CRP gene and lower CRP levels, and found two variants in the FGA and FGG genes associating with fibrinogen. The results from this third study suggest a role for the interleukin 6 pathway genes in the pathogenesis of CVD and warrant further studies in other populations. In addition to the IL6 -related genes, we describe in this thesis several sex-specific associations in other genes included in this study. The majority of the findings were evident only in women encouraging other studies of cardiovascular disease to include and analyse women separately from men.