919 resultados para genoma, genetica, dna, bioinformatica, mapreduce, snp, gwas, big data, sequenziamento, pipeline


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis was part of a multidisciplinary research project funded by the German Research Foundation (“Bevölkerungsgeschichte des Karpatenbeckens in der Jungsteinzeit und ihr Einfluss auf die Besiedlung Mitteleuropas”, grant no. Al 287/10-1) aimed at elucidating the population history of the Carpathian Basin during the Neolithic. The Carpathian Basin was an important waypoint on the spread of the Neolithic from southeastern to central Europe. On the Great Hungarian Plain (Alföld), the first farming communities appeared around 6000 cal BC. They belonged to the Körös culture, which derived from the Starčevo-Körös-Criş complex in the northern Balkans. Around 5600 cal BC the Alföld-Linearbandkeramik (ALBK), so called due to its stylistic similarities with the Transdanubian and central European LBK, emerged in the northwestern Alföld. Following a short “classical phase”, the ALBK split into several regional subgroups during its later stages, but did not expand beyond the Great Hungarian Plain. Marking the beginning of the late Neolithic period, the Tisza culture first appeared in the southern Alföld around 5000 cal BC and subsequently spread into the central and northern Alföld. Together with the Herpály and Csőszhalom groups it was an integral part of the late Neolithic cultural landscape of the Alföld. Up until now, the Neolithic cultural succession on the Alföld has been almost exclusively studied from an archaeological point of view, while very little is known about the population genetic processes during this time period. The aim of this thesis was to perform ancient DNA (aDNA) analyses on human samples from the Alföld Neolithic and analyse the resulting mitochondrial population data to address the following questions: is there population continuity between the Central European Mesolithic hunter-gatherer metapopulation and the first farming communities on the Alföld? Is there genetic continuity from the early to the late Neolithic? Are there genetic as well as cultural differences between the regional groups of the ALBK? Additionally, the relationships between the Alföld and the neighbouring Transdanubian Neolithic as well as other European early farming communities were evaluated to gain insights into the genetic affinities of the Alföld Neolithic in a larger geographic context. 320 individuals were analysed for this study; reproducible mitochondrial haplogroup information (HVS-I and/or SNP data) could be obtained from 242 Neolithic individuals. According to the analyses, population continuity between hunter-gatherers and the Neolithic cultures of the Alföld can be excluded at any stage of the Neolithic. In contrast, there is strong evidence for population continuity from the early to the late Neolithic. All cultural groups on the Alföld were heavily shaped by the genetic substrate introduced into the Carpathian Basin during the early Neolithic by the Körös and Starčevo cultures. Accordingly, genetic differentiation between regional groups of the ALBK is not very pronounced. The Alföld cultures are furthermore genetically highly similar to the Transdanubian Neolithic cultures, probably due to common ancestry. In the wider European context, the Alföld Neolithic cultures also highly similar to the central European LBK, while they differ markedly from contemporaneous populations of the Iberian Peninsula and the Ukraine. Thus, the Körös culture, the ALBK and the Tisza culture can be regarded as part of a “genetic continuum” that links the Neolithic Carpathian Basin to central Europe and likely has its roots in the Starčevo -Körös-Criş complex of the northern Balkans.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lo scopo di questa tesi è quello di evidenziare, attraverso varie analisi statistiche ed applicazione di modelli stocastici, il comportamento strutturale e funzionale dei dinucleotidi che compongono le sequenze di DNA di diversi organismi. Gli organismi che abbiamo scelto di prendere in considerazione sono l'uomo, il topo e l'Escherichia coli. Questa scelta non è stata casuale, ma oculata, al fine di mettere in risalto alcune differenze tra organismi eucarioti, quali l'uomo e il topo, ed organismi procarioti come il batterio E.coli. Nella prima parte del nostro studio, abbiamo computato le distanze che intercorrono tra occorrenze successive dello stesso dinucleotide lungo la sequenza, usando un metodo di non sovrapposizione, ed abbiamo iterato il calcolo per tutti i 16 dinucleotidi. Dopodiché ci siamo preoccupati di graficare le distribuzioni di distanza dei 16 dinucleotidi per l'E.Coli, il topo e l'uomo; gli istogrammi evidenziano un comportamento anomalo della distribuzione di CG che accomuna gli organismi eucarioti e di cui, invece, è esente l'organismo procariote esaminato. Questo dato statistico trova una spiegazione nei processi biologici di metilazione che possono innescarsi sul dinucleotide CG nelle sequenze eucariotiche. In seguito, per determinare quanto ciascuna delle 16 distribuzioni si discosti dalle altre abbiamo usato la divergenza di Jensen-Shannon. Per quantificare le differenze sostanziali tra le distribuzioni di CG dei 3 organismi considerati abbiamo deciso di verificare quale fosse il miglior fit per tali curve tra un esponenziale ed una power-law. L'esponenziale rappresenta un buon fit per le code delle distribuzioni di CG del topo e dell'uomo; ciò rivela la presenza di una lunghezza caratteristica per entrambi gli organismi. Nella seconda parte dello studio, i risultati vengono confrontati con modelli markoviani: sequenze random generate con catene di Markov di ordine zero (basate sulle frequenze relative dei nucleotidi) e uno (basate sulle probabilità di transizione tra diversi nucleotidi). Quest'ultima riproduce abbastanza fedelmente la sequenza biologica di partenza, per cui abbiamo scelto di utilizzare la catena Markov del 1° ordine per altre analisi statistiche riguardanti le distribuzioni dei nucleotidi, dinucleotidi, ed anche dei trinucleotidi con particolare interesse per quelli in cui è contenuto CG, in modo da verificare se l'anomalia si ripercuote anche in essi. Riteniamo pertanto che metodi basati su questo approccio potrebbero essere sfruttati per confermare le peculiarità biologiche e per migliorare l'individuazione delle aree di interesse, come le isole CpG, ed eventualmente promotori e Lamina Associated Domains (LAD), nel genoma di diversi organismi.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Negli ultimi anni la biologia ha fatto ricorso in misura sempre maggiore all’informatica per affrontare analisi complesse che prevedono l’utilizzo di grandi quantità di dati. Fra le scienze biologiche che prevedono l’elaborazione di una mole di dati notevole c’è la genomica, una branca della biologia molecolare che si occupa dello studio di struttura, contenuto, funzione ed evoluzione del genoma degli organismi viventi. I sistemi di data warehouse sono una tecnologia informatica che ben si adatta a supportare determinati tipi di analisi in ambito genomico perché consentono di effettuare analisi esplorative e dinamiche, analisi che si rivelano utili quando si vogliono ricavare informazioni di sintesi a partire da una grande quantità di dati e quando si vogliono esplorare prospettive e livelli di dettaglio diversi. Il lavoro di tesi si colloca all’interno di un progetto più ampio riguardante la progettazione di un data warehouse in ambito genomico. Le analisi effettuate hanno portato alla scoperta di dipendenze funzionali e di conseguenza alla definizione di una gerarchia nei dati. Attraverso l’inserimento di tale gerarchia in un modello multidimensionale relativo ai dati genomici sarà possibile ampliare il raggio delle analisi da poter eseguire sul data warehouse introducendo un contenuto informativo ulteriore riguardante le caratteristiche dei pazienti. I passi effettuati in questo lavoro di tesi sono stati prima di tutto il caricamento e filtraggio dei dati. Il fulcro del lavoro di tesi è stata l’implementazione di un algoritmo per la scoperta di dipendenze funzionali con lo scopo di ricavare dai dati una gerarchia. Nell’ultima fase del lavoro di tesi si è inserita la gerarchia ricavata all’interno di un modello multidimensionale preesistente. L’intero lavoro di tesi è stato svolto attraverso l’utilizzo di Apache Spark e Apache Hadoop.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Marginal zone B-cell lymphomas (MZLs) have been divided into 3 distinct subtypes (extranodal MZLs of mucosa-associated lymphoid tissue [MALT] type, nodal MZLs, and splenic MZLs). Nevertheless, the relationship between the subtypes is still unclear. We performed a comprehensive analysis of genomic DNA copy number changes in a very large series of MZL cases with the aim of addressing this question. Samples from 218 MZL patients (25 nodal, 57 MALT, 134 splenic, and 2 not better specified MZLs) were analyzed with the Affymetrix Human Mapping 250K SNP arrays, and the data combined with matched gene expression in 33 of 218 cases. MALT lymphoma presented significantly more frequently gains at 3p, 6p, 18p, and del(6q23) (TNFAIP3/A20), whereas splenic MZLs was associated with del(7q31), del(8p). Nodal MZLs did not show statistically significant differences compared with MALT lymphoma while lacking the splenic MZLs-related 7q losses. Gains of 3q and 18q were common to all 3 subtypes. del(8p) was often present together with del(17p) (TP53). Although del(17p) did not determine a worse outcome and del(8p) was only of borderline significance, the presence of both deletions had a highly significant negative impact on the outcome of splenic MZLs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In vitro and in animal models, APE1, OGG1, and PARP-1 have been proposed as being involved with inflammatory response. In this work, we have investigated if the SNPs APE1 Asn148Glu, OGG1 Ser326Cys, and PARP-1 Val762Ala are associated to meningitis. The patient genotypes were investigated by PIRA-PCR or PCR-RFLP. DNA damages were detected in genomic DNA by Fpg treatment. IgG and IgA were measured from plasma and the cytokines and chemokines were measured from cerebrospinal fluid samples using Bio-Plex assays. A higher frequency (P<0.05) of APE1 Glu allele in bacterial meningitis (BM) and aseptic meningitis (AM) patients was observed. The genotypes Asn/Asn in control group and Asn/Glu in BM group was also higher. For the SNP OGG1 Ser326Cys, the genotype Cys/Cys was more frequent (P<0.05) in BM group. The frequency of PARP-1 Val/Val genotype was higher in control group (P<0.05). The occurrence of combined SNPs is significantly higher in BM patients, indicating that these SNPs may be associated to the disease. Increasing in sensitive sites to Fpg was observed in carriers of APE1 Glu allele or OGG1 Cys allele, suggesting that SNPs affect DNA repair activity. Alterations in IgG production were observed in the presence of SNPs APE1 Asn148Glu, OGG1 Ser326Cys or PARP-1 Val762Ala. Moreover, reduction in the levels of IL-6, IL-1Ra, MCP-1/CCL2 and IL-8/CXCL8 was observed in the presence of APE1 Glu allele in BM patients. In conclusion, we obtained indications of an effect of SNPs in DNA repair genes on the regulation of immune response in meningitis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Porcine IGF2 and the H19 genes are imprinted. The IGF2 is paternally expressed, while the H19 gene is maternally expressed. Extensive studies in mice established a boundary model indicating that the H19 differentially methylated domain (DMD) controls, upon binding with the CTCF protein, reciprocal imprinting of the IGF2 and the H19 genes. IGF2 transcription is tissue and development specific involving the use of 4 promoters. In the liver of adult Large White boars IGF2 is expressed from both parental alleles, whereas in skeletal muscle and kidney tissues we observed variable relaxation of IGF2 imprinting. We hypothesized that IGF2 expression from both paternal alleles and relaxation of IGF2 imprinting is reflected in differences in DNA methylation patterns at the H19 DMD and IGF2 differentially methylated regions 1 and 2 (DMR1 and DMR2). RESULTS: Bisulfite sequencing analysis did not show any differences in DNA methylation at the three porcine CTCF binding sites in the H19 DMD between liver, muscle and kidney tissues of adult pigs. A DNA methylation analysis using methyl-sensitive restriction endonuclease SacII and 'hot-stop' PCR gave consistent results with those from the bisulfite sequencing analysis. We found that porcine H19 DMD is distinctly differentially methylated, at least for the region formally confirmed by two SNPs, in liver, skeletal muscle and kidney of foetal, newborn and adult pigs, independent of the combined imprinting status of all IGF2 expressed transcripts. DNA methylation at CpG sites in DMR1 of foetal liver was significantly lower than in the adult liver due to the presence of hypomethylated molecules. An allele specific analysis was performed for IGF2 DMR2 using a SNP in the IGF2 3'-UTR. The maternal IGF2 DMR2 of foetal and newborn liver revealed a higher DNA methylation content compared to the respective paternal allele. CONCLUSIONS: Our results indicate that the IGF2 imprinting status is transcript-specific. Biallelic IGF2 expression in adult porcine liver and relaxation of IGF2 imprinting in porcine muscle were a common feature. These results were consistent with the IGF2 promoter P1 usage in adult liver and IGF2 promoter P2, P3 and P4 usages in muscle. The results showed further that bialellic IGF2 expression in liver and relaxation of imprinting in muscle and kidney were not associated with DNA methylation variation at and around at least one CTCF binding site in H19 DMD. The imprinting status in adult liver, muscle and kidney tissues were also not reflected in the methylation patterns of IGF2 DMRs 1 and 2.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is increasing evidence that strain variation in Mycobacterium tuberculosis complex (MTBC) might influence the outcome of tuberculosis infection and disease. To assess genotype-phenotype associations, phylogenetically robust molecular markers and appropriate genotyping tools are required. Most current genotyping methods for MTBC are based on mobile or repetitive DNA elements. Because these elements are prone to convergent evolution, the corresponding genotyping techniques are suboptimal for phylogenetic studies and strain classification. By contrast, single nucleotide polymorphisms (SNP) are ideal markers for classifying MTBC into phylogenetic lineages, as they exhibit very low degrees of homoplasy. In this study, we developed two complementary SNP-based genotyping methods to classify strains into the six main human-associated lineages of MTBC, the "Beijing" sublineage, and the clade comprising Mycobacterium bovis and Mycobacterium caprae. Phylogenetically informative SNPs were obtained from 22 MTBC whole-genome sequences. The first assay, referred to as MOL-PCR, is a ligation-dependent PCR with signal detection by fluorescent microspheres and a Luminex flow cytometer, which simultaneously interrogates eight SNPs. The second assay is based on six individual TaqMan real-time PCR assays for singleplex SNP-typing. We compared MOL-PCR and TaqMan results in two panels of clinical MTBC isolates. Both methods agreed fully when assigning 36 well-characterized strains into the main phylogenetic lineages. The sensitivity in allele-calling was 98.6% and 98.8% for MOL-PCR and TaqMan, respectively. Typing of an additional panel of 78 unknown clinical isolates revealed 99.2% and 100% sensitivity in allele-calling, respectively, and 100% agreement in lineage assignment between both methods. While MOL-PCR and TaqMan are both highly sensitive and specific, MOL-PCR is ideal for classification of isolates with no previous information, whereas TaqMan is faster for confirmation. Furthermore, both methods are rapid, flexible and comparably inexpensive.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In most microarray technologies, a number of critical steps are required to convert raw intensity measurements into the data relied upon by data analysts, biologists and clinicians. These data manipulations, referred to as preprocessing, can influence the quality of the ultimate measurements. In the last few years, the high-throughput measurement of gene expression is the most popular application of microarray technology. For this application, various groups have demonstrated that the use of modern statistical methodology can substantially improve accuracy and precision of gene expression measurements, relative to ad-hoc procedures introduced by designers and manufacturers of the technology. Currently, other applications of microarrays are becoming more and more popular. In this paper we describe a preprocessing methodology for a technology designed for the identification of DNA sequence variants in specific genes or regions of the human genome that are associated with phenotypes of interest such as disease. In particular we describe methodology useful for preprocessing Affymetrix SNP chips and obtaining genotype calls with the preprocessed data. We demonstrate how our procedure improves existing approaches using data from three relatively large studies including one in which large number independent calls are available. Software implementing these ideas are avialble from the Bioconductor oligo package.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Amplifications and deletions of chromosomal DNA, as well as copy-neutral loss of heterozygosity have been associated with diseases processes. High-throughput single nucleotide polymorphism (SNP) arrays are useful for making genome-wide estimates of copy number and genotype calls. Because neighboring SNPs in high throughput SNP arrays are likely to have dependent copy number and genotype due to the underlying haplotype structure and linkage disequilibrium, hidden Markov models (HMM) may be useful for improving genotype calls and copy number estimates that do not incorporate information from nearby SNPs. We improve previous approaches that utilize a HMM framework for inference in high throughput SNP arrays by integrating copy number, genotype calls, and the corresponding confidence scores when available. Using simulated data, we demonstrate how confidence scores control smoothing in a probabilistic framework. Software for fitting HMMs to SNP array data is available in the R package ICE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In cattle, at least 39 variants of the 4 casein proteins (α(S1)-, β-, α(S2)- and κ-casein) have been described to date. Many of these variants are known to affect milk-production traits, cheese-processing properties, and the nutritive value of milk. They also provide valuable information for phylogenetic studies. So far, the majority of studies exploring the genetic variability of bovine caseins considered European taurine cattle breeds and were carried out at the protein level by electrophoretic techniques. This only allows the identification of variants that, due to amino acid exchanges, differ in their electric charge, molecular weight, or isoelectric point. In this study, the open reading frames of the casein genes CSN1S1, CSN2, CSN1S2, and CSN3 of 356 animals belonging to 14 taurine and 3 indicine cattle breeds were sequenced. With this approach, we identified 23 alleles, including 5 new DNA sequence variants, with a predicted effect on the protein sequence. The new variants were only found in indicine breeds and in one local Iranian breed, which has been phenotypically classified as a taurine breed. A multidimensional scaling approach based on available SNP chip data, however, revealed an admixture of taurine and indicine populations in this breed as well as in the local Iranian breed Golpayegani. Specific indicine casein alleles were also identified in a few European taurine breeds, indicating the introgression of indicine breeds into these populations. This study shows the existence of substantial undiscovered genetic variability of bovine casein loci, especially in indicine cattle breeds. The identification of new variants is a valuable tool for phylogenetic studies and investigations into the evolution of the milk protein genes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present the postmortem findings of a fatal road accident involving a motorcyclist, a car, and a common buzzard. Both the motorcyclist and the bird died on the scene of the accident and were examined by postmortem full-body CT and autopsy. In addition, a facial injury of the motorcyclist was compared with the dimensions of the buzzard’s beak and claws by 3D scan technologies. Blood splatters collected on the bird’s beak, feet, and tail were examined by DNA analysis. The overall findings suggested a collision of a common buzzard with a motorcyclist in full speed, causing the motorcyclist to lose control of his vehicle and crash with an approaching car on the oncoming lane.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Different cytokines are secreted in response to specific microbial molecules referred to as pathogen associated molecular patterns (PAMPs). Interleukin 6 (IL6) and interleukin 10 (IL10), both secreted by macrophages and lymphocytes, play a central role in the immunological response. In this work we obtained the genomic structure and complete DNA sequence of the porcine IL6 and IL10 genes and identified polymorphisms in the genomic sequences of these genes on a panel of ten different pig breeds. Comparative intra- and interbreed sequence analysis revealed a total of eight polymorphisms in the porcine IL6 gene and 21 in the porcine IL10 gene, which include single nucleotide polymorphisms (SNPs) and insertion deletion polymorphisms (indels). Additionally, the chromosomal localization of the IL10 gene was determined by FISH and RH mapping.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hereditary nonpolyposis colorectal cancer (HNPCC) is an autosomal dominant disease caused by germline mutations in DNA mismatch repair(MMR) genes. The nucleotide excision repair(NER) pathway plays a very important role in cancer development. We systematically studied interactions between NER and MMR genes to identify NER gene single nucleotide polymorphism (SNP) risk factors that modify the effect of MMR mutations on risk for cancer in HNPCC. We analyzed data from polymorphisms in 10 NER genes that had been genotyped in HNPCC patients that carry MSH2 and MLH1 gene mutations. The influence of the NER gene SNPs on time to onset of colorectal cancer (CRC) was assessed using survival analysis and a semiparametric proportional hazard model. We found the median age of onset for CRC among MMR mutation carriers with the ERCC1 mutation was 3.9 years earlier than patients with wildtype ERCC1(median 47.7 vs 51.6, log-rank test p=0.035). The influence of Rad23B A249V SNP on age of onset of HNPCC is age dependent (likelihood ratio test p=0.0056). Interestingly, using the likelihood ratio test, we also found evidence of genetic interactions between the MMR gene mutations and SNPs in ERCC1 gene(C8092A) and XPG/ERCC5 gene(D1104H) with p-values of 0.004 and 0.042, respectively. An assessment using tree structured survival analysis (TSSA) showed distinct gene interactions in MLH1 mutation carriers and MSH2 mutation carriers. ERCC1 SNP genotypes greatly modified the age onset of HNPCC in MSH2 mutation carriers, while no effect was detected in MLH1 mutation carriers. Given the NER genes in this study play different roles in NER pathway, they may have distinct influences on the development of HNPCC. The findings of this study are very important for elucidation of the molecular mechanism of colon cancer development and for understanding why some mutation carriers of the MSH2 and MLH1 gene develop CRC early and others never develop CRC. Overall, the findings also have important implications for the development of early detection strategies and prevention as well as understanding the mechanism of colorectal carcinogenesis in HNPCC. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pumas are one of the most studied terrestrial mammals because of their widespread distribution, substantial ecological impacts, and conflicts with humans. Extensive efforts, often employing genetic methods, are undertaken to manage this species. However, the comparison of population genetic data is difficult because few of the microsatellite loci chosen are shared across research programs. Here, we describe the development of PumaPlex, a high-throughput assay to genotype 25 single nucleotide polymorphisms in pumas. We validated PumaPlex in more than 700 North American pumas (Puma concolor couguar), and demonstrated its ability to generate reproducible genotypes and accurately identify individuals. Furthermore, we compared PumaPlex with traditional genotyping of 12 microsatellite loci in fecal DNA samples and found that PumaPlex produced significantly more genotypes with fewer false alleles. PumaPlex promotes the cross-laboratory comparison of genotypes, is easily expandable in the future, and is a valuable tool for the genetic monitoring and management of North American puma populations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La semilla es el órgano que garantiza la propagación y continuidad evolutiva de las plantas espermatofitas y constituye un elemento indispensable en la alimentación humana y animal. La semilla de cereales acumula en el endospermo durante la maduración, mayoritariamente, almidón y proteínas de reserva. Estas reservas son hidrolizadas en la germinación por hidrolasas sintetizadas en la aleurona en respuesta a giberelinas (GA), siendo la principal fuente de energía hasta que la plántula emergente es fotosintéticamente activa. Ambas fases del desarrollo de la semilla, están reguladas por una red de factores de transcripción (TF) que unen motivos conservados en cis- en los promotores de sus genes diana. Los TFs son proteínas que han desempeñado un papel central en la evolución y en el proceso de domesticación, siendo uno de los principales mecanismos de regulación génica; en torno al 7% de los genes de plantas codifican TFs. Atendiendo al motivo de unión a DNA, éstos, se han clasificado en familias. La familia DOF (DNA binding with One Finger) participa en procesos vitales exclusivos de plantas superiores y sus ancestros cercanos (algas, musgos y helechos). En las semillas de las Triticeae (subfamilia Pooideae), se han identificado varias proteínas DOF que desempeñan un papel fundamental en la regulación de la expresión génica. Brachypodium distachyon es la primera especie de la subfamilia Pooideae cuyo genoma (272 Mbp) ha sido secuenciado. Su pequeño tamaño, ciclo de vida corto, y la posibilidad de ser transformado por Agrobacterium tumefaciens (plásmido Ti), hacen que sea el sistema modelo para el estudio de cereales de la tribu Triticeae con gran importancia agronómica mundial, como son el trigo y la cebada. En este trabajo, se han identificado 27 genes Dof en el genoma de B. distachyon y se han establecido las relaciones evolutivas entre estos genes Dof y los de cebada (subfamilia Pooideae) y de arroz (subfamilia Oryzoideae), construyendo un árbol filogenético en base al alineamiento múltiple del dominio DOF. La cebada contiene 26 genes Dof y en arroz se han anotado 30. El análisis filogenético establece cuatro grupos de genes ortólogos (MCOGs: Major Clusters of Orthologous Genes), que están validados por motivos conservados adicionales, además del dominio DOF, entre las secuencias de las proteínas de un mismo MCOG. El estudio global de expresión en diferentes órganos establece un grupo de nueve genes BdDof expresados abundantemente y/o preferencialmente en semillas. El estudio detallado de expresión de estos genes durante la maduración y germinación muestra que BdDof24, ortólogo putativo a BPBF-HvDOF24 de cebada, es el gen más abundante en las semillas en germinación de B. distachyon. La regulación transcripcional de los genes que codifican hidrolasas en la aleurona de las semillas de cereales durante la post‐germinación ha puesto de manifiesto la existencia en sus promotores de un motivo tripartito en cis- conservado GARC (GA-Responsive Complex), que unen TFs de la clase MYB-R2R3, DOF y MYBR1-SHAQKYF. En esta tesis, se ha caracterizado el gen BdCathB de Brachypodium que codifica una proteasa tipo catepsina B y es ortólogo a los genes Al21 de trigo y HvCathB de cebada, así como los TFs responsables de su regulación transcripcional BdDOF24 y BdGAMYB (ortólogo a HvGAMYB). El análisis in silico del promotor BdCathB ha identificado un motivo GARC conservado, en posición y secuencia, con sus ortólogos en trigo y cebada. La expresión de BdCathB se induce durante la germinación, así como la de los genes BdDof24 y BdGamyb. Además, los TFs BdDOF24 y BdGAMYB interaccionan en el sistema de dos híbridos de levadura e in planta en experimentos de complementación bimolecular fluorescente. En capas de aleurona de cebada, BdGAMYB activa el promotor BdCathB, mientras que BdDOF24 lo reprime; este resultado es similar al obtenido con los TFs ortólogos de cebada BPBF-HvDOF24 y HvGAMYB. Sin embargo, cuando las células de aleurona se transforman simultáneamente con los dos TFs, BdDOF24 tiene un efecto aditivo sobre la trans-activación mediada por BdGAMYB, mientras que su ortólogo BPBF-HvDOF24 produce el efecto contrario, revirtiendo el efecto de HvGAMYB sobre el promotor BdCathB. Las diferencias entre las secuencias deducidas de las proteínas BdDOF24 y BPBF-HvDOF24 podrían explicar las funciones opuestas que desempeñan en su interacción con GAMYB. Resultados preliminares con líneas de inserción de T-DNA y de sobre-expresión estable de BdGamyb, apoyan los resultados obtenidos en expresión transitoria. Además las líneas homocigotas knock-out para el gen BdGamyb presentan alteraciones en anteras y polen y no producen semillas viables. ABSTRACT The seed is the plant organ of the spermatophytes responsible for the dispersion and survival in the course of evolution. In addition, it constitutes one of the most importan elements of human food and animal feed. The main reserves accumulated in the endosperm of cereal seeds through the maturation phase of development are starch and proteins. Its degradation by hydrolases synthetized in aleurone cells in response to GA upon germination provides energy, carbon and nitrogen to the emerging seedling before it acquires complete photosynthetic capacity. Both phases of seed development are controlled by a network of transcription factors (TFs) that interact with specific cis- elements in the promoters of their target genes. TFs are proteins that have played a central role during evolution and domestication, being one of the most important regulatory mechanisms of gene expression. Around 7% of genes in plant genomes encode TFs. Based on the DNA binding motif, TFs are classified into families. The DOF (DNA binding with One Finger) family is involved in specific processes of plants and its ancestors (algae, mosses and ferns). Several DOF proteins have been described to play important roles in the regulation of genes in seeds of the Triticeae tribe (Pooideae subfamily). Brachypodium distachyon is the first member of the Pooideae subfamily to be sequenced. Its small size and compact structured genome (272 Mbp), the short life cycle, small plant size and the possibility of being transformed with Agrobacterium tumefaciens (Ti-plasmid) make Brachypodium the model system for comparative studies within cereals of the Triticeae tribe that have big economic value such as wheat and barley. In this study, 27 Dof genes have been identified in the genome of B. distachyon and the evolutionary relationships among these Dof genes and those frome barley (Pooideae subfamily) and those from rice (Oryzoideae subfamily) have been established by building a phylogenetic tree based on the multiple alignment of the DOF DNA binding domains. The barley genome (Hordeum vulgare) contains 26 Dof genes and in rice (Oryza sativa) 30 genes have been annotated. The phylogenetic analysis establishes four Major Clusters of Orthologous Genes (MCOGs) that are supported by additional conserved motives out of the DOF domain, between proteins of the same MCOG. The global expression study of BdDof genes in different organs and tissues classifies BdDof genes into two groups; nine of the 27 BdDof genes are abundantly or preferentially expressed in seeds. A more detailed expression analysis of these genes during seed maturation and germination shows that BdDof24, orholog to barley BPBF-HvDof24, is the most abundantly expressed gene in germinating seeds. Transcriptional regulation studies of genes that encode hydrolases in aleurone cells during post-germination of cereal seeds, have identified in their promoters a tripartite conserved cis- motif GARC (GA-Responsive Complex) that binds TFs of the MYB-R2R3, DOF and MYBR1-SHAQKYF families. In this thesis, the characterization of the BdCathB gene, encoding a Cathepsin B-like protease and that is ortholog to the wheat Al21 and the barley HvCathB genes, has been done and its transcriptional regulation by the TFs BdDOF24 and BdGAMYB (ortholog to HvGAMYB) studied. The in silico analysis of the BdCathB promoter sequence has identified a GARC motif. BdCathB expression is induced upon germination, as well as, those of BdDof24 and BdGamyb genes. Moreover, BdDOF24 and BdGAMYB interact in yeast (Yeast 2 Hybrid System, Y2HS) and in planta (Bimolecular Fluorecence Complementation, BiFC). In transient assays in aleurone cells, BdGAMYB activates the BdCathB promoter, whereas BdDOF24 is a transcriptional repressor, this result is similar to that obtained with the barley orthologous genes BPBF-HvDOF24 and HvGAMYB. However, when aleurone cells are simultaneously transformed with both TFs, BdDOF24 has an additive effect to the trans-activation mediated by BdGAMYB, while its ortholog BPBF-HvDOF24 produces an opposite effect by reducing the HvGAMYB activation of the BdCathB promoter. The differences among the deduced protein sequences between BdDOF24 and BPBF-HvDOF24 could explain their opposite functions in the interaction with GAMYB protein. Preliminary results of T-DNA insertion (K.O.) and stable over-expression lines of BdGamyb support the data obtained in transient expression assays. In addition, the BdGamyb homozygous T-DNA insertion (K.O.) lines have anther and pollen alterations and they do not produce viable seeds.