Hydrogeological research usually includes some statistical studies devised to elucidate mean background state, characterise relationships among different hydrochemical parameters, and show the influence of human activities. These goals are achieved either by means of a statistical approach or by mixing modelsbetween end-members. Compositional data analysis has proved to be effective with the first approach, but there is no commonly accepted solution to the end-member problem in a compositional framework.We present here a possible solution based on factor analysis of compositions illustrated with a case study.We find two factors on the compositional bi-plot fitting two non-centered orthogonal axes to the most representative variables. Each one of these axes defines a subcomposition, grouping those variables thatlay nearest to it. With each subcomposition a log-contrast is computed and rewritten as an equilibrium equation. These two factors can be interpreted as the isometric log-ratio coordinates (ilr) of three hiddencomponents, that can be plotted in a ternary diagram. These hidden components might be interpreted as end-members.We have analysed 14 molarities in 31 sampling stations all along the Llobregat River and its tributaries, with a monthly measure during two years. We have obtained a bi-plot with a 57% of explained totalvariance, from which we have extracted two factors: factor G, reflecting geological background enhanced by potash mining; and factor A, essentially controlled by urban and/or farming wastewater. Graphicalrepresentation of these two factors allows us to identify three extreme samples, corresponding to pristine waters, potash mining influence and urban sewage influence. To confirm this, we have available analysisof diffused and widespread point sources identified in the area: springs, potash mining lixiviates, sewage, and fertilisers. Each one of these sources shows a clear link with one of the extreme samples, exceptfertilisers due to the heterogeneity of their composition.This approach is a useful tool to distinguish end-members, and characterise them, an issue generally difficult to solve. It is worth note that the end-member composition cannot be fully estimated but only characterised through log-ratio relationships among components. Moreover, the influence of each endmember in a given sample must be evaluated in relative terms of the other samples. These limitations areintrinsic to the relative nature of compositional data


We study the equidistribution of Fekete points in a compact complex manifold. These are extremal point configurations defined through sections of powers of a positive line bundle. Their equidistribution is a known result. The novelty of our approach is that we relate them to the problem of sampling and interpolation on line bundles, which allows us to estimate the equidistribution of the Fekete points quantitatively. In particular we estimate the Kantorovich-Wasserstein distance of the Fekete points to its limiting measure. The sampling and interpolation arrays on line bundles are a subject of independent interest, and we provide necessary density conditions through the classical approach of Landau, that in this context measures the local dimension of the space of sections of the line bundle. We obtain a complete geometric characterization of sampling and interpolation arrays in the case of compact manifolds of dimension one, and we prove that there are no arrays of both sampling and interpolation in the more general setting of semipositive line bundles.


Levels of low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol, triglycerides and total cholesterol are heritable, modifiable risk factors for coronary artery disease. To identify new loci and refine known loci influencing these lipids, we examined 188,577 individuals using genome-wide and custom genotyping arrays. We identify and annotate 157 loci associated with lipid levels at P < 5 × 10(-8), including 62 loci not previously associated with lipid levels in humans. Using dense genotyping in individuals of European, East Asian, South Asian and African ancestry, we narrow association signals in 12 loci. We find that loci associated with blood lipid levels are often associated with cardiovascular and metabolic traits, including coronary artery disease, type 2 diabetes, blood pressure, waist-hip ratio and body mass index. Our results demonstrate the value of using genetic data from individuals of diverse ancestry and provide insights into the biological mechanisms regulating blood lipids to guide future genetic, biological and therapeutic research.


In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5' regulatory sequence variation in the corresponding genes is indeed increased. However, approximately 42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a &gt;4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL.


Early pregnancy and multiparity are known to reduce the risk of women to develop breast cancer at menopause. Based on the knowledge that the differentiation of the breast induced by the hormones of pregnancy plays a major role in this protection, this work was performed with the purpose of identifying what differentiation-associated molecular changes persist in the breast until menopause. Core needle biopsies (CNB) obtained from the breast of 42 nulliparous (NP) and 71 parous (P) postmenopausal women were analyzed in morphology, immunocytochemistry and gene expression. Whereas in the NP breast, nuclei of epithelial cells were large and euchromatic, in the P breast they were small and hyperchromatic, showing strong methylation of histone 3 at lysine 9 and 27. Transcriptomic analysis performed using Affymetrix HG_U133 oligonucleotide arrays revealed that in CNB of the P breast, there were 267 upregulated probesets that comprised genes controlling chromatin organization, transcription regulation, splicing machinery, mRNA processing and noncoding elements including XIST. We concluded that the differentiation process induced by pregnancy is centered in chromatin remodeling and in the mRNA processing reactome, both of which emerge as important regulatory pathways. These are indicative of a safeguard step that maintains the fidelity of the transcription process, becoming the ultimate mechanism mediating the protection of the breast conferred by full-term pregnancy.


The biological and therapeutic responses to hyperthermia, when it is envisaged as an anti-tumor treatment modality, are complex and variable. Heat delivery plays a critical role and is counteracted by more or less efficient body cooling, which is largely mediated by blood flow. In the case of magnetically mediated modality, the delivery of the magnetic particles, most often superparamagnetic iron oxide nanoparticles (SPIONs), is also critically involved. We focus here on the magnetic characterization of two injectable formulations able to gel in situ and entrap silica microparticles embedding SPIONs. These formulations have previously shown suitable syringeability and intratumoral distribution in vivo. The first formulation is based on alginate, and the second on a poly(ethylene-co-vinyl alcohol) (EVAL). Here we investigated the magnetic properties and heating capacities in an alternating magnetic field (141 kHz, 12 mT) for implants with increasing concentrations of magnetic microparticles. We found that the magnetic properties of the magnetic microparticles were preserved using the formulation and in the wet implant at 37 degrees C, as in vivo. Using two orthogonal methods, a common SLP (20 Wg(-1)) was found after weighting by magnetic microparticle fraction, suggesting that both formulations are able to properly carry the magnetic microparticles in situ while preserving their magnetic properties and heating capacities. (C) 2010 Elsevier B.V. All rights reserved.


The recent advance in high-throughput sequencing and genotyping protocols allows rapid investigation of Mendelian and complex diseases on a scale not previously been possible. In my thesis research I took advantage of these modern techniques to study retinitis pigmentosa (RP), a rare inherited disease characterized by progressive loss of photoreceptors and leading to blindness; and hypertension, a common condition affecting 30% of the adult population. Firstly, I compared the performance of different next generation sequencing (NGS) platforms in the sequencing of the RP-linked gene PRPF31. The gene contained a mutation in an intronic repetitive element, which presented difficulties for both classic sequencing methods and NGS. We showed that all NGS platforms are powerful tools to identify rare and common DNA variants, also in case of more complex sequences. Moreover, we evaluated the features of different NGS platforms that are important in re-sequencing projects. The main focus of my thesis was then to investigate the involvement of pre-mRNA splicing factors in autosomal dominant RP (adRP). I screened 5 candidate genes in a large cohort of patients by using long-range PCR as enrichment step, followed by NGS. We tested two different approaches: in one, all target PCRs from all patients were pooled and sequenced as a single DNA library; in the other, PCRs from each patient were separated within the pool by DNA barcodes. The first solution was more cost-effective, while the second one allowed obtaining faster and more accurate results, but overall they both proved to be effective strategies for gene screenings in many samples. We could in fact identify novel missense mutations in the SNRNP200 gene, encoding an essential RNA helicase for splicing catalysis. Interestingly, one of these mutations showed incomplete penetrance in one family with adRP. Thus, we started to study the possible molecular causes underlying phenotypic differences between asymptomatic and affected members of this family. For the study of hypertension, I joined a European consortium to perform genome-wide association studies (GWAS). Thanks to the use of very informative genotyping arrays and of phenotipically well-characterized cohorts, we could identify a novel susceptibility locus for hypertension in the promoter region of the endothelial nitric oxide synthase gene (NOS3). Moreover, we have proven the direct causality of the associated SNP using three different methods: 1) targeted resequencing, 2) luciferase assay, and 3) population study. - Le récent progrès dans le Séquençage à haut Débit et les protocoles de génotypage a permis une plus vaste et rapide étude des maladies mendéliennes et multifactorielles à une échelle encore jamais atteinte. Durant ma thèse de recherche, j'ai utilisé ces nouvelles techniques de séquençage afin d'étudier la retinite pigmentale (RP), une maladie héréditaire rare caractérisée par une perte progressive des photorécepteurs de l'oeil qui entraine la cécité; et l'hypertension, une maladie commune touchant 30% de la population adulte. Tout d'abord, j'ai effectué une comparaison des performances de différentes plateformes de séquençage NGS (Next Generation Sequencing) lors du séquençage de PRPF31, un gène lié à RP. Ce gène contenait une mutation dans un élément répétable intronique, qui présentait des difficultés de séquençage avec la méthode classique et les NGS. Nous avons montré que les plateformes de NGS analysées sont des outils très puissants pour identifier des variations de l'ADN rares ou communes et aussi dans le cas de séquences complexes. De plus, nous avons exploré les caractéristiques des différentes plateformes NGS qui sont importantes dans les projets de re-séquençage. L'objectif principal de ma thèse a été ensuite d'examiner l'effet des facteurs d'épissage de pre-ARNm dans une forme autosomale dominante de RP (adRP). Un screening de 5 gènes candidats issus d'une large cohorte de patients a été effectué en utilisant la long-range PCR comme étape d'enrichissement, suivie par séquençage avec NGS. Nous avons testé deux approches différentes : dans la première, toutes les cibles PCRs de tous les patients ont été regroupées et séquencées comme une bibliothèque d'ADN unique; dans la seconde, les PCRs de chaque patient ont été séparées par code barres d'ADN. La première solution a été la plus économique, tandis que la seconde a permis d'obtenir des résultats plus rapides et précis. Dans l'ensemble, ces deux stratégies se sont démontrées efficaces pour le screening de gènes issus de divers échantillons. Nous avons pu identifier des nouvelles mutations faux-sens dans le gène SNRNP200, une hélicase ayant une fonction essentielle dans l'épissage. Il est intéressant de noter qu'une des ces mutations montre une pénétrance incomplète dans une famille atteinte d'adRP. Ainsi, nous avons commencé une étude sur les causes moléculaires entrainant des différences phénotypiques entre membres affectés et asymptomatiques de cette famille. Lors de l'étude de l'hypertension, j'ai rejoint un consortium européen pour réaliser une étude d'association Pangénomique ou genome-wide association study Grâce à l'utilisation de tableaux de génotypage très informatifs et de cohortes extrêmement bien caractérisées au niveau phénotypique, un nouveau locus lié à l'hypertension a été identifié dans la région promotrice du gène endothélial nitric oxide sinthase (NOS3). Par ailleurs, nous avons prouvé la cause directe du SNP associé au moyen de trois méthodes différentes: i) en reséquençant la cible avec NGS, ii) avec des essais à la luciférase et iii) une étude de population.


Els avenços en tècniques de genotipat de polimorfismes genètics a gran escala estan liderant una revolució en el camp de l’epidemiologia genètica i la genètica de poblacions humanes. La informació aportada per aquestes tècniques ha evidenciat l’existència d’estructuracions poblacionals que poden augmentar l’error en els estudis d’associació a escala genòmica (GWAS, genome-wide association studies). Estudis recents han demostrat la presència d’aquestes estructuracions a nivell interregional i intrarregional a Europa. El present projecte ha avaluat el grau d’estructuració genètica en poblacions de la Península Ibèrica i altres regions del sudoest europeu (Itàlia i França) per quantificar l’impacte que aquesta potencial estructuració pot tenir en el disseny d’estudis d’associació GWAS i reconstruir la història demogràfica de les poblacions de la Mediterrània. Per aconseguir aquests objectius, s’han analitzat mostres de DNA de 770 individus de 26 poblacions de la Península Ibèrica, França, Itàlia i d’altres països de la Mediterrània. Aquestes mostres van ser genotipades per 240000 SNPs utilitzant l’array 250K StyI d’Affymetrix en el marc d’aquest projecte o mitjançant altres arrays d’Affymetrix en els projectes internacionals HapMap i POPRES. S’han realitzat anàlisis estadístiques incloent anàlisis de components principals, Fst, identitat per descendència, desequilibri de lligament, barreres genètiques, etc. Aquests resultats han permés construir un marc de referència de la variabilitat en aquesta regió, avaluar el seu impacte en estudis d’associació i proposar mesures per evitar l’increment de qualsevol tipus d’error (tipus I i II) en estudis nacionals i internacionals. A més, també han permés reconstruir la història de les poblacions humanes de la Mediterrània així com analitzar les seves relacions demogràfiques. Donada la duració limitada d’aquesta acció (24 mesos, d’octubre de 2010 a setembre de 2012), els resultats d’aquest projecte es troben actualment en fase de redacció i conduiran a diverses publicacions en revistes internacionals i a la preparació de comunicacions a congressos.


* The 'in planta' visualization of F-actin in all cells and in all developmental stages of a plant is a challenging problem. By using the soybean heat inducible Gmhsp17.3B promoter instead of a constitutive promoter, we have been able to label all cells in various developmental stages of the moss Physcomitrella patens, through a precise temperature tuning of the expression of green fluorescent protein (GFP)-talin. * A short moderate heat treatment was sufficient to induce proper labeling of the actin cytoskeleton and to allow the visualization of time-dependent organization of F-actin structures without impairment of cell viability. * In growing moss cells, dense converging arrays of F-actin structures were present at the growing tips of protonema cell, and at the localization of branching. Protonema and leaf cells contained a network of thick actin cables; during de-differentiation of leaf cells into new protonema filaments, the thick bundled actin network disappeared, and a new highly polarized F-actin network formed. * The controlled expression of GFP-talin through an inducible promoter improves significantly the 'in planta' imaging of actin.


Cancer genomes frequently contain somatic copy number alterations (SCNA) that can significantly perturb the expression level of affected genes and thus disrupt pathways controlling normal growth. In melanoma, many studies have focussed on the copy number and gene expression levels of the BRAF, PTEN and MITF genes, but little has been done to identify new genes using these parameters at the genome-wide scale. Using karyotyping, SNP and CGH arrays, and RNA-seq, we have identified SCNA affecting gene expression ('SCNA-genes') in seven human metastatic melanoma cell lines. We showed that the combination of these techniques is useful to identify candidate genes potentially involved in tumorigenesis. Since few of these alterations were recurrent across our samples, we used a protein network-guided approach to determine whether any pathways were enriched in SCNA-genes in one or more samples. From this unbiased genome-wide analysis, we identified 28 significantly enriched pathway modules. Comparison with two large, independent melanoma SCNA datasets showed less than 10% overlap at the individual gene level, but network-guided analysis revealed 66% shared pathways, including all but three of the pathways identified in our data. Frequently altered pathways included WNT, cadherin signalling, angiogenesis and melanogenesis. Additionally, our results emphasize the potential of the EPHA3 and FRS2 gene products, involved in angiogenesis and migration, as possible therapeutic targets in melanoma. Our study demonstrates the utility of network-guided approaches, for both large and small datasets, to identify pathways recurrently perturbed in cancer.


Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic–stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to ∼2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3′-UTRs. While we estimate a significant false discovery rate of ∼50%–70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).


This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.


The anx/anx mouse displays poor appetite and lean appearance and is considered a good model for the study of anorexia nervosa. To identify new genes involved in feeding behavior and body weight regulation we performed an expression profiling in the hypothalamus of the anx/anx mice. Using commercial microarrays we detected 156 differentially expressed genes and validated 92 of those using TaqMan low-density arrays. The expression of a set of 87 candidate genes selected based on literature evidences was also quantified by TaqMan low-density arrays. Our results showed enrichment in deregulated genes involved in cell death, cell morphology and cancer as well as an alteration of several signaling circuits involved in energy balance including neuropeptide Y and melanocortin signaling. The expression profile along with the phenotype led us to conclude that anx/anx mice resemble the anorexia-cachexia syndrome typically observed in cancer, infection with human immunodeficiency virus or chronic diseases, rather than starvation, and that anx/anx mice could be considered a good model for the treatment and investigation of this condition.


The vision-for-action literature favours the idea that the motor output of an action - whether manual or oculomotor - leads to similar results regarding object handling. Findings on line bisection performance challenge this idea: healthy individuals bisect lines manually to the left of centre, and to the right of centre when using eye fixation. In case that these opposite biases for manual and oculomotor action reflect more universal compensatory mechanisms that cancel each other out to enhance overall accuracy, one would like to observe comparable opposite biases for other material. In the present study, we report on three independent experiments in which we tested line bisection (by hand, by eye fixation) not only for solid lines, but also for letter lines; the latter, when bisected manually, is known to result in a rightward bias. Accordingly, we expected a leftward bias for letter lines when bisected via eye fixation. Analysis of bisection biases provided evidence for this idea: manual bisection was more rightward for letter as compared to solid lines, while bisection by eye fixation was more leftward for letter as compared to solid lines. Support for the eye fixation observation was particularly obvious in two of the three studies, for which comparability between eye and hand action was increasingly adjusted (paper-pencil versus touch screen for manual action). These findings question the assumption that ocular motor and manual output are always inter-changeable, but rather suggest that at least for some situations ocular motor and manual output biases are orthogonal to each other, possibly balancing each other out.


This paper applies random matrix theory to obtain analytical characterizations of the capacity of correlated multiantenna channels. The analysis is not restricted to the popular separable correlation model, but rather it embraces a more general representation that subsumesmost of the channel models that have been treated in the literature. For arbitrary signal-to-noise ratios (SNR), the characterization is conducted in the regime of large numbers of antennas. For the low- and high-SNR regions, in turn, we uncover compact capacity expansions that are valid for arbitrary numbers of antennas and that shed insight on how antenna correlation impacts the tradeoffs between power, bandwidth and rate.