373 resultados para unsupervised


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Over the past few decades, Fourier transform infrared (FTIR) spectroscopy coupled to microscopy has been recognized as an emerging and potentially powerful tool in cancer research and diagnosis. For this purpose, histological analyses performed by pathologists are mostly carried out on biopsied tissue that undergoes the formalin-fixation and paraffin-embedding (FFPE) procedure. This processing method ensures an optimal and permanent preservation of the samples, making FFPE-archived tissue an extremely valuable source for retrospective studies. Nevertheless, as highlighted by previous studies, this fixation procedure significantly changes the principal constituents of cells, resulting in important effects on their infrared (IR) spectrum. Despite the chemical and spectral influence of FFPE processing, some studies demonstrate that FTIR imaging allows precise identification of the different cell types present in biopsied tissue, indicating that the FFPE process preserves spectral differences between distinct cell types. In this study, we investigated whether this is also the case for closely related cell lines. We analyzed spectra from 8 cancerous epithelial cell lines: 4 breast cancer cell lines and 4 melanoma cell lines. For each cell line, we harvested cells at subconfluence and divided them into two sets. We first tested the "original" capability of FTIR imaging to identify these closely related cell lines on cells just dried on BaF2 slides. We then repeated the test after submitting the cells to the FFPE procedure. Our results show that the IR spectra of FFPE processed cancerous cell lines undergo small but significant changes due to the treatment. The spectral modifications were interpreted as a potential decrease in the phospholipid content and protein denaturation, in line with the scientific literature on the topic. Nevertheless, unsupervised analyses showed that spectral proximities and distances between closely related cell lines were mostly, but not entirely, conserved after FFPE processing. Finally, PLS-DA statistical analyses highlighted that closely related cell lines are still successfully identified and efficiently distinguished by FTIR spectroscopy after FFPE treatment. This last result paves the way towards identification and characterization of cellular subtypes on FFPE tissue sections by FTIR imaging, indicating that this analysis technique could become a potential useful tool in cancer research.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MOTIVATION: Analysis of millions of pyro-sequences is currently playing a crucial role in the advance of environmental microbiology. Taxonomy-independent, i.e. unsupervised, clustering of these sequences is essential for the definition of Operational Taxonomic Units. For this application, reproducibility and robustness should be the most sought after qualities, but have thus far largely been overlooked. RESULTS: More than 1 million hyper-variable internal transcribed spacer 1 (ITS1) sequences of fungal origin have been analyzed. The ITS1 sequences were first properly extracted from 454 reads using generalized profiles. Then, otupipe, cd-hit-454, ESPRIT-Tree and DBC454, a new algorithm presented here, were used to analyze the sequences. A numerical assay was developed to measure the reproducibility and robustness of these algorithms. DBC454 was the most robust, closely followed by ESPRIT-Tree. DBC454 features density-based hierarchical clustering, which complements the other methods by providing insights into the structure of the data. AVAILABILITY: An executable is freely available for non-commercial users at ftp://ftp.vital-it.ch/tools/dbc454. It is designed to run under MPI on a cluster of 64-bit Linux machines running Red Hat 4.x, or on a multi-core OSX system. CONTACT: dbc454@vital-it.ch or nicolas.guex@isb-sib.ch.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract : This work is concerned with the development and application of novel unsupervised learning methods, having in mind two target applications: the analysis of forensic case data and the classification of remote sensing images. First, a method based on a symbolic optimization of the inter-sample distance measure is proposed to improve the flexibility of spectral clustering algorithms, and applied to the problem of forensic case data. This distance is optimized using a loss function related to the preservation of neighborhood structure between the input space and the space of principal components, and solutions are found using genetic programming. Results are compared to a variety of state-of--the-art clustering algorithms. Subsequently, a new large-scale clustering method based on a joint optimization of feature extraction and classification is proposed and applied to various databases, including two hyperspectral remote sensing images. The algorithm makes uses of a functional model (e.g., a neural network) for clustering which is trained by stochastic gradient descent. Results indicate that such a technique can easily scale to huge databases, can avoid the so-called out-of-sample problem, and can compete with or even outperform existing clustering algorithms on both artificial data and real remote sensing images. This is verified on small databases as well as very large problems. Résumé : Ce travail de recherche porte sur le développement et l'application de méthodes d'apprentissage dites non supervisées. Les applications visées par ces méthodes sont l'analyse de données forensiques et la classification d'images hyperspectrales en télédétection. Dans un premier temps, une méthodologie de classification non supervisée fondée sur l'optimisation symbolique d'une mesure de distance inter-échantillons est proposée. Cette mesure est obtenue en optimisant une fonction de coût reliée à la préservation de la structure de voisinage d'un point entre l'espace des variables initiales et l'espace des composantes principales. Cette méthode est appliquée à l'analyse de données forensiques et comparée à un éventail de méthodes déjà existantes. En second lieu, une méthode fondée sur une optimisation conjointe des tâches de sélection de variables et de classification est implémentée dans un réseau de neurones et appliquée à diverses bases de données, dont deux images hyperspectrales. Le réseau de neurones est entraîné à l'aide d'un algorithme de gradient stochastique, ce qui rend cette technique applicable à des images de très haute résolution. Les résultats de l'application de cette dernière montrent que l'utilisation d'une telle technique permet de classifier de très grandes bases de données sans difficulté et donne des résultats avantageusement comparables aux méthodes existantes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

RÉSUMÉ Cette thèse porte sur le développement de méthodes algorithmiques pour découvrir automatiquement la structure morphologique des mots d'un corpus. On considère en particulier le cas des langues s'approchant du type introflexionnel, comme l'arabe ou l'hébreu. La tradition linguistique décrit la morphologie de ces langues en termes d'unités discontinues : les racines consonantiques et les schèmes vocaliques. Ce genre de structure constitue un défi pour les systèmes actuels d'apprentissage automatique, qui opèrent généralement avec des unités continues. La stratégie adoptée ici consiste à traiter le problème comme une séquence de deux sous-problèmes. Le premier est d'ordre phonologique : il s'agit de diviser les symboles (phonèmes, lettres) du corpus en deux groupes correspondant autant que possible aux consonnes et voyelles phonétiques. Le second est de nature morphologique et repose sur les résultats du premier : il s'agit d'établir l'inventaire des racines et schèmes du corpus et de déterminer leurs règles de combinaison. On examine la portée et les limites d'une approche basée sur deux hypothèses : (i) la distinction entre consonnes et voyelles peut être inférée sur la base de leur tendance à alterner dans la chaîne parlée; (ii) les racines et les schèmes peuvent être identifiés respectivement aux séquences de consonnes et voyelles découvertes précédemment. L'algorithme proposé utilise une méthode purement distributionnelle pour partitionner les symboles du corpus. Puis il applique des principes analogiques pour identifier un ensemble de candidats sérieux au titre de racine ou de schème, et pour élargir progressivement cet ensemble. Cette extension est soumise à une procédure d'évaluation basée sur le principe de la longueur de description minimale, dans- l'esprit de LINGUISTICA (Goldsmith, 2001). L'algorithme est implémenté sous la forme d'un programme informatique nommé ARABICA, et évalué sur un corpus de noms arabes, du point de vue de sa capacité à décrire le système du pluriel. Cette étude montre que des structures linguistiques complexes peuvent être découvertes en ne faisant qu'un minimum d'hypothèses a priori sur les phénomènes considérés. Elle illustre la synergie possible entre des mécanismes d'apprentissage portant sur des niveaux de description linguistique distincts, et cherche à déterminer quand et pourquoi cette coopération échoue. Elle conclut que la tension entre l'universalité de la distinction consonnes-voyelles et la spécificité de la structuration racine-schème est cruciale pour expliquer les forces et les faiblesses d'une telle approche. ABSTRACT This dissertation is concerned with the development of algorithmic methods for the unsupervised learning of natural language morphology, using a symbolically transcribed wordlist. It focuses on the case of languages approaching the introflectional type, such as Arabic or Hebrew. The morphology of such languages is traditionally described in terms of discontinuous units: consonantal roots and vocalic patterns. Inferring this kind of structure is a challenging task for current unsupervised learning systems, which generally operate with continuous units. In this study, the problem of learning root-and-pattern morphology is divided into a phonological and a morphological subproblem. The phonological component of the analysis seeks to partition the symbols of a corpus (phonemes, letters) into two subsets that correspond well with the phonetic definition of consonants and vowels; building around this result, the morphological component attempts to establish the list of roots and patterns in the corpus, and to infer the rules that govern their combinations. We assess the extent to which this can be done on the basis of two hypotheses: (i) the distinction between consonants and vowels can be learned by observing their tendency to alternate in speech; (ii) roots and patterns can be identified as sequences of the previously discovered consonants and vowels respectively. The proposed algorithm uses a purely distributional method for partitioning symbols. Then it applies analogical principles to identify a preliminary set of reliable roots and patterns, and gradually enlarge it. This extension process is guided by an evaluation procedure based on the minimum description length principle, in line with the approach to morphological learning embodied in LINGUISTICA (Goldsmith, 2001). The algorithm is implemented as a computer program named ARABICA; it is evaluated with regard to its ability to account for the system of plural formation in a corpus of Arabic nouns. This thesis shows that complex linguistic structures can be discovered without recourse to a rich set of a priori hypotheses about the phenomena under consideration. It illustrates the possible synergy between learning mechanisms operating at distinct levels of linguistic description, and attempts to determine where and why such a cooperation fails. It concludes that the tension between the universality of the consonant-vowel distinction and the specificity of root-and-pattern structure is crucial for understanding the advantages and weaknesses of this approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present research deals with the review of the analysis and modeling of Swiss franc interest rate curves (IRC) by using unsupervised (SOM, Gaussian Mixtures) and supervised machine (MLP) learning algorithms. IRC are considered as objects embedded into different feature spaces: maturities; maturity-date, parameters of Nelson-Siegel model (NSM). Analysis of NSM parameters and their temporal and clustering structures helps to understand the relevance of model and its potential use for the forecasting. Mapping of IRC in a maturity-date feature space is presented and analyzed for the visualization and forecasting purposes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This project examines the effects of age, experience, and video-based feedback on the rate and type of safety-relevant events captured on video event recorders in the vehicles of three groups of newly licensed young drivers: 1. 14.5- to 15.5-year-old drivers who hold a minor school license (see Appendix A for the provisions of the Iowa code governing minor school licenses); 2. 16-year-old drivers with an intermediate license who are driving unsupervised for the first time; 3. 16-year-old drivers with an intermediate license who previously drove unsupervised for at least four months with a school license. METHODS: The young drivers’ vehicles were equipped with an event-triggered video recording device for 24 weeks. Half of the participants received feedback regarding their driving, and the other half received no feedback at all and served as a control group. The number of safety-relevant events per 1,000 miles (i.e., “event rate”) was analyzed for 90 participants who completed the study. RESULTS: On average, the young drivers who received the video-based intervention had significantly lower event rates than those in the control group. This finding was true for all three groups. An effect of experience was seen for drivers in the control group; the 16-year-olds with driving experience had significantly lower event rates than the 16-year-olds without experience. When the intervention concluded, an increase in event rate was seen for the school license holders, but not for either group of 16-year-old drivers. There is strong evidence that giving young drivers video-based feedback, regardless of their age or level of driving experience, is effective in reducing the rate of safety-relevant events relative to a control group who do not receive feedback. Specific comparisons with regard to age and experience indicated that the age of the driver did not have an effect on the rate of safety-events, while experience did. Young drivers with six months or more of additional experience behind the wheel had nearly half as many safety-relevant events as those without that experience.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Gene set enrichment (GSE) analysis is a popular framework for condensing information from gene expression profiles into a pathway or signature summary. The strengths of this approach over single gene analysis include noise and dimension reduction, as well as greater biological interpretability. As molecular profiling experiments move beyond simple case-control studies, robust and flexible GSE methodologies are needed that can model pathway activity within highly heterogeneous data sets. To address this challenge, we introduce Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner. We demonstrate the robustness of GSVA in a comparison with current state of the art sample-wise enrichment methods. Further, we provide examples of its utility in differential pathway activity and survival analysis. Lastly, we show how GSVA works analogously with data from both microarray and RNA-seq experiments. GSVA provides increased power to detect subtle pathway activity changes over a sample population in comparison to corresponding methods. While GSE methods are generally regarded as end points of a bioinformatic analysis, GSVA constitutes a starting point to build pathway-centric models of biology. Moreover, GSVA contributes to the current need of GSE methods for RNA-seq data. GSVA is an open source software package for R which forms part of the Bioconductor project and can be downloaded at http://www.bioconductor.org.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The recognition that colorectal cancer (CRC) is a heterogeneous disease in terms of clinical behaviour and response to therapy translates into an urgent need for robust molecular disease subclassifiers that can explain this heterogeneity beyond current parameters (MSI, KRAS, BRAF). Attempts to fill this gap are emerging. The Cancer Genome Atlas (TGCA) reported two main CRC groups, based on the incidence and spectrum of mutated genes, and another paper reported an EMT expression signature defined subgroup. We performed a prior free analysis of CRC heterogeneity on 1113 CRC gene expression profiles and confronted our findings to established molecular determinants and clinical, histopathological and survival data. Unsupervised clustering based on gene modules allowed us to distinguish at least five different gene expression CRC subtypes, which we call surface crypt-like, lower crypt-like, CIMP-H-like, mesenchymal and mixed. A gene set enrichment analysis combined with literature search of gene module members identified distinct biological motifs in different subtypes. The subtypes, which were not derived based on outcome, nonetheless showed differences in prognosis. Known gene copy number variations and mutations in key cancer-associated genes differed between subtypes, but the subtypes provided molecular information beyond that contained in these variables. Morphological features significantly differed between subtypes. The objective existence of the subtypes and their clinical and molecular characteristics were validated in an independent set of 720 CRC expression profiles. Our subtypes provide a novel perspective on the heterogeneity of CRC. The proposed subtypes should be further explored retrospectively on existing clinical trial datasets and, when sufficiently robust, be prospectively assessed for clinical relevance in terms of prognosis and treatment response predictive capacity. Original microarray data were uploaded to the ArrayExpress database (http://www.ebi.ac.uk/arrayexpress/) under Accession Nos E-MTAB-990 and E-MTAB-1026. © 2013 Swiss Institute of Bioinformatics. Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The ability to obtain gene expression profiles from human disease specimens provides an opportunity to identify relevant gene pathways, but is limited by the absence of data sets spanning a broad range of conditions. Here, we analyzed publicly available microarray data from 16 diverse skin conditions in order to gain insight into disease pathogenesis. Unsupervised hierarchical clustering separated samples by disease as well as common cellular and molecular pathways. Disease-specific signatures were leveraged to build a multi-disease classifier, which predicted the diagnosis of publicly and prospectively collected expression profiles with 93% accuracy. In one sample, the molecular classifier differed from the initial clinical diagnosis and correctly predicted the eventual diagnosis as the clinical presentation evolved. Finally, integration of IFN-regulated gene programs with the skin database revealed a significant inverse correlation between IFN-β and IFN-γ programs across all conditions. Our study provides an integrative approach to the study of gene signatures from multiple skin conditions, elucidating mechanisms of disease pathogenesis. In addition, these studies provide a framework for developing tools for personalized medicine toward the precise prediction, prevention, and treatment of disease on an individual level.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work describes the analysis of different walking paths registered using a Light Detection And Ranging (LIDAR) laser range sensor in order to measure oscillating trajectories during unsupervised walking. The estimate of the gait and trajectory parameters were obtained with a terrestrial LIDAR placed 100 mm above the ground with the scanning plane parallel to the floor to measure the trajectory of the legs without attaching any markers or modifying the floor. Three different large walking experiments were performed to test the proposed measurement system with straight and oscillating trajectories. The main advantages of the proposed system are the possibility to measure several steps and obtain average gait parameters and the minimum infrastructure required. This measurement system enables the development of new ambulatory applications based on the analysis of the gait and the trajectory during a walk.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En este trabajo se presenta un protocolo para la zonificación intraparcelaria de la viña con la finalidad de vendimia selectiva. Se basa en la adquisición de una imagen multiespectral detallada en el momento del envero, a partir de la cual se obtiene el índice de vegetación de la diferencia normalizada (NDVI). Este índice se clasifica en áreas de vigor alto y bajo mediante un proceso de clasificación no supervisada (algoritmo ISODATA). Las zonas resultantes se generalizan y se transfieren al monitor de cosecha de una máquina vendimiadora para realizar la recolección selectiva. La uva recolectada según este protocolo en parcelas control ha mostrado diferenciación en cuanto a parámetros de calidad como el pH, la acidez total, el contenido de polifenoles y el color. La imagen multiespectral utilizada fue adquirida por el satélite Quickbird-2. Los datos de calidad de la uva fueron muestreados según una malla regular de 5 filas por 10 cepas, procediendo a un test estadístico de rangos múltiples para analizar la separación de medias de las variables analizadas en cada zona de NDVI.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tutkimus lasten ja nuorten luvattomasta tulen käsittelystä perustuu näkemykseen siitä, että ilmiöön voidaan puuttua tehokkaasti interventioin, jos toiminta havaitaan ajoissa. Ilmiötä sävyttää teon salailu ja neutralisaatio eli vähättely. Tulella tehtyjen tuhotöiden lisääntymistä ja muuttumista aggressiivisemmaksi voidaan ennalta ehkäistä ongelman tunnistamisella ja reagoimalla lasten häiriökäytökseen. Lasten ja nuorten luvatonta tulen käyttöä ei ole tutkittu Suomessa aiemmin. Työn ensimmäisessä osiossa tarkastellaan luvattomaan tulen käsittelyyn liittyviä teorioita (esim. Fineman 1980, 1995), kansainvälisiä näkökohtia, teonpiirteitä ja yksilön sisäisiä prosesseja. Lisäksi tarkastellaan perheen, koulun ja ystäväpiirin osuutta ilmiöön, niin sanotun Oregonin mallin mukaisesti (Oregon Treatment Strategies Task Force 1996, 16 – 47). Työn empiirisessä osiossa ilmiötä ja sen ilmenemistä lasten ja nuorten keskuudessa kuvataan oppilaiden, vanhempien ja opettajien näkökulmasta. Tutkimukseen osallistui 661 oppilasta perusasteen toiselta, viidenneltä ja kahdeksannelta luokalta, 341 vanhempaa ja 22 koulun työntekijää. Oppilaiden ja vanhempien aineisto kerättiin survey-tutkimuksella ja opettajat tutkittiin haastattelumenetelmällä. Lasten luvaton tulen käyttö on yleisempää kuin aiemmin on luultu. Vielä viidenteen luokkaan mennessä luvaton tulen käsittely oli yleisempää pojille kuin tytöille, mutta murrosikään tultaessa sukupuolierot vähenivät. Pojista 37 % ja tytöistä 25 % raportoi käsitelleensä tulta luvattomasti. Kaikkiaan kolmasosa oppilaista raportoi leikkineensä tulella. Yleisin tulen sytyttelypaikka oli oma koti tai kodin lähiympäristö, josta tulentekovälineet yleisimmin hankittiin pyytämällä tai ottamalla. Luvattomasti tulta käsitelleet oppilaat olivat häirinneet oppitunteja. Tilastollisesti merkitsevimmin runsasta luvatonta tulen käsittelyä ennusti omien tulentekovälineiden omistaminen ja häiriökäyttäytyminen koulussa. Vanhemmat eivät pitäneet lastensa tulen käyttöä merkittävänä vaarana. Aikuisten suhtautumista lasten luvattomaan tulen käyttöön sävytti tekojen vähättely eli neutralisaatio; vähättelyilmiö oli yhteinen sekä lapsille itselleen, vanhemmille että viranomaisille. Kasvattajilla ei ollut käytössään tehokkaita interventiomenetelmiä ongelman ratkaisemiseen. Viranomaisyhteistyöstä raportoitiin vain vähän. Pelastusviranomaisia ei juurikaan käytetty lasten luvattoman tulen käsittelyn interventiossa. Interventiota sävytti aikuisten käsitysten mukaan tapauskohtaisuus ja sattumanvaraisuus.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this thesis author approaches the problem of automated text classification, which is one of basic tasks for building Intelligent Internet Search Agent. The work discusses various approaches to solving sub-problems of automated text classification, such as feature extraction and machine learning on text sources. Author also describes her own multiword approach to feature extraction and pres-ents the results of testing this approach using linear discriminant analysis based classifier, and classifier combining unsupervised learning for etalon extraction with supervised learning using common backpropagation algorithm for multilevel perceptron.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tärkeä tehtävä ympäristön tarkkailussa on arvioida ympäristön nykyinen tila ja ihmisen siihen aiheuttamat muutokset sekä analysoida ja etsiä näiden yhtenäiset suhteet. Ympäristön muuttumista voidaan hallita keräämällä ja analysoimalla tietoa. Tässä diplomityössä on tutkittu vesikasvillisuudessa hai vainuja muutoksia käyttäen etäältä hankittua mittausdataa ja kuvan analysointimenetelmiä. Ympäristön tarkkailuun on käytetty Suomen suurimmasta järvestä Saimaasta vuosina 1996 ja 1999 otettuja ilmakuvia. Ensimmäinen kuva-analyysin vaihe on geometrinen korjaus, jonka tarkoituksena on kohdistaa ja suhteuttaa otetut kuvat samaan koordinaattijärjestelmään. Toinen vaihe on kohdistaa vastaavat paikalliset alueet ja tunnistaa kasvillisuuden muuttuminen. Kasvillisuuden tunnistamiseen on käytetty erilaisia lähestymistapoja sisältäen valvottuja ja valvomattomia tunnistustapoja. Tutkimuksessa käytettiin aitoa, kohinoista mittausdataa, minkä perusteella tehdyt kokeet antoivat hyviä tuloksia tutkimuksen onnistumisesta.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: The structure and organisation of ecological interactions within an ecosystem is modified by the evolution and coevolution of the individual species it contains. Understanding how historical conditions have shaped this architecture is vital for understanding system responses to change at scales from the microbial upwards. However, in the absence of a group selection process, the collective behaviours and ecosystem functions exhibited by the whole community cannot be organised or adapted in a Darwinian sense. A long-standing open question thus persists: Are there alternative organising principles that enable us to understand and predict how the coevolution of the component species creates and maintains complex collective behaviours exhibited by the ecosystem as a whole? RESULTS: Here we answer this question by incorporating principles from connectionist learning, a previously unrelated discipline already using well-developed theories on how emergent behaviours arise in simple networks. Specifically, we show conditions where natural selection on ecological interactions is functionally equivalent to a simple type of connectionist learning, 'unsupervised learning', well-known in neural-network models of cognitive systems to produce many non-trivial collective behaviours. Accordingly, we find that a community can self-organise in a well-defined and non-trivial sense without selection at the community level; its organisation can be conditioned by past experience in the same sense as connectionist learning models habituate to stimuli. This conditioning drives the community to form a distributed ecological memory of multiple past states, causing the community to: a) converge to these states from any random initial composition; b) accurately restore historical compositions from small fragments; c) recover a state composition following disturbance; and d) to correctly classify ambiguous initial compositions according to their similarity to learned compositions. We examine how the formation of alternative stable states alters the community's response to changing environmental forcing, and we identify conditions under which the ecosystem exhibits hysteresis with potential for catastrophic regime shifts. CONCLUSIONS: This work highlights the potential of connectionist theory to expand our understanding of evo-eco dynamics and collective ecological behaviours. Within this framework we find that, despite not being a Darwinian unit, ecological communities can behave like connectionist learning systems, creating internal conditions that habituate to past environmental conditions and actively recalling those conditions. REVIEWERS: This article was reviewed by Prof. Ricard V Solé, Universitat Pompeu Fabra, Barcelona and Prof. Rob Knight, University of Colorado, Boulder.