49 resultados para Lanczos, Linear systems, Generalized cross validation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a new indicator taxa approach to the prediction of climate change effects on biodiversity at the national level in Switzerland. As indicators, we select a set of the most widely distributed species that account for 95% of geographical variation in sampled species richness of birds, butterflies, and vascular plants. Species data come from a national program designed to monitor spatial and temporal trends in species richness. We examine some opportunities and limitations in using these data. We develop ecological niche models for the species as functions of both climate and land cover variables. We project these models to the future using climate predictions that correspond to two IPCC 3rd assessment scenarios for the development of 'greenhouse' gas emissions. We find that models that are calibrated with Swiss national monitoring data perform well in 10-fold cross-validation, but can fail to capture the hot-dry end of environmental gradients that constrain some species distributions. Models for indicator species in all three higher taxa predict that climate change will result in turnover in species composition even where there is little net change in predicted species richness. Indicator species from high elevations lose most areas of suitable climate even under the relatively mild B2 scenario. We project some areas to increase in the number of species for which climate conditions are suitable early in the current century, but these areas become less suitable for a majority of species by the end of the century. Selection of indicator species based on rank prevalence results in a set of models that predict observed species richness better than a similar set of species selected based on high rank of model AUC values. An indicator species approach based on selected species that are relatively common may facilitate the use of national monitoring data for predicting climate change effects on the distribution of biodiversity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES: To test the validity of a simple, rapid, field-adapted, portable hand-held impedancemeter (HHI) for the estimation of lean body mass (LBM) and percentage body fat (%BF) in African women, and to develop specific predictive equations. DESIGN: Cross-sectional observational study. SETTINGS: Dakar, the capital city of Senegal, West Africa. SUBJECTS: A total sample of 146 women volunteered. Their mean age was of 31.0 y (s.d. 9.1), weight 60.9 kg (s.d. 13.1) and BMI 22.6 kg/m(2) (s.d. 4.5). METHODS: Body composition values estimated by HHI were compared to those measured by whole body densitometry performed by air displacement plethysmography (ADP). The specific density of LBM in black subjects was taken into account for the calculation of %BF from body density. RESULTS: : Estimations from HHI showed a large bias (mean difference) of 5.6 kg LBM (P<10(-4)) and -8.8 %BF (P<10(-4)) and errors (s.d. of the bias) of 2.6 kg LBM and 3.7 %BF. In order to correct for the bias, specific predictive equations were developed. With the HHI result as a single predictor, error values were of 1.9 kg LBM and 3.7 %BF in the prediction group (n=100), and of 2.2 kg LBM and 3.6 %BF in the cross-validation group (n=46). Addition of anthropometrical predictors was not necessary. CONCLUSIONS: The HHI analyser significantly overestimated LBM and underestimated %BF in African women. After correction for the bias, the body compartments could easily be estimated in African women by using the HHI result in an appropriate prediction equation with a good precision. It remains to be seen whether a combination of arm and leg impedancemetry in order to take into account lower limbs would further improve the prediction of body composition in Africans.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aim We examined whether species occurrences are primarily limited by physiological tolerance in the abiotically more stressful end of climatic gradients (the asymmetric abiotic stress limitation (AASL) hypothesis) and the geographical predictions of this hypothesis: abiotic stress mainly determines upper-latitudinal and upper-altitudinal species range limits, and the importance of abiotic stress for these range limits increases the further northwards and upwards a species occurs. Location Europe and the Swiss Alps. Methods The AASL hypothesis predicts that species have skewed responses to climatic gradients, with a steep decline towards the more stressful conditions. Based on presence-absence data we examined the shape of plant species responses (measured as probability of occurrence) along three climatic gradients across latitudes in Europe (1577 species) and altitudes in the Swiss Alps (284 species) using Huisman-Olff-Fresco, generalized linear and generalized additive models. Results We found that almost half of the species from Europe and one-third from the Swiss Alps showed responses consistent with the predictions of the AASL hypothesis. Cold temperatures and a short growing season seemed to determine the upper-latitudinal and upper-altitudinal range limits of up to one-third of the species, while drought provided an important constraint at lower-latitudinal range limits for up to one-fifth of the species. We found a biome-dependent influence of abiotic stress and no clear support for abiotic stress as a stronger upper range-limit determinant for species with higher latitudinal and altitudinal distributions. However, the overall influence of climate as a range-limit determinant increased with latitude. Main conclusions Our results support the AASL hypothesis for almost half of the studied species, and suggest that temperature-related stress controls the upper-latitudinal and upper-altitudinal range limits of a large proportion of these species, while other factors including drought stress may be important at the lower range limits.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents the general regression neural networks (GRNN) as a nonlinear regression method for the interpolation of monthly wind speeds in complex Alpine orography. GRNN is trained using data coming from Swiss meteorological networks to learn the statistical relationship between topographic features and wind speed. The terrain convexity, slope and exposure are considered by extracting features from the digital elevation model at different spatial scales using specialised convolution filters. A database of gridded monthly wind speeds is then constructed by applying GRNN in prediction mode during the period 1968-2008. This study demonstrates that using topographic features as inputs in GRNN significantly reduces cross-validation errors with respect to low-dimensional models integrating only geographical coordinates and terrain height for the interpolation of wind speed. The spatial predictability of wind speed is found to be lower in summer than in winter due to more complex and weaker wind-topography relationships. The relevance of these relationships is studied using an adaptive version of the GRNN algorithm which allows to select the useful terrain features by eliminating the noisy ones. This research provides a framework for extending the low-dimensional interpolation models to high-dimensional spaces by integrating additional features accounting for the topographic conditions at multiple spatial scales. Copyright (c) 2012 Royal Meteorological Society.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genetic variants influence the risk to develop certain diseases or give rise to differences in drug response. Recent progresses in cost-effective, high-throughput genome-wide techniques, such as microarrays measuring Single Nucleotide Polymorphisms (SNPs), have facilitated genotyping of large clinical and population cohorts. Combining the massive genotypic data with measurements of phenotypic traits allows for the determination of genetic differences that explain, at least in part, the phenotypic variations within a population. So far, models combining the most significant variants can only explain a small fraction of the variance, indicating the limitations of current models. In particular, researchers have only begun to address the possibility of interactions between genotypes and the environment. Elucidating the contributions of such interactions is a difficult task because of the large number of genetic as well as possible environmental factors.In this thesis, I worked on several projects within this context. My first and main project was the identification of possible SNP-environment interactions, where the phenotypes were serum lipid levels of patients from the Swiss HIV Cohort Study (SHCS) treated with antiretroviral therapy. Here the genotypes consisted of a limited set of SNPs in candidate genes relevant for lipid transport and metabolism. The environmental variables were the specific combinations of drugs given to each patient over the treatment period. My work explored bioinformatic and statistical approaches to relate patients' lipid responses to these SNPs, drugs and, importantly, their interactions. The goal of this project was to improve our understanding and to explore the possibility of predicting dyslipidemia, a well-known adverse drug reaction of antiretroviral therapy. Specifically, I quantified how much of the variance in lipid profiles could be explained by the host genetic variants, the administered drugs and SNP-drug interactions and assessed the predictive power of these features on lipid responses. Using cross-validation stratified by patients, we could not validate our hypothesis that models that select a subset of SNP-drug interactions in a principled way have better predictive power than the control models using "random" subsets. Nevertheless, all models tested containing SNP and/or drug terms, exhibited significant predictive power (as compared to a random predictor) and explained a sizable proportion of variance, in the patient stratified cross-validation context. Importantly, the model containing stepwise selected SNP terms showed higher capacity to predict triglyceride levels than a model containing randomly selected SNPs. Dyslipidemia is a complex trait for which many factors remain to be discovered, thus missing from the data, and possibly explaining the limitations of our analysis. In particular, the interactions of drugs with SNPs selected from the set of candidate genes likely have small effect sizes which we were unable to detect in a sample of the present size (<800 patients).In the second part of my thesis, I performed genome-wide association studies within the Cohorte Lausannoise (CoLaus). I have been involved in several international projects to identify SNPs that are associated with various traits, such as serum calcium, body mass index, two-hour glucose levels, as well as metabolic syndrome and its components. These phenotypes are all related to major human health issues, such as cardiovascular disease. I applied statistical methods to detect new variants associated with these phenotypes, contributing to the identification of new genetic loci that may lead to new insights into the genetic basis of these traits. This kind of research will lead to a better understanding of the mechanisms underlying these pathologies, a better evaluation of disease risk, the identification of new therapeutic leads and may ultimately lead to the realization of "personalized" medicine.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper deals with the development and application of the methodology for automatic mapping of pollution/contamination data. General Regression Neural Network (GRNN) is considered in detail and is proposed as an efficient tool to solve this problem. The automatic tuning of isotropic and an anisotropic GRNN model using cross-validation procedure is presented. Results are compared with k-nearest-neighbours interpolation algorithm using independent validation data set. Quality of mapping is controlled by the analysis of raw data and the residuals using variography. Maps of probabilities of exceeding a given decision level and ?thick? isoline visualization of the uncertainties are presented as examples of decision-oriented mapping. Real case study is based on mapping of radioactively contaminated territories.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Only few countries have cohorts enabling specific and up-to-date cardiovascular disease (CVD) risk estimation. Individual risk assessment based on study samples that differ too much from the target population could jeopardize the benefit of risk charts in general practice. Our aim was to provide up-to-date and valid CVD risk estimation for a Swiss population using a novel record linkage approach. METHODS: Anonymous record linkage was used to follow-up (for mortality, until 2008) 9,853 men and women aged 25-74 years who participated in the Swiss MONICA (MONItoring of trends and determinants in CVD) study of 1983-92. The linkage success was 97.8%, loss to follow-up 1990-2000 was 4.7%. Based on the ESC SCORE methodology (Weibull regression), we used age, sex, blood pressure, smoking, and cholesterol to generate three models. We compared the 1) original SCORE model with a 2) recalibrated and a 3) new model using the Brier score (BS) and cross-validation. RESULTS: Based on the cross-validated BS, the new model (BS = 14107×10(-6)) was somewhat more appropriate for risk estimation than the original (BS = 14190×10(-6)) and the recalibrated (BS = 14172×10(-6)) model. Particularly at younger age, derived absolute risks were consistently lower than those from the original and the recalibrated model which was mainly due to a smaller impact of total cholesterol. CONCLUSION: Using record linkage of observational and routine data is an efficient procedure to obtain valid and up-to-date CVD risk estimates for a specific population.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple sclerosis (MS), a variable and diffuse disease affecting white and gray matter, is known to cause functional connectivity anomalies in patients. However, related studies published to-date are post hoc; our hypothesis was that such alterations could discriminate between patients and healthy controls in a predictive setting, laying the groundwork for imaging-based prognosis. Using functional magnetic resonance imaging resting state data of 22 minimally disabled MS patients and 14 controls, we developed a predictive model of connectivity alterations in MS: a whole-brain connectivity matrix was built for each subject from the slow oscillations (<0.11Hz) of region-averaged time series, and a pattern recognition technique was used to learn a discriminant function indicating which particular functional connections are most affected by disease. Classification performance using strict cross-validation yielded a sensitivity of 82% (above chance at p<0.005) and specificity of 86% (p<0.01) to distinguish between MS patients and controls. The most discriminative connectivity changes were found in subcortical and temporal regions, and contralateral connections were more discriminative than ipsilateral connections. The pattern of decreased discriminative connections can be summarized post hoc in an index that correlates positively (ρ=0.61) with white matter lesion load, possibly indicating functional reorganisation to cope with increasing lesion load. These results are consistent with a subtle but widespread impact of lesions in white matter and in gray matter structures serving as high-level integrative hubs. These findings suggest that predictive models of resting state fMRI can reveal specific anomalies due to MS with high sensitivity and specificity, potentially leading to new non-invasive markers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

PURPOSE: To improve the risk stratification of patients with rhabdomyosarcoma (RMS) through the use of clinical and molecular biologic data. PATIENTS AND METHODS: Two independent data sets of gene-expression profiling for 124 and 101 patients with RMS were used to derive prognostic gene signatures by using a meta-analysis. These and a previously published metagene signature were evaluated by using cross validation analyses. A combined clinical and molecular risk-stratification scheme that incorporated the PAX3/FOXO1 fusion gene status was derived from 287 patients with RMS and evaluated. RESULTS: We showed that our prognostic gene-expression signature and the one previously published performed well with reproducible and significant effects. However, their effect was reduced when cross validated or tested in independent data and did not add new prognostic information over the fusion gene status, which is simpler to assay. Among nonmetastatic patients, patients who were PAX3/FOXO1 positive had a significantly poorer outcome compared with both alveolar-negative and PAX7/FOXO1-positive patients. Furthermore, a new clinicomolecular risk score that incorporated fusion gene status (negative and PAX3/FOXO1 and PAX7/FOXO1 positive), Intergroup Rhabdomyosarcoma Study TNM stage, and age showed a significant increase in performance over the current risk-stratification scheme. CONCLUSION: Gene signatures can improve current stratification of patients with RMS but will require complex assays to be developed and extensive validation before clinical application. A significant majority of their prognostic value was encapsulated by the fusion gene status. A continuous risk score derived from the combination of clinical parameters with the presence or absence of PAX3/FOXO1 represents a robust approach to improving current risk-adapted therapy for RMS.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

PURPOSE: Proper delineation of ocular anatomy in 3-dimensional (3D) imaging is a big challenge, particularly when developing treatment plans for ocular diseases. Magnetic resonance imaging (MRI) is presently used in clinical practice for diagnosis confirmation and treatment planning for treatment of retinoblastoma in infants, where it serves as a source of information, complementary to the fundus or ultrasonographic imaging. Here we present a framework to fully automatically segment the eye anatomy for MRI based on 3D active shape models (ASM), and we validate the results and present a proof of concept to automatically segment pathological eyes. METHODS AND MATERIALS: Manual and automatic segmentation were performed in 24 images of healthy children's eyes (3.29 ± 2.15 years of age). Imaging was performed using a 3-T MRI scanner. The ASM consists of the lens, the vitreous humor, the sclera, and the cornea. The model was fitted by first automatically detecting the position of the eye center, the lens, and the optic nerve, and then aligning the model and fitting it to the patient. We validated our segmentation method by using a leave-one-out cross-validation. The segmentation results were evaluated by measuring the overlap, using the Dice similarity coefficient (DSC) and the mean distance error. RESULTS: We obtained a DSC of 94.90 ± 2.12% for the sclera and the cornea, 94.72 ± 1.89% for the vitreous humor, and 85.16 ± 4.91% for the lens. The mean distance error was 0.26 ± 0.09 mm. The entire process took 14 seconds on average per eye. CONCLUSION: We provide a reliable and accurate tool that enables clinicians to automatically segment the sclera, the cornea, the vitreous humor, and the lens, using MRI. We additionally present a proof of concept for fully automatically segmenting eye pathology. This tool reduces the time needed for eye shape delineation and thus can help clinicians when planning eye treatment and confirming the extent of the tumor.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: To develop predictive models for early triage of burn patients based on hypersusceptibility to repeated infections. BACKGROUND: Infection remains a major cause of mortality and morbidity after severe trauma, demanding new strategies to combat infections. Models for infection prediction are lacking. METHODS: Secondary analysis of 459 burn patients (≥16 years old) with 20% or more total body surface area burns recruited from 6 US burn centers. We compared blood transcriptomes with a 180-hour cutoff on the injury-to-transcriptome interval of 47 patients (≤1 infection episode) to those of 66 hypersusceptible patients [multiple (≥2) infection episodes (MIE)]. We used LASSO regression to select biomarkers and multivariate logistic regression to built models, accuracy of which were assessed by area under receiver operating characteristic curve (AUROC) and cross-validation. RESULTS: Three predictive models were developed using covariates of (1) clinical characteristics; (2) expression profiles of 14 genomic probes; (3) combining (1) and (2). The genomic and clinical models were highly predictive of MIE status [AUROCGenomic = 0.946 (95% CI: 0.906-0.986); AUROCClinical = 0.864 (CI: 0.794-0.933); AUROCGenomic/AUROCClinical P = 0.044]. Combined model has an increased AUROCCombined of 0.967 (CI: 0.940-0.993) compared with the individual models (AUROCCombined/AUROCClinical P = 0.0069). Hypersusceptible patients show early alterations in immune-related signaling pathways, epigenetic modulation, and chromatin remodeling. CONCLUSIONS: Early triage of burn patients more susceptible to infections can be made using clinical characteristics and/or genomic signatures. Genomic signature suggests new insights into the pathophysiology of hypersusceptibility to infection may lead to novel potential therapeutic or prophylactic targets.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Summary: Lipophilicity plays an important role in the determination and the comprehension of the pharmacokinetic behavior of drugs. It is usually expressed by the partition coefficient (log P) in the n-octanol/water system. The use of an additional solvent system (1,2-dichlorethane/water) is necessary to obtain complementary information, as the log Poct values alone are not sufficient to explain ail biological properties. The aim of this thesis is to develop tools allowing to predict lipophilicity of new drugs and to analyze the information yielded by those log P values. Part I presents the development of theoretical models used to predict lipophilicity. Chapter 2 shows the necessity to extend the existing solvatochromic analyses in order to predict correctly the lipophilicity of new and complex neutral compounds. In Chapter 3, solvatochromic analyses are used to develop a model for the prediction of the lipophilicity of ions. A global model was obtained allowing to estimate the lipophilicity of neutral, anionic and cationic solutes. Part II presents the detailed study of two physicochemical filters. Chapter 4 shows that the Discovery RP Amide C16 stationary phase allows to estimate lipophilicity of the neutral form of basic and acidic solutes, except of lipophilic acidic solutes. Those solutes present additional interactions with this particular stationary phase. In Chapter 5, 4 different IANI stationary phases are investigated. For neutral solutes, linear data are obtained whatever the IANI column used. For the ionized solutes, their retention is due to a balance of electrostatic and hydrophobie interactions. Thus no discrimination is observed between different series of solutes bearing the same charge, from one column to an other. Part III presents two examples illustrating the information obtained thanks to Structure-Properties Relationships (SPR). Comparing graphically lipophilicity values obtained in two different solvent systems allows to reveal the presence of intramolecular effects .such as internai H-bond (Chapter 6). SPR is used to study the partitioning of ionizable groups encountered in Medicinal Chemistry (Chapter7). Résumé La lipophilie joue un .rôle important dans la détermination et la compréhension du comportement pharmacocinétique des médicaments. Elle est généralement exprimée par le coefficient de partage (log P) d'un composé dans le système de solvants n-octanol/eau. L'utilisation d'un deuxième système de solvants (1,2-dichloroéthane/eau) s'est avérée nécessaire afin d'obtenir des informations complémentaires, les valeurs de log Poct seules n'étant pas suffisantes pour expliquer toutes les propriétés biologiques. Le but de cette thèse est de développer des outils permettant de prédire la lipophilie de nouveaux candidats médicaments et d'analyser l'information fournie par les valeurs de log P. La Partie I présente le développement de modèles théoriques utilisés pour prédire la lipophilie. Le chapitre 2 montre la nécessité de mettre à jour les analyses solvatochromiques existantes mais inadaptées à la prédiction de la lipophilie de nouveaux composés neutres. Dans le chapitre 3, la même méthodologie des analyses solvatochromiques est utilisée pour développer un modèle permettant de prédire la lipophilie des ions. Le modèle global obtenu permet la prédiction de la lipophilie de composés neutres, anioniques et cationiques. La Partie II présente l'étude approfondie de deux filtres physicochimiques. Le Chapitre 4 montre que la phase stationnaire Discovery RP Amide C16 permet la détermination de la lipophilie de la forme neutre de composés basiques et acides, à l'exception des acides très lipophiles. Ces derniers présentent des interactions supplémentaires avec cette phase stationnaire. Dans le Chapitre 5, 4 phases stationnaires IAM sont étudiées. Pour les composés neutres étudiés, des valeurs de rétention linéaires sont obtenues, quelque que soit la colonne IAM utilisée. Pour les composés ionisables, leur rétention est due à une balance entre des interactions électrostatiques et hydrophobes. Donc aucune discrimination n'est observée entre les différentes séries de composés portant la même charge d'une colonne à l'autre. La Partie III présente deux exemples illustrant les informations obtenues par l'utilisation des relations structures-propriétés. Comparer graphiquement la lipophilie mesurée dans deux différents systèmes de solvants permet de mettre en évidence la présence d'effets intramoléculaires tels que les liaisons hydrogène intramoléculaires (Chapitre 6). Cette approche des relations structures-propriétés est aussi appliquée à l'étude du partage de fonctions ionisables rencontrées en Chimie Thérapeutique (Chapitre 7) Résumé large public Pour exercer son effet thérapeutique, un médicament doit atteindre son site d'action en quantité suffisante. La quantité effective de médicament atteignant le site d'action dépend du nombre d'interactions entre le médicament et de nombreux constituants de l'organisme comme, par exemple, les enzymes du métabolisme ou les membranes biologiques. Le passage du médicament à travers ces membranes, appelé perméation, est un paramètre important à optimiser pour développer des médicaments plus puissants. La lipophilie joue un rôle clé dans la compréhension de la perméation passive des médicaments. La lipophilie est généralement exprimée par le coefficient de partage (log P) dans le système de solvants (non miscibles) n-octanol/eau. Les valeurs de log Poct seules se sont avérées insuffisantes pour expliquer la perméation à travers toutes les différentes membranes biologiques du corps humain. L'utilisation d'un système de solvants additionnel (le système 1,2-dichloroéthane/eau) a permis d'obtenir les informations complémentaires indispensables à une bonne compréhension du processus de perméation. Un grand nombre d'outils expérimentaux et théoriques sont à disposition pour étudier la lipophilie. Ce travail de thèse se focalise principalement sur le développement ou l'amélioration de certains de ces outils pour permettre leur application à un champ plus large de composés. Voici une brève description de deux de ces outils: 1)La factorisation de la lipophilie en fonction de certaines propriétés structurelles (telle que le volume) propres aux composés permet de développer des modèles théoriques utilisables pour la prédiction de la lipophilie de nouveaux composés ou médicaments. Cette approche est appliquée à l'analyse de la lipophilie de composés neutres ainsi qu'à la lipophilie de composés chargés. 2)La chromatographie liquide à haute pression sur phase inverse (RP-HPLC) est une méthode couramment utilisée pour la détermination expérimentale des valeurs de log Poct.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

AIM: The aim of this study was to interpret and validate a French version of the Oswestry disability index (ODI), using a cross-cultural validation method. The validity and reliability of the questionnaire was assessed in order to ensure the psychometric characteristics. METHOD: The cross-cultural validation was carried out according to Beaton's methodology. The study was conducted with 41 patients suffering from low back pain. The correlation between the ODI and the Roland-Morris disability questionnaire (RMDQ), the medical outcome survey short form-36 (MOS SF-36) and a pain visual analogical scale (VAS) was assessed. RESULTS: The validity of the Oswestry questionnaire was studied using the Cronbach Alpha coefficient calculation: 0.87 (n=36). The significant correlation between the ODI and RMDQ was 0.8 (P<0.001, n=41) and 0.71 (P<0.001, n=36) for the pain VAS. The correlation between the ODI and certain subscales (physical functioning 0.7 (P<0.001, n=41), physical role 0.49 et bodily pain 0.73 (P<0.001, n=41)) of the MOS SF-36 were equally significant. The reproducibility of the ODI was calculated using the Wilcoxon matched pairs test: there was no significant difference for eight out of ten sections or for the final score. CONCLUSION: This French translation of the ODI should be considered as valid and reliable. It should be used for any future clinical studies carried out using French language patients. Complimentary studies must be completed in order to assess its sensitivity to change in the event of any modifications in the patients functional capacity.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: The variety of DNA microarray formats and datasets presently available offers an unprecedented opportunity to perform insightful comparisons of heterogeneous data. Cross-species studies, in particular, have the power of identifying conserved, functionally important molecular processes. Validation of discoveries can now often be performed in readily available public data which frequently requires cross-platform studies.Cross-platform and cross-species analyses require matching probes on different microarray formats. This can be achieved using the information in microarray annotations and additional molecular biology databases, such as orthology databases. Although annotations and other biological information are stored using modern database models ( e. g. relational), they are very often distributed and shared as tables in text files, i.e. flat file databases. This common flat database format thus provides a simple and robust solution to flexibly integrate various sources of information and a basis for the combined analysis of heterogeneous gene expression profiles.Results: We provide annotationTools, a Bioconductor-compliant R package to annotate microarray experiments and integrate heterogeneous gene expression profiles using annotation and other molecular biology information available as flat file databases. First, annotationTools contains a specialized set of functions for mining this widely used database format in a systematic manner. It thus offers a straightforward solution for annotating microarray experiments. Second, building on these basic functions and relying on the combination of information from several databases, it provides tools to easily perform cross-species analyses of gene expression data.Here, we present two example applications of annotationTools that are of direct relevance for the analysis of heterogeneous gene expression profiles, namely a cross-platform mapping of probes and a cross-species mapping of orthologous probes using different orthology databases. We also show how to perform an explorative comparison of disease-related transcriptional changes in human patients and in a genetic mouse model.Conclusion: The R package annotationTools provides a simple solution to handle microarray annotation and orthology tables, as well as other flat molecular biology databases. Thereby, it allows easy integration and analysis of heterogeneous microarray experiments across different technological platforms or species.