958 resultados para Validation Studies
Resumo:
Background: The variety of DNA microarray formats and datasets presently available offers an unprecedented opportunity to perform insightful comparisons of heterogeneous data. Cross-species studies, in particular, have the power of identifying conserved, functionally important molecular processes. Validation of discoveries can now often be performed in readily available public data which frequently requires cross-platform studies.Cross-platform and cross-species analyses require matching probes on different microarray formats. This can be achieved using the information in microarray annotations and additional molecular biology databases, such as orthology databases. Although annotations and other biological information are stored using modern database models ( e. g. relational), they are very often distributed and shared as tables in text files, i.e. flat file databases. This common flat database format thus provides a simple and robust solution to flexibly integrate various sources of information and a basis for the combined analysis of heterogeneous gene expression profiles.Results: We provide annotationTools, a Bioconductor-compliant R package to annotate microarray experiments and integrate heterogeneous gene expression profiles using annotation and other molecular biology information available as flat file databases. First, annotationTools contains a specialized set of functions for mining this widely used database format in a systematic manner. It thus offers a straightforward solution for annotating microarray experiments. Second, building on these basic functions and relying on the combination of information from several databases, it provides tools to easily perform cross-species analyses of gene expression data.Here, we present two example applications of annotationTools that are of direct relevance for the analysis of heterogeneous gene expression profiles, namely a cross-platform mapping of probes and a cross-species mapping of orthologous probes using different orthology databases. We also show how to perform an explorative comparison of disease-related transcriptional changes in human patients and in a genetic mouse model.Conclusion: The R package annotationTools provides a simple solution to handle microarray annotation and orthology tables, as well as other flat molecular biology databases. Thereby, it allows easy integration and analysis of heterogeneous microarray experiments across different technological platforms or species.
Resumo:
PURPOSE: Cardiovascular magnetic resonance (CMR) has become a robust and important diagnostic imaging modality in cardiovascular medicine. However,insufficient image quality may compromise its diagnostic accuracy. No standardized criteria are available to assess the quality of CMR studies. We aimed todescribe and validate standardized criteria to evaluate the quality of CMR studies including: a) cine steady-state free precession, b) delayed gadoliniumenhancement, and c) adenosine stress first-pass perfusion. These criteria will serve for the assessment of the image quality in the setting of the Euro-CMR registry.METHOD AND MATERIALS: First, a total of 45 quality criteria were defined (35 qualitative criteria with a score from 0-3, and 10 quantitative criteria). Thequalitative score ranged from 0 to 105. The lower the qualitative score, the better the quality. The quantitative criteria were based on the absolute signal intensity (delayed enhancement) and on the signal increase (perfusion) of the anterior/posterior left ventricular wall after gadolinium injection. These criteria were then applied in 30 patients scanned with a 1.5T system and in 15 patients scanned with a 3.0T system. The examinations were jointly interpreted by 3 CMR experts and 1 study nurse. In these 45 patients the correlation between the results of the quality assessment obtained by the different readers was calculated.RESULTS: On the 1.5T machine, the mean quality score was 3.5. The mean difference between each pair of observers was 0.2 (5.7%) with a mean standarddeviation of 1.4. On the 3.0T machine, the mean quality score was 4.4. The mean difference between each pair of onservers was 0.3 (6.4%) with a meanstandard deviation of 1.6. The quantitative quality assessments between observers were well correlated for the 1.5T machine: R was between 0.78 and 0.99 (pCONCLUSION: The described criteria for the assessment of CMR image quality are robust and have a low inter-observer variability, especially on 1.5T systems.CLINICAL RELEVANCE/APPLICATION: These criteria will allow the standardization of CMR examinations. They will help to improve the overall quality ofexaminations and the comparison between clinical studies.
Resumo:
Background: We address the problem of studying recombinational variations in (human) populations. In this paper, our focus is on one computational aspect of the general task: Given two networks G1 and G2, with both mutation and recombination events, defined on overlapping sets of extant units the objective is to compute a consensus network G3 with minimum number of additional recombinations. We describe a polynomial time algorithm with a guarantee that the number of computed new recombination events is within ϵ = sz(G1, G2) (function sz is a well-behaved function of the sizes and topologies of G1 and G2) of the optimal number of recombinations. To date, this is the best known result for a network consensus problem.Results: Although the network consensus problem can be applied to a variety of domains, here we focus on structure of human populations. With our preliminary analysis on a segment of the human Chromosome X data we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. These results have been verified independently using traditional manual procedures. To the best of our knowledge, this is the first recombinations-based characterization of human populations. Conclusion: We show that our mathematical model identifies recombination spots in the individual haplotypes; the aggregate of these spots over a set of haplotypes defines a recombinational landscape that has enough signal to detect continental as well as population divide based on a short segment of Chromosome X. In particular, we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. The agreement with mutation-based analysis can be viewed as an indirect validation of our results and the model. Since the model in principle gives us more information embedded in the networks, in our future work, we plan to investigate more non-traditional questions via these structures computed by our methodology.
Resumo:
OBJECTIVE: Evaluation of a French translation of the Addiction Severity Index (ASI) in 100 (78 male) alcoholic patients. METHOD: Validity of the instrument was assessed by measuring test-retest and interrater reliability, internal consistency and convergence and discrimination between items and scales. Concurrent validity was assessed by comparing the scores from the ASI with those obtained from three other clinimetric instruments. RESULTS: Test-retest reliability of ASI scores (after a 10-day interval) was good (r = 0.63 to r = 0.95). Interrater reliability was evaluated using six video recordings of patient interviews. Severity ratings assigned by six rates were significantly different (p < .05), but 72% of the ratings assigned by those who viewed the videos were within two points of the interviewer's severity ratings. Cronbach alpha coefficient of internal consistency varied from 0.58 to 0.81 across scales. The average item-to-scale convergent validity (r value) was 0.49 (range 0.0 to 0.84) for composite scores and 0.35 (range 0.00 to 0.68) for severity ratings, whereas discriminant validity was 0.11 on average (range-0.19 to 0.46) for composite scores and 0.12 (range-0.20 to 0.52) for severity ratings. Finally, concurrent validity with the following instruments was assessed: Severity of Alcoholism Dependence Questionnaire (40% shared variance with ASI alcohol scale), Michigan Alcoholism Screening Test (2% shared variance with ASI alcohol scale) and Hamilton Depression Rating Scale (31% shared variance with ASI psychiatric scale). CONCLUSIONS: The Addiction Severity Index covers a large scope of problems encountered among alcoholics and quantifies need for treatment. This French version presents acceptable criteria of reliability and validity.
Resumo:
PURPOSE: Quantification of myocardial blood flow (MBF) with generator-produced (82)Rb is an attractive alternative for centres without an on-site cyclotron. Our aim was to validate (82)Rb-measured MBF in relation to that measured using (15)O-water, as a tracer 100% of which can be extracted from the circulation even at high flow rates, in healthy control subject and patients with mild coronary artery disease (CAD). METHODS: MBF was measured at rest and during adenosine-induced hyperaemia with (82)Rb and (15)O-water PET in 33 participants (22 control subjects, aged 30 ± 13 years; 11 CAD patients without transmural infarction, aged 60 ± 13 years). A one-tissue compartment (82)Rb model with ventricular spillover correction was used. The (82)Rb flow-dependent extraction rate was derived from (15)O-water measurements in a subset of 11 control subjects. Myocardial flow reserve (MFR) was defined as the hyperaemic/rest MBF. Pearson's correlation r, Bland-Altman 95% limits of agreement (LoA), and Lin's concordance correlation ρ (c) (measuring both precision and accuracy) were used. RESULTS: Over the entire MBF range (0.66-4.7 ml/min/g), concordance was excellent for MBF (r = 0.90, [(82)Rb-(15)O-water] mean difference ± SD = 0.04 ± 0.66 ml/min/g, LoA = -1.26 to 1.33 ml/min/g, ρ(c) = 0.88) and MFR (range 1.79-5.81, r = 0.83, mean difference = 0.14 ± 0.58, LoA = -0.99 to 1.28, ρ(c) = 0.82). Hyperaemic MBF was reduced in CAD patients compared with the subset of 11 control subjects (2.53 ± 0.74 vs. 3.62 ± 0.68 ml/min/g, p = 0.002, for (15)O-water; 2.53 ± 1.01 vs. 3.82 ± 1.21 ml/min/g, p = 0.013, for (82)Rb) and this was paralleled by a lower MFR (2.65 ± 0.62 vs. 3.79 ± 0.98, p = 0.004, for (15)O-water; 2.85 ± 0.91 vs. 3.88 ± 0.91, p = 0.012, for (82)Rb). Myocardial perfusion was homogeneous in 1,114 of 1,122 segments (99.3%) and there were no differences in MBF among the coronary artery territories (p > 0.31). CONCLUSION: Quantification of MBF with (82)Rb with a newly derived correction for the nonlinear extraction function was validated against MBF measured using (15)O-water in control subjects and patients with mild CAD, where it was found to be accurate at high flow rates. (82)Rb-derived MBF estimates seem robust for clinical research, advancing a step further towards its implementation in clinical routine.
Resumo:
BACKGROUND: The reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR) is a widely used, highly sensitive laboratory technique to rapidly and easily detect, identify and quantify gene expression. Reliable RT-qPCR data necessitates accurate normalization with validated control genes (reference genes) whose expression is constant in all studied conditions. This stability has to be demonstrated.We performed a literature search for studies using quantitative or semi-quantitative PCR in the rat spared nerve injury (SNI) model of neuropathic pain to verify whether any reference genes had previously been validated. We then analyzed the stability over time of 7 commonly used reference genes in the nervous system - specifically in the spinal cord dorsal horn and the dorsal root ganglion (DRG). These were: Actin beta (Actb), Glyceraldehyde-3-phosphate dehydrogenase (GAPDH), ribosomal proteins 18S (18S), L13a (RPL13a) and L29 (RPL29), hypoxanthine phosphoribosyltransferase 1 (HPRT1) and hydroxymethylbilane synthase (HMBS). We compared the candidate genes and established a stability ranking using the geNorm algorithm. Finally, we assessed the number of reference genes necessary for accurate normalization in this neuropathic pain model. RESULTS: We found GAPDH, HMBS, Actb, HPRT1 and 18S cited as reference genes in literature on studies using the SNI model. Only HPRT1 and 18S had been once previously demonstrated as stable in RT-qPCR arrays. All the genes tested in this study, using the geNorm algorithm, presented gene stability values (M-value) acceptable enough for them to qualify as potential reference genes in both DRG and spinal cord. Using the coefficient of variation, 18S failed the 50% cut-off with a value of 61% in the DRG. The two most stable genes in the dorsal horn were RPL29 and RPL13a; in the DRG they were HPRT1 and Actb. Using a 0.15 cut-off for pairwise variations we found that any pair of stable reference gene was sufficient for the normalization process. CONCLUSIONS: In the rat SNI model, we validated and ranked Actb, RPL29, RPL13a, HMBS, GAPDH, HPRT1 and 18S as good reference genes in the spinal cord. In the DRG, 18S did not fulfill stability criteria. The combination of any two stable reference genes was sufficient to provide an accurate normalization.
Resumo:
Since the advent of high-throughput DNA sequencing technologies, the ever-increasing rate at which genomes have been published has generated new challenges notably at the level of genome annotation. Even if gene predictors and annotation softwares are more and more efficient, the ultimate validation is still in the observation of predicted gene product( s). Mass-spectrometry based proteomics provides the necessary high throughput technology to show evidences of protein presence and, from the identified sequences, confirmation or invalidation of predicted annotations. We review here different strategies used to perform a MS-based proteogenomics experiment with a bottom-up approach. We start from the strengths and weaknesses of the different database construction strategies, based on different genomic information (whole genome, ORF, cDNA, EST or RNA-Seq data), which are then used for matching mass spectra to peptides and proteins. We also review the important points to be considered for a correct statistical assessment of the peptide identifications. Finally, we provide references for tools used to map and visualize the peptide identifications back to the original genomic information.
Resumo:
INTRODUCTION: A clinical decision rule to improve the accuracy of a diagnosis of influenza could help clinicians avoid unnecessary use of diagnostic tests and treatments. Our objective was to develop and validate a simple clinical decision rule for diagnosis of influenza. METHODS: We combined data from 2 studies of influenza diagnosis in adult outpatients with suspected influenza: one set in California and one in Switzerland. Patients in both studies underwent a structured history and physical examination and had a reference standard test for influenza (polymerase chain reaction or culture). We randomly divided the dataset into derivation and validation groups and then evaluated simple heuristics and decision rules from previous studies and 3 rules based on our own multivariate analysis. Cutpoints for stratification of risk groups in each model were determined using the derivation group before evaluating them in the validation group. For each decision rule, the positive predictive value and likelihood ratio for influenza in low-, moderate-, and high-risk groups, and the percentage of patients allocated to each risk group, were reported. RESULTS: The simple heuristics (fever and cough; fever, cough, and acute onset) were helpful when positive but not when negative. The most useful and accurate clinical rule assigned 2 points for fever plus cough, 2 points for myalgias, and 1 point each for duration <48 hours and chills or sweats. The risk of influenza was 8% for 0 to 2 points, 30% for 3 points, and 59% for 4 to 6 points; the rule performed similarly in derivation and validation groups. Approximately two-thirds of patients fell into the low- or high-risk group and would not require further diagnostic testing. CONCLUSION: A simple, valid clinical rule can be used to guide point-of-care testing and empiric therapy for patients with suspected influenza.
Resumo:
BACKGROUND: Cardiovascular magnetic resonance (CMR) has become an important diagnostic imaging modality in cardiovascular medicine. However, insufficient image quality may compromise its diagnostic accuracy. We aimed to describe and validate standardized criteria to evaluate a) cine steady-state free precession (SSFP), b) late gadolinium enhancement (LGE), and c) stress first-pass perfusion images. These criteria will serve for quality assessment in the setting of the Euro-CMR registry. METHODS: Thirty-five qualitative criteria were defined (scores 0-3) with lower scores indicating better image quality. In addition, quantitative parameters were measured yielding 2 additional quality criteria, i.e. signal-to-noise ratio (SNR) of non-infarcted myocardium (as a measure of correct signal nulling of healthy myocardium) for LGE and % signal increase during contrast medium first-pass for perfusion images. These qualitative and quantitative criteria were assessed in a total of 90 patients (60 patients scanned at our own institution at 1.5T (n=30) and 3T (n=30) and in 30 patients randomly chosen from the Euro-CMR registry examined at 1.5T). Analyses were performed by 2 SCMR level-3 experts, 1 trained study nurse, and 1 trained medical student. RESULTS: The global quality score was 6.7±4.6 (n=90, mean of 4 observers, maximum possible score 64), range 6.4-6.9 (p=0.76 between observers). It ranged from 4.0-4.3 for 1.5T (p=0.96 between observers), from 5.9-6.9 for 3T (p=0.33 between observers), and from 8.6-10.3 for the Euro-CMR cases (p=0.40 between observers). The inter- (n=4) and intra-observer (n=2) agreement for the global quality score, i.e. the percentage of assignments to the same quality tertile ranged from 80% to 88% and from 90% to 98%, respectively. The agreement for the quantitative assessment for LGE images (scores 0-2 for SNR <2, 2-5, >5, respectively) ranged from 78-84% for the entire population, and 70-93% at 1.5T, 64-88% at 3T, and 72-90% for the Euro-CMR cases. The agreement for perfusion images (scores 0-2 for %SI increase >200%, 100%-200%,<100%, respectively) ranged from 81-91% for the entire population, and 76-100% at 1.5T, 67-96% at 3T, and 62-90% for the Euro-CMR registry cases. The intra-class correlation coefficient for the global quality score was 0.83. CONCLUSIONS: The described criteria for the assessment of CMR image quality are robust with a good inter- and intra-observer agreement. Further research is needed to define the impact of image quality on the diagnostic and prognostic yield of CMR studies.
Resumo:
OBJECTIVE: To validate a revision of the Mini Nutritional Assessment short-form (MNA(R)-SF) against the full MNA, a standard tool for nutritional evaluation. METHODS: A literature search identified studies that used the MNA for nutritional screening in geriatric patients. The contacted authors submitted original datasets that were merged into a single database. Various combinations of the questions on the current MNA-SF were tested using this database through combination analysis and ROC based derivation of classification thresholds. RESULTS: Twenty-seven datasets (n=6257 participants) were initially processed from which twelve were used in the current analysis on a sample of 2032 study participants (mean age 82.3y) with complete information on all MNA items. The original MNA-SF was a combination of six questions from the full MNA. A revised MNA-SF included calf circumference (CC) substituted for BMI performed equally well. A revised three-category scoring classification for this revised MNA-SF, using BMI and/or CC, had good sensitivity compared to the full MNA. CONCLUSION: The newly revised MNA-SF is a valid nutritional screening tool applicable to geriatric health care professionals with the option of using CC when BMI cannot be calculated. This revised MNA-SF increases the applicability of this rapid screening tool in clinical practice through the inclusion of a "malnourished" category.
Resumo:
The turbot (Scophthalmus maximus) is a commercially valuable flatfish and one of the most promising aquaculture species in Europe. Two transcriptome 454-pyrosequencing runs were used in order to detect Single Nucleotide Polymorphisms (SNPs) in genesrelated to immune response and gonad differentiation. A total of 866 true SNPs were detected in 140 different contigs representing 262,093 bp as a whole. Only one true SNP was analyzed in each contig. One hundred and thirteen SNPs out of the 140 analyzed were feasible (genotyped), while Ш were polymorphic in a wild population. Transition/transversion ratio (1.354) was similar to that observed in other fish studies. Unbiased gene diversity (He) estimates ranged from 0.060 to 0.510 (mean = 0.351), minimum allele frequency (MAF) from 0.030 to 0.500 (mean = 0.259) and all loci were in Hardy-Weinberg equilibrium after Bonferroni correction. A large number of SNPs (49) were located in the coding region, 33 representing synonymous and 16 non-synonymous changes. Most SNP-containing genes were related to immune response and gonad differentiation processes, and could be candidates for functional changes leading to phenotypic changes. These markers will be useful for population screening to look for adaptive variation in wild and domestic turbot
Batch effect confounding leads to strong bias in performance estimates obtained by cross-validation.
Resumo:
BACKGROUND: With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences ("batch effects") as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. FOCUS: The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. DATA: We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., 'control') or group 2 (e.g., 'treated'). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. METHODS: We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data.
Resumo:
Molecular shape has long been known to be an important property for the process of molecular recognition. Previous studies postulated the existence of a drug-like shape space that could be used to artificially bias the composition of screening libraries, with the aim to increase the chance of success in Hit Identification. In this work, it was analysed to which extend this assumption holds true. Normalized Principal Moments of Inertia Ratios (NPRs) have been used to describe the molecular shape of small molecules. It was investigated, whether active molecules of diverse targets are located in preferred subspaces of the NPR shape space. Results illustrated a significantly stronger clustering than could be expected by chance, with parts of the space unlikely to be occupied by active compounds. Furthermore, a strong enrichment of elongated, rather flat shapes could be observed, while globular compounds were highly underrepresented. This was confirmed for a wide range of small molecule datasets from different origins. Active compounds exhibited a high overlap in their shape distributions across different targets, making a purely shape based discrimination very difficult. An additional perspective was provided by comparing the shapes of protein binding pockets with those of their respective ligands. Although more globular than their ligands, it was observed that binding sites shapes exhibited a similarly skewed distribution in shape space: spherical shapes were highly underrepresented. This was different for unoccupied binding pockets of smaller size. These were on the contrary identified to possess a more globular shape. The relation between shape complementarity and exhibited bioactivity was analysed; a moderate correlation between bioactivity and parameters including pocket coverage, distance in shape space, and others could be identified, which reflects the importance of shape complementarity. However, this also suggests that other aspects are of relevance for molecular recognition. A subsequent analysis assessed if and how shape and volume information retrieved from pocket or respective reference ligands could be used as a pre-filter in a virtual screening approach. ln Lead Optimization compounds need to get optimized with respect to a variety of pararneters. Here, the availability of past success stories is very valuable, as they can guide medicinal chemists during their analogue synthesis plans. However, although of tremendous interest for the public domain, so far only large corporations had the ability to mine historical knowledge in their proprietary databases. With the aim to provide such information, the SwissBioisostere database was developed and released during this thesis. This database contains information on 21,293,355 performed substructural exchanges, corresponding to 5,586,462 unique replacements that have been measured in 35,039 assays against 1,948 molecular targets representing 30 target classes, and on their impact on bioactivity . A user-friendly interface was developed that provides facile access to these data and is accessible at http//www.swissbioisostere.ch. The ChEMBL database was used as primary data source of bioactivity information. Matched molecular pairs have been identified in the extracted and cleaned data. Success-based scores were developed and integrated into the database to allow re-ranking of proposed replacements by their past outcomes. It was analysed to which degree these scores correlate with chemical similarity of the underlying fragments. An unexpectedly weak relationship was detected and further investigated. Use cases of this database were envisioned, and functionalities implemented accordingly: replacement outcomes are aggregatable at the assay level, and it was shawn that an aggregation at the target or target class level could also be performed, but should be accompanied by a careful case-by-case assessment. It was furthermore observed that replacement success depends on the activity of the starting compound A within a matched molecular pair A-B. With increasing potency the probability to lose bioactivity through any substructural exchange was significantly higher than in low affine binders. A potential existence of a publication bias could be refuted. Furthermore, often performed medicinal chemistry strategies for structure-activity-relationship exploration were analysed using the acquired data. Finally, data originating from pharmaceutical companies were compared with those reported in the literature. It could be seen that industrial medicinal chemistry can access replacement information not available in the public domain. In contrast, a large amount of often-performed replacements within companies could also be identified in literature data. Preferences for particular replacements differed between these two sources. The value of combining different endpoints in an evaluation of molecular replacements was investigated. The performed studies highlighted furthermore that there seem to exist no universal substructural replacement that always retains bioactivity irrespective of the biological environment. A generalization of bioisosteric replacements seems therefore not possible. - La forme tridimensionnelle des molécules a depuis longtemps été reconnue comme une propriété importante pour le processus de reconnaissance moléculaire. Des études antérieures ont postulé que les médicaments occupent préférentiellement un sous-ensemble de l'espace des formes des molécules. Ce sous-ensemble pourrait être utilisé pour biaiser la composition de chimiothèques à cribler, dans le but d'augmenter les chances d'identifier des Hits. L'analyse et la validation de cette assertion fait l'objet de cette première partie. Les Ratios de Moments Principaux d'Inertie Normalisés (RPN) ont été utilisés pour décrire la forme tridimensionnelle de petites molécules de type médicament. Il a été étudié si les molécules actives sur des cibles différentes se co-localisaient dans des sous-espaces privilégiés de l'espace des formes. Les résultats montrent des regroupements de molécules incompatibles avec une répartition aléatoire, avec certaines parties de l'espace peu susceptibles d'être occupées par des composés actifs. Par ailleurs, un fort enrichissement en formes allongées et plutôt plates a pu être observé, tandis que les composés globulaires étaient fortement sous-représentés. Cela a été confirmé pour un large ensemble de compilations de molécules d'origines différentes. Les distributions de forme des molécules actives sur des cibles différentes se recoupent largement, rendant une discrimination fondée uniquement sur la forme très difficile. Une perspective supplémentaire a été ajoutée par la comparaison des formes des ligands avec celles de leurs sites de liaison (poches) dans leurs protéines respectives. Bien que plus globulaires que leurs ligands, il a été observé que les formes des poches présentent une distribution dans l'espace des formes avec le même type d'asymétrie que celle observée pour les ligands: les formes sphériques sont fortement sous représentées. Un résultat différent a été obtenu pour les poches de plus petite taille et cristallisées sans ligand: elles possédaient une forme plus globulaire. La relation entre complémentarité de forme et bioactivité a été également analysée; une corrélation modérée entre bioactivité et des paramètres tels que remplissage de poche, distance dans l'espace des formes, ainsi que d'autres, a pu être identifiée. Ceci reflète l'importance de la complémentarité des formes, mais aussi l'implication d'autres facteurs. Une analyse ultérieure a évalué si et comment la forme et le volume d'une poche ou de ses ligands de référence pouvaient être utilisés comme un pré-filtre dans une approche de criblage virtuel. Durant l'optimisation d'un Lead, de nombreux paramètres doivent être optimisés simultanément. Dans ce contexte, la disponibilité d'exemples d'optimisations réussies est précieuse, car ils peuvent orienter les chimistes médicinaux dans leurs plans de synthèse par analogie. Cependant, bien que d'un extrême intérêt pour les chercheurs dans le domaine public, seules les grandes sociétés pharmaceutiques avaient jusqu'à présent la capacité d'exploiter de telles connaissances au sein de leurs bases de données internes. Dans le but de remédier à cette limitation, la base de données SwissBioisostere a été élaborée et publiée dans le domaine public au cours de cette thèse. Cette base de données contient des informations sur 21 293 355 échanges sous-structuraux observés, correspondant à 5 586 462 remplacements uniques mesurés dans 35 039 tests contre 1948 cibles représentant 30 familles, ainsi que sur leur impact sur la bioactivité. Une interface a été développée pour permettre un accès facile à ces données, accessible à http:/ /www.swissbioisostere.ch. La base de données ChEMBL a été utilisée comme source de données de bioactivité. Une version modifiée de l'algorithme de Hussain et Rea a été implémentée pour identifier les Matched Molecular Pairs (MMP) dans les données préparées au préalable. Des scores de succès ont été développés et intégrés dans la base de données pour permettre un reclassement des remplacements proposés selon leurs résultats précédemment observés. La corrélation entre ces scores et la similarité chimique des fragments correspondants a été étudiée. Une corrélation plus faible qu'attendue a été détectée et analysée. Différents cas d'utilisation de cette base de données ont été envisagés, et les fonctionnalités correspondantes implémentées: l'agrégation des résultats de remplacement est effectuée au niveau de chaque test, et il a été montré qu'elle pourrait également être effectuée au niveau de la cible ou de la classe de cible, sous réserve d'une analyse au cas par cas. Il a en outre été constaté que le succès d'un remplacement dépend de l'activité du composé A au sein d'une paire A-B. Il a été montré que la probabilité de perdre la bioactivité à la suite d'un remplacement moléculaire quelconque est plus importante au sein des molécules les plus actives que chez les molécules de plus faible activité. L'existence potentielle d'un biais lié au processus de publication par articles a pu être réfutée. En outre, les stratégies fréquentes de chimie médicinale pour l'exploration des relations structure-activité ont été analysées à l'aide des données acquises. Enfin, les données provenant des compagnies pharmaceutiques ont été comparées à celles reportées dans la littérature. Il a pu être constaté que les chimistes médicinaux dans l'industrie peuvent accéder à des remplacements qui ne sont pas disponibles dans le domaine public. Par contre, un grand nombre de remplacements fréquemment observés dans les données de l'industrie ont également pu être identifiés dans les données de la littérature. Les préférences pour certains remplacements particuliers diffèrent entre ces deux sources. L'intérêt d'évaluer les remplacements moléculaires simultanément selon plusieurs paramètres (bioactivité et stabilité métabolique par ex.) a aussi été étudié. Les études réalisées ont souligné qu'il semble n'exister aucun remplacement sous-structural universel qui conserve toujours la bioactivité quel que soit le contexte biologique. Une généralisation des remplacements bioisostériques ne semble donc pas possible.
Resumo:
BACKGROUND: A 70-gene signature was previously shown to have prognostic value in patients with node-negative breast cancer. Our goal was to validate the signature in an independent group of patients. METHODS: Patients (n = 307, with 137 events after a median follow-up of 13.6 years) from five European centers were divided into high- and low-risk groups based on the gene signature classification and on clinical risk classifications. Patients were assigned to the gene signature low-risk group if their 5-year distant metastasis-free survival probability as estimated by the gene signature was greater than 90%. Patients were assigned to the clinicopathologic low-risk group if their 10-year survival probability, as estimated by Adjuvant! software, was greater than 88% (for estrogen receptor [ER]-positive patients) or 92% (for ER-negative patients). Hazard ratios (HRs) were estimated to compare time to distant metastases, disease-free survival, and overall survival in high- versus low-risk groups. RESULTS: The 70-gene signature outperformed the clinicopathologic risk assessment in predicting all endpoints. For time to distant metastases, the gene signature yielded HR = 2.32 (95% confidence interval [CI] = 1.35 to 4.00) without adjustment for clinical risk and hazard ratios ranging from 2.13 to 2.15 after adjustment for various estimates of clinical risk; clinicopathologic risk using Adjuvant! software yielded an unadjusted HR = 1.68 (95% CI = 0.92 to 3.07). For overall survival, the gene signature yielded an unadjusted HR = 2.79 (95% CI = 1.60 to 4.87) and adjusted hazard ratios ranging from 2.63 to 2.89; clinicopathologic risk yielded an unadjusted HR = 1.67 (95% CI = 0.93 to 2.98). For patients in the gene signature high-risk group, 10-year overall survival was 0.69 for patients in both the low- and high-clinical risk groups; for patients in the gene signature low-risk group, the 10-year survival rates were 0.88 and 0.89, respectively. CONCLUSIONS: The 70-gene signature adds independent prognostic information to clinicopathologic risk assessment for patients with early breast cancer.
Resumo:
BACKGROUND & AIMS: Standardized instruments are needed to assess the activity of eosinophilic esophagitis (EoE) and to provide end points for clinical trials and observational studies. We aimed to develop and validate a patient-reported outcome (PRO) instrument and score, based on items that could account for variations in patient assessments of disease severity. We also evaluated relationships between patient assessment of disease severity and EoE-associated endoscopic, histologic, and laboratory findings. METHODS: We collected information from 186 patients with EoE in Switzerland and the United States (69.4% male; median age, 43 y) via surveys (n = 135), focus groups (n = 27), and semistructured interviews (n = 24). Items were generated for the instruments to assess biologic activity based on physician input. Linear regression was used to quantify the extent to which variations in patient-reported disease characteristics could account for variations in patient assessment of EoE severity. The PRO instrument was used prospectively in 153 adult patients with EoE (72.5% male; median age, 38 y), and validated in an independent group of 120 patients with EoE (60.8% male; median age, 40.5 y). RESULTS: Seven PRO factors that are used to assess characteristics of dysphagia, behavioral adaptations to living with dysphagia, and pain while swallowing accounted for 67% of the variation in patient assessment of disease severity. Based on statistical consideration and patient input, a 7-day recall period was selected. Highly active EoE, based on endoscopic and histologic findings, was associated with an increase in patient-assessed disease severity. In the validation study, the mean difference between patient assessment of EoE severity (range, 0-10) and PRO score (range, 0-8.52) was 0.15. CONCLUSIONS: We developed and validated an EoE scoring system based on 7 PRO items that assess symptoms over a 7-day recall period. Clinicaltrials.gov number: NCT00939263.