76 resultados para in-domain data requirement
em Université de Lausanne, Switzerland
Resumo:
Analyzing functional data often leads to finding common factors, for which functional principal component analysis proves to be a useful tool to summarize and characterize the random variation in a function space. The representation in terms of eigenfunctions is optimal in the sense of L-2 approximation. However, the eigenfunctions are not always directed towards an interesting and interpretable direction in the context of functional data and thus could obscure the underlying structure. To overcome such difficulty, an alternative to functional principal component analysis is proposed that produces directed components which may be more informative and easier to interpret. These structural components are similar to principal components, but are adapted to situations in which the domain of the function may be decomposed into disjoint intervals such that there is effectively independence between intervals and positive correlation within intervals. The approach is demonstrated with synthetic examples as well as real data. Properties for special cases are also studied.
Resumo:
The use of Geographic Information Systems has revolutionalized the handling and the visualization of geo-referenced data and has underlined the critic role of spatial analysis. The usual tools for such a purpose are geostatistics which are widely used in Earth science. Geostatistics are based upon several hypothesis which are not always verified in practice. On the other hand, Artificial Neural Network (ANN) a priori can be used without special assumptions and are known to be flexible. This paper proposes to discuss the application of ANN in the case of the interpolation of a geo-referenced variable.
Resumo:
BACKGROUND: Greater tobacco smoking and alcohol consumption and lower body mass index (BMI) increase odds ratios (OR) for oral cavity, oropharyngeal, hypopharyngeal, and laryngeal cancers; however, there are no comprehensive sex-specific comparisons of ORs for these factors. METHODS: We analyzed 2,441 oral cavity (925 women and 1,516 men), 2,297 oropharynx (564 women and 1,733 men), 508 hypopharynx (96 women and 412 men), and 1,740 larynx (237 women and 1,503 men) cases from the INHANCE consortium of 15 head and neck cancer case-control studies. Controls numbered from 7,604 to 13,829 subjects, depending on analysis. Analyses fitted linear-exponential excess ORs models. RESULTS: ORs were increased in underweight (<18.5 BMI) relative to normal weight (18.5-24.9) and reduced in overweight and obese categories (>/=25 BMI) for all sites and were homogeneous by sex. ORs by smoking and drinking in women compared with men were significantly greater for oropharyngeal cancer (p < 0.01 for both factors), suggestive for hypopharyngeal cancer (p = 0.05 and p = 0.06, respectively), but homogeneous for oral cavity (p = 0.56 and p = 0.64) and laryngeal (p = 0.18 and p = 0.72) cancers. CONCLUSIONS: The extent that OR modifications of smoking and drinking by sex for oropharyngeal and, possibly, hypopharyngeal cancers represent true associations, or derive from unmeasured confounders or unobserved sex-related disease subtypes (e.g., human papillomavirus-positive oropharyngeal cancer) remains to be clarified.
Resumo:
Despite their limited proliferation capacity, regulatory T cells (T(regs)) constitute a population maintained over the entire lifetime of a human organism. The means by which T(regs) sustain a stable pool in vivo are controversial. Using a mathematical model, we address this issue by evaluating several biological scenarios of the origins and the proliferation capacity of two subsets of T(regs): precursor CD4(+)CD25(+)CD45RO(-) and mature CD4(+)CD25(+)CD45RO(+) cells. The lifelong dynamics of T(regs) are described by a set of ordinary differential equations, driven by a stochastic process representing the major immune reactions involving these cells. The model dynamics are validated using data from human donors of different ages. Analysis of the data led to the identification of two properties of the dynamics: (1) the equilibrium in the CD4(+)CD25(+)FoxP3(+)T(regs) population is maintained over both precursor and mature T(regs) pools together, and (2) the ratio between precursor and mature T(regs) is inverted in the early years of adulthood. Then, using the model, we identified three biologically relevant scenarios that have the above properties: (1) the unique source of mature T(regs) is the antigen-driven differentiation of precursors that acquire the mature profile in the periphery and the proliferation of T(regs) is essential for the development and the maintenance of the pool; there exist other sources of mature T(regs), such as (2) a homeostatic density-dependent regulation or (3) thymus- or effector-derived T(regs), and in both cases, antigen-induced proliferation is not necessary for the development of a stable pool of T(regs). This is the first time that a mathematical model built to describe the in vivo dynamics of regulatory T cells is validated using human data. The application of this model provides an invaluable tool in estimating the amount of regulatory T cells as a function of time in the blood of patients that received a solid organ transplant or are suffering from an autoimmune disease.
Resumo:
OBJECTIVE: As part of the WHO ICD-11 development initiative, the Topic Advisory Group on Quality and Safety explores meta-features of morbidity data sets, such as the optimal number of secondary diagnosis fields. DESIGN: The Health Care Quality Indicators Project of the Organization for Economic Co-Operation and Development collected Patient Safety Indicator (PSI) information from administrative hospital data of 19-20 countries in 2009 and 2011. We investigated whether three countries that expanded their data systems to include more secondary diagnosis fields showed increased PSI rates compared with six countries that did not. Furthermore, administrative hospital data from six of these countries and two American states, California (2011) and Florida (2010), were analysed for distributions of coded patient safety events across diagnosis fields. RESULTS: Among the participating countries, increasing the number of diagnosis fields was not associated with any overall increase in PSI rates. However, high proportions of PSI-related diagnoses appeared beyond the sixth secondary diagnosis field. The distribution of three PSI-related ICD codes was similar in California and Florida: 89-90% of central venous catheter infections and 97-99% of retained foreign bodies and accidental punctures or lacerations were captured within 15 secondary diagnosis fields. CONCLUSIONS: Six to nine secondary diagnosis fields are inadequate for comparing complication rates using hospital administrative data; at least 15 (and perhaps more with ICD-11) are recommended to fully characterize clinical outcomes. Increasing the number of fields should improve the international and intra-national comparability of data for epidemiologic and health services research, utilization analyses and quality of care assessment.
Resumo:
For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.
Resumo:
The transmembrane protein HER2 is over-expressed in approximately 15% of invasive breast cancers as a result of HER2 gene amplification. HER2 proteolytic cleavage (HER2 shedding) generates soluble truncated HER2 molecules that include only the extracellular domain and the concentration of which can be measured in the serum fraction of blood. HER2 shedding also generates a constitutively active truncated intracellular receptor of 95kDa (p95(HER2)). Another soluble truncated HER2 protein (Herstatin), which can also be found in serum, is the product of an alternatively spliced HER2 transcript. Recent preclinical findings may provide crucial insights into the biological and clinical relevance of increased sHER2 concentrations for the outcome of HER2-positive breast cancer and sensitivity to trastuzumab and lapatinib treatment. We present here the most recent findings about the role and biology of sHER2 based on data obtained using a standardized test, which has been cleared by FDA in 2000, for measuring sHER2. This test includes quality control assessments and has been already widely used to evaluate the clinical utility of sHER2 as a biomarker in breast cancer. We will describe in detail data concerning the assessment of sHER2 as a surrogate maker to optimize the evaluation of the HER2 status of a primary tumor and as a prognosis and predictive marker of response to therapies, both in early and metastatic breast cancer.
Resumo:
The recent developments in high magnetic field 13C magnetic resonance spectroscopy with improved localization and shimming techniques have led to important gains in sensitivity and spectral resolution of 13C in vivo spectra in the rodent brain, enabling the separation of several 13C isotopomers of glutamate and glutamine. In this context, the assumptions used in spectral quantification might have a significant impact on the determination of the 13C concentrations and the related metabolic fluxes. In this study, the time domain spectral quantification algorithm AMARES (advanced method for accurate, robust and efficient spectral fitting) was applied to 13 C magnetic resonance spectroscopy spectra acquired in the rat brain at 9.4 T, following infusion of [1,6-(13)C2 ] glucose. Using both Monte Carlo simulations and in vivo data, the goal of this work was: (1) to validate the quantification of in vivo 13C isotopomers using AMARES; (2) to assess the impact of the prior knowledge on the quantification of in vivo 13C isotopomers using AMARES; (3) to compare AMARES and LCModel (linear combination of model spectra) for the quantification of in vivo 13C spectra. AMARES led to accurate and reliable 13C spectral quantification similar to those obtained using LCModel, when the frequency shifts, J-coupling constants and phase patterns of the different 13C isotopomers were included as prior knowledge in the analysis.
Resumo:
Myosin V motors are believed to contribute to cell polarization by carrying cargoes along actin tracks. In Schizosaccharomyces pombe, Myosin Vs transport secretory vesicles along actin cables, which are dynamic actin bundles assembled by the formin For3 at cell poles. How these flexible structures are able to extend longitudinally in the cell through the dense cytoplasm is unknown. Here we show that in myosin V (myo52 myo51) null cells, actin cables are curled, bundled, and fail to extend into the cell interior. They also exhibit reduced retrograde flow, suggesting that formin-mediated actin assembly is impaired. Myo52 may contribute to actin cable organization by delivering actin regulators to cell poles, as myoV defects are partially suppressed by diverting cargoes toward cell tips onto microtubules with a kinesin 7-Myo52 tail chimera. In addition, Myo52 motor activity may pull on cables to provide the tension necessary for their extension and efficient assembly, as artificially tethering actin cables to the nuclear envelope via a Myo52 motor domain restores actin cable extension and retrograde flow in myoV mutants. Together these in vivo data reveal elements of a self-organizing system in which the motors shape their own tracks by transporting cargoes and exerting physical pulling forces.
Resumo:
Validated in vitro methods for skin corrosion and irritation were adopted by the OECD and by the European Union during the last decade. In the EU, Switzerland and countries adopting the EU legislation, these assays may allow the full replacement of animal testing for identifying and classifying compounds as skin corrosives, skin irritants, and non irritants. In order to develop harmonised recommendations on the use of in vitro data for regulatory assessment purposes within the European framework, a workshop was organized by the Swiss Federal Office of Public Health together with ECVAM and the BfR. It comprised stakeholders from various European countries involved in the process from in vitro testing to the regulatory assessment of in vitro data. Discussions addressed the following questions: (1) the information requirements considered useful for regulatory assessment; (2) the applicability of in vitro skin corrosion data to assign the corrosive subcategories as implemented by the EU Classification, Labelling and Packaging Regulation; (3) the applicability of testing strategies for determining skin corrosion and irritation hazards; and (4) the applicability of the adopted in vitro assays to test mixtures, preparations and dilutions. Overall, a number of agreements and recommendations were achieved in order to clarify and facilitate the assessment and use of in vitro data from regulatory accepted methods, and ultimately help regulators and scientists facing with the new in vitro approaches to evaluate skin irritation and corrosion hazards and risks without animal data.
Resumo:
With advances in the effectiveness of treatment and disease management, the contribution of chronic comorbid diseases (comorbidities) found within the Charlson comorbidity index to mortality is likely to have changed since development of the index in 1984. The authors reevaluated the Charlson index and reassigned weights to each condition by identifying and following patients to observe mortality within 1 year after hospital discharge. They applied the updated index and weights to hospital discharge data from 6 countries and tested for their ability to predict in-hospital mortality. Compared with the original Charlson weights, weights generated from the Calgary, Alberta, Canada, data (2004) were 0 for 5 comorbidities, decreased for 3 comorbidities, increased for 4 comorbidities, and did not change for 5 comorbidities. The C statistics for discriminating in-hospital mortality between the new score generated from the 12 comorbidities and the Charlson score were 0.825 (new) and 0.808 (old), respectively, in Australian data (2008), 0.828 and 0.825 in Canadian data (2008), 0.878 and 0.882 in French data (2004), 0.727 and 0.723 in Japanese data (2008), 0.831 and 0.836 in New Zealand data (2008), and 0.869 and 0.876 in Swiss data (2008). The updated index of 12 comorbidities showed good-to-excellent discrimination in predicting in-hospital mortality in data from 6 countries and may be more appropriate for use with more recent administrative data.
Resumo:
Molecular shape has long been known to be an important property for the process of molecular recognition. Previous studies postulated the existence of a drug-like shape space that could be used to artificially bias the composition of screening libraries, with the aim to increase the chance of success in Hit Identification. In this work, it was analysed to which extend this assumption holds true. Normalized Principal Moments of Inertia Ratios (NPRs) have been used to describe the molecular shape of small molecules. It was investigated, whether active molecules of diverse targets are located in preferred subspaces of the NPR shape space. Results illustrated a significantly stronger clustering than could be expected by chance, with parts of the space unlikely to be occupied by active compounds. Furthermore, a strong enrichment of elongated, rather flat shapes could be observed, while globular compounds were highly underrepresented. This was confirmed for a wide range of small molecule datasets from different origins. Active compounds exhibited a high overlap in their shape distributions across different targets, making a purely shape based discrimination very difficult. An additional perspective was provided by comparing the shapes of protein binding pockets with those of their respective ligands. Although more globular than their ligands, it was observed that binding sites shapes exhibited a similarly skewed distribution in shape space: spherical shapes were highly underrepresented. This was different for unoccupied binding pockets of smaller size. These were on the contrary identified to possess a more globular shape. The relation between shape complementarity and exhibited bioactivity was analysed; a moderate correlation between bioactivity and parameters including pocket coverage, distance in shape space, and others could be identified, which reflects the importance of shape complementarity. However, this also suggests that other aspects are of relevance for molecular recognition. A subsequent analysis assessed if and how shape and volume information retrieved from pocket or respective reference ligands could be used as a pre-filter in a virtual screening approach. ln Lead Optimization compounds need to get optimized with respect to a variety of pararneters. Here, the availability of past success stories is very valuable, as they can guide medicinal chemists during their analogue synthesis plans. However, although of tremendous interest for the public domain, so far only large corporations had the ability to mine historical knowledge in their proprietary databases. With the aim to provide such information, the SwissBioisostere database was developed and released during this thesis. This database contains information on 21,293,355 performed substructural exchanges, corresponding to 5,586,462 unique replacements that have been measured in 35,039 assays against 1,948 molecular targets representing 30 target classes, and on their impact on bioactivity . A user-friendly interface was developed that provides facile access to these data and is accessible at http//www.swissbioisostere.ch. The ChEMBL database was used as primary data source of bioactivity information. Matched molecular pairs have been identified in the extracted and cleaned data. Success-based scores were developed and integrated into the database to allow re-ranking of proposed replacements by their past outcomes. It was analysed to which degree these scores correlate with chemical similarity of the underlying fragments. An unexpectedly weak relationship was detected and further investigated. Use cases of this database were envisioned, and functionalities implemented accordingly: replacement outcomes are aggregatable at the assay level, and it was shawn that an aggregation at the target or target class level could also be performed, but should be accompanied by a careful case-by-case assessment. It was furthermore observed that replacement success depends on the activity of the starting compound A within a matched molecular pair A-B. With increasing potency the probability to lose bioactivity through any substructural exchange was significantly higher than in low affine binders. A potential existence of a publication bias could be refuted. Furthermore, often performed medicinal chemistry strategies for structure-activity-relationship exploration were analysed using the acquired data. Finally, data originating from pharmaceutical companies were compared with those reported in the literature. It could be seen that industrial medicinal chemistry can access replacement information not available in the public domain. In contrast, a large amount of often-performed replacements within companies could also be identified in literature data. Preferences for particular replacements differed between these two sources. The value of combining different endpoints in an evaluation of molecular replacements was investigated. The performed studies highlighted furthermore that there seem to exist no universal substructural replacement that always retains bioactivity irrespective of the biological environment. A generalization of bioisosteric replacements seems therefore not possible. - La forme tridimensionnelle des molécules a depuis longtemps été reconnue comme une propriété importante pour le processus de reconnaissance moléculaire. Des études antérieures ont postulé que les médicaments occupent préférentiellement un sous-ensemble de l'espace des formes des molécules. Ce sous-ensemble pourrait être utilisé pour biaiser la composition de chimiothèques à cribler, dans le but d'augmenter les chances d'identifier des Hits. L'analyse et la validation de cette assertion fait l'objet de cette première partie. Les Ratios de Moments Principaux d'Inertie Normalisés (RPN) ont été utilisés pour décrire la forme tridimensionnelle de petites molécules de type médicament. Il a été étudié si les molécules actives sur des cibles différentes se co-localisaient dans des sous-espaces privilégiés de l'espace des formes. Les résultats montrent des regroupements de molécules incompatibles avec une répartition aléatoire, avec certaines parties de l'espace peu susceptibles d'être occupées par des composés actifs. Par ailleurs, un fort enrichissement en formes allongées et plutôt plates a pu être observé, tandis que les composés globulaires étaient fortement sous-représentés. Cela a été confirmé pour un large ensemble de compilations de molécules d'origines différentes. Les distributions de forme des molécules actives sur des cibles différentes se recoupent largement, rendant une discrimination fondée uniquement sur la forme très difficile. Une perspective supplémentaire a été ajoutée par la comparaison des formes des ligands avec celles de leurs sites de liaison (poches) dans leurs protéines respectives. Bien que plus globulaires que leurs ligands, il a été observé que les formes des poches présentent une distribution dans l'espace des formes avec le même type d'asymétrie que celle observée pour les ligands: les formes sphériques sont fortement sous représentées. Un résultat différent a été obtenu pour les poches de plus petite taille et cristallisées sans ligand: elles possédaient une forme plus globulaire. La relation entre complémentarité de forme et bioactivité a été également analysée; une corrélation modérée entre bioactivité et des paramètres tels que remplissage de poche, distance dans l'espace des formes, ainsi que d'autres, a pu être identifiée. Ceci reflète l'importance de la complémentarité des formes, mais aussi l'implication d'autres facteurs. Une analyse ultérieure a évalué si et comment la forme et le volume d'une poche ou de ses ligands de référence pouvaient être utilisés comme un pré-filtre dans une approche de criblage virtuel. Durant l'optimisation d'un Lead, de nombreux paramètres doivent être optimisés simultanément. Dans ce contexte, la disponibilité d'exemples d'optimisations réussies est précieuse, car ils peuvent orienter les chimistes médicinaux dans leurs plans de synthèse par analogie. Cependant, bien que d'un extrême intérêt pour les chercheurs dans le domaine public, seules les grandes sociétés pharmaceutiques avaient jusqu'à présent la capacité d'exploiter de telles connaissances au sein de leurs bases de données internes. Dans le but de remédier à cette limitation, la base de données SwissBioisostere a été élaborée et publiée dans le domaine public au cours de cette thèse. Cette base de données contient des informations sur 21 293 355 échanges sous-structuraux observés, correspondant à 5 586 462 remplacements uniques mesurés dans 35 039 tests contre 1948 cibles représentant 30 familles, ainsi que sur leur impact sur la bioactivité. Une interface a été développée pour permettre un accès facile à ces données, accessible à http:/ /www.swissbioisostere.ch. La base de données ChEMBL a été utilisée comme source de données de bioactivité. Une version modifiée de l'algorithme de Hussain et Rea a été implémentée pour identifier les Matched Molecular Pairs (MMP) dans les données préparées au préalable. Des scores de succès ont été développés et intégrés dans la base de données pour permettre un reclassement des remplacements proposés selon leurs résultats précédemment observés. La corrélation entre ces scores et la similarité chimique des fragments correspondants a été étudiée. Une corrélation plus faible qu'attendue a été détectée et analysée. Différents cas d'utilisation de cette base de données ont été envisagés, et les fonctionnalités correspondantes implémentées: l'agrégation des résultats de remplacement est effectuée au niveau de chaque test, et il a été montré qu'elle pourrait également être effectuée au niveau de la cible ou de la classe de cible, sous réserve d'une analyse au cas par cas. Il a en outre été constaté que le succès d'un remplacement dépend de l'activité du composé A au sein d'une paire A-B. Il a été montré que la probabilité de perdre la bioactivité à la suite d'un remplacement moléculaire quelconque est plus importante au sein des molécules les plus actives que chez les molécules de plus faible activité. L'existence potentielle d'un biais lié au processus de publication par articles a pu être réfutée. En outre, les stratégies fréquentes de chimie médicinale pour l'exploration des relations structure-activité ont été analysées à l'aide des données acquises. Enfin, les données provenant des compagnies pharmaceutiques ont été comparées à celles reportées dans la littérature. Il a pu être constaté que les chimistes médicinaux dans l'industrie peuvent accéder à des remplacements qui ne sont pas disponibles dans le domaine public. Par contre, un grand nombre de remplacements fréquemment observés dans les données de l'industrie ont également pu être identifiés dans les données de la littérature. Les préférences pour certains remplacements particuliers diffèrent entre ces deux sources. L'intérêt d'évaluer les remplacements moléculaires simultanément selon plusieurs paramètres (bioactivité et stabilité métabolique par ex.) a aussi été étudié. Les études réalisées ont souligné qu'il semble n'exister aucun remplacement sous-structural universel qui conserve toujours la bioactivité quel que soit le contexte biologique. Une généralisation des remplacements bioisostériques ne semble donc pas possible.