135 resultados para Knowledge Discovery


Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a new framework for large-scale data clustering. The main idea is to modify functional dimensionality reduction techniques to directly optimize over discrete labels using stochastic gradient descent. Compared to methods like spectral clustering our approach solves a single optimization problem, rather than an ad-hoc two-stage optimization approach, does not require a matrix inversion, can easily encode prior knowledge in the set of implementable functions, and does not have an ?out-of-sample? problem. Experimental results on both artificial and real-world datasets show the usefulness of our approach.

Relevância:

60.00% 60.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Arenaviruses are a large group of emerging viruses including several causative agents of severe hemorrhagic fevers with high mortality in man. Considering the number of people affected and the currently limited therapeutic options, novel efficacious therapeutics against arenaviruses are urgently needed. Over the past decade, significant advances in knowledge about the basic virology of arenaviruses have been accompanied by the development of novel therapeutics targeting different steps of the arenaviral life cycle. High-throughput, small-molecule screens identified potent and broadly active inhibitors of arenavirus entry that were instrumental for the dissection of unique features of arenavirus fusion. Novel inhibitors of arenavirus replication have been successfully tested in animal models and hold promise for application in humans. Late in the arenavirus life cycle, the proteolytic processing of the arenavirus envelope glycoprotein precursor and cellular factors critically involved virion assembly and budding provide further promising 'druggable' targets for novel therapeutics to combat human arenavirus infection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10(-9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10(-4)-2.2 × 10(-7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Molecular shape has long been known to be an important property for the process of molecular recognition. Previous studies postulated the existence of a drug-like shape space that could be used to artificially bias the composition of screening libraries, with the aim to increase the chance of success in Hit Identification. In this work, it was analysed to which extend this assumption holds true. Normalized Principal Moments of Inertia Ratios (NPRs) have been used to describe the molecular shape of small molecules. It was investigated, whether active molecules of diverse targets are located in preferred subspaces of the NPR shape space. Results illustrated a significantly stronger clustering than could be expected by chance, with parts of the space unlikely to be occupied by active compounds. Furthermore, a strong enrichment of elongated, rather flat shapes could be observed, while globular compounds were highly underrepresented. This was confirmed for a wide range of small molecule datasets from different origins. Active compounds exhibited a high overlap in their shape distributions across different targets, making a purely shape­ based discrimination very difficult. An additional perspective was provided by comparing the shapes of protein binding pockets with those of their respective ligands. Although more globular than their ligands, it was observed that binding sites shapes exhibited a similarly skewed distribution in shape space: spherical shapes were highly underrepresented. This was different for unoccupied binding pockets of smaller size. These were on the contrary identified to possess a more globular shape. The relation between shape complementarity and exhibited bioactivity was analysed; a moderate correlation between bioactivity and parameters including pocket coverage, distance in shape space, and others could be identified, which reflects the importance of shape complementarity. However, this also suggests that other aspects are of relevance for molecular recognition. A subsequent analysis assessed if and how shape and volume information retrieved from pocket or respective reference ligands could be used as a pre-filter in a virtual screening approach. ln Lead Optimization compounds need to get optimized with respect to a variety of pararneters. Here, the availability of past success stories is very valuable, as they can guide medicinal chemists during their analogue synthesis plans. However, although of tremendous interest for the public domain, so far only large corporations had the ability to mine historical knowledge in their proprietary databases. With the aim to provide such information, the SwissBioisostere database was developed and released during this thesis. This database contains information on 21,293,355 performed substructural exchanges, corresponding to 5,586,462 unique replacements that have been measured in 35,039 assays against 1,948 molecular targets representing 30 target classes, and on their impact on bioactivity . A user-friendly interface was developed that provides facile access to these data and is accessible at http//www.swissbioisostere.ch. The ChEMBL database was used as primary data source of bioactivity information. Matched molecular pairs have been identified in the extracted and cleaned data. Success-based scores were developed and integrated into the database to allow re-ranking of proposed replacements by their past outcomes. It was analysed to which degree these scores correlate with chemical similarity of the underlying fragments. An unexpectedly weak relationship was detected and further investigated. Use cases of this database were envisioned, and functionalities implemented accordingly: replacement outcomes are aggregatable at the assay level, and it was shawn that an aggregation at the target or target class level could also be performed, but should be accompanied by a careful case-by-case assessment. It was furthermore observed that replacement success depends on the activity of the starting compound A within a matched molecular pair A-B. With increasing potency the probability to lose bioactivity through any substructural exchange was significantly higher than in low affine binders. A potential existence of a publication bias could be refuted. Furthermore, often performed medicinal chemistry strategies for structure-activity-relationship exploration were analysed using the acquired data. Finally, data originating from pharmaceutical companies were compared with those reported in the literature. It could be seen that industrial medicinal chemistry can access replacement information not available in the public domain. In contrast, a large amount of often-performed replacements within companies could also be identified in literature data. Preferences for particular replacements differed between these two sources. The value of combining different endpoints in an evaluation of molecular replacements was investigated. The performed studies highlighted furthermore that there seem to exist no universal substructural replacement that always retains bioactivity irrespective of the biological environment. A generalization of bioisosteric replacements seems therefore not possible. - La forme tridimensionnelle des molécules a depuis longtemps été reconnue comme une propriété importante pour le processus de reconnaissance moléculaire. Des études antérieures ont postulé que les médicaments occupent préférentiellement un sous-ensemble de l'espace des formes des molécules. Ce sous-ensemble pourrait être utilisé pour biaiser la composition de chimiothèques à cribler, dans le but d'augmenter les chances d'identifier des Hits. L'analyse et la validation de cette assertion fait l'objet de cette première partie. Les Ratios de Moments Principaux d'Inertie Normalisés (RPN) ont été utilisés pour décrire la forme tridimensionnelle de petites molécules de type médicament. Il a été étudié si les molécules actives sur des cibles différentes se co-localisaient dans des sous-espaces privilégiés de l'espace des formes. Les résultats montrent des regroupements de molécules incompatibles avec une répartition aléatoire, avec certaines parties de l'espace peu susceptibles d'être occupées par des composés actifs. Par ailleurs, un fort enrichissement en formes allongées et plutôt plates a pu être observé, tandis que les composés globulaires étaient fortement sous-représentés. Cela a été confirmé pour un large ensemble de compilations de molécules d'origines différentes. Les distributions de forme des molécules actives sur des cibles différentes se recoupent largement, rendant une discrimination fondée uniquement sur la forme très difficile. Une perspective supplémentaire a été ajoutée par la comparaison des formes des ligands avec celles de leurs sites de liaison (poches) dans leurs protéines respectives. Bien que plus globulaires que leurs ligands, il a été observé que les formes des poches présentent une distribution dans l'espace des formes avec le même type d'asymétrie que celle observée pour les ligands: les formes sphériques sont fortement sous­ représentées. Un résultat différent a été obtenu pour les poches de plus petite taille et cristallisées sans ligand: elles possédaient une forme plus globulaire. La relation entre complémentarité de forme et bioactivité a été également analysée; une corrélation modérée entre bioactivité et des paramètres tels que remplissage de poche, distance dans l'espace des formes, ainsi que d'autres, a pu être identifiée. Ceci reflète l'importance de la complémentarité des formes, mais aussi l'implication d'autres facteurs. Une analyse ultérieure a évalué si et comment la forme et le volume d'une poche ou de ses ligands de référence pouvaient être utilisés comme un pré-filtre dans une approche de criblage virtuel. Durant l'optimisation d'un Lead, de nombreux paramètres doivent être optimisés simultanément. Dans ce contexte, la disponibilité d'exemples d'optimisations réussies est précieuse, car ils peuvent orienter les chimistes médicinaux dans leurs plans de synthèse par analogie. Cependant, bien que d'un extrême intérêt pour les chercheurs dans le domaine public, seules les grandes sociétés pharmaceutiques avaient jusqu'à présent la capacité d'exploiter de telles connaissances au sein de leurs bases de données internes. Dans le but de remédier à cette limitation, la base de données SwissBioisostere a été élaborée et publiée dans le domaine public au cours de cette thèse. Cette base de données contient des informations sur 21 293 355 échanges sous-structuraux observés, correspondant à 5 586 462 remplacements uniques mesurés dans 35 039 tests contre 1948 cibles représentant 30 familles, ainsi que sur leur impact sur la bioactivité. Une interface a été développée pour permettre un accès facile à ces données, accessible à http:/ /www.swissbioisostere.ch. La base de données ChEMBL a été utilisée comme source de données de bioactivité. Une version modifiée de l'algorithme de Hussain et Rea a été implémentée pour identifier les Matched Molecular Pairs (MMP) dans les données préparées au préalable. Des scores de succès ont été développés et intégrés dans la base de données pour permettre un reclassement des remplacements proposés selon leurs résultats précédemment observés. La corrélation entre ces scores et la similarité chimique des fragments correspondants a été étudiée. Une corrélation plus faible qu'attendue a été détectée et analysée. Différents cas d'utilisation de cette base de données ont été envisagés, et les fonctionnalités correspondantes implémentées: l'agrégation des résultats de remplacement est effectuée au niveau de chaque test, et il a été montré qu'elle pourrait également être effectuée au niveau de la cible ou de la classe de cible, sous réserve d'une analyse au cas par cas. Il a en outre été constaté que le succès d'un remplacement dépend de l'activité du composé A au sein d'une paire A-B. Il a été montré que la probabilité de perdre la bioactivité à la suite d'un remplacement moléculaire quelconque est plus importante au sein des molécules les plus actives que chez les molécules de plus faible activité. L'existence potentielle d'un biais lié au processus de publication par articles a pu être réfutée. En outre, les stratégies fréquentes de chimie médicinale pour l'exploration des relations structure-activité ont été analysées à l'aide des données acquises. Enfin, les données provenant des compagnies pharmaceutiques ont été comparées à celles reportées dans la littérature. Il a pu être constaté que les chimistes médicinaux dans l'industrie peuvent accéder à des remplacements qui ne sont pas disponibles dans le domaine public. Par contre, un grand nombre de remplacements fréquemment observés dans les données de l'industrie ont également pu être identifiés dans les données de la littérature. Les préférences pour certains remplacements particuliers diffèrent entre ces deux sources. L'intérêt d'évaluer les remplacements moléculaires simultanément selon plusieurs paramètres (bioactivité et stabilité métabolique par ex.) a aussi été étudié. Les études réalisées ont souligné qu'il semble n'exister aucun remplacement sous-structural universel qui conserve toujours la bioactivité quel que soit le contexte biologique. Une généralisation des remplacements bioisostériques ne semble donc pas possible.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gestures are the first forms of conventional communication that young children develop in order to intentionally convey a specific message. However, at first, infants rarely communicate successfully with their gestures, prompting caregivers to interpret them. Although the role of caregivers in early communication development has been examined, little is known about how caregivers attribute a specific communicative function to infants' gestures. In this study, we argue that caregivers rely on the knowledge about the referent that is shared with infants in order to interpret what communicative function infants wish to convey with their gestures. We videotaped interactions from six caregiver-infant dyads playing with toys when infants were 8, 10, 12, 14, and 16 months old. We coded infants' gesture production and we determined whether caregivers interpreted those gestures as conveying a clear communicative function or not; we also coded whether infants used objects according to their conventions of use as a measure of shared knowledge about the referent. Results revealed an association between infants' increasing knowledge of object use and maternal interpretations of infants' gestures as conveying a clear communicative function. Our findings emphasize the importance of shared knowledge in shaping infants' emergent communicative skills.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Intrarenal neurotransmission implies the co-release of neuropeptides at the neuro-effector junction with direct influence on parameters of kidney function. The presence of an angiotensin (Ang) II-containing phenotype in catecholaminergic postganglionic and sensory fibers of the kidney, based on immunocytological investigations, has only recently been reported. These angiotensinergic fibers display a distinct morphology and intrarenal distribution, suggesting anatomical and functional subspecialization linked to neuronal Ang II-expression. This review discusses the present knowledge concerning these fibers, and their significance for renal physiology and the pathogenesis of hypertension in light of established mechanisms. The data suggest a new role of Ang II as a co-transmitter stimulating renal target cells or modulating nerve traffic from or to the kidney. Neuronal Ang II is likely to be an independent source of intrarenal Ang II. Further physiological experimentation will have to explore the role of the angiotensinergic renal innervation and integrate it into existing concepts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recent developments in high magnetic field 13C magnetic resonance spectroscopy with improved localization and shimming techniques have led to important gains in sensitivity and spectral resolution of 13C in vivo spectra in the rodent brain, enabling the separation of several 13C isotopomers of glutamate and glutamine. In this context, the assumptions used in spectral quantification might have a significant impact on the determination of the 13C concentrations and the related metabolic fluxes. In this study, the time domain spectral quantification algorithm AMARES (advanced method for accurate, robust and efficient spectral fitting) was applied to 13 C magnetic resonance spectroscopy spectra acquired in the rat brain at 9.4 T, following infusion of [1,6-(13)C2 ] glucose. Using both Monte Carlo simulations and in vivo data, the goal of this work was: (1) to validate the quantification of in vivo 13C isotopomers using AMARES; (2) to assess the impact of the prior knowledge on the quantification of in vivo 13C isotopomers using AMARES; (3) to compare AMARES and LCModel (linear combination of model spectra) for the quantification of in vivo 13C spectra. AMARES led to accurate and reliable 13C spectral quantification similar to those obtained using LCModel, when the frequency shifts, J-coupling constants and phase patterns of the different 13C isotopomers were included as prior knowledge in the analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this study is to perform a thorough comparison of quantitative susceptibility mapping (QSM) techniques and their dependence on the assumptions made. The compared methodologies were: two iterative single orientation methodologies minimizing the l2, l1TV norm of the prior knowledge of the edges of the object, one over-determined multiple orientation method (COSMOS) and anewly proposed modulated closed-form solution (MCF). The performance of these methods was compared using a numerical phantom and in-vivo high resolution (0.65mm isotropic) brain data acquired at 7T using a new coil combination method. For all QSM methods, the relevant regularization and prior-knowledge parameters were systematically changed in order to evaluate the optimal reconstruction in the presence and absence of a ground truth. Additionally, the QSM contrast was compared to conventional gradient recalled echo (GRE) magnitude and R2* maps obtained from the same dataset. The QSM reconstruction results of the single orientation methods show comparable performance. The MCF method has the highest correlation (corrMCF=0.95, r(2)MCF =0.97) with the state of the art method (COSMOS) with additional advantage of extreme fast computation time. The l-curve method gave the visually most satisfactory balance between reduction of streaking artifacts and over-regularization with the latter being overemphasized when the using the COSMOS susceptibility maps as ground-truth. R2* and susceptibility maps, when calculated from the same datasets, although based on distinct features of the data, have a comparable ability to distinguish deep gray matter structures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Coastal primary rainforests have suffered damage in Côte d'Ivoire as a result of a lack of protection and urban pressures. Consequently, the highly endemic and critically endangered Wimmer's shrew, Crocidura wimmeri, known only from its type locality, Adiopodoumé, near Abidjan, was considered to have been extinct since 1976. Shrew species assignment is often problematic because of strong phenotypic similarities among many species. The phylogenetic position of C. wimmeri within the African Crocidura species should thus be clarified. In light of its recent rediscovery in the nearby small Banco National Park (34 km2), we investigated the genetic identity of seven specimens of C. wimmeri, based on 1020 bp of the mitochondrial DNA cytochrome b gene compared to other species sampled in the same region and published sequences from GenBank. Crocidura wimmeri formed a well-defined clade, the closest-related species being Crocidura sp., with a distance of 9.3%, a yet unknown species from Taï and Ziama forests. These results thus confirmed the validity of this species. This genetic characterization not only contributes to our knowledge of the evolution of West African shrews, but also may help in the discovery of additional populations of this critically endangered species.