899 resultados para Information Filtering, Pattern Mining, Relevance Feature Discovery, Text Mining


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The synthesis and biological evaluation of novel 1-aryl-3-[2-, 3- or 4-(thieno[3,2-b]pyridin-7-ylthio)phenyl]ureas 3, 4 and 5 as VEGFR-2 tyrosine kinase inhibitors, are reported. The 1-aryl-3-[3-(thieno[3,2-b]pyridin-7-ylthio)phenyl]ureas 4a-4h, with the arylurea in the meta position to the thioether, showed the lowest IC50 values in enzymatic assays (10-206 nM), the most potent compounds 4d-4h (IC50 10-28 nM) bearing hydrophobic groups (Me, F, CF3 and Cl) in the terminal phenyl ring. A convincing rationalization was achieved for the highest potent compounds 4 as type II VEGFR-2 inhibitors, based on the simultaneous presence of: (1) the thioether linker and (2) the arylurea moiety in the meta position. For compounds 4, significant inhibition of Human Umbilical Vein Endothelial Cells (HUVECs) proliferation (BrdU assay), migration (wound-healing assay) and tube formation were observed at low concentrations. These compounds have also shown to increase apoptosis using the TUNEL assay. Immunostaining for total and phosphorylated (active) VEGFR-2 was performed by Western blotting. The phosphorylation of the receptor was significantly inhibited at 1.0 and 2.5 microM for the most promising compounds. Altogether, these findings point to an antiangiogenic effect in HUVECs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Smart Drug Search is publicly accessible at http://sing.ei.uvigo.es/sds/. The BIOMedical Search Engine Framework is freely available for non-commercial use at https://github.com/agjacome/biomsef

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Die Preise für Speicherplatz fallen stetig, da verwundert es nicht, dass Unternehmen riesige Datenmengen anhäufen und sammeln. Diese immensen Datenmengen müssen jedoch mit geeigneten Methoden analysiert werden, um für das Unternehmen überlebensnotwendige Muster zu identifizieren. Solche Muster können Probleme aber auch Chancen darstellen. In jedem Fall ist es von größter Bedeutung, rechtzeitig diese Muster zu entdecken, um zeitnah reagieren zu können. Um breite Nutzerschichten anzusprechen, müssen Analysemethoden ferner einfach zu bedienen sein, sofort Rückmeldungen liefern und intuitive Visualisierungen anbieten. Ich schlage in der vorliegenden Arbeit Methoden zur Visualisierung und Filterung von Assoziationsregeln basierend auf ihren zeitlichen Änderungen vor. Ich werde lingustische Terme (die durch Fuzzymengen modelliert werden) verwenden, um die Historien von Regelbewertungsmaßen zu charakterisieren und so eine Ordnung von relevanten Regeln zu generieren. Weiterhin werde ich die vorgeschlagenen Methoden auf weitereModellarten übertragen, die Software-Plattformvorstellen, die die Analysemethoden dem Nutzer zugänglich macht und schließlich empirische Auswertungen auf Echtdaten aus Unternehmenskooperationen vorstellen, die die Wirksamkeit meiner Vorschläge belegen.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A detailed knoledge of distribution patterns schistosome intermediate hsts and their populations dynamics and factors affecting these patterns will provide useful information about the possibilities and desirability of conducting snail control measures in various transmission situations. On the basis of various case studies the association between the occurence of human water contacts and the presence of schistosome intermediate hosts or infections in the intermediate hosts is illustrated. Other parameters affecting snail distribution patterns and density fluctuations are discussed. It is concluded that ecological studies on the intermediate host are extremely relevant, either to optimally apply existing control measures or to develop alternative measures of snail control, such as ecological or biological control.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The development of palaeoparasitology in Japan has occurred in recent decades. Despite the fact that archaeology in Japan has been slow to develop techniques for excavating ancient toilets, important information about the development of sanitation has been derived from the analysis of a few sites. This shows that the earliest people had very simple methods of sanitation. As populations increased, sanitation became more complex. Ditches surrounding early towns were used for excrement disposal. Eventually distinct toilets were developed followed by cesspit type toilets and flushing toilets. The parasites recovered from these toilets include many species that infect humans today. These parasite spectra reflect local use of aquatic, marine, and land animals. Fecal borne disease was an increasing problem as represented by whipworm and ascarid roundworm eggs. Interestingly, ascarid roundworms were absent in the earliest cultures and only became common with rice agriculture. Finds of pollen and seeds in toilet sediments reveal the use of medicinal plants to control the emerging problem of parasites.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

To determine the features of papers, authors, and citation of eleven journals in tropical medicine indexed by Science Citation Index Expanded, the database of the Institute for Scientific Information, we analyzed original articles, editorials, reviews, corrections, letters, biographies, and news published in these journals. The results show that these journals covered 107 countries or regions on six continents. The average number of reference was 23.05, with 87.89% of the references from periodicals. The Price Index was 31.43% and the self-citing rate was 7.02%. The references in the first 20 journals ranked by the amount of citation accounted for 36.71% of the total citations. Brazil, United States, India, and England are more advanced in tropical medicine research. The conclusion is that these journals covered most research done in these countries or regions. Most researches were done by cooperation of the researchers, but many of the publications used outdated articles and should include newer information.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Human T-cell lymphotropic virus type 1 (HTLV-1) is mainly associated with two diseases: tropical spastic paraparesis/HTLV-1-associated myelopathy (TSP/HAM) and adult T-cell leukaemia/lymphoma. This retrovirus infects five-10 million individuals throughout the world. Previously, we developed a database that annotates sequence data from GenBank and the present study aimed to describe the clinical, molecular and epidemiological scenarios of HTLV-1 infection through the stored sequences in this database. A total of 2,545 registered complete and partial sequences of HTLV-1 were collected and 1,967 (77.3%) of those sequences represented unique isolates. Among these isolates, 93% contained geographic origin information and only 39% were related to any clinical status. A total of 1,091 sequences contained information about the geographic origin and viral subtype and 93% of these sequences were identified as subtype “a”. Ethnicity data are very scarce. Regarding clinical status data, 29% of the sequences were generated from TSP/HAM and 67.8% from healthy carrier individuals. Although the data mining enabled some inferences about specific aspects of HTLV-1 infection to be made, due to the relative scarcity of data of available sequences, it was not possible to delineate a global scenario of HTLV-1 infection.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: The variety of DNA microarray formats and datasets presently available offers an unprecedented opportunity to perform insightful comparisons of heterogeneous data. Cross-species studies, in particular, have the power of identifying conserved, functionally important molecular processes. Validation of discoveries can now often be performed in readily available public data which frequently requires cross-platform studies.Cross-platform and cross-species analyses require matching probes on different microarray formats. This can be achieved using the information in microarray annotations and additional molecular biology databases, such as orthology databases. Although annotations and other biological information are stored using modern database models ( e. g. relational), they are very often distributed and shared as tables in text files, i.e. flat file databases. This common flat database format thus provides a simple and robust solution to flexibly integrate various sources of information and a basis for the combined analysis of heterogeneous gene expression profiles.Results: We provide annotationTools, a Bioconductor-compliant R package to annotate microarray experiments and integrate heterogeneous gene expression profiles using annotation and other molecular biology information available as flat file databases. First, annotationTools contains a specialized set of functions for mining this widely used database format in a systematic manner. It thus offers a straightforward solution for annotating microarray experiments. Second, building on these basic functions and relying on the combination of information from several databases, it provides tools to easily perform cross-species analyses of gene expression data.Here, we present two example applications of annotationTools that are of direct relevance for the analysis of heterogeneous gene expression profiles, namely a cross-platform mapping of probes and a cross-species mapping of orthologous probes using different orthology databases. We also show how to perform an explorative comparison of disease-related transcriptional changes in human patients and in a genetic mouse model.Conclusion: The R package annotationTools provides a simple solution to handle microarray annotation and orthology tables, as well as other flat molecular biology databases. Thereby, it allows easy integration and analysis of heterogeneous microarray experiments across different technological platforms or species.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We diagnosed a non-small cell lung carcinoma in a 49-year-old female patient with the histopathological diagnosis of stage IIIB mixed bronchioloalveolar and papillary adenocarcinoma with extensive micropapillary feature, which was not visualized on the preoperative multimodality imaging with positron emission tomography (PET) and computed tomography (CT). The micropapillary component characterized by a unique growth pattern with particular morphological features can be observed in all subtypes of lung adenocarcinoma. Micropapillary component is increasingly recognized as a distinct entity associated with higher aggressiveness. Even the most modern multimodality PET/CT imaging technology may fail to adequately visualize this important component with highly relevant prognostic implications. Thus, the pathologist needs to consciously look for a micropapillary component in the surgical specimen or in preoperative biopsies or cytology. This may have potential future treatment implications, as adjuvant or neoadjuvant chemotherapy may be of relevance, even in the early stages of the disease.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: To enhance our understanding of complex biological systems like diseases we need to put all of the available data into context and use this to detect relations, pattern and rules which allow predictive hypotheses to be defined. Life science has become a data rich science with information about the behaviour of millions of entities like genes, chemical compounds, diseases, cell types and organs, which are organised in many different databases and/or spread throughout the literature. Existing knowledge such as genotype - phenotype relations or signal transduction pathways must be semantically integrated and dynamically organised into structured networks that are connected with clinical and experimental data. Different approaches to this challenge exist but so far none has proven entirely satisfactory. Results: To address this challenge we previously developed a generic knowledge management framework, BioXM™, which allows the dynamic, graphic generation of domain specific knowledge representation models based on specific objects and their relations supporting annotations and ontologies. Here we demonstrate the utility of BioXM for knowledge management in systems biology as part of the EU FP6 BioBridge project on translational approaches to chronic diseases. From clinical and experimental data, text-mining results and public databases we generate a chronic obstructive pulmonary disease (COPD) knowledge base and demonstrate its use by mining specific molecular networks together with integrated clinical and experimental data. Conclusions: We generate the first semantically integrated COPD specific public knowledge base and find that for the integration of clinical and experimental data with pre-existing knowledge the configuration based set-up enabled by BioXM reduced implementation time and effort for the knowledge base compared to similar systems implemented as classical software development projects. The knowledgebase enables the retrieval of sub-networks including protein-protein interaction, pathway, gene - disease and gene - compound data which are used for subsequent data analysis, modelling and simulation. Pre-structured queries and reports enhance usability; establishing their use in everyday clinical settings requires further simplification with a browser based interface which is currently under development.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Abstract : Understanding how biodiversity is distributed is central to any conservation effort and has traditionally been based on niche modeling and the causal relationship between spatial distribution of organisms and their environment. More recently, the study of species' evolutionary history and relatedness has permeated the fields of ecology and conservation and, coupled with spatial predictions, provides useful insights to the origin of current biodiversity patterns, community structuring and potential vulnerability to extinction. This thesis explores several key ecological questions by combining the fields of niche modeling and phylogenetics and using important components of southern African biodiversity. The aims of this thesis are to provide comparisons of biodiversity measures, to assess how climate change will affect evolutionary history loss, to ask whether there is a clear link between evolutionary history and morphology and to investigate the potential role of relatedness in macro-climatic niche structuring. The first part of my thesis provides a fine scale comparison and spatial overlap quantification of species richness and phylogenetic diversity predictions for one of the most diverse plant families in the Cape Floristic Region (CFR), the Proteaceae. In several of the measures used, patterns do not match sufficiently to argue that species relatedness information is implicit in species richness patterns. The second part of my thesis predicts how climate change may affect threat and potential extinction of southern African animal and plant taxa. I compare present and future niche models to assess whether predicted species extinction will result in higher or lower V phylogenetic diversity survival than what would be experienced under random extinction processes. l find that predicted extinction will result in lower phylogenetic diversity survival but that this non-random pattern will be detected only after a substantial proportion of the taxa in each group has been lost. The third part of my thesis explores the relationship between phylogenetic and morphological distance in southern African bats to assess whether long evolutionary histories correspond to equally high levels of morphological variation, as predicted by a neutral model of character evolution. I find no such evidence; on the contrary weak negative trends are detected for this group, as well as in simulations of both neutral and convergent character evolution. Finally, I ask whether spatial and climatic niche occupancy in southern African bats is influenced by evolutionary history or not. I relate divergence time between species pairs to climatic niche and range overlap and find no evidence for clear phylogenetic structuring. I argue that this may be due to particularly high levels of micro-niche partitioning. Résumé : Comprendre la distribution de la biodiversité représente un enjeu majeur pour la conservation de la nature. Les analyses se basent le plus souvent sur la modélisation de la niche écologique à travers l'étude des relations causales entre la distribution spatiale des organismes et leur environnement. Depuis peu, l'étude de l'histoire évolutive des organismes est également utilisée dans les domaines de l'écologie et de la conservation. En combinaison avec la modélisation de la distribution spatiale des organismes, cette nouvelle approche fournit des informations pertinentes pour mieux comprendre l'origine des patterns de biodiversité actuels, de la structuration des communautés et des risques potentiels d'extinction. Cette thèse explore plusieurs grandes questions écologiques, en combinant les domaines de la modélisation de la niche et de la phylogénétique. Elle s'applique aux composants importants de la biodiversité de l'Afrique australe. Les objectifs de cette thèse ont été l) de comparer différentes mesures de la biodiversité, 2) d'évaluer l'impact des changements climatiques à venir sur la perte de diversité phylogénétique, 3) d'analyser le lien potentiel entre diversité phylogénétique et diversité morphologique et 4) d'étudier le rôle potentiel de la phylogénie sur la structuration des niches macro-climatiques des espèces. La première partie de cette thèse fournit une comparaison spatiale, et une quantification du chevauchement, entre des prévisions de richesse spécifique et des prédictions de la diversité phylogénétique pour l'une des familles de plantes les plus riches en espèces de la région floristique du Cap (CFR), les Proteaceae. Il résulte des analyses que plusieurs mesures de diversité phylogénétique montraient des distributions spatiales différentes de la richesse spécifique, habituellement utilisée pour édicter des mesures de conservation. La deuxième partie évalue les effets potentiels des changements climatiques attendus sur les taux d'extinction d'animaux et de plantes de l'Afrique australe. Pour cela, des modèles de distribution d'espèces actuels et futurs ont permis de déterminer si l'extinction des espèces se traduira par une plus grande ou une plus petite perte de diversité phylogénétique en comparaison à un processus d'extinction aléatoire. Les résultats ont effectivement montré que l'extinction des espèces liées aux changements climatiques pourrait entraîner une perte plus grande de diversité phylogénétique. Cependant, cette perte ne serait plus grande que celle liée à un processus d'extinction aléatoire qu'à partir d'une forte perte de taxons dans chaque groupe. La troisième partie de cette thèse explore la relation entre distances phylogénétiques et morphologiques d'espèces de chauves-souris de l'Afrique australe. ll s'agit plus précisément de déterminer si une longue histoire évolutive correspond également à des variations morphologiques plus grandes dans ce groupe. Cette relation est en fait prédite par un modèle neutre d'évolution de caractères. Aucune évidence de cette relation n'a émergé des analyses. Au contraire, des tendances négatives ont été détectées, ce qui représenterait la conséquence d'une évolution convergente entre clades et des niveaux élevés de cloisonnement pour chaque clade. Enfin, la dernière partie présente une étude sur la répartition de la niche climatique des chauves-souris de l'Afrique australe. Dans cette étude je rapporte temps de divergence évolutive (ou deux espèces ont divergé depuis un ancêtre commun) au niveau de chevauchement de leurs niches climatiques. Les résultats n'ont pas pu mettre en évidence de lien entre ces deux paramètres. Les résultats soutiennent plutôt l'idée que cela pourrait être I dû à des niveaux particulièrement élevés de répartition de la niche à échelle fine.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

O presente trabalho cujo Título é técnicas de Data e Text Mining para a anotação dum Arquivo Digital, tem como objectivo testar a viabilidade da utilização de técnicas de processamento automático de texto para a anotação das sessões dos debates parlamentares da Assembleia da República de Portugal. Ao longo do trabalho abordaram-se conceitos como tecnologias de descoberta do conhecimento (KDD), o processo da descoberta do conhecimento em texto, a caracterização das várias etapas do processamento de texto e a descrição de algumas ferramentas open souce para a mineração de texto. A metodologia utilizada baseou-se na experimentação de várias técnicas de processamento textual utilizando a open source R/tm. Apresentam-se, como resultados, a influência do pré-processamento, tamanho dos documentos e tamanhos dos corpora no resultado do processamento utilizando o algoritmo knnflex.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Data mining can be defined as the extraction of previously unknown and potentially useful information from large datasets. The main principle is to devise computer programs that run through databases and automatically seek deterministic patterns. It is applied in different fields of application, e.g., remote sensing, biometry, speech recognition, but has seldom been applied to forensic case data. The intrinsic difficulty related to the use of such data lies in its heterogeneity, which comes from the many different sources of information. The aim of this study is to highlight potential uses of pattern recognition that would provide relevant results from a criminal intelligence point of view. The role of data mining within a global crime analysis methodology is to detect all types of structures in a dataset. Once filtered and interpreted, those structures can point to previously unseen criminal activities. The interpretation of patterns for intelligence purposes is the final stage of the process. It allows the researcher to validate the whole methodology and to refine each step if necessary. An application to cutting agents found in illicit drug seizures was performed. A combinatorial approach was done, using the presence and the absence of products. Methods coming from the graph theory field were used to extract patterns in data constituted by links between products and place and date of seizure. A data mining process completed using graphing techniques is called ``graph mining''. Patterns were detected that had to be interpreted and compared with preliminary knowledge to establish their relevancy. The illicit drug profiling process is actually an intelligence process that uses preliminary illicit drug classes to classify new samples. Methods proposed in this study could be used \textit{a priori} to compare structures from preliminary and post-detection patterns. This new knowledge of a repeated structure may provide valuable complementary information to profiling and become a source of intelligence.