37 resultados para Discovery Tools
em Helda - Digital Repository of University of Helsinki
Resumo:
The present challenge in drug discovery is to synthesize new compounds efficiently in minimal time. The trend is towards carefully designed and well-characterized compound libraries because fast and effective synthesis methods easily produce thousands of new compounds. The need for rapid and reliable analysis methods is increased at the same time. Quality assessment, including the identification and purity tests, is highly important since false (negative or positive) results, for instance in tests of biological activity or determination of early-ADME parameters in vitro (the pharmacokinetic study of drug absorption, distribution, metabolism, and excretion), must be avoided. This thesis summarizes the principles of classical planar chromatographic separation combined with ultraviolet (UV) and mass spectrometric (MS) detection, and introduces powerful, rapid, easy, low-cost, and alternative tools and techniques for qualitative and quantitative analysis of small drug or drug-like molecules. High performance thin-layer chromatography (HPTLC) was introduced and evaluated for fast semi-quantitative assessment of the purity of synthesis target compounds. HPTLC methods were compared with the liquid chromatography (LC) methods. Electrospray ionization mass spectrometry (ESI MS) and atmospheric pressure matrix-assisted laser desorption/ionization MS (AP MALDI MS) were used to identify and confirm the product zones on the plate. AP MALDI MS was rapid, and easy to carry out directly on the plate without scraping. The PLC method was used to isolate target compounds from crude synthesized products and purify them for bioactivity and preliminary ADME tests. Ultra-thin-layer chromatography (UTLC) with AP MALDI MS and desorption electrospray ionization mass spectrometry (DESI MS) was introduced and studied for the first time. Because of the thinner adsorbent layer, the monolithic UTLC plate provided 10 100 times better sensitivity in MALDI analysis than did HPTLC plates. The limits of detection (LODs) down to low picomole range were demonstrated for UTLC AP MALDI and UTLC DESI MS. In a comparison of AP and vacuum MALDI MS detection for UTLC plates, desorption from the irregular surface of the plates with the combination of an external AP MALDI ion source and an ion trap instrument provided clearly less variation in mass accuracy than the vacuum MALDI time-of-flight (TOF) instrument. The performance of the two-dimensional (2D) UTLC separation with AP MALDI MS method was studied for the first time. The influence of the urine matrix on the separation and the repeatability was evaluated with benzodiazepines as model substances in human urine. The applicability of 2D UTLC AP MALDI MS was demonstrated in the detection of metabolites in an authentic urine sample.
Resumo:
The work is based on the assumption that words with similar syntactic usage have similar meaning, which was proposed by Zellig S. Harris (1954,1968). We study his assumption from two aspects: Firstly, different meanings (word senses) of a word should manifest themselves in different usages (contexts), and secondly, similar usages (contexts) should lead to similar meanings (word senses). If we start with the different meanings of a word, we should be able to find distinct contexts for the meanings in text corpora. We separate the meanings by grouping and labeling contexts in an unsupervised or weakly supervised manner (Publication 1, 2 and 3). We are confronted with the question of how best to represent contexts in order to induce effective classifiers of contexts, because differences in context are the only means we have to separate word senses. If we start with words in similar contexts, we should be able to discover similarities in meaning. We can do this monolingually or multilingually. In the monolingual material, we find synonyms and other related words in an unsupervised way (Publication 4). In the multilingual material, we ?nd translations by supervised learning of transliterations (Publication 5). In both the monolingual and multilingual case, we first discover words with similar contexts, i.e., synonym or translation lists. In the monolingual case we also aim at finding structure in the lists by discovering groups of similar words, e.g., synonym sets. In this introduction to the publications of the thesis, we consider the larger background issues of how meaning arises, how it is quantized into word senses, and how it is modeled. We also consider how to define, collect and represent contexts. We discuss how to evaluate the trained context classi?ers and discovered word sense classifications, and ?nally we present the word sense discovery and disambiguation methods of the publications. This work supports Harris' hypothesis by implementing three new methods modeled on his hypothesis. The methods have practical consequences for creating thesauruses and translation dictionaries, e.g., for information retrieval and machine translation purposes. Keywords: Word senses, Context, Evaluation, Word sense disambiguation, Word sense discovery.
Resumo:
The purpose of this study is to analyze and develop various forms of abduction as a means of conceptualizing processes of discovery. Abduction was originally presented by Charles S. Peirce (1839-1914) as a "weak", third main mode of inference -- besides deduction and induction -- one which, he proposed, is closely related to many kinds of cognitive processes, such as instincts, perception, practices and mediated activity in general. Both abduction and discovery are controversial issues in philosophy of science. It is often claimed that discovery cannot be a proper subject area for conceptual analysis and, accordingly, abduction cannot serve as a "logic of discovery". I argue, however, that abduction gives essential means for understanding processes of discovery although it cannot give rise to a manual or algorithm for making discoveries. In the first part of the study, I briefly present how the main trend in philosophy of science has, for a long time, been critical towards a systematic account of discovery. Various models have, however, been suggested. I outline a short history of abduction; first Peirce's evolving forms of his theory, and then later developments. Although abduction has not been a major area of research until quite recently, I review some critiques of it and look at the ways it has been analyzed, developed and used in various fields of research. Peirce's own writings and later developments, I argue, leave room for various subsequent interpretations of abduction. The second part of the study consists of six research articles. First I treat "classical" arguments against abduction as a logic of discovery. I show that by developing strategic aspects of abductive inference these arguments can be countered. Nowadays the term 'abduction' is often used as a synonym for the Inference to the Best Explanation (IBE) model. I argue, however, that it is useful to distinguish between IBE ("Harmanian abduction") and "Hansonian abduction"; the latter concentrating on analyzing processes of discovery. The distinctions between loveliness and likeliness, and between potential and actual explanations are more fruitful within Hansonian abduction. I clarify the nature of abduction by using Peirce's distinction between three areas of "semeiotic": grammar, critic, and methodeutic. Grammar (emphasizing "Firstnesses" and iconicity) and methodeutic (i.e., a processual approach) especially, give new means for understanding abduction. Peirce himself held a controversial view that new abductive ideas are products of an instinct and an inference at the same time. I maintain that it is beneficial to make a clear distinction between abductive inference and abductive instinct, on the basis of which both can be developed further. Besides these, I analyze abduction as a part of distributed cognition which emphasizes a long-term interaction with the material, social and cultural environment as a source for abductive ideas. This approach suggests a "trialogical" model in which inquirers are fundamentally connected both to other inquirers and to the objects of inquiry. As for the classical Meno paradox about discovery, I show that abduction provides more than one answer. As my main example of abductive methodology, I analyze the process of Ignaz Semmelweis' research on childbed fever. A central basis for abduction is the claim that discovery is not a sequence of events governed only by processes of chance. Abduction treats those processes which both constrain and instigate the search for new ideas; starting from the use of clues as a starting point for discovery, but continuing in considerations like elegance and 'loveliness'. The study then continues a Peircean-Hansonian research programme by developing abduction as a way of analyzing processes of discovery.
Resumo:
Populations in developed countries are ageing fast. The elderly have the greatest incidence of de-mentia, and thus the increase in the number of demented individuals, increases the immediate costs for the governments concerning healthcare and hospital treatment. Attention is being paid to disorders behind cognitive impairment with behavioural and psychological symptoms, which are enormous contributors to the hospital care required for the elderly. The highest dreams are in prevention; however, before discovering the tools for preventing dementia, the pathogenesis behind dementia disorders needs to be understood. Dementia with Lewy bodies (DLB), a relatively recently discovered dementia disorder compared to Alzheimer’s disease (AD), is estimated to account for up to one third of primary degenerative dementia, thus being the second most common cause of dementia in the elderly. Nevertheless, the impact of neuropathological and genetic findings on the clinical syndrome of DLB is not fully established. In this present series of studies, the frequency of neuropathological findings of DLB and its relation to the clinical findings was evaluated in a cohort of subjects with primary degenerative dementia and in a population-based prospective cohort study of individuals aged 85 years or older. α-synuclein (αS) immunoreactive pathology classifiable according to the DLB consensus criteria was found in one fourth of the primary degenerative dementia subjects. In the population-based study, the corresponding figure was one third of the population, 38% of the demented and one fifth of the non-demented very elderly Finns. However, in spite of the frequent discovery of αS pathology, its association with the clinical symptoms was quite poor. Indeed, the common clinical features of DLB, hypokinesia and visual hallucinations, associated better with the severe neurofibrillary AD-type pathology than with the extensive (diffuse neocortical) αS pathology when both types of pathology were taken into account. The severity of the neurofibrillary AD-type pathology (Braak stage) associated with the extent of αS pathology in the brain. In addition, the genetic study showed an interaction between tau and αS; common variation in the αS gene (SNCA) associated significantly with the severity of the neurofibrillary AD-type pathology and nominally significantly with the extensive αS pathology. Further, the relevance and temporal course of the substantia nigra (SN) degeneration and of the spinal cord αS pathology were studied in relation to αS pathology in the brain. The linear association between the extent of αS pathology in the brain and the neuron loss in SN suggests that in DLB the degeneration of SN proceeds as the αS pathology extends from SN to the neocortex instead of early destruction of SN seen in Parkinson’s disease (PD). Furthermore, the extent of αS pathology in the brain associated with the severity of αS pathology in the thoracic and sacral autonomic nuclei of the spinal cord. The thoracic αS pathology was more common and more severe compared to sacral cord, suggesting that the progress of αS pathology proceeds downwards from the brainstem towards the sacral spinal cord.
Resumo:
Telecommunications network management is based on huge amounts of data that are continuously collected from elements and devices from all around the network. The data is monitored and analysed to provide information for decision making in all operation functions. Knowledge discovery and data mining methods can support fast-pace decision making in network operations. In this thesis, I analyse decision making on different levels of network operations. I identify the requirements decision-making sets for knowledge discovery and data mining tools and methods, and I study resources that are available to them. I then propose two methods for augmenting and applying frequent sets to support everyday decision making. The proposed methods are Comprehensive Log Compression for log data summarisation and Queryable Log Compression for semantic compression of log data. Finally I suggest a model for a continuous knowledge discovery process and outline how it can be implemented and integrated to the existing network operations infrastructure.
Resumo:
This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.
Resumo:
Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.