24 resultados para (hyper)text
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Helsingfors : Wasenius & Co. [1853]
Resumo:
[Åbo] : [Victor Forselius] c. 1900
Resumo:
Internet on elektronisen postin perusrakenne ja ollut tärkeä tiedonlähde akateemisille käyttäjille jo pitkään. Siitä on tullut merkittävä tietolähde kaupallisille yrityksille niiden pyrkiessä pitämään yhteyttä asiakkaisiinsa ja seuraamaan kilpailijoitansa. WWW:n kasvu sekä määrällisesti että sen moninaisuus on luonut kasvavan kysynnän kehittyneille tiedonhallintapalveluille. Tällaisia palveluja ovet ryhmittely ja luokittelu, tiedon löytäminen ja suodattaminen sekä lähteiden käytön personointi ja seuranta. Vaikka WWW:stä saatavan tieteellisen ja kaupallisesti arvokkaan tiedon määrä on huomattavasti kasvanut viime vuosina sen etsiminen ja löytyminen on edelleen tavanomaisen Internet hakukoneen varassa. Tietojen hakuun kohdistuvien kasvavien ja muuttuvien tarpeiden tyydyttämisestä on tullut monimutkainen tehtävä Internet hakukoneille. Luokittelu ja indeksointi ovat merkittävä osa luotettavan ja täsmällisen tiedon etsimisessä ja löytämisessä. Tämä diplomityö esittelee luokittelussa ja indeksoinnissa käytettävät yleisimmät menetelmät ja niitä käyttäviä sovelluksia ja projekteja, joissa tiedon hakuun liittyvät ongelmat on pyritty ratkaisemaan.
Resumo:
Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.
Resumo:
Biomedical research is currently facing a new type of challenge: an excess of information, both in terms of raw data from experiments and in the number of scientific publications describing their results. Mirroring the focus on data mining techniques to address the issues of structured data, there has recently been great interest in the development and application of text mining techniques to make more effective use of the knowledge contained in biomedical scientific publications, accessible only in the form of natural human language. This thesis describes research done in the broader scope of projects aiming to develop methods, tools and techniques for text mining tasks in general and for the biomedical domain in particular. The work described here involves more specifically the goal of extracting information from statements concerning relations of biomedical entities, such as protein-protein interactions. The approach taken is one using full parsing—syntactic analysis of the entire structure of sentences—and machine learning, aiming to develop reliable methods that can further be generalized to apply also to other domains. The five papers at the core of this thesis describe research on a number of distinct but related topics in text mining. In the first of these studies, we assessed the applicability of two popular general English parsers to biomedical text mining and, finding their performance limited, identified several specific challenges to accurate parsing of domain text. In a follow-up study focusing on parsing issues related to specialized domain terminology, we evaluated three lexical adaptation methods. We found that the accurate resolution of unknown words can considerably improve parsing performance and introduced a domain-adapted parser that reduced the error rate of theoriginal by 10% while also roughly halving parsing time. To establish the relative merits of parsers that differ in the applied formalisms and the representation given to their syntactic analyses, we have also developed evaluation methodology, considering different approaches to establishing comparable dependency-based evaluation results. We introduced a methodology for creating highly accurate conversions between different parse representations, demonstrating the feasibility of unification of idiverse syntactic schemes under a shared, application-oriented representation. In addition to allowing formalism-neutral evaluation, we argue that such unification can also increase the value of parsers for domain text mining. As a further step in this direction, we analysed the characteristics of publicly available biomedical corpora annotated for protein-protein interactions and created tools for converting them into a shared form, thus contributing also to the unification of text mining resources. The introduced unified corpora allowed us to perform a task-oriented comparative evaluation of biomedical text mining corpora. This evaluation established clear limits on the comparability of results for text mining methods evaluated on different resources, prompting further efforts toward standardization. To support this and other research, we have also designed and annotated BioInfer, the first domain corpus of its size combining annotation of syntax and biomedical entities with a detailed annotation of their relationships. The corpus represents a major design and development effort of the research group, with manual annotation that identifies over 6000 entities, 2500 relationships and 28,000 syntactic dependencies in 1100 sentences. In addition to combining these key annotations for a single set of sentences, BioInfer was also the first domain resource to introduce a representation of entity relations that is supported by ontologies and able to capture complex, structured relationships. Part I of this thesis presents a summary of this research in the broader context of a text mining system, and Part II contains reprints of the five included publications.
Resumo:
Fluent health information flow is critical for clinical decision-making. However, a considerable part of this information is free-form text and inabilities to utilize it create risks to patient safety and cost-effective hospital administration. Methods for automated processing of clinical text are emerging. The aim in this doctoral dissertation is to study machine learning and clinical text in order to support health information flow.First, by analyzing the content of authentic patient records, the aim is to specify clinical needs in order to guide the development of machine learning applications.The contributions are a model of the ideal information flow,a model of the problems and challenges in reality, and a road map for the technology development. Second, by developing applications for practical cases,the aim is to concretize ways to support health information flow. Altogether five machine learning applications for three practical cases are described: The first two applications are binary classification and regression related to the practical case of topic labeling and relevance ranking.The third and fourth application are supervised and unsupervised multi-class classification for the practical case of topic segmentation and labeling.These four applications are tested with Finnish intensive care patient records.The fifth application is multi-label classification for the practical task of diagnosis coding. It is tested with English radiology reports.The performance of all these applications is promising. Third, the aim is to study how the quality of machine learning applications can be reliably evaluated.The associations between performance evaluation measures and methods are addressed,and a new hold-out method is introduced.This method contributes not only to processing time but also to the evaluation diversity and quality. The main conclusion is that developing machine learning applications for text requires interdisciplinary, international collaboration. Practical cases are very different, and hence the development must begin from genuine user needs and domain expertise. The technological expertise must cover linguistics,machine learning, and information systems. Finally, the methods must be evaluated both statistically and through authentic user-feedback.
Resumo:
Von Wrightin taitelijaveljeksistä Wilhelm v. Wright (1810-1887) muutti Ruotsiin 18-vuotiaana avustaakseen vanhempaa veljeään Magnusta SKANDINAVIENS FÅGLAR -teoksen (ks. kohde 258) valmistamisessa. Wilhelm jäi pysyvästi Ruotsiin ja ryhtyi työskentelemään Kuninkaallisen tiedeakatemian piirtäjänä. Vuonna 1836 alkoi ilmestyä laaja kalakuvasto SKANDINAVIENS FISKAR, johon hän oli laatinut kuvat ja eläintieteilijät C. U. Ekström, B. F. Fries ja C. J. Sundevall kirjoittaneet tekstit. Tästä teoksesta tuli v. Wrightin kuvitustöistä merkittävin ja korkeatasoisin. Myöhemmin v. Wright toimi Bohuslänin kalastuksentarkastajan virassa, mutta halvaantui loppuelämäkseen yli kolmenkymmenen vuoden ajaksi.
Resumo:
Tässä pro gradu -tutkielmassa käsittelen lähde- ja kohdetekstikeskeisyyttä näytelmäkääntämisessä. Tutkimuskohteina olivat käännösten sanasto, syntaksi, näyttämötekniikka, kielikuvat, sanaleikit, runomitta ja tyyli. Tutkimuksen tarkoituksena oli selvittää, näkyykö teoreettinen painopisteen siirtyminen lähdetekstikeskeisyydestä kohdetekstikeskeisyyteen suomenkielisessä näytelmäkääntämisessä. Oletuksena oli, että siirtyminen näkyy käytetyissä käännösstrategioissa. Tutkimuksen teoriaosuudessa käsitellään ensin lähde- ja kohdetekstikeskeisiä käännösteorioita. Ensin esitellään kaksi lähdetekstikeskeistä teoriaa, jotka ovat Catfordin (1965) muodollinen vastaavuus ja Nidan (1964) dynaaminen ekvivalenssi. Kohdetekstikeskeisistä teorioista käsitellään Touryn (1980) ja Newmarkin (1981) teoreettisia näkemyksiä sekä Reiss ja Vermeerin (1986) esittelemää skopos-teoriaa. Vieraannuttamisen ja kotouttamisen periaatteet esitellään lyhyesti. Teoriaosuudessa käsitellään myös näytelmäkääntämistä, William Shakespearen kieltä ja siihen liittyviä käännösongelmia. Lisäksi esittelen lyhyesti Shakespearen kääntämistä Suomessa ja Julius Caesarin neljä kääntäjää. Tutkimuksen materiaalina oli neljä Shakespearen Julius Caesar –näytelmän suomennosta, joista Paavo Cajanderin käännös on julkaistu vuonna 1883, Eeva-Liisa Mannerin vuonna 1983, Lauri Siparin vuonna 2006 ja Jarkko Laineen vuonna 2007. Analyysissa käännöksiä verrattiin lähdetekstiin ja toisiinsa ja vertailtiin kääntäjien tekemiä käännösratkaisuja. Tulokset olivat oletuksen mukaisia. Lähdetekstikeskeisiä käännösstrategioita oli käytetty uusissa käännöksissä vähemmän kuin vanhemmissa. Kohdetekstikeskeiset strategiat erosivat huomattavasti toisistaan ja uusinta käännöstä voi sanoa adaptaatioksi. Jatkotutkimuksissa tulisi materiaali laajentaa koskemaan muitakin Shakespearen näytelmien suomennoksia. Eri aikakausien käännöksiä tulisi verrata keskenään ja toisiinsa, jotta voitaisiin luotettavasti kuvata muutosta lähde- ja kohdetekstikeskeisten käännösstrategioiden käytössä ja eri aikakausien tyypillisten strategioiden kartoittamiseksi.
Resumo:
My presupposition, that learning at some level deals with life praxis, is expressed in four metaphors: space, time, fable and figure. Relations between learning,knowledge building and meaning making are linked to the concept of personal knowledge. I present a two part study of learning as text in a drama pedagogical rooted reading where learning is framed as the ongoing event, and knowledge, as the product of previous processes, is framed as culturally formed utterances. A frame analysis model is constructed as a topological guide for relations between the two concepts learning and knowledge. It visualises an aesthetic understanding, rooted in drama pedagogical comprehension. Insight and perception are linked in an inner relationship that is neither external nor identical. This understanding expresses the movement "in between" connecting asymmetrical and nonlinear features of human endeavour and societal issues. The performability of bodily and oral participation in the learning event in a socio-cultural setting is analysed as a dialogised text. In an ethnographical case study I have gathered material with an interest for the particular. The empirical material is based on three problem based learning situations in a Polytechnic setting. The act of transformation in the polyphony of the event is considered as a turning point in the narrative employment. Negotiation and figuration in the situation form patterns of the space for improvisation (flow) and tensions at the boundaries (thresholds) which imply the logical structure of transformation. Learning as a dialogised text of "yes" and "no", of structure and play for the improvised, interrelate in that movement. It is related to both the syntagmic and the paradigmatic forms of thinking. In the philosophical study, forms of understanding are linked to the logical structure of transformation as a cultural issue. The classical rhetorical concepts of Logos, Pathos, Ethos and Mythos are connected to the multidimensional rationality of the human being. In the Aristotelian form of knowledge, phronesis,a logic structure of inquiry is recognised. The shifting of perspectives between approaches, the construction of knowledge as context and the human project of meaning making as a subtext, illuminates multiple layers of the learning text. In an argumentation that post-modern apprehension of knowledge, emphasising contextual and situational values, has an empowering impact on learning, I find pedagogical benefits. The dialogical perspective has opened lenses that manage to hold in aesthetic doubling the individual action of inquiry and the stage with its cultural tools in a three dimensional reading.