29 resultados para Online matching
em Helda - Digital Repository of University of Helsinki
Resumo:
Tämä pro gradu -tutkielma vertailee korpuksen avulla erisnimien kvantitatiivista jakautumista luokkiin kahdessa saksalaisessa verkkolehdessä. Työn tavoitteena on selvittää, kuinka erisnimiä voidaan luokitella ja mitä eroja niiden avulla on havaittavissa lehtien raportoinnissa. Laajempana kehyksenä toimii kysymys siitä, voidaanko erisnimiä hyödyntäen hahmottaa lehtien sisältöjä. Korpus on kerätty Frankfurter Allgemeine Zeitungin ja Süddeutsche Zeitungin verkkolehtien http: //www.faz.net (FAZ) ja http://www.sueddeutsche.de (SZ) artikkeleista ajalta 2.11.2004-8.11.2004. Valitut sivustot edustavat Saksan arvostetuimpien päivittäisten, koko maan kattavien sanomaleh- tien verkkojulkaisuja. Näistä FAZ:ia pidetään konservatiivisena ja SZ:ia liberaalina lehtenä. Kumpikin korpus käsittelee USA:n presidentinvaaleja syksyllä 2004 ja sisältää hieman alle 30 000 sanaa noin 40 lehtiartikkelista. Aihesidonnaisen korpuksen valinta perustuu erityisesti siihen, että tutkimuksen päämääränä on saada erisnimien avulla selville, miltä osin FAZ ja SZ eroavat toisistaan käsitellessään samaa aihetta. Teoriaosassa käydään läpi saksalaisten verkkolehtien taustaa, työhön liittyviä tekstilingvistisiä teo- rioita sekä erisnimien erikoispiirteitä. Siinä käsitellään myös kolmea aiempaa, saksankielisen eris- nimitutkimuksen luokittelua ja yhtä englanninkielistä, kieliteknologian luokittelua. Näissä havaitut puutteet motivoivat yhdistelemään ja muuttamaan olemassa olevia luokitteluja tätä työtä varten. Uusi luokittelu sisältää neljä yläluokkaa (olentojen, maantieteelliset, instituutioden ja asioiden ni- met), jotka kaikki kattavat kahdesta yhdeksään alaluokkaa. Kummankin korpuksen erisnimet luo- kitellaan tämän perusteella. Kvantitatiivinen analyysi keskittyy ylä- ja alaluokkien vertailuun lehtien välillä. Lisäksi se kattaa sekä kummankin aineiston että pääluokkien frekventimpien sanojen tarkastelun. Vaikka FAZ ja SZ käyttivätkin pääosin samoja erisnimiä raportoinnissaan, voidaan lehtien välillä osoittaa selkeitä eroja alaluokkien kohdalla ja vähäisiä eroja erisnimien jakautumisessa yläluokkiin. chi2 -testin näytti kuitenkin, että erisnimien jakautuminen yläluokkiin on lehtisidonnaista. Siksi voidaan väittää, että muun muassa valittu media vaikuttaa erisnimivalintoihin. Erisnimien frekvenssit antavat ymmärtää, että SZ raportoisi monipuolisemmin kuin FAZ, joka käyttää erisnimiä keskitetymmin. SZ:in aineiston erisnimiä yhdistää eurooppalainen näkökulma vaaleihin, kun taas FAZ pyrkii tuomaan esille tapahtumia USA:n eri osavaltioissa. Niin lehdissä mainitut henkilöiden kuin instituutioden nimet tukevat tätä väitetettä. SZ korostaa maantieteellisesti kaupunkien merkitystä, FAZ osavaltioiden. Saadut tulokset osoittavat, että tämänkaltaisen erisnimitutkimuksen soveltaminen lehtiteksteihin on mahdollista. Luokitellut erisnimet heijastavat osittain käsiteltyjen aineistojen sisältöä ja paljastavat raportoinnin painopisteistä.
Resumo:
The present thesis discusses relevant issues in education: 1) learning disabilities including the role of comorbidity in LDs, and 2) the use of research-based interventions. This thesis consists of a series of four studies (three articles), which deepens the knowledge of the field of special education. Intervention studies (N=242) aimed to examine whether training using a nonverbal auditory-visual matching computer program had a remedial effect in different learning disabilities, such as developmental dyslexia, Attention Deficit Disorder (ADD) and Specific Language Impairment (SLI). These studies were conducted in both Finland and Sweden. The intervention’s non-verbal character made an international perspective possible. The results of the intervention studies confirmed, that the auditory-visual matching computer program, called Audilex had positive intervention effects. In Study I of children with developmental dyslexia there were also improvements in reading skills, specifically in reading nonsense words and reading speed. These improvements in tasks, which are thought to rely on phonological processing, suggest that such reading difficulties in dyslexia may stem in part from more basic perceptual difficulties, including those required to manage the visual and auditory components of the decoding task. In Study II the intervention had a positive effect on children with dyslexia; older students with dyslexia and surprisingly, students with ADD also benefited from this intervention. In conclusion, the role of comorbidity was apparent. An intervention effect was evident also in students’ school behavior. Study III showed that children with SLI experience difficulties very similar to those of children with dyslexia in auditory-visual matching. Children with language-based learning disabilities, such as dyslexia and SLI benefited from the auditory-visual matching intervention. Also comorbidity was evident among these children; in addition to formal diagnoses, comorbidity was explored with an assessment inventory, which was developed for this thesis. Interestingly, an overview of the data of this thesis shows positive intervention effects in all studies despite learning disability, language, gender or age. These findings have been described by a concept inter-modal transpose. Self-evidently these issues need further studies. In learning disabilities the aim in the future will also be to identify individuals at risk rather than by deficit; this aim can be achieved by using research-based interventions, intensified support in general education and inclusive special education. Keywords: learning disabilities, developmental dyslexia, attention deficit disorder, specific language impairment, language-based learning disabilities, comorbidity, auditory-visual matching, research-based interventions, inter-modal transpose
Resumo:
Topic detection and tracking (TDT) is an area of information retrieval research the focus of which revolves around news events. The problems TDT deals with relate to segmenting news text into cohesive stories, detecting something new, previously unreported, tracking the development of a previously reported event, and grouping together news that discuss the same event. The performance of the traditional information retrieval techniques based on full-text similarity has remained inadequate for online production systems. It has been difficult to make the distinction between same and similar events. In this work, we explore ways of representing and comparing news documents in order to detect new events and track their development. First, however, we put forward a conceptual analysis of the notions of topic and event. The purpose is to clarify the terminology and align it with the process of news-making and the tradition of story-telling. Second, we present a framework for document similarity that is based on semantic classes, i.e., groups of words with similar meaning. We adopt people, organizations, and locations as semantic classes in addition to general terms. As each semantic class can be assigned its own similarity measure, document similarity can make use of ontologies, e.g., geographical taxonomies. The documents are compared class-wise, and the outcome is a weighted combination of class-wise similarities. Third, we incorporate temporal information into document similarity. We formalize the natural language temporal expressions occurring in the text, and use them to anchor the rest of the terms onto the time-line. Upon comparing documents for event-based similarity, we look not only at matching terms, but also how near their anchors are on the time-line. Fourth, we experiment with an adaptive variant of the semantic class similarity system. The news reflect changes in the real world, and in order to keep up, the system has to change its behavior based on the contents of the news stream. We put forward two strategies for rebuilding the topic representations and report experiment results. We run experiments with three annotated TDT corpora. The use of semantic classes increased the effectiveness of topic tracking by 10-30\% depending on the experimental setup. The gain in spotting new events remained lower, around 3-4\%. The anchoring the text to a time-line based on the temporal expressions gave a further 10\% increase the effectiveness of topic tracking. The gains in detecting new events, again, remained smaller. The adaptive systems did not improve the tracking results.
Resumo:
Event-based systems are seen as good candidates for supporting distributed applications in dynamic and ubiquitous environments because they support decoupled and asynchronous many-to-many information dissemination. Event systems are widely used, because asynchronous messaging provides a flexible alternative to RPC (Remote Procedure Call). They are typically implemented using an overlay network of routers. A content-based router forwards event messages based on filters that are installed by subscribers and other routers. The filters are organized into a routing table in order to forward incoming events to proper subscribers and neighbouring routers. This thesis addresses the optimization of content-based routing tables organized using the covering relation and presents novel data structures and configurations for improving local and distributed operation. Data structures are needed for organizing filters into a routing table that supports efficient matching and runtime operation. We present novel results on dynamic filter merging and the integration of filter merging with content-based routing tables. In addition, the thesis examines the cost of client mobility using different protocols and routing topologies. We also present a new matching technique called temporal subspace matching. The technique combines two new features. The first feature, temporal operation, supports notifications, or content profiles, that persist in time. The second feature, subspace matching, allows more expressive semantics, because notifications may contain intervals and be defined as subspaces of the content space. We also present an application of temporal subspace matching pertaining to metadata-based continuous collection and object tracking.
Resumo:
Online content services can greatly benefit from personalisation features that enable delivery of content that is suited to each user's specific interests. This thesis presents a system that applies text analysis and user modeling techniques in an online news service for the purpose of personalisation and user interest analysis. The system creates a detailed thematic profile for each content item and observes user's actions towards content items to learn user's preferences. A handcrafted taxonomy of concepts, or ontology, is used in profile formation to extract relevant concepts from the text. User preference learning is automatic and there is no need for explicit preference settings or ratings from the user. Learned user profiles are segmented into interest groups using clustering techniques with the objective of providing a source of information for the service provider. Some theoretical background for chosen techniques is presented while the main focus is in finding practical solutions to some of the current information needs, which are not optimally served with traditional techniques.
Resumo:
Marja Heinonen s dissertation Verkkomedian käyttö ja tutkiminen. Iltalehti Online 1995-2001 describes the usage of new internet based news service Iltalehti Online during its first years of existence, 1995-2001. The study focuses on the content of the service and users attitudes towards the new media and its contents. Heinonen has also analyzed and described the research methods that can be used in the research of any new media phenomenon when there is no historical perspective to do the research. Heinonen has created a process model for the research of net medium, which is based on a multidimensional approach. She has chosen an iterative research method inspired by Sudweeks and Simoff s CEDA-methodology in which qualitative and quantitative methods take turns both creating results and new research questions. The dissertation discusses and describes the possibilities of combining several research methods in the study of online news media. On general level it discusses the methodological possibilities of researching a completely new media form when there is no historical perspective. The result of these discussions is in favour for the multidimensional methods. The empiric research was built around three cases of Iltalehti Online among its users: log analysis 1996-1999, interviews 1999 and clustering 2000-2001. Even though the results of different cases were somewhat conflicting here are the central results from the analysis of Iltalehti Online 1995-2001: - Reading was strongly determined by the gender. - The structure of Iltalehti Online guided the reading strongly. - People did not make a clear distinction in content between news and entertainment. - Users created new habits in their everyday life during the first years of using Iltalehti Online. These habits were categorized as follows: - break between everyday routines - established habit - new practice within the rhythm of the day - In the clustering of the users sports, culture and celebrities were the most distinguishing contents. Users did not move across these borders as much as within them. The dissertation gives contribution to the development of multidimensional research methods in the field of emerging phenomena in media field. It is also a unique description of a phase of development in media history through an unique research material. There is no such information (logs + demographics) available of any other Finnish online news media. Either from the first years or today.
Resumo:
The core aim of machine learning is to make a computer program learn from the experience. Learning from data is usually defined as a task of learning regularities or patterns in data in order to extract useful information, or to learn the underlying concept. An important sub-field of machine learning is called multi-view learning where the task is to learn from multiple data sets or views describing the same underlying concept. A typical example of such scenario would be to study a biological concept using several biological measurements like gene expression, protein expression and metabolic profiles, or to classify web pages based on their content and the contents of their hyperlinks. In this thesis, novel problem formulations and methods for multi-view learning are presented. The contributions include a linear data fusion approach during exploratory data analysis, a new measure to evaluate different kinds of representations for textual data, and an extension of multi-view learning for novel scenarios where the correspondence of samples in the different views or data sets is not known in advance. In order to infer the one-to-one correspondence of samples between two views, a novel concept of multi-view matching is proposed. The matching algorithm is completely data-driven and is demonstrated in several applications such as matching of metabolites between humans and mice, and matching of sentences between documents in two languages.
Resumo:
Recent evidence from adult pronoun comprehension suggests that semantic factors such as verb transitivity affect referent salience and thereby anap- hora resolution. We tested whether the same semantic factors influence pronoun comprehension in young children. In a visual world study, 3-year- olds heard stories that began with a sentence containing either a high or a low transitivity verb. Looking behaviour to pictures depicting the subject and object of this sentence was recorded as children listened to a subsequent sentence containing a pronoun. Children showed a stronger preference to look to the subject as opposed to the object antecedent in the low transitivity condition. In addition there were general preferences (1) to look to the subject in both conditions and (2) to look more at both potential antecedents in the high transitivity condition. This suggests that children, like adults, are affected by semantic factors, specifically semantic prominence, when interpreting anaphoric pronouns.
Resumo:
The paper explores the effect of customer satisfaction with online supporting services on loyalty to providers of an offline core service. Supporting services are provided to customers before, during, or after the purchase of a tangible or intangible core product, and have the purpose of enhancing or facilitating the use of this product. The internet has the potential to dominate all other marketing channels when it comes to the interactive and personalised communication that is considered quintessential for supporting services. Our study shows that the quality of online supporting services powerfully affects satisfaction with the provider and customer loyalty through its effect on online value and enjoyment. Managerial implications are provided.
Resumo:
This thesis analyzes how matching takes place at the Finnish labor market from three different angles. The Finnish labor market has undergone severe structural changes following the economic crisis in the early 1990s. The labor market has had problems adjusting from these changes and hence a high and persistent unemployment has followed. In this thesis I analyze if matching problems, and in particular if changes in matching, can explain some of this persistence. The thesis consists of three essays. In the first essay Finnish Evidence of Changes in the Labor Market Matching Process the matching process at the Finnish labor market is analyzed. The key finding is that the matching process has changed thoroughly between the booming 1980s and the post-crisis period. The importance of the number of unemployed, and in particular long-term unemployed, for the matching process has vanished. More unemployed do not increase matching as theory predicts but rather the opposite. In the second essay, The Aggregate Matching Function and Directed Search -Finnish Evidence, stock-flow matching as a potential micro foundation of the aggregate matching function is studied. In the essay I show that newly unemployed match mainly with the stock of vacancies while longer term unemployed match with the inflow of vacancies. When aggregating I still find evidence of the traditional aggregate matching function. This could explain the huge support the aggregate matching function has received despite its odd randomness assumption. The third essay, How do Registered Job Seekers really match? -Finnish occupational level Evidence, studies matching for nine occupational groups and finds that very different matching problems exist for different occupations. In this essay also misspecification stemming from non-corresponding variables is dealt with through the introduction of a completely new set of variables. The new outflow measure used is vacancies filled with registered job seekers and it is matched by the supply side measure registered job seekers.