Biblioteca Digital

16 resultados para tRNA editing

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland

Voiko totuutta tavoittaa? : direct cinema -tyylisuunnan käyttö dokumenttielokuvassa

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The subject of this study is the use of direct cinema style in documentary film. The main purpose of this thesis was to the research the ways in which direct cinema style attempts to show and achieve truth in documentary films. The following questions were posed: Is it possible to depict reality in a documentary film; how does the choice of using this style affect the final documentary? The essential purpose of this study was to try to see whether the direct cinema style works when trying to achieve truth in a documentary film. This work consints of two elements, the theoretical part and the short documentary. The theoretical part deals with the history, the truth, and the direct cinema- style in documentaries. The theoretical information of direct cinema has been used when making the short documentary. In the documentary Tuloaula 2 I have studied the way in which using direct cinema -style works in practise. The documentary has followed as strictly as possible the direct cinema style. I was the director, the cameraman and the editor of my documentary film. In the documentary film Tuloaula 2 it appeared that the direct cinema style works best when filming everyday life. By using this style it is easy for the director to observe and leave his own persona in the background. The strength in using the direct cinema style is that it enables the viewer to build his/her own impression on the subject. Even though the direct cinema style aims to achieve objectivity the director has to make numerous subjective choices during both the filming and the editing process. These subjective choices automatically effect the "truth" of the documentary film. The difficulty in a direct cinema style is the large amount of material. This often leads to a long editing phase, which is not often possible in the busy production schedules. The direct cinema style is not at its best when shooting people who are passive because their attention often focuses too much on the camera. In general, the best way to make a documentary film would be to use many documentary styles in one film and not to srictly concentrate on only one style.

Developing a web portal for managing marketing campaign information

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Diplomityö liittyy Accenturen projektiin, jossa kehitettiin asiakkaalle CMS Web-portaali, jonka tarkoitus on tarjota mekanismi tuote- ja kampanjainformaation luontiin ja hallintaan sekä hallita niihin liittyviä budjettiprosesseja yrityksen Intranetissä. Työn tavoitteena on kuvata CMS-portaalin kehitysprosessia ja koota projektin aikana saadut opit ja parannusehdotukset. Tavoitteena on myös esittää ideoita havaittujen ongelmien ehkäisemiseksi tulevissa projekteissa. Portaalin kehitysprojektinsuurimmat haasteet liittyivät tietojärjestelmien kehitysympäristöihin, portaali- ja sisällönhallintapuolen yhdistämiseen sekä tiimikehitykseen. Kun portaaliprojekti tehdään asiakkaan tiloissa, ei täyttä kontrollia kehitysympäristöistä voi saada. Jos kehitysympäristöjen kanssa on ongelmia, niistä on syytä kommunikoida selkeästi ja ammattimaisesti asiakkaan kehitysympäristöistä vastaavalle taholle. Yhteistyö ja hyvät henkilökohtaiset suhteet asiakkaan kanssaovat tärkeitä. Jos portaalin sisällönhallintatarpeet eivät ole erittäin rajoittuneet, on suositeltavaa käyttää erillistä sisällönhallintaohjelmistoa portaalin sisällön hallitsemiseksi. Pienemmillekin projekteille tämä mahdollistaa paremmat laajennusmahdollisuudet. Portaali- ja sisällönhallintapuolenyhdistäminen kannattaa tehdä ohjelmistojen tarjoajien ohjeiden mukaan ja yleisiä menettelytapoja noudattaen. Yleisillä menettelytavoilla tarkoitetaan portaalinja sisällönhallinnan yhdistämisessä sitä, että portaali vastaanottaa sisältöä sisällönhallintajärjestelmältä, mutta kaikki sisällön muokkaustoimenpiteet tehdään sisällönhallintajärjestelmän käyttöliittymän kautta. Jos mukautettuja menettelytapoja on käytettävä, näiden kehittämiselle on varattava niiden vaatima aika. Tällöin Web-palveluiden käyttöä kannattaa harkita, koska Web-palvelut auttavat ohjelmistojen yhdistämisessä etenkin, kun yhdistäminen tehdään mukautetusti. Kun portaali tehdään käyttäen tiimikehitystyötä, on käytettävä myös versionhallintajärjestelmää, jolla estetään päällekkäisten muutosten mahdollisuus. Kehitysprosessin yhdenmukaistamiseksi on erittäin suositeltavaa tehdä yleinen kehitysohjedokumentti. Lisäksi on huolehdittava siitä, että kaikki kehittäjät noudattavat yleisiä kehitysohjeita, jotta yhdenmukaisuuden mukanaan tuomat edut saavutetaan mahdollisimman hyvin.

3D-suunnittelujärjestelmän ja tuotetiedonhallintajärjestelmän integrointi

Relevância:

10.00% 10.00%

Publicador:

Resumo:

3D-suunnittelujärjestelmät ovat tärkeitä työkaluja tuotetiedon luomista ja muokkaamista varten, joten niiden tehokas toiminta yhdessä tuotetiedonhallintajärjestelmien kanssa on erittäin tärkeää. 3D-suunnittelujärjestelmien kehityksen seurauksena 3D-malleihin voidaan sisällyttää entistä enemmän tuotetietoa, jolloin tehokas tiedon tallentaminen ja sen hallinta kasvattaa merkitystään. 3D-malleihin sisältyvää tietoa, kuten esimerkiksi kappaleen painoa tai geometriaa, halutaan myös tarkastella ilman tiettyä 3D-järjestelmää. Tuotetiedon hallinta on ollut jo pitkään tärkeä osa tuotteen suunnitteluprosessia. Tuotetiedon hallintaan sisältyy tiedon luomista, keräämistä ja muokkaamista koko tuotteen eliniän ajan. Näitä tietoja voivat olla esimerkiksi piirustukset, 3D-mallit, mittauspöytäkirjat, kokouspöytäkirjat, mallistoluettelot, lujuuslaskelmat, huoltoraportit jne. Tuotetiedonhallintajärjestelmä pitää huolta kaikesta tästä tiedosta ja sen kautta tuotteisiin liittyvää tietoa voidaan myös muokata ja jakaa tehokkaasti. Tässä diplomityössä tutkittiin näiden kahden järjestelmän integraation toteutusta erilaisilla menetelmillä. Työn tarkoituksena oli valita Valtra Oy:n tarkoituksiin parhaiten sopiva menetelmä tuotetiedon siirtämiseksi järjestelmien välillä tehokkaasti. Työn tuloksena annetaan suositus käytettävistä menetelmistä ja työkaluista.

Riskienhallinta erään ohjelmiston muutosten yhteydessä

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tämän diplomityön tarkoituksena oli esittää menetelmä erääseen ohjelmistoon toteutettavista muutoksista aiheutuvien riskien hallintaan. Ohjelmistoa käyttää päivittäin useita satoja henkilöitä ja sen ongelmaton toiminta on ohjelmiston omistavalle asiakkaalle erittäin tärkeää. Ohjelmiston ja sen kehitystyön kannalta riski on asianomistajan tavoitteita uhkaava menetyksen mahdollisuus tai menetykseen liittyvä ominaisuus, tekijä tai toiminta. Tämän työn yhteydessä asianomistaja on yritys, joka on toteuttanut nykyisen ohjelmiston ja on vastuussa ohjelmiston jatkokehityksestä. Yrityksen riskienhallintatarpeita vastaava ratkaisu pyritään löytämään perehtymällä riskienhallinnan perusteisiin sekä kahteen erityisesti ohjelmistotuotantoon tarkoitettuun riskienhallintamenetelmään. Riskienhallinnan kehittämisen kannalta on tärkeää, että ohjelmistotuotannon tyypilliset virheet onnistutaan pääsääntöisesti välttämään. Riskienhallinnan yleisempien virheiden tiedostamisesta on suurta hyötyä omaa riskienhallintaa kehitettäessä. Ohjelmiston kehitysorganisaation systemaattinen tapa toteuttaa ohjelmistomuutoksia perustuu ohjelmistotuotantoon tarkoitetun tuotteenhallintaohjelman käyttöön. Tuotteenhallintaohjelmassa muutospyyntö on ohjelmiston kehitystyön perusyksikkö, johon riskienhallintatoimet on pyrittävä kohdistamaan. Yrityksen tarpeita vastaava riskienhallintamalli rakennetaan lisäämällä Riskit-menetelmän mukainen riskienhallintaprosessi osaksi muutospyynnön systemaattista käsittelyprosessia. Työn tuloksena aikaansaadun mallin mukaista riskienhallintaa voidaan käytännössä harjoittaa usealla eri tavalla. Arvioiden perusteella kaavionluonti- ja tekstinkäsittelyohjelma ovat riittävät työkalut riskienhallinnan käytännön toteutusta varten. Kokemukset uudesta riskienhallintamenetelmästä osoittivat sen käyttökelpoiseksi. Menetelmän käyttöönoton sujuvuuden varmistamiseksi, riskienhallintatoimet kannattaa kuitenkin aluksi kohdistaa yksittäistä muutospyyntöä suurempaan kokonaisuuteen.

Tunnetta myllyssä : dokumenttielokuvaleikkauksen tunnekerronnan keinoja

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Opinnäytetyö on monimuototyö, jonka teososan muodostaa dokumentti myllystä ja nykypäivän mylläristä. Kyseinen mylly on yksi noin sadasta vielä toiminnassa olevasta myllystä. Myllyn uudelleen käyttöönottoa edelsi monta vaihetta koneiden puhdistamisesta jauhatusprosessin opettelemiseen. Mylläri viljelee itse ja jauhaa luomuviljan omassa myllyssään jauhoiksi. Vaimo leipoo niistä myllypuodissa myytävät leivät ja leivonnaiset. Dokumentin ajatus oli kuvata toimiva valssimylly mylläreineen, vielä kun niiden taltioiminen on mahdollista, tallentaa pala katoavaa kulttuurihistoriaa. Toimin dokumentin käsikirjoittajana, ohjaajana, tuottajana, toisena kuvaajana ja leikkaajana. Teososan tekeminen on ollut lähtökohtana kirjallisen osan pohdinnoille. Kirjallisessa osassa tutkin, kuinka tunteita tuodaan kuvallisesti ja leikkauksellisesi esille dokumenttielokuvassa. Tavoitteenani on selvittää, millä leikkausteknisillä keinoilla voidaan elokuvaan luoda lisää tunnetta ja mahdollisesti vaikuttaa katsojaan. Tarkastelen ja reflektoin aihetta ohjaaja-leikkaajan näkökulmasta. Käyn läpi Mylly-dokumentin tekoprosessin ja keskityn leikkauksen eri vaiheisiin. Tiedonhankinta perustui Myllyliitto ry:n edustajan ja myllärin haastatteluihin, kotimaisiin dokumenttielokuviin ja kirjallisuuteen, joka käsittelee pääasiassa leikkaustekniikoita. Lisäksi tuon esiin omia huomioitani dokumentin tekijänä. - Opinnäytetyöhön kuuluu teososa, dokumenttielokuva Mylly.

Kevyt videotuotanto kuvaamisesta julkaisuun

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Opinnäytetyön tavoitteena on määritellä kevyt videotuotanto, määritellä verkkoleikkaaminen ja analysoida kevyen videotuotannon tuotteita eli verkkovideoiden roolia mediakentässä julkaisemisen kannalta. Opinnäytetyö sai alkunsa Stadian Videos-hankkeen videotuotantojen yhteydessä. Aihealue tarkentui Helsingissä järjestettävän ESC2007- eli Euroviisut 2007 -monijulkaisutuotannon yhteydessä. ESC2007-tapahtuman ympärillä tuotettiin ammattikorkeakoulu Arcadan sekä Dina-kaapelikanavan yhteistyönä useilla erilaisilla tuotantomenetelmillä materiaalia Euroviisujen tapahtumista, ilmiöistä ja ihmisistä. Yksi näistä tuotantomenetelmistä oli multimediapuhelin-tuotanto. Toimittajat kuvasivat niin sanotuilla kamerakännyköillä videomateriaalia, haastatteluja ja niin edelleen. Näitä kahta tuotantoa eli Videos-hankkeen tuotantoja ja ESC2007-tuotantoa käytetään tässä opinnäytetyössä pohjana kevyen videotuotannon määrittelemiselle. Lisäksi opinnäytetyössä selvitetään verkkovideoeditointia, joka on myös osa Videos-hanketta. Videos- hankkeen tarkoitus on rakentaa verkkoleikkauseditori Fooga, joka on täysin vapaasti asennettavissa kenen tahansa internet-sivuille. Opinnäytetyössä esitellään myös muita verkkovideoeditoreja. Lopuksi määritellään kevyt videotuotanto perusteluineen. Lisäksi kokonaisuudessa kerrotaan verkkovideoiden julkaisemisesta sekä mediakonvergenssin ja divergenssin merkityksestä julkaisemisessa. Opinnäytetyön tuloksena voidaan todeta, että kevyt videotuotanto on nopean reagoimisen videotuotanto, jossa idea näyttelee pääroolia.

Epäpuhtauksien optinen laskenta sulpusta

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Työn tarkoituksena oli hakea mittausjärjestelmän raja-arvoja optiselle kamerapohjaiselle roskalaskentajärjestelmälle sekä testata roskalaskentajärjestelmän toimivuus käytännössä. Tavoitteena oli tuotteistaa kamerapohjainen roskalaskenta-analyysi palvelutuotteeksi, jota voitaisiin hyödyntää sihtien kuntokartoituksessa ja ongelmanratkaisuvälineenä. Teoriaosa koostui kahdesta kokonaisuudesta: sulpun epäpuhtauksista, roskalaskennan teoriasta ja epäpuhtauksien mittausmenetelmistä sekä markkinoinnista, tuotteistamis- ja lanseerausprosessista palvelutuotteen näkökulmasta. Kokeellisessa osassa selvitettiin kamerapohjaiseen roskalaskentaanalyysiin vaikuttavia tekijöitä: kameran tarkennus, kuvan terävyys, analysoitavan arkin väri, neliömassa ja roskapitoisuus, impregnointi, valonlähde, kuvan muokkaus, tiedostomuoto ja pikselimäärä. Kamerapohjaisen roskalaskenta-analyysin soveltuvuus käytäntöön testattiin tehdasesimerkin avulla. Havaittiin, että kamerapohjaista roskalaskenta-analyysiä voitaisiin käyttää lähes kaikille massatyypeille. Työssä määriteltiin kalibrointimenetelmä kameran tarkentamiseksi arkin tasoon sekä suljinnopeusanalyysi massatyypistä riippuvan suljinnopeuden selvitykseen. Kamerapohjaisessa roskalaskenta-analyysissä määritettiin käytettäväksi arkin neliömassana 60 g/m2, suljinaukkoa F5 ja terävyysasetusta 5. Tulokseksi saatiin, että analysoitavia arkkeja ei tarvitse impregnoida tai jälkikäsitellä. Korrelaatiota Somerville-erotustehokkuuteen ei löytynyt. Esimerkkitehtaasta selvitettiin primääriportaan roskapitoisuudet ja erotustehokkuudet. Tehdasesimerkin tulosten perusteella havaittiin happivaiheen ja D0-vaiheen olleen tehokkaimpia epäpuhtauksien poistajia.

Test Campaign Parameter Editor for a Graphical TTCN-3 Development Environment

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents the design for a graphical parameter editor for Testing and Test Control Notation 3 (TTCN-3) test suites. This work was done in the context of OpenTTCN IDE, a TTCN-3 development environment built on top of the Eclipse platform. The design presented relies on an additional parameter editing tab added to the launch configurations for test campaigns. This parameter editing tab shows the list of editable parameters and allows opening editing components for the different parameters. Each TTCN-3 primitive type will have a specific editing component providing tools to ease modification of values of that type.

Jatkuvan äänitehojakautuman algoritmi pitkien käytävien äänikenttien mallintamiseen

Relevância:

10.00% 10.00%

Publicador:

Resumo:

JÄKÄLA-algoritmi (Jatkuvan Äänitehojakautuman algoritmi Käytävien Äänikenttien LAskentaan) ja sen NUMO- ja APPRO-laskentayhtälöt perustuvat käytävällä olevan todellisen äänilähteen kuvalähteiden symmetriaan. NUMO on algoritmin numeerisen ratkaisun ja APPRO likiarvoratkaisun laskentayhtälö. Algoritmia johdettaessa oletettiin, että absorptiomateriaali oli jakautunut tasaisesti käytävän ääntä heijastaville pinnoille. Suorakaiteen muotoisen käytävän kuvalähdetason muunto jatkuvaksi äänitehojakautumaksi sisältää kolme muokkausvaihetta. Aluksi suorakaiteen kuvalähdetaso muunnetaan neliön muotoiseksi. Seuraavaksi neliön muotoisen kuvalähdetason samanarvoiset kuvalähteet siirretään koordinaattiakselille diskreetiksi kuvalähdejonoksi. Lopuksi kuvalähdejono muunnetaan jatkuvaksi äänitehojakautumaksi, jolloin käytävän vastaanottopisteen äänenpainetaso voidaan laskea integroimalla jatkuvan äänitehojakautuman yli. JÄKÄLA-algoritmin validiteetin toteamiseksi käytettiin testattua kaupallista AKURI-ohjelmaa. AKURI-ohjelma antoi myös hyvän käsityksen siitä, miten NUMO- ja APPRO-yhtälöillä lasketut arvot mahdollisesti eroavat todellisilla käytävillä mitatuista arvoista. JÄKÄLA-algoritmin NUMO- ja APPRO-yhtälöitä testattiin myös vertaamalla niiden antamia tuloksia kolmen erityyppisen käytävän äänenpainetasomittauksiin. Tässä tutkimuksessa on osoitettu, että akustisen kuvateorian pohjalta on mahdollista johtaa laskenta-algoritmi, jota voidaan soveltaa pitkien käytävien äänikenttien pika-arvioinnissa paikan päällä. Sekä teoreettinen laskenta että käytännön äänenpainetasomittaukset todellisilla käytävillä osoittivat, että JÄKÄLA-algoritmin yhtälöiden ennustustarkkuus oli erinomainen ideaalikäytävillä ja hyvä niillä todellisilla käytävillä, joilla ei ollut ääntä heijastavia rakenteita. NUMO- ja APPRO-yhtälöt näyttäisivät toimivan hyvin käytävillä, joiden poikkileikkaus oli lähes neliön muotoinen ja joissa pintojen suurin absorptiokerroin oli korkeintaan kymmenen kertaa pienintä absorptiokerrointa suurempi. NUMO- ja APPRO-yhtälöiden suurin puute on, etteivät ne ota huomioon pintojen erilaisia absorptiokertoimia eivätkä esineistä heijastuvia ääniä. NUMO- ja APPRO- laskentayhtälöt poikkesivat mitatuista arvoista eniten käytävillä, joilla kahden vastakkaisen pinnan absorptiokerroin oli hyvin suuri ja toisen pintaparin hyvin pieni, ja käytävillä, joissa oli massiivisia, ääntä heijastavia pilareita ja palkkeja. JÄKÄLA-algoritmin NUMO- ja APPRO-yhtälöt antoivat tutkituilla käytävillä kuitenkin selvästi tarkempia arvoja kuin Kuttruffin likiarvoyhtälö ja tilastollisen huoneakustiikan perusyhtälö. JÄKÄLA-algoritmin laskentatarkkuutta on testattu vain neljällä todellisella käytävällä. Algoritmin kehittämiseksi tulisi jatkossa käytävän vastakkaisia pintoja ja niiden absorptiokertoimia käsitellä laskennassa pareittain. Algoritmin validiteetin varmistamiseksi on mittauksia tehtävä lisää käytävillä, joiden absorptiomateriaalien jakautumat poikkeavat toisistaan.

Making the impact on research and society : a case study: crowdsourcing solutions developed for the linguistic research and citizen science

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Can crowdsourcing solutions serve many masters? Can they be beneficial for both, for the layman or native speakers of minority languages on the one hand and serious linguistic research on the other? How did an infrastructure that was designed to support linguistics turn out to be a solution for raising awareness of native languages? Since 2012 the National Library of Finland has been developing the Digitisation Project for Kindred Languages, in which the key objective is to support a culture of openness and interaction in linguistic research, but also to promote crowdsourcing as a tool for participation of the language community in research. In the course of the project, over 1,200 monographs and nearly 111,000 pages of newspapers in Finno-Ugric languages will be digitised and made available in the Fenno-Ugrica digital collection. This material was published in the Soviet Union in the 1920s and 1930s, and users have had only sporadic access to the material. The publication of open-access and searchable materials from this period is a goldmine for researchers. Historians, social scientists and laymen with an interest in specific local publications can now find text materials pertinent to their studies. The linguistically-oriented population can also find writings to delight them: (1) lexical items specific to a given publication, and (2) orthographically-documented specifics of phonetics. In addition to the open access collection, we developed an open source code OCR editor that enables the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary since these rare and peripheral prints often include already archaic characters, which are neglected by modern OCR software developers but belong to the historical context of kindred languages, and are thus an essential part of the linguistic heritage. When modelling the OCR editor, it was essential to consider both the needs of researchers and the capabilities of lay citizens, and to have them participate in the planning and execution of the project from the very beginning. By implementing the feedback iteratively from both groups, it was possible to transform the requested changes as tools for research that not only supported the work of linguistics but also encouraged the citizen scientists to face the challenge and work with the crowdsourcing tools for the benefit of research. This presentation will not only deal with the technical aspects, developments and achievements of the infrastructure but will highlight the way in which user groups, researchers and lay citizens were engaged in a process as an active and communicative group of users and how their contributions were made to mutual benefit.

SYLI - an external workflow system for DSpace

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

OCR Correction Tool for Linguistic Corpora

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

SimpleREST -RESTful DSpace API

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Nichesourcing The Uralic Languages For The Benefit Of Linguistic Research And Lingual Societies

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The emerging technologies have recently challenged the libraries to reconsider their role as a mere mediator between the collections, researchers, and wider audiences (Sula, 2013), and libraries, especially the nationwide institutions like national libraries, haven’t always managed to face the challenge (Nygren et al., 2014). In the Digitization Project of Kindred Languages, the National Library of Finland has become a node that connects the partners to interplay and work for shared goals and objectives. In this paper, I will be drawing a picture of the crowdsourcing methods that have been established during the project to support both linguistic research and lingual diversity. The National Library of Finland has been executing the Digitization Project of Kindred Languages since 2012. The project seeks to digitize and publish approximately 1,200 monograph titles and more than 100 newspapers titles in various, and in some cases endangered Uralic languages. Once the digitization has been completed in 2015, the Fenno-Ugrica online collection will consist of 110,000 monograph pages and around 90,000 newspaper pages to which all users will have open access regardless of their place of residence. The majority of the digitized literature was originally published in the 1920s and 1930s in the Soviet Union, and it was the genesis and consolidation period of literary languages. This was the era when many Uralic languages were converted into media of popular education, enlightenment, and dissemination of information pertinent to the developing political agenda of the Soviet state. The ‘deluge’ of popular literature in the 1920s to 1930s suddenly challenged the lexical orthographic norms of the limited ecclesiastical publications from the 1880s onward. Newspapers were now written in orthographies and in word forms that the locals would understand. Textbooks were written to address the separate needs of both adults and children. New concepts were introduced in the language. This was the beginning of a renaissance and period of enlightenment (Rueter, 2013). The linguistically oriented population can also find writings to their delight, especially lexical items specific to a given publication, and orthographically documented specifics of phonetics. The project is financially supported by the Kone Foundation in Helsinki and is part of the Foundation’s Language Programme. One of the key objectives of the Kone Foundation Language Programme is to support a culture of openness and interaction in linguistic research, but also to promote citizen science as a tool for the participation of the language community in research. In addition to sharing this aspiration, our objective within the Language Programme is to make sure that old and new corpora in Uralic languages are made available for the open and interactive use of the academic community as well as the language societies. Wordlists are available in 17 languages, but without tokenization, lemmatization, and so on. This approach was verified with the scholars, and we consider the wordlists as raw data for linguists. Our data is used for creating the morphological analyzers and online dictionaries at the Helsinki and Tromsø Universities, for instance. In order to reach the targets, we will produce not only the digitized materials but also their development tools for supporting linguistic research and citizen science. The Digitization Project of Kindred Languages is thus linked with the research of language technology. The mission is to improve the usage and usability of digitized content. During the project, we have advanced methods that will refine the raw data for further use, especially in the linguistic research. How does the library meet the objectives, which appears to be beyond its traditional playground? The written materials from this period are a gold mine, so how could we retrieve these hidden treasures of languages out of the stack that contains more than 200,000 pages of literature in various Uralic languages? The problem is that the machined-encoded text (OCR) contains often too many mistakes to be used as such in research. The mistakes in OCRed texts must be corrected. For enhancing the OCRed texts, the National Library of Finland developed an open-source code OCR editor that enabled the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary to implement, since these rare and peripheral prints did often include already perished characters, which are sadly neglected by the modern OCR software developers, but belong to the historical context of kindred languages and thus are an essential part of the linguistic heritage (van Hemel, 2014). Our crowdsourcing tool application is essentially an editor of Alto XML format. It consists of a back-end for managing users, permissions, and files, communicating through a REST API with a front-end interface—that is, the actual editor for correcting the OCRed text. The enhanced XML files can be retrieved from the Fenno-Ugrica collection for further purposes. Could the crowd do this work to support the academic research? The challenge in crowdsourcing lies in its nature. The targets in the traditional crowdsourcing have often been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguists are not necessarily met. Also, the remarkable downside is the lack of shared goal or the social affinity. There is no reward in the traditional methods of crowdsourcing (de Boer et al., 2012). Also, there has been criticism that digital humanities makes the humanities too data-driven and oriented towards quantitative methods, losing the values of critical qualitative methods (Fish, 2012). And on top of that, the downsides of the traditional crowdsourcing become more imminent when you leave the Anglophone world. Our potential crowd is geographically scattered in Russia. This crowd is linguistically heterogeneous, speaking 17 different languages. In many cases languages are close to extinction or longing for language revitalization, and the native speakers do not always have Internet access, so an open call for crowdsourcing would not have produced appeasing results for linguists. Thus, one has to identify carefully the potential niches to complete the needed tasks. When using the help of a crowd in a project that is aiming to support both linguistic research and survival of endangered languages, the approach has to be a different one. In nichesourcing, the tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for complex tasks with high-quality product expectations found in nichesourcing. Communities have a purpose and identity, and their regular interaction engenders social trust and reputation. These communities can correspond to research more precisely (de Boer et al., 2012). Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. In nichesourcing, we hand in such assignments that would precisely fill the gaps in linguistic research. A typical task would be editing and collecting the words in such fields of vocabularies where the researchers do require more information. For instance, there is lack of Hill Mari words and terminology in anatomy. We have digitized the books in medicine, and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with the OCR editor. From the nichesourcing’s perspective, it is essential that altruism play a central role when the language communities are involved. In nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit from the results. For instance, the corrected words in Ingrian will be added to an online dictionary, which is made freely available for the public, so the society can benefit, too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of ‘two masters’: research and society.

Nichesourcing for the benefit of linguistic research and native speakers

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The National Library of Finland is implementing the Digitization Project of Kindred Languages in 2012–16. Within the project we will digitize materials in the Uralic languages as well as develop tools to support linguistic research and citizen science. Through this project, researchers will gain access to new corpora 329 and to which all users will have open access regardless of their place of residence. Our objective is to make sure that the new corpora are made available for the open and interactive use of both the academic community and the language societies as a whole. The project seeks to digitize and publish approximately 1200 monograph titles and more than 100 newspapers titles in various Uralic languages. The digitization will be completed by the early of 2015, when the Fenno-Ugrica collection would contain around 200 000 pages of editable text. The researchers cannot spend so much time with the material that they could retrieve a satisfactory amount of edited words, so the participation of a crowd in editing work is needed. Often the targets in crowdsourcing have been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguistic research are not necessarily met. Also, the number of pages is too high to deal with. The remarkable downside is the lack of shared goal or social affinity. There is no reward in traditional methods of crowdsourcing. Nichesourcing is a specific type of crowdsourcing where tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for the complex tasks with high-quality product expectations found in nichesourcing. Communities have purpose, identity and their regular interactions engenders social trust and reputation. These communities can correspond to research more precisely. Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. Some selection must be made, since we are not aiming to correct all 200,000 pages which we have digitized, but give such assignments to citizen scientists that would precisely fill the gaps in linguistic research. A typical task would editing and collecting the words in such fields of vocabularies, where the researchers do require more information. For instance, there’s a lack of Hill Mari words in anatomy. We have digitized the books in medicine and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with OCR editor. From the nichesourcing’s perspective, it is essential that the altruism plays a central role, when the language communities involve. Upon the nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit on the results. For instance, the corrected words in Ingrian will be added onto the online dictionary, which is made freely available for the public and the society can benefit too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of “two masters”, the research and the society.

«
1
2
»