30 resultados para Data-Driven Behavior Modeling

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most of the applications of airborne laser scanner data to forestry require that the point cloud be normalized, i.e., each point represents height from the ground instead of elevation. To normalize the point cloud, a digital terrain model (DTM), which is derived from the ground returns in the point cloud, is employed. Unfortunately, extracting accurate DTMs from airborne laser scanner data is a challenging task, especially in tropical forests where the canopy is normally very thick (partially closed), leading to a situation in which only a limited number of laser pulses reach the ground. Therefore, robust algorithms for extracting accurate DTMs in low-ground-point-densitysituations are needed in order to realize the full potential of airborne laser scanner data to forestry. The objective of this thesis is to develop algorithms for processing airborne laser scanner data in order to: (1) extract DTMs in demanding forest conditions (complex terrain and low number of ground points) for applications in forestry; (2) estimate canopy base height (CBH) for forest fire behavior modeling; and (3) assess the robustness of LiDAR-based high-resolution biomass estimation models against different field plot designs. Here, the aim is to find out if field plot data gathered by professional foresters can be combined with field plot data gathered by professionally trained community foresters and used in LiDAR-based high-resolution biomass estimation modeling without affecting prediction performance. The question of interest in this case is whether or not the local forest communities can achieve the level technical proficiency required for accurate forest monitoring. The algorithms for extracting DTMs from LiDAR point clouds presented in this thesis address the challenges of extracting DTMs in low-ground-point situations and in complex terrain while the algorithm for CBH estimation addresses the challenge of variations in the distribution of points in the LiDAR point cloud caused by things like variations in tree species and season of data acquisition. These algorithms are adaptive (with respect to point cloud characteristics) and exhibit a high degree of tolerance to variations in the density and distribution of points in the LiDAR point cloud. Results of comparison with existing DTM extraction algorithms showed that DTM extraction algorithms proposed in this thesis performed better with respect to accuracy of estimating tree heights from airborne laser scanner data. On the other hand, the proposed DTM extraction algorithms, being mostly based on trend surface interpolation, can not retain small artifacts in the terrain (e.g., bumps, small hills and depressions). Therefore, the DTMs generated by these algorithms are only suitable for forestry applications where the primary objective is to estimate tree heights from normalized airborne laser scanner data. On the other hand, the algorithm for estimating CBH proposed in this thesis is based on the idea of moving voxel in which gaps (openings in the canopy) which act as fuel breaks are located and their height is estimated. Test results showed a slight improvement in CBH estimation accuracy over existing CBH estimation methods which are based on height percentiles in the airborne laser scanner data. However, being based on the idea of moving voxel, this algorithm has one main advantage over existing CBH estimation methods in the context of forest fire modeling: it has great potential in providing information about vertical fuel continuity. This information can be used to create vertical fuel continuity maps which can provide more realistic information on the risk of crown fires compared to CBH.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Collecting and analyzing consumer data is essential in today’s data-driven business environment. However, consumers are becoming more aware of the value of the information they can provide to companies, thereby being more reluctant to share it for free. Therefore, companies need to find ways to motivate consumers to disclose personal information. The main research question of the study was formed as “How can companies motivate consumers to disclose personal information?” and it was further divided into two subquestions: 1) What types of benefits motivate consumers to disclose personal information? 2) How does the disclosure context affect the consumers’ information disclosure behavior? The conceptual framework consisted of a classification of extrinsic and intrinsic benefits, and moderating factors, which were recognized on the basis of prior research in the field. The study was conducted by using qualitative research methods. The primary data was collected by interviewing ten representatives from eight companies. The data was analyzed and reported according to predetermined themes. The findings of the study confirm that consumers can be motivated to disclose personal information by offering different types of extrinsic (monetary saving, time saving, self-enhancement, and social adjustment) and intrinsic (novelty, pleasure, and altruism) benefits. However, not all the benefits are equally useful ways to convince the customer to disclose information. Moreover, different factors in the disclosure context can either alleviate or increase the effectiveness of the benefits and the consumers’ motivation to disclose personal information. Such factors include the consumer’s privacy concerns, perceived trust towards the company, the relevancy of the requested information, personalization, website elements (especially security, usability, and aesthetics of a website), and the consumer’s shopping motivation. This study has several contributions. It is essential that companies recognize the most attractive benefits regarding their business and their customers, and that they understand how the disclosure context affects the consumer’s information disclosure behavior. The likelihood of information disclosure can be increased, for example, by offering benefits that meet the consumers’ needs and preferences, improving the relevancy of the asked information, stating the reasons for data collection, creating and maintaining a trustworthy image of the company, and enhancing the quality of the company’s website.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this thesis was to study the removal of gases from paper mill circulation waters experimentally and to provide data for CFD modeling. Flow and bubble size measurements were carried out in a laboratory scale open gas separation channel. Particle Image Velocimetry (PIV) technique was used to measure the gas and liquid flow fields, while bubble size measurements were conducted using digital imaging technique with back light illumination. Samples of paper machine waters as well as a model solution were used for the experiments. The PIV results show that the gas bubbles near the feed position have the tendency to escape from the circulation channel at a faster rate than those bubbles which are further away from the feed position. This was due to an increased rate of bubble coalescence as a result of the relatively larger bubbles near the feed position. Moreover, a close similarity between the measured slip velocities of the paper mill waters and that of literature values was obtained. It was found that due to dilution of paper mill waters, the observed average bubble size was considerably large as compared to the average bubble sizes in real industrial pulp suspension and circulation waters. Among the studied solutions, the model solution has the highest average drag coefficient value due to its relatively high viscosity. The results were compared to a 2D steady sate CFD simulation model. A standard Euler-Euler k-ε turbulence model was used in the simulations. The channel free surface was modeled as a degassing boundary. From the drag models used in the simulations, the Grace drag model gave velocity fields closest to the experimental values. In general, the results obtained from experiments and CFD simulations are in good qualitative agreement.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Robotic grasping has been studied increasingly for a few decades. While progress has been made in this field, robotic hands are still nowhere near the capability of human hands. However, in the past few years, the increase in computational power and the availability of commercial tactile sensors have made it easier to develop techniques that exploit the feedback from the hand itself, the sense of touch. The focus of this thesis lies in the use of this sense. The work described in this thesis focuses on robotic grasping from two different viewpoints: robotic systems and data-driven grasping. The robotic systems viewpoint describes a complete architecture for the act of grasping and, to a lesser extent, more general manipulation. Two central claims that the architecture was designed for are hardware independence and the use of sensors during grasping. These properties enables the use of multiple different robotic platforms within the architecture. Secondly, new data-driven methods are proposed that can be incorporated into the grasping process. The first of these methods is a novel way of learning grasp stability from the tactile and haptic feedback of the hand instead of analytically solving the stability from a set of known contacts between the hand and the object. By learning from the data directly, there is no need to know the properties of the hand, such as kinematics, enabling the method to be utilized with complex hands. The second novel method, probabilistic grasping, combines the fields of tactile exploration and grasp planning. By employing well-known statistical methods and pre-existing knowledge of an object, object properties, such as pose, can be inferred with related uncertainty. This uncertainty is utilized by a grasp planning process which plans for stable grasps under the inferred uncertainty.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Few people see both opportunities and threats coming from IT legacy in current world. On one hand, effective legacy management can bring substantial hard savings and smooth transition to the desired future state. On the other hand, its mismanagement contributes to serious operational business risks, as old systems are not as reliable as it is required by the business users. This thesis offers one perspective of dealing with IT legacy – through effective contract management, as a component towards achieving Procurement Excellence in IT, thus bridging IT delivery departments, IT procurement, business units, and suppliers. It developed a model for assessing the impact of improvements on contract management process and set of tools and advices with regards to analysis and improvement actions. The thesis conducted case study to present and justify the implementation of Lean Six Sigma in IT legacy contract management environment. Lean Six Sigma proved to be successful and this thesis presents and discusses all the steps necessary, and pitfalls to avoid, to achieve breakthrough improvement in IT contract management process performance. For the IT legacy contract management process two improvements require special attention and can be easily copied to any organization. First is the issue of diluted contract ownership that stops all the improvements, as people do not know who is responsible for performing those actions. Second is the contract management performance evaluation tool, which can be used for monitoring, identifying outlying contracts and opportunities for improvements in the process. The study resulted in a valuable insight on the benefits of applying Lean Six Sigma to improve IT legacy contract management, as well as on how Lean Six Sigma can be applied in IT environment. Managerial implications are discussed. It is concluded that the use of data-driven Lean Six Sigma methodology for improving the existing IT contract management processes is a significant addition to the existing best practices in contract management.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The evolution of our society is impossible without a constant progress in life-important areas such as chemical engineering and technology. Innovation, creativity and technology are three main components driving the progress of chemistry further towards a sustainable society. Biomass, being an attractive renewable feedstock for production of fine chemicals, energy-rich materials and even transportation fuels, captures progressively new positions in the area of chemical technology. Knowledge of heterogeneous catalysis and chemical technology applied to transformation of biomass-derived substances will open doors for a sustainable economy and facilitates the discovery of novel environmentally-benign processes which probably will replace existing technologies in the era of biorefinary. Aqueous-phase reforming (APR) is regarded as a promising technology for production of hydrogen and liquids fuels from biomass-derived substances such as C3-C6 polyols. In the present work, aqueous-phase reforming of glycerol, xylitol and sorbitol was investigated in the presence of supported Pt catalysts. The catalysts were deposited on different support materials, including Al2O3, TiO2 and carbons. Catalytic measurements were performed in a laboratory-scale continuous fixedbed reactor. An advanced analytical approach was developed in order to identify reaction products and reaction intermediates in the APR of polyols. The influence of the substrate structure on the product formation and selectivity in the APR reaction was also investigated, showing that the yields of the desired products varied depending on the substrate chain length. Additionally, the influence of bioethanol additive in the APR of glycerol and sorbitol was studied. A reaction network was advanced explaining the formation of products and key intermediates. The structure sensitivity in the aqueous-phase reforming reaction was demonstrated using a series of platinum catalysts supported on carbon with different Pt cluster sizes in the continuous fixed-bed reactor. Furthermore, a correlation between texture physico-chemical properties of the catalysts and catalytic data was established. The effect of the second metal (Re, Cu) addition to Pt catalysts was investigated in the APR of xylitol showing a superior hydrocarbon formation on PtRe bimetallic catalysts compared to monometallic Pt. On the basis of the experimental data obtained, mathematical modeling of the reaction kinetics was performed. The developed model was proven to successfully describe experimental data on APR of sorbitol with good accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vaikka liiketoimintatiedon hallintaa sekä johdon päätöksentekoa on tutkittu laajasti, näiden kahden käsitteen yhteisvaikutuksesta on olemassa hyvin rajallinen määrä tutkimustietoa. Tulevaisuudessa aiheen tärkeys korostuu, sillä olemassa olevan datan määrä kasvaa jatkuvasti. Yritykset tarvitsevat jatkossa yhä enemmän kyvykkyyksiä sekä resursseja, jotta sekä strukturoitua että strukturoimatonta tietoa voidaan hyödyntää lähteestä riippumatta. Nykyiset Business Intelligence -ratkaisut mahdollistavat tehokkaan liiketoimintatiedon hallinnan osana johdon päätöksentekoa. Aiemman kirjallisuuden pohjalta, tutkimuksen empiirinen osuus tunnistaa liiketoimintatiedon hyödyntämiseen liittyviä tekijöitä, jotka joko tukevat tai rajoittavat johdon päätöksentekoprosessia. Tutkimuksen teoreettinen osuus johdattaa lukijan tutkimusaiheeseen kirjallisuuskatsauksen avulla. Keskeisimmät tutkimukseen liittyvät käsitteet, kuten Business Intelligence ja johdon päätöksenteko, esitetään relevantin kirjallisuuden avulla – tämän lisäksi myös dataan liittyvät käsitteet analysoidaan tarkasti. Tutkimuksen empiirinen osuus rakentuu tutkimusteorian pohjalta. Tutkimuksen empiirisessä osuudessa paneudutaan tutkimusteemoihin käytännön esimerkein: kolmen tapaustutkimuksen avulla tutkitaan sekä kuvataan toisistaan irrallisia tapauksia. Jokainen tapaus kuvataan sekä analysoidaan teoriaan perustuvien väitteiden avulla – nämä väitteet ovat perusedellytyksiä menestyksekkäälle liiketoimintatiedon hyödyntämiseen perustuvalle päätöksenteolle. Tapaustutkimusten avulla alkuperäistä tutkimusongelmaa voidaan analysoida tarkasti huomioiden jo olemassa oleva tutkimustieto. Analyysin tulosten avulla myös yksittäisiä rajoitteita sekä mahdollistavia tekijöitä voidaan analysoida. Tulokset osoittavat, että rajoitteilla on vahvasti negatiivinen vaikutus päätöksentekoprosessin onnistumiseen. Toisaalta yritysjohto on tietoinen liiketoimintatiedon hallintaan liittyvistä positiivisista seurauksista, vaikka kaikkia mahdollisuuksia ei olisikaan hyödynnetty. Tutkimuksen merkittävin tulos esittelee viitekehyksen, jonka puitteissa johdon päätöksentekoprosesseja voidaan arvioida sekä analysoida. Despite the fact that the literature on Business Intelligence and managerial decision-making is extensive, relatively little effort has been made to research the relationship between them. This particular field of study has become important since the amount of data in the world is growing every second. Companies require capabilities and resources in order to utilize structured data and unstructured data from internal and external data sources. However, the present Business Intelligence technologies enable managers to utilize data effectively in decision-making. Based on the prior literature, the empirical part of the thesis identifies the enablers and constraints in computer-aided managerial decision-making process. In this thesis, the theoretical part provides a preliminary understanding about the research area through a literature review. The key concepts such as Business Intelligence and managerial decision-making are explored by reviewing the relevant literature. Additionally, different data sources as well as data forms are analyzed in further detail. All key concepts are taken into account when the empirical part is carried out. The empirical part obtains an understanding of the real world situation when it comes to the themes that were covered in the theoretical part. Three selected case companies are analyzed through those statements, which are considered as critical prerequisites for successful computer-aided managerial decision-making. The case study analysis, which is a part of the empirical part, enables the researcher to examine the relationship between Business Intelligence and managerial decision-making. Based on the findings of the case study analysis, the researcher identifies the enablers and constraints through the case study interviews. The findings indicate that the constraints have a highly negative influence on the decision-making process. In addition, the managers are aware of the positive implications that Business Intelligence has for decision-making, but all possibilities are not yet utilized. As a main result of this study, a data-driven framework for managerial decision-making is introduced. This framework can be used when the managerial decision-making processes are evaluated and analyzed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tämän pro gradu –tutkielman tarkoituksena on selvittää minkälaisella prosessilla saadaan määriteltyä resursoinnin näkökulmasta toteutettu osaamiskartoitus. Tutkimus on laadullinen tapaustutkimus kohdeorganisaatiossa. Tutkimusaineisto on kerätty dokumenteista ja tutkimuksessa toteutetuista tapaamisista sekä työpajoista. Tutkimusaineisto on analysoitu aineistolähtöisellä sisällönanalyysimenetelmällä. Tutkimuksen tulosten mukaan osaamiskartoitusprosessiin ja sen onnistumiseen vaikuttavat merkittävästi yrityksen strategia, johdon sitoutuminen osaamiskartoitustyöhön, nykytilan analyysi, yhteiset käsitteistöt, mittarit ja tavoitteet. Resursoinnin näkökulmasta vaadittavat osaamiset eivät välttämättä ole samat kuin kehittämisen näkökulmasta. Määrittelyprosessin onnistumisen kannalta merkittäviä tekijöitä ovat oikeiden henkilöiden osallistuminen prosessiin ja heidän halunsa jakaa tietoa.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As technology has developed it has increased the number of data produced and collected from business environment. Over 80% of that data includes some sort of reference to geographical location. Individuals have used that information by utilizing Google Maps or different GPS devices, however such information has remained unexploited in business. This thesis will study the use and utilization of geographically referenced data in capital-intensive business by first providing theoretical insight into how data and data-driven management enables and enhances the business and how especially geographically referenced data adds value to the company and then examining empirical case evidence how geographical information can truly be exploited in capital-intensive business and what are the value adding elements of geographical information to the business. The study contains semi-structured interviews that are used to scan attitudes and beliefs of an organization towards the geographic information and to discover fields of applications for the use of geographic information system within the case company. Additionally geographical data is tested in order to illustrate how the data could be used in practice. Finally the outcome of the thesis provides understanding from which elements the added value of geographical information in business is consisted of and how such data can be utilized in the case company and in capital-intensive business.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human activity recognition in everyday environments is a critical, but challenging task in Ambient Intelligence applications to achieve proper Ambient Assisted Living, and key challenges still remain to be dealt with to realize robust methods. One of the major limitations of the Ambient Intelligence systems today is the lack of semantic models of those activities on the environment, so that the system can recognize the speci c activity being performed by the user(s) and act accordingly. In this context, this thesis addresses the general problem of knowledge representation in Smart Spaces. The main objective is to develop knowledge-based models, equipped with semantics to learn, infer and monitor human behaviours in Smart Spaces. Moreover, it is easy to recognize that some aspects of this problem have a high degree of uncertainty, and therefore, the developed models must be equipped with mechanisms to manage this type of information. A fuzzy ontology and a semantic hybrid system are presented to allow modelling and recognition of a set of complex real-life scenarios where vagueness and uncertainty are inherent to the human nature of the users that perform it. The handling of uncertain, incomplete and vague data (i.e., missing sensor readings and activity execution variations, since human behaviour is non-deterministic) is approached for the rst time through a fuzzy ontology validated on real-time settings within a hybrid data-driven and knowledgebased architecture. The semantics of activities, sub-activities and real-time object interaction are taken into consideration. The proposed framework consists of two main modules: the low-level sub-activity recognizer and the high-level activity recognizer. The rst module detects sub-activities (i.e., actions or basic activities) that take input data directly from a depth sensor (Kinect). The main contribution of this thesis tackles the second component of the hybrid system, which lays on top of the previous one, in a superior level of abstraction, and acquires the input data from the rst module's output, and executes ontological inference to provide users, activities and their in uence in the environment, with semantics. This component is thus knowledge-based, and a fuzzy ontology was designed to model the high-level activities. Since activity recognition requires context-awareness and the ability to discriminate among activities in di erent environments, the semantic framework allows for modelling common-sense knowledge in the form of a rule-based system that supports expressions close to natural language in the form of fuzzy linguistic labels. The framework advantages have been evaluated with a challenging and new public dataset, CAD-120, achieving an accuracy of 90.1% and 91.1% respectively for low and high-level activities. This entails an improvement over both, entirely data-driven approaches, and merely ontology-based approaches. As an added value, for the system to be su ciently simple and exible to be managed by non-expert users, and thus, facilitate the transfer of research to industry, a development framework composed by a programming toolbox, a hybrid crisp and fuzzy architecture, and graphical models to represent and con gure human behaviour in Smart Spaces, were developed in order to provide the framework with more usability in the nal application. As a result, human behaviour recognition can help assisting people with special needs such as in healthcare, independent elderly living, in remote rehabilitation monitoring, industrial process guideline control, and many other cases. This thesis shows use cases in these areas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the last 30 years, new technologies and globalization have radically changed the way in which marketing is conducted. However, whereas their effects on business in general have been widely discussed, the focus of the effects on marketing remains without clear recognition. Global research has been made to shed light onto the issue, but it has widely concentrated on the views of executives as well as the consumer markets. In addition, a research gap is existent in applying the concept of marketing change in a specific business-to-business (B2B) industry. Therefore, the main research question this study seeks to answer is: “How is contemporary marketing conducted in the high-technology industry?” In this research, the researcher considers the specific industry of high-technology. However, as the industry is comprised of differing markets, the focus will be given to one of the industry’s prime sectors – the information technology (IT) markets, where companies offer other firms products or services manufactured with advanced technology. The growing IT-market is considered of critical importance in the economies of technologically ready countries such as Finland, where this research is also conducted. Through multiple case studies the researcher aims to describe how the changes in technology, customer engagement and future trends have shaped the way in which successful high-tech marketing is conducted in today’s marketplace. Then, results derived from the empirical research are presented to the reader with links to existing literature. As a conclusion, a generalized framework is constructed to depict and ideal marketer-customer relationship, with emphasis on dynamic, two-way communication and its supporting elements of customer analytics, change adaptation, strategic customer communication and organizational support. From a managerial point of view, the research may provide beneficial information as contemporary marketing can yield profitable outcomes if managed correctly. As a new way to grasp competitive advantage, strategic marketing is much more data-driven and customer-focused than ever before. The study can also prove to be relevant for the academic communities, while its results may act as inspiring for new focus on the education trends of future marketers. This study was limited to the internal activities done at the high-tech industry, leaving out the considerations for co-marketing, marketing via business partners or marketing at other B2B-industries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tämän pro gradu -tutkielman tavoitteena oli standardoida ja kehittää hankinnan prosesseja systemaattisesti. Prosessien standardoiminen edesauttaa organisaation itseohjautuvuutta ja parantaa organisaation kyvykkyyttä. Tämä tutkielma tuottaa uutta tutkimustietoa tutkimusaukkoon hankinnan prosessien systemaattisesta kehittämisestä. Tutkielman teoriaosassa käsitellään prosessiajattelua, ISO 9001 standardin vaikutusta hankintatoimintaan, tunnistetaan hankinnan prosesseja ja esitetään erilaisia hankinnan prosesseihin soveltuvia LEAN Six Sigma kehitystyökaluja. Tutkielman empiirinen osio on toteutettu toimeksiantona suurelle teollisuusyritykselle. Tämän tutkielman empiirisessä osiossa on yhdistelty kvalitatiiviselle ja kvantitatiiviselle tutkimukselle ominaisia tutkimusmetodologioita. Tutkimusmetodologioiden yhdisteleminen tuottaa lisäarvoa tutkielman toteutukselle ja parantaa työn luotettavuutta. Tässä tutkielmassa käytetyt tutkimusmenetelmät koostuvat kvantitatiiviselle tutkimukselle ominaisesta kyselystä hankinnan prosessien nykytilan selvittämiseksi sekä kvalitatiivisen tutkimuksen tutkimusmenetelmille ominaisesta tapaustutkimuksesta. Kyselyn avulla muodostettiin analyysi hankinnan prosessien nykytilasta, jonka pohjalta toimeksiantajayritykselle ehdotettiin kehitettäviä hankinnan prosesseja. Lopulliseksi kehityskohteeksi valittu prosessi määriteltiin aivoriihessä kerätyn tiedon perusteella kehityspotentiaalin esille tuovasta nelikentästä. Kehitettävä prosessi asemoitui samalla tapaustutkimuksella tutkittavaksi tapaukseksi. Kehitettäväksi prosessiksi valittiin saapuvan tavaran vastaanotto ja visuaalinen tarkastaminen. Prosessia kehitettiin mallintamalla ja LEAN Six Sigma menetelmiä hyödyntäen. Prosessimallinnuksen sekä LEAN Six Sigma menetelmien tueksi kerättiin tietoa aivoriihistä ja teemahaastatteluista. Tutkielman tuloksena toimeksiantajayritys sai kokonaiskuvan hankinnan prosessien nykytilasta, kehitetyn tavoiteprosessin ja työohjeen saapuvan tavaran vastaanotolle ja visuaaliselle tarkastukselle sekä jatkossa hankinnan prosessien standardoimista ja kehittämistä helpottavan hankinnan prosessin systemaattisen kehittämismallin. Hankinnan prosessin systemaattinen kehittämismalli on uutta tutkimustietoa, joka asemoituu hankinnan prosessien kehittämisen tutkimusaukkoon. Hankinnan prosessin systemaattista kehittämismallia ei tämän tutkielman pohjalta voida yleistää, koska tutkielma on toteutettu toimeksiantajayrityksen lähtökohdista. Johtopäätöksenä voidaan todeta, että toimeksiantajayritys, muut organisaatiot sekä tutkimuskenttä tarvitsevat lisää tietoa hankinnan prosessien kehittämisestä. Hankinnan prosessien kehittämistä tulisi tutkia enemmän ja erityisesti tässä tutkielmassa esitettyä kehittämismallia tulisi jatkossa testata niin, että kehittämismallin toimivuus voitaisiin yleistää.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The emerging technologies have recently challenged the libraries to reconsider their role as a mere mediator between the collections, researchers, and wider audiences (Sula, 2013), and libraries, especially the nationwide institutions like national libraries, haven’t always managed to face the challenge (Nygren et al., 2014). In the Digitization Project of Kindred Languages, the National Library of Finland has become a node that connects the partners to interplay and work for shared goals and objectives. In this paper, I will be drawing a picture of the crowdsourcing methods that have been established during the project to support both linguistic research and lingual diversity. The National Library of Finland has been executing the Digitization Project of Kindred Languages since 2012. The project seeks to digitize and publish approximately 1,200 monograph titles and more than 100 newspapers titles in various, and in some cases endangered Uralic languages. Once the digitization has been completed in 2015, the Fenno-Ugrica online collection will consist of 110,000 monograph pages and around 90,000 newspaper pages to which all users will have open access regardless of their place of residence. The majority of the digitized literature was originally published in the 1920s and 1930s in the Soviet Union, and it was the genesis and consolidation period of literary languages. This was the era when many Uralic languages were converted into media of popular education, enlightenment, and dissemination of information pertinent to the developing political agenda of the Soviet state. The ‘deluge’ of popular literature in the 1920s to 1930s suddenly challenged the lexical orthographic norms of the limited ecclesiastical publications from the 1880s onward. Newspapers were now written in orthographies and in word forms that the locals would understand. Textbooks were written to address the separate needs of both adults and children. New concepts were introduced in the language. This was the beginning of a renaissance and period of enlightenment (Rueter, 2013). The linguistically oriented population can also find writings to their delight, especially lexical items specific to a given publication, and orthographically documented specifics of phonetics. The project is financially supported by the Kone Foundation in Helsinki and is part of the Foundation’s Language Programme. One of the key objectives of the Kone Foundation Language Programme is to support a culture of openness and interaction in linguistic research, but also to promote citizen science as a tool for the participation of the language community in research. In addition to sharing this aspiration, our objective within the Language Programme is to make sure that old and new corpora in Uralic languages are made available for the open and interactive use of the academic community as well as the language societies. Wordlists are available in 17 languages, but without tokenization, lemmatization, and so on. This approach was verified with the scholars, and we consider the wordlists as raw data for linguists. Our data is used for creating the morphological analyzers and online dictionaries at the Helsinki and Tromsø Universities, for instance. In order to reach the targets, we will produce not only the digitized materials but also their development tools for supporting linguistic research and citizen science. The Digitization Project of Kindred Languages is thus linked with the research of language technology. The mission is to improve the usage and usability of digitized content. During the project, we have advanced methods that will refine the raw data for further use, especially in the linguistic research. How does the library meet the objectives, which appears to be beyond its traditional playground? The written materials from this period are a gold mine, so how could we retrieve these hidden treasures of languages out of the stack that contains more than 200,000 pages of literature in various Uralic languages? The problem is that the machined-encoded text (OCR) contains often too many mistakes to be used as such in research. The mistakes in OCRed texts must be corrected. For enhancing the OCRed texts, the National Library of Finland developed an open-source code OCR editor that enabled the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary to implement, since these rare and peripheral prints did often include already perished characters, which are sadly neglected by the modern OCR software developers, but belong to the historical context of kindred languages and thus are an essential part of the linguistic heritage (van Hemel, 2014). Our crowdsourcing tool application is essentially an editor of Alto XML format. It consists of a back-end for managing users, permissions, and files, communicating through a REST API with a front-end interface—that is, the actual editor for correcting the OCRed text. The enhanced XML files can be retrieved from the Fenno-Ugrica collection for further purposes. Could the crowd do this work to support the academic research? The challenge in crowdsourcing lies in its nature. The targets in the traditional crowdsourcing have often been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguists are not necessarily met. Also, the remarkable downside is the lack of shared goal or the social affinity. There is no reward in the traditional methods of crowdsourcing (de Boer et al., 2012). Also, there has been criticism that digital humanities makes the humanities too data-driven and oriented towards quantitative methods, losing the values of critical qualitative methods (Fish, 2012). And on top of that, the downsides of the traditional crowdsourcing become more imminent when you leave the Anglophone world. Our potential crowd is geographically scattered in Russia. This crowd is linguistically heterogeneous, speaking 17 different languages. In many cases languages are close to extinction or longing for language revitalization, and the native speakers do not always have Internet access, so an open call for crowdsourcing would not have produced appeasing results for linguists. Thus, one has to identify carefully the potential niches to complete the needed tasks. When using the help of a crowd in a project that is aiming to support both linguistic research and survival of endangered languages, the approach has to be a different one. In nichesourcing, the tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for complex tasks with high-quality product expectations found in nichesourcing. Communities have a purpose and identity, and their regular interaction engenders social trust and reputation. These communities can correspond to research more precisely (de Boer et al., 2012). Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. In nichesourcing, we hand in such assignments that would precisely fill the gaps in linguistic research. A typical task would be editing and collecting the words in such fields of vocabularies where the researchers do require more information. For instance, there is lack of Hill Mari words and terminology in anatomy. We have digitized the books in medicine, and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with the OCR editor. From the nichesourcing’s perspective, it is essential that altruism play a central role when the language communities are involved. In nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit from the results. For instance, the corrected words in Ingrian will be added to an online dictionary, which is made freely available for the public, so the society can benefit, too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of ‘two masters’: research and society.