80 resultados para Online services using open-source NLP tools
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
A web service is a software system that provides a machine-processable interface to the other machines over the network using different Internet protocols. They are being increasingly used in the industry in order to automate different tasks and offer services to a wider audience. The REST architectural style aims at producing scalable and extensible web services using technologies that play well with the existing tools and infrastructure of the web. It provides a uniform set of operation that can be used to invoke a CRUD interface (create, retrieve, update and delete) of a web service. The stateless behavior of the service interface requires that every request to a resource is independent of the previous ones facilitating scalability. Automated systems, e.g., hotel reservation systems, provide advanced scenarios for stateful services that require a certain sequence of requests that must be followed in order to fulfill the service goals. Designing and developing such services for advanced scenarios with REST constraints require rigorous approaches that are capable of creating web services that can be trusted for their behavior. Systems that can be trusted for their behavior can be termed as dependable systems. This thesis presents an integrated design, analysis and validation approach that facilitates the service developer to create dependable and stateful REST web services. The main contribution of this thesis is that we provide a novel model-driven methodology to design behavioral REST web service interfaces and their compositions. The behavioral interfaces provide information on what methods can be invoked on a service and the pre- and post-conditions of these methods. The methodology uses Unified Modeling Language (UML), as the modeling language, which has a wide user base and has mature tools that are continuously evolving. We have used UML class diagram and UML state machine diagram with additional design constraints to provide resource and behavioral models, respectively, for designing REST web service interfaces. These service design models serve as a specification document and the information presented in them have manifold applications. The service design models also contain information about the time and domain requirements of the service that can help in requirement traceability which is an important part of our approach. Requirement traceability helps in capturing faults in the design models and other elements of software development environment by tracing back and forth the unfulfilled requirements of the service. The information about service actors is also included in the design models which is required for authenticating the service requests by authorized actors since not all types of users have access to all the resources. In addition, following our design approach, the service developer can ensure that the designed web service interfaces will be REST compliant. The second contribution of this thesis is consistency analysis of the behavioral REST interfaces. To overcome the inconsistency problem and design errors in our service models, we have used semantic technologies. The REST interfaces are represented in web ontology language, OWL2, that can be part of the semantic web. These interfaces are used with OWL 2 reasoners to check unsatisfiable concepts which result in implementations that fail. This work is fully automated thanks to the implemented translation tool and the existing OWL 2 reasoners. The third contribution of this thesis is the verification and validation of REST web services. We have used model checking techniques with UPPAAL model checker for this purpose. The timed automata of UML based service design models are generated with our transformation tool that are verified for their basic characteristics like deadlock freedom, liveness, reachability and safety. The implementation of a web service is tested using a black-box testing approach. Test cases are generated from the UPPAAL timed automata and using the online testing tool, UPPAAL TRON, the service implementation is validated at runtime against its specifications. Requirement traceability is also addressed in our validation approach with which we can see what service goals are met and trace back the unfulfilled service goals to detect the faults in the design models. A final contribution of the thesis is an implementation of behavioral REST interfaces and service monitors from the service design models. The partial code generation tool creates code skeletons of REST web services with method pre and post-conditions. The preconditions of methods constrain the user to invoke the stateful REST service under the right conditions and the post condition constraint the service developer to implement the right functionality. The details of the methods can be manually inserted by the developer as required. We do not target complete automation because we focus only on the interface aspects of the web service. The applicability of the approach is demonstrated with a pedagogical example of a hotel room booking service and a relatively complex worked example of holiday booking service taken from the industrial context. The former example presents a simple explanation of the approach and the later worked example shows how stateful and timed web services offering complex scenarios and involving other web services can be constructed using our approach.
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Workshop at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
The emerging technologies have recently challenged the libraries to reconsider their role as a mere mediator between the collections, researchers, and wider audiences (Sula, 2013), and libraries, especially the nationwide institutions like national libraries, haven’t always managed to face the challenge (Nygren et al., 2014). In the Digitization Project of Kindred Languages, the National Library of Finland has become a node that connects the partners to interplay and work for shared goals and objectives. In this paper, I will be drawing a picture of the crowdsourcing methods that have been established during the project to support both linguistic research and lingual diversity. The National Library of Finland has been executing the Digitization Project of Kindred Languages since 2012. The project seeks to digitize and publish approximately 1,200 monograph titles and more than 100 newspapers titles in various, and in some cases endangered Uralic languages. Once the digitization has been completed in 2015, the Fenno-Ugrica online collection will consist of 110,000 monograph pages and around 90,000 newspaper pages to which all users will have open access regardless of their place of residence. The majority of the digitized literature was originally published in the 1920s and 1930s in the Soviet Union, and it was the genesis and consolidation period of literary languages. This was the era when many Uralic languages were converted into media of popular education, enlightenment, and dissemination of information pertinent to the developing political agenda of the Soviet state. The ‘deluge’ of popular literature in the 1920s to 1930s suddenly challenged the lexical orthographic norms of the limited ecclesiastical publications from the 1880s onward. Newspapers were now written in orthographies and in word forms that the locals would understand. Textbooks were written to address the separate needs of both adults and children. New concepts were introduced in the language. This was the beginning of a renaissance and period of enlightenment (Rueter, 2013). The linguistically oriented population can also find writings to their delight, especially lexical items specific to a given publication, and orthographically documented specifics of phonetics. The project is financially supported by the Kone Foundation in Helsinki and is part of the Foundation’s Language Programme. One of the key objectives of the Kone Foundation Language Programme is to support a culture of openness and interaction in linguistic research, but also to promote citizen science as a tool for the participation of the language community in research. In addition to sharing this aspiration, our objective within the Language Programme is to make sure that old and new corpora in Uralic languages are made available for the open and interactive use of the academic community as well as the language societies. Wordlists are available in 17 languages, but without tokenization, lemmatization, and so on. This approach was verified with the scholars, and we consider the wordlists as raw data for linguists. Our data is used for creating the morphological analyzers and online dictionaries at the Helsinki and Tromsø Universities, for instance. In order to reach the targets, we will produce not only the digitized materials but also their development tools for supporting linguistic research and citizen science. The Digitization Project of Kindred Languages is thus linked with the research of language technology. The mission is to improve the usage and usability of digitized content. During the project, we have advanced methods that will refine the raw data for further use, especially in the linguistic research. How does the library meet the objectives, which appears to be beyond its traditional playground? The written materials from this period are a gold mine, so how could we retrieve these hidden treasures of languages out of the stack that contains more than 200,000 pages of literature in various Uralic languages? The problem is that the machined-encoded text (OCR) contains often too many mistakes to be used as such in research. The mistakes in OCRed texts must be corrected. For enhancing the OCRed texts, the National Library of Finland developed an open-source code OCR editor that enabled the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary to implement, since these rare and peripheral prints did often include already perished characters, which are sadly neglected by the modern OCR software developers, but belong to the historical context of kindred languages and thus are an essential part of the linguistic heritage (van Hemel, 2014). Our crowdsourcing tool application is essentially an editor of Alto XML format. It consists of a back-end for managing users, permissions, and files, communicating through a REST API with a front-end interface—that is, the actual editor for correcting the OCRed text. The enhanced XML files can be retrieved from the Fenno-Ugrica collection for further purposes. Could the crowd do this work to support the academic research? The challenge in crowdsourcing lies in its nature. The targets in the traditional crowdsourcing have often been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguists are not necessarily met. Also, the remarkable downside is the lack of shared goal or the social affinity. There is no reward in the traditional methods of crowdsourcing (de Boer et al., 2012). Also, there has been criticism that digital humanities makes the humanities too data-driven and oriented towards quantitative methods, losing the values of critical qualitative methods (Fish, 2012). And on top of that, the downsides of the traditional crowdsourcing become more imminent when you leave the Anglophone world. Our potential crowd is geographically scattered in Russia. This crowd is linguistically heterogeneous, speaking 17 different languages. In many cases languages are close to extinction or longing for language revitalization, and the native speakers do not always have Internet access, so an open call for crowdsourcing would not have produced appeasing results for linguists. Thus, one has to identify carefully the potential niches to complete the needed tasks. When using the help of a crowd in a project that is aiming to support both linguistic research and survival of endangered languages, the approach has to be a different one. In nichesourcing, the tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for complex tasks with high-quality product expectations found in nichesourcing. Communities have a purpose and identity, and their regular interaction engenders social trust and reputation. These communities can correspond to research more precisely (de Boer et al., 2012). Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. In nichesourcing, we hand in such assignments that would precisely fill the gaps in linguistic research. A typical task would be editing and collecting the words in such fields of vocabularies where the researchers do require more information. For instance, there is lack of Hill Mari words and terminology in anatomy. We have digitized the books in medicine, and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with the OCR editor. From the nichesourcing’s perspective, it is essential that altruism play a central role when the language communities are involved. In nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit from the results. For instance, the corrected words in Ingrian will be added to an online dictionary, which is made freely available for the public, so the society can benefit, too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of ‘two masters’: research and society.
Resumo:
Tämä tutkimus oli osa sähköistä liiketoimintaa ja langattomia sovelluksia tutkivaa projektia ja tutkimuksen tavoitteena oli selvittää ennustamisen rooli päätöksenteko- ja suunnitteluprosessissa ja määrittää parhaiten soveltuvat ja useimmin käytetyt teknologian ennustusmenetelmät. Ennustusmenetelmiä tarkasteltiin erityisesti uuden teknologian ja pitkän aikavälin ennustamisen näkökulmasta. Tutkimus perustui teknologista ennustamista, pitkän aikavälin suunnittelua ja innovaatioprosesseja käsittelevän kirjallisuuden analysointiin. Materiaalin perusteella kuvataan teknologian ennustamista informaation hankkimisvälineenä organisaatioiden suunnitteluprosessin apuna. Työssä arvioidaan myös seuraavat teknologisen ennustamisen menetelmät: trendianalyysi-, Delfoi-, cross-impact analyysi-, morfologinen analyysi- ja skenaario analyysimenetelmä. Työ tuo esille jokaisen ennustusmenetelmä ominaispiirteet, rajoitukset ja sovellusmahdollisuudet. Käyttäen esiteltyjä menetelmiä, saadaan kerättyä hyödyllistä informaatiota tulevaisuuden näkymistä, joita sitten voidaan käyttää hyväksi organisaatioiden suunnitteluprosesseissa.
Resumo:
Tässä työssä kuvataan Lahti Fenix Kuntalaistilijärjestelmän ja Tekla Xcity-järjestelmän välille toteutetun järjestelmäintegraation eri vaiheet. Kuntalaistilijärjestelmä on Lahden kaupungin Fenix-hankkeessa kehitteillä oleva sähköinen palvelualusta, jota pitkin kuntalaisille tarjotaan erilaisia kunnallisia palveluja, kuten vastaanottoaikoja hammaslääkärille. Tekla Xcity on kuntien ja kaupunkien käyttöön tarkoitettu järjestelmä, josta on mahdollista hakea esimerkiksi henkilö- ja paikkatietoja. Aluksi työssä esitellään lyhyesti erilaisia tapoja toteuttaa järjestelmäintegraatioita. Seuraavaksi kiinnitetään erityistä huomiota niin sanottuihin web-palveluihin, joiden etuja ja haittoja arvioidaan käytännön esimerkin kautta. Tässä pidetään viitekehyksenä Kuntalaistilijärjestelmää ja siinä käytettyä palvelukeskeistä arkkitehtuuria. Arkkitehtuurin ja viestiliikenneratkaisujen arvioinnin jälkeen siirrytään käytännön osuuteen, jossa itse järjestelmäintegraatio toteutetaan. Järjestelmäintegraatio toteutetaan käyttäen avoimen lähdekoodin palveluväylää ja sille saatavissa olevia viestintäkehyksiä. Integraation eri vaiheissa tutustutaan erilaisiin viestiliikenneprotokolliin ja niiden käyttöön valittujen viestintäkehysten kanssa. Kunkin protokollan toimivuus varmennetaan analysoimalla integraatioon liittyvien komponenttien ja päätepisteiden välistä tietoliikennettä.
Resumo:
Web application performance testing is an emerging and important field of software engineering. As web applications become more commonplace and complex, the need for performance testing will only increase. This paper discusses common concepts, practices and tools that lie at the heart of web application performance testing. A pragmatic, hands-on approach is assumed where applicable; real-life examples of test tooling, execution and analysis are presented right next to the underpinning theory. At the client-side, web application performance is primarily driven by the amount of data transmitted over the wire. At the server-side, selection of programming language and platform, implementation complexity and configuration are the primary contributors to web application performance. Web application performance testing is an activity that requires delicate coordination between project stakeholders, developers, system administrators and testers in order to produce reliable and useful results. Proper test definition, execution, reporting and repeatable test results are of utmost importance. Open-source performance analysis tools such as Apache JMeter, Firebug and YSlow can be used to realise effective web application performance tests. A sample case study using these tools is presented in this paper. The sample application was found to perform poorly even under the moderate load incurred by the sample tests.
Resumo:
The current research emphasizes on various questions raised and deliberated upon by different entrepreneurs. It provides a valuable contribution to comprehend the importance of social media and ICT-applications. Furthermore, it demonstrates how to support and implement the management consulting and business coaching start-ups with the help of social media and ICT-tools. The thesis presents a literary review from different information systems science, SME and e-business journals, web articles, as well as, survey analysis reports on social media applications. The methodology incorporated into a qualitative research method in which social anthropological approaches were used to oversee the case study activities in order to collect data. The collaborative social research approach was used to shelter the action research method. The research discovered that new business start-ups, as well as small businesses do not use social media and ICT-tools, unlike most of the large corporations use. At present, the current open-source ICT-technologies and social media applications are equally available for new and small businesses as they are available for larger companies. Successful implementation of social media and ICT-applications can easily enhance start-up performance and overcome business hassles. The thesis sheds some light on effective and innovative implementation of social media and ICT-applications for new business risk takers and small business birds. Key words
Resumo:
The portfolio as a means of demonstrating personal skills has lately been gaining prominence among technology students. This is partially due to the introduction of electronic portfolios, or e-portfolios. As platforms for e-portfolio management with different approaches have been introduced, the learning cycle, traditional portfolio pedagogy, and learner centricity have sometimes been forgotten, and as a result, the tools have been used for the most part as data depositories. The purpose of this thesis is to show how the construction of e-portfolios of IT students can be supported by institutions through the usage of different tools that relate to study advising, teaching, and learning. The construction process is presented as a cycle based on learning theories. Actions related to the various phases of the e-portfolio construction process are supported by the implementation of software applications. To maximize learner-centricity and minimize the intervention of the institution, the evaluated and controlled actions for these practices can be separated from the e-portfolios, leaving the construction of the e-portfolio to students. The main contributions of this thesis are the implemented applications, which can be considered to support the e-portfolio construction by assisting in planning, organizing, and reflecting activities. Eventually, this supports the students in their construction of better and more extensive e-portfolios. The implemented tools include 1) JobSkillSearcher to help students’ recognition of the demands of the ICT industry regarding skills, 2) WebTUTOR to support students’ personal study planning, 3) Learning Styles to determine students' learning styles, and 4) MyPeerReview to provide a platform on which to carry out anonymous peer review processes in courses. The most visible outcome concerning the e-portfolio is its representation, meaning that one can use it to demonstrate personal achievements at the time of seeking a job and gaining employment. Testing the tools and the selected open-source e-portfolio application indicates that the degree of richness of e-portfolio content can be increased by using the implemented applications.
Resumo:
Biokuvainformatiikan kehittäminen – mikroskopiasta ohjelmistoratkaisuihin – sovellusesimerkkinä α2β1-integriini Kun ihmisen genomi saatiin sekvensoitua vuonna 2003, biotieteiden päätehtäväksi tuli selvittää eri geenien tehtävät, ja erilaisista biokuvantamistekniikoista tuli keskeisiä tutkimusmenetelmiä. Teknologiset kehitysaskeleet johtivat erityisesti fluoresenssipohjaisten valomikroskopiatekniikoiden suosion räjähdysmäiseen kasvuun, mutta mikroskopian tuli muuntua kvalitatiivisesta tieteestä kvantitatiiviseksi. Tämä muutos synnytti uuden tieteenalan, biokuvainformatiikan, jonka on sanottu mahdollisesti mullistavan biotieteet. Tämä väitöskirja esittelee laajan, poikkitieteellisen työkokonaisuuden biokuvainformatiikan alalta. Väitöskirjan ensimmäinen tavoite oli kehittää protokollia elävien solujen neliulotteiseen konfokaalimikroskopiaan, joka oli yksi nopeimmin kasvavista biokuvantamismenetelmistä. Ihmisen kollageenireseptori α2β1-integriini, joka on tärkeä molekyyli monissa fysiologisissa ja patologisissa prosesseissa, oli sovellusesimerkkinä. Työssä saavutettiin selkeitä visualisointeja integriinien liikkeistä, yhteenkeräytymisestä ja solun sisään siirtymisestä, mutta työkaluja kuvainformaation kvantitatiiviseen analysointiin ei ollut. Väitöskirjan toiseksi tavoitteeksi tulikin tällaiseen analysointiin soveltuvan tietokoneohjelmiston kehittäminen. Samaan aikaan syntyi biokuvainformatiikka, ja kipeimmin uudella alalla kaivattiin erikoistuneita tietokoneohjelmistoja. Tämän väitöskirjatyön tärkeimmäksi tulokseksi muodostui näin ollen BioImageXD, uudenlainen avoimen lähdekoodin ohjelmisto moniulotteisten biokuvien visualisointiin, prosessointiin ja analysointiin. BioImageXD kasvoi yhdeksi alansa suurimmista ja monipuolisimmista. Se julkaistiin Nature Methods -lehden biokuvainformatiikkaa käsittelevässä erikoisnumerossa, ja siitä tuli tunnettu ja laajalti käytetty. Väitöskirjan kolmas tavoite oli soveltaa kehitettyjä menetelmiä johonkin käytännönläheisempään. Tehtiin keinotekoisia piidioksidinanopartikkeleita, joissa oli "osoitelappuina" α2β1-integriinin tunnistavia vasta-aineita. BioImageXD:n avulla osoitettiin, että nanopartikkeleilla on potentiaalia lääkkeiden täsmäohjaussovelluksissa. Tämän väitöskirjatyön yksi perimmäinen tavoite oli edistää uutta ja tuntematonta biokuvainformatiikan tieteenalaa, ja tämä tavoite saavutettiin erityisesti BioImageXD:n ja sen lukuisten julkaistujen sovellusten kautta. Väitöskirjatyöllä on merkittävää potentiaalia tulevaisuudessa, mutta biokuvainformatiikalla on vakavia haasteita. Ala on liian monimutkainen keskimääräisen biolääketieteen tutkijan hallittavaksi, ja alan keskeisin elementti, avoimen lähdekoodin ohjelmistokehitystyö, on aliarvostettu. Näihin seikkoihin tarvitaan useita parannuksia,
Resumo:
Panel at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014