26 resultados para Open Source Software


Relevância:

90.00% 90.00%

Publicador:

Resumo:

The EU Directive harmonising copyright, Directive 2001/29/EC, has been implemented in all META-NORD countries. The licensing schemas of open content/open source and META-SHARE as well as CLARIN are discussed shortly. The status of the licensing of tools and resources available at the consortium partners are outlined. The aim of the article is to compare a set of open content and open source license and provide some guidance on the optimal use of licenses provided by META-NET and CLARIN for licensing the tools and resources for the benefit of the language technology community.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Language software applications encounter new words, e.g., acronyms, technical terminology, names or compounds of such words. In order to add new words to a lexicon, we need to indicate their inflectional paradigm. We present a new generally applicable method for creating an entry generator, i.e. a paradigm guesser, for finite-state transducer lexicons. As a guesser tends to produce numerous suggestions, it is important that the correct suggestions be among the first few candidates. We prove some formal properties of the method and evaluate it on Finnish, English and Swedish full-scale transducer lexicons. We use the open-source Helsinki Finite-State Technology to create finitestate transducer lexicons from existing lexical resources and automatically derive guessers for unknown words. The method has a recall of 82-87 % and a precision of 71-76 % for the three test languages. The model needs no external corpus and can therefore serve as a baseline.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Place identification refers to the process of analyzing sensor data in order to detect places, i.e., spatial areas that are linked with activities and associated with meanings. Place information can be used, e.g., to provide awareness cues in applications that support social interactions, to provide personalized and location-sensitive information to the user, and to support mobile user studies by providing cues about the situations the study participant has encountered. Regularities in human movement patterns make it possible to detect personally meaningful places by analyzing location traces of a user. This thesis focuses on providing system level support for place identification, as well as on algorithmic issues related to the place identification process. The move from location to place requires interactions between location sensing technologies (e.g., GPS or GSM positioning), algorithms that identify places from location data and applications and services that utilize place information. These interactions can be facilitated using a mobile platform, i.e., an application or framework that runs on a mobile phone. For the purposes of this thesis, mobile platforms automate data capture and processing and provide means for disseminating data to applications and other system components. The first contribution of the thesis is BeTelGeuse, a freely available, open source mobile platform that supports multiple runtime environments. The actual place identification process can be understood as a data analysis task where the goal is to analyze (location) measurements and to identify areas that are meaningful to the user. The second contribution of the thesis is the Dirichlet Process Clustering (DPCluster) algorithm, a novel place identification algorithm. The performance of the DPCluster algorithm is evaluated using twelve different datasets that have been collected by different users, at different locations and over different periods of time. As part of the evaluation we compare the DPCluster algorithm against other state-of-the-art place identification algorithms. The results indicate that the DPCluster algorithm provides improved generalization performance against spatial and temporal variations in location measurements.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This work is a case study of applying nonparametric statistical methods to corpus data. We show how to use ideas from permutation testing to answer linguistic questions related to morphological productivity and type richness. In particular, we study the use of the suffixes -ity and -ness in the 17th-century part of the Corpus of Early English Correspondence within the framework of historical sociolinguistics. Our hypothesis is that the productivity of -ity, as measured by type counts, is significantly low in letters written by women. To test such hypotheses, and to facilitate exploratory data analysis, we take the approach of computing accumulation curves for types and hapax legomena. We have developed an open source computer program which uses Monte Carlo sampling to compute the upper and lower bounds of these curves for one or more levels of statistical significance. By comparing the type accumulation from women’s letters with the bounds, we are able to confirm our hypothesis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The main aim of the present study was to develop information and communication technology (ICT) based chemistry education. The goals for the study were to support meaningful chemistry learning, research-based teaching and diffusion of ICT innovations. These goals were used as guidelines that form the theoretical framework for this study. This Doctoral Dissertation is based on eight-stage research project that included three design researches. These three design researches were scrutinized as separate case studies in which the different cases were formed according to different design teams: i) one researcher was in charge of the design and teachers were involved in the research process, ii) a research group was in charge of the design and students were involved in the research process, and iii) the design was done by student teams, the research was done collaboratively, and the design process was coordinated by a researcher. The research projects were conducted using mixed method approach, which enabled a comprehensive view on education design. In addition, the three central areas of design research: problem analysis, design solution and design process were included in the research, which was guided by the main research questions formed according to these central areas: 1) design solution: what kind of elements are included in ICT-based learning environments that support meaningful chemistry learning and diffusion of innovation, 2) problem analysis: what kind of new possibilities the designed learning environments offer for the support of meaningful chemistry learning, and 3) design process: what kind of opportunities and challenges does collaboration bring to the design of ICT-based learning environments? The main research questions were answered according to the analysis of the survey and observation data, six designed learning environments and ten design narratives from the three case studies. Altogether 139 chemistry teachers and teacher students were involved in the design processes. The data was mainly analysed by methods of qualitative content analysis. The first main result from the study give new information on the meaningful chemistry learning and the elements of ICT-based learning environment that support the diffusion of innovation, which can help in the development of future ICT-education design. When the designed learning environment was examined in the context of chemistry education, it was evident that an ICT-based chemistry learning environment supporting the meaningful learning of chemistry motivates the students and makes the teacher s work easier. In addition, it should enable the simultaneous fulfilment of several pedagogical goals and activate higher-level cognitive processes. The learning environment supporting the diffusion of ICT innovation is suitable for Finnish school environment, based on open source code, and easy to use with quality chemistry content. According to the second main result, new information was acquired about the possibilities of ICT-based learning environments in supporting meaningful chemistry learning. This will help in setting the goals for future ICT education. After the analysis of design solutions and their evaluations, it can be said that ICT enables the recognition of all elements that define learning environments (i.e. didactic, physical, technological and social elements). The research particularly demonstrates the significance of ICT in supporting students motivation and higher-level cognitive processes as well as versatile visualization resources for chemistry that ICT makes possible. In addition, research-based teaching method supports well the diffusion of studied innovation on individual level. The third main result brought out new information on the significance of collaboration in design research, which guides the design of ICT education development. According to the analysis of design narratives, it can be said that collaboration is important in the execution of scientifically reliable design research. It enables comprehensive requirement analysis and multifaceted development, which improves the reliability and validity of the research. At the same time, it sets reliability challenges by complicating documenting and coordination, for example. In addition, a new method for design research was developed. Its aim is to support the execution of complicated collaborative design projects. To increase the reliability and validity of the research, a model theory was used. It enables time-pound documenting and visualization of design decisions that clarify the process. This improves the reliability of the research. The validity of the research is improved by requirement definition through models. This way learning environments that meet the design goals can be constructed. The designed method can be used in education development from comprehensive to higher level. It can be used to recognize the needs of different interest groups and individuals with regard to processes, technology and substance knowledge as well as interfaces and relations between them. The developed method has also commercial potential. It is used to design learning environments for national and international market.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Researchers and developers in academia and industry would benefit from a facility that enables them to easily locate, licence and use the kind of empirical data they need for testing and refining their hypotheses and to deposit and disseminate their data e.g. to support replication and validation of reported scientific experiments. To answer these needs initially in Finland, there is an ongoing project at University of Helsinki and its collaborators to create a user-friendly web service for researchers and developers in Finland and other countries. In our talk, we describe ongoing work to create a palette of extensive but easily available Finnish language resources and technologies for the research community, including lexical resources, wordnets, morphologically tagged corpora, dependency syntactic treebanks and parsebanks, open-source finite state toolkits and libraries and language models to support text analysis and processing at customer site. Also first publicly available results are presented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

FinnWordNet is a wordnet for Finnish that complies with the format of the Princeton WordNet (PWN) (Fellbaum, 1998). It was built by translating the PrincetonWordNet 3.0 synsets into Finnish by human translators. It is open source and contains 117000 synsets. The Finnish translations were inserted into the PWN structure resulting in a bilingual lexical database. In natural language processing (NLP), wordnets have been used for infusing computers with semantic knowledge assuming that humans already have a sufficient amount of this knowledge. In this paper we present a case study of using wordnets as an electronic dictionary. We tested whether native Finnish speakers benefit from using a wordnet while completing English sentence completion tasks. We found that using either an English wordnet or a bilingual English Finnish wordnet significantly improves performance in the task. This should be taken into account when setting standards and comparing human and computer performance on these tasks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

FinnWordNet is a WordNet for Finnish that conforms to the framework given in Fellbaum (1998) and Vossen (ed.) (1998). FinnWordNet is open source and currently contains 117,000 synsets. A classic WordNet consists of synsets, or sets of partial synonyms whose shared meaning is described and exemplified by a gloss, a common part of speech and a hyperonym. Synsets in a WordNet are arranged in hierarchical partial orderings according to semantic relations like hyponymy/hyperonymy. Together the gloss, part of speech and hyperonym fix the meaning of a word and constrain the possible translations of a word in a given synset. The Finnish group has opted for translating Princeton WordNet 3.0 synsets wholesale into Finnish by professional translators, because the translation process can be controlled with regard to quality, coverage, cost and speed of translation. The project was financed by FIN-CLARIN at the University of Helsinki. According to our preliminary evaluation, the translation process was diligent and the quality is on a par with the original Princeton WordNet.

Relevância:

80.00% 80.00%

Publicador: