53 resultados para Fuzzy linguistic variable
Resumo:
This study examines the structure of the Russian Reflexive Marker ( ся/-сь) and offers a usage-based model building on Construction Grammar and a probabilistic view of linguistic structure. Traditionally, reflexive verbs are accounted for relative to non-reflexive verbs. These accounts assume that linguistic structures emerge as pairs. Furthermore, these accounts assume directionality where the semantics and structure of a reflexive verb can be derived from the non-reflexive verb. However, this directionality does not necessarily hold diachronically. Additionally, the semantics and the patterns associated with a particular reflexive verb are not always shared with the non-reflexive verb. Thus, a model is proposed that can accommodate the traditional pairs as well as for the possible deviations without postulating different systems. A random sample of 2000 instances marked with the Reflexive Marker was extracted from the Russian National Corpus and the sample used in this study contains 819 unique reflexive verbs. This study moves away from the traditional pair account and introduces the concept of Neighbor Verb. A neighbor verb exists for a reflexive verb if they share the same phonological form excluding the Reflexive Marker. It is claimed here that the Reflexive Marker constitutes a system in Russian and the relation between the reflexive and neighbor verbs constitutes a cross-paradigmatic relation. Furthermore, the relation between the reflexive and the neighbor verb is argued to be of symbolic connectivity rather than directionality. Effectively, the relation holding between particular instantiations can vary. The theoretical basis of the present study builds on this assumption. Several new variables are examined in order to systematically model variability of this symbolic connectivity, specifically the degree and strength of connectivity between items. In usage-based models, the lexicon does not constitute an unstructured list of items. Instead, items are assumed to be interconnected in a network. This interconnectedness is defined as Neighborhood in this study. Additionally, each verb carves its own niche within the Neighborhood and this interconnectedness is modeled through rhyme verbs constituting the degree of connectivity of a particular verb in the lexicon. The second component of the degree of connectivity concerns the status of a particular verb relative to its rhyme verbs. The connectivity within the neighborhood of a particular verb varies and this variability is quantified by using the Levenshtein distance. The second property of the lexical network is the strength of connectivity between items. Frequency of use has been one of the primary variables in functional linguistics used to probe this. In addition, a new variable called Constructional Entropy is introduced in this study building on information theory. It is a quantification of the amount of information carried by a particular reflexive verb in one or more argument constructions. The results of the lexical connectivity indicate that the reflexive verbs have statistically greater neighborhood distances than the neighbor verbs. This distributional property can be used to motivate the traditional observation that the reflexive verbs tend to have idiosyncratic properties. A set of argument constructions, generalizations over usage patterns, are proposed for the reflexive verbs in this study. In addition to the variables associated with the lexical connectivity, a number of variables proposed in the literature are explored and used as predictors in the model. The second part of this study introduces the use of a machine learning algorithm called Random Forests. The performance of the model indicates that it is capable, up to a degree, of disambiguating the proposed argument construction types of the Russian Reflexive Marker. Additionally, a global ranking of the predictors used in the model is offered. Finally, most construction grammars assume that argument construction form a network structure. A new method is proposed that establishes generalization over the argument constructions referred to as Linking Construction. In sum, this study explores the structural properties of the Russian Reflexive Marker and a new model is set forth that can accommodate both the traditional pairs and potential deviations from it in a principled manner.
Resumo:
Työssä käsitellään innovaatioprosessin ensimmäistä ”fuzzy front end” -vaihetta, jota työssä kutsutaan front end -vaiheeksi. Front end -vaihe on innovaatioprosessin alustava tutkimus ja suunnittelu vaihe ennen teknistä kehittämisvaihetta. Front end -vaihetta on tutkittu innovaatioprosessin osista vähiten, sekä se on useimmille yrityksillä sumea ja vaikeasti käsitettävä. Tutkimusten mukaan front end -vaiheen osaaminen on kuitenkin erittäin merkittävä tekijä yrityksen innovatiivisuudelle. Työssä avataan innovaatioprosessin sisältöä ja tavoitteita, sekä vertaillaan käytössä olevia malleja front end -vaiheen rakenteesta. Työssä selvitetään avaintekijöitä front end -vaiheen menestykseen ja tehokkuuteen. Lisäksi käsitellään johtamisen tekijöitä, jotka edesauttavat onnistumaan front end -vaiheessa.
Resumo:
Fan systems are responsible for approximately 10% of the electricity consumption in industrial and municipal sectors, and it has been found that there is energy-saving potential in these systems. To this end, variable speed drives (VSDs) are used to enhance the efficiency of fan systems. Usually, fan system operation is optimized based on measurements of the system, but there are seldom readily installed meters in the system that can be used for the purpose. Thus, sensorless methods are needed for the optimization of fan system operation. In this thesis, methods for the fan operating point estimation with a variable speed drive are studied and discussed. These methods can be used for the energy efficient control of the fan system without additional measurements. The operation of these methods is validated by laboratory measurements and data from an industrial fan system. In addition to their energy consumption, condition monitoring of fan systems is a key issue as fans are an integral part of various production processes. Fan system condition monitoring is usually carried out with vibration measurements, which again increase the system complexity. However, variable speed drives can already be used for pumping system condition monitoring. Therefore, it would add to the usability of a variablespeed- driven fan system if the variable speed drive could be used as a condition monitoring device. In this thesis, sensorless detection methods for three lifetime-reducing phenomena are suggested: these are detection of the fan contamination build-up, the correct rotational direction, and the fan surge. The methods use the variable speed drive monitoring and control options for the detection along with simple signal processing methods, such as power spectrum density estimates. The methods have been validated by laboratory measurements. The key finding of this doctoral thesis is that a variable speed drive can be used on its own as a monitoring and control device for the fan system energy efficiency, and it can also be used in the detection of certain lifetime-reducing phenomena.
Resumo:
The pumping processes requiring wide range of flow are often equipped with parallelconnected centrifugal pumps. In parallel pumping systems, the use of variable speed control allows that the required output for the process can be delivered with a varying number of operated pump units and selected rotational speed references. However, the optimization of the parallel-connected rotational speed controlled pump units often requires adaptive modelling of both parallel pump characteristics and the surrounding system in varying operation conditions. The available information required for the system modelling in typical parallel pumping applications such as waste water treatment and various cooling and water delivery pumping tasks can be limited, and the lack of real-time operation point monitoring often sets limits for accurate energy efficiency optimization. Hence, alternatives for easily implementable control strategies which can be adopted with minimum system data are necessary. This doctoral thesis concentrates on the methods that allow the energy efficient use of variable speed controlled parallel pumps in system scenarios in which the parallel pump units consist of a centrifugal pump, an electric motor, and a frequency converter. Firstly, the suitable operation conditions for variable speed controlled parallel pumps are studied. Secondly, methods for determining the output of each parallel pump unit using characteristic curve-based operation point estimation with frequency converter are discussed. Thirdly, the implementation of the control strategy based on real-time pump operation point estimation and sub-optimization of each parallel pump unit is studied. The findings of the thesis support the idea that the energy efficiency of the pumping can be increased without the installation of new, more efficient components in the systems by simply adopting suitable control strategies. An easily implementable and adaptive control strategy for variable speed controlled parallel pumping systems can be created by utilizing the pump operation point estimation available in modern frequency converters. Hence, additional real-time flow metering, start-up measurements, and detailed system model are unnecessary, and the pumping task can be fulfilled by determining a speed reference for each parallel-pump unit which suggests the energy efficient operation of the pumping system.
Resumo:
We introduce a new tool for correcting OCR errors of materials in a repository of cultural materials. The poster is aimed to all who are interested in digital humanities and who might find our tool useful. The poster will focus on the OCR correction tool and on the background processes. We have started a project on materials published in Finno-Ugric languages in the Soviet Union in the 1920s and 1930s. The materials are digitised in Russia. As they arrive, we publish them in DSpace (fennougrica.kansalliskirjasto.fi). For research purposes, the results of the OCR must be corrected manually. For this we have built a new tool. Although similar tools exist, we found in-house development necessary in order to serve the researchers' needs. The tool enables exporting the corrected text as required by the researchers. It makes it possible to distribute the correction tasks and their supervision. After a supervisor has approved a text as finalised, the new version of the work will replace the old one in DSpace. The project has - benefitted the small language communities, - opened channels for cooperation in Russia. - increased our capabilities in digital humanities. The OCR correction tool will be available to others.
Resumo:
Can crowdsourcing solutions serve many masters? Can they be beneficial for both, for the layman or native speakers of minority languages on the one hand and serious linguistic research on the other? How did an infrastructure that was designed to support linguistics turn out to be a solution for raising awareness of native languages? Since 2012 the National Library of Finland has been developing the Digitisation Project for Kindred Languages, in which the key objective is to support a culture of openness and interaction in linguistic research, but also to promote crowdsourcing as a tool for participation of the language community in research. In the course of the project, over 1,200 monographs and nearly 111,000 pages of newspapers in Finno-Ugric languages will be digitised and made available in the Fenno-Ugrica digital collection. This material was published in the Soviet Union in the 1920s and 1930s, and users have had only sporadic access to the material. The publication of open-access and searchable materials from this period is a goldmine for researchers. Historians, social scientists and laymen with an interest in specific local publications can now find text materials pertinent to their studies. The linguistically-oriented population can also find writings to delight them: (1) lexical items specific to a given publication, and (2) orthographically-documented specifics of phonetics. In addition to the open access collection, we developed an open source code OCR editor that enables the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary since these rare and peripheral prints often include already archaic characters, which are neglected by modern OCR software developers but belong to the historical context of kindred languages, and are thus an essential part of the linguistic heritage. When modelling the OCR editor, it was essential to consider both the needs of researchers and the capabilities of lay citizens, and to have them participate in the planning and execution of the project from the very beginning. By implementing the feedback iteratively from both groups, it was possible to transform the requested changes as tools for research that not only supported the work of linguistics but also encouraged the citizen scientists to face the challenge and work with the crowdsourcing tools for the benefit of research. This presentation will not only deal with the technical aspects, developments and achievements of the infrastructure but will highlight the way in which user groups, researchers and lay citizens were engaged in a process as an active and communicative group of users and how their contributions were made to mutual benefit.
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at the 12th Bibliotheca Baltica Symposium at Södertörn University Library
Resumo:
Kirjallisuusarvostelu
Resumo:
This master thesis work introduces the fuzzy tolerance/equivalence relation and its application in cluster analysis. The work presents about the construction of fuzzy equivalence relations using increasing generators. Here, we investigate and research on the role of increasing generators for the creation of intersection, union and complement operators. The objective is to develop different varieties of fuzzy tolerance/equivalence relations using different varieties of increasing generators. At last, we perform a comparative study with these developed varieties of fuzzy tolerance/equivalence relations in their application to a clustering method.
Resumo:
This thesis presents an analysis of recently enacted Russian renewable energy policy based on capacity mechanism. Considering its novelty and poor coverage by academic literature, the aim of the thesis is to analyze capacity mechanism influence on investors’ decision-making process. The current research introduces a number of approaches to investment analysis. Firstly, classical financial model was built with Microsoft Excel® and crisp efficiency indicators such as net present value were determined. Secondly, sensitivity analysis was performed to understand different factors influence on project profitability. Thirdly, Datar-Mathews method was applied that by means of Monte Carlo simulation realized with Matlab Simulink®, disclosed all possible outcomes of investment project and enabled real option thinking. Fourthly, previous analysis was duplicated by fuzzy pay-off method with Microsoft Excel®. Finally, decision-making process under capacity mechanism was illustrated with decision tree. Capacity remuneration paid within 15 years is calculated individually for each RE project as variable annuity that guarantees a particular return on investment adjusted on changes in national interest rates. Analysis results indicate that capacity mechanism creates a real option to invest in renewable energy project by ensuring project profitability regardless of market conditions if project-internal factors are managed properly. The latter includes keeping capital expenditures within set limits, production performance higher than 75% of target indicators, and fulfilling localization requirement, implying producing equipment and services within the country. Occurrence of real option shapes decision-making process in the following way. Initially, investor should define appropriate location for a planned power plant where high production performance can be achieved, and lock in this location in case of competition. After, investor should wait until capital cost limit and localization requirement can be met, after that decision to invest can be made without any risk to project profitability. With respect to technology kind, investment into solar PV power plant is more attractive than into wind or small hydro power, since it has higher weighted net present value and lower standard deviation. However, it does not change decision-making strategy that remains the same for each technology type. Fuzzy pay-method proved its ability to disclose the same patterns of information as Monte Carlo simulation. Being effective in investment analysis under uncertainty and easy in use, it can be recommended as sufficient analytical tool to investors and researchers. Apart from described results, this thesis contributes to the academic literature by detailed description of capacity price calculation for renewable energy that was not available in English before. With respect to methodology novelty, such advanced approaches as Datar-Mathews method and fuzzy pay-off method are applied on the top of investment profitability model that incorporates capacity remuneration calculation as well. Comparison of effects of two different RE supporting schemes, namely Russian capacity mechanism and feed-in premium, contributes to policy comparative studies and exhibits useful inferences for researchers and policymakers. Limitations of this research are simplification of assumptions to country-average level that restricts our ability to analyze renewable energy investment region wise and existing limitation of the studying policy to the wholesale power market that leaves retail markets and remote areas without our attention, taking away medium and small investment into renewable energy from the research focus. Elimination of these limitations would allow creating the full picture of Russian renewable energy investment profile.
Resumo:
The shift towards a knowledge-based economy has inevitably prompted the evolution of patent exploitation. Nowadays, patent is more than just a prevention tool for a company to block its competitors from developing rival technologies, but lies at the very heart of its strategy for value creation and is therefore strategically exploited for economic pro t and competitive advantage. Along with the evolution of patent exploitation, the demand for reliable and systematic patent valuation has also reached an unprecedented level. However, most of the quantitative approaches in use to assess patent could arguably fall into four categories and they are based solely on the conventional discounted cash flow analysis, whose usability and reliability in the context of patent valuation are greatly limited by five practical issues: the market illiquidity, the poor data availability, discriminatory cash-flow estimations, and its incapability to account for changing risk and managerial flexibility. This dissertation attempts to overcome these impeding barriers by rationalizing the use of two techniques, namely fuzzy set theory (aiming at the first three issues) and real option analysis (aiming at the last two). It commences with an investigation into the nature of the uncertainties inherent in patent cash flow estimation and claims that two levels of uncertainties must be properly accounted for. Further investigation reveals that both levels of uncertainties fall under the categorization of subjective uncertainty, which differs from objective uncertainty originating from inherent randomness in that uncertainties labelled as subjective are highly related to the behavioural aspects of decision making and are usually witnessed whenever human judgement, evaluation or reasoning is crucial to the system under consideration and there exists a lack of complete knowledge on its variables. Having clarified their nature, the application of fuzzy set theory in modelling patent-related uncertain quantities is effortlessly justified. The application of real option analysis to patent valuation is prompted by the fact that both patent application process and the subsequent patent exploitation (or commercialization) are subject to a wide range of decisions at multiple successive stages. In other words, both patent applicants and patentees are faced with a large variety of courses of action as to how their patent applications and granted patents can be managed. Since they have the right to run their projects actively, this flexibility has value and thus must be properly accounted for. Accordingly, an explicit identification of the types of managerial flexibility inherent in patent-related decision making problems and in patent valuation, and a discussion on how they could be interpreted in terms of real options are provided in this dissertation. Additionally, the use of the proposed techniques in practical applications is demonstrated by three fuzzy real option analysis based models. In particular, the pay-of method and the extended fuzzy Black-Scholes model are employed to investigate the profitability of a patent application project for a new process for the preparation of a gypsum-fibre composite and to justify the subsequent patent commercialization decision, respectively; a fuzzy binomial model is designed to reveal the economic potential of a patent licensing opportunity.
Resumo:
The emerging technologies have recently challenged the libraries to reconsider their role as a mere mediator between the collections, researchers, and wider audiences (Sula, 2013), and libraries, especially the nationwide institutions like national libraries, haven’t always managed to face the challenge (Nygren et al., 2014). In the Digitization Project of Kindred Languages, the National Library of Finland has become a node that connects the partners to interplay and work for shared goals and objectives. In this paper, I will be drawing a picture of the crowdsourcing methods that have been established during the project to support both linguistic research and lingual diversity. The National Library of Finland has been executing the Digitization Project of Kindred Languages since 2012. The project seeks to digitize and publish approximately 1,200 monograph titles and more than 100 newspapers titles in various, and in some cases endangered Uralic languages. Once the digitization has been completed in 2015, the Fenno-Ugrica online collection will consist of 110,000 monograph pages and around 90,000 newspaper pages to which all users will have open access regardless of their place of residence. The majority of the digitized literature was originally published in the 1920s and 1930s in the Soviet Union, and it was the genesis and consolidation period of literary languages. This was the era when many Uralic languages were converted into media of popular education, enlightenment, and dissemination of information pertinent to the developing political agenda of the Soviet state. The ‘deluge’ of popular literature in the 1920s to 1930s suddenly challenged the lexical orthographic norms of the limited ecclesiastical publications from the 1880s onward. Newspapers were now written in orthographies and in word forms that the locals would understand. Textbooks were written to address the separate needs of both adults and children. New concepts were introduced in the language. This was the beginning of a renaissance and period of enlightenment (Rueter, 2013). The linguistically oriented population can also find writings to their delight, especially lexical items specific to a given publication, and orthographically documented specifics of phonetics. The project is financially supported by the Kone Foundation in Helsinki and is part of the Foundation’s Language Programme. One of the key objectives of the Kone Foundation Language Programme is to support a culture of openness and interaction in linguistic research, but also to promote citizen science as a tool for the participation of the language community in research. In addition to sharing this aspiration, our objective within the Language Programme is to make sure that old and new corpora in Uralic languages are made available for the open and interactive use of the academic community as well as the language societies. Wordlists are available in 17 languages, but without tokenization, lemmatization, and so on. This approach was verified with the scholars, and we consider the wordlists as raw data for linguists. Our data is used for creating the morphological analyzers and online dictionaries at the Helsinki and Tromsø Universities, for instance. In order to reach the targets, we will produce not only the digitized materials but also their development tools for supporting linguistic research and citizen science. The Digitization Project of Kindred Languages is thus linked with the research of language technology. The mission is to improve the usage and usability of digitized content. During the project, we have advanced methods that will refine the raw data for further use, especially in the linguistic research. How does the library meet the objectives, which appears to be beyond its traditional playground? The written materials from this period are a gold mine, so how could we retrieve these hidden treasures of languages out of the stack that contains more than 200,000 pages of literature in various Uralic languages? The problem is that the machined-encoded text (OCR) contains often too many mistakes to be used as such in research. The mistakes in OCRed texts must be corrected. For enhancing the OCRed texts, the National Library of Finland developed an open-source code OCR editor that enabled the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary to implement, since these rare and peripheral prints did often include already perished characters, which are sadly neglected by the modern OCR software developers, but belong to the historical context of kindred languages and thus are an essential part of the linguistic heritage (van Hemel, 2014). Our crowdsourcing tool application is essentially an editor of Alto XML format. It consists of a back-end for managing users, permissions, and files, communicating through a REST API with a front-end interface—that is, the actual editor for correcting the OCRed text. The enhanced XML files can be retrieved from the Fenno-Ugrica collection for further purposes. Could the crowd do this work to support the academic research? The challenge in crowdsourcing lies in its nature. The targets in the traditional crowdsourcing have often been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguists are not necessarily met. Also, the remarkable downside is the lack of shared goal or the social affinity. There is no reward in the traditional methods of crowdsourcing (de Boer et al., 2012). Also, there has been criticism that digital humanities makes the humanities too data-driven and oriented towards quantitative methods, losing the values of critical qualitative methods (Fish, 2012). And on top of that, the downsides of the traditional crowdsourcing become more imminent when you leave the Anglophone world. Our potential crowd is geographically scattered in Russia. This crowd is linguistically heterogeneous, speaking 17 different languages. In many cases languages are close to extinction or longing for language revitalization, and the native speakers do not always have Internet access, so an open call for crowdsourcing would not have produced appeasing results for linguists. Thus, one has to identify carefully the potential niches to complete the needed tasks. When using the help of a crowd in a project that is aiming to support both linguistic research and survival of endangered languages, the approach has to be a different one. In nichesourcing, the tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for complex tasks with high-quality product expectations found in nichesourcing. Communities have a purpose and identity, and their regular interaction engenders social trust and reputation. These communities can correspond to research more precisely (de Boer et al., 2012). Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. In nichesourcing, we hand in such assignments that would precisely fill the gaps in linguistic research. A typical task would be editing and collecting the words in such fields of vocabularies where the researchers do require more information. For instance, there is lack of Hill Mari words and terminology in anatomy. We have digitized the books in medicine, and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with the OCR editor. From the nichesourcing’s perspective, it is essential that altruism play a central role when the language communities are involved. In nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit from the results. For instance, the corrected words in Ingrian will be added to an online dictionary, which is made freely available for the public, so the society can benefit, too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of ‘two masters’: research and society.