994 resultados para Machine-readable Library Cataloguing
Resumo:
A direct-driven permanent magnet synchronous machine for a small urban use electric vehicle is presented. The measured performance of the machine at the test bench as well as the performance over the modified New European Drive Cycle will be given. The effect of optimal current components, maximizing the efficiency and taking into account the iron loss, is compared with the simple id=0 – control. The machine currents and losses during the drive cycle are calculated and compared with each other.
Resumo:
The National Library of Finland realizes the Digitization Project of Kindred Languages in 2012–15. The project is financially supported by the Kone Foundation. During this project the National Library of Finland has digitized and made available approximately 1200 monograph and more than 100 newspaper titles in several Uralic languages. The materials are available to both researchers and citizens in the National Library’s Fenno-Ugrica collection. The project will produce digitized materials in the Uralic languages as well as their development tools to support linguistic research and citizen science. The resulting materials will constitute the largest resource for the Uralic languages in the world. Through this project, researchers will gain access to corpora which they have not been able to study before and to which all users will have open access regardless of their place of residence. In my presentation, I will discuss 1) how we utilized the social media (Facebook, Twitter, VKontakte etc) to gain audience for our collection and 2) how the needs of researchers and laymen were met in crowdsourcing.
Resumo:
The emerging technologies have recently challenged the libraries to reconsider their role as a mere mediator between the collections, researchers, and wider audiences (Sula, 2013), and libraries, especially the nationwide institutions like national libraries, haven’t always managed to face the challenge (Nygren et al., 2014). In the Digitization Project of Kindred Languages, the National Library of Finland has become a node that connects the partners to interplay and work for shared goals and objectives. In this paper, I will be drawing a picture of the crowdsourcing methods that have been established during the project to support both linguistic research and lingual diversity. The National Library of Finland has been executing the Digitization Project of Kindred Languages since 2012. The project seeks to digitize and publish approximately 1,200 monograph titles and more than 100 newspapers titles in various, and in some cases endangered Uralic languages. Once the digitization has been completed in 2015, the Fenno-Ugrica online collection will consist of 110,000 monograph pages and around 90,000 newspaper pages to which all users will have open access regardless of their place of residence. The majority of the digitized literature was originally published in the 1920s and 1930s in the Soviet Union, and it was the genesis and consolidation period of literary languages. This was the era when many Uralic languages were converted into media of popular education, enlightenment, and dissemination of information pertinent to the developing political agenda of the Soviet state. The ‘deluge’ of popular literature in the 1920s to 1930s suddenly challenged the lexical orthographic norms of the limited ecclesiastical publications from the 1880s onward. Newspapers were now written in orthographies and in word forms that the locals would understand. Textbooks were written to address the separate needs of both adults and children. New concepts were introduced in the language. This was the beginning of a renaissance and period of enlightenment (Rueter, 2013). The linguistically oriented population can also find writings to their delight, especially lexical items specific to a given publication, and orthographically documented specifics of phonetics. The project is financially supported by the Kone Foundation in Helsinki and is part of the Foundation’s Language Programme. One of the key objectives of the Kone Foundation Language Programme is to support a culture of openness and interaction in linguistic research, but also to promote citizen science as a tool for the participation of the language community in research. In addition to sharing this aspiration, our objective within the Language Programme is to make sure that old and new corpora in Uralic languages are made available for the open and interactive use of the academic community as well as the language societies. Wordlists are available in 17 languages, but without tokenization, lemmatization, and so on. This approach was verified with the scholars, and we consider the wordlists as raw data for linguists. Our data is used for creating the morphological analyzers and online dictionaries at the Helsinki and Tromsø Universities, for instance. In order to reach the targets, we will produce not only the digitized materials but also their development tools for supporting linguistic research and citizen science. The Digitization Project of Kindred Languages is thus linked with the research of language technology. The mission is to improve the usage and usability of digitized content. During the project, we have advanced methods that will refine the raw data for further use, especially in the linguistic research. How does the library meet the objectives, which appears to be beyond its traditional playground? The written materials from this period are a gold mine, so how could we retrieve these hidden treasures of languages out of the stack that contains more than 200,000 pages of literature in various Uralic languages? The problem is that the machined-encoded text (OCR) contains often too many mistakes to be used as such in research. The mistakes in OCRed texts must be corrected. For enhancing the OCRed texts, the National Library of Finland developed an open-source code OCR editor that enabled the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary to implement, since these rare and peripheral prints did often include already perished characters, which are sadly neglected by the modern OCR software developers, but belong to the historical context of kindred languages and thus are an essential part of the linguistic heritage (van Hemel, 2014). Our crowdsourcing tool application is essentially an editor of Alto XML format. It consists of a back-end for managing users, permissions, and files, communicating through a REST API with a front-end interface—that is, the actual editor for correcting the OCRed text. The enhanced XML files can be retrieved from the Fenno-Ugrica collection for further purposes. Could the crowd do this work to support the academic research? The challenge in crowdsourcing lies in its nature. The targets in the traditional crowdsourcing have often been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguists are not necessarily met. Also, the remarkable downside is the lack of shared goal or the social affinity. There is no reward in the traditional methods of crowdsourcing (de Boer et al., 2012). Also, there has been criticism that digital humanities makes the humanities too data-driven and oriented towards quantitative methods, losing the values of critical qualitative methods (Fish, 2012). And on top of that, the downsides of the traditional crowdsourcing become more imminent when you leave the Anglophone world. Our potential crowd is geographically scattered in Russia. This crowd is linguistically heterogeneous, speaking 17 different languages. In many cases languages are close to extinction or longing for language revitalization, and the native speakers do not always have Internet access, so an open call for crowdsourcing would not have produced appeasing results for linguists. Thus, one has to identify carefully the potential niches to complete the needed tasks. When using the help of a crowd in a project that is aiming to support both linguistic research and survival of endangered languages, the approach has to be a different one. In nichesourcing, the tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for complex tasks with high-quality product expectations found in nichesourcing. Communities have a purpose and identity, and their regular interaction engenders social trust and reputation. These communities can correspond to research more precisely (de Boer et al., 2012). Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. In nichesourcing, we hand in such assignments that would precisely fill the gaps in linguistic research. A typical task would be editing and collecting the words in such fields of vocabularies where the researchers do require more information. For instance, there is lack of Hill Mari words and terminology in anatomy. We have digitized the books in medicine, and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with the OCR editor. From the nichesourcing’s perspective, it is essential that altruism play a central role when the language communities are involved. In nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit from the results. For instance, the corrected words in Ingrian will be added to an online dictionary, which is made freely available for the public, so the society can benefit, too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of ‘two masters’: research and society.
Resumo:
This thesis studies the use of machine vision in RDF quality assurance and manufacturing. Currently machine vision is used in recycling and material detection and some commer- cial products are available in the market. In this thesis an on-line machine vision system is proposed for characterizing particle size. The proposed machine vision system is based on the mapping between image segmenta- tion and the ground truth of the particle size. The results shows that the implementation of such machine vision system is feasible.
Resumo:
Laser cutting implementation possibilities into paper making machine was studied as the main objective of the work. Laser cutting technology application was considered as a replacement tool for conventional cutting methods used in paper making machines for longitudinal cutting such as edge trimming at different paper making process and tambour roll slitting. Laser cutting of paper was tested in 70’s for the first time. Since then, laser cutting and processing has been applied for paper materials with different level of success in industry. Laser cutting can be employed for longitudinal cutting of paper web in machine direction. The most common conventional cutting methods include water jet cutting and rotating slitting blades applied in paper making machines. Cutting with CO2 laser fulfils basic requirements for cutting quality, applicability to material and cutting speeds in all locations where longitudinal cutting is needed. Literature review provided description of advantages, disadvantages and challenges of laser technology when it was applied for cutting of paper material with particular attention to cutting of moving paper web. Based on studied laser cutting capabilities and problem definition of conventional cutting technologies, preliminary selection of the most promising application area was carried out. Laser cutting (trimming) of paper web edges in wet end was estimated to be the most promising area where it can be implemented. This assumption was made on the basis of rate of web breaks occurrence. It was found that up to 64 % of total number of web breaks occurred in wet end, particularly in location of so called open draws where paper web was transferred unsupported by wire or felt. Distribution of web breaks in machine cross direction revealed that defects of paper web edge was the main reason of tearing initiation and consequent web break. The assumption was made that laser cutting was capable of improvement of laser cut edge tensile strength due to high cutting quality and sealing effect of the edge after laser cutting. Studies of laser ablation of cellulose supported this claim. Linear energy needed for cutting was calculated with regard to paper web properties in intended laser cutting location. Calculated linear cutting energy was verified with series of laser cutting. Practically obtained laser energy needed for cutting deviated from calculated values. This could be explained by difference in heat transfer via radiation in laser cutting and different absorption characteristics of dry and moist paper material. Laser cut samples (both dry and moist (dry matter content about 25-40%)) were tested for strength properties. It was shown that tensile strength and strain break of laser cut samples are similar to corresponding values of non-laser cut samples. Chosen method, however, did not address tensile strength of laser cut edge in particular. Thus, the assumption of improving strength properties with laser cutting was not fully proved. Laser cutting effect on possible pollution of mill broke (recycling of trimmed edge) was carried out. Laser cut samples (both dry and moist) were tested on the content of dirt particles. The tests revealed that accumulation of dust particles on the surface of moist samples can take place. This has to be taken into account to prevent contamination of pulp suspension when trim waste is recycled. Material loss due to evaporation during laser cutting and amount of solid residues after cutting were evaluated. Edge trimming with laser would result in 0.25 kg/h of solid residues and 2.5 kg/h of lost material due to evaporation. Schemes of laser cutting implementation and needed laser equipment were discussed. Generally, laser cutting system would require two laser sources (one laser source for each cutting zone), set of beam transfer and focusing optics and cutting heads. In order to increase reliability of system, it was suggested that each laser source would have double capacity. That would allow to perform cutting employing one laser source working at full capacity for both cutting zones. Laser technology is in required level at the moment and do not require additional development. Moreover, capacity of speed increase is high due to availability high power laser sources what can support the tendency of speed increase of paper making machines. Laser cutting system would require special roll to maintain cutting. The scheme of such roll was proposed as well as roll integration into paper making machine. Laser cutting can be done in location of central roll in press section, before so-called open draw where many web breaks occur, where it has potential to improve runability of a paper making machine. Economic performance of laser cutting was done as comparison of laser cutting system and water jet cutting working in the same conditions. It was revealed that laser cutting would still be about two times more expensive compared to water jet cutting. This is mainly due to high investment cost of laser equipment and poor energy efficiency of CO2 lasers. Another factor is that laser cutting causes material loss due to evaporation whereas water jet cutting almost does not cause material loss. Despite difficulties of laser cutting implementation in paper making machine, its implementation can be beneficial. The crucial role in that is possibility to improve cut edge strength properties and consequently reduce number of web breaks. Capacity of laser cutting to maintain cutting speeds which exceed current speeds of paper making machines what is another argument to consider laser cutting technology in design of new high speed paper making machines.
Resumo:
The subject of the thesis is automatic sentence compression with machine learning, so that the compressed sentences remain both grammatical and retain their essential meaning. There are multiple possible uses for the compression of natural language sentences. In this thesis the focus is generation of television program subtitles, which often are compressed version of the original script of the program. The main part of the thesis consists of machine learning experiments for automatic sentence compression using different approaches to the problem. The machine learning methods used for this work are linear-chain conditional random fields and support vector machines. Also we take a look which automatic text analysis methods provide useful features for the task. The data used for machine learning is supplied by Lingsoft Inc. and consists of subtitles in both compressed an uncompressed form. The models are compared to a baseline system and comparisons are made both automatically and also using human evaluation, because of the potentially subjective nature of the output. The best result is achieved using a CRF - sequence classification using a rich feature set. All text analysis methods help classification and most useful method is morphological analysis. Tutkielman aihe on suomenkielisten lauseiden automaattinen tiivistäminen koneellisesti, niin että lyhennetyt lauseet säilyttävät olennaisen informaationsa ja pysyvät kieliopillisina. Luonnollisen kielen lauseiden tiivistämiselle on monta käyttötarkoitusta, mutta tässä tutkielmassa aihetta lähestytään television ohjelmien tekstittämisen kautta, johon käytännössä kuuluu alkuperäisen tekstin lyhentäminen televisioruudulle paremmin sopivaksi. Tutkielmassa kokeillaan erilaisia koneoppimismenetelmiä tekstin automaatiseen lyhentämiseen ja tarkastellaan miten hyvin erilaiset luonnollisen kielen analyysimenetelmät tuottavat informaatiota, joka auttaa näitä menetelmiä lyhentämään lauseita. Lisäksi tarkastellaan minkälainen lähestymistapa tuottaa parhaan lopputuloksen. Käytetyt koneoppimismenetelmät ovat tukivektorikone ja lineaarisen sekvenssin mallinen CRF. Koneoppimisen tukena käytetään tekstityksiä niiden eri käsittelyvaiheissa, jotka on saatu Lingsoft OY:ltä. Luotuja malleja vertaillaan Lopulta mallien lopputuloksia evaluoidaan automaattisesti ja koska teksti lopputuksena on jossain määrin subjektiivinen myös ihmisarviointiin perustuen. Vertailukohtana toimii kirjallisuudesta poimittu menetelmä. Tutkielman tuloksena paras lopputulos saadaan aikaan käyttäen CRF sekvenssi-luokittelijaa laajalla piirrejoukolla. Kaikki kokeillut teksin analyysimenetelmät auttavat luokittelussa, joista tärkeimmän panoksen antaa morfologinen analyysi.
Resumo:
Teema: Kansainvälisyys.
Resumo:
Axial-flux machines tend to have cooling difficulties since it is difficult to arrange continuous heat path between the stator stack and the frame. One important reason for this is that no shrink fitting of the stator is possible in an axial-flux machine. Using of liquid-cooled end shields does not alone solve this issue. Cooling of the rotor and the end windings may also be difficult at least in case of two-stator-single-rotor construction where air circulation in the rotor and in the end-winding areas may be difficult to arrange. If the rotor has significant losses air circulation via the rotor and behind the stator yokes should be arranged which, again, weakens the stator cooling. In this paper we study a novel way of using copper bars as extra heat transfer paths between the stator teeth and liquid cooling pools in the end shields. After this the end windings still suffer of low thermal conductivity and means for improving this by high-heat-conductance material was also studied. The design principle of each cooling system is presented in details. Thermal models based on Computational Fluid Dynamics (CFD) are used to analyse the temperature distribution in the machine. Measurement results are provided from different versions of the machine. The results show that significant improvements in the cooling can be gained by these steps.
Resumo:
Mobile malwares are increasing with the growing number of Mobile users. Mobile malwares can perform several operations which lead to cybersecurity threats such as, stealing financial or personal information, installing malicious applications, sending premium SMS, creating backdoors, keylogging and crypto-ransomware attacks. Knowing the fact that there are many illegitimate Applications available on the App stores, most of the mobile users remain careless about the security of their Mobile devices and become the potential victim of these threats. Previous studies have shown that not every antivirus is capable of detecting all the threats; due to the fact that Mobile malwares use advance techniques to avoid detection. A Network-based IDS at the operator side will bring an extra layer of security to the subscribers and can detect many advanced threats by analyzing their traffic patterns. Machine Learning(ML) will provide the ability to these systems to detect unknown threats for which signatures are not yet known. This research is focused on the evaluation of Machine Learning classifiers in Network-based Intrusion detection systems for Mobile Networks. In this study, different techniques of Network-based intrusion detection with their advantages, disadvantages and state of the art in Hybrid solutions are discussed. Finally, a ML based NIDS is proposed which will work as a subsystem, to Network-based IDS deployed by Mobile Operators, that can help in detecting unknown threats and reducing false positives. In this research, several ML classifiers were implemented and evaluated. This study is focused on Android-based malwares, as Android is the most popular OS among users, hence most targeted by cyber criminals. Supervised ML algorithms based classifiers were built using the dataset which contained the labeled instances of relevant features. These features were extracted from the traffic generated by samples of several malware families and benign applications. These classifiers were able to detect malicious traffic patterns with the TPR upto 99.6% during Cross-validation test. Also, several experiments were conducted to detect unknown malware traffic and to detect false positives. These classifiers were able to detect unknown threats with the Accuracy of 97.5%. These classifiers could be integrated with current NIDS', which use signatures, statistical or knowledge-based techniques to detect malicious traffic. Technique to integrate the output from ML classifier with traditional NIDS is discussed and proposed for future work.
Resumo:
The review of intelligent machines shows that the demand for new ways of helping people in perception of the real world is becoming higher and higher every year. This thesis provides information about design and implementation of machine vision for mobile assembly robot. The work has been done as a part of LUT project in Laboratory of Intelligent Machines. The aim of this work is to create a working vision system. The qualitative and quantitative research were done to complete this task. In the first part, the author presents the theoretical background of such things as digital camera work principles, wireless transmission basics, creation of live stream, methods used for pattern recognition. Formulas, dependencies and previous research related to the topic are shown. In the second part, the equipment used for the project is described. There is information about the brands, models, capabilities and also requirements needed for implementation. Although, the author gives a description of LabVIEW software, its add-ons and OpenCV which are used in the project. Furthermore, one can find results in further section of considered thesis. They mainly represented by screenshots from cameras, working station and photos of the system. The key result of this thesis is vision system created for the needs of mobile assembly robot. Therefore, it is possible to see graphically what was done on examples. Future research in this field includes optimization of the pattern recognition algorithm. This will give less response time for recognizing objects. Presented by author system can be used also for further activities which include artificial intelligence usage.