946 resultados para Automatic merging of lexical resources
Resumo:
The teaching-learning process is increasingly focused on the combination of the paradigms “learning by viewing” and “learning by doing.” In this context, educational resources, either expository or evaluative, play a pivotal role. Both types of resources are interdependent and their sequencing would create a richer educational experience to the end user. However, there is a lack of tools that support sequencing essentially due to the fact that existing specifications are complex. The Seqins is a sequencing tool of digital resources that has a fairly simple sequencing model. The tool communicates through the IMS LTI specification with a plethora of e-learning systems such as learning management systems, repositories, authoring and evaluation systems. In order to validate Seqins we integrate it in an e-learning Ensemble framework instance for the computer programming learning.
Resumo:
To meet the increasing demands of the complex inter-organizational processes and the demand for continuous innovation and internationalization, it is evident that new forms of organisation are being adopted, fostering more intensive collaboration processes and sharing of resources, in what can be called collaborative networks (Camarinha-Matos, 2006:03). Information and knowledge are crucial resources in collaborative networks, being their management fundamental processes to optimize. Knowledge organisation and collaboration systems are thus important instruments for the success of collaborative networks of organisations having been researched in the last decade in the areas of computer science, information science, management sciences, terminology and linguistics. Nevertheless, research in this area didn’t give much attention to multilingual contexts of collaboration, which pose specific and challenging problems. It is then clear that access to and representation of knowledge will happen more and more on a multilingual setting which implies the overcoming of difficulties inherent to the presence of multiple languages, through the use of processes like localization of ontologies. Although localization, like other processes that involve multilingualism, is a rather well-developed practice and its methodologies and tools fruitfully employed by the language industry in the development and adaptation of multilingual content, it has not yet been sufficiently explored as an element of support to the development of knowledge representations - in particular ontologies - expressed in more than one language. Multilingual knowledge representation is then an open research area calling for cross-contributions from knowledge engineering, terminology, ontology engineering, cognitive sciences, computational linguistics, natural language processing, and management sciences. This workshop joined researchers interested in multilingual knowledge representation, in a multidisciplinary environment to debate the possibilities of cross-fertilization between knowledge engineering, terminology, ontology engineering, cognitive sciences, computational linguistics, natural language processing, and management sciences applied to contexts where multilingualism continuously creates new and demanding challenges to current knowledge representation methods and techniques. In this workshop six papers dealing with different approaches to multilingual knowledge representation are presented, most of them describing tools, approaches and results obtained in the development of ongoing projects. In the first case, Andrés Domínguez Burgos, Koen Kerremansa and Rita Temmerman present a software module that is part of a workbench for terminological and ontological mining, Termontospider, a wiki crawler that aims at optimally traverse Wikipedia in search of domainspecific texts for extracting terminological and ontological information. The crawler is part of a tool suite for automatically developing multilingual termontological databases, i.e. ontologicallyunderpinned multilingual terminological databases. In this paper the authors describe the basic principles behind the crawler and summarized the research setting in which the tool is currently tested. In the second paper, Fumiko Kano presents a work comparing four feature-based similarity measures derived from cognitive sciences. The purpose of the comparative analysis presented by the author is to verify the potentially most effective model that can be applied for mapping independent ontologies in a culturally influenced domain. For that, datasets based on standardized pre-defined feature dimensions and values, which are obtainable from the UNESCO Institute for Statistics (UIS) have been used for the comparative analysis of the similarity measures. The purpose of the comparison is to verify the similarity measures based on the objectively developed datasets. According to the author the results demonstrate that the Bayesian Model of Generalization provides for the most effective cognitive model for identifying the most similar corresponding concepts existing for a targeted socio-cultural community. In another presentation, Thierry Declerck, Hans-Ulrich Krieger and Dagmar Gromann present an ongoing work and propose an approach to automatic extraction of information from multilingual financial Web resources, to provide candidate terms for building ontology elements or instances of ontology concepts. The authors present a complementary approach to the direct localization/translation of ontology labels, by acquiring terminologies through the access and harvesting of multilingual Web presences of structured information providers in the field of finance, leading to both the detection of candidate terms in various multilingual sources in the financial domain that can be used not only as labels of ontology classes and properties but also for the possible generation of (multilingual) domain ontologies themselves. In the next paper, Manuel Silva, António Lucas Soares and Rute Costa claim that despite the availability of tools, resources and techniques aimed at the construction of ontological artifacts, developing a shared conceptualization of a given reality still raises questions about the principles and methods that support the initial phases of conceptualization. These questions become, according to the authors, more complex when the conceptualization occurs in a multilingual setting. To tackle these issues the authors present a collaborative platform – conceptME - where terminological and knowledge representation processes support domain experts throughout a conceptualization framework, allowing the inclusion of multilingual data as a way to promote knowledge sharing and enhance conceptualization and support a multilingual ontology specification. In another presentation Frieda Steurs and Hendrik J. Kockaert present us TermWise, a large project dealing with legal terminology and phraseology for the Belgian public services, i.e. the translation office of the ministry of justice, a project which aims at developing an advanced tool including expert knowledge in the algorithms that extract specialized language from textual data (legal documents) and whose outcome is a knowledge database including Dutch/French equivalents for legal concepts, enriched with the phraseology related to the terms under discussion. Finally, Deborah Grbac, Luca Losito, Andrea Sada and Paolo Sirito report on the preliminary results of a pilot project currently ongoing at UCSC Central Library, where they propose to adapt to subject librarians, employed in large and multilingual Academic Institutions, the model used by translators working within European Union Institutions. The authors are using User Experience (UX) Analysis in order to provide subject librarians with a visual support, by means of “ontology tables” depicting conceptual linking and connections of words with concepts presented according to their semantic and linguistic meaning. The organizers hope that the selection of papers presented here will be of interest to a broad audience, and will be a starting point for further discussion and cooperation.
Resumo:
Lunacloud is a cloud service provider with offices in Portugal, Spain, France and UK that focus on delivering reliable, elastic and low cost cloud Infrastructure as a Service (IaaS) solutions. The company currently relies on a proprietary IaaS platform - the Parallels Automation for Cloud Infrastructure (PACI) - and wishes to expand and integrate other IaaS solutions seamlessly, namely open source solutions. This is the challenge addressed in this thesis. This proposal, which was fostered by Eurocloud Portugal Association, contributes to the promotion of interoperability and standardisation in Cloud Computing. The goal is to investigate, propose and develop an interoperable open source solution with standard interfaces for the integrated management of IaaS Cloud Computing resources based on new as well as existing abstraction libraries or frameworks. The solution should provide bothWeb and application programming interfaces. The research conducted consisted of two surveys covering existing open source IaaS platforms and PACI (features and API) and open source IaaS abstraction solutions. The first study was focussed on the characteristics of most popular open source IaaS platforms, namely OpenNebula, OpenStack, CloudStack and Eucalyptus, as well as PACI and included a thorough inventory of the provided Application Programming Interfaces (API), i.e., offered operations, followed by a comparison of these platforms in order to establish their similarities and dissimilarities. The second study on existing open source interoperability solutions included the analysis of existing abstraction libraries and frameworks and their comparison. The approach proposed and adopted, which was supported on the conclusions of the carried surveys, reuses an existing open source abstraction solution – the Apache Deltacloud framework. Deltacloud relies on the development of software driver modules to interface with different IaaS platforms, officially provides and supports drivers to sixteen IaaS platform, including OpenNebula and OpenStack, and allows the development of new provider drivers. The latter functionality was used to develop a new Deltacloud driver for PACI. Furthermore, Deltacloud provides a Web dashboard and REpresentational State Transfer (REST) API interfaces. To evaluate the adopted solution, a test bed integrating OpenNebula, Open- Stack and PACI nodes was assembled and deployed. The tests conducted involved time elapsed and data payload measurements via the Deltacloud framework as well as via the pre-existing IaaS platform API. The Deltacloud framework behaved as expected, i.e., introduced additional delays, but no substantial overheads. Both the Web and the REST interfaces were tested and showed identical measurements. The developed interoperable solution for the seamless integration and provision of IaaS resources from PACI, OpenNebula and OpenStack IaaS platforms fulfils the specified requirements, i.e., provides Lunacloud with the ability to expand the range of adopted IaaS platforms and offers a Web dashboard and REST API for the integrated management. The contributions of this work include the surveys and comparisons made, the selection of the abstraction framework and, last, but not the least, the PACI driver developed.
Resumo:
In this paper, three approaches for the assessment of credibility in a web environment will be presented, namely the checklists model, the cognitive authority model and the contextual model. This theoretical framework was used to conduct a study about the assessment of web resources credibility among a sample of 195 students, from elementary and secondary schools in a municipality in Oporto district (Portugal). The practices that young people and children claim to use regarding the use of criteria for web resources selection will be presented. In addition, these results will be discussed and compared with the perceptions that these respondents have demonstrated for the use of criteria to establish or assess authorship, originality, or information resources structure. These results will be also discussed and compared with the perceptions that these respondents have demonstrated for the elements that make up each of these criteria.
Resumo:
A Work Project, presented as part of the requirements for the Award of a Masters Degree in Management from the NOVA – School of Business and Economics
Resumo:
Due to advances in information technology (e.g., digital video cameras, ubiquitous sensors), the automatic detection of human behaviors from video is a very recent research topic. In this paper, we perform a systematic and recent literature review on this topic, from 2000 to 2014, covering a selection of 193 papers that were searched from six major scientific publishers. The selected papers were classified into three main subjects: detection techniques, datasets and applications. The detection techniques were divided into four categories (initialization, tracking, pose estimation and recognition). The list of datasets includes eight examples (e.g., Hollywood action). Finally, several application areas were identified, including human detection, abnormal activity detection, action recognition, player modeling and pedestrian detection. Our analysis provides a road map to guide future research for designing automatic visual human behavior detection systems.
Resumo:
Purpose: Recently morphometric measurements of the ascending aorta have been done with ECG-gated MDCT to help the development of future endovascular therapies (TCT) [1]. However, the variability of these measurements remains unknown. It will be interesting to know the impact of CAD (computer aided diagnosis) with automated segmentation of the vessel and automatic measurements of diameter on the management of ascending aorta aneurysms. Methods and Materials: Thirty patients referred for ECG-gated CT thoracic angiography (64-row CT scanner) were evaluated. Measurements of the maximum and minimum ascending aorta diameters were obtained automatically with a commercially available CAD and semi-manually by two observers separately. The CAD algorithms segment the iv-enhanced lumen of the ascending aorta into perpendicular planes along the centreline. The CAD then determines the largest and the smallest diameters. Both observers repeated the automatic measurements and the semimanual measurements during a different session at least one month after the first measurements. The Bland and Altman method was used to study the inter/intraobserver variability. A Wilcoxon signed-rank test was also used to analyse differences between observers. Results: Interobserver variability for semi-manual measurements between the first and second observers was between 1.2 to 1.0 mm for maximal and minimal diameter, respectively. Intraobserver variability of each observer ranged from 0.8 to 1.2 mm, the lowest variability being produced by the more experienced observer. CAD variability could be as low as 0.3 mm, showing that it can perform better than human observers. However, when used in nonoptimal conditions (streak artefacts from contrast in the superior vena cava or weak lumen enhancement), CAD has a variability that can be as high as 0.9 mm, reaching variability of semi-manual measurements. Furthermore, there were significant differences between both observers for maximal and minimal diameter measurements (p<0.001). There was also a significant difference between the first observer and CAD for maximal diameter measurements with the former underestimating the diameter compared to the latter (p<0.001). As for minimal diameters, they were higher when measured by the second observer than when measured by CAD (p<0.001). Neither the difference of mean minimal diameter between the first observer and CAD nor the difference of mean maximal diameter between the second observer and CAD was significant (p=0.20 and 0.06, respectively). Conclusion: CAD algorithms can lessen the variability of diameter measurements in the follow-up of ascending aorta aneurysms. Nevertheless, in non-optimal conditions, it may be necessary to correct manually the measurements. Improvements of the algorithms will help to avoid such a situation.
Resumo:
This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.
Resumo:
This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.
Resumo:
In this article, we present the current state of our work on a linguistically-motivated model for automatic summarization of medical articles in Spanish. The model takes into account the results of an empirical study which reveals that, on the one hand, domain-specific summarization criteria can often be derived from the summaries of domain specialists, and, on the other hand, adequate summarization strategies must be multidimensional, i.e., cover various types of linguistic clues. We take into account the textual, lexical, discursive, syntactic and communicative dimensions. This is novel in the field of summarization. The experiments carried out so far indicate that our model is suitable to provide high quality summarizations.
Resumo:
Automatic classification of makams from symbolic data is a rarely studied topic. In this paper, first a review of an n-gram based approach is presented using various representations of the symbolic data. While a high degree of precision can be obtained, confusion happens mainly for makams using (almost) the same scale and pitch hierarchy but differ in overall melodic progression, seyir. To further improve the system, first n-gram based classification is tested for various sections of the piece to take into account a feature of the seyir that melodic progression starts in a certain region of the scale. In a second test, a hierarchical classification structure is designed which uses n-grams and seyir features in different levels to further improve the system.
Resumo:
This research investigates the phenomenon of translationese in two monolingual comparable corpora of original and translated Catalan texts. Translationese has been defined as the dialect, sub-language or code of translated language. This study aims at giving empirical evidence of translation universals regardless the source language.Traditionally, research conducted on translation strategies has been mainly intuition-based. Computational Linguistics and Natural Language Processing techniques provide reliable information of lexical frequencies, morphological and syntactical distribution in corpora. Therefore, they have been applied to observe which translation strategies occur in these corpora.Results seem to prove the simplification, interference and explicitation hypotheses, whereas no sign of normalization has been detected with the methodology used.The data collected and the resources created for identifying lexical, morphological and syntactic patterns of translations can be useful for Translation Studies teachers, scholars and students: teachers will have more tools to help students avoid the reproduction of translationese patterns. Resources developed will help in detecting non-genuine or inadequate structures in the target language. This fact may imply an improvement in stylistic quality in translations. Translation professionals can also take advantage of these resources to improve their translation quality.
Resumo:
Most economic interactions happen in a context of sequential exchange in which innocent third parties suffer information asymmetry with respect to previous "originative" contracts. The law reduces transaction costs by protecting these third parties but preserves some element of consent by property right holders to avoid damaging property enforcement-e.g., it is they as principals who authorize agents in originative contracts. Judicial verifiability of these originative contracts is obtained either as an automatic byproduct of transactions or, when these would have remained private, by requiring them to be made public. Protecting third parties produces a sort of legal commodity which is easy to trade impersonally, improving the allocation and specialization of resources. Historical delay in generalizing this legal commoditization paradigm is attributed to path dependency-the law first developed for personal trade-and an unbalance in vested interests, as luddite legal professionals face weak public bureaucracies.
Resumo:
The problem of small Island Developing States (SIDS) is quite recent, end of the 80s and 90s, still looking for a theoretical consolidation. SIDS, as small states in development, formed by one or several islands geographically dispersed, present reduced population, market, territory, natural resources, including drinkable water, and, in great number of the cases, low level of economic activity, factors that together, hinder the gathering of scale economies. To these diseconomies they come to join the more elevated costs in transports and communications which, allies to lower productivities, to a smaller quality and diversification of its productions, which difficult its integration in the world economy. In some SIDS these factors are not dissociating of the few investments in infrastructures, in the formation of human resources and in productive investments, just as it happens in most of the developing countries. In ecological terms, many of them with shortage of natural resources, but integrating important ecosystems in national and world terms, but with great fragility relatively to the pollution action, of excessive fishing, of uncontrolled development of tourism, factors that, conjugated and associated to the stove effect, condition the climate and the slope of the medium level of the sea water and therefore could put in cause the own survival of some of them. The drive to the awareness of the international community towards its problems summed up with the accomplishment by the United Nations in the Barbados’s Conference, 1994 where the right to the development was emphasized, through the going up the appropriate strategies and the Programme of Action for the Sustainable Development of the SIDS. The orientation of the regional and international cooperation in that sense, sharing technology (namely clean technology and control and administration environmental technology), information and creation of capacity-building, supplying means, including financial resources, creating non discriminatory and just trade rules, it would drive to the establishment of a world system economically more equal, in which the production, the consumption, the pollution levels, the demographic politics were guided towards the sustainability. It constituted an important step for the recognition for the international community on the specificities of those states and it allowed the definition of a group of norms and politics to implement at the national, regional and international level and it was important that they continued in the sense of the sustainable development. But this Conference had in its origin previous summits: the Summit of Rio de Janeiro about Environment and Development, accomplished in 1992, which left an important document - the Agenda 21, in the Conference of Stockholm at 1972 and even in the Conference of Ramsar, 1971 about “Wetlands.” CENTRO DE ESTUDOS AFRICANOS Occasional Papers © CEA - Centro de Estudos Africanos 4 Later, the Valletta Declaration, Malta, 1998, the Forum of Small States, 2002, get the international community's attention for the problems of SIDS again, in the sense that they act to increase its resilience. If the definition of “vulnerability” was the inability of the countries to resist economical, ecological and socially to the external shocks and “resilience” as the potential for them to absorb and minimize the impact of those shocks, presenting a structure that allows them to be little affected by them, a part of the available studies, dated of the 90s, indicate that the SIDS are more vulnerable than the other developing countries. The vulnerability of SIDS results from the fact the they present an assemblage of characteristics that turns them less capable of resisting or they advance strategies that allow a larger resilience to the external shocks, either anthropogenic (economical, financial, environmental) or even natural, connected with the vicissitudes of the nature. If these vulnerability factors were grouped with the expansion of the economic capitalist system at world level, the economic and financial globalisation, the incessant search of growing profits on the part of the multinational enterprises, the technological accelerated evolution drives to a situation of disfavour of the more poor. The creation of the resilience to the external shocks, to the process of globalisation, demands from SIDS and of many other developing countries the endogen definition of strategies and solid but flexible programs of integrated development. These must be assumed by the instituted power, but also by the other stakeholders, including companies and organizations of the civil society and for the population in general. But that demands strong investment in the formation of human resources, in infrastructures, in investigation centres; it demands the creation capacity not only to produce, but also to produce differently and do international marketing. It demands institutional capacity. Cape Verde is on its way to this stage.
Resumo:
The paper deals with the development and application of the generic methodology for automatic processing (mapping and classification) of environmental data. General Regression Neural Network (GRNN) is considered in detail and is proposed as an efficient tool to solve the problem of spatial data mapping (regression). The Probabilistic Neural Network (PNN) is considered as an automatic tool for spatial classifications. The automatic tuning of isotropic and anisotropic GRNN/PNN models using cross-validation procedure is presented. Results are compared with the k-Nearest-Neighbours (k-NN) interpolation algorithm using independent validation data set. Real case studies are based on decision-oriented mapping and classification of radioactively contaminated territories.