848 resultados para computer science and engineering
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
Machine vision is an important subject in computer science and engineering degrees. For laboratory experimentation, it is desirable to have a complete and easy-to-use tool. In this work we present a Java library, oriented to teaching computer vision. We have designed and built the library from the scratch with enfasis on readability and understanding rather than on efficiency. However, the library can also be used for research purposes. JavaVis is an open source Java library, oriented to the teaching of Computer Vision. It consists of a framework with several features that meet its demands. It has been designed to be easy to use: the user does not have to deal with internal structures or graphical interface, and should the student need to add a new algorithm it can be done simply enough. Once we sketch the library, we focus on the experience the student gets using this library in several computer vision courses. Our main goal is to find out whether the students understand what they are doing, that is, find out how much the library helps the student in grasping the basic concepts of computer vision. In the last four years we have conducted surveys to assess how much the students have improved their skills by using this library.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
The 2008 National Student Survey revealed that: 44% of full-time students in England did not think that the feedback on their work had been prompt nor did they agree that the feedback on their work helped them clarify things that they did not understand (HEFCE, 2008). Computer Science and Engineering & Technology have been amongst the poorest performers in this aspect as they ranked in the lower quartile (Surridge, 2007, p.32). Five years since the first NSS survey, assessment and feedback remains the biggest concern. Dissatisfaction in any aspect of studies demotivates students and can lead to disengagement and attrition. As the student number grows, the situation can only get worse if nothing is done about it. We have conducted a survey to investigate views on assessment and feedback from Engineering, Mathematics and Computing students. The survey aims at investigating the core issues of dissatisfaction in assessment and feedback and ways in which UK Engineering students can learn better through helpful feedback. The study focuses on collecting students' experiences with feedback received in their coursework, assignments and quizzes in Computing Science modules. The survey reveals the role of feedback in their learning. The results of the survey help to identify the forms of feedback that are considered to be helpful in learning and the time frame for timely feedback. We report on the findings of the survey. We also explore ways to improve assessment and feedback in a bid to better engage engineering students in their studies.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-07
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Early definitions of Smart Building focused almost entirely on the technology aspect and did not suggest user interaction at all. Indeed, today we would attribute it more to the concept of the automated building. In this sense, control of comfort conditions inside buildings is a problem that is being well investigated, since it has a direct effect on users’ productivity and an indirect effect on energy saving. Therefore, from the users’ perspective, a typical environment can be considered comfortable, if it’s capable of providing adequate thermal comfort, visual comfort and indoor air quality conditions and acoustic comfort. In the last years, the scientific community has dealt with many challenges, especially from a technological point of view. For instance, smart sensing devices, the internet, and communication technologies have enabled a new paradigm called Edge computing that brings computation and data storage closer to the location where it is needed, to improve response times and save bandwidth. This has allowed us to improve services, sustainability and decision making. Many solutions have been implemented such as smart classrooms, controlling the thermal condition of the building, monitoring HVAC data for energy-efficient of the campus and so forth. Though these projects provide to the realization of smart campus, a framework for smart campus is yet to be determined. These new technologies have also introduced new research challenges: within this thesis work, some of the principal open challenges will be faced, proposing a new conceptual framework, technologies and tools to move forward the actual implementation of smart campuses. Keeping in mind, several problems known in the literature have been investigated: the occupancy detection, noise monitoring for acoustic comfort, context awareness inside the building, wayfinding indoor, strategic deployment for air quality and books preserving.
Resumo:
The availability of a huge amount of source code from code archives and open-source projects opens up the possibility to merge machine learning, programming languages, and software engineering research fields. This area is often referred to as Big Code where programming languages are treated instead of natural languages while different features and patterns of code can be exploited to perform many useful tasks and build supportive tools. Among all the possible applications which can be developed within the area of Big Code, the work presented in this research thesis mainly focuses on two particular tasks: the Programming Language Identification (PLI) and the Software Defect Prediction (SDP) for source codes. Programming language identification is commonly needed in program comprehension and it is usually performed directly by developers. However, when it comes at big scales, such as in widely used archives (GitHub, Software Heritage), automation of this task is desirable. To accomplish this aim, the problem is analyzed from different points of view (text and image-based learning approaches) and different models are created paying particular attention to their scalability. Software defect prediction is a fundamental step in software development for improving quality and assuring the reliability of software products. In the past, defects were searched by manual inspection or using automatic static and dynamic analyzers. Now, the automation of this task can be tackled using learning approaches that can speed up and improve related procedures. Here, two models have been built and analyzed to detect some of the commonest bugs and errors at different code granularity levels (file and method levels). Exploited data and models’ architectures are analyzed and described in detail. Quantitative and qualitative results are reported for both PLI and SDP tasks while differences and similarities concerning other related works are discussed.
Resumo:
Slot and van Emde Boas Invariance Thesis states that a time (respectively, space) cost model is reasonable for a computational model C if there are mutual simulations between Turing machines and C such that the overhead is polynomial in time (respectively, linear in space). The rationale is that under the Invariance Thesis, complexity classes such as LOGSPACE, P, PSPACE, become robust, i.e. machine independent. In this dissertation, we want to find out if it possible to define a reasonable space cost model for the lambda-calculus, the paradigmatic model for functional programming languages. We start by considering an unusual evaluation mechanism for the lambda-calculus, based on Girard's Geometry of Interaction, that was conjectured to be the key ingredient to obtain a space reasonable cost model. By a fine complexity analysis of this schema, based on new variants of non-idempotent intersection types, we disprove this conjecture. Then, we change the target of our analysis. We consider a variant over Krivine's abstract machine, a standard evaluation mechanism for the call-by-name lambda-calculus, optimized for space complexity, and implemented without any pointer. A fine analysis of the execution of (a refined version of) the encoding of Turing machines into the lambda-calculus allows us to conclude that the space consumed by this machine is indeed a reasonable space cost model. In particular, for the first time we are able to measure also sub-linear space complexities. Moreover, we transfer this result to the call-by-value case. Finally, we provide also an intersection type system that characterizes compositionally this new reasonable space measure. This is done through a minimal, yet non trivial, modification of the original de Carvalho type system.
Resumo:
The multi-faced evolution of network technologies ranges from big data centers to specialized network infrastructures and protocols for mission-critical operations. For instance, technologies such as Software Defined Networking (SDN) revolutionized the world of static configuration of the network - i.e., by removing the distributed and proprietary configuration of the switched networks - centralizing the control plane. While this disruptive approach is interesting from different points of view, it can introduce new unforeseen vulnerabilities classes. One topic of particular interest in the last years is industrial network security, an interest which started to rise in 2016 with the introduction of the Industry 4.0 (I4.0) movement. Networks that were basically isolated by design are now connected to the internet to collect, archive, and analyze data. While this approach got a lot of momentum due to the predictive maintenance capabilities, these network technologies can be exploited in various ways from a cybersecurity perspective. Some of these technologies lack security measures and can introduce new families of vulnerabilities. On the other side, these networks can be used to enable accurate monitoring, formal verification, or defenses that were not practical before. This thesis explores these two fields: by introducing monitoring, protections, and detection mechanisms where the new network technologies make it feasible; and by demonstrating attacks on practical scenarios related to emerging network infrastructures not protected sufficiently. The goal of this thesis is to highlight this lack of protection in terms of attacks on and possible defenses enabled by emerging technologies. We will pursue this goal by analyzing the aforementioned technologies and by presenting three years of contribution to this field. In conclusion, we will recapitulate the research questions and give answers to them.
Resumo:
In the framework of industrial problems, the application of Constrained Optimization is known to have overall very good modeling capability and performance and stands as one of the most powerful, explored, and exploited tool to address prescriptive tasks. The number of applications is huge, ranging from logistics to transportation, packing, production, telecommunication, scheduling, and much more. The main reason behind this success is to be found in the remarkable effort put in the last decades by the OR community to develop realistic models and devise exact or approximate methods to solve the largest variety of constrained or combinatorial optimization problems, together with the spread of computational power and easily accessible OR software and resources. On the other hand, the technological advancements lead to a data wealth never seen before and increasingly push towards methods able to extract useful knowledge from them; among the data-driven methods, Machine Learning techniques appear to be one of the most promising, thanks to its successes in domains like Image Recognition, Natural Language Processes and playing games, but also the amount of research involved. The purpose of the present research is to study how Machine Learning and Constrained Optimization can be used together to achieve systems able to leverage the strengths of both methods: this would open the way to exploiting decades of research on resolution techniques for COPs and constructing models able to adapt and learn from available data. In the first part of this work, we survey the existing techniques and classify them according to the type, method, or scope of the integration; subsequently, we introduce a novel and general algorithm devised to inject knowledge into learning models through constraints, Moving Target. In the last part of the thesis, two applications stemming from real-world projects and done in collaboration with Optit will be presented.