902 resultados para domain knowledge
Resumo:
In this paper the authors present an approach for the semantic annotation of RESTful services in the geospatial domain. Their approach automates some stages of the annotation process, by using a combination of resources and services: a cross-domain knowledge base like DBpedia, two domain ontologies like GeoNames and the WGS84 vocabulary, and suggestion and synonym services. The authors’ approach has been successfully evaluated with a set of geospatial RESTful services obtained from ProgrammableWeb.com, where geospatial services account for a third of the total amount of services available in this registry.
Resumo:
Ontology antipatterns are structures that reflect ontology modelling problems, they lead to inconsistencies, bad reasoning performance or bad formalisation of domain knowledge. Antipatterns normally appear in ontologies developed by those who are not experts in ontology engineering. Based on our experience in ontology design, we have created a catalogue of such antipatterns in the past, and in this paper we describe how we can use SPARQL-DL to detect them. We conduct some experiments to detect them in a large OWL ontology corpus obtained from the Watson ontology search portal. Our results show that each antipattern needs a specialised detection method.
Resumo:
Ontology antipatterns are structures that reflect ontology modelling problems because they lead to inconsistencies, bad reasoning performance or bad formalisation of domain knowledge. We propose four methods for the detection of antipatterns using SPARQL queries.We conduct some experiments to detect antipattern in a corpus of OWL ontologies.
Resumo:
El objetivo de esta Tesis es crear un Modelo de Diseño Orientado a Marcos que, intermedio entre el Mundo Externo y el Modelo Interno del Mundo que supone el sistema ímplementado, disminuya la pérdida de conocimiento que se produce al formalizar la realidad en Bases de Conocimientos. El modelo disminuye la pérdida de conocimiento al formalizar Bases de Conocimiento, acercando el formalismo de Marcos al Mundo Externo, porque: 1. Crea una base teórica que uniformiza el concepto de Marco en el plano de la Formalización, estableciendo un conjunto de restricciones sintácticas y semánticas que impedirán, al Ingeniero del Conocimiento (IC) cuando formaliza, definir elementos no permitidos o el uso indebido de ellos. 2. Se incrementa la expresividad del formalismo al asociar a cada una de las propiedades de un marco clase un parámetro adicional que simboliza la representatividad de la propiedad en el concepto. Este parámetro, y las técnicas de inferencia que trabajan con él, permitirán al IC introducir en el Modelo Formalizado conocimiento que antes no introducía al construir la base de conocimientos y que, sin embargo, sí existía en la realidad. 3. Se propone una técnica de equiparación que trabaja con el conocimiento incierto presente en el dominio. Esta técnica de equiparación, utiliza la representatividad de las propiedades en los marcos clase y el grado de certeza de las propiedades de las entidades para calcular el valor de equiparación y, así, determinar en qué medida los marcos clase seleccionados son consistentes con la descripción de la situación actual dada por una entidad. 4. Proporciona nuevas técnicas de inferencia basadas en la transferencia de propiedades y modifica las ya existentes. Las transferencias de propiedades realizadas sobre relaciones "ad hoc" definidas por el IC al construir el sistema, es una nueva técnica de inferencia independiente y complementaria a la transferencia de propiedades llamada tradicionalmente Herencia (cesión de propiedades entre padres e hijos). A esta nueva técnica, se le ha llamado Donación, es decir, cesión de propiedades entre marcos sin parentesco. Como aportación práctica, se ha construido un entorno de construcción de Sistemas Basados en el Conocimiento formalizados en Marcos, donde se han introducido todos los nuevos conceptos del Modelo Teórico de la Tesis. Se trata de una cierta anidación. Es decir, son marcos que permiten formalizar cualquier SBC en marcos. El entorno permitirá al IC formalizar bases de conocimientos automáticamente y éste podrá validar el conocimiento del dominio en la fase de formalización en lugar de tener que esperar a que la BC esté implementada. Todo ello lleva a describir el Modelo de Diseño Orientado a Marcos como un puente que aproxima y comunica el Mundo Externo con el Modelo Interno asociado a la realidad e implementado en una computadora, disminuyendo así las diversas pérdidas de conocimiento que si bien no ocurren simultáneamente al construir Sistemas Basados en el Conocimiento, sí coexisten en él.---ABSTRACT---The goal of this thesis is to créate a Frame-Orlented Deslgn Model that, bridging the Outside World and the implemented system's Internal Model of the World, reduces the amount of knowledge lost when reality is formalized in Knowledge Bases (KB). The model diminishes the loss of knowledge when formalizing a KB and brings the Frame-formalized Model closer to the Outside World because: 1. It creates a theory that standardizes the concept of trame at the formalization level to establish a set of syntactic and semantic constraints that will prevent the Knowledge Engineer (KE) from defining forbidden elements or their undue use in the formalization process. 2. The formalism's expressiveness is increased by associating an additional parameter to each of the properties of a class frame to symbolize the representativeness of the concept property. This parameter and the related inference techniques will allow the KE to enter knowledge into the Formalized Model that actually existed but that was not used previously when building the KB. 3. The proposed technique involves matching and works with uncertain knowledge present in the domain. This matching technique takes the representativeness of the properties in the class frame and the degree of certainty of the properties of the entities to calcúlate the matching valué and thus determine to what extent the class frames selected are consistent with the description of the present situation given by an entity. 4. It offers new inference techniques based on property transfer and alters existing ones. Property transfer on ad hoc relations defined by the KE when building a system is a new inference technique independent of and complementary to property transfer traditionally termed Inheritance (transfer of properties between parents and children). This new technique has been callad Donation (transfer of properties between trames without relationships). 5. It improves control of the procedural knowledge defined in the trames by introducing OO concepta. A frame-formalized KBS building environment has been constructed, incorporating all the new concepts of the theoretical model set out in the thesis. There is some embedding, that is, they are trames that provide for any KBS to be formalizad in trames. The environment will enable the KE to formaliza KB automatically, and he will be able to valídate the domain knowledge in the formalization stage instead of havíng to wait until the KB has been implemented. This is a description of the Frame-oriented Design Model, a bridge that brings closer and communicates the Outside World with the Interna! Model associated to reality and implemented on a computar, thus reducing the different losses in knowledge that, though they do not occur simultaneosly when building a Knowledge-based System, coexist within it.
Resumo:
El aprendizaje basado en problemas se lleva aplicando con éxito durante las últimas tres décadas en un amplio rango de entornos de aprendizaje. Este enfoque educacional consiste en proponer problemas a los estudiantes de forma que puedan aprender sobre un dominio particular mediante el desarrollo de soluciones a dichos problemas. Si esto se aplica al modelado de conocimiento, y en particular al basado en Razonamiento Cualitativo, las soluciones a los problemas pasan a ser modelos que representan el compotamiento del sistema dinámico propuesto. Por lo tanto, la tarea del estudiante en este caso es acercar su modelo inicial (su primer intento de representar el sistema) a los modelos objetivo que proporcionan soluciones al problema, a la vez que adquieren conocimiento sobre el dominio durante el proceso. En esta tesis proponemos KaiSem, un método que usa tecnologías y recursos semánticos para guiar a los estudiantes durante el proceso de modelado, ayudándoles a adquirir tanto conocimiento como sea posible sin la directa supervisión de un profesor. Dado que tanto estudiantes como profesores crean sus modelos de forma independiente, estos tendrán diferentes terminologías y estructuras, dando lugar a un conjunto de modelos altamente heterogéneo. Para lidiar con tal heterogeneidad, proporcionamos una técnica de anclaje semántico para determinar, de forma automática, enlaces entre la terminología libre usada por los estudiantes y algunos vocabularios disponibles en la Web de Datos, facilitando con ello la interoperabilidad y posterior alineación de modelos. Por último, proporcionamos una técnica de feedback semántico para comparar los modelos ya alineados y generar feedback basado en las posibles discrepancias entre ellos. Este feedback es comunicado en forma de sugerencias individualizadas que el estudiante puede utilizar para acercar su modelo a los modelos objetivos en cuanto a su terminología y estructura se refiere. ABSTRACT Problem-based learning has been successfully applied over the last three decades to a diverse range of learning environments. This educational approach consists of posing problems to learners, so they can learn about a particular domain by developing solutions to them. When applied to conceptual modeling, and particularly to Qualitative Reasoning, the solutions to problems are models that represent the behavior of a dynamic system. Therefore, the learner's task is to move from their initial model, as their first attempt to represent the system, to the target models that provide solutions to that problem while acquiring domain knowledge in the process. In this thesis we propose KaiSem, a method for using semantic technologies and resources to scaffold the modeling process, helping the learners to acquire as much domain knowledge as possible without direct supervision from the teacher. Since learners and experts create their models independently, these will have different terminologies and structure, giving rise to a pool of models highly heterogeneous. To deal with such heterogeneity, we provide a semantic grounding technique to automatically determine links between the unrestricted terminology used by learners and some online vocabularies of the Web of Data, thus facilitating the interoperability and later alignment of the models. Lastly, we provide a semantic-based feedback technique to compare the aligned models and generate feedback based on the possible discrepancies. This feedback is communicated in the form of individualized suggestions, which can be used by the learner to bring their model closer in terminology and structure to the target models.
Resumo:
La presente tesis doctoral contribuye al problema del diagnóstico autonómico de fallos en redes de telecomunicación. En las redes de telecomunicación actuales, las operadoras realizan tareas de diagnóstico de forma manual. Dichas operaciones deben ser llevadas a cabo por ingenieros altamente cualificados que cada vez tienen más dificultades a la hora de gestionar debidamente el crecimiento exponencial de la red tanto en tamaño, complejidad y heterogeneidad. Además, el advenimiento del Internet del Futuro hace que la demanda de sistemas que simplifiquen y automaticen la gestión de las redes de telecomunicación se haya incrementado en los últimos años. Para extraer el conocimiento necesario para desarrollar las soluciones propuestas y facilitar su adopción por los operadores de red, se propone una metodología de pruebas de aceptación para sistemas multi-agente enfocada en simplificar la comunicación entre los diferentes grupos de trabajo involucrados en todo proyecto de desarrollo software: clientes y desarrolladores. Para contribuir a la solución del problema del diagnóstico autonómico de fallos, se propone una arquitectura de agente capaz de diagnosticar fallos en redes de telecomunicación de manera autónoma. Dicha arquitectura extiende el modelo de agente Belief-Desire- Intention (BDI) con diferentes modelos de diagnóstico que gestionan las diferentes sub-tareas del proceso. La arquitectura propuesta combina diferentes técnicas de razonamiento para alcanzar su propósito gracias a un modelo estructural de la red, que usa razonamiento basado en ontologías, y un modelo causal de fallos, que usa razonamiento Bayesiano para gestionar debidamente la incertidumbre del proceso de diagnóstico. Para asegurar la adecuación de la arquitectura propuesta en situaciones de gran complejidad y heterogeneidad, se propone un marco de argumentación que permite diagnosticar a agentes que estén ejecutando en dominios federados. Para la aplicación de este marco en un sistema multi-agente, se propone un protocolo de coordinación en el que los agentes dialogan hasta alcanzar una conclusión para un caso de diagnóstico concreto. Como trabajos futuros, se consideran la extensión de la arquitectura para abordar otros problemas de gestión como el auto-descubrimiento o la auto-optimización, el uso de técnicas de reputación dentro del marco de argumentación para mejorar la extensibilidad del sistema de diagnóstico en entornos federados y la aplicación de las arquitecturas propuestas en las arquitecturas de red emergentes, como SDN, que ofrecen mayor capacidad de interacción con la red. ABSTRACT This PhD thesis contributes to the problem of autonomic fault diagnosis of telecommunication networks. Nowadays, in telecommunication networks, operators perform manual diagnosis tasks. Those operations must be carried out by high skilled network engineers which have increasing difficulties to properly manage the growing of those networks, both in size, complexity and heterogeneity. Moreover, the advent of the Future Internet makes the demand of solutions which simplifies and automates the telecommunication network management has been increased in recent years. To collect the domain knowledge required to developed the proposed solutions and to simplify its adoption by the operators, an agile testing methodology is defined for multiagent systems. This methodology is focused on the communication gap between the different work groups involved in any software development project, stakeholders and developers. To contribute to overcoming the problem of autonomic fault diagnosis, an agent architecture for fault diagnosis of telecommunication networks is defined. That architecture extends the Belief-Desire-Intention (BDI) agent model with different diagnostic models which handle the different subtasks of the process. The proposed architecture combines different reasoning techniques to achieve its objective using a structural model of the network, which uses ontology-based reasoning, and a causal model, which uses Bayesian reasoning to properly handle the uncertainty of the diagnosis process. To ensure the suitability of the proposed architecture in complex and heterogeneous environments, an argumentation framework is defined. This framework allows agents to perform fault diagnosis in federated domains. To apply this framework in a multi-agent system, a coordination protocol is defined. This protocol is used by agents to dialogue until a reliable conclusion for a specific diagnosis case is reached. Future work comprises the further extension of the agent architecture to approach other managements problems, such as self-discovery or self-optimisation; the application of reputation techniques in the argumentation framework to improve the extensibility of the diagnostic system in federated domains; and the application of the proposed agent architecture in emergent networking architectures, such as SDN, which offers new capabilities of control for the network.
Resumo:
La meta de intercambiabilidad de piezas establecida en los sistemas de producción del siglo XIX, es ampliada en el último cuarto del siglo pasado para lograr la capacidad de fabricación de varios tipos de producto en un mismo sistema de manufactura, requerimiento impulsado por la incertidumbre del mercado. Esta incertidumbre conduce a plantear la flexibilidad como característica importante en el sistema de producción. La presente tesis se ubica en el problema de integración del sistema informático (SI) con el equipo de producción (EP) en la búsqueda de una solución que coadyuve a satisfacer los requerimientos de flexibilidad impuestas por las condiciones actuales de mercado. Se describen antecedentes de los sistemas de producción actuales y del concepto de flexibilidad. Se propone una clasificación compacta y práctica de los tipos de flexibilidad relevantes en el problema de integración SI-EP, con la finalidad de ubicar el significado de flexibilidad en el área de interés. Así mismo, las variables a manejar en la solución son clasificadas en cuatro tipos: Medio físico, lenguajes de programación y controlador, naturaleza del equipo y componentes de acoplamiento. Por otra parte, la característica de reusabilidad como un efecto importante y deseable de un sistema flexible, es planteada como meta en la solución propuesta no solo a nivel aplicación del sistema sino también a nivel de reuso de conceptos de diseño. Se propone un esquema de referencia en tres niveles de abstracción, que permita manejar y reutilizar en forma organizada el conocimiento del dominio de aplicación (integración SI-EP), el desarrollo de sistemas de aplicación genérica así como también la aplicación del mismo en un caso particular. Un análisis del concepto de acoplamiento débil (AD) es utilizado como base en la solución propuesta al problema de integración SI-EP. El desarrollo inicia identificando condiciones para la existencia del acoplamiento débil, compensadores para soportar la operación del sistema bajo AD y los efectos que ocasionan en el sistema informático los cambios en el conjunto de equipos de producción. Así mismo, se introducen como componentes principales del acoplamiento los componentes tecnológico, tarea y rol, a utilizar en el análisis de los requerimientos para el desarrollo de una solución de AD entre SI-EP. La estructura de tres niveles del esquema de referencia propuesto surge del análisis del significado de conceptos de referencia comúnmente reportados en la literatura, tales como arquitectura de referencia, modelo de referencia, marco de trabajo, entre otros. Se presenta un análisis de su significado como base para la definición de cada uno de los niveles de la estructura del esquema, pretendiendo con ello evitar la ambigüedad existente debido al uso indistinto de tales conceptos en la literatura revisada. Por otra parte, la relación entre niveles es definida tomando como base la estructura de cuatro capas planteada en el área de modelado de datos. La arquitectura de referencia, implementada en el primer nivel del esquema propuesto es utilizada como base para el desarrollo del modelo de referencia o marco de trabajo para el acoplamiento débil entre el SI y el EP. La solución propuesta es validada en la integración de un sistema informático de coordinación de flujo y procesamiento de pieza con un conjunto variable de equipos de diferentes tipos, naturaleza y fabricantes. En el ejercicio de validación se abordaron diferentes estándares y técnicas comúnmente empleadas como soporte al problema de integración a nivel componente tecnológico, tales como herramientas de cero configuración (ejemplo: plug and play), estándar OPC-UA, colas de mensajes y servicios web, permitiendo así ubicar el apoyo de estas técnicas en el ámbito del componente tecnológico y su relación con los otros componentes de acoplamiento: tarea y rol. ABSTRACT The interchangeability of parts, as a goal of manufacturing systems at the nineteenth century, is extended into the present to achieve the ability to manufacture various types of products in the same manufacturing system, requirement associated with market uncertainty. This uncertainty raises flexibility as an important feature in the production system. This thesis addresses the problem regarding integration of software system (SS) and the set of production equipment (PE); looking for a solution that contributes to satisfy the requirements of flexibility that the current market conditions impose on manufacturing, particularly to the production floor. Antecedents to actual production systems as well as the concept of flexibility are described and analyzed in detail. A practical and compact classification of flexibility types of relevance to the integration SS-EP problem is proposed with the aim to delimit the meaning of flexibility regarding the area of interest. Also, a classification for the variables involved in the integration problem is presented into four types: Physical media, programming and controller languages, equipment nature and coupling components. In addition, the characteristic of reusability that has been seen as an important and desirable effect of a flexible system is taken as a goal in the proposed solution, not only at system implementation level but also at system design level. In this direction, a reference scheme is proposed consisting of three abstraction levels to systematically support management and reuse of domain knowledge (SS-PE), development of a generic system as well as its application in a particular case. The concept of loose coupling is used as a basis in the development of the proposed solution to the problem of integration SS-EP. The first step of the development process consists of an analysis of the loose coupled concept, identifying conditions for its existence, compensators for system operation under loose coupling conditions as well as effects in the software system caused by modification in the set of production equipment. In addition coupling components: technological, task and role are introduced as main components to support the analysis of requirements regarding loose coupling of SS-PE. The three tier structure of the proposed reference scheme emerges from the analysis of reference concepts commonly reported in the literature, such as reference architecture, reference model and framework, among others. An analysis of these concepts is used as a basis for definition of the structure levels of the proposed scheme, trying to avoid the ambiguity due to the indiscriminate use of such concepts in the reviewed literature. In addition, the relation between adjacent levels of the structure is defined based on the four tiers structure commonly used in the data modelling area. The reference architecture is located as the first level in the structure of the proposed reference scheme and it is utilized as a basis for the development of the reference model or loose coupling framework for SS-PE integration. The proposed solution is validated by integrating a software system (process and piece flow coordination system) with a variable set of production equipment including different types, nature and manufacturers of equipment. Furthermore, in this validation exercise, different standards and techniques commonly used have been taken into account to support the issue of technology coupling component, such as tools for zero configuration (i.e. Plug and Play), message queues, OPC-UA standard, and web services. Through this part of the validation exercise, these integration tools are located as a part of the technological component and they are related to the role and task components of coupling.
Resumo:
Contexto: La presente tesis doctoral se enmarca en la actividad de educción de los requisitos. La educción de requisitos es generalmente aceptada como una de las actividades más importantes dentro del proceso de Ingeniería de Requisitos, y tiene un impacto directo en la calidad del software. Es una actividad donde la comunicación entre los involucrados (analistas, clientes, usuarios) es primordial. La efectividad y eficacia del analista en la compresión de las necesidades de clientes y usuarios es un factor crítico para el éxito del desarrollo de software. La literatura se ha centrado principalmente en estudiar y comprender un conjunto específico de capacidades o habilidades personales que debe poseer el analista para realizar de forma efectiva la actividad de educción. Sin embargo, existen muy pocos trabajos que han estudiado dichas capacidades o habilidades empíricamente. Objetivo: La presente investigación tiene por objetivo estudiar el efecto de la experiencia, el conocimiento acerca del dominio y la titulación académica que poseen los analistas en la efectividad del proceso de educción de los requisitos, durante los primeros contactos del analista con el cliente. Método de Investigación: Hemos ejecutado 8 estudios empíricos entre cuasi-experimentos (4) y experimentos controlados (4). Un total de 110 sujetos experimentales han participado en los estudios, entre estudiantes de post-grado de la Escuela Técnica Superior de Ingenieros Informáticos de la Universidad Politécnica de Madrid y profesionales. La tarea experimental consistió en realizar sesiones de educción de requisitos sobre uno o más dominios de problemas (de carácter conocido y desconocido para los sujetos). Las sesiones de educción se realizaron empleando la entrevista abierta. Finalizada la entrevista, los sujetos reportaron por escrito toda la información adquirida. Resultados: Para dominios desconocidos, la experiencia (entrevistas, requisitos, desarrollo y profesional) del analista no influye en su efectividad. En dominios conocidos, la experiencia en entrevistas (r = 0.34, p-valor = 0.080) y la experiencia en requisitos (r = 0.22, p-valor = 0.279), ejercen un efecto positivo. Esto es, los analistas con más años de experiencia en entrevistas y/o requisitos tienden a alcanzar mejores efectividades. Por el contrario, la experiencia en desarrollo (r = -0.06, p-valor = 0.765) y la experiencia profesional (r = -0.35, p-valor = 0.077), tienden a ejercer un efecto nulo y negativo, respectivamente. En lo que respecta al conocimiento acerca del dominio del problema que poseen los analistas, ejerce un moderado efecto positivo (r=0.31), estadísticamente significativo (p-valor = 0.029) en la efectividad de la actividad de educción. Esto es, los analistas con conocimiento tienden a ser más efectivos en los dominios de problema conocidos. En lo que respecta a la titulación académica, por falta de diversidad en las titulaciones académicas de los sujetos experimentales no es posible alcanzar una conclusión. Hemos podido explorar el efecto de la titulación académica en sólo dos cuasi-experimentos, sin embargo, nuestros resultados arrojan efectos contradictorios (r = 0.694, p-valor = 0.51 y r = -0.266, p-valor = 0.383). Además de las variables estudiadas indicadas anteriormente, hemos confirmado la existencia de variables moderadoras que afectan a la actividad de educción, tales como el entrevistado o la formación. Nuestros datos experimentales confirman que el entrevistado es un factor clave en la actividad de educción. Estadísticamente ejerce una influencia significativa en la efectividad de los analistas (p-valor= 0.000). La diferencia entre entrevistar a uno u otro entrevistado, en unidades naturales, varía entre un 18% - 23% en efectividad. Por otro lado, la formación en requisitos aumenta considerablemente la efectividad de los analistas. Los sujetos que realizaron la educción de requisitos después de recibir una formación específica en requisitos tienden a ser entre un 12% y 20% más efectivos que aquellos que no la recibieron. El efecto es significativo (p-valor = 0.000). Finalmente, hemos observado tres hechos que podrían influir en los resultados de esta investigación. En primer lugar, la efectividad de los analistas es diferencial dependiendo del tipo de elemento del dominio. En dominios conocidos, los analistas con experiencia tienden a adquirir más conceptos que los analistas noveles. En los dominios desconocidos, son los procesos los que se adquieren de forma prominente. En segundo lugar, los analistas llegan a una especie de “techo de cristal” que no les permite adquirir más información. Es decir, el analista sólo reconoce (parte de) los elementos del dominio del problema mencionado. Este hecho se observa tanto en el dominio de problema desconocido como en el conocido, y parece estar relacionado con el modo en que los analistas exploran el dominio del problema. En tercer lugar, aunque los años de experiencia no parecen predecir cuán efectivo será un analista, sí parecen asegurar que un analista con cierta experiencia, en general, tendrá una efectividad mínima que será superior a la efectividad mínima de los analistas con menos experiencia. Conclusiones: Los resultados obtenidos muestran que en dominios desconocidos, la experiencia por sí misma no determina la efectividad de los analistas de requisitos. En dominios conocidos, la efectividad de los analistas se ve influenciada por su experiencia en entrevistas y requisitos, aunque sólo parcialmente. Otras variables influyen en la efectividad de los analistas, como podrían ser las habilidades débiles. El conocimiento del dominio del problema por parte del analista ejerce un efecto positivo en la efectividad de los analistas, e interacciona positivamente con la experiencia incrementando aún más la efectividad de los analistas. Si bien no fue posible obtener conclusiones sólidas respecto al efecto de la titulación académica, si parece claro que la formación específica en requisitos ejerce una importante influencia positiva en la efectividad de los analistas. Finalmente, el analista no es el único factor relevante en la actividad de educción. Los clientes/usuarios (entrevistados) también juegan un rol importante en el proceso de generación de información. ABSTRACT Context: This PhD dissertation addresses requirements elicitation activity. Requirements elicitation is generally acknowledged as one of the most important activities of the requirements process, having a direct impact in the software quality. It is an activity where the communication among stakeholders (analysts, customers, users) is paramount. The analyst’s ability to effectively understand customers/users’ needs represents a critical factor for the success of software development. The literature has focused on studying and comprehending a specific set of personal skills that the analyst must have to perform requirements elicitation effectively. However, few studies have explored those skills from an empirical viewpoint. Goal: This research aims to study the effects of experience, domain knowledge and academic qualifications on the analysts’ effectiveness when performing requirements elicitation, during the first stages of analyst-customer interaction. Research method: We have conducted eight empirical studies, quasi-experiments (four) and controlled experiments (four). 110 experimental subjects participated, including: graduate students with the Escuela Técnica Superior de Ingenieros Informáticos of the Universidad Politécnica de Madrid, as well as researchers and professionals. The experimental tasks consisted in elicitation sessions about one or several problem domains (ignorant and/or aware for the subjects). Elicitation sessions were conducted using unstructured interviews. After each interview, the subjects reported in written all collected information. Results: In ignorant domains, the analyst’s experience (interviews, requirements, development and professional) does not influence her effectiveness. In aware domains, interviewing experience (r = 0.34, p-value = 0.080) and requirements experience (r = 0.22, p-value = 0.279), make a positive effect, i.e.: the analysts with more years of interviewing/requirements experience tend to achieve higher effectiveness. On the other hand, development experience (r = -0.06, p-value = 0.765) and professional experience (r = -0.35, p-value = 0.077) tend to make a null and negative effect, respectively. On what regards the analyst’s problem domain knowledge, it makes a modest positive effect (r=0.31), statistically significant (p-value = 0.029) on the effectiveness of the elicitation activity, i.e.: the analysts with tend to be more effective in problem domains they are aware of. On what regards academic qualification, due to the lack of diversity in the subjects’ academic degrees, we cannot come to a conclusion. We have been able to explore the effect of academic qualifications in two experiments; however, our results show opposed effects (r = 0.694, p-value = 0.51 y r = -0.266, p-value = 0.383). Besides the variables mentioned above, we have confirmed the existence of moderator variables influencing the elicitation activity, such as the interviewee and the training. Our data confirm that the interviewee is a key factor in the elicitation activity; it makes statistically significant effect on analysts’ effectiveness (p-value = 0.000). Interviewing one or another interviewee represents a difference in effectiveness of 18% - 23%, in natural units. On the other hand, requirements training increases to a large extent the analysts’ effectiveness. Those subjects who performed requirements elicitation after specific training tend to be 12% - 20% more effective than those who did not receive training. The effect is statistically significant (p-value = 0.000). Finally, we have observed three phenomena that could have an influence on the results of this research. First, the analysts’ effectiveness differs depending on domain element types. In aware domains, experienced analysts tend to capture more concepts than novices. In ignorant domains, processes are identified more frequently. Second, analysts get to a “glass ceiling” that prevents them to acquire more information, i.e.: analysts only identify (part of) the elements of the problem domain. This fact can be observed in both the ignorant and aware domains. Third, experience years do not look like a good predictor of how effective an analyst will be; however, they seem to guarantee that an analyst with some experience years will have a higher minimum effectiveness than the minimum effectiveness of analysts with fewer experience years. Conclusions: Our results point out that experience alone does not explain analysts’ effectiveness in ignorant domains. In aware domains, analysts’ effectiveness is influenced the experience in interviews and requirements, albeit partially. Other variables influence analysts’ effectiveness, e.g.: soft skills. The analysts’ problem domain knowledge makes a positive effect in analysts’ effectiveness; it positively interacts with the experience, increasing even further analysts’ effectiveness. Although we could not obtain solid conclusions on the effect of the academic qualifications, it is plain clear that specific requirements training makes a rather positive effect on analysts’ effectiveness. Finally, the analyst is not the only relevant factor in the elicitation activity. The customers/users (interviewees) play also an important role in the information generation process.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
The initial aim of this research was to investigate the application of expert Systems, or Knowledge Base Systems technology to the automated synthesis of Hazard and Operability Studies. Due to the generic nature of Fault Analysis problems and the way in which Knowledge Base Systems work, this goal has evolved into a consideration of automated support for Fault Analysis in general, covering HAZOP, Fault Tree Analysis, FMEA and Fault Diagnosis in the Process Industries. This thesis described a proposed architecture for such an Expert System. The purpose of the System is to produce a descriptive model of faults and fault propagation from a description of the physical structure of the plant. From these descriptive models, the desired Fault Analysis may be produced. The way in which this is done reflects the complexity of the problem which, in principle, encompasses the whole of the discipline of Process Engineering. An attempt is made to incorporate the perceived method that an expert uses to solve the problem; keywords, heuristics and guidelines from techniques such as HAZOP and Fault Tree Synthesis are used. In a truly Expert System, the performance of the system is strongly dependent on the high quality of the knowledge that is incorporated. This expert knowledge takes the form of heuristics or rules of thumb which are used in problem solving. This research has shown that, for the application of fault analysis heuristics, it is necessary to have a representation of the details of fault propagation within a process. This helps to ensure the robustness of the system - a gradual rather than abrupt degradation at the boundaries of the domain knowledge.
Resumo:
This thesis introduces a flexible visual data exploration framework which combines advanced projection algorithms from the machine learning domain with visual representation techniques developed in the information visualisation domain to help a user to explore and understand effectively large multi-dimensional datasets. The advantage of such a framework to other techniques currently available to the domain experts is that the user is directly involved in the data mining process and advanced machine learning algorithms are employed for better projection. A hierarchical visualisation model guided by a domain expert allows them to obtain an informed segmentation of the input space. Two other components of this thesis exploit properties of these principled probabilistic projection algorithms to develop a guided mixture of local experts algorithm which provides robust prediction and a model to estimate feature saliency simultaneously with the training of a projection algorithm.Local models are useful since a single global model cannot capture the full variability of a heterogeneous data space such as the chemical space. Probabilistic hierarchical visualisation techniques provide an effective soft segmentation of an input space by a visualisation hierarchy whose leaf nodes represent different regions of the input space. We use this soft segmentation to develop a guided mixture of local experts (GME) algorithm which is appropriate for the heterogeneous datasets found in chemoinformatics problems. Moreover, in this approach the domain experts are more involved in the model development process which is suitable for an intuition and domain knowledge driven task such as drug discovery. We also derive a generative topographic mapping (GTM) based data visualisation approach which estimates feature saliency simultaneously with the training of a visualisation model.
Resumo:
This thesis describes work done exploring the application of expert system techniques to the domain of designing durable concrete. The nature of concrete durability design is described and some problems from the domain are discussed. Some related work on expert systems in concrete durability are described. Various implementation languages are considered - PROLOG and OPS5, and rejected in favour of a shell - CRYSTAL3 (later CRYSTAL4). Criteria for useful expert system shells in the domain are discussed. CRYSTAL4 is evaluated in the light of these criteria. Modules in various sub-domains (mix-design, sulphate attack, steel-corrosion and alkali aggregate reaction) are developed and organised under a BLACKBOARD system (called DEX). Extensions to the CRYSTAL4 modules are considered for different knowledge representations. These include LOTUS123 spreadsheets implementing models incorporating some of the mathematical knowledge in the domain. Design databases are used to represent tabular design knowledge. Hypertext representations of the original building standards texts are proposed as a tool for providing a well structured and extensive justification/help facility. A standardised approach to module development is proposed using hypertext development as a structured basis for expert systems development. Some areas of deficient domain knowledge are highlighted particularly in the use of data from mathematical models and in gaps and inconsistencies in the original knowledge source Digests.
Resumo:
The primary objective of this research was to understand what kinds of knowledge and skills people use in `extracting' relevant information from text and to assess the extent to which expert systems techniques could be applied to automate the process of abstracting. The approach adopted in this thesis is based on research in cognitive science, information science, psycholinguistics and textlinguistics. The study addressed the significance of domain knowledge and heuristic rules by developing an information extraction system, called INFORMEX. This system, which was implemented partly in SPITBOL, and partly in PROLOG, used a set of heuristic rules to analyse five scientific papers of expository type, to interpret the content in relation to the key abstract elements and to extract a set of sentences recognised as relevant for abstracting purposes. The analysis of these extracts revealed that an adequate abstract could be generated. Furthermore, INFORMEX showed that a rule based system was a suitable computational model to represent experts' knowledge and strategies. This computational technique provided the basis for a new approach to the modelling of cognition. It showed how experts tackle the task of abstracting by integrating formal knowledge as well as experiential learning. This thesis demonstrated that empirical and theoretical knowledge can be effectively combined in expert systems technology to provide a valuable starting approach to automatic abstracting.
Resumo:
In the developed world we are surrounded by man-made objects, but most people give little thought to the complex processes needed for their design. The design of hand knitting is complex because much of the domain knowledge is tacit. The objective of this thesis is to devise a methodology to help designers to work within design constraints, whilst facilitating creativity. A hybrid solution including computer aided design (CAD) and case based reasoning (CBR) is proposed. The CAD system creates designs using domain-specific rules and these designs are employed for initial seeding of the case base and the management of constraints. CBR reuses the designer's previous experience. The key aspects in the CBR system are measuring the similarity of cases and adapting past solutions to the current problem. Similarity is measured by asking the user to rank the importance of features; the ranks are then used to calculate weights for an algorithm which compares the specifications of designs. A novel adaptation operator called rule difference replay (RDR) is created. When the specifications to a new design is presented, the CAD program uses it to construct a design constituting an approximate solution. The most similar design from the case-base is then retrieved and RDR replays the changes previously made to the retrieved design on the new solution. A measure of solution similarity that can validate subjective success scores is created. Specification similarity can be used as a guide whether to invoke CBR, in a hybrid CAD-CBR system. If the newly resulted design is suffciently similar to a previous design, then CBR is invoked; otherwise CAD is used. The application of RDR to knitwear design has demonstrated the flexibility to overcome deficiencies in rules that try to automate creativity, and has the potential to be applied to other domains such as interior design.
Resumo:
With advances in science and technology, computing and business intelligence (BI) systems are steadily becoming more complex with an increasing variety of heterogeneous software and hardware components. They are thus becoming progressively more difficult to monitor, manage and maintain. Traditional approaches to system management have largely relied on domain experts through a knowledge acquisition process that translates domain knowledge into operating rules and policies. It is widely acknowledged as a cumbersome, labor intensive, and error prone process, besides being difficult to keep up with the rapidly changing environments. In addition, many traditional business systems deliver primarily pre-defined historic metrics for a long-term strategic or mid-term tactical analysis, and lack the necessary flexibility to support evolving metrics or data collection for real-time operational analysis. There is thus a pressing need for automatic and efficient approaches to monitor and manage complex computing and BI systems. To realize the goal of autonomic management and enable self-management capabilities, we propose to mine system historical log data generated by computing and BI systems, and automatically extract actionable patterns from this data. This dissertation focuses on the development of different data mining techniques to extract actionable patterns from various types of log data in computing and BI systems. Four key problems—Log data categorization and event summarization, Leading indicator identification , Pattern prioritization by exploring the link structures , and Tensor model for three-way log data are studied. Case studies and comprehensive experiments on real application scenarios and datasets are conducted to show the effectiveness of our proposed approaches.