18 resultados para Data recovery (Computer science)
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
As the world becomes more technologically advanced and economies become globalized, computer science evolution has become faster than ever before. With this evolution and globalization come the need for sustainable university curricula that adequately prepare graduates for life in the industry. Additionally, behavioural skills or “soft” skills have become just as important as technical abilities and knowledge or “hard” skills. The objective of this study was to investigate the current skill gap that exists between computer science university graduates and actual industry needs as well as the sustainability of current computer science university curricula by conducting a systematic literature review of existing publications on the subject as well as a survey of recently graduated computer science students and their work supervisors. A quantitative study was carried out with respondents from six countries, mainly Finland, 31 of the responses came from recently graduated computer science professionals and 18 from their employers. The observed trends suggest that a skill gap really does exist particularly with “soft” skills and that many companies are forced to provide additional training to newly graduated employees if they are to be successful at their jobs.
Resumo:
The primary goals of this study are to: embed sustainable concepts of energy consumption into certain part of existing Computer Science curriculum for English schools; investigate how to motivate 7-to-11 years old kids to learn these concepts; promote responsive ICT (Information and Communications Technology) use by these kids in their daily life; raise their awareness of today’s ecological challenges. Sustainability-related ICT lessons developed aim to provoke computational thinking and creativity to foster understanding of environmental impact of ICT and positive environmental impact of small changes in user energy consumption behaviour. The importance of including sustainability into the Computer Science curriculum is due to the fact that ICT is both a solution and one of the causes of current world ecological problems. This research follows Agile software development methodology. In order to achieve the aforementioned goals, sustainability requirements, curriculum requirements and technical requirements are firstly analysed. Secondly, the web-based user interface is designed. In parallel, a set of three online lessons (video, slideshow and game) is created for the website GreenICTKids.com taking into account several green design patterns. Finally, the evaluation phase involves the collection of adults’ and kids’ feedback on the following: user interface; contents; user interaction; impacts on the kids’ sustainability awareness and on the kids’ behaviour with technologies. In conclusion, a list of research outcomes is as follows: 92% of the adults learnt more about energy consumption; 80% of the kids are motivated to learn about energy consumption and found the website easy to use; 100% of the kids understood the contents and liked website’s visual aspect; 100% of the kids will try to apply in their daily life what they learnt through the online lessons.
Resumo:
The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.
Resumo:
Nowadays the used fuel variety in power boilers is widening and new boiler constructions and running models have to be developed. This research and development is done in small pilot plants where more faster analyse about the boiler mass and heat balance is needed to be able to find and do the right decisions already during the test run. The barrier on determining boiler balance during test runs is the long process of chemical analyses of collected input and outputmatter samples. The present work is concentrating on finding a way to determinethe boiler balance without chemical analyses and optimise the test rig to get the best possible accuracy for heat and mass balance of the boiler. The purpose of this work was to create an automatic boiler balance calculation method for 4 MW CFB/BFB pilot boiler of Kvaerner Pulping Oy located in Messukylä in Tampere. The calculation was created in the data management computer of pilot plants automation system. The calculation is made in Microsoft Excel environment, which gives a good base and functions for handling large databases and calculations without any delicate programming. The automation system in pilot plant was reconstructed und updated by Metso Automation Oy during year 2001 and the new system MetsoDNA has good data management properties, which is necessary for big calculations as boiler balance calculation. Two possible methods for calculating boiler balance during test run were found. Either the fuel flow is determined, which is usedto calculate the boiler's mass balance, or the unburned carbon loss is estimated and the mass balance of the boiler is calculated on the basis of boiler's heat balance. Both of the methods have their own weaknesses, so they were constructed parallel in the calculation and the decision of the used method was left to user. User also needs to define the used fuels and some solid mass flowsthat aren't measured automatically by the automation system. With sensitivity analysis was found that the most essential values for accurate boiler balance determination are flue gas oxygen content, the boiler's measured heat output and lower heating value of the fuel. The theoretical part of this work concentrates in the error management of these measurements and analyses and on measurement accuracy and boiler balance calculation in theory. The empirical part of this work concentrates on the creation of the balance calculation for the boiler in issue and on describing the work environment.
Resumo:
This thesis considers aspects related to the design and standardisation of transmission systems for wireless broadcasting, comprising terrestrial and mobile reception. The purpose is to identify which factors influence the technical decisions and what issues could be better considered in the design process in order to assess different use cases, service scenarios and end-user quality. Further, the necessity of cross-layer optimisation for efficient data transmission is emphasised and means to take this into consideration are suggested. The work is mainly related terrestrial and mobile digital video broadcasting systems but many of the findings can be generalised also to other transmission systems and design processes. The work has led to three main conclusions. First, it is discovered that there are no sufficiently accurate error criteria for measuring the subjective perceived audiovisual quality that could be utilised in transmission system design. Means for designing new error criteria for mobile TV (television) services are suggested and similar work related to other services is recommended. Second, it is suggested that in addition to commercial requirements there should be technical requirements setting the frame work for the design process of a new transmission system. The technical requirements should include the assessed reception conditions, technical quality of service and service functionalities. Reception conditions comprise radio channel models, receiver types and antenna types. Technical quality of service consists of bandwidth, timeliness and reliability. Of these, the thesis focuses on radio channel models and errorcriteria (reliability) as two of the most important design challenges and provides means to optimise transmission parameters based on these. Third, the thesis argues that the most favourable development for wireless broadcasting would be a single system suitable for all scenarios of wireless broadcasting. It is claimed that there are no major technical obstacles to achieve this and that the recently published second generation digital terrestrial television broadcasting system provides a good basis. The challenges and opportunities of a universal wireless broadcasting system are discussed mainly from technical but briefly also from commercial and regulatory aspect
Resumo:
Broadcasting systems are networks where the transmission is received by several terminals. Generally broadcast receivers are passive devices in the network, meaning that they do not interact with the transmitter. Providing a certain Quality of Service (QoS) for the receivers in heterogeneous reception environment with no feedback is not an easy task. Forward error control coding can be used for protection against transmission errors to enhance the QoS for broadcast services. For good performance in terrestrial wireless networks, diversity should be utilized. The diversity is utilized by application of interleaving together with the forward error correction codes. In this dissertation the design and analysis of forward error control and control signalling for providing QoS in wireless broadcasting systems are studied. Control signaling is used in broadcasting networks to give the receiver necessary information on how to connect to the network itself and how to receive the services that are being transmitted. Usually control signalling is considered to be transmitted through a dedicated path in the systems. Therefore, the relationship of the signaling and service data paths should be considered early in the design phase. Modeling and simulations are used in the case studies of this dissertation to study this relationship. This dissertation begins with a survey on the broadcasting environment and mechanisms for providing QoS therein. Then case studies present analysis and design of such mechanisms in real systems. The mechanisms for providing QoS considering signaling and service data paths and their relationship at the DVB-H link layer are analyzed as the first case study. In particular the performance of different service data decoding mechanisms and optimal signaling transmission parameter selection are presented. The second case study investigates the design of signaling and service data paths for the more modern DVB-T2 physical layer. Furthermore, by comparing the performances of the signaling and service data paths by simulations, configuration guidelines for the DVB-T2 physical layer signaling are given. The presented guidelines can prove useful when configuring DVB-T2 transmission networks. Finally, recommendations for the design of data and signalling paths are given based on findings from the case studies. The requirements for the signaling design should be derived from the requirements for the main services. Generally, these requirements for signaling should be more demanding as the signaling is the enabler for service reception.
Resumo:
In this thesis, simple methods have been sought to lower the teacher’s threshold to start to apply constructive alignment in instruction. From the phases of the instructional process, aspects that can be improved with little effort by the teacher have been identified. Teachers have been interviewed in order to find out what students actually learn in computer science courses. A quantitative analysis of the structured interviews showed that in addition to subject specific skills and knowledge, students learn many other skills that should be mentioned in the learning outcomes of the course. The students’ background, such as their prior knowledge, learning style and culture, affects how they learn in a course. A survey was conducted to map the learning styles of computer science students and to see if their cultural background affected their learning style. A statistical analysis of the data indicated that computer science students are different learners than engineering students in general and that there is a connection between the student’s culture and learning style. In this thesis, a simple self-assessment scale that is based on Bloom’s revised taxonomy has been developed. A statistical analysis of the test results indicates that in general the scale is quite reliable, but single students still slightly overestimate or under-estimate their knowledge levels. For students, being able to follow their own progress is motivating, and for a teacher, self-assessment results give information about how the class is proceeding and what the level of the students’ knowledge is.
Resumo:
Tämä tutkielma kuuluu merkkijonoalgoritmiikan piiriin. Merkkijono S on merkkijonojen X[1..m] ja Y[1..n] yhteinen alijono, mikäli se voidaan muodostaa poistamalla X:stä 0..m ja Y:stä 0..n kappaletta merkkejä mielivaltaisista paikoista. Jos yksikään X:n ja Y:n yhteinen alijono ei ole S:ää pidempi, sanotaan, että S on X:n ja Y:n pisin yhteinen alijono (lyh. PYA). Tässä työssä keskitytään kahden merkkijonon PYAn ratkaisemiseen, mutta ongelma on yleistettävissä myös useammalle jonolle. PYA-ongelmalle on sovelluskohteita – paitsi tietojenkäsittelytieteen niin myös bioinformatiikan osa-alueilla. Tunnetuimpia niistä ovat tekstin ja kuvien tiivistäminen, tiedostojen versionhallinta, hahmontunnistus sekä DNA- ja proteiiniketjujen rakennetta vertaileva tutkimus. Ongelman ratkaisemisen tekee hankalaksi ratkaisualgoritmien riippuvuus syötejonojen useista eri parametreista. Näitä ovat syötejonojen pituuden lisäksi mm. syöttöaakkoston koko, syötteiden merkkijakauma, PYAn suhteellinen osuus lyhyemmän syötejonon pituudesta ja täsmäävien merkkiparien lukumäärä. Täten on vaikeaa kehittää algoritmia, joka toimisi tehokkaasti kaikille ongelman esiintymille. Tutkielman on määrä toimia yhtäältä käsikirjana, jossa esitellään ongelman peruskäsitteiden kuvauksen jälkeen jo aikaisemmin kehitettyjä tarkkoja PYAalgoritmeja. Niiden tarkastelu on ryhmitelty algoritmin toimintamallin mukaan joko rivi, korkeuskäyrä tai diagonaali kerrallaan sekä monisuuntaisesti prosessoiviin. Tarkkojen menetelmien lisäksi esitellään PYAn pituuden ylä- tai alarajan laskevia heuristisia menetelmiä, joiden laskemia tuloksia voidaan hyödyntää joko sellaisinaan tai ohjaamaan tarkan algoritmin suoritusta. Tämä osuus perustuu tutkimusryhmämme julkaisemiin artikkeleihin. Niissä käsitellään ensimmäistä kertaa heuristiikoilla tehostettuja tarkkoja menetelmiä. Toisaalta työ sisältää laajahkon empiirisen tutkimusosuuden, jonka tavoitteena on ollut tehostaa olemassa olevien tarkkojen algoritmien ajoaikaa ja muistinkäyttöä. Kyseiseen tavoitteeseen on pyritty ohjelmointiteknisesti esittelemällä algoritmien toimintamallia hyvin tukevia tietorakenteita ja rajoittamalla algoritmien suorittamaa tuloksetonta laskentaa parantamalla niiden kykyä havainnoida suorituksen aikana saavutettuja välituloksia ja hyödyntää niitä. Tutkielman johtopäätöksinä voidaan yleisesti todeta tarkkojen PYA-algoritmien heuristisen esiprosessoinnin lähes systemaattisesti pienentävän niiden suoritusaikaa ja erityisesti muistintarvetta. Lisäksi algoritmin käyttämällä tietorakenteella on ratkaiseva vaikutus laskennan tehokkuuteen: mitä paikallisempia haku- ja päivitysoperaatiot ovat, sitä tehokkaampaa algoritmin suorittama laskenta on.
Resumo:
The ongoing global financial crisis has demonstrated the importance of a systemwide, or macroprudential, approach to safeguarding financial stability. An essential part of macroprudential oversight concerns the tasks of early identification and assessment of risks and vulnerabilities that eventually may lead to a systemic financial crisis. Thriving tools are crucial as they allow early policy actions to decrease or prevent further build-up of risks or to otherwise enhance the shock absorption capacity of the financial system. In the literature, three types of systemic risk can be identified: i ) build-up of widespread imbalances, ii ) exogenous aggregate shocks, and iii ) contagion. Accordingly, the systemic risks are matched by three categories of analytical methods for decision support: i ) early-warning, ii ) macro stress-testing, and iii ) contagion models. Stimulated by the prolonged global financial crisis, today's toolbox of analytical methods includes a wide range of innovative solutions to the two tasks of risk identification and risk assessment. Yet, the literature lacks a focus on the task of risk communication. This thesis discusses macroprudential oversight from the viewpoint of all three tasks: Within analytical tools for risk identification and risk assessment, the focus concerns a tight integration of means for risk communication. Data and dimension reduction methods, and their combinations, hold promise for representing multivariate data structures in easily understandable formats. The overall task of this thesis is to represent high-dimensional data concerning financial entities on lowdimensional displays. The low-dimensional representations have two subtasks: i ) to function as a display for individual data concerning entities and their time series, and ii ) to use the display as a basis to which additional information can be linked. The final nuance of the task is, however, set by the needs of the domain, data and methods. The following ve questions comprise subsequent steps addressed in the process of this thesis: 1. What are the needs for macroprudential oversight? 2. What form do macroprudential data take? 3. Which data and dimension reduction methods hold most promise for the task? 4. How should the methods be extended and enhanced for the task? 5. How should the methods and their extensions be applied to the task? Based upon the Self-Organizing Map (SOM), this thesis not only creates the Self-Organizing Financial Stability Map (SOFSM), but also lays out a general framework for mapping the state of financial stability. This thesis also introduces three extensions to the standard SOM for enhancing the visualization and extraction of information: i ) fuzzifications, ii ) transition probabilities, and iii ) network analysis. Thus, the SOFSM functions as a display for risk identification, on top of which risk assessments can be illustrated. In addition, this thesis puts forward the Self-Organizing Time Map (SOTM) to provide means for visual dynamic clustering, which in the context of macroprudential oversight concerns the identification of cross-sectional changes in risks and vulnerabilities over time. Rather than automated analysis, the aim of visual means for identifying and assessing risks is to support disciplined and structured judgmental analysis based upon policymakers' experience and domain intelligence, as well as external risk communication.
Resumo:
The horse industry is in many ways still operating the same way as it did in the beginning of the 20th century. At the same time the role of the horse has changed dramatically, from a beast of burden to a top athlete, a production animal or a beloved pet. A racehorse or an equestrian sport horse is trained and taken care of like any other athlete, but unlike its human counterpart, it might end up on our plate. According to European and many other countries’ laws, a horse is a production animal. The medical data of a horse should be known if it is to be slaughtered, to ensure that the meat is safe for human consumption. Today this vital medical information should be noted in the horse’s passport, but this paperbased system is not reliable. If a horse gets sold, depending on the country’s laws, the medical records might not be transferred to the new owner, the horse’s passport might get lost etc. Thus the system is not fool proof. It is not only the horse owners who have to struggle with paperwork; veterinarians as well as other officials often use much time on redundant paperwork. The main research question of this thesis is if IS could be used to help the different stakeholders within the horse industry? Veterinarians in particular who travel to stables to treat horses cannot always take with them their computers, since the somewhat unsanitary environment is not suitable for a sensitive technological device. Currently there is no common medical database developed for horses, although such a database with a support system could help with many problems. These include vaccination and disease control, food-safety, as well as export and import problems. The main stakeholders within the horse industry, including equine veterinarians and horse owners, were studied to find out their daily routines and needs for a possible support system. The research showed that there are different aspects within the horse industry where IS could be used to support the stakeholders daily routines. Thus a support system including web and mobile accessibility for the main stakeholders is under development. Since veterinarians will be the main users of this support system, it is very important to make sure that they find it useful and beneficial in their daily work. To ensure a desired result, the research and development of the system has been done iteratively with the stakeholders following the Action Design Research methodology.
Resumo:
With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.
Resumo:
In recent decades, business intelligence (BI) has gained momentum in real-world practice. At the same time, business intelligence has evolved as an important research subject of Information Systems (IS) within the decision support domain. Today’s growing competitive pressure in business has led to increased needs for real-time analytics, i.e., so called real-time BI or operational BI. This is especially true with respect to the electricity production, transmission, distribution, and retail business since the law of physics determines that electricity as a commodity is nearly impossible to be stored economically, and therefore demand-supply needs to be constantly in balance. The current power sector is subject to complex changes, innovation opportunities, and technical and regulatory constraints. These range from low carbon transition, renewable energy sources (RES) development, market design to new technologies (e.g., smart metering, smart grids, electric vehicles, etc.), and new independent power producers (e.g., commercial buildings or households with rooftop solar panel installments, a.k.a. Distributed Generation). Among them, the ongoing deployment of Advanced Metering Infrastructure (AMI) has profound impacts on the electricity retail market. From the view point of BI research, the AMI is enabling real-time or near real-time analytics in the electricity retail business. Following Design Science Research (DSR) paradigm in the IS field, this research presents four aspects of BI for efficient pricing in a competitive electricity retail market: (i) visual data-mining based descriptive analytics, namely electricity consumption profiling, for pricing decision-making support; (ii) real-time BI enterprise architecture for enhancing management’s capacity on real-time decision-making; (iii) prescriptive analytics through agent-based modeling for price-responsive demand simulation; (iv) visual data-mining application for electricity distribution benchmarking. Even though this study is from the perspective of the European electricity industry, particularly focused on Finland and Estonia, the BI approaches investigated can: (i) provide managerial implications to support the utility’s pricing decision-making; (ii) add empirical knowledge to the landscape of BI research; (iii) be transferred to a wide body of practice in the power sector and BI research community.
Resumo:
Technological innovations, the development of the internet, and globalization have increased the number and complexity of web applications. As a result, keeping web user interfaces understandable and usable (in terms of ease-of-use, effectiveness, and satisfaction) is a challenge. As part of this, designing userintuitive interface signs (i.e., the small elements of web user interface, e.g., navigational link, command buttons, icons, small images, thumbnails, etc.) is an issue for designers. Interface signs are key elements of web user interfaces because ‘interface signs’ act as a communication artefact to convey web content and system functionality, and because users interact with systems by means of interface signs. In the light of the above, applying semiotic (i.e., the study of signs) concepts on web interface signs will contribute to discover new and important perspectives on web user interface design and evaluation. The thesis mainly focuses on web interface signs and uses the theory of semiotic as a background theory. The underlying aim of this thesis is to provide valuable insights to design and evaluate web user interfaces from a semiotic perspective in order to improve overall web usability. The fundamental research question is formulated as What do practitioners and researchers need to be aware of from a semiotic perspective when designing or evaluating web user interfaces to improve web usability? From a methodological perspective, the thesis follows a design science research (DSR) approach. A systematic literature review and six empirical studies are carried out in this thesis. The empirical studies are carried out with a total of 74 participants in Finland. The steps of a design science research process are followed while the studies were designed and conducted; that includes (a) problem identification and motivation, (b) definition of objectives of a solution, (c) design and development, (d) demonstration, (e) evaluation, and (f) communication. The data is collected using observations in a usability testing lab, by analytical (expert) inspection, with questionnaires, and in structured and semi-structured interviews. User behaviour analysis, qualitative analysis and statistics are used to analyze the study data. The results are summarized as follows and have lead to the following contributions. Firstly, the results present the current status of semiotic research in UI design and evaluation and highlight the importance of considering semiotic concepts in UI design and evaluation. Secondly, the thesis explores interface sign ontologies (i.e., sets of concepts and skills that a user should know to interpret the meaning of interface signs) by providing a set of ontologies used to interpret the meaning of interface signs, and by providing a set of features related to ontology mapping in interpreting the meaning of interface signs. Thirdly, the thesis explores the value of integrating semiotic concepts in usability testing. Fourthly, the thesis proposes a semiotic framework (Semiotic Interface sign Design and Evaluation – SIDE) for interface sign design and evaluation in order to make them intuitive for end users and to improve web usability. The SIDE framework includes a set of determinants and attributes of user-intuitive interface signs, and a set of semiotic heuristics to design and evaluate interface signs. Finally, the thesis assesses (a) the quality of the SIDE framework in terms of performance metrics (e.g., thoroughness, validity, effectiveness, reliability, etc.) and (b) the contributions of the SIDE framework from the evaluators’ perspective.
Resumo:
Human activity recognition in everyday environments is a critical, but challenging task in Ambient Intelligence applications to achieve proper Ambient Assisted Living, and key challenges still remain to be dealt with to realize robust methods. One of the major limitations of the Ambient Intelligence systems today is the lack of semantic models of those activities on the environment, so that the system can recognize the speci c activity being performed by the user(s) and act accordingly. In this context, this thesis addresses the general problem of knowledge representation in Smart Spaces. The main objective is to develop knowledge-based models, equipped with semantics to learn, infer and monitor human behaviours in Smart Spaces. Moreover, it is easy to recognize that some aspects of this problem have a high degree of uncertainty, and therefore, the developed models must be equipped with mechanisms to manage this type of information. A fuzzy ontology and a semantic hybrid system are presented to allow modelling and recognition of a set of complex real-life scenarios where vagueness and uncertainty are inherent to the human nature of the users that perform it. The handling of uncertain, incomplete and vague data (i.e., missing sensor readings and activity execution variations, since human behaviour is non-deterministic) is approached for the rst time through a fuzzy ontology validated on real-time settings within a hybrid data-driven and knowledgebased architecture. The semantics of activities, sub-activities and real-time object interaction are taken into consideration. The proposed framework consists of two main modules: the low-level sub-activity recognizer and the high-level activity recognizer. The rst module detects sub-activities (i.e., actions or basic activities) that take input data directly from a depth sensor (Kinect). The main contribution of this thesis tackles the second component of the hybrid system, which lays on top of the previous one, in a superior level of abstraction, and acquires the input data from the rst module's output, and executes ontological inference to provide users, activities and their in uence in the environment, with semantics. This component is thus knowledge-based, and a fuzzy ontology was designed to model the high-level activities. Since activity recognition requires context-awareness and the ability to discriminate among activities in di erent environments, the semantic framework allows for modelling common-sense knowledge in the form of a rule-based system that supports expressions close to natural language in the form of fuzzy linguistic labels. The framework advantages have been evaluated with a challenging and new public dataset, CAD-120, achieving an accuracy of 90.1% and 91.1% respectively for low and high-level activities. This entails an improvement over both, entirely data-driven approaches, and merely ontology-based approaches. As an added value, for the system to be su ciently simple and exible to be managed by non-expert users, and thus, facilitate the transfer of research to industry, a development framework composed by a programming toolbox, a hybrid crisp and fuzzy architecture, and graphical models to represent and con gure human behaviour in Smart Spaces, were developed in order to provide the framework with more usability in the nal application. As a result, human behaviour recognition can help assisting people with special needs such as in healthcare, independent elderly living, in remote rehabilitation monitoring, industrial process guideline control, and many other cases. This thesis shows use cases in these areas.
Resumo:
Companies require information in order to gain an improved understanding of their customers. Data concerning customers, their interests and behavior are collected through different loyalty programs. The amount of data stored in company data bases has increased exponentially over the years and become difficult to handle. This research area is the subject of much current interest, not only in academia but also in practice, as is shown by several magazines and blogs that are covering topics on how to get to know your customers, Big Data, information visualization, and data warehousing. In this Ph.D. thesis, the Self-Organizing Map and two extensions of it – the Weighted Self-Organizing Map (WSOM) and the Self-Organizing Time Map (SOTM) – are used as data mining methods for extracting information from large amounts of customer data. The thesis focuses on how data mining methods can be used to model and analyze customer data in order to gain an overview of the customer base, as well as, for analyzing niche-markets. The thesis uses real world customer data to create models for customer profiling. Evaluation of the built models is performed by CRM experts from the retailing industry. The experts considered the information gained with help of the models to be valuable and useful for decision making and for making strategic planning for the future.