Biblioteca Digital

6 resultados para 004 - Informatik (Data processing Computer science)

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland

Big Data Processing in the Cloud: a Hydra/Sufia Experience

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Veja mais

Sustainable Computer Science Education

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As the world becomes more technologically advanced and economies become globalized, computer science evolution has become faster than ever before. With this evolution and globalization come the need for sustainable university curricula that adequately prepare graduates for life in the industry. Additionally, behavioural skills or “soft” skills have become just as important as technical abilities and knowledge or “hard” skills. The objective of this study was to investigate the current skill gap that exists between computer science university graduates and actual industry needs as well as the sustainability of current computer science university curricula by conducting a systematic literature review of existing publications on the subject as well as a survey of recently graduated computer science students and their work supervisors. A quantitative study was carried out with respondents from six countries, mainly Finland, 31 of the responses came from recently graduated computer science professionals and 18 from their employers. The observed trends suggest that a skill gap really does exist particularly with “soft” skills and that many companies are forced to provide additional training to newly graduated employees if they are to be successful at their jobs.

Veja mais

Embedding sustainability into the new computer science curriculum for English schools

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The primary goals of this study are to: embed sustainable concepts of energy consumption into certain part of existing Computer Science curriculum for English schools; investigate how to motivate 7-to-11 years old kids to learn these concepts; promote responsive ICT (Information and Communications Technology) use by these kids in their daily life; raise their awareness of today’s ecological challenges. Sustainability-related ICT lessons developed aim to provoke computational thinking and creativity to foster understanding of environmental impact of ICT and positive environmental impact of small changes in user energy consumption behaviour. The importance of including sustainability into the Computer Science curriculum is due to the fact that ICT is both a solution and one of the causes of current world ecological problems. This research follows Agile software development methodology. In order to achieve the aforementioned goals, sustainability requirements, curriculum requirements and technical requirements are firstly analysed. Secondly, the web-based user interface is designed. In parallel, a set of three online lessons (video, slideshow and game) is created for the website GreenICTKids.com taking into account several green design patterns. Finally, the evaluation phase involves the collection of adults’ and kids’ feedback on the following: user interface; contents; user interaction; impacts on the kids’ sustainability awareness and on the kids’ behaviour with technologies. In conclusion, a list of research outcomes is as follows: 92% of the adults learnt more about energy consumption; 80% of the kids are motivated to learn about energy consumption and found the website easy to use; 100% of the kids understood the contents and liked website’s visual aspect; 100% of the kids will try to apply in their daily life what they learnt through the online lessons.

Veja mais

Kahden merkkijonon pisimmän yhteisen alijonon ongelma ja sen ratkaiseminen

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tämä tutkielma kuuluu merkkijonoalgoritmiikan piiriin. Merkkijono S on merkkijonojen X[1..m] ja Y[1..n] yhteinen alijono, mikäli se voidaan muodostaa poistamalla X:stä 0..m ja Y:stä 0..n kappaletta merkkejä mielivaltaisista paikoista. Jos yksikään X:n ja Y:n yhteinen alijono ei ole S:ää pidempi, sanotaan, että S on X:n ja Y:n pisin yhteinen alijono (lyh. PYA). Tässä työssä keskitytään kahden merkkijonon PYAn ratkaisemiseen, mutta ongelma on yleistettävissä myös useammalle jonolle. PYA-ongelmalle on sovelluskohteita – paitsi tietojenkäsittelytieteen niin myös bioinformatiikan osa-alueilla. Tunnetuimpia niistä ovat tekstin ja kuvien tiivistäminen, tiedostojen versionhallinta, hahmontunnistus sekä DNA- ja proteiiniketjujen rakennetta vertaileva tutkimus. Ongelman ratkaisemisen tekee hankalaksi ratkaisualgoritmien riippuvuus syötejonojen useista eri parametreista. Näitä ovat syötejonojen pituuden lisäksi mm. syöttöaakkoston koko, syötteiden merkkijakauma, PYAn suhteellinen osuus lyhyemmän syötejonon pituudesta ja täsmäävien merkkiparien lukumäärä. Täten on vaikeaa kehittää algoritmia, joka toimisi tehokkaasti kaikille ongelman esiintymille. Tutkielman on määrä toimia yhtäältä käsikirjana, jossa esitellään ongelman peruskäsitteiden kuvauksen jälkeen jo aikaisemmin kehitettyjä tarkkoja PYAalgoritmeja. Niiden tarkastelu on ryhmitelty algoritmin toimintamallin mukaan joko rivi, korkeuskäyrä tai diagonaali kerrallaan sekä monisuuntaisesti prosessoiviin. Tarkkojen menetelmien lisäksi esitellään PYAn pituuden ylä- tai alarajan laskevia heuristisia menetelmiä, joiden laskemia tuloksia voidaan hyödyntää joko sellaisinaan tai ohjaamaan tarkan algoritmin suoritusta. Tämä osuus perustuu tutkimusryhmämme julkaisemiin artikkeleihin. Niissä käsitellään ensimmäistä kertaa heuristiikoilla tehostettuja tarkkoja menetelmiä. Toisaalta työ sisältää laajahkon empiirisen tutkimusosuuden, jonka tavoitteena on ollut tehostaa olemassa olevien tarkkojen algoritmien ajoaikaa ja muistinkäyttöä. Kyseiseen tavoitteeseen on pyritty ohjelmointiteknisesti esittelemällä algoritmien toimintamallia hyvin tukevia tietorakenteita ja rajoittamalla algoritmien suorittamaa tuloksetonta laskentaa parantamalla niiden kykyä havainnoida suorituksen aikana saavutettuja välituloksia ja hyödyntää niitä. Tutkielman johtopäätöksinä voidaan yleisesti todeta tarkkojen PYA-algoritmien heuristisen esiprosessoinnin lähes systemaattisesti pienentävän niiden suoritusaikaa ja erityisesti muistintarvetta. Lisäksi algoritmin käyttämällä tietorakenteella on ratkaiseva vaikutus laskennan tehokkuuteen: mitä paikallisempia haku- ja päivitysoperaatiot ovat, sitä tehokkaampaa algoritmin suorittama laskenta on.

Veja mais

Scheduling dynamic dataflow graphs with model checking

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.

Veja mais

Data analysis tools and methods for DNA microarray and high-throughput sequencing data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and eﬃcient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the ﬁeld of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workﬂows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are speciﬁcally aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workﬂows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the speciﬁc data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workﬂows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The ﬁrst study focuses on spermatogenesis in murinetestis and the second one examines cell lineage speciﬁcation in mouse embryonicstem cells.

Veja mais

6 resultados para 004 - Informatik (Data processing Computer science)

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland

Filtro por publicador

Big Data Processing in the Cloud: a Hydra/Sufia Experience

Sustainable Computer Science Education

Embedding sustainability into the new computer science curriculum for English schools

Kahden merkkijonon pisimmän yhteisen alijonon ongelma ja sen ratkaiseminen

Scheduling dynamic dataflow graphs with model checking

Data analysis tools and methods for DNA microarray and high-throughput sequencing data