20 resultados para Thread safe parallel run-time

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monimutkaisen tietokonejärjestelmän suorituskykyoptimointi edellyttää järjestelmän ajonaikaisen käyttäytymisen ymmärtämistä. Ohjelmiston koon ja monimutkaisuuden kasvun myötä suorituskykyoptimointi tulee yhä tärkeämmäksi osaksi tuotekehitysprosessia. Tehokkaampien prosessorien käytön myötä myös energiankulutus ja lämmöntuotto ovat nousseet yhä suuremmiksi ongelmiksi, erityisesti pienissä, kannettavissa laitteissa. Lämpö- ja energiaongelmien rajoittamiseksi on kehitetty suorituskyvyn skaalausmenetelmiä, jotka edelleen lisäävät järjestelmän kompleksisuutta ja suorituskykyoptimoinnin tarvetta. Tässä työssä kehitettiin visualisointi- ja analysointityökalu ajonaikaisen käyttäytymisen ymmärtämisen helpottamiseksi. Lisäksi kehitettiin suorituskyvyn mitta, joka mahdollistaa erilaisten skaalausmenetelmien vertailun ja arvioimisen suoritusympäristöstä riippumatta, perustuen joko suoritustallenteen tai teoreettiseen analyysiin. Työkalu esittää ajonaikaisesti kerätyn tallenteen helposti ymmärrettävällä tavalla. Se näyttää mm. prosessit, prosessorikuorman, skaalausmenetelmien toiminnan sekä energiankulutuksen kolmiulotteista grafiikkaa käyttäen. Työkalu tuottaa myös käyttäjän valitsemasta osasta suorituskuvaa numeerista tietoa, joka sisältää useita oleellisia suorituskykyarvoja ja tilastotietoa. Työkalun sovellettavuutta tarkasteltiin todellisesta laitteesta saatua suoritustallennetta sekä suorituskyvyn skaalauksen simulointia analysoimalla. Skaalausmekanismin parametrien vaikutus simuloidun laitteen suorituskykyyn analysoitiin.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis estimates long-run time variant conditional correlation between stock and bond returns of CIVETS (Colombia, Indonesia, Vietnam, Egypt, Turkey, and South Africa) nations. Further, aims to analyse the presence of asymmetric volatility effect in both asset returns, as well as, obverses increment or decrement in conditional correlation during pre-crisis and crisis period, which lead to make a reliable diversification decision. The Constant Conditional Correlation (CCC) GARCH model of Bollerslev (1990), the Dynamic Conditional Correlation (DCC) GARCH model (Engle 2002), and the Asymmetric Dynamic Conditional Correlation (ADCC) GARCH model of Cappiello, Engle, and Sheppard (2006) were implemented in the study. The analyses present strong evidence of time-varying conditional correlation in CIVETS markets, excluding Vietnam, during 2005-2013. In addition, negative innovation effects were found in both conditional variance and correlation of the asset returns. The results of this study recommend investors to include financial assets from these markets in portfolios, in order to obtain better stock-bond diversification benefits, especially during high volatility periods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diplomityössä tutkitaan keinoja brändätä ja varioida S60-ohjelmistoja dynaamisesti ja ajonaikaisesti. S60 on kehitysalusta, jota käyttävät useat puhelinvalmistajat ja heidän puhelimiaan käyttävät lukuisat eri operaattorit. Operaattorit haluavat puhelimiensa tai osan puhelimen sovelluksista erottuvan kilpailijoista heidän omalla brändillään ja tämän takia täytyy olla keinot joko koko puhelimen, tai valittujen sovellusten brändäykselle. Osa sovelluksista saatetaan haluta vaihtavan käytettyä brändiä sen käyttämien resurssien, kuten verkkopalvelimen, mukaan. Variointidataa tulee myös pystyä jakamaan eri sovellusten tai sovellusten osien kesken. Työssä esitellään Symbian käyttöjärjestelmä ja S60 kehitysympäristö, sekä pohditaan Symbianin turvallisuuskäytäntöjen tuomia haasteita variointidatan jakamiseen eri sovellusten välillä. Olemassaolevia variointitapoja tutkitaan työn mahdolliseksi pohjaksi. Työ sisältää esittelyn projektista, jossa kehitettiin erään S60 sovelluksen dynaaminen brändäystoteutus, joka myös mahdollistaa variointidatan jakamisen eri sovellusten kanssa.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nokia Push To Talk järjestelmä tarjoaa uuden kommunikointimetodin tavallisen puhelun oheen. Yksi tärkeimmistä uuden järjestelmän ominaisuuksista on puhelunmuodostuksen nopeus. Lisäksi järjestelmän tulee olla telekommunikaatiojärjestelmien yleisten periaatteiden mukainen, mahdollisimman stabiili ja skaalautuva, jotta järjestelmä olisi mahdollisimman vikasietoinen ja laajennettavissa. Diplomityön päätavoite on esitellä "C++"-tietokantakirjastojen suunnittelua ja testausta. Aluksi tutkitaan tietokantajärjestelmien problematiikkaa alkaen tietokantajärjestelmän valinnasta ja huomioiden erityisesti nopeuskriteerit. Sitten esitellään kaksi teknistä toteutusta kahta "C++"-tietokantakirjastoa varten ja pohditaan joitakin vaihtoehtoisia toteutustapoja.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Työssä tutkitaan eri tekniikoita, joilla web-käyttöliittymä voidaan toteuttaa. Tutkituista tekniikoista valitaan työn tavoitteisiin ja rajoitteisiin parhaiten soveltuvat tekniikat, joita käytetään hyväksi luotaessa varsinainen käyttöliittymäkerros olemassa olevalle web-sovellukselle. Varsinaiset käyttöliittymät luodaan automaattisesti työn aikana toteutettavalla käyttöliittymägeneraattorilla, joka käyttää hyväkseen käyttöliittymiä kuvaavia XML-kuvaustiedostoja. Tekniikoista parhaiten tarpeisiimme soveltui AJAX-lähestymistapa, joka mahdollistaa sivun osittaisen päivittämisen ja täten työpöytäsovellusmaisemman käytettävyyden nopeamman sivun päivityksen vuoksi. Käyttöliittymägeneraattorin käyttämät kuvaustiedostot puolestaan mahdollistavat käyttöliittymäkontrollien valmiin mallintamisen yleisessä kontrollikuvaustiedostossa sekä niiden helpon muokkaamisen ja sijoittelun sivu-kohtaisesti. Lisäksi käyttöliittymäkerros sisältää monipuoliset käyttöliittymäkontrollit.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social tagging evolved in response to a need to tag heterogeneous objects, the automated tagging of which is usually not feasible by current technological means. Social tagging can be used for more flexible competence management within organizations. The profiles of employees can be built in the form of groups of tags, as employees tag each other, based on their familiarity of each other’s expertise. This can serve as a replacement for the more traditional competence management approaches, which usually become outdated due to social and organizational hurdles, and obsolete data. These limitations can be overcome by people tagging, as the information revealed by such tags is usually based on most recent employee interaction and knowledge. Task management as part of personal information management aims at the support of users’ individual task handling. This can include collaborating with other individuals, sharing one’s knowledge, both functional and process-related, and distributing documents and web resources. In this context, Task patterns can be used as templates that collect information and experience around tasks associated to it during run time, facilitating agility. The effective collaboration among contributors necessitates the means to find the appropriate individuals to work with on the task, and this can be made possible by using social tagging to describe individual competencies. The goal of this study is to support finding and tagging people within task management, through the effective exploitation of the work/task context. This involves the utilization of knowledge of the workers’ expertise, nature of the task/task pattern and information available from the documents and web resources attached to the task. Vice versa, task management provides an excellent environment for social tagging due to the task context that already provides suitable tags. The study also aims at assisting users of the task management solution with the collaborative construction of light-weight ontology by inferring semantic relations between tags. The thesis project aims at an implementation of people finding & tagging within the java application for task management that consumes web services, which provide the required ontology for the organization.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tässä työssä tutkitaan Wärtsilä Oyj:n moottorin ja ABB Oy:n generaattorin muodostaman aggregaatin väsymiskestävyyttä lineaarisen murtumismekaniikan avulla. Työn tavoitteena on selvittää ABAQUS/XFEM- ja FRANC3D-ohjelman soveltuvuus kuormitukseltaan ja geometrialtaan vaativan generaattorirungon suunnittelutyökaluksi. Generaattorirungon kuormitukset aiheutuvat aggregaatin käynnin sekä käynnistys- ja sammutusvaiheen aikai-sista syntyvistä värähtelyistä. Tutkimuksessa tarkastellaan generaattorirungon väsymistä käynninaikaisella kuormituksella. Työssä mallinnettiin generaattorirungosta valittu hitsausdetalji alimallinnustekniikalla, jolloin alimallin reunaehdot voitiin määrittää aggregaatille tehdyn vastelaskennan perus-teella. Alimallista tutkittiin kahta erilaista hitsiliitostyyppiä, joihin mallinnettiin XFEM- ja FRANC3D-ohjelmilla erikokoisia säröjä hitsiliitosten rajaviivalle sekä juuren puolelle. Tutkittavilla ohjelmilla saatujen jännitysintensiteettikertoimien avulla säröille voitiin las-kea ekvivalentti jännitysintensiteettikerroin, jota verrattiin kokeellisesti saatuun jännitysin-tensiteettikertoimen kynnysarvoon. XFEM- ja FRANC3D-ohjelmia vertailtiin käytön helppouden, tulosten tarkkuuden sekä laskenta-aikojen perusteella. Käytettävyyden ja laskenta-aikojen perusteella XFEM-ohjelma soveltui paremmin käytettäväksi teollisuudessa suunnittelu- ja kehitystyön apu-työkaluna. FRANC3D taas antoi XFEM-ohjelmaa luotettavampia tuloksia, mutta laskenta-ajat olivat moninkertaiset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In accordance with the Moore's law, the increasing number of on-chip integrated transistors has enabled modern computing platforms with not only higher processing power but also more affordable prices. As a result, these platforms, including portable devices, work stations and data centres, are becoming an inevitable part of the human society. However, with the demand for portability and raising cost of power, energy efficiency has emerged to be a major concern for modern computing platforms. As the complexity of on-chip systems increases, Network-on-Chip (NoC) has been proved as an efficient communication architecture which can further improve system performances and scalability while reducing the design cost. Therefore, in this thesis, we study and propose energy optimization approaches based on NoC architecture, with special focuses on the following aspects. As the architectural trend of future computing platforms, 3D systems have many bene ts including higher integration density, smaller footprint, heterogeneous integration, etc. Moreover, 3D technology can signi cantly improve the network communication and effectively avoid long wirings, and therefore, provide higher system performance and energy efficiency. With the dynamic nature of on-chip communication in large scale NoC based systems, run-time system optimization is of crucial importance in order to achieve higher system reliability and essentially energy efficiency. In this thesis, we propose an agent based system design approach where agents are on-chip components which monitor and control system parameters such as supply voltage, operating frequency, etc. With this approach, we have analysed the implementation alternatives for dynamic voltage and frequency scaling and power gating techniques at different granularity, which reduce both dynamic and leakage energy consumption. Topologies, being one of the key factors for NoCs, are also explored for energy saving purpose. A Honeycomb NoC architecture is proposed in this thesis with turn-model based deadlock-free routing algorithms. Our analysis and simulation based evaluation show that Honeycomb NoCs outperform their Mesh based counterparts in terms of network cost, system performance as well as energy efficiency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The original contribution of this thesis to knowledge are novel digital readout architectures for hybrid pixel readout chips. The thesis presents asynchronous bus-based architecture, a data-node based column architecture and a network-based pixel matrix architecture for data transportation. It is shown that the data-node architecture achieves readout efficiency 99% with half the output rate as a bus-based system. The network-based solution avoids “broken” columns due to some manufacturing errors, and it distributes internal data traffic more evenly across the pixel matrix than column-based architectures. An improvement of > 10% to the efficiency is achieved with uniform and non-uniform hit occupancies. Architectural design has been done using transaction level modeling (TLM) and sequential high-level design techniques for reducing the design and simulation time. It has been possible to simulate tens of column and full chip architectures using the high-level techniques. A decrease of > 10 in run-time is observed using these techniques compared to register transfer level (RTL) design technique. Reduction of 50% for lines-of-code (LoC) for the high-level models compared to the RTL description has been achieved. Two architectures are then demonstrated in two hybrid pixel readout chips. The first chip, Timepix3 has been designed for the Medipix3 collaboration. According to the measurements, it consumes < 1 W/cm^2. It also delivers up to 40 Mhits/s/cm^2 with 10-bit time-over-threshold (ToT) and 18-bit time-of-arrival (ToA) of 1.5625 ns. The chip uses a token-arbitrated, asynchronous two-phase handshake column bus for internal data transfer. It has also been successfully used in a multi-chip particle tracking telescope. The second chip, VeloPix, is a readout chip being designed for the upgrade of Vertex Locator (VELO) of the LHCb experiment at CERN. Based on the simulations, it consumes < 1.5 W/cm^2 while delivering up to 320 Mpackets/s/cm^2, each packet containing up to 8 pixels. VeloPix uses a node-based data fabric for achieving throughput of 13.3 Mpackets/s from the column to the EoC. By combining Monte Carlo physics data with high-level simulations, it has been demonstrated that the architecture meets requirements of the VELO (260 Mpackets/s/cm^2 with efficiency of 99%).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tämän tutkielman tavoitteena on tutkia tekijöitä jotkavaikuttavat lyhyellä ja pitkällä aikavälillä kullan hintaan. Toiseksi tutkielmassa selvitetään mitä eri sijoitusmahdollisuuksia löytyy kultaan sijoitettaessa. Aineistona käytetään kuukausitasoista dataa Yhdysvaltain ja maailman hintaindekseistä, Yhdysvaltain ja maailman inflaatiosta ja inflaation volatiliteetista, kullan beetasta, kullan lainahinnasta, luottoriskistä ja Yhdysvaltojen ja maailman valuuttakurssi indeksistä joulukuulta 1972 elokuulle 2006. Yhteisintegraatio regressiotekniikoita käytettiin muodostamaan malli jonka avullatutkittiin päätekijöitä jotka vaikuttavat kullan hintaan. Kirjallisuutta tutkimalla selvitettiin miten kultaan voidaan sijoittaa. Empiirisettulokset ovat yhteneväisiä edellisten tutkimusten kanssa. Tukea löytyi sille, että kulta on pitkän ajan suoja inflaatiota vastaan ja kulta ja Yhdysvaltojen inflaatio liikkuvat pitkällä aikavälillä yhdessä. Kullan hintaan vaikuttavat kuitenkin lyhyen ajan tekijät pitkän ajan tekijöitä enemmän. Kulta on myös sijoittajalle helppo sijoituskohde, koska se on hyvin saatavilla markkinoilla ja eri instrumentteja on lukuisia.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The past few decades have seen a considerable increase in the number of parallel and distributed systems. With the development of more complex applications, the need for more powerful systems has emerged and various parallel and distributed environments have been designed and implemented. Each of the environments, including hardware and software, has unique strengths and weaknesses. There is no single parallel environment that can be identified as the best environment for all applications with respect to hardware and software properties. The main goal of this thesis is to provide a novel way of performing data-parallel computation in parallel and distributed environments by utilizing the best characteristics of difference aspects of parallel computing. For the purpose of this thesis, three aspects of parallel computing were identified and studied. First, three parallel environments (shared memory, distributed memory, and a network of workstations) are evaluated to quantify theirsuitability for different parallel applications. Due to the parallel and distributed nature of the environments, networks connecting the processors in these environments were investigated with respect to their performance characteristics. Second, scheduling algorithms are studied in order to make them more efficient and effective. A concept of application-specific information scheduling is introduced. The application- specific information is data about the workload extractedfrom an application, which is provided to a scheduling algorithm. Three scheduling algorithms are enhanced to utilize the application-specific information to further refine their scheduling properties. A more accurate description of the workload is especially important in cases where the workunits are heterogeneous and the parallel environment is heterogeneous and/or non-dedicated. The results obtained show that the additional information regarding the workload has a positive impact on the performance of applications. Third, a programming paradigm for networks of symmetric multiprocessor (SMP) workstations is introduced. The MPIT programming paradigm incorporates the Message Passing Interface (MPI) with threads to provide a methodology to write parallel applications that efficiently utilize the available resources and minimize the overhead. The MPIT allows for communication and computation to overlap by deploying a dedicated thread for communication. Furthermore, the programming paradigm implements an application-specific scheduling algorithm. The scheduling algorithm is executed by the communication thread. Thus, the scheduling does not affect the execution of the parallel application. Performance results achieved from the MPIT show that considerable improvements over conventional MPI applications are achieved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis concurrent communication event handling is implemented using thread pool approach. Concurrent events are handled with a Reactor design pattern and multithreading is implemented using a Leader/Followers design pattern. Main focus is to evaluate behaviour of implemented model by different numbers of concurrent connections and amount of used threads. Furthermore, model feasibility in a PeerHood middleware is evaluated. Implemented model is evaluated with created test environment which enables concurrent message sending from multiple connections to the system under test. Messages round trip times are measured in the tester application. In the evaluation processing delay into system is simulated and influence of delay to the average round trip time is analysed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis considers modeling and analysis of noise and interconnects in onchip communication. Besides transistor count and speed, the capabilities of a modern design are often limited by on-chip communication links. These links typically consist of multiple interconnects that run parallel to each other for long distances between functional or memory blocks. Due to the scaling of technology, the interconnects have considerable electrical parasitics that affect their performance, power dissipation and signal integrity. Furthermore, because of electromagnetic coupling, the interconnects in the link need to be considered as an interacting group instead of as isolated signal paths. There is a need for accurate and computationally effective models in the early stages of the chip design process to assess or optimize issues affecting these interconnects. For this purpose, a set of analytical models is developed for on-chip data links in this thesis. First, a model is proposed for modeling crosstalk and intersymbol interference. The model takes into account the effects of inductance, initial states and bit sequences. Intersymbol interference is shown to affect crosstalk voltage and propagation delay depending on bus throughput and the amount of inductance. Next, a model is proposed for the switching current of a coupled bus. The model is combined with an existing model to evaluate power supply noise. The model is then applied to reduce both functional crosstalk and power supply noise caused by a bus as a trade-off with time. The proposed reduction method is shown to be effective in reducing long-range crosstalk noise. The effects of process variation on encoded signaling are then modeled. In encoded signaling, the input signals to a bus are encoded using additional signaling circuitry. The proposed model includes variation in both the signaling circuitry and in the wires to calculate the total delay variation of a bus. The model is applied to study level-encoded dual-rail and 1-of-4 signaling. In addition to regular voltage-mode and encoded voltage-mode signaling, current-mode signaling is a promising technique for global communication. A model for energy dissipation in RLC current-mode signaling is proposed in the thesis. The energy is derived separately for the driver, wire and receiver termination.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Diplomityön tarkoituksena on optimoida asiakkaiden sähkölaskun laskeminen hajautetun laskennan avulla. Älykkäiden etäluettavien energiamittareiden tullessa jokaiseen kotitalouteen, energiayhtiöt velvoitetaan laskemaan asiakkaiden sähkölaskut tuntiperusteiseen mittaustietoon perustuen. Kasvava tiedonmäärä lisää myös tarvittavien laskutehtävien määrää. Työssä arvioidaan vaihtoehtoja hajautetun laskennan toteuttamiseksi ja luodaan tarkempi katsaus pilvilaskennan mahdollisuuksiin. Lisäksi ajettiin simulaatioita, joiden avulla arvioitiin rinnakkaislaskennan ja peräkkäislaskennan eroja. Sähkölaskujen oikeinlaskemisen tueksi kehitettiin mittauspuu-algoritmi.