40 resultados para Data fusion applications

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Open data refers to publishing data on the web in machine-readable formats for public access. Using open data, innovative applications can be developed to facilitate people‟s lives. In this thesis, based on the open data cases (discussed in the literature review), Open Data Lappeenranta is suggested, which publishes open data related to opening hours of shops and stores in Lappeenranta City. To prove the possibility of creating Open Data Lappeenranta, the implementation of an open data system is presented in this thesis, which publishes specific data related to shops and stores (including their opening hours) on the web in standard format (JSON). The published open data is used to develop web and mobile applications to demonstrate the benefits of open data in practice. Also, the open data system provides manual and automatic interfaces which make it possible for shops and stores to maintain their own data in the system. Finally in this thesis, the completed version of Open Data Lappeenranta is proposed, which publishes open data related to other fields and businesses in Lappeenranta beyond only stores‟ data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The whole research of the current Master Thesis project is related to Big Data transfer over Parallel Data Link and my main objective is to assist the Saint-Petersburg National Research University ITMO research team to accomplish this project and apply Green IT methods for the data transfer system. The goal of the team is to transfer Big Data by using parallel data links with SDN Openflow approach. My task as a team member was to compare existing data transfer applications in case to verify which results the highest data transfer speed in which occasions and explain the reasons. In the context of this thesis work a comparison between 5 different utilities was done, which including Fast Data Transfer (FDT), BBCP, BBFTP, GridFTP, and FTS3. A number of scripts where developed which consist of creating random binary data to be incompressible to have fair comparison between utilities, execute the Utilities with specified parameters, create log files, results, system parameters, and plot graphs to compare the results. Transferring such an enormous variety of data can take a long time, and hence, the necessity appears to reduce the energy consumption to make them greener. In the context of Green IT approach, our team used Cloud Computing infrastructure called OpenStack. It’s more efficient to allocated specific amount of hardware resources to test different scenarios rather than using the whole resources from our testbed. Testing our implementation with OpenStack infrastructure results that the virtual channel does not consist of any traffic and we can achieve the highest possible throughput. After receiving the final results we are in place to identify which utilities produce faster data transfer in different scenarios with specific TCP parameters and we can use them in real network data links.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Multicast is one method to transfer information in IPv4 based communication. Other methods are unicast and broadcast. Multicast is based on the group concept where data is sent from one point to a group of receivers and this remarkably saves bandwidth. Group members express an interest to receive data by using Internet Group Management Protocol and traffic is received by only those receivers who want it. The most common multicast applications are media streaming applications, surveillance applications and data collection applications. There are many data security methods to protect unicast communication that is the most common transfer method in Internet. Popular data security methods are encryption, authentication, access control and firewalls. The characteristics of multicast such as dynamic membership cause that all these data security mechanisms can not be used to protect multicast traffic. Nowadays the protection of multicast traffic is possible via traffic restrictions where traffic is allowed to propagate only to certain areas. One way to implement this is packet filters. Methods tested in this thesis are MVR, IGMP Filtering and access control lists which worked as supposed. These methods restrict the propagation of multicast but are laborious to configure in a large scale. There are also a few manufacturerspecific products that make possible to encrypt multicast traffic. These separate products are expensive and mainly intended to protect video transmissions via satellite. Investigation of multicast security has taken place for several years and the security methods that will be the results of the investigation are getting ready. An IETF working group called MSEC is standardizing these security methods. The target of this working group is to standardize data security protocols for multicast during 2004.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Virtual screening is a central technique in drug discovery today. Millions of molecules can be tested in silico with the aim to only select the most promising and test them experimentally. The topic of this thesis is ligand-based virtual screening tools which take existing active molecules as starting point for finding new drug candidates. One goal of this thesis was to build a model that gives the probability that two molecules are biologically similar as function of one or more chemical similarity scores. Another important goal was to evaluate how well different ligand-based virtual screening tools are able to distinguish active molecules from inactives. One more criterion set for the virtual screening tools was their applicability in scaffold-hopping, i.e. finding new active chemotypes. In the first part of the work, a link was defined between the abstract chemical similarity score given by a screening tool and the probability that the two molecules are biologically similar. These results help to decide objectively which virtual screening hits to test experimentally. The work also resulted in a new type of data fusion method when using two or more tools. In the second part, five ligand-based virtual screening tools were evaluated and their performance was found to be generally poor. Three reasons for this were proposed: false negatives in the benchmark sets, active molecules that do not share the binding mode, and activity cliffs. In the third part of the study, a novel visualization and quantification method is presented for evaluation of the scaffold-hopping ability of virtual screening tools.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Multiprocessor system-on-chip (MPSoC) designs utilize the available technology and communication architectures to meet the requirements of the upcoming applications. In MPSoC, the communication platform is both the key enabler, as well as the key differentiator for realizing efficient MPSoCs. It provides product differentiation to meet a diverse, multi-dimensional set of design constraints, including performance, power, energy, reconfigurability, scalability, cost, reliability and time-to-market. The communication resources of a single interconnection platform cannot be fully utilized by all kind of applications, such as the availability of higher communication bandwidth for computation but not data intensive applications is often unfeasible in the practical implementation. This thesis aims to perform the architecture-level design space exploration towards efficient and scalable resource utilization for MPSoC communication architecture. In order to meet the performance requirements within the design constraints, careful selection of MPSoC communication platform, resource aware partitioning and mapping of the application play important role. To enhance the utilization of communication resources, variety of techniques such as resource sharing, multicast to avoid re-transmission of identical data, and adaptive routing can be used. For implementation, these techniques should be customized according to the platform architecture. To address the resource utilization of MPSoC communication platforms, variety of architectures with different design parameters and performance levels, namely Segmented bus (SegBus), Network-on-Chip (NoC) and Three-Dimensional NoC (3D-NoC), are selected. Average packet latency and power consumption are the evaluation parameters for the proposed techniques. In conventional computing architectures, fault on a component makes the connected fault-free components inoperative. Resource sharing approach can utilize the fault-free components to retain the system performance by reducing the impact of faults. Design space exploration also guides to narrow down the selection of MPSoC architecture, which can meet the performance requirements with design constraints.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Med prediktion avses att man skattar det framtida värdet på en observerbar storhet. Kännetecknande för det bayesianska paradigmet är att osäkerhet gällande okända storheter uttrycks i form av sannolikheter. En bayesiansk prediktiv modell är således en sannolikhetsfördelning över de möjliga värden som en observerbar, men ännu inte observerad storhet kan anta. I de artiklar som ingår i avhandlingen utvecklas metoder, vilka bl.a. tillämpas i analys av kromatografiska data i brottsutredningar. Med undantag för den första artikeln, bygger samtliga metoder på bayesiansk prediktiv modellering. I artiklarna betraktas i huvudsak tre olika typer av problem relaterade till kromatografiska data: kvantifiering, parvis matchning och klustring. I den första artikeln utvecklas en icke-parametrisk modell för mätfel av kromatografiska analyser av alkoholhalt i blodet. I den andra artikeln utvecklas en prediktiv inferensmetod för jämförelse av två stickprov. Metoden tillämpas i den tredje artik eln för jämförelse av oljeprover i syfte att kunna identifiera den förorenande källan i samband med oljeutsläpp. I den fjärde artikeln härleds en prediktiv modell för klustring av data av blandad diskret och kontinuerlig typ, vilken bl.a. tillämpas i klassificering av amfetaminprover med avseende på produktionsomgångar.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A quadcopter is a helicopter with four rotors, which is mechanically simple device, but requires complex electrical control for each motor. Control system needs accurate information about quadcopter’s attitude in order to achieve stable flight. The goal of this bachelor’s thesis was to research how this information could be obtained. Literature review revealed that most of the quadcopters, whose source-code is available, use a complementary filter or some derivative of it to fuse data from a gyroscope, an accelerometer and often also a magnetometer. These sensors combined are called an Inertial Measurement Unit. This thesis focuses on calculating angles from each sensor’s data and fusing these with a complementary filter. On the basis of literature review and measurements using a quadcopter, the proposed filter provides sufficiently accurate attitude data for flight control system. However, a simple complementary filter has one significant drawback – it works reliably only when the quadcopter is hovering or moving at a constant speed. The reason is that an accelerometer can’t be used to measure angles accurately if linear acceleration is present. This problem can be fixed using some derivative of a complementary filter like an adaptive complementary filter or a Kalman filter, which are not covered in this thesis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The past few decades have seen a considerable increase in the number of parallel and distributed systems. With the development of more complex applications, the need for more powerful systems has emerged and various parallel and distributed environments have been designed and implemented. Each of the environments, including hardware and software, has unique strengths and weaknesses. There is no single parallel environment that can be identified as the best environment for all applications with respect to hardware and software properties. The main goal of this thesis is to provide a novel way of performing data-parallel computation in parallel and distributed environments by utilizing the best characteristics of difference aspects of parallel computing. For the purpose of this thesis, three aspects of parallel computing were identified and studied. First, three parallel environments (shared memory, distributed memory, and a network of workstations) are evaluated to quantify theirsuitability for different parallel applications. Due to the parallel and distributed nature of the environments, networks connecting the processors in these environments were investigated with respect to their performance characteristics. Second, scheduling algorithms are studied in order to make them more efficient and effective. A concept of application-specific information scheduling is introduced. The application- specific information is data about the workload extractedfrom an application, which is provided to a scheduling algorithm. Three scheduling algorithms are enhanced to utilize the application-specific information to further refine their scheduling properties. A more accurate description of the workload is especially important in cases where the workunits are heterogeneous and the parallel environment is heterogeneous and/or non-dedicated. The results obtained show that the additional information regarding the workload has a positive impact on the performance of applications. Third, a programming paradigm for networks of symmetric multiprocessor (SMP) workstations is introduced. The MPIT programming paradigm incorporates the Message Passing Interface (MPI) with threads to provide a methodology to write parallel applications that efficiently utilize the available resources and minimize the overhead. The MPIT allows for communication and computation to overlap by deploying a dedicated thread for communication. Furthermore, the programming paradigm implements an application-specific scheduling algorithm. The scheduling algorithm is executed by the communication thread. Thus, the scheduling does not affect the execution of the parallel application. Performance results achieved from the MPIT show that considerable improvements over conventional MPI applications are achieved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The evaluation of investments in advanced technology is one of the most important decision making tasks. The importance is even more pronounced considering the huge budget concerning the strategic, economic and analytic justification in order to shorten design and development time. Choosing the most appropriate technology requires an accurate and reliable system that can lead the decision makers to obtain such a complicated task. Currently, several Information and Communication Technologies (ICTs) manufacturers that design global products are seeking local firms to act as their sales and services representatives (called distributors) to the end user. At the same time, the end user or customer is also searching for the best possible deal for their investment in ICT's projects. Therefore, the objective of this research is to present a holistic decision support system to assist the decision maker in Small and Medium Enterprises (SMEs) - working either as individual decision makers or in a group - in the evaluation of the investment to become an ICT's distributor or an ICT's end user. The model is composed of the Delphi/MAH (Maximising Agreement Heuristic) Analysis, a well-known quantitative method in Group Support System (GSS), which is applied to gather the average ranking data from amongst Decision Makers (DMs). After that the Analytic Network Process (ANP) analysis is brought in to analyse holistically: it performs quantitative and qualitative analysis simultaneously. The illustrative data are obtained from industrial entrepreneurs by using the Group Support System (GSS) laboratory facilities at Lappeenranta University of Technology, Finland and in Thailand. The result of the research, which is currently implemented in Thailand, can provide benefits to the industry in the evaluation of becoming an ICT's distributor or an ICT's end user, particularly in the assessment of the Enterprise Resource Planning (ERP) programme. After the model is put to test with an in-depth collaboration with industrial entrepreneurs in Finland and Thailand, the sensitivity analysis is also performed to validate the robustness of the model. The contribution of this research is in developing a new approach and the Delphi/MAH software to obtain an analysis of the value of becoming an ERP distributor or end user that is flexible and applicable to entrepreneurs, who are looking for the most appropriate investment to become an ERP distributor or end user. The main advantage of this research over others is that the model can deliver the value of becoming an ERP distributor or end user in a single number which makes it easier for DMs to choose the most appropriate ERP vendor. The associated advantage is that the model can include qualitative data as well as quantitative data, as the results from using quantitative data alone can be misleading and inadequate. There is a need to utilise quantitative and qualitative analysis together, as can be seen from the case studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ohjelmistoteollisuudessa pitkiä ja vaikeita kehityssyklejä voidaan helpottaa käyttämällä hyväksi ohjelmistokehyksiä (frameworks). Ohjelmistokehykset edustavat kokoelmaa luokkia, jotka tarjoavat yleisiä ratkaisuja tietyn ongelmakentän tarpeisiin vapauttaen ohjelmistokehittäjät keskittymään sovelluskohtaisiin vaatimuksiin. Hyvin suunniteltujen ohjelmistokehyksien käyttö lisää suunnitteluratkaisujen sekä lähdekoodin uudelleenkäytettävyyttä enemmän kuin mikään muu suunnittelulähestymistapa. Tietyn kohdealueen tietämys voidaan tallentaa ohjelmistokehyksiin, joista puolestaan voidaan erikoistaa viimeisteltyjä ohjelmistotuotteita. Tässä diplomityössä kuvataan ohjelmistoagentteihin (software agents) perustuvaa ohjelmistokehyksen suunnittelua toteutusta. Pääpaino työssä on vaatimusmäärittelyä vastaavan suunnitelman sekä toteutuksen kuvaaminen ohjelmistokehykselle, josta voidaan erikoistaa erilaiseen tiedonkeruuseen kykeneviä ohjelmistoja Internet ympäristöön. Työn kokeellisessa osuudessa esitellään myös esimerkkisovellus, joka perustuu työssä kehitettyyn ohjelmistokehykseen.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tämä diplomityö kirjoitettiin UPM- Kymmene konsernin UPM Net Services sa/nv osastolle Brysselissä ja Helsingissä. Työn aihe, Data communication in paper sales environment, määriteltiin käsittelemään paperin myyntijärjestelmään liittyviä aiheita. Nykyinen paperin myyntijärjestelmä on käsitelty ensin teoriassa ja aiheeseen kuuluvat ohjelmistotuotteet ja työkaluohjelmistot on esitelty. Parannuksia nykyiseen järjestelmään on pohdittu ohjelmistosuunnittelun, tehokkuuden, tiedon hallinnan, tietoturvallisuuden ja liiketoiminnan näkökulmista. Diplomityön käytännön osuudessa esitellään kaksi ohjelmistoa. Nämä ohjelmistot tehtiin UPM Net Services'lle, jotta saatiin kokemuksia viestin välitykseen perustuvasta tiedon siirrosta. Diplomityön johtopäätösosuudessa todetaan, että paperin myyntijärjestelmän tiedon siirto toimii luotettavasti nykyisessä järjestelmässä. Tulevaisuuden tarpeet ja parannukset ovat kuitenkin vaikeasti toteutettavissa nykyään käytettävin välinein. Erityisesti internetin hyödyntäminen nähdään tärkeänä, mutta se on vaikeasti otettavissa käyttöön nykyisessä järjestelmässä. Viestin välitykseen perustuvat järjestelmät ovat osoittautuneet käytännössä toimiviksi ja tärkein kehitysehdotus onkin viestin välitysjärjestelmän käyttöönotto.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Työn päätavoitteena oli tutkia mobiilipalveluita ja langattomia sovelluksia Suomen terveydenhuollon sektorilla. Tutkimus havainnollistaa avain-alueita, missä mobiilipalvelut ja langattomat sovellukset voivat antaa lisäarvoa perinteiseen lääketieteen harjoittamiseen, ja selvittää, mitkä ovat tähän kehitykseen liittyvät suurimmat ongelmat ja uhkat sekä tutkimustuloksiin pohjautuvat mahdolliset palvelut ja sovellukset 5-10 vuoden kuluttua. Tutkimus oli luonteeltaan kvalitatiivinen ja tutkimuksen toteuttamiseen valittiin tulevaisuudentutkimus ja erityisesti yksi sen menetelmistä, delfoi-menetelmä. Tutkimuksen aineisto kerättiin kahdelta puolistrukturoidulta haastattelukierrokselta. Työn empiirinen osuus keskittyi kuvailemaan Suomen terveydenhuollon sektoria, siinä meneillään olevia projekteja sekä teknisiä esteitä. Lisäksi pyrittiin vastaamaan tutkimuksen pääkysymykseen. Tutkimustulokset osoittivat, että tärkeät alueet, joihin langaton kommunikaatio tulisi vaikuttamaan merkittävästi, ovat ensiaputoiminta, kroonisten potilaiden etämonitorointi, välineiden kehittäminen langattomaan kommunikaatioon kotihoidon parantamiseksi ja uusien toimintamallien luomiseksi sekä lääketieteellinen yhteistyö jakamalla terveydenhuoltoon liittyvät informaation lähteet. Työn tulosten perusteellavoitiin antaa myös muutamia toimenpide-ehdotuksia jatkotutkimuksia varten.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tämän diplomityön tarkoituksena oli tehdä selvitys EDI:in liittyvistä vaikutuksista, tarpeista ja eduista sekä valmistella Oracle Applications- toiminnanohjausjärjestelmän EDI Gateway- modulin ottamista tuotantokäyttöön. Tietoa tarvekartoitukseen saatiin keskustelujen avulla. Uusia kaupallisista lähtökohdista johdettuja, yritysten väliseen kaupankäyntiin ja internet-teknologian hyödyntämiseen kehitettyjä aloitteita käsiteltiin EDI-näkökulmasta tulevaisuutta varten. Ajankohtaisinta tietoa tätä diplomityötä varten löydettiin myös internetistä. Tämän jälkeen oli mahdollista toteuttaa sopivan laaja mutta rajattu EDI pilottiprojekti EDI-konseptin luomista varten. EDI:n vaikutuksiin ostossa keskityttiin tässä diplomityössä enemmän ja EDI:ä päätettiin soveltaa aluksi ostotilauksissa. EDI:n hyötyjä on vaikea mitata numeerisesti. Suurta määrää rahaa tai tuoteyksiköitä on käsiteltävä EDI-partnerin kanssa riittävän usein. EDI:n käyttöönottovaiheessa pääongelmat ovat sovelluksiin liittyviä tietotekniikkaongelmia. Selvityksistä ja EDI-projektista saatu tieto on mahdollista hyödyntää jatkokehityksessä. Lisätoimenpiteitä tarvitaan kokonaan toimivan järjestelmän luomiseksi.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rosin is a natural product from pine forests and it is used as a raw material in resinate syntheses. Resinates are polyvalent metal salts of rosin acids and especially Ca- and Ca/Mg- resinates find wide application in the printing ink industry. In this thesis, analytical methods were applied to increase general knowledge of resinate chemistry and the reaction kinetics was studied in order to model the non linear solution viscosity increase during resinate syntheses by the fusion method. Solution viscosity in toluene is an important quality factor for resinates to be used in printing inks. The concept of critical resinate concentration, c crit, was introduced to define an abrupt change in viscosity dependence on resinate concentration in the solution. The concept was then used to explain the non-inear solution viscosity increase during resinate syntheses. A semi empirical model with two estimated parameters was derived for the viscosity increase on the basis of apparent reaction kinetics. The model was used to control the viscosity and to predict the total reaction time of the resinate process. The kinetic data from the complex reaction media was obtained by acid value titration and by FTIR spectroscopic analyses using a conventional calibration method to measure the resinate concentration and the concentration of free rosin acids. A multivariate calibration method was successfully applied to make partial least square (PLS) models for monitoring acid value and solution viscosity in both mid-infrared (MIR) and near infrared (NIR) regions during the syntheses. The calibration models can be used for on line resinate process monitoring. In kinetic studies, two main reaction steps were observed during the syntheses. First a fast irreversible resination reaction occurs at 235 °C and then a slow thermal decarboxylation of rosin acids starts to take place at 265 °C. Rosin oil is formed during the decarboxylation reaction step causing significant mass loss as the rosin oil evaporates from the system while the viscosity increases to the target level. The mass balance of the syntheses was determined based on the resinate concentration increase during the decarboxylation reaction step. A mechanistic study of the decarboxylation reaction was based on the observation that resinate molecules are partly solvated by rosin acids during the syntheses. Different decarboxylation mechanisms were proposed for the free and solvating rosin acids. The deduced kinetic model supported the analytical data of the syntheses in a wide resinate concentration region, over a wide range of viscosity values and at different reaction temperatures. In addition, the application of the kinetic model to the modified resinate syntheses gave a good fit. A novel synthesis method with the addition of decarboxylated rosin (i.e. rosin oil) to the reaction mixture was introduced. The conversion of rosin acid to resinate was increased to the level necessary to obtain the target viscosity for the product at 235 °C. Due to a lower reaction temperature than in traditional fusion synthesis at 265 °C, thermal decarboxylation is avoided. As a consequence, the mass yield of the resinate syntheses can be increased from ca. 70% to almost 100% by recycling the added rosin oil.