922 resultados para Application specific algorithm
Resumo:
This paper presents a modified Particle Swarm Optimization (PSO) methodology to solve the problem of energy resources management with high penetration of distributed generation and Electric Vehicles (EVs) with gridable capability (V2G). The objective of the day-ahead scheduling problem in this work is to minimize operation costs, namely energy costs, regarding he management of these resources in the smart grid context. The modifications applied to the PSO aimed to improve its adequacy to solve the mentioned problem. The proposed Application Specific Modified Particle Swarm Optimization (ASMPSO) includes an intelligent mechanism to adjust velocity limits during the search process, as well as self-parameterization of PSO parameters making it more user-independent. It presents better robustness and convergence characteristics compared with the tested PSO variants as well as better constraint handling. This enables its use for addressing real world large-scale problems in much shorter times than the deterministic methods, providing system operators with adequate decision support and achieving efficient resource scheduling, even when a significant number of alternative scenarios should be considered. The paper includes two realistic case studies with different penetration of gridable vehicles (1000 and 2000). The proposed methodology is about 2600 times faster than Mixed-Integer Non-Linear Programming (MINLP) reference technique, reducing the time required from 25 h to 36 s for the scenario with 2000 vehicles, with about one percent of difference in the objective function cost value.
Resumo:
This paper presents a modified Particle Swarm Optimization (PSO) methodology to solve the problem of energy resources management with high penetration of distributed generation and Electric Vehicles (EVs) with gridable capability (V2G). The objective of the day-ahead scheduling problem in this work is to minimize operation costs, namely energy costs, regarding the management of these resources in the smart grid context. The modifications applied to the PSO aimed to improve its adequacy to solve the mentioned problem. The proposed Application Specific Modified Particle Swarm Optimization (ASMPSO) includes an intelligent mechanism to adjust velocity limits during the search process, as well as self-parameterization of PSO parameters making it more user-independent. It presents better robustness and convergence characteristics compared with the tested PSO variants as well as better constraint handling. This enables its use for addressing real world large-scale problems in much shorter times than the deterministic methods, providing system operators with adequate decision support and achieving efficient resource scheduling, even when a significant number of alternative scenarios should be considered. The paper includes two realistic case studies with different penetration of gridable vehicles (1000 and 2000). The proposed methodology is about 2600 times faster than Mixed-Integer Non-Linear Programming (MINLP) reference technique, reducing the time required from 25 h to 36 s for the scenario with 2000 vehicles, with about one percent of difference in the objective function cost value.
Resumo:
Recent integrated circuit technologies have opened the possibility to design parallel architectures with hundreds of cores on a single chip. The design space of these parallel architectures is huge with many architectural options. Exploring the design space gets even more difficult if, beyond performance and area, we also consider extra metrics like performance and area efficiency, where the designer tries to design the architecture with the best performance per chip area and the best sustainable performance. In this paper we present an algorithm-oriented approach to design a many-core architecture. Instead of doing the design space exploration of the many core architecture based on the experimental execution results of a particular benchmark of algorithms, our approach is to make a formal analysis of the algorithms considering the main architectural aspects and to determine how each particular architectural aspect is related to the performance of the architecture when running an algorithm or set of algorithms. The architectural aspects considered include the number of cores, the local memory available in each core, the communication bandwidth between the many-core architecture and the external memory and the memory hierarchy. To exemplify the approach we did a theoretical analysis of a dense matrix multiplication algorithm and determined an equation that relates the number of execution cycles with the architectural parameters. Based on this equation a many-core architecture has been designed. The results obtained indicate that a 100 mm(2) integrated circuit design of the proposed architecture, using a 65 nm technology, is able to achieve 464 GFLOPs (double precision floating-point) for a memory bandwidth of 16 GB/s. This corresponds to a performance efficiency of 71 %. Considering a 45 nm technology, a 100 mm(2) chip attains 833 GFLOPs which corresponds to 84 % of peak performance These figures are better than those obtained by previous many-core architectures, except for the area efficiency which is limited by the lower memory bandwidth considered. The results achieved are also better than those of previous state-of-the-art many-cores architectures designed specifically to achieve high performance for matrix multiplication.
Resumo:
QUESTION UNDER STUDY: Thirty-day readmissions can be classified as potentially avoidable (PARs) or not avoidable (NARs) by following a specific algorithm (SQLape®). We wanted to assess the financial impact of the Swiss-DRG system, which regroups some readmissions occurring within 18 days after discharge within the initial hospital stay, on PARs at our hospital. METHODS: First, PARs were identified from all hospitalisations recorded in 2011 at our university hospital. Second, 2012 Swiss-DRG readmission rules were applied, regrouped readmissions (RR) were identified, and their financial impact computed. Third, RRs were classified as potentially avoidable (PARRs), not avoidable (NARRs), and others causes (OCRRs). Characteristics of PARR patients and stays were retrieved, and the financial impact of PARRS was computed. RESULTS: A total of 36,777 hospitalisations were recorded in 2011, of which 3,140 were considered as readmissions (8.5%): 1,470 PARs (46.8%) and 1,733 NARs (53.2%). The 2012 Swiss-DRG rules would have resulted in 910 RRs (2.5% of hospitalisations, 29% of readmissions): 395 PARRs (43% of RR), 181 NARRs (20%), and 334 OCRRs (37%). Loss in reimbursement would have amounted to CHF 3.157 million (0.6% of total reimbursement). As many as 95% of the 395 PARR patients lived at home. In total, 28% of PARRs occurred within 3 days after discharge, and 58% lasted less than 5 days; 79% of the patients were discharged home again. Loss in reimbursement would amount to CHF 1.771 million. CONCLUSION: PARs represent a sizeable number of 30-day readmissions, as do PARRs of 18-day RRs in the 2012 Swiss DRG system. They should be the focus of attention, as the PARRs represent an avoidable loss in reimbursement.
Resumo:
The past few decades have seen a considerable increase in the number of parallel and distributed systems. With the development of more complex applications, the need for more powerful systems has emerged and various parallel and distributed environments have been designed and implemented. Each of the environments, including hardware and software, has unique strengths and weaknesses. There is no single parallel environment that can be identified as the best environment for all applications with respect to hardware and software properties. The main goal of this thesis is to provide a novel way of performing data-parallel computation in parallel and distributed environments by utilizing the best characteristics of difference aspects of parallel computing. For the purpose of this thesis, three aspects of parallel computing were identified and studied. First, three parallel environments (shared memory, distributed memory, and a network of workstations) are evaluated to quantify theirsuitability for different parallel applications. Due to the parallel and distributed nature of the environments, networks connecting the processors in these environments were investigated with respect to their performance characteristics. Second, scheduling algorithms are studied in order to make them more efficient and effective. A concept of application-specific information scheduling is introduced. The application- specific information is data about the workload extractedfrom an application, which is provided to a scheduling algorithm. Three scheduling algorithms are enhanced to utilize the application-specific information to further refine their scheduling properties. A more accurate description of the workload is especially important in cases where the workunits are heterogeneous and the parallel environment is heterogeneous and/or non-dedicated. The results obtained show that the additional information regarding the workload has a positive impact on the performance of applications. Third, a programming paradigm for networks of symmetric multiprocessor (SMP) workstations is introduced. The MPIT programming paradigm incorporates the Message Passing Interface (MPI) with threads to provide a methodology to write parallel applications that efficiently utilize the available resources and minimize the overhead. The MPIT allows for communication and computation to overlap by deploying a dedicated thread for communication. Furthermore, the programming paradigm implements an application-specific scheduling algorithm. The scheduling algorithm is executed by the communication thread. Thus, the scheduling does not affect the execution of the parallel application. Performance results achieved from the MPIT show that considerable improvements over conventional MPI applications are achieved.
Resumo:
Tehoelektoniikkalaitteella tarkoitetaan ohjaus- ja säätöjärjestelmää, jolla sähköä muokataan saatavilla olevasta muodosta haluttuun uuteen muotoon ja samalla hallitaan sähköisen tehon virtausta lähteestä käyttökohteeseen. Tämä siis eroaa signaalielektroniikasta, jossa sähköllä tyypillisesti siirretään tietoa hyödyntäen eri tiloja. Tehoelektroniikkalaitteita vertailtaessa katsotaan yleensä niiden luotettavuutta, kokoa, tehokkuutta, säätötarkkuutta ja tietysti hintaa. Tyypillisiä tehoelektroniikkalaitteita ovat taajuudenmuuttajat, UPS (Uninterruptible Power Supply) -laitteet, hitsauskoneet, induktiokuumentimet sekä erilaiset teholähteet. Perinteisesti näiden laitteiden ohjaus toteutetaan käyttäen mikroprosessoreja, ASIC- (Application Specific Integrated Circuit) tai IC (Intergrated Circuit) -piirejä sekä analogisia säätimiä. Tässä tutkimuksessa on analysoitu FPGA (Field Programmable Gate Array) -piirien soveltuvuutta tehoelektroniikan ohjaukseen. FPGA-piirien rakenne muodostuu erilaisista loogisista elementeistä ja niiden välisistä yhdysjohdoista.Loogiset elementit ovat porttipiirejä ja kiikkuja. Yhdysjohdot ja loogiset elementit ovat piirissä kiinteitä eikä koostumusta tai lukumäärää voi jälkikäteen muuttaa. Ohjelmoitavuus syntyy elementtien välisistä liitännöistä. Piirissä on lukuisia, jopa miljoonia kytkimiä, joiden asento voidaan asettaa. Siten piirin peruselementeistä voidaan muodostaa lukematon määrä erilaisia toiminnallisia kokonaisuuksia. FPGA-piirejä on pitkään käytetty kommunikointialan tuotteissa ja siksi niiden kehitys on viime vuosina ollut nopeaa. Samalla hinnat ovat pudonneet. Tästä johtuen FPGA-piiristä on tullut kiinnostava vaihtoehto myös tehoelektroniikkalaitteiden ohjaukseen. Väitöstyössä FPGA-piirien käytön soveltuvuutta on tutkittu käyttäen kahta vaativaa ja erilaista käytännön tehoelektroniikkalaitetta: taajuudenmuuttajaa ja hitsauskonetta. Molempiin testikohteisiin rakennettiin alan suomalaisten teollisuusyritysten kanssa soveltuvat prototyypit,joiden ohjauselektroniikka muutettiin FPGA-pohjaiseksi. Lisäksi kehitettiin tätä uutta tekniikkaa hyödyntävät uudentyyppiset ohjausmenetelmät. Prototyyppien toimivuutta verrattiin vastaaviin perinteisillä menetelmillä ohjattuihin kaupallisiin tuotteisiin ja havaittiin FPGA-piirien mahdollistaman rinnakkaisen laskennantuomat edut molempien tehoelektroniikkalaitteiden toimivuudessa. Työssä on myösesitetty uusia menetelmiä ja työkaluja FPGA-pohjaisen säätöjärjestelmän kehitykseen ja testaukseen. Esitetyillä menetelmillä tuotteiden kehitys saadaan mahdollisimman nopeaksi ja tehokkaaksi. Lisäksi työssä on kehitetty FPGA:n sisäinen ohjaus- ja kommunikointiväylärakenne, joka palvelee tehoelektroniikkalaitteiden ohjaussovelluksia. Uusi kommunikointirakenne edistää lisäksi jo tehtyjen osajärjestelmien uudelleen käytettävyyttä tulevissa sovelluksissa ja tuotesukupolvissa.
Resumo:
Reusability has become more popular factor in modern software engineering. This is mainly because object-orientation has brought methods that allow reusing more easily. Today more and more application developer thinks how they can reuse already existing applications in their work. If the developer wants to use existing components outside the current project, he can use design patterns, class libraries or frameworks. These provide solution for specific or general problems that has been already encountered. Application frameworks are collection of classes that provides base for the developer. Application frameworks are mostly implementation phase tools, but can also be used in application design. The main purpose of the frameworks is separate domain specific functionalities from the application specific. Usually the frameworks are divided into two categories: black and white box. Difference between those categories is the way the reuse is done. The application frameworks provide properties that can be examined and compared between different frameworks. These properties are: extensibility, reusability, modularity and scalability. These examine how framework will handle different platforms, changes in framework, increasing demand for resources, etc. Generally application frameworks do have these properties in good level. When comparing general purpose framework and more specific purpose framework, the main difference can be located in reusability of frameworks. It is mainly because the framework designed to specific domain can have constraints from external systems and resources. With general purpose framework these are set by the application developed based on the framework.
Resumo:
BACKGROUND: Previous observations found a high prevalence of obstructive sleep apnea (OSA) in the hemodialysis population, but the best diagnostic approach remains undefined. We assessed OSA prevalence and performance of available screening tools to propose a specific diagnostic algorithm. METHODS: 104 patients from 6 Swiss hemodialysis centers underwent polygraphy and completed 3 OSA screening scores: STOP-BANG, Berlin's Questionnaire, and Adjusted Neck Circumference. The OSA predictors were identified on a derivation population and used to develop the diagnostic algorithm, which was validated on an independent population. RESULTS: We found 56% OSA prevalence (AHI ≥ 15/h), which was largely underdiagnosed. Screening scores showed poor performance for OSA screening (ROC areas 0.538 [SE 0.093] to 0.655 [SE 0.083]). Age, neck circumference, and time on renal replacement therapy were the best predictors of OSA and were used to develop a screening algorithm, with higher discriminatory performance than classical screening tools (ROC area 0.831 [0.066]). CONCLUSIONS: Our study confirms the high OSA prevalence and highlights the low diagnosis rate of this treatable cardiovascular risk factor in the hemodialysis population. Considering the poor performance of OSA screening tools, we propose and validate a specific algorithm to identify hemodialysis patients at risk for OSA for whom further sleep investigations should be considered.
Resumo:
Association rules are a popular knowledge discovery technique for warehouse basket analysis. They indicate which items of the warehouse are frequently bought together. The problem of association rule mining has first been stated in 1993. Five years later, several research groups discovered that this problem has a strong connection to Formal Concept Analysis (FCA). In this survey, we will first introduce some basic ideas of this connection along a specific algorithm, TITANIC, and show how FCA helps in reducing the number of resulting rules without loss of information, before giving a general overview over the history and state of the art of applying FCA for association rule mining.
Resumo:
Caches are known to consume up to half of all system power in embedded processors. Co-optimizing performance and power of the cache subsystems is therefore an important step in the design of embedded systems, especially those employing application specific instruction processors. In this project, we propose an analytical cache model that succinctly captures the miss performance of an application over the entire cache parameter space. Unlike exhaustive trace driven simulation, our model requires that the program be simulated once so that a few key characteristics can be obtained. Using these application-dependent characteristics, the model can span the entire cache parameter space consisting of cache sizes, associativity and cache block sizes. In our unified model, we are able to cater for direct-mapped, set and fully associative instruction, data and unified caches. Validation against full trace-driven simulations shows that our model has a high degree of fidelity. Finally, we show how the model can be coupled with a power model for caches such that one can very quickly decide on pareto-optimal performance-power design points for rapid design space exploration.
Resumo:
We describe a high-level design method to synthesize multi-phase regular arrays. The method is based on deriving component designs using classical regular (or systolic) array synthesis techniques and composing these separately evolved component design into a unified global design. Similarity transformations ar e applied to component designs in the composition stage in order to align data ow between the phases of the computations. Three transformations are considered: rotation, re ection and translation. The technique is aimed at the design of hardware components for high-throughput embedded systems applications and we demonstrate this by deriving a multi-phase regular array for the 2-D DCT algorithm which is widely used in many vide ocommunications applications.
Resumo:
This paper investigates properties of integer programming models for a class of production planning problems. The models are developed within a decision support system to advise a sales team of the products on which to focus their efforts in gaining new orders in the short term. The products generally require processing on several manufacturing cells and involve precedence relationships. The cells are already (partially) committed with products for stock and to satisfy existing orders and therefore only the residual capacities of each cell in each time period of the planning horizon are considered. The determination of production recommendations to the sales team that make use of residual capacities is a nontrivial optimization problem. Solving such models is computationally demanding and techniques for speeding up solution times are highly desirable. An integer programming model is developed and various preprocessing techniques are investigated and evaluated. In addition, a number of cutting plane approaches have been applied. The performance of these approaches which are both general and application specific is examined.
Resumo:
This thesis develops high performance real-time signal processing modules for direction of arrival (DOA) estimation for localization systems. It proposes highly parallel algorithms for performing subspace decomposition and polynomial rooting, which are otherwise traditionally implemented using sequential algorithms. The proposed algorithms address the emerging need for real-time localization for a wide range of applications. As the antenna array size increases, the complexity of signal processing algorithms increases, making it increasingly difficult to satisfy the real-time constraints. This thesis addresses real-time implementation by proposing parallel algorithms, that maintain considerable improvement over traditional algorithms, especially for systems with larger number of antenna array elements. Singular value decomposition (SVD) and polynomial rooting are two computationally complex steps and act as the bottleneck to achieving real-time performance. The proposed algorithms are suitable for implementation on field programmable gated arrays (FPGAs), single instruction multiple data (SIMD) hardware or application specific integrated chips (ASICs), which offer large number of processing elements that can be exploited for parallel processing. The designs proposed in this thesis are modular, easily expandable and easy to implement. Firstly, this thesis proposes a fast converging SVD algorithm. The proposed method reduces the number of iterations it takes to converge to correct singular values, thus achieving closer to real-time performance. A general algorithm and a modular system design are provided making it easy for designers to replicate and extend the design to larger matrix sizes. Moreover, the method is highly parallel, which can be exploited in various hardware platforms mentioned earlier. A fixed point implementation of proposed SVD algorithm is presented. The FPGA design is pipelined to the maximum extent to increase the maximum achievable frequency of operation. The system was developed with the objective of achieving high throughput. Various modern cores available in FPGAs were used to maximize the performance and details of these modules are presented in detail. Finally, a parallel polynomial rooting technique based on Newton’s method applicable exclusively to root-MUSIC polynomials is proposed. Unique characteristics of root-MUSIC polynomial’s complex dynamics were exploited to derive this polynomial rooting method. The technique exhibits parallelism and converges to the desired root within fixed number of iterations, making this suitable for polynomial rooting of large degree polynomials. We believe this is the first time that complex dynamics of root-MUSIC polynomial were analyzed to propose an algorithm. In all, the thesis addresses two major bottlenecks in a direction of arrival estimation system, by providing simple, high throughput, parallel algorithms.
Resumo:
This paper proposes the Optimized Power save Algorithm for continuous Media Applications (OPAMA) to improve end-user device energy efficiency. OPAMA enhances the standard legacy Power Save Mode (PSM) of IEEE 802.11 by taking into consideration application specific requirements combined with data aggregation techniques. By establishing a balanced cost/benefit tradeoff between performance and energy consumption, OPAMA is able to improve energy efficiency, while keeping the end-user experience at a desired level. OPAMA was assessed in the OMNeT++ simulator using real traces of variable bitrate video streaming applications. The results showed the capability to enhance energy efficiency, achieving savings up to 44% when compared with the IEEE 802.11 legacy PSM.