53 resultados para computation
Resumo:
As technology geometries have shrunk to the deep submicron regime, the communication delay and power consumption of global interconnections in high performance Multi- Processor Systems-on-Chip (MPSoCs) are becoming a major bottleneck. The Network-on- Chip (NoC) architecture paradigm, based on a modular packet-switched mechanism, can address many of the on-chip communication issues such as performance limitations of long interconnects and integration of large number of Processing Elements (PEs) on a chip. The choice of routing protocol and NoC structure can have a significant impact on performance and power consumption in on-chip networks. In addition, building a high performance, area and energy efficient on-chip network for multicore architectures requires a novel on-chip router allowing a larger network to be integrated on a single die with reduced power consumption. On top of that, network interfaces are employed to decouple computation resources from communication resources, to provide the synchronization between them, and to achieve backward compatibility with existing IP cores. Three adaptive routing algorithms are presented as a part of this thesis. The first presented routing protocol is a congestion-aware adaptive routing algorithm for 2D mesh NoCs which does not support multicast (one-to-many) traffic while the other two protocols are adaptive routing models supporting both unicast (one-to-one) and multicast traffic. A streamlined on-chip router architecture is also presented for avoiding congested areas in 2D mesh NoCs via employing efficient input and output selection. The output selection utilizes an adaptive routing algorithm based on the congestion condition of neighboring routers while the input selection allows packets to be serviced from each input port according to its congestion level. Moreover, in order to increase memory parallelism and bring compatibility with existing IP cores in network-based multiprocessor architectures, adaptive network interface architectures are presented to use multiple SDRAMs which can be accessed simultaneously. In addition, a smart memory controller is integrated in the adaptive network interface to improve the memory utilization and reduce both memory and network latencies. Three Dimensional Integrated Circuits (3D ICs) have been emerging as a viable candidate to achieve better performance and package density as compared to traditional 2D ICs. In addition, combining the benefits of 3D IC and NoC schemes provides a significant performance gain for 3D architectures. In recent years, inter-layer communication across multiple stacked layers (vertical channel) has attracted a lot of interest. In this thesis, a novel adaptive pipeline bus structure is proposed for inter-layer communication to improve the performance by reducing the delay and complexity of traditional bus arbitration. In addition, two mesh-based topologies for 3D architectures are also introduced to mitigate the inter-layer footprint and power dissipation on each layer with a small performance penalty.
Resumo:
In this study, feature selection in classification based problems is highlighted. The role of feature selection methods is to select important features by discarding redundant and irrelevant features in the data set, we investigated this case by using fuzzy entropy measures. We developed fuzzy entropy based feature selection method using Yu's similarity and test this using similarity classifier. As the similarity classifier we used Yu's similarity, we tested our similarity on the real world data set which is dermatological data set. By performing feature selection based on fuzzy entropy measures before classification on our data set the empirical results were very promising, the highest classification accuracy of 98.83% was achieved when testing our similarity measure to the data set. The achieved results were then compared with some other results previously obtained using different similarity classifiers, the obtained results show better accuracy than the one achieved before. The used methods helped to reduce the dimensionality of the used data set, to speed up the computation time of a learning algorithm and therefore have simplified the classification task
Resumo:
The purpose of this master thesis was to perform simulations that involve use of random number while testing hypotheses especially on two samples populations being compared weather by their means, variances or Sharpe ratios. Specifically, we simulated some well known distributions by Matlab and check out the accuracy of an hypothesis testing. Furthermore, we went deeper and check what could happen once the bootstrapping method as described by Effrons is applied on the simulated data. In addition to that, one well known RobustSharpe hypothesis testing stated in the paper of Ledoit and Wolf was applied to measure the statistical significance performance between two investment founds basing on testing weather there is a statistically significant difference between their Sharpe Ratios or not. We collected many literatures about our topic and perform by Matlab many simulated random numbers as possible to put out our purpose; As results we come out with a good understanding that testing are not always accurate; for instance while testing weather two normal distributed random vectors come from the same normal distribution. The Jacque-Berra test for normality showed that for the normal random vector r1 and r2, only 94,7% and 95,7% respectively are coming from normal distribution in contrast 5,3% and 4,3% failed to shown the truth already known; but when we introduce the bootstrapping methods by Effrons while estimating pvalues where the hypothesis decision is based, the accuracy of the test was 100% successful. From the above results the reports showed that bootstrapping methods while testing or estimating some statistics should always considered because at most cases the outcome are accurate and errors are minimized in the computation. Also the RobustSharpe test which is known to use one of the bootstrapping methods, studentised one, were applied first on different simulated data including distribution of many kind and different shape secondly, on real data, Hedge and Mutual funds. The test performed quite well to agree with the existence of statistical significance difference between their Sharpe ratios as described in the paper of Ledoit andWolf.
Resumo:
Memristive computing refers to the utilization of the memristor, the fourth fundamental passive circuit element, in computational tasks. The existence of the memristor was theoretically predicted in 1971 by Leon O. Chua, but experimentally validated only in 2008 by HP Labs. A memristor is essentially a nonvolatile nanoscale programmable resistor — indeed, memory resistor — whose resistance, or memristance to be precise, is changed by applying a voltage across, or current through, the device. Memristive computing is a new area of research, and many of its fundamental questions still remain open. For example, it is yet unclear which applications would benefit the most from the inherent nonlinear dynamics of memristors. In any case, these dynamics should be exploited to allow memristors to perform computation in a natural way instead of attempting to emulate existing technologies such as CMOS logic. Examples of such methods of computation presented in this thesis are memristive stateful logic operations, memristive multiplication based on the translinear principle, and the exploitation of nonlinear dynamics to construct chaotic memristive circuits. This thesis considers memristive computing at various levels of abstraction. The first part of the thesis analyses the physical properties and the current-voltage behaviour of a single device. The middle part presents memristor programming methods, and describes microcircuits for logic and analog operations. The final chapters discuss memristive computing in largescale applications. In particular, cellular neural networks, and associative memory architectures are proposed as applications that significantly benefit from memristive implementation. The work presents several new results on memristor modeling and programming, memristive logic, analog arithmetic operations on memristors, and applications of memristors. The main conclusion of this thesis is that memristive computing will be advantageous in large-scale, highly parallel mixed-mode processing architectures. This can be justified by the following two arguments. First, since processing can be performed directly within memristive memory architectures, the required circuitry, processing time, and possibly also power consumption can be reduced compared to a conventional CMOS implementation. Second, intrachip communication can be naturally implemented by a memristive crossbar structure.
Resumo:
Modern machine structures are often fabricated by welding. From a fatigue point of view, the structural details and especially, the welded details are the most prone to fatigue damage and failure. Design against fatigue requires information on the fatigue resistance of a structure’s critical details and the stress loads that act on each detail. Even though, dynamic simulation of flexible bodies is already current method for analyzing structures, obtaining the stress history of a structural detail during dynamic simulation is a challenging task; especially when the detail has a complex geometry. In particular, analyzing the stress history of every structural detail within a single finite element model can be overwhelming since the amount of nodal degrees of freedom needed in the model may require an impractical amount of computational effort. The purpose of computer simulation is to reduce amount of prototypes and speed up the product development process. Also, to take operator influence into account, real time models, i.e. simplified and computationally efficient models are required. This in turn, requires stress computation to be efficient if it will be performed during dynamic simulation. The research looks back at the theoretical background of multibody dynamic simulation and finite element method to find suitable parts to form a new approach for efficient stress calculation. This study proposes that, the problem of stress calculation during dynamic simulation can be greatly simplified by using a combination of floating frame of reference formulation with modal superposition and a sub-modeling approach. In practice, the proposed approach can be used to efficiently generate the relevant fatigue assessment stress history for a structural detail during or after dynamic simulation. In this work numerical examples are presented to demonstrate the proposed approach in practice. The results show that approach is applicable and can be used as proposed.
Resumo:
Waste incineration plants are increasingly established in China. A low heating value and high moisture content, due to a large proportion of biowaste in the municipal solid waste (MSW), can be regarded as typical characteristics of Chinese MSW. Two incineration technologies have been mainly established in China: stoker grate and circular fluidized bed (CFB). Both of them are designed to incinerate mixed MSW. However, there have been difficulties to reach the sufficient temperature in the combustion process due to the low heating value of the MSW. That is contributed to the usage of an auxiliary fossil fuel, which is often used during the whole incineration process. The objective of this study was to design alternative Waste-to-energy (WTE) scenarios for existing WTE plants with the aim to improve the material and energy efficiency as well as the feasibility of the plants. Moreover, the aim of this thesis was to find the key factors that affect to the feasibility of the scenarios. Five different WTE plants were selected as study targets. The necessary data for calculation was gained from literature as well as received from the operators of the target WTE plants. The created scenarios were based on mechanical-biological treatment (MBT) technologies, in which the produced solid recovered fuel (SRF) was fed as an auxiliary fuel into a WTE plant replacing the fossil fuel. The mechanically separated biowaste was treated either in an anaerobic digestion (AD) plant, a biodrying plant, a thermal drying plant, or a combined AD plant + thermal drying plant. An interactive excel spreadsheet based computation tool was designed to estimate the viability of the scenarios in different WTE cases. The key figures of the improved material and energy efficiency, such as additional electricity generated and avoided waste for landfill, were got as results. Furthermore, economic indicators such as annual profits (or costs), payback period, and internal rate of return (IRR) were gained as results. The results show that the AD scenario was the most profitable in most of the cases. The current heating value of MSW and the tipping fee for the received MSW appeared as the most important factor in terms of feasibility.
Resumo:
Olemassa olevat spektrieromittarit eivät vastaa riittävästi CIEDE2000-värieroa. Tämän työn tavoitteena oli toteuttaa menetelmä, joka laskee värispektrien eron siten, että tulos vastaa CIEDE2000-värieroa. Kehitystyön tuloksena syntyi menetelmä, joka perustuu ennalta laskettuihin eroihin tunnettujen spektrien välillä ja niiden perusteella johdettuihin laskentaparametreihin. menetelmällä pystyy laskemaan spektrieroja vain niiden spektrien välillä, jotka saadaan sekoittamalla tunnettuja spektrejä. Laskentaparametrien laskenta on työläs prosessi ja siksi menetelmään toteutettiin hajautus usealle tietokoneelle. Menetelmä saatiin vastaamaan hyvin CIEDE2000:ia suurimmalle osalle spektrejä harvoja poikkeuksia lukuunottamatta. Ongelmat johtuvat mallissa olevasta matemaattisesta ominaisuudesta. Spektrieromittari näyttää metameerisille spektreille nollasta poikkeavan arvon, vaikka CIEDE2000 näyttää nollaa. Tämä osoittaa spektrieromittarin oikeamman toiminnan CIEDE2000-värieroon verrattuna.
Resumo:
The papermaking industry has been continuously developing intelligent solutions to characterize the raw materials it uses, to control the manufacturing process in a robust way, and to guarantee the desired quality of the end product. Based on the much improved imaging techniques and image-based analysis methods, it has become possible to look inside the manufacturing pipeline and propose more effective alternatives to human expertise. This study is focused on the development of image analyses methods for the pulping process of papermaking. Pulping starts with wood disintegration and forming the fiber suspension that is subsequently bleached, mixed with additives and chemicals, and finally dried and shipped to the papermaking mills. At each stage of the process it is important to analyze the properties of the raw material to guarantee the product quality. In order to evaluate properties of fibers, the main component of the pulp suspension, a framework for fiber characterization based on microscopic images is proposed in this thesis as the first contribution. The framework allows computation of fiber length and curl index correlating well with the ground truth values. The bubble detection method, the second contribution, was developed in order to estimate the gas volume at the delignification stage of the pulping process based on high-resolution in-line imaging. The gas volume was estimated accurately and the solution enabled just-in-time process termination whereas the accurate estimation of bubble size categories still remained challenging. As the third contribution of the study, optical flow computation was studied and the methods were successfully applied to pulp flow velocity estimation based on double-exposed images. Finally, a framework for classifying dirt particles in dried pulp sheets, including the semisynthetic ground truth generation, feature selection, and performance comparison of the state-of-the-art classification techniques, was proposed as the fourth contribution. The framework was successfully tested on the semisynthetic and real-world pulp sheet images. These four contributions assist in developing an integrated factory-level vision-based process control.
Resumo:
Stochastic differential equation (SDE) is a differential equation in which some of the terms and its solution are stochastic processes. SDEs play a central role in modeling physical systems like finance, Biology, Engineering, to mention some. In modeling process, the computation of the trajectories (sample paths) of solutions to SDEs is very important. However, the exact solution to a SDE is generally difficult to obtain due to non-differentiability character of realizations of the Brownian motion. There exist approximation methods of solutions of SDE. The solutions will be continuous stochastic processes that represent diffusive dynamics, a common modeling assumption for financial, Biology, physical, environmental systems. This Masters' thesis is an introduction and survey of numerical solution methods for stochastic differential equations. Standard numerical methods, local linearization methods and filtering methods are well described. We compute the root mean square errors for each method from which we propose a better numerical scheme. Stochastic differential equations can be formulated from a given ordinary differential equations. In this thesis, we describe two kind of formulations: parametric and non-parametric techniques. The formulation is based on epidemiological SEIR model. This methods have a tendency of increasing parameters in the constructed SDEs, hence, it requires more data. We compare the two techniques numerically.
Resumo:
Hydraulic head is distributed through a medium with porous aspect. The analysis of hydraulic head from one point to another is used by the Richard's equation. This equation is equivalent to the groundwater ow equation that predicts the volumetric water contents. COMSOL 3.5 is used for computation applying Richard's equation. A rectangle of 100 meters of length and 10 meters of large (depth) with 0,1 m/s fl ux of inlet as source of our fl uid is simulated. The domain have Richards' equation model in two dimension (2D). Hydraulic head increases proportional with moisture content.
Resumo:
Multiprocessor system-on-chip (MPSoC) designs utilize the available technology and communication architectures to meet the requirements of the upcoming applications. In MPSoC, the communication platform is both the key enabler, as well as the key differentiator for realizing efficient MPSoCs. It provides product differentiation to meet a diverse, multi-dimensional set of design constraints, including performance, power, energy, reconfigurability, scalability, cost, reliability and time-to-market. The communication resources of a single interconnection platform cannot be fully utilized by all kind of applications, such as the availability of higher communication bandwidth for computation but not data intensive applications is often unfeasible in the practical implementation. This thesis aims to perform the architecture-level design space exploration towards efficient and scalable resource utilization for MPSoC communication architecture. In order to meet the performance requirements within the design constraints, careful selection of MPSoC communication platform, resource aware partitioning and mapping of the application play important role. To enhance the utilization of communication resources, variety of techniques such as resource sharing, multicast to avoid re-transmission of identical data, and adaptive routing can be used. For implementation, these techniques should be customized according to the platform architecture. To address the resource utilization of MPSoC communication platforms, variety of architectures with different design parameters and performance levels, namely Segmented bus (SegBus), Network-on-Chip (NoC) and Three-Dimensional NoC (3D-NoC), are selected. Average packet latency and power consumption are the evaluation parameters for the proposed techniques. In conventional computing architectures, fault on a component makes the connected fault-free components inoperative. Resource sharing approach can utilize the fault-free components to retain the system performance by reducing the impact of faults. Design space exploration also guides to narrow down the selection of MPSoC architecture, which can meet the performance requirements with design constraints.
Resumo:
Video transcoding refers to the process of converting a digital video from one format into another format. It is a compute-intensive operation. Therefore, transcoding of a large number of simultaneous video streams requires a large amount of computing resources. Moreover, to handle di erent load conditions in a cost-e cient manner, the video transcoding service should be dynamically scalable. Infrastructure as a Service Clouds currently offer computing resources, such as virtual machines, under the pay-per-use business model. Thus the IaaS Clouds can be leveraged to provide a coste cient, dynamically scalable video transcoding service. To use computing resources e ciently in a cloud computing environment, cost-e cient virtual machine provisioning is required to avoid overutilization and under-utilization of virtual machines. This thesis presents proactive virtual machine resource allocation and de-allocation algorithms for video transcoding in cloud computing. Since users' requests for videos may change at di erent times, a check is required to see if the current computing resources are adequate for the video requests. Therefore, the work on admission control is also provided. In addition to admission control, temporal resolution reduction is used to avoid jitters in a video. Furthermore, in a cloud computing environment such as Amazon EC2, the computing resources are more expensive as compared with the storage resources. Therefore, to avoid repetition of transcoding operations, a transcoded video needs to be stored for a certain time. To store all videos for the same amount of time is also not cost-e cient because popular transcoded videos have high access rate while unpopular transcoded videos are rarely accessed. This thesis provides a cost-e cient computation and storage trade-o strategy, which stores videos in the video repository as long as it is cost-e cient to store them. This thesis also proposes video segmentation strategies for bit rate reduction and spatial resolution reduction video transcoding. The evaluation of proposed strategies is performed using a message passing interface based video transcoder, which uses a coarse-grain parallel processing approach where video is segmented at group of pictures level.
Resumo:
Global illumination algorithms are at the center of realistic image synthesis and account for non-trivial light transport and occlusion within scenes, such as indirect illumination, ambient occlusion, and environment lighting. Their computationally most difficult part is determining light source visibility at each visible scene point. Height fields, on the other hand, constitute an important special case of geometry and are mainly used to describe certain types of objects such as terrains and to map detailed geometry onto object surfaces. The geometry of an entire scene can also be approximated by treating the distance values of its camera projection as a screen-space height field. In order to shadow height fields from environment lights a horizon map is usually used to occlude incident light. We reduce the per-receiver time complexity of generating the horizon map on N N height fields from O(N) of the previous work to O(1) by using an algorithm that incrementally traverses the height field and reuses the information already gathered along the path of traversal. We also propose an accurate method to integrate the incident light within the limits given by the horizon map. Indirect illumination in height fields requires information about which other points are visible to each height field point. We present an algorithm to determine this intervisibility in a time complexity that matches the space complexity of the produced visibility information, which is in contrast to previous methods which scale in the height field size. As a result the amount of computation is reduced by two orders of magnitude in common use cases. Screen-space ambient obscurance methods approximate ambient obscurance from the depth bu er geometry and have been widely adopted by contemporary real-time applications. They work by sampling the screen-space geometry around each receiver point but have been previously limited to near- field effects because sampling a large radius quickly exceeds the render time budget. We present an algorithm that reduces the quadratic per-pixel complexity of previous methods to a linear complexity by line sweeping over the depth bu er and maintaining an internal representation of the processed geometry from which occluders can be efficiently queried. Another algorithm is presented to determine ambient obscurance from the entire depth bu er at each screen pixel. The algorithm scans the depth bu er in a quick pre-pass and locates important features in it, which are then used to evaluate the ambient obscurance integral accurately. We also propose an evaluation of the integral such that results within a few percent of the ray traced screen-space reference are obtained at real-time render times.
Resumo:
With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.
Resumo:
One of the main challenges in Software Engineering is to cope with the transition from an industry based on software as a product to software as a service. The field of Software Engineering should provide the necessary methods and tools to develop and deploy new cost-efficient and scalable digital services. In this thesis, we focus on deployment platforms to ensure cost-efficient scalability of multi-tier web applications and on-demand video transcoding service for different types of load conditions. Infrastructure as a Service (IaaS) clouds provide Virtual Machines (VMs) under the pay-per-use business model. Dynamically provisioning VMs on demand allows service providers to cope with fluctuations on the number of service users. However, VM provisioning must be done carefully, because over-provisioning results in an increased operational cost, while underprovisioning leads to a subpar service. Therefore, our main focus in this thesis is on cost-efficient VM provisioning for multi-tier web applications and on-demand video transcoding. Moreover, to prevent provisioned VMs from becoming overloaded, we augment VM provisioning with an admission control mechanism. Similarly, to ensure efficient use of provisioned VMs, web applications on the under-utilized VMs are consolidated periodically. Thus, the main problem that we address is cost-efficient VM provisioning augmented with server consolidation and admission control on the provisioned VMs. We seek solutions for two types of applications: multi-tier web applications that follow the request-response paradigm and on-demand video transcoding that is based on video streams with soft realtime constraints. Our first contribution is a cost-efficient VM provisioning approach for multi-tier web applications. The proposed approach comprises two subapproaches: a reactive VM provisioning approach called ARVUE and a hybrid reactive-proactive VM provisioning approach called Cost-efficient Resource Allocation for Multiple web applications with Proactive scaling. Our second contribution is a prediction-based VM provisioning approach for on-demand video transcoding in the cloud. Moreover, to prevent virtualized servers from becoming overloaded, the proposed VM provisioning approaches are augmented with admission control approaches. Therefore, our third contribution is a session-based admission control approach for multi-tier web applications called adaptive Admission Control for Virtualized Application Servers. Similarly, the fourth contribution in this thesis is a stream-based admission control and scheduling approach for on-demand video transcoding called Stream-Based Admission Control and Scheduling. Our fifth contribution is a computation and storage trade-o strategy for cost-efficient video transcoding in cloud computing. Finally, the sixth and the last contribution is a web application consolidation approach, which uses Ant Colony System to minimize the under-utilization of the virtualized application servers.