414 resultados para runtime bloat
Resumo:
This work was developed with the objective of proposing a simple, fast and versatile methodological routine using near-infrared spectroscopy (NIR) combined with multivariate analysis for the determination of ash content, moisture, protein and total lipids present in the gray shrimp (Litopenaeus vannamei ) which is conventionally performed gravimetrically after ashing at 550 ° C gravimetrically after drying at 105 ° C for the determination of moisture gravimetrically after a Soxhlet extraction using volumetric and after digestion and distillation Kjedhal respectively. Was first collected the spectra of 63 samples processed boiled shrimp Litopenaeus vannamei species. Then, the determinations by conventional standard methods were carried out. The spectra centered average underwent multiplicative scattering correction of light, smoothing Saviztky-Golay 15 points and first derivative, eliminated the noisy region, the working range was from 1100,36 to 2502,37 nm. Thus, the PLS models for predicting ash showed R 0,9471; 0,1017 and RMSEP RMSEC 0,1548; Moisture R was 0,9241; 2,5483 and RMSEP RMSEC 4,1979; R protein to 0,9201; 1,9391 and RMSEP RMSEC 2,7066; for lipids R 0,8801; 0,2827 and RMSEP RMSEC 0,2329 So that the results showed that the relative errors found between the reference method and the NIR were small and satisfactory. These results are an excellent indication that you can use the NIR to these analyzes, which is quite advantageous, since conventional techniques are time consuming, they spend a lot of reagents and involve a number of professionals, which requires a reasonable runtime while after the validation of the methodology execution using NIR reduces all this time to a few minutes, saving reagents, time and without waste generation, and that this is a non-destructive technique.
Resumo:
This work presents an analysis of the behavior of some algorithms usually available in stereo correspondence literature, with full HD images (1920x1080 pixels) to establish, within the precision dilemma versus runtime applications which these methods can be better used. The images are obtained by a system composed of a stereo camera coupled to a computer via a capture board. The OpenCV library is used for computer vision operations and processing images involved. The algorithms discussed are an overall method of search for matching blocks with the Sum of the Absolute Value of the difference (Sum of Absolute Differences - SAD), a global technique based on cutting energy graph cuts, and a so-called matching technique semi -global. The criteria for analysis are processing time, the consumption of heap memory and the mean absolute error of disparity maps generated.
Resumo:
The Artificial Neural Networks (ANN), which is one of the branches of Artificial Intelligence (AI), are being employed as a solution to many complex problems existing in several areas. To solve these problems, it is essential that its implementation is done in hardware. Among the strategies to be adopted and met during the design phase and implementation of RNAs in hardware, connections between neurons are the ones that need more attention. Recently, are RNAs implemented both in application specific integrated circuits's (Application Specific Integrated Circuits - ASIC) and in integrated circuits configured by the user, like the Field Programmable Gate Array (FPGA), which have the ability to be partially rewritten, at runtime, forming thus a system Partially Reconfigurable (SPR), the use of which provides several advantages, such as flexibility in implementation and cost reduction. It has been noted a considerable increase in the use of FPGAs for implementing ANNs. Given the above, it is proposed to implement an array of reconfigurable neurons for topologies Description of artificial neural network multilayer perceptrons (MLPs) in FPGA, in order to encourage feedback and reuse of neural processors (perceptrons) used in the same area of the circuit. It is further proposed, a communication network capable of performing the reuse of artificial neurons. The architecture of the proposed system will configure various topologies MLPs networks through partial reconfiguration of the FPGA. To allow this flexibility RNAs settings, a set of digital components (datapath), and a controller were developed to execute instructions that define each topology for MLP neural network.
Resumo:
The Artificial Neural Networks (ANN), which is one of the branches of Artificial Intelligence (AI), are being employed as a solution to many complex problems existing in several areas. To solve these problems, it is essential that its implementation is done in hardware. Among the strategies to be adopted and met during the design phase and implementation of RNAs in hardware, connections between neurons are the ones that need more attention. Recently, are RNAs implemented both in application specific integrated circuits's (Application Specific Integrated Circuits - ASIC) and in integrated circuits configured by the user, like the Field Programmable Gate Array (FPGA), which have the ability to be partially rewritten, at runtime, forming thus a system Partially Reconfigurable (SPR), the use of which provides several advantages, such as flexibility in implementation and cost reduction. It has been noted a considerable increase in the use of FPGAs for implementing ANNs. Given the above, it is proposed to implement an array of reconfigurable neurons for topologies Description of artificial neural network multilayer perceptrons (MLPs) in FPGA, in order to encourage feedback and reuse of neural processors (perceptrons) used in the same area of the circuit. It is further proposed, a communication network capable of performing the reuse of artificial neurons. The architecture of the proposed system will configure various topologies MLPs networks through partial reconfiguration of the FPGA. To allow this flexibility RNAs settings, a set of digital components (datapath), and a controller were developed to execute instructions that define each topology for MLP neural network.
Resumo:
The objective of this work is to use algorithms known as Boltzmann Machine to rebuild and classify patterns as images. This algorithm has a similar structure to that of an Artificial Neural Network but network nodes have stochastic and probabilistic decisions. This work presents the theoretical framework of the main Artificial Neural Networks, General Boltzmann Machine algorithm and a variation of this algorithm known as Restricted Boltzmann Machine. Computer simulations are performed comparing algorithms Artificial Neural Network Backpropagation with these algorithms Boltzmann General Machine and Machine Restricted Boltzmann. Through computer simulations are analyzed executions times of the different described algorithms and bit hit percentage of trained patterns that are later reconstructed. Finally, they used binary images with and without noise in training Restricted Boltzmann Machine algorithm, these images are reconstructed and classified according to the bit hit percentage in the reconstruction of the images. The Boltzmann machine algorithms were able to classify patterns trained and showed excellent results in the reconstruction of the standards code faster runtime and thus can be used in applications such as image recognition.
Resumo:
This book constitutes the refereed proceedings of the 14th International Conference on Parallel Problem Solving from Nature, PPSN 2016, held in Edinburgh, UK, in September 2016. The total of 93 revised full papers were carefully reviewed and selected from 224 submissions. The meeting began with four workshops which offered an ideal opportunity to explore specific topics in intelligent transportation Workshop, landscape-aware heuristic search, natural computing in scheduling and timetabling, and advances in multi-modal optimization. PPSN XIV also included sixteen free tutorials to give us all the opportunity to learn about new aspects: gray box optimization in theory; theory of evolutionary computation; graph-based and cartesian genetic programming; theory of parallel evolutionary algorithms; promoting diversity in evolutionary optimization: why and how; evolutionary multi-objective optimization; intelligent systems for smart cities; advances on multi-modal optimization; evolutionary computation in cryptography; evolutionary robotics - a practical guide to experiment with real hardware; evolutionary algorithms and hyper-heuristics; a bridge between optimization over manifolds and evolutionary computation; implementing evolutionary algorithms in the cloud; the attainment function approach to performance evaluation in EMO; runtime analysis of evolutionary algorithms: basic introduction; meta-model assisted (evolutionary) optimization. The papers are organized in topical sections on adaption, self-adaption and parameter tuning; differential evolution and swarm intelligence; dynamic, uncertain and constrained environments; genetic programming; multi-objective, many-objective and multi-level optimization; parallel algorithms and hardware issues; real-word applications and modeling; theory; diversity and landscape analysis.
Resumo:
Distributed Computing frameworks belong to a class of programming models that allow developers to
launch workloads on large clusters of machines. Due to the dramatic increase in the volume of
data gathered by ubiquitous computing devices, data analytic workloads have become a common
case among distributed computing applications, making Data Science an entire field of
Computer Science. We argue that Data Scientist's concern lays in three main components: a dataset,
a sequence of operations they wish to apply on this dataset, and some constraint they may have
related to their work (performances, QoS, budget, etc). However, it is actually extremely
difficult, without domain expertise, to perform data science. One need to select the right amount
and type of resources, pick up a framework, and configure it. Also, users are often running their
application in shared environments, ruled by schedulers expecting them to specify precisely their resource
needs. Inherent to the distributed and concurrent nature of the cited frameworks, monitoring and
profiling are hard, high dimensional problems that block users from making the right
configuration choices and determining the right amount of resources they need. Paradoxically, the
system is gathering a large amount of monitoring data at runtime, which remains unused.
In the ideal abstraction we envision for data scientists, the system is adaptive, able to exploit
monitoring data to learn about workloads, and process user requests into a tailored execution
context. In this work, we study different techniques that have been used to make steps toward
such system awareness, and explore a new way to do so by implementing machine learning
techniques to recommend a specific subset of system configurations for Apache Spark applications.
Furthermore, we present an in depth study of Apache Spark executors configuration, which highlight
the complexity in choosing the best one for a given workload.
Resumo:
Kernel-level malware is one of the most dangerous threats to the security of users on the Internet, so there is an urgent need for its detection. The most popular detection approach is misuse-based detection. However, it cannot catch up with today's advanced malware that increasingly apply polymorphism and obfuscation. In this thesis, we present our integrity-based detection for kernel-level malware, which does not rely on the specific features of malware. We have developed an integrity analysis system that can derive and monitor integrity properties for commodity operating systems kernels. In our system, we focus on two classes of integrity properties: data invariants and integrity of Kernel Queue (KQ) requests. We adopt static analysis for data invariant detection and overcome several technical challenges: field-sensitivity, array-sensitivity, and pointer analysis. We identify data invariants that are critical to system runtime integrity from Linux kernel 2.4.32 and Windows Research Kernel (WRK) with very low false positive rate and very low false negative rate. We then develop an Invariant Monitor to guard these data invariants against real-world malware. In our experiment, we are able to use Invariant Monitor to detect ten real-world Linux rootkits and nine real-world Windows malware and one synthetic Windows malware. We leverage static and dynamic analysis of kernel and device drivers to learn the legitimate KQ requests. Based on the learned KQ requests, we build KQguard to protect KQs. At runtime, KQguard rejects all the unknown KQ requests that cannot be validated. We apply KQguard on WRK and Linux kernel, and extensive experimental evaluation shows that KQguard is efficient (up to 5.6% overhead) and effective (capable of achieving zero false positives against representative benign workloads after appropriate training and very low false negatives against 125 real-world malware and nine synthetic attacks). In our system, Invariant Monitor and KQguard cooperate together to protect data invariants and KQs in the target kernel. By monitoring these integrity properties, we can detect malware by its violation of these integrity properties during execution.
Resumo:
Software engineering researchers are challenged to provide increasingly more pow- erful levels of abstractions to address the rising complexity inherent in software solu- tions. One new development paradigm that places models as abstraction at the fore- front of the development process is Model-Driven Software Development (MDSD). MDSD considers models as first class artifacts, extending the capability for engineers to use concepts from the problem domain of discourse to specify apropos solutions. A key component in MDSD is domain-specific modeling languages (DSMLs) which are languages with focused expressiveness, targeting a specific taxonomy of problems. The de facto approach used is to first transform DSML models to an intermediate artifact in a HLL e.g., Java or C++, then execute that resulting code. Our research group has developed a class of DSMLs, referred to as interpreted DSMLs (i-DSMLs), where models are directly interpreted by a specialized execution engine with semantics based on model changes at runtime. This execution engine uses a layered architecture and is referred to as a domain-specific virtual machine (DSVM). As the domain-specific model being executed descends the layers of the DSVM the semantic gap between the user-defined model and the services being provided by the underlying infrastructure is closed. The focus of this research is the synthesis engine, the layer in the DSVM which transforms i-DSML models into executable scripts for the next lower layer to process. The appeal of an i-DSML is constrained as it possesses unique semantics contained within the DSVM. Existing DSVMs for i-DSMLs exhibit tight coupling between the implicit model of execution and the semantics of the domain, making it difficult to develop DSVMs for new i-DSMLs without a significant investment in resources. At the onset of this research only one i-DSML had been created for the user- centric communication domain using the aforementioned approach. This i-DSML is the Communication Modeling Language (CML) and its DSVM is the Communication Virtual machine (CVM). A major problem with the CVM’s synthesis engine is that the domain-specific knowledge (DSK) and the model of execution (MoE) are tightly interwoven consequently subsequent DSVMs would need to be developed from inception with no reuse of expertise. This dissertation investigates how to decouple the DSK from the MoE and sub- sequently producing a generic model of execution (GMoE) from the remaining appli- cation logic. This GMoE can be reused to instantiate synthesis engines for DSVMs in other domains. The generalized approach to developing the model synthesis com- ponent of i-DSML interpreters utilizes a reusable framework loosely coupled to DSK as swappable framework extensions. This approach involves first creating an i-DSML and its DSVM for a second do- main, demand-side smartgrid, or microgrid energy management, and designing the synthesis engine so that the DSK and MoE are easily decoupled. To validate the utility of the approach, the SEs are instantiated using the GMoE and DSKs of the two aforementioned domains and an empirical study to support our claim of reduced developmental effort is performed.
Resumo:
Modern software applications are becoming more dependent on database management systems (DBMSs). DBMSs are usually used as black boxes by software developers. For example, Object-Relational Mapping (ORM) is one of the most popular database abstraction approaches that developers use nowadays. Using ORM, objects in Object-Oriented languages are mapped to records in the database, and object manipulations are automatically translated to SQL queries. As a result of such conceptual abstraction, developers do not need deep knowledge of databases; however, all too often this abstraction leads to inefficient and incorrect database access code. Thus, this thesis proposes a series of approaches to improve the performance of database-centric software applications that are implemented using ORM. Our approaches focus on troubleshooting and detecting inefficient (i.e., performance problems) database accesses in the source code, and we rank the detected problems based on their severity. We first conduct an empirical study on the maintenance of ORM code in both open source and industrial applications. We find that ORM performance-related configurations are rarely tuned in practice, and there is a need for tools that can help improve/tune the performance of ORM-based applications. Thus, we propose approaches along two dimensions to help developers improve the performance of ORM-based applications: 1) helping developers write more performant ORM code; and 2) helping developers configure ORM configurations. To provide tooling support to developers, we first propose static analysis approaches to detect performance anti-patterns in the source code. We automatically rank the detected anti-pattern instances according to their performance impacts. Our study finds that by resolving the detected anti-patterns, the application performance can be improved by 34% on average. We then discuss our experience and lessons learned when integrating our anti-pattern detection tool into industrial practice. We hope our experience can help improve the industrial adoption of future research tools. However, as static analysis approaches are prone to false positives and lack runtime information, we also propose dynamic analysis approaches to further help developers improve the performance of their database access code. We propose automated approaches to detect redundant data access anti-patterns in the database access code, and our study finds that resolving such redundant data access anti-patterns can improve application performance by an average of 17%. Finally, we propose an automated approach to tune performance-related ORM configurations using both static and dynamic analysis. Our study shows that our approach can help improve application throughput by 27--138%. Through our case studies on real-world applications, we show that all of our proposed approaches can provide valuable support to developers and help improve application performance significantly.
Resumo:
With the emerging prevalence of smart phones and 4G LTE networks, the demand for faster-better-cheaper mobile services anytime and anywhere is ever growing. The Dynamic Network Optimization (DNO) concept emerged as a solution that optimally and continuously tunes the network settings, in response to varying network conditions and subscriber needs. Yet, the DNO realization is still at infancy, largely hindered by the bottleneck of the lengthy optimization runtime. This paper presents the design and prototype of a novel cloud based parallel solution that further enhances the scalability of our prior work on various parallel solutions that accelerate network optimization algorithms. The solution aims to satisfy the high performance required by DNO, preliminarily on a sub-hourly basis. The paper subsequently visualizes a design and a full cycle of a DNO system. A set of potential solutions to large network and real-time DNO are also proposed. Overall, this work creates a breakthrough towards the realization of DNO.
Resumo:
The Mobile Network Optimization (MNO) technologies have advanced at a tremendous pace in recent years. And the Dynamic Network Optimization (DNO) concept emerged years ago, aimed to continuously optimize the network in response to variations in network traffic and conditions. Yet, DNO development is still at its infancy, mainly hindered by a significant bottleneck of the lengthy optimization runtime. This paper identifies parallelism in greedy MNO algorithms and presents an advanced distributed parallel solution. The solution is designed, implemented and applied to real-life projects whose results yield a significant, highly scalable and nearly linear speedup up to 6.9 and 14.5 on distributed 8-core and 16-core systems respectively. Meanwhile, optimization outputs exhibit self-consistency and high precision compared to their sequential counterpart. This is a milestone in realizing the DNO. Further, the techniques may be applied to similar greedy optimization algorithm based applications.
Resumo:
It has been years since the introduction of the Dynamic Network Optimization (DNO) concept, yet the DNO development is still at its infant stage, largely due to a lack of breakthrough in minimizing the lengthy optimization runtime. Our previous work, a distributed parallel solution, has achieved a significant speed gain. To cater for the increased optimization complexity pressed by the uptake of smartphones and tablets, however, this paper examines the potential areas for further improvement and presents a novel asynchronous distributed parallel design that minimizes the inter-process communications. The new approach is implemented and applied to real-life projects whose results demonstrate an augmented acceleration of 7.5 times on a 16-core distributed system compared to 6.1 of our previous solution. Moreover, there is no degradation in the optimization outcome. This is a solid sprint towards the realization of DNO.
Resumo:
This talk explores how the runtime system and operating system can leverage metrics that express the significance and resilience of application components in order to reduce the energy footprint of parallel applications. We will explore in particular how software can tolerate and indeed exploit higher error rates in future processors and memory technologies that may operate outside their safe margins.
Resumo:
Reliability has emerged as a critical design constraint especially in memories. Designers are going to great lengths to guarantee fault free operation of the underlying silicon by adopting redundancy-based techniques, which essentially try to detect and correct every single error. However, such techniques come at a cost of large area, power and performance overheads which making many researchers to doubt their efficiency especially for error resilient systems where 100% accuracy is not always required. In this paper, we present an alternative method focusing on the confinement of the resulting output error induced by any reliability issues. By focusing on memory faults, rather than correcting every single error the proposed method exploits the statistical characteristics of any target application and replaces any erroneous data with the best available estimate of that data. To realize the proposed method a RISC processor is augmented with custom instructions and special-purpose functional units. We apply the method on the proposed enhanced processor by studying the statistical characteristics of the various algorithms involved in a popular multimedia application. Our experimental results show that in contrast to state-of-the-art fault tolerance approaches, we are able to reduce runtime and area overhead by 71.3% and 83.3% respectively.