908 resultados para Data Structure Operations


Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we present a data structure which improves the average complexity of the operations of updating and a certain type of retrieving information on an array. The data structure is devised from a particular family of digraphs verifying conditions so that they represent solutions for this problem.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The rapid growth of virtualized data centers and cloud hosting services is making the management of physical resources such as CPU, memory, and I/O bandwidth in data center servers increasingly important. Server management now involves dealing with multiple dissimilar applications with varying Service-Level-Agreements (SLAs) and multiple resource dimensions. The multiplicity and diversity of resources and applications are rendering administrative tasks more complex and challenging. This thesis aimed to develop a framework and techniques that would help substantially reduce data center management complexity.^ We specifically addressed two crucial data center operations. First, we precisely estimated capacity requirements of client virtual machines (VMs) while renting server space in cloud environment. Second, we proposed a systematic process to efficiently allocate physical resources to hosted VMs in a data center. To realize these dual objectives, accurately capturing the effects of resource allocations on application performance is vital. The benefits of accurate application performance modeling are multifold. Cloud users can size their VMs appropriately and pay only for the resources that they need; service providers can also offer a new charging model based on the VMs performance instead of their configured sizes. As a result, clients will pay exactly for the performance they are actually experiencing; on the other hand, administrators will be able to maximize their total revenue by utilizing application performance models and SLAs. ^ This thesis made the following contributions. First, we identified resource control parameters crucial for distributing physical resources and characterizing contention for virtualized applications in a shared hosting environment. Second, we explored several modeling techniques and confirmed the suitability of two machine learning tools, Artificial Neural Network and Support Vector Machine, to accurately model the performance of virtualized applications. Moreover, we suggested and evaluated modeling optimizations necessary to improve prediction accuracy when using these modeling tools. Third, we presented an approach to optimal VM sizing by employing the performance models we created. Finally, we proposed a revenue-driven resource allocation algorithm which maximizes the SLA-generated revenue for a data center.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

During the SINOPS project, an optimal state of the art simulation of the marine silicon cycle is attempted employing a biogeochemical ocean general circulation model (BOGCM) through three particular time steps relevant for global (paleo-) climate. In order to tune the model optimally, results of the simulations are compared to a comprehensive data set of 'real' observations. SINOPS' scientific data management ensures that data structure becomes homogeneous throughout the project. Practical work routine comprises systematic progress from data acquisition, through preparation, processing, quality check and archiving, up to the presentation of data to the scientific community. Meta-information and analytical data are mapped by an n-dimensional catalogue in order to itemize the analytical value and to serve as an unambiguous identifier. In practice, data management is carried out by means of the online-accessible information system PANGAEA, which offers a tool set comprising a data warehouse, Graphical Information System (GIS), 2-D plot, cross-section plot, etc. and whose multidimensional data model promotes scientific data mining. Besides scientific and technical aspects, this alliance between scientific project team and data management crew serves to integrate the participants and allows them to gain mutual respect and appreciation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Due to the growth of design size and complexity, design verification is an important aspect of the Logic Circuit development process. The purpose of verification is to validate that the design meets the system requirements and specification. This is done by either functional or formal verification. The most popular approach to functional verification is the use of simulation based techniques. Using models to replicate the behaviour of an actual system is called simulation. In this thesis, a software/data structure architecture without explicit locks is proposed to accelerate logic gate circuit simulation. We call thus system ZSIM. The ZSIM software architecture simulator targets low cost SIMD multi-core machines. Its performance is evaluated on the Intel Xeon Phi and 2 other machines (Intel Xeon and AMD Opteron). The aim of these experiments is to: • Verify that the data structure used allows SIMD acceleration, particularly on machines with gather instructions ( section 5.3.1). • Verify that, on sufficiently large circuits, substantial gains could be made from multicore parallelism ( section 5.3.2 ). • Show that a simulator using this approach out-performs an existing commercial simulator on a standard workstation ( section 5.3.3 ). • Show that the performance on a cheap Xeon Phi card is competitive with results reported elsewhere on much more expensive super-computers ( section 5.3.5 ). To evaluate the ZSIM, two types of test circuits were used: 1. Circuits from the IWLS benchmark suit [1] which allow direct comparison with other published studies of parallel simulators.2. Circuits generated by a parametrised circuit synthesizer. The synthesizer used an algorithm that has been shown to generate circuits that are statistically representative of real logic circuits. The synthesizer allowed testing of a range of very large circuits, larger than the ones for which it was possible to obtain open source files. The experimental results show that with SIMD acceleration and multicore, ZSIM gained a peak parallelisation factor of 300 on Intel Xeon Phi and 11 on Intel Xeon. With only SIMD enabled, ZSIM achieved a maximum parallelistion gain of 10 on Intel Xeon Phi and 4 on Intel Xeon. Furthermore, it was shown that this software architecture simulator running on a SIMD machine is much faster than, and can handle much bigger circuits than a widely used commercial simulator (Xilinx) running on a workstation. The performance achieved by ZSIM was also compared with similar pre-existing work on logic simulation targeting GPUs and supercomputers. It was shown that ZSIM simulator running on a Xeon Phi machine gives comparable simulation performance to the IBM Blue Gene supercomputer at very much lower cost. The experimental results have shown that the Xeon Phi is competitive with simulation on GPUs and allows the handling of much larger circuits than have been reported for GPU simulation. When targeting Xeon Phi architecture, the automatic cache management of the Xeon Phi, handles and manages the on-chip local store without any explicit mention of the local store being made in the architecture of the simulator itself. However, targeting GPUs, explicit cache management in program increases the complexity of the software architecture. Furthermore, one of the strongest points of the ZSIM simulator is its portability. Note that the same code was tested on both AMD and Xeon Phi machines. The same architecture that efficiently performs on Xeon Phi, was ported into a 64 core NUMA AMD Opteron. To conclude, the two main achievements are restated as following: The primary achievement of this work was proving that the ZSIM architecture was faster than previously published logic simulators on low cost platforms. The secondary achievement was the development of a synthetic testing suite that went beyond the scale range that was previously publicly available, based on prior work that showed the synthesis technique is valid.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The rapid growth of virtualized data centers and cloud hosting services is making the management of physical resources such as CPU, memory, and I/O bandwidth in data center servers increasingly important. Server management now involves dealing with multiple dissimilar applications with varying Service-Level-Agreements (SLAs) and multiple resource dimensions. The multiplicity and diversity of resources and applications are rendering administrative tasks more complex and challenging. This thesis aimed to develop a framework and techniques that would help substantially reduce data center management complexity. We specifically addressed two crucial data center operations. First, we precisely estimated capacity requirements of client virtual machines (VMs) while renting server space in cloud environment. Second, we proposed a systematic process to efficiently allocate physical resources to hosted VMs in a data center. To realize these dual objectives, accurately capturing the effects of resource allocations on application performance is vital. The benefits of accurate application performance modeling are multifold. Cloud users can size their VMs appropriately and pay only for the resources that they need; service providers can also offer a new charging model based on the VMs performance instead of their configured sizes. As a result, clients will pay exactly for the performance they are actually experiencing; on the other hand, administrators will be able to maximize their total revenue by utilizing application performance models and SLAs. This thesis made the following contributions. First, we identified resource control parameters crucial for distributing physical resources and characterizing contention for virtualized applications in a shared hosting environment. Second, we explored several modeling techniques and confirmed the suitability of two machine learning tools, Artificial Neural Network and Support Vector Machine, to accurately model the performance of virtualized applications. Moreover, we suggested and evaluated modeling optimizations necessary to improve prediction accuracy when using these modeling tools. Third, we presented an approach to optimal VM sizing by employing the performance models we created. Finally, we proposed a revenue-driven resource allocation algorithm which maximizes the SLA-generated revenue for a data center.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the presented thesis work, the meshfree method with distance fields was coupled with the lattice Boltzmann method to obtain solutions of fluid-structure interaction problems. The thesis work involved development and implementation of numerical algorithms, data structure, and software. Numerical and computational properties of the coupling algorithm combining the meshfree method with distance fields and the lattice Boltzmann method were investigated. Convergence and accuracy of the methodology was validated by analytical solutions. The research was focused on fluid-structure interaction solutions in complex, mesh-resistant domains as both the lattice Boltzmann method and the meshfree method with distance fields are particularly adept in these situations. Furthermore, the fluid solution provided by the lattice Boltzmann method is massively scalable, allowing extensive use of cutting edge parallel computing resources to accelerate this phase of the solution process. The meshfree method with distance fields allows for exact satisfaction of boundary conditions making it possible to exactly capture the effects of the fluid field on the solid structure.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The central aim of this dissertation is to introduce innovative methods, models, and tools to enhance the overall performance of supply chains responsible for handling perishable products. This concept of improved performance encompasses several critical dimensions, including enhanced efficiency in supply chain operations, product quality, safety, sustainability, waste generation minimization, and compliance with norms and regulations. The research is structured around three specific research questions that provide a solid foundation for delving into and narrowing down the array of potential solutions. These questions primarily concern enhancing the overall performance of distribution networks for perishable products and optimizing the package hierarchy, extending to unconventional packaging solutions. To address these research questions effectively, a well-defined research framework guides the approach. However, the dissertation adheres to an overarching methodological approach that comprises three fundamental aspects. The first aspect centers on the necessity of systematic data sampling and categorization, including identifying critical points within food supply chains. The data collected in this context must then be organized within a customized data structure designed to feed both cyber-physical and digital twins to quantify and analyze supply chain failures with a preventive perspective.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The validation of an analytical procedure must be certified through the determination of parameters known as figures of merit. For first order data, the acuracy, precision, robustness and bias is similar to the methods of univariate calibration. Linearity, sensitivity, signal to noise ratio, adjustment, selectivity and confidence intervals need different approaches, specific for multivariate data. Selectivity and signal to noise ratio are more critical and they only can be estimated by means of the calculation of the net analyte signal. In second order calibration, some differentes approaches are necessary due to data structure.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The power loss reduction in distribution systems (DSs) is a nonlinear and multiobjective problem. Service restoration in DSs is even computationally hard since it additionally requires a solution in real-time. Both DS problems are computationally complex. For large-scale networks, the usual problem formulation has thousands of constraint equations. The node-depth encoding (NDE) enables a modeling of DSs problems that eliminates several constraint equations from the usual formulation, making the problem solution simpler. On the other hand, a multiobjective evolutionary algorithm (EA) based on subpopulation tables adequately models several objectives and constraints, enabling a better exploration of the search space. The combination of the multiobjective EA with NDE (MEAN) results in the proposed approach for solving DSs problems for large-scale networks. Simulation results have shown the MEAN is able to find adequate restoration plans for a real DS with 3860 buses and 632 switches in a running time of 0.68 s. Moreover, the MEAN has shown a sublinear running time in function of the system size. Tests with networks ranging from 632 to 5166 switches indicate that the MEAN can find network configurations corresponding to a power loss reduction of 27.64% for very large networks requiring relatively low running time.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

O objectivo deste trabalho é a análise da eficiência produtiva e dos efeitos da concentração sobre os custos bancários, tendo por base a indústria bancária portuguesa. O carácter multiproduto da empresa bancária sugere a necessidade de se adoptar formas multiproduto da função custo (tipo Fourier). Introduzimos variáveis de homogeneidade e de estrutura que permitem o recurso a formas funcionais uniproduto (Cobb-Douglas) à banca. A amostra corresponde a 22 bancos que operavam em Portugal entre 1995-2001, base não consolidada e dados em painel. Para o estudo da ineficiência recorreu-se ao modelo estocástico da curva fronteira (SFA), para as duas especificações. Na análise da concentração, introduziram-se variáveis binárias que pretendem captar os efeitos durante quatro anos após a concentração. Tanto no caso da SFA como no da concentração, os resultados encontrados são sensíveis à especificação funcional adoptada. Concluindo, o processo de concentração bancário parece justificar-se pela possibilidade da diminuição da ineficiência-X. This study addresses the productive efficiency and the effects of concentration over the banking costs, stressing its focus on the Portuguese banking market. The multiproduct character of the banking firm suggests the use of functional forms as Fourier. The introduction of variables of structure and of homogeneity allows the association of the banking activity (multiproduct) with a single product function (Cobb-Douglas type). The sample covers 22 banks which operated in Portugal from 1995-2001, non consolidated base with a panel data structure. The study about inefficiency is elaborated through the stochastic frontier model (SFA), for the two specifications selected. As a methodology to analyze the concentration, we introduced binary variables, which intend to catch the effects through four years after the concentration process. The results obtained, through SFA and concentration approach, are influenced by the kind of specifications selected. Summing up, the concentration process of the Banking Industry sounds to be justified by the possibility of the X-inefficiency.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Environmental management is a complex task. The amount and heterogeneity of the data needed for an environmental decision making tool is overwhelming without adequate database systems and innovative methodologies. As far as data management, data interaction and data processing is concerned we here propose the use of a Geographical Information System (GIS) whilst for the decision making we suggest a Multi-Agent System (MAS) architecture. With the adoption of a GIS we hope to provide a complementary coexistence between heterogeneous data sets, a correct data structure, a good storage capacity and a friendly user’s interface. By choosing a distributed architecture such as a Multi-Agent System, where each agent is a semi-autonomous Expert System with the necessary skills to cooperate with the others in order to solve a given task, we hope to ensure a dynamic problem decomposition and to achieve a better performance compared with standard monolithical architectures. Finally, and in view of the partial, imprecise, and ever changing character of information available for decision making, Belief Revision capabilities are added to the system. Our aim is to present and discuss an intelligent environmental management system capable of suggesting the more appropriate land-use actions based on the existing spatial and non-spatial constraints.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Master’s Thesis in Computer Engineering

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Os osciloscópios digitais são utilizados em diversas áreas do conhecimento, assumindo-se no âmbito da engenharia electrónica, como instrumentos indispensáveis. Graças ao advento das Field Programmable Gate Arrays (FPGAs), os instrumentos de medição reconfiguráveis, dadas as suas vantagens, i.e., altos desempenhos, baixos custos e elevada flexibilidade, são cada vez mais uma alternativa aos instrumentos tradicionalmente usados nos laboratórios. Tendo como objectivo a normalização no acesso e no controlo deste tipo de instrumentos, esta tese descreve o projecto e implementação de um osciloscópio digital reconfigurável baseado na norma IEEE 1451.0. Definido de acordo com uma arquitectura baseada nesta norma, as características do osciloscópio são descritas numa estrutura de dados denominada Transducer Electronic Data Sheet (TEDS), e o seu controlo é efectuado utilizando um conjunto de comandos normalizados. O osciloscópio implementa um conjunto de características e funcionalidades básicas, todas verificadas experimentalmente. Destas, destaca-se uma largura de banda de 575kHz, um intervalo de medição de 0.4V a 2.9V, a possibilidade de se definir um conjunto de escalas horizontais, o nível e declive de sincronismo e o modo de acoplamento com o circuito sob análise. Arquitecturalmente, o osciloscópio é constituído por um módulo especificado com a linguagem de descrição de hardware (HDL, Hardware Description Language) Verilog e por uma interface desenvolvida na linguagem de programação Java®. O módulo é embutido numa FPGA, definindo todo o processamento do osciloscópio. A interface permite o seu controlo e a representação do sinal medido. Durante o projecto foi utilizado um conversor Analógico/Digital (A/D) com uma frequência máxima de amostragem de 1.5MHz e 14 bits de resolução que, devido às suas limitações, obrigaram à implementação de um sistema de interpolação multi-estágio com filtros digitais.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

O objectivo deste trabalho é a análise da eficiência produtiva e dos efeitos da concentração sobre os custos bancários, tendo por base a indústria bancária portuguesa. O carácter multiproduto da empresa bancária sugere a necessidade de se adoptar formas multiproduto da função custo (tipo Fourier). Introduzimos variáveis de homogeneidade e de estrutura que permitem o recurso a formas funcionais uniproduto (Cobb-Douglas) à banca. A amostra corresponde a 22 bancos que operavam em Portugal entre 1995-2001, base não consolidada e dados em painel. Para o estudo da ineficiência recorreu-se ao modelo estocástico da curva fronteira (SFA), para as duas especificações. Na análise da concentração, introduziram-se variáveis binárias que pretendem captar os efeitos durante quatro anos após a concentração. Tanto no caso da SFA como no da concentração, os resultados encontrados são sensíveis à especificação funcional adoptada. Concluindo, o processo de concentração bancário parece justificar-se pela possibilidade da diminuição da ineficiência-X.