961 resultados para General-purpose computing on graphics processing units (GPGPU)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este trabalho apresenta uma nova metodologia para elastografia virtual em imagens simuladas de ultrassom utilizando métodos numéricos e métodos de visão computacional. O objetivo é estimar o módulo de elasticidade de diferentes tecidos tendo como entrada duas imagens da mesma seção transversal obtidas em instantes de tempo e pressões aplicadas diferentes. Esta metodologia consiste em calcular um campo de deslocamento das imagens com um método de fluxo óptico e aplicar um método iterativo para estimar os módulos de elasticidade (análise inversa) utilizando métodos numéricos. Para o cálculo dos deslocamentos, duas formulações são utilizadas para fluxo óptico: Lucas-Kanade e Brox. A análise inversa é realizada utilizando duas técnicas numéricas distintas: o Método dos Elementos Finitos (MEF) e o Método dos Elementos de Contorno (MEC), sendo ambos implementados em Unidades de Processamento Gráfico de uso geral, GpGPUs ( \"General Purpose Graphics Units\" ). Considerando uma quantidade qualquer de materiais a serem determinados, para a implementação do Método dos Elementos de Contorno é empregada a técnica de sub-regiões para acoplar as matrizes de diferentes estruturas identificadas na imagem. O processo de otimização utilizado para determinar as constantes elásticas é realizado de forma semi-analítica utilizando cálculo por variáveis complexas. A metodologia é testada em três etapas distintas, com simulações sem ruído, simulações com adição de ruído branco gaussiano e phantoms matemáticos utilizando rastreamento de ruído speckle. Os resultados das simulações apontam o uso do MEF como mais preciso, porém computacionalmente mais caro, enquanto o MEC apresenta erros toleráveis e maior velocidade no tempo de processamento.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Self-organising neural models have the ability to provide a good representation of the input space. In particular the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time-consuming, especially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This paper proposes a Graphics Processing Unit (GPU) parallel implementation of the GNG with Compute Unified Device Architecture (CUDA). In contrast to existing algorithms, the proposed GPU implementation allows the acceleration of the learning process keeping a good quality of representation. Comparative experiments using iterative, parallel and hybrid implementations are carried out to demonstrate the effectiveness of CUDA implementation. The results show that GNG learning with the proposed implementation achieves a speed-up of 6× compared with the single-threaded CPU implementation. GPU implementation has also been applied to a real application with time constraints: acceleration of 3D scene reconstruction for egomotion, in order to validate the proposal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tool path generation is one of the most complex problems in Computer Aided Manufacturing. Although some efficient strategies have been developed, most of them are only useful for standard machining. However, the algorithms used for tool path computation demand a higher computation performance, which makes the implementation on many existing systems very slow or even impractical. Hardware acceleration is an incremental solution that can be cleanly added to these systems while keeping everything else intact. It is completely transparent to the user. The cost is much lower and the development time is much shorter than replacing the computers by faster ones. This paper presents an optimisation that uses a specific graphic hardware approach using the power of multi-core Graphic Processing Units (GPUs) in order to improve the tool path computation. This improvement is applied on a highly accurate and robust tool path generation algorithm. The paper presents, as a case of study, a fully implemented algorithm used for turning lathe machining of shoe lasts. A comparative study will show the gain achieved in terms of total computing time. The execution time is almost two orders of magnitude faster than modern PCs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of new nano-biocomposites has been one of the main research areas of interest in polymer science in recent years, since they can combine the intrinsic biodegradable nature of matrices with the ability to modify their properties by the addition of selected nano-reinforcements. In this work, the addition of mineral nanoclays (montmorillonites and sepiolites) to a commercial starch-based matrix is proposed. A complete study on their processing by melt-intercalation techniques and further evaluation of the main properties of nano-biocomposites has been carried out. The results reported show an important influence of the nano-biocomposites morphology on their final properties. In particular, the rheological and viscoelastic characteristics of these systems are very sensitive to the dispersion level of the nanofiller, but it is possible to assess that the material processing behaviour is not compromised by the presence of these nano-reinforcements. In general, both nanofillers had a positive influence in the materials final properties. Mechanical performance shows improvements in terms of elastic modulus, without important limitations in terms of ductility. Thermal properties are improved in terms of residual mass after degradation and low improvements are also observed in terms of oxygen barrier properties.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents the formal definition of a novel Mobile Cloud Computing (MCC) extension of the Networked Autonomic Machine (NAM) framework, a general-purpose conceptual tool which describes large-scale distributed autonomic systems. The introduction of autonomic policies in the MCC paradigm has proved to be an effective technique to increase the robustness and flexibility of MCC systems. In particular, autonomic policies based on continuous resource and connectivity monitoring help automate context-aware decisions for computation offloading. We have also provided NAM with a formalization in terms of a transformational operational semantics in order to fill the gap between its existing Java implementation NAM4J and its conceptual definition. Moreover, we have extended NAM4J by adding several components with the purpose of managing large scale autonomic distributed environments. In particular, the middleware allows for the implementation of peer-to-peer (P2P) networks of NAM nodes. Moreover, NAM mobility actions have been implemented to enable the migration of code, execution state and data. Within NAM4J, we have designed and developed a component, denoted as context bus, which is particularly useful in collaborative applications in that, if replicated on each peer, it instantiates a virtual shared channel allowing nodes to notify and get notified about context events. Regarding the autonomic policies management, we have provided NAM4J with a rule engine, whose purpose is to allow a system to autonomously determine when offloading is convenient. We have also provided NAM4J with trust and reputation management mechanisms to make the middleware suitable for applications in which such aspects are of great interest. To this purpose, we have designed and implemented a distributed framework, denoted as DARTSense, where no central server is required, as reputation values are stored and updated by participants in a subjective fashion. We have also investigated the literature regarding MCC systems. The analysis pointed out that all MCC models focus on mobile devices, and consider the Cloud as a system with unlimited resources. To contribute in filling this gap, we defined a modeling and simulation framework for the design and analysis of MCC systems, encompassing both their sides. We have also implemented a modular and reusable simulator of the model. We have applied the NAM principles to two different application scenarios. First, we have defined a hybrid P2P/cloud approach where components and protocols are autonomically configured according to specific target goals, such as cost-effectiveness, reliability and availability. Merging P2P and cloud paradigms brings together the advantages of both: high availability, provided by the Cloud presence, and low cost, by exploiting inexpensive peers resources. As an example, we have shown how the proposed approach can be used to design NAM-based collaborative storage systems based on an autonomic policy to decide how to distribute data chunks among peers and Cloud, according to cost minimization and data availability goals. As a second application, we have defined an autonomic architecture for decentralized urban participatory sensing (UPS) which bridges sensor networks and mobile systems to improve effectiveness and efficiency. The developed application allows users to retrieve and publish different types of sensed information by using the features provided by NAM4J's context bus. Trust and reputation is managed through the application of DARTSense mechanisms. Also, the application includes an autonomic policy that detects areas characterized by few contributors, and tries to recruit new providers by migrating code necessary to sensing, through NAM mobility actions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this research was to investigate the effects of Processing Instruction (VanPatten, 1996, 2007), as an input-based model for teaching second language grammar, on Syrian learners’ processing abilities. The present research investigated the effects of Processing Instruction on the acquisition of English relative clauses by Syrian learners in the form of a quasi-experimental design. Three separate groups were involved in the research (Processing Instruction, Traditional Instruction and a Control Group). For assessment, a pre-test, a direct post-test and a delayed post-test were used as main tools for eliciting data. A questionnaire was also distributed to participants in the Processing Instruction group to give them the opportunity to give feedback in relation to the treatment they received in comparison with the Traditional Instruction they are used to. Four hypotheses were formulated on the possible effectivity of Processing Instruction on Syrian learners’ linguistic system. It was hypothesised that Processing Instruction would improve learners’ processing abilities leading to an improvement in learners’ linguistic system. This was expected to lead to a better performance when it comes to the comprehension and production of English relative clauses. The main source of data was analysed statistically using the ANOVA test. Cohen’s d calculations were also used to support the ANOVA test. Cohen’s d showed the magnitude of effects of the three treatments. Results of the analysis showed that both Processing Instruction and Traditional Instruction groups had improved after treatment. However, the Processing Instruction Group significantly outperformed the other two groups in the comprehension of relative clauses. The analysis concluded that Processing Instruction is a useful tool for instructing relative clauses to Syrian learners. This was enhanced by participants’ responses to the questionnaire as they were in favour of Processing Instruction, rather than Traditional Instruction. This research has theoretical and pedagogical implications. Theoretically, the study showed support for the Input hypothesis. That is, it was shown that Processing Instruction had a positive effect on input processing as it affected learners’ linguistic system. This was reflected in learners’ performance where learners were able to produce a structure which they had not been asked to produce. Pedagogically, the present research showed that Processing Instruction is a useful tool for teaching English grammar in the context where the experiment was carried out, as it had a large effect on learners’ performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study was to determine the effects of participating in an existing study skills course, developed for use with a general college population, on the study strategies and attitudes of college students with learning disabilities. This study further investigated whether there would be differential effectiveness for segregated and mainstreamed sections of the course.^ The sample consisted of 42 students with learning disabilities attending a southeastern university. Students were randomly assigned to either a segregated or mainstreamed section of the study skills course. In addition, a control group consisted of students with learning disabilities who received no study skills instruction.^ All subjects completed the Learning and Study Strategies Inventory (LASSI) before and after the study skills course. The subjects in the segregated group showed significant improvement on six of the 10 scales of the LASSI: Time Management, Concentration, Information Processing, Selecting Main Ideas, Study Aids, and Self Testing. Subjects in the mainstreamed section showed significant improvement on five scales: Anxiety, Selecting Main Ideas, Study Aids, Self Testing, and Test Strategies. The subjects in the control group did not significantly improve on any of the scales.^ This study showed that college students with learning disabilities improved their study strategies and attitudes by participating in a study skills course designed for a general student population. Further, these students benefitted whether by taking the course only with other students with learning disabilities, or by taking the course in a mixed group of students with or without learning disabilities. These results have important practical implications in that it appears that colleges can use existing study skills courses without having to develop special courses and schedules of course offerings targeted specifically for students with learning disabilities. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many-core systems are emerging from the need of more computational power and power efficiency. However there are many issues which still revolve around the many-core systems. These systems need specialized software before they can be fully utilized and the hardware itself may differ from the conventional computational systems. To gain efficiency from many-core system, programs need to be parallelized. In many-core systems the cores are small and less powerful than cores used in traditional computing, so running a conventional program is not an efficient option. Also in Network-on-Chip based processors the network might get congested and the cores might work at different speeds. In this thesis is, a dynamic load balancing method is proposed and tested on Intel 48-core Single-Chip Cloud Computer by parallelizing a fault simulator. The maximum speedup is difficult to obtain due to severe bottlenecks in the system. In order to exploit all the available parallelism of the Single-Chip Cloud Computer, a runtime approach capable of dynamically balancing the load during the fault simulation process is used. The proposed dynamic fault simulation approach on the Single-Chip Cloud Computer shows up to 45X speedup compared to a serial fault simulation approach. Many-core systems can draw enormous amounts of power, and if this power is not controlled properly, the system might get damaged. One way to manage power is to set power budget for the system. But if this power is drawn by just few cores of the many, these few cores get extremely hot and might get damaged. Due to increase in power density multiple thermal sensors are deployed on the chip area to provide realtime temperature feedback for thermal management techniques. Thermal sensor accuracy is extremely prone to intra-die process variation and aging phenomena. These factors lead to a situation where thermal sensor values drift from the nominal values. This necessitates efficient calibration techniques to be applied before the sensor values are used. In addition, in modern many-core systems cores have support for dynamic voltage and frequency scaling. Thermal sensors located on cores are sensitive to the core's current voltage level, meaning that dedicated calibration is needed for each voltage level. In this thesis a general-purpose software-based auto-calibration approach is also proposed for thermal sensors to calibrate thermal sensors on different range of voltages.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract: As time has passed, the general purpose programming paradigm has evolved, producing different hardware architectures whose characteristics differ widely. In this work, we are going to demonstrate, through different applications belonging to the field of Image Processing, the existing difference between three Nvidia hardware platforms: two of them belong to the GeForce graphics cards series, the GTX 480 and the GTX 980 and one of the low consumption platforms which purpose is to allow the execution of embedded applications as well as providing an extreme efficiency: the Jetson TK1. With respect to the test applications we will use five examples from Nvidia CUDA Samples. These applications are directly related to Image Processing, as the algorithms they use are similar to those from the field of medical image registration. After the tests, it will be proven that GTX 980 is both the device with the highest computational power and the one that has greater consumption, it will be seen that Jetson TK1 is the most efficient platform, it will be shown that GTX 480 produces more heat than the others and we will learn other effects produced by the existing difference between the architecture of the devices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current trends in broadband mobile networks are addressed towards the placement of different capabilities at the edge of the mobile network in a centralised way. On one hand, the split of the eNB between baseband processing units and remote radio headers makes it possible to process some of the protocols in centralised premises, likely with virtualised resources. On the other hand, mobile edge computing makes use of processing and storage capabilities close to the air interface in order to deploy optimised services with minimum delay. The confluence of both trends is a hot topic in the definition of future 5G networks. The full centralisation of both technologies in cloud data centres imposes stringent requirements to the fronthaul connections in terms of throughput and latency. Therefore, all those cells with limited network access would not be able to offer these types of services. This paper proposes a solution for these cases, based on the placement of processing and storage capabilities close to the remote units, which is especially well suited for the deployment of clusters of small cells. The proposed cloud-enabled small cells include a highly efficient microserver with a limited set of virtualised resources offered to the cluster of small cells. As a result, a light data centre is created and commonly used for deploying centralised eNB and mobile edge computing functionalities. The paper covers the proposed architecture, with special focus on the integration of both aspects, and possible scenarios of application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

By providing vehicle-to-vehicle and vehicle-to-infrastructure wireless communications, vehicular ad hoc networks (VANETs), also known as the “networks on wheels”, can greatly enhance traffic safety, traffic efficiency and driving experience for intelligent transportation system (ITS). However, the unique features of VANETs, such as high mobility and uneven distribution of vehicular nodes, impose critical challenges of high efficiency and reliability for the implementation of VANETs. This dissertation is motivated by the great application potentials of VANETs in the design of efficient in-network data processing and dissemination. Considering the significance of message aggregation, data dissemination and data collection, this dissertation research targets at enhancing the traffic safety and traffic efficiency, as well as developing novel commercial applications, based on VANETs, following four aspects: 1) accurate and efficient message aggregation to detect on-road safety relevant events, 2) reliable data dissemination to reliably notify remote vehicles, 3) efficient and reliable spatial data collection from vehicular sensors, and 4) novel promising applications to exploit the commercial potentials of VANETs. Specifically, to enable cooperative detection of safety relevant events on the roads, the structure-less message aggregation (SLMA) scheme is proposed to improve communication efficiency and message accuracy. The scheme of relative position based message dissemination (RPB-MD) is proposed to reliably and efficiently disseminate messages to all intended vehicles in the zone-of-relevance in varying traffic density. Due to numerous vehicular sensor data available based on VANETs, the scheme of compressive sampling based data collection (CS-DC) is proposed to efficiently collect the spatial relevance data in a large scale, especially in the dense traffic. In addition, with novel and efficient solutions proposed for the application specific issues of data dissemination and data collection, several appealing value-added applications for VANETs are developed to exploit the commercial potentials of VANETs, namely general purpose automatic survey (GPAS), VANET-based ambient ad dissemination (VAAD) and VANET based vehicle performance monitoring and analysis (VehicleView). Thus, by improving the efficiency and reliability in in-network data processing and dissemination, including message aggregation, data dissemination and data collection, together with the development of novel promising applications, this dissertation will help push VANETs further to the stage of massive deployment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a general purpose cloud system efficiencies are yet to be had from supporting diverse applications and their requirements within a storage system used for a private cloud. Supporting such diverse requirements poses a significant challenge in a storage system that supports fine grained configuration on a variety of parameters. This paper uses the Ceph distributed file system, and in particular its global parameters, to show how a single changed parameter can effect the performance for a range of access patterns when tested with an OpenStack cloud system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In Part I [""Fast Transforms for Acoustic Imaging-Part I: Theory,"" IEEE TRANSACTIONS ON IMAGE PROCESSING], we introduced the Kronecker array transform (KAT), a fast transform for imaging with separable arrays. Given a source distribution, the KAT produces the spectral matrix which would be measured by a separable sensor array. In Part II, we establish connections between the KAT, beamforming and 2-D convolutions, and show how these results can be used to accelerate classical and state of the art array imaging algorithms. We also propose using the KAT to accelerate general purpose regularized least-squares solvers. Using this approach, we avoid ill-conditioned deconvolution steps and obtain more accurate reconstructions than previously possible, while maintaining low computational costs. We also show how the KAT performs when imaging near-field source distributions, and illustrate the trade-off between accuracy and computational complexity. Finally, we show that separable designs can deliver accuracy competitive with multi-arm logarithmic spiral geometries, while having the computational advantages of the KAT.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mestrado em Engenharia Electrotécnica e de Computadores.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia de Electrónica e Telecomunicações