888 resultados para Parallel processing (Electronic computers) - Research
Resumo:
Pós-graduação em Agronomia (Energia na Agricultura) - FCA
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Pós-graduação em Educação para a Ciência - FC
Resumo:
Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data.
Resumo:
Hybrid technologies, thanks to the convergence of integrated microelectronic devices and new class of microfluidic structures could open new perspectives to the way how nanoscale events are discovered, monitored and controlled. The key point of this thesis is to evaluate the impact of such an approach into applications of ion-channel High Throughput Screening (HTS)platforms. This approach offers promising opportunities for the development of new classes of sensitive, reliable and cheap sensors. There are numerous advantages of embedding microelectronic readout structures strictly coupled to sensing elements. On the one hand the signal-to-noise-ratio is increased as a result of scaling. On the other, the readout miniaturization allows organization of sensors into arrays, increasing the capability of the platform in terms of number of acquired data, as required in the HTS approach, to improve sensing accuracy and reliabiity. However, accurate interface design is required to establish efficient communication between ionic-based and electronic-based signals. The work made in this thesis will show a first example of a complete parallel readout system with single ion channel resolution, using a compact and scalable hybrid architecture suitable to be interfaced to large array of sensors, ensuring simultaneous signal recording and smart control of the signal-to-noise ratio and bandwidth trade off. More specifically, an array of microfluidic polymer structures, hosting artificial lipid bilayers blocks where single ion channel pores are embededed, is coupled with an array of ultra-low noise current amplifiers for signal amplification and data processing. As demonstrating working example, the platform was used to acquire ultra small currents derived by single non-covalent molecular binding between alpha-hemolysin pores and beta-cyclodextrin molecules in artificial lipid membranes.
Resumo:
The term "Brain Imaging" identi�es a set of techniques to analyze the structure and/or functional behavior of the brain in normal and/or pathological situations. These techniques are largely used in the study of brain activity. In addition to clinical usage, analysis of brain activity is gaining popularity in others recent �fields, i.e. Brain Computer Interfaces (BCI) and the study of cognitive processes. In this context, usage of classical solutions (e.g. f MRI, PET-CT) could be unfeasible, due to their low temporal resolution, high cost and limited portability. For these reasons alternative low cost techniques are object of research, typically based on simple recording hardware and on intensive data elaboration process. Typical examples are ElectroEncephaloGraphy (EEG) and Electrical Impedance Tomography (EIT), where electric potential at the patient's scalp is recorded by high impedance electrodes. In EEG potentials are directly generated from neuronal activity, while in EIT by the injection of small currents at the scalp. To retrieve meaningful insights on brain activity from measurements, EIT and EEG relies on detailed knowledge of the underlying electrical properties of the body. This is obtained from numerical models of the electric �field distribution therein. The inhomogeneous and anisotropic electric properties of human tissues make accurate modeling and simulation very challenging, leading to a tradeo�ff between physical accuracy and technical feasibility, which currently severely limits the capabilities of these techniques. Moreover elaboration of data recorded requires usage of regularization techniques computationally intensive, which influences the application with heavy temporal constraints (such as BCI). This work focuses on the parallel implementation of a work-flow for EEG and EIT data processing. The resulting software is accelerated using multi-core GPUs, in order to provide solution in reasonable times and address requirements of real-time BCI systems, without over-simplifying the complexity and accuracy of the head models.
Resumo:
This thesis explores the capabilities of heterogeneous multi-core systems, based on multiple Graphics Processing Units (GPUs) in a standard desktop framework. Multi-GPU accelerated desk side computers are an appealing alternative to other high performance computing (HPC) systems: being composed of commodity hardware components fabricated in large quantities, their price-performance ratio is unparalleled in the world of high performance computing. Essentially bringing “supercomputing to the masses”, this opens up new possibilities for application fields where investing in HPC resources had been considered unfeasible before. One of these is the field of bioelectrical imaging, a class of medical imaging technologies that occupy a low-cost niche next to million-dollar systems like functional Magnetic Resonance Imaging (fMRI). In the scope of this work, several computational challenges encountered in bioelectrical imaging are tackled with this new kind of computing resource, striving to help these methods approach their true potential. Specifically, the following main contributions were made: Firstly, a novel dual-GPU implementation of parallel triangular matrix inversion (TMI) is presented, addressing an crucial kernel in computation of multi-mesh head models of encephalographic (EEG) source localization. This includes not only a highly efficient implementation of the routine itself achieving excellent speedups versus an optimized CPU implementation, but also a novel GPU-friendly compressed storage scheme for triangular matrices. Secondly, a scalable multi-GPU solver for non-hermitian linear systems was implemented. It is integrated into a simulation environment for electrical impedance tomography (EIT) that requires frequent solution of complex systems with millions of unknowns, a task that this solution can perform within seconds. In terms of computational throughput, it outperforms not only an highly optimized multi-CPU reference, but related GPU-based work as well. Finally, a GPU-accelerated graphical EEG real-time source localization software was implemented. Thanks to acceleration, it can meet real-time requirements in unpreceeded anatomical detail running more complex localization algorithms. Additionally, a novel implementation to extract anatomical priors from static Magnetic Resonance (MR) scansions has been included.
Resumo:
Ultrasound imaging is widely used in medical diagnostics as it is the fastest, least invasive, and least expensive imaging modality. However, ultrasound images are intrinsically difficult to be interpreted. In this scenario, Computer Aided Detection (CAD) systems can be used to support physicians during diagnosis providing them a second opinion. This thesis discusses efficient ultrasound processing techniques for computer aided medical diagnostics, focusing on two major topics: (i) Ultrasound Tissue Characterization (UTC), aimed at characterizing and differentiating between healthy and diseased tissue; (ii) Ultrasound Image Segmentation (UIS), aimed at detecting the boundaries of anatomical structures to automatically measure organ dimensions and compute clinically relevant functional indices. Research on UTC produced a CAD tool for Prostate Cancer detection to improve the biopsy protocol. In particular, this thesis contributes with: (i) the development of a robust classification system; (ii) the exploitation of parallel computing on GPU for real-time performance; (iii) the introduction of both an innovative Semi-Supervised Learning algorithm and a novel supervised/semi-supervised learning scheme for CAD system training that improve system performance reducing data collection effort and avoiding collected data wasting. The tool provides physicians a risk map highlighting suspect tissue areas, allowing them to perform a lesion-directed biopsy. Clinical validation demonstrated the system validity as a diagnostic support tool and its effectiveness at reducing the number of biopsy cores requested for an accurate diagnosis. For UIS the research developed a heart disease diagnostic tool based on Real-Time 3D Echocardiography. Thesis contributions to this application are: (i) the development of an automated GPU based level-set segmentation framework for 3D images; (ii) the application of this framework to the myocardium segmentation. Experimental results showed the high efficiency and flexibility of the proposed framework. Its effectiveness as a tool for quantitative analysis of 3D cardiac morphology and function was demonstrated through clinical validation.
Resumo:
Despite the several issues faced in the past, the evolutionary trend of silicon has kept its constant pace. Today an ever increasing number of cores is integrated onto the same die. Unfortunately, the extraordinary performance achievable by the many-core paradigm is limited by several factors. Memory bandwidth limitation, combined with inefficient synchronization mechanisms, can severely overcome the potential computation capabilities. Moreover, the huge HW/SW design space requires accurate and flexible tools to perform architectural explorations and validation of design choices. In this thesis we focus on the aforementioned aspects: a flexible and accurate Virtual Platform has been developed, targeting a reference many-core architecture. Such tool has been used to perform architectural explorations, focusing on instruction caching architecture and hybrid HW/SW synchronization mechanism. Beside architectural implications, another issue of embedded systems is considered: energy efficiency. Near Threshold Computing is a key research area in the Ultra-Low-Power domain, as it promises a tenfold improvement in energy efficiency compared to super-threshold operation and it mitigates thermal bottlenecks. The physical implications of modern deep sub-micron technology are severely limiting performance and reliability of modern designs. Reliability becomes a major obstacle when operating in NTC, especially memory operation becomes unreliable and can compromise system correctness. In the present work a novel hybrid memory architecture is devised to overcome reliability issues and at the same time improve energy efficiency by means of aggressive voltage scaling when allowed by workload requirements. Variability is another great drawback of near-threshold operation. The greatly increased sensitivity to threshold voltage variations in today a major concern for electronic devices. We introduce a variation-tolerant extension of the baseline many-core architecture. By means of micro-architectural knobs and a lightweight runtime control unit, the baseline architecture becomes dynamically tolerant to variations.
Resumo:
This is the first part of a study investigating a model-based transient calibration process for diesel engines. The motivation is to populate hundreds of parameters (which can be calibrated) in a methodical and optimum manner by using model-based optimization in conjunction with the manual process so that, relative to the manual process used by itself, a significant improvement in transient emissions and fuel consumption and a sizable reduction in calibration time and test cell requirements is achieved. Empirical transient modelling and optimization has been addressed in the second part of this work, while the required data for model training and generalization are the focus of the current work. Transient and steady-state data from a turbocharged multicylinder diesel engine have been examined from a model training perspective. A single-cylinder engine with external air-handling has been used to expand the steady-state data to encompass transient parameter space. Based on comparative model performance and differences in the non-parametric space, primarily driven by a high engine difference between exhaust and intake manifold pressures (ΔP) during transients, it has been recommended that transient emission models should be trained with transient training data. It has been shown that electronic control module (ECM) estimates of transient charge flow and the exhaust gas recirculation (EGR) fraction cannot be accurate at the high engine ΔP frequently encountered during transient operation, and that such estimates do not account for cylinder-to-cylinder variation. The effects of high engine ΔP must therefore be incorporated empirically by using transient data generated from a spectrum of transient calibrations. Specific recommendations on how to choose such calibrations, how many data to acquire, and how to specify transient segments for data acquisition have been made. Methods to process transient data to account for transport delays and sensor lags have been developed. The processed data have then been visualized using statistical means to understand transient emission formation. Two modes of transient opacity formation have been observed and described. The first mode is driven by high engine ΔP and low fresh air flowrates, while the second mode is driven by high engine ΔP and high EGR flowrates. The EGR fraction is inaccurately estimated at both modes, while EGR distribution has been shown to be present but unaccounted for by the ECM. The two modes and associated phenomena are essential to understanding why transient emission models are calibration dependent and furthermore how to choose training data that will result in good model generalization.
Resumo:
Small clusters of gallium oxide, technologically important high temperature ceramic, together with interaction of nucleic acid bases with graphene and small-diameter carbon nanotube are focus of first principles calculations in this work. A high performance parallel computing platform is also developed to perform these calculations at Michigan Tech. First principles calculations are based on density functional theory employing either local density or gradient-corrected approximation together with plane wave and gaussian basis sets. The bulk Ga2O3 is known to be a very good candidate for fabricating electronic devices that operate at high temperatures. To explore the properties of Ga2O3 at nonoscale, we have performed a systematic theoretical study on the small polyatomic gallium oxide clusters. The calculated results find that all lowest energy isomers of GamOn clusters are dominated by the Ga-O bonds over the metal-metal or the oxygen-oxygen bonds. Analysis of atomic charges suggest the clusters to be highly ionic similar to the case of bulk Ga2O3. In the study of sequential oxidation of these slusters starting from Ga2O, it is found that the most stable isomers display up to four different backbones of constituent atoms. Furthermore, the predicted configuration of the ground state of Ga2O is recently confirmed by the experimental result of Neumark's group. Guided by the results of calculations the study of gallium oxide clusters, performance related challenge of computational simulations, of producing high performance computers/platforms, has been addressed. Several engineering aspects were thoroughly studied during the design, development and implementation of the high performance parallel computing platform, rama, at Michigan Tech. In an attempt to stay true to the principles of Beowulf revolutioni, the rama cluster was extensively customized to make it easy to understand, and use - for administrators as well as end-users. Following the results of benchmark calculations and to keep up with the complexity of systems under study, rama has been expanded to a total of sixty four processors. Interest in the non-covalent intereaction of DNA with carbon nanotubes has steadily increased during past several years. This hybrid system, at the junction of the biological regime and the nanomaterials world, possesses features which make it very attractive for a wide range of applicatioins. Using the in-house computational power available, we have studied details of the interaction between nucleic acid bases with graphene sheet as well as high-curvature small-diameter carbon nanotube. The calculated trend in the binding energies strongly suggests that the polarizability of the base molecules determines the interaction strength of the nucleic acid bases with graphene. When comparing the results obtained here for physisorption on the small diameter nanotube considered with those from the study on graphene, it is observed that the interaction strength of nucleic acid bases is smaller for the tube. Thus, these results show that the effect of introducing curvature is to reduce the binding energy. The binding energies for the two extreme cases of negligible curvature (i.e. flat graphene sheet) and of very high curvature (i.e. small diameter nanotube) may be considered as upper and lower bounds. This finding represents an important step towards a better understanding of experimentally observed sequence-dependent interaction of DNA with Carbon nanotubes.