992 resultados para Graphical processing unit
Resumo:
The Optical, Spectroscopic, and Infrared Remote Imaging System OSIRIS is the scientific camera system onboard the Rosetta spacecraft (Figure 1). The advanced high performance imaging system will be pivotal for the success of the Rosetta mission. OSIRIS will detect 67P/Churyumov-Gerasimenko from a distance of more than 106 km, characterise the comet shape and volume, its rotational state and find a suitable landing spot for Philae, the Rosetta lander. OSIRIS will observe the nucleus, its activity and surroundings down to a scale of ~2 cm px−1. The observations will begin well before the onset of cometary activity and will extend over months until the comet reaches perihelion. During the rendezvous episode of the Rosetta mission, OSIRIS will provide key information about the nature of cometary nuclei and reveal the physics of cometary activity that leads to the gas and dust coma. OSIRIS comprises a high resolution Narrow Angle Camera (NAC) unit and a Wide Angle Camera (WAC) unit accompanied by three electronics boxes. The NAC is designed to obtain high resolution images of the surface of comet 7P/Churyumov-Gerasimenko through 12 discrete filters over the wavelength range 250–1000 nm at an angular resolution of 18.6 μrad px−1. The WAC is optimised to provide images of the near-nucleus environment in 14 discrete filters at an angular resolution of 101 μrad px−1. The two units use identical shutter, filter wheel, front door, and detector systems. They are operated by a common Data Processing Unit. The OSIRIS instrument has a total mass of 35 kg and is provided by institutes from six European countries
Resumo:
Self-organising neural models have the ability to provide a good representation of the input space. In particular the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time-consuming, especially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This paper proposes a Graphics Processing Unit (GPU) parallel implementation of the GNG with Compute Unified Device Architecture (CUDA). In contrast to existing algorithms, the proposed GPU implementation allows the acceleration of the learning process keeping a good quality of representation. Comparative experiments using iterative, parallel and hybrid implementations are carried out to demonstrate the effectiveness of CUDA implementation. The results show that GNG learning with the proposed implementation achieves a speed-up of 6× compared with the single-threaded CPU implementation. GPU implementation has also been applied to a real application with time constraints: acceleration of 3D scene reconstruction for egomotion, in order to validate the proposal.
Resumo:
Durante los últimos años ha sido creciente el uso de las unidades de procesamiento gráfico, más conocidas como GPU (Graphic Processing Unit), en aplicaciones de propósito general, dejando a un lado el objetivo para el que fueron creadas y que no era otro que el renderizado de gráficos por computador. Este crecimiento se debe en parte a la evolución que han experimentado estos dispositivos durante este tiempo y que les ha dotado de gran potencia de cálculo, consiguiendo que su uso se extienda desde ordenadores personales a grandes cluster. Este hecho unido a la proliferación de sensores RGB-D de bajo coste ha hecho que crezca el número de aplicaciones de visión que hacen uso de esta tecnología para la resolución de problemas, así como también para el desarrollo de nuevas aplicaciones. Todas estas mejoras no solamente se han realizado en la parte hardware, es decir en los dispositivos, sino también en la parte software con la aparición de nuevas herramientas de desarrollo que facilitan la programación de estos dispositivos GPU. Este nuevo paradigma se acuñó como Computación de Propósito General sobre Unidades de Proceso Gráfico (General-Purpose computation on Graphics Processing Units, GPGPU). Los dispositivos GPU se clasifican en diferentes familias, en función de las distintas características hardware que poseen. Cada nueva familia que aparece incorpora nuevas mejoras tecnológicas que le permite conseguir mejor rendimiento que las anteriores. No obstante, para sacar un rendimiento óptimo a un dispositivo GPU es necesario configurarlo correctamente antes de usarlo. Esta configuración viene determinada por los valores asignados a una serie de parámetros del dispositivo. Por tanto, muchas de las implementaciones que hoy en día hacen uso de los dispositivos GPU para el registro denso de nubes de puntos 3D, podrían ver mejorado su rendimiento con una configuración óptima de dichos parámetros, en función del dispositivo utilizado. Es por ello que, ante la falta de un estudio detallado del grado de afectación de los parámetros GPU sobre el rendimiento final de una implementación, se consideró muy conveniente la realización de este estudio. Este estudio no sólo se realizó con distintas configuraciones de parámetros GPU, sino también con diferentes arquitecturas de dispositivos GPU. El objetivo de este estudio es proporcionar una herramienta de decisión que ayude a los desarrolladores a la hora implementar aplicaciones para dispositivos GPU. Uno de los campos de investigación en los que más prolifera el uso de estas tecnologías es el campo de la robótica ya que tradicionalmente en robótica, sobre todo en la robótica móvil, se utilizaban combinaciones de sensores de distinta naturaleza con un alto coste económico, como el láser, el sónar o el sensor de contacto, para obtener datos del entorno. Más tarde, estos datos eran utilizados en aplicaciones de visión por computador con un coste computacional muy alto. Todo este coste, tanto el económico de los sensores utilizados como el coste computacional, se ha visto reducido notablemente gracias a estas nuevas tecnologías. Dentro de las aplicaciones de visión por computador más utilizadas está el registro de nubes de puntos. Este proceso es, en general, la transformación de diferentes nubes de puntos a un sistema de coordenadas conocido. Los datos pueden proceder de fotografías, de diferentes sensores, etc. Se utiliza en diferentes campos como son la visión artificial, la imagen médica, el reconocimiento de objetos y el análisis de imágenes y datos de satélites. El registro se utiliza para poder comparar o integrar los datos obtenidos en diferentes mediciones. En este trabajo se realiza un repaso del estado del arte de los métodos de registro 3D. Al mismo tiempo, se presenta un profundo estudio sobre el método de registro 3D más utilizado, Iterative Closest Point (ICP), y una de sus variantes más conocidas, Expectation-Maximization ICP (EMICP). Este estudio contempla tanto su implementación secuencial como su implementación paralela en dispositivos GPU, centrándose en cómo afectan a su rendimiento las distintas configuraciones de parámetros GPU. Como consecuencia de este estudio, también se presenta una propuesta para mejorar el aprovechamiento de la memoria de los dispositivos GPU, permitiendo el trabajo con nubes de puntos más grandes, reduciendo el problema de la limitación de memoria impuesta por el dispositivo. El funcionamiento de los métodos de registro 3D utilizados en este trabajo depende en gran medida de la inicialización del problema. En este caso, esa inicialización del problema consiste en la correcta elección de la matriz de transformación con la que se iniciará el algoritmo. Debido a que este aspecto es muy importante en este tipo de algoritmos, ya que de él depende llegar antes o no a la solución o, incluso, no llegar nunca a la solución, en este trabajo se presenta un estudio sobre el espacio de transformaciones con el objetivo de caracterizarlo y facilitar la elección de la transformación inicial a utilizar en estos algoritmos.
Resumo:
The focus of this study is development of parallelised version of severely sequential and iterative numerical algorithms based on multi-threaded parallel platform such as a graphics processing unit. This requires design and development of a platform-specific numerical solution that can benefit from the parallel capabilities of the chosen platform. Graphics processing unit was chosen as a parallel platform for design and development of a numerical solution for a specific physical model in non-linear optics. This problem appears in describing ultra-short pulse propagation in bulk transparent media that has recently been subject to several theoretical and numerical studies. The mathematical model describing this phenomenon is a challenging and complex problem and its numerical modeling limited on current modern workstations. Numerical modeling of this problem requires a parallelisation of an essentially serial algorithms and elimination of numerical bottlenecks. The main challenge to overcome is parallelisation of the globally non-local mathematical model. This thesis presents a numerical solution for elimination of numerical bottleneck associated with the non-local nature of the mathematical model. The accuracy and performance of the parallel code is identified by back-to-back testing with a similar serial version.
Resumo:
Implementation of a Monte Carlo simulation for the solution of population balance equations (PBEs) requires choice of initial sample number (N0), number of replicates (M), and number of bins for probability distribution reconstruction (n). It is found that Squared Hellinger Distance, H2, is a useful measurement of the accuracy of Monte Carlo (MC) simulation, and can be related directly to N0, M, and n. Asymptotic approximations of H2 are deduced and tested for both one-dimensional (1-D) and 2-D PBEs with coalescence. The central processing unit (CPU) cost, C, is found in a power-law relationship, C= aMNb0, with the CPU cost index, b, indicating the weighting of N0 in the total CPU cost. n must be chosen to balance accuracy and resolution. For fixed n, M × N0 determines the accuracy of MC prediction; if b > 1, then the optimal solution strategy uses multiple replications and small sample size. Conversely, if 0 < b < 1, one replicate and a large initial sample size is preferred. © 2015 American Institute of Chemical Engineers AIChE J, 61: 2394–2402, 2015
Resumo:
Femtosecond laser microfabrication has emerged over the last decade as a 3D flexible technology in photonics. Numerical simulations provide an important insight into spatial and temporal beam and pulse shaping during the course of extremely intricate nonlinear propagation (see e.g. [1,2]). Electromagnetics of such propagation is typically described in the form of the generalized Non-Linear Schrdinger Equation (NLSE) coupled with Drude model for plasma [3]. In this paper we consider a multi-threaded parallel numerical solution for a specific model which describes femtosecond laser pulse propagation in transparent media [4, 5]. However our approach can be extended to similar models. The numerical code is implemented in NVIDIA Graphics Processing Unit (GPU) which provides an effitient hardware platform for multi-threded computing. We compare the performance of the described below parallel code implementated for GPU using CUDA programming interface [3] with a serial CPU version used in our previous papers [4,5]. © 2011 IEEE.
Resumo:
In the oil industry, natural gas is a vital component of the world energy supply and an important source of hydrocarbons. It is one of the cleanest, safest and most relevant of all energy sources, and helps to meet the world's growing demand for clean energy in the future. With the growing share of natural gas in the Brazil energy matrix, the main purpose of its use has been the supply of electricity by thermal power generation. In the current production process, as in a Natural Gas Processing Unit (NGPU), natural gas undergoes various separation units aimed at producing liquefied natural gas and fuel gas. The latter should be specified to meet the thermal machines specifications. In the case of remote wells, the process of absorption of heavy components aims the match of fuel gas application and thereby is an alternative to increase the energy matrix. Currently, due to the high demand for this raw gas, research and development techniques aimed at adjusting natural gas are studied. Conventional methods employed today, such as physical absorption, show good results. The objective of this dissertation is to evaluate the removal of heavy components of natural gas by absorption. In this research it was used as the absorbent octyl alcohol (1-octanol). The influence of temperature (5 and 40 °C) and flowrate (25 and 50 ml/min) on the absorption process was studied. Absorption capacity expressed by the amount absorbed and kinetic parameters, expressed by the mass transfer coefficient, were evaluated. As expected from the literature, it was observed that the absorption of heavy hydrocarbon fraction is favored by lowering the temperature. Moreover, both temperature and flowrate favors mass transfer (kinetic effect). The absorption kinetics for removal of heavy components was monitored by chromatographic analysis and the experimental results demonstrated a high percentage of recovery of heavy components. Furthermore, it was observed that the use of octyl alcohol as absorbent was feasible for the requested separation process.
Resumo:
In the oil industry, natural gas is a vital component of the world energy supply and an important source of hydrocarbons. It is one of the cleanest, safest and most relevant of all energy sources, and helps to meet the world's growing demand for clean energy in the future. With the growing share of natural gas in the Brazil energy matrix, the main purpose of its use has been the supply of electricity by thermal power generation. In the current production process, as in a Natural Gas Processing Unit (NGPU), natural gas undergoes various separation units aimed at producing liquefied natural gas and fuel gas. The latter should be specified to meet the thermal machines specifications. In the case of remote wells, the process of absorption of heavy components aims the match of fuel gas application and thereby is an alternative to increase the energy matrix. Currently, due to the high demand for this raw gas, research and development techniques aimed at adjusting natural gas are studied. Conventional methods employed today, such as physical absorption, show good results. The objective of this dissertation is to evaluate the removal of heavy components of natural gas by absorption. In this research it was used as the absorbent octyl alcohol (1-octanol). The influence of temperature (5 and 40 °C) and flowrate (25 and 50 ml/min) on the absorption process was studied. Absorption capacity expressed by the amount absorbed and kinetic parameters, expressed by the mass transfer coefficient, were evaluated. As expected from the literature, it was observed that the absorption of heavy hydrocarbon fraction is favored by lowering the temperature. Moreover, both temperature and flowrate favors mass transfer (kinetic effect). The absorption kinetics for removal of heavy components was monitored by chromatographic analysis and the experimental results demonstrated a high percentage of recovery of heavy components. Furthermore, it was observed that the use of octyl alcohol as absorbent was feasible for the requested separation process.
Resumo:
A major weakness among loading models for pedestrians walking on flexible structures proposed in recent years is the various uncorroborated assumptions made in their development. This applies to spatio-temporal characteristics of pedestrian loading and the nature of multi-object interactions. To alleviate this problem, a framework for the determination of localised pedestrian forces on full-scale structures is presented using a wireless attitude and heading reference systems (AHRS). An AHRS comprises a triad of tri-axial accelerometers, gyroscopes and magnetometers managed by a dedicated data processing unit, allowing motion in three-dimensional space to be reconstructed. A pedestrian loading model based on a single point inertial measurement from an AHRS is derived and shown to perform well against benchmark data collected on an instrumented treadmill. Unlike other models, the current model does not take any predefined form nor does it require any extrapolations as to the timing and amplitude of pedestrian loading. In order to assess correctly the influence of the moving pedestrian on behaviour of a structure, an algorithm for tracking the point of application of pedestrian force is developed based on data from a single AHRS attached to a foot. A set of controlled walking tests with a single pedestrian is conducted on a real footbridge for validation purposes. A remarkably good match between the measured and simulated bridge response is found, indeed confirming applicability of the proposed framework.
Resumo:
The problem addressed concerns the determination of the average numberof successive attempts of guessing a word of a certain length consisting of letters withgiven probabilities of occurrence. Both first- and second-order approximations to a naturallanguage are considered. The guessing strategy used is guessing words in decreasing orderof probability. When word and alphabet sizes are large, approximations are necessary inorder to estimate the number of guesses. Several kinds of approximations are discusseddemonstrating moderate requirements regarding both memory and central processing unit(CPU) time. When considering realistic sizes of alphabets and words (100), the numberof guesses can be estimated within minutes with reasonable accuracy (a few percent) andmay therefore constitute an alternative to, e.g., various entropy expressions. For manyprobability distributions, the density of the logarithm of probability products is close to anormal distribution. For those cases, it is possible to derive an analytical expression for theaverage number of guesses. The proportion of guesses needed on average compared to thetotal number decreases almost exponentially with the word length. The leading term in anasymptotic expansion can be used to estimate the number of guesses for large word lengths.Comparisons with analytical lower bounds and entropy expressions are also provided.
Resumo:
Finding rare events in multidimensional data is an important detection problem that has applications in many fields, such as risk estimation in insurance industry, finance, flood prediction, medical diagnosis, quality assurance, security, or safety in transportation. The occurrence of such anomalies is so infrequent that there is usually not enough training data to learn an accurate statistical model of the anomaly class. In some cases, such events may have never been observed, so the only information that is available is a set of normal samples and an assumed pairwise similarity function. Such metric may only be known up to a certain number of unspecified parameters, which would either need to be learned from training data, or fixed by a domain expert. Sometimes, the anomalous condition may be formulated algebraically, such as a measure exceeding a predefined threshold, but nuisance variables may complicate the estimation of such a measure. Change detection methods used in time series analysis are not easily extendable to the multidimensional case, where discontinuities are not localized to a single point. On the other hand, in higher dimensions, data exhibits more complex interdependencies, and there is redundancy that could be exploited to adaptively model the normal data. In the first part of this dissertation, we review the theoretical framework for anomaly detection in images and previous anomaly detection work done in the context of crack detection and detection of anomalous components in railway tracks. In the second part, we propose new anomaly detection algorithms. The fact that curvilinear discontinuities in images are sparse with respect to the frame of shearlets, allows us to pose this anomaly detection problem as basis pursuit optimization. Therefore, we pose the problem of detecting curvilinear anomalies in noisy textured images as a blind source separation problem under sparsity constraints, and propose an iterative shrinkage algorithm to solve it. Taking advantage of the parallel nature of this algorithm, we describe how this method can be accelerated using graphical processing units (GPU). Then, we propose a new method for finding defective components on railway tracks using cameras mounted on a train. We describe how to extract features and use a combination of classifiers to solve this problem. Then, we scale anomaly detection to bigger datasets with complex interdependencies. We show that the anomaly detection problem naturally fits in the multitask learning framework. The first task consists of learning a compact representation of the good samples, while the second task consists of learning the anomaly detector. Using deep convolutional neural networks, we show that it is possible to train a deep model with a limited number of anomalous examples. In sequential detection problems, the presence of time-variant nuisance parameters affect the detection performance. In the last part of this dissertation, we present a method for adaptively estimating the threshold of sequential detectors using Extreme Value Theory on a Bayesian framework. Finally, conclusions on the results obtained are provided, followed by a discussion of possible future work.
Resumo:
This work presents a low cost architecture for development of synchronized phasor measurement units (PMU). The device is intended to be connected in the low voltage grid, which allows the monitoring of transmission and distribution networks. Developments of this project include a complete PMU, with instrumentation module for use in low voltage network, GPS module to provide the sync signal and time stamp for the measures, processing unit with the acquisition system, phasor estimation and formatting data according to the standard and finally, communication module for data transmission. For the development and evaluation of the performance of this PMU, it was developed a set of applications in LabVIEW environment with specific features that let analyze the behavior of the measures and identify the sources of error of the PMU, as well as to apply all the tests proposed by the standard. The first application, useful for the development of instrumentation, consists of a function generator integrated with an oscilloscope, which allows the generation and acquisition of signals synchronously, in addition to the handling of samples. The second and main, is the test platform, with capabality of generating all tests provided by the synchronized phasor measurement standard IEEE C37.118.1, allowing store data or make the analysis of the measurements in real time. Finally, a third application was developed to evaluate the results of the tests and generate calibration curves to adjust the PMU. The results include all the tests proposed by synchrophasors standard and an additional test that evaluates the impact of noise. Moreover, through two prototypes connected to the electrical installation of consumers in same distribution circuit, it was obtained monitoring records that allowed the identification of loads in consumer and power quality analysis, beyond the event detection at the distribution and transmission levels.
Resumo:
This thesis develops and tests various transient and steady-state computational models such as direct numerical simulation (DNS), large eddy simulation (LES), filtered unsteady Reynolds-averaged Navier-Stokes (URANS) and steady Reynolds-averaged Navier-Stokes (RANS) with and without magnetic field to investigate turbulent flows in canonical as well as in the nozzle and mold geometries of the continuous casting process. The direct numerical simulations are first performed in channel, square and 2:1 aspect rectangular ducts to investigate the effect of magnetic field on turbulent flows. The rectangular duct is a more practical geometry for continuous casting nozzle and mold and has the option of applying magnetic field either perpendicular to broader side or shorter side. This work forms the part of a graphic processing unit (GPU) based CFD code (CU-FLOW) development for magnetohydrodynamic (MHD) turbulent flows. The DNS results revealed interesting effects of the magnetic field and its orientation on primary, secondary flows (instantaneous and mean), Reynolds stresses, turbulent kinetic energy (TKE) budgets, momentum budgets and frictional losses, besides providing DNS database for two-wall bounded square and rectangular duct MHD turbulent flows. Further, the low- and high-Reynolds number RANS models (k-ε and Reynolds stress models) are developed and tested with DNS databases for channel and square duct flows with and without magnetic field. The MHD sink terms in k- and ε-equations are implemented as proposed by Kenjereš and Hanjalić using a user defined function (UDF) in FLUENT. This work revealed varying accuracies of different RANS models at different levels. This work is useful for industry to understand the accuracies of these models, including continuous casting. After realizing the accuracy and computational cost of RANS models, the steady-state k-ε model is then combined with the particle image velocimetry (PIV) and impeller probe velocity measurements in a 1/3rd scale water model to study the flow quality coming out of the well- and mountain-bottom nozzles and the effect of stopper-rod misalignment on fluid flow. The mountain-bottom nozzle was found more prone to the longtime asymmetries and higher surface velocities. The left misalignment of stopper gave higher surface velocity on the right leading to significantly large number of vortices forming behind the nozzle on the left. Later, the transient and steady-state models such as LES, filtered URANS and steady RANS models are combined with ultrasonic Doppler velocimetry (UDV) measurements in a GaInSn model of typical continuous casting process. LES-CU-LOW is the fastest and the most accurate model owing to much finer mesh and a smaller timestep. This work provided a good understanding on the performance of these models. The behavior of instantaneous flows, Reynolds stresses and proper orthogonal decomposition (POD) analysis quantified the nozzle bottom swirl and its importance on the turbulent flow in the mold. Afterwards, the aforementioned work in GaInSn model is extended with electromagnetic braking (EMBr) to help optimize a ruler-type brake and its location for the continuous casting process. The magnetic field suppressed turbulence and promoted vortical structures with their axis aligned with the magnetic field suggesting tendency towards 2-d turbulence. The stronger magnetic field at the nozzle well and around the jet region created large scale and lower frequency flow behavior by suppressing nozzle bottom swirl and its front-back alternation. Based on this work, it is advised to avoid stronger magnetic field around jet and nozzle bottom to get more stable and less defect prone flow.
Resumo:
The Solar Intensity X-ray and particle Spectrometer (SIXS) on board BepiColombo's Mercury Planetary Orbiter (MPO) will study solar energetic particles moving towards Mercury and solar X-rays on the dayside of Mercury. The SIXS instrument consists of two detector sub-systems; X-ray detector SIXS-X and particle detector SIXS-P. The SIXS-P subdetector will detect solar energetic electrons and protons in a broad energy range using a particle telescope approach with five outer Si detectors around a central CsI(Tl) scintillator. The measurements made by the SIXS instrument are necessary for other instruments on board the spacecraft. SIXS data will be used to study the Solar X-ray corona, solar flares, solar energetic particles, the Hermean magnetosphere, and solar eruptions. The SIXS-P detector was calibrated by comparing experimental measurement data from the instrument with Geant4 simulation data. Calibration curves were produced for the different side detectors and the core scintillator for electrons and protons, respectively. The side detector energy response was found to be linear for both electrons and protons. The core scintillator energy response to protons was found to be non-linear. The core scintillator calibration for electrons was omitted due to insufficient experimental data. The electron and proton acceptance of the SIXS-P detector was determined with Geant4 simulations. Electron and proton energy channels are clean in the main energy range of the instrument. At higher energies, protons and electrons produce non-ideal response in the energy channels. Due to the limited bandwidth of the spacecraft's telemetry, the particle measurements made by SIXS-P have to be pre-processed in the data processing unit of the SIXS instrument. A lookup table was created for the pre-processing of data with Geant4 simulations, and the ability of the lookup table to provide spectral information from a simulated electron event was analysed. The lookup table produces clean electron and proton channels and is able to separate protons and electrons. Based on a simulated solar energetic electron event, the incident electron spectrum cannot be determined from channel particle counts with a standard analysis method.
Resumo:
Various mechanisms have been proposed to explain extreme waves or rogue waves in an oceanic environment including directional focusing, dispersive focusing, wave-current interaction, and nonlinear modulational instability. The Benjamin-Feir instability (nonlinear modulational instability), however, is considered to be one of the primary mechanisms for rogue-wave occurrence. The nonlinear Schrodinger equation is a well-established approximate model based on the same assumptions as required for the derivation of the Benjamin-Feir theory. Solutions of the nonlinear Schrodinger equation, including new rogue-wave type solutions are presented in the author's dissertation work. The solutions are obtained by using a predictive eigenvalue map based predictor-corrector procedure developed by the author. Features of the predictive map are explored and the influences of certain parameter variations are investigated. The solutions are rescaled to match the length scales of waves generated in a wave tank. Based on the information provided by the map and the details of physical scaling, a framework is developed that can serve as a basis for experimental investigations into a variety of extreme waves as well localizations in wave fields. To derive further fundamental insights into the complexity of extreme wave conditions, Smoothed Particle Hydrodynamics (SPH) simulations are carried out on an advanced Graphic Processing Unit (GPU) based parallel computational platform. Free surface gravity wave simulations have successfully characterized water-wave dispersion in the SPH model while demonstrating extreme energy focusing and wave growth in both linear and nonlinear regimes. A virtual wave tank is simulated wherein wave motions can be excited from either side. Focusing of several wave trains and isolated waves has been simulated. With properly chosen parameters, dispersion effects are observed causing a chirped wave train to focus and exhibit growth. By using the insights derived from the study of the nonlinear Schrodinger equation, modulational instability or self-focusing has been induced in a numerical wave tank and studied through several numerical simulations. Due to the inherent dissipative nature of SPH models, simulating persistent progressive waves can be problematic. This issue has been addressed and an observation-based solution has been provided. The efficacy of SPH in modeling wave focusing can be critical to further our understanding and predicting extreme wave phenomena through simulations. A deeper understanding of the mechanisms underlying extreme energy localization phenomena can help facilitate energy harnessing and serve as a basis to predict and mitigate the impact of energy focusing.