14 resultados para fault tolerant systems

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The demand for computational power has been leading the improvement of the High Performance Computing (HPC) area, generally represented by the use of distributed systems like clusters of computers running parallel applications. In this area, fault tolerance plays an important role in order to provide high availability isolating the application from the faults effects. Performance and availability form an undissociable binomial for some kind of applications. Therefore, the fault tolerant solutions must take into consideration these two constraints when it has been designed. In this dissertation, we present a few side-effects that some fault tolerant solutions may presents when recovering a failed process. These effects may causes degradation of the system, affecting mainly the overall performance and availability. We introduce RADIC-II, a fault tolerant architecture for message passing based on RADIC (Redundant Array of Distributed Independent Fault Tolerance Controllers) architecture. RADIC-II keeps as maximum as possible the RADIC features of transparency, decentralization, flexibility and scalability, incorporating a flexible dynamic redundancy feature, allowing to mitigate or to avoid some recovery side-effects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The speed of fault isolation is crucial for the design and reconfiguration of fault tolerant control (FTC). In this paper the fault isolation problem is stated as a constraint satisfaction problem (CSP) and solved using constraint propagation techniques. The proposed method is based on constraint satisfaction techniques and uncertainty space refining of interval parameters. In comparison with other approaches based on adaptive observers, the major advantage of the presented method is that the isolation speed is fast even taking into account uncertainty in parameters, measurements and model errors and without the monotonicity assumption. In order to illustrate the proposed approach, a case study of a nonlinear dynamic system is presented

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Aquest projecte presenta, en primer lloc, un estudi dels protocols de generació de claus criptogràfiques i autoritats de certificació distribuïdes més destacables desenvolupades fins a l'actualitat. Posteriorment, implementem un protocol, que toleri les errades, de generació distribuïda de claus RSA sense servidor de confiança, orientat a xarxes ad-hoc. El protocol necessita la participació conjunta de n nodes per generar un mòdul RSA (N = pq), un exponent d'encriptació públic i les particions de l'exponent privat d, seguint un esquema llindar (t, n).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El uso intensivo y prolongado de computadores de altas prestaciones para ejecutar aplicaciones computacionalmente intensivas, sumado al elevado número de elementos que los componen, incrementan drásticamente la probabilidad de ocurrencia de fallos durante su funcionamiento. El objetivo del trabajo es resolver el problema de tolerancia a fallos para redes de interconexión de altas prestaciones, partiendo del diseño de polí­ticas de encaminamiento tolerantes a fallos. Buscamos resolver una determinada cantidad de fallos de enlaces y nodos, considerando sus factores de impacto y probabilidad de aparición. Para ello aprovechamos la redundancia de caminos de comunicación existentes, partiendo desde enfoques de encaminamiento adaptativos capaces de cumplir con las cuatro fases de la tolerancia a fallos: detección del error, contención del daño, recuperación del error, y tratamiento del fallo y continuidad del servicio. La experimentación muestra una degradación de prestaciones menor al 5%. En el futuro, se tratará la pérdida de información en tránsito.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La E/S Paralela es un área de investigación que tiene una creciente importancia en el cómputo de Altas Prestaciones. Si bien durante años ha sido el cuello de botella de los computadores paralelos en la actualidad, debido al gran aumento del poder de cómputo, el problema de la E/S se ha incrementado y la comunidad del Cómputo de Altas Prestaciones considera que se debe trabajar en mejorar el sistema de E/S de los computadores paralelos, para lograr cubrir las exigencias de las aplicaciones científicas que usan HPC. La Configuración de la Entrada/Salida (E/S) Paralela tiene una gran influencia en las prestaciones y disponibilidad, por ello es importante “Analizar configuraciones de E/S paralela para identificar los factores claves que influyen en las prestaciones y disponibilidad de la E/S de Aplicaciones Científicas que se ejecutan en un clúster”. Para realizar el análisis de las configuraciones de E/S se propone una metodología que permite identificar los factores de E/S y evaluar su influencia para diferentes configuraciones de E/S formada por tres fases: Caracterización, Configuración y Evaluación. La metodología permite analizar el computador paralelo a nivel de Aplicación Científica, librerías de E/S y de arquitectura de E/S, pero desde el punto de vista de la E/S. Los experimentos realizados para diferentes configuraciones de E/S y los resultados obtenidos indican la complejidad del análisis de los factores de E/S y los diferentes grados de influencia en las prestaciones del sistema de E/S. Finalmente se explican los trabajos futuros, el diseño de un modelo que de soporte al proceso de Configuración del sistema de E/S paralela para aplicaciones científicas. Por otro lado, para identificar y evaluar los factores de E/S asociados con la disponibilidad a nivel de datos, se pretende utilizar la Arquitectura Tolerante a Fallos RADIC.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La tolerancia a fallos es una línea de investigación que ha adquirido una importancia relevante con el aumento de la capacidad de cómputo de los súper-computadores actuales. Esto es debido a que con el aumento del poder de procesamiento viene un aumento en la cantidad de componentes que trae consigo una mayor cantidad de fallos. Las estrategias de tolerancia a fallos actuales en su mayoría son centralizadas y estas no escalan cuando se utiliza una gran cantidad de procesos, dado que se requiere sincronización entre todos ellos para realizar las tareas de tolerancia a fallos. Además la necesidad de mantener las prestaciones en programas paralelos es crucial, tanto en presencia como en ausencia de fallos. Teniendo en cuenta lo citado, este trabajo se ha centrado en una arquitectura tolerante a fallos descentralizada (RADIC – Redundant Array of Distributed and Independant Controllers) que busca mantener las prestaciones iniciales y garantizar la menor sobrecarga posible para reconfigurar el sistema en caso de fallos. La implementación de esta arquitectura se ha llevado a cabo en la librería de paso de mensajes denominada Open MPI, la misma es actualmente una de las más utilizadas en el mundo científico para la ejecución de programas paralelos que utilizan una plataforma de paso de mensajes. Las pruebas iniciales demuestran que el sistema introduce mínima sobrecarga para llevar a cabo las tareas correspondientes a la tolerancia a fallos. MPI es un estándar por defecto fail-stop, y en determinadas implementaciones que añaden cierto nivel de tolerancia, las estrategias más utilizadas son coordinadas. En RADIC cuando ocurre un fallo el proceso se recupera en otro nodo volviendo a un estado anterior que ha sido almacenado previamente mediante la utilización de checkpoints no coordinados y la relectura de mensajes desde el log de eventos. Durante la recuperación, las comunicaciones con el proceso en cuestión deben ser retrasadas y redirigidas hacia la nueva ubicación del proceso. Restaurar procesos en un lugar donde ya existen procesos sobrecarga la ejecución disminuyendo las prestaciones, por lo cual en este trabajo se propone la utilización de nodos spare para la recuperar en ellos a los procesos que fallan, evitando de esta forma la sobrecarga en nodos que ya tienen trabajo. En este trabajo se muestra un diseño propuesto para gestionar de un modo automático y descentralizado la recuperación en nodos spare en un entorno Open MPI y se presenta un análisis del impacto en las prestaciones que tiene este diseño. Resultados iniciales muestran una degradación significativa cuando a lo largo de la ejecución ocurren varios fallos y no se utilizan spares y sin embargo utilizándolos se restablece la configuración inicial y se mantienen las prestaciones.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

L'objectiu final d'aquest projecte és realitzar un Sistema Traçador d' Errors, però potser mésimportant és l'objectiu d'aprendre noves tecnologies, que sovint estan a disposició de l'usuari però l'usuari les desconeix.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Fault location has been studied deeply for transmission lines due to its importance in power systems. Nowadays the problem of fault location on distribution systems is receiving special attention mainly because of the power quality regulations. In this context, this paper presents an application software developed in Matlabtrade that automatically calculates the location of a fault in a distribution power system, starting from voltages and currents measured at the line terminal and the model of the distribution power system data. The application is based on a N-ary tree structure, which is suitable to be used in this application due to the highly branched and the non- homogeneity nature of the distribution systems, and has been developed for single-phase, two-phase, two-phase-to-ground, and three-phase faults. The implemented application is tested by using fault data in a real electrical distribution power system

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper focus on the problem of locating single-phase faults in mixed distribution electric systems, with overhead lines and underground cables, using voltage and current measurements at the sending-end and sequence model of the network. Since calculating series impedance for underground cables is not as simple as in the case of overhead lines, the paper proposes a methodology to obtain an estimation of zero-sequence impedance of underground cables starting from previous single-faults occurred in the system, in which an electric arc occurred at the fault location. For this reason, the signal is previously pretreated to eliminate its peaks voltage and the analysis can be done working with a signal as close as a sinus wave as possible

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper deals with fault detection and isolation problems for nonlinear dynamic systems. Both problems are stated as constraint satisfaction problems (CSP) and solved using consistency techniques. The main contribution is the isolation method based on consistency techniques and uncertainty space refining of interval parameters. The major advantage of this method is that the isolation speed is fast even taking into account uncertainty in parameters, measurements, and model errors. Interval calculations bring independence from the assumption of monotony considered by several approaches for fault isolation which are based on observers. An application to a well known alcoholic fermentation process model is presented

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The role of root systems in drought tolerance is a subject of very limited information compared with above-ground responses. Adjustments to the ability of roots to supply water relative to shoot transpiration demand is proposed as a major means for woody perennial plants to tolerate drought, and is often expressed as changes in the ratios of leaf to root area (AL:AR). Seasonal root proliferation in a directed manner could increase the water supply function of roots independent of total root area (AR) and represents a mechanism whereby water supply to demand could be increased. To address this issue, seasonal root proliferation, stomatal conductance (gs) and whole root system hydraulic conductance (kr) were investigated for a drought-tolerant grape root system (Vitis berlandieri×V. rupestris cv. 1103P) and a non-drought-tolerant root system (Vitis riparia×V. rupestris cv. 101-14Mgt), upon which had been grafted the same drought-sensitive clone of Vitis vinifera cv. Merlot. Leaf water potentials (ψL) for Merlot grafted onto the 1103P root system (–0.91±0.02 MPa) were +0.15 MPa higher than Merlot on 101-14Mgt (–1.06±0.03 MPa) during spring, but dropped by approximately –0.4 MPa from spring to autumn, and were significantly lower by –0.15 MPa (–1.43±0.02 MPa) than for Merlot on 101-14Mgt (at –1.28±0.02 MPa). Surprisingly, gs of Merlot on the drought-tolerant root system (1103P) was less down-regulated and canopies maintained evaporative fluxes ranging from 35–20 mmol vine−1 s−1 during the diurnal peak from spring to autumn, respectively, three times greater than those measured for Merlot on the drought-sensitive rootstock 101-14Mgt. The drought-tolerant root system grew more roots at depth during the warm summer dry period, and the whole root system conductance (kr) increased from 0.004 to 0.009 kg MPa−1 s−1 during that same time period. The changes in kr could not be explained by xylem anatomy or conductivity changes of individual root segments. Thus, the manner in which drought tolerance was conveyed to the drought-sensitive clone appeared to arise from deep root proliferation during the hottest and driest part of the season, rather than through changes in xylem structure, xylem density or stomatal regulation. This information can be useful to growers on a site-specific basis in selecting rootstocks for grape clonal material (scions) grafted to them.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Not considered in the analytical model of the plant, uncertainties always dramatically decrease the performance of the fault detection task in the practice. To cope better with this prevalent problem, in this paper we develop a methodology using Modal Interval Analysis which takes into account those uncertainties in the plant model. A fault detection method is developed based on this model which is quite robust to uncertainty and results in no false alarm. As soon as a fault is detected, an ANFIS model is trained in online to capture the major behavior of the occurred fault which can be used for fault accommodation. The simulation results understandably demonstrate the capability of the proposed method for accomplishing both tasks appropriately

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Often practical performance of analytical redundancy for fault detection and diagnosis is decreased by uncertainties prevailing not only in the system model, but also in the measurements. In this paper, the problem of fault detection is stated as a constraint satisfaction problem over continuous domains with a big number of variables and constraints. This problem can be solved using modal interval analysis and consistency techniques. Consistency techniques are then shown to be particularly efficient to check the consistency of the analytical redundancy relations (ARRs), dealing with uncertain measurements and parameters. Through the work presented in this paper, it can be observed that consistency techniques can be used to increase the performance of a robust fault detection tool, which is based on interval arithmetic. The proposed method is illustrated using a nonlinear dynamic model of a hydraulic system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the techniques used to detect faults in dynamic systems is analytical redundancy. An important difficulty in applying this technique to real systems is dealing with the uncertainties associated with the system itself and with the measurements. In this paper, this uncertainty is taken into account by the use of intervals for the parameters of the model and for the measurements. The method that is proposed in this paper checks the consistency between the system's behavior, obtained from the measurements, and the model's behavior; if they are inconsistent, then there is a fault. The problem of detecting faults is stated as a quantified real constraint satisfaction problem, which can be solved using the modal interval analysis (MIA). MIA is used because it provides powerful tools to extend the calculations over real functions to intervals. To improve the results of the detection of the faults, the simultaneous use of several sliding time windows is proposed. The result of implementing this method is semiqualitative tracking (SQualTrack), a fault-detection tool that is robust in the sense that it does not generate false alarms, i.e., if there are false alarms, they indicate either that the interval model does not represent the system adequately or that the interval measurements do not represent the true values of the variables adequately. SQualTrack is currently being used to detect faults in real processes. Some of these applications using real data have been developed within the European project advanced decision support system for chemical/petrochemical manufacturing processes and are also described in this paper