883 resultados para network-on-chip,deadlock, message-dependent-deadlock,NoC
Resumo:
The increasingly request for processing power during last years has pushed integrated circuit industry to look for ways of providing even more processing power with less heat dissipation, power consumption, and chip area. This goal has been achieved increasing the circuit clock, but since there are physical limits of this approach a new solution emerged as the multiprocessor system on chip (MPSoC). This approach demands new tools and basic software infrastructure to take advantage of the inherent parallelism of these architectures. The oil exploration industry has one of its firsts activities the project decision on exploring oil fields, those decisions are aided by reservoir simulations demanding high processing power, the MPSoC may offer greater performance if its parallelism can be well used. This work presents a proposal of a micro-kernel operating system and auxiliary libraries aimed to the STORM MPSoC platform analyzing its influence on the problem of reservoir simulation
Resumo:
Objective: To evaluate perinatal factors associated with early neonatal death in preterm infants with birth weights (BW) of 400-1,500 g.Methods: A multicenter prospective cohort study of all infants with BW of 400-1,500 g and 23-33 weeks of gestational age (GA), without malformations, who were born alive at eight public university tertiary hospitals in Brazil between June of 2004 and May of 2005. Infants who died within their first 6 days of life were compared with those who did not regarding maternal and neonatal characteristics and morbidity during the first 72 hours of life. Variables associated with the early deaths were identified by stepwise logistic regression.Results: A total of 579 live births met the inclusion criteria. Early deaths occurred in 92 (16%) cases, varying between centers from 5 to 31%, and these differences persisted after controlling for newborn illness severity and mortality risk score (SNAPPE-II). According to the multivariate analysis, the following factors were associated with early intrahospital neonatal deaths: gestational age of 23-27 weeks (odds ratio - OR = 5.0; 95%CI 2.7-9.4), absence of maternal hypertension (OR = 1.9; 95%CI 1.0-3.7), 5th minute Apgar 0-6 (OR = 2.8; 95%CI 1.4-5.4), presence of respiratory distress syndrome (OR = 3.1; 95%CI 1.4-6.6), and network center of birth.Conclusion: Important perinatal factors that are associated with early neonatal deaths in very low birth weight preterm infants can be modified by interventions such as improving fetal vitality at birth and reducing the incidence and severity of respiratory distress syndrome. The heterogeneity of early neonatal rates across the different centers studied indicates that best clinical practices should be identified and disseminated throughout the country.
Resumo:
This paper deals with the design of a network-on-chip reconfigurable pseudorandom number generation unit that can map and execute meta-heuristic algorithms in hardware. The unit can be configured to implement one of the following five linear generator algorithms: a multiplicative congruential, a mixed congruential, a standard multiple recursive, a mixed multiple recursive, and a multiply-with-carry. The generation unit can be used both as a pseudorandom and a message passing-based server, which is able to produce pseudorandom numbers on demand, sending them to the network-on-chip blocks that originate the service request. The generator architecture has been mapped to a field programmable gate array, and showed that millions of numbers in 32-, 64-, 96-, or 128-bit formats can be produced in tens of milliseconds. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Complex biological systems require sophisticated approach for analysis, once there are variables with distinct measure levels to be analyzed at the same time in them. The mouse assisted reproduction, e.g. superovulation and viable embryos production, demand a multidisciplinary control of the environment, endocrinologic and physiologic status of the animals, of the stressing factors and the conditions which are favorable to their copulation and subsequently oocyte fertilization. In the past, analyses with a simplified approach of these variables were not well succeeded to predict the situations that viable embryos were obtained in mice. Thereby, we suggest a more complex approach with association of the Cluster Analysis and the Artificial Neural Network to predict embryo production in superovulated mice. A robust prediction could avoid the useless death of animals and would allow an ethic management of them in experiments requiring mouse embryo.
Resumo:
The transcription process is crucial to life and the enzyme RNA polymerase (RNAP) is the major component of the transcription machinery. The development of single-molecule techniques, such as magnetic and optical tweezers, atomic-force microscopy and single-molecule fluorescence, increased our understanding of the transcription process and complements traditional biochemical studies. Based on these studies, theoretical models have been proposed to explain and predict the kinetics of the RNAP during the polymerization, highlighting the results achieved by models based on the thermodynamic stability of the transcription elongation complex. However, experiments showed that if more than one RNAP initiates from the same promoter, the transcription behavior slightly changes and new phenomenona are observed. We proposed and implemented a theoretical model that considers collisions between RNAPs and predicts their cooperative behavior during multi-round transcription generalizing the Bai et al. stochastic sequence-dependent model. In our approach, collisions between elongating enzymes modify their transcription rate values. We performed the simulations in Mathematica® and compared the results of the single and the multiple-molecule transcription with experimental results and other theoretical models. Our multi-round approach can recover several expected behaviors, showing that the transcription process for the studied sequences can be accelerated up to 48% when collisions are allowed: the dwell times on pause sites are reduced as well as the distance that the RNAPs backtracked from backtracking sites. © 2013 Costa et al.
Resumo:
The rapid (2 min) nongenomic effects of aldosterone (ALDO) and/or spironolactone (MR antagonist), RU 486 (GR antagonist), atrial natriuretic peptide (ANP) and dimethyl-BAPTA (BAPTA) on the intracellular pH recovery rate (pHirr) via NHE1 (basolateral Na+/H+ exchanger isoform), after the acid load induced by NH4Cl, and on the cytosolic free calcium concentration ([Ca2+](i)) were investigated in the proximal S3 segment isolated from rats, by the probes BCECF-AM and FLUO-4-AM, respectively. The basal pHi was 7.15+/-0.008 and the basal pHirr was 0.195+/-0.012 pH units/min (number of tubules/number of tubular areas = 16/96). Our results confirmed the rapid biphasic effect of ALDO on NHE1: ALDO (10(-12) M) increases the pHirr to approximately 59% of control value, and ALDO (10(-6)M) decreases it to approximately 49%. Spironolactone did not change these effects, but RU 486 inhibited the stimulatory effect and maintained the inhibitory effect. ANP (10(-6) M) or BAPTA (5 x 10(-5) M) alone had no significant effect on NHE1 but prevented both effects of ALDO on this exchanger. The basal [Ca2+](i) was 104+/-3 nM (15), and ALDO (10(-12) or 10(-6) M) increased the basal [Ca2+](i) to approximately 50% or 124%, respectively. RU 486, ANP and BAPTA decreased the [Ca2+](i) and inhibited the stimulatory effect of both doses of ALDO. The results suggest the involvement of GR on the nongenomic effects of ALDO and indicate a pHirr-regulating role for [Ca2+](i) that is mediated by NHE1, stimulated/impaired by ALDO, and affected by ANP or BAPTA with ALDO. The observed nongenomic hormonal interaction in the S3 segment may represent a rapid and physiologically relevant regulatory mechanism in the intact animal under conditions of volume alterations. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
The digital electronic market development is founded on the continuous reduction of the transistors size, to reduce area, power, cost and increase the computational performance of integrated circuits. This trend, known as technology scaling, is approaching the nanometer size. The lithographic process in the manufacturing stage is increasing its uncertainty with the scaling down of the transistors size, resulting in a larger parameter variation in future technology generations. Furthermore, the exponential relationship between the leakage current and the threshold voltage, is limiting the threshold and supply voltages scaling, increasing the power density and creating local thermal issues, such as hot spots, thermal runaway and thermal cycles. In addiction, the introduction of new materials and the smaller devices dimension are reducing transistors robustness, that combined with high temperature and frequently thermal cycles, are speeding up wear out processes. Those effects are no longer addressable only at the process level. Consequently the deep sub-micron devices will require solutions which will imply several design levels, as system and logic, and new approaches called Design For Manufacturability (DFM) and Design For Reliability. The purpose of the above approaches is to bring in the early design stages the awareness of the device reliability and manufacturability, in order to introduce logic and system able to cope with the yield and reliability loss. The ITRS roadmap suggests the following research steps to integrate the design for manufacturability and reliability in the standard CAD automated design flow: i) The implementation of new analysis algorithms able to predict the system thermal behavior with the impact to the power and speed performances. ii) High level wear out models able to predict the mean time to failure of the system (MTTF). iii) Statistical performance analysis able to predict the impact of the process variation, both random and systematic. The new analysis tools have to be developed beside new logic and system strategies to cope with the future challenges, as for instance: i) Thermal management strategy that increase the reliability and life time of the devices acting to some tunable parameter,such as supply voltage or body bias. ii) Error detection logic able to interact with compensation techniques as Adaptive Supply Voltage ASV, Adaptive Body Bias ABB and error recovering, in order to increase yield and reliability. iii) architectures that are fundamentally resistant to variability, including locally asynchronous designs, redundancy, and error correcting signal encodings (ECC). The literature already features works addressing the prediction of the MTTF, papers focusing on thermal management in the general purpose chip, and publications on statistical performance analysis. In my Phd research activity, I investigated the need for thermal management in future embedded low-power Network On Chip (NoC) devices.I developed a thermal analysis library, that has been integrated in a NoC cycle accurate simulator and in a FPGA based NoC simulator. The results have shown that an accurate layout distribution can avoid the onset of hot-spot in a NoC chip. Furthermore the application of thermal management can reduce temperature and number of thermal cycles, increasing the systemreliability. Therefore the thesis advocates the need to integrate a thermal analysis in the first design stages for embedded NoC design. Later on, I focused my research in the development of statistical process variation analysis tool that is able to address both random and systematic variations. The tool was used to analyze the impact of self-timed asynchronous logic stages in an embedded microprocessor. As results we confirmed the capability of self-timed logic to increase the manufacturability and reliability. Furthermore we used the tool to investigate the suitability of low-swing techniques in the NoC system communication under process variations. In this case We discovered the superior robustness to systematic process variation of low-swing links, which shows a good response to compensation technique as ASV and ABB. Hence low-swing is a good alternative to the standard CMOS communication for power, speed, reliability and manufacturability. In summary my work proves the advantage of integrating a statistical process variation analysis tool in the first stages of the design flow.
Resumo:
The evolution of the electronics embedded applications forces electronics systems designers to match their ever increasing requirements. This evolution pushes the computational power of digital signal processing systems, as well as the energy required to accomplish the computations, due to the increasing mobility of such applications. Current approaches used to match these requirements relies on the adoption of application specific signal processors. Such kind of devices exploits powerful accelerators, which are able to match both performance and energy requirements. On the other hand, the too high specificity of such accelerators often results in a lack of flexibility which affects non-recurrent engineering costs, time to market, and market volumes too. The state of the art mainly proposes two solutions to overcome these issues with the ambition of delivering reasonable performance and energy efficiency: reconfigurable computing and multi-processors computing. All of these solutions benefits from the post-fabrication programmability, that definitively results in an increased flexibility. Nevertheless, the gap between these approaches and dedicated hardware is still too high for many application domains, especially when targeting the mobile world. In this scenario, flexible and energy efficient acceleration can be achieved by merging these two computational paradigms, in order to address all the above introduced constraints. This thesis focuses on the exploration of the design and application spectrum of reconfigurable computing, exploited as application specific accelerators for multi-processors systems on chip. More specifically, it introduces a reconfigurable digital signal processor featuring a heterogeneous set of reconfigurable engines, and a homogeneous multi-core system, exploiting three different flavours of reconfigurable and mask-programmable technologies as implementation platform for applications specific accelerators. In this work, the various trade-offs concerning the utilization multi-core platforms and the different configuration technologies are explored, characterizing the design space of the proposed approach in terms of programmability, performance, energy efficiency and manufacturing costs.
Resumo:
MultiProcessor Systems-on-Chip (MPSoC) are the core of nowadays and next generation computing platforms. Their relevance in the global market continuously increase, occupying an important role both in everydaylife products (e.g. smartphones, tablets, laptops, cars) and in strategical market sectors as aviation, defense, robotics, medicine. Despite of the incredible performance improvements in the recent years processors manufacturers have had to deal with issues, commonly called “Walls”, that have hindered the processors development. After the famous “Power Wall”, that limited the maximum frequency of a single core and marked the birth of the modern multiprocessors system-on-chip, the “Thermal Wall” and the “Utilization Wall” are the actual key limiter for performance improvements. The former concerns the damaging effects of the high temperature on the chip caused by the large power densities dissipation, whereas the second refers to the impossibility of fully exploiting the computing power of the processor due to the limitations on power and temperature budgets. In this thesis we faced these challenges by developing efficient and reliable solutions able to maximize performance while limiting the maximum temperature below a fixed critical threshold and saving energy. This has been possible by exploiting the Model Predictive Controller (MPC) paradigm that solves an optimization problem subject to constraints in order to find the optimal control decisions for the future interval. A fully-distributedMPC-based thermal controller with a far lower complexity respect to a centralized one has been developed. The control feasibility and interesting properties for the simplification of the control design has been proved by studying a partial differential equation thermal model. Finally, the controller has been efficiently included in more complex control schemes able to minimize energy consumption and deal with mixed-criticalities tasks
Resumo:
Despite the several issues faced in the past, the evolutionary trend of silicon has kept its constant pace. Today an ever increasing number of cores is integrated onto the same die. Unfortunately, the extraordinary performance achievable by the many-core paradigm is limited by several factors. Memory bandwidth limitation, combined with inefficient synchronization mechanisms, can severely overcome the potential computation capabilities. Moreover, the huge HW/SW design space requires accurate and flexible tools to perform architectural explorations and validation of design choices. In this thesis we focus on the aforementioned aspects: a flexible and accurate Virtual Platform has been developed, targeting a reference many-core architecture. Such tool has been used to perform architectural explorations, focusing on instruction caching architecture and hybrid HW/SW synchronization mechanism. Beside architectural implications, another issue of embedded systems is considered: energy efficiency. Near Threshold Computing is a key research area in the Ultra-Low-Power domain, as it promises a tenfold improvement in energy efficiency compared to super-threshold operation and it mitigates thermal bottlenecks. The physical implications of modern deep sub-micron technology are severely limiting performance and reliability of modern designs. Reliability becomes a major obstacle when operating in NTC, especially memory operation becomes unreliable and can compromise system correctness. In the present work a novel hybrid memory architecture is devised to overcome reliability issues and at the same time improve energy efficiency by means of aggressive voltage scaling when allowed by workload requirements. Variability is another great drawback of near-threshold operation. The greatly increased sensitivity to threshold voltage variations in today a major concern for electronic devices. We introduce a variation-tolerant extension of the baseline many-core architecture. By means of micro-architectural knobs and a lightweight runtime control unit, the baseline architecture becomes dynamically tolerant to variations.
Resumo:
Il presente lavoro di tesi, svolto presso i laboratori dell'X-ray Imaging Group del Dipartimento di Fisica e Astronomia dell'Università di Bologna e all'interno del progetto della V Commissione Scientifica Nazionale dell'INFN, COSA (Computing on SoC Architectures), ha come obiettivo il porting e l’analisi di un codice di ricostruzione tomografica su architetture GPU installate su System-On-Chip low-power, al fine di sviluppare un metodo portatile, economico e relativamente veloce. Dall'analisi computazionale sono state sviluppate tre diverse versioni del porting in CUDA C: nella prima ci si è limitati a trasporre la parte più onerosa del calcolo sulla scheda grafica, nella seconda si sfrutta la velocità del calcolo matriciale propria del coprocessore (facendo coincidere ogni pixel con una singola unità di calcolo parallelo), mentre la terza è un miglioramento della precedente versione ottimizzata ulteriormente. La terza versione è quella definitiva scelta perché è la più performante sia dal punto di vista del tempo di ricostruzione della singola slice sia a livello di risparmio energetico. Il porting sviluppato è stato confrontato con altre due parallelizzazioni in OpenMP ed MPI. Si è studiato quindi, sia su cluster HPC, sia su cluster SoC low-power (utilizzando in particolare la scheda quad-core Tegra K1), l’efficienza di ogni paradigma in funzione della velocità di calcolo e dell’energia impiegata. La soluzione da noi proposta prevede la combinazione del porting in OpenMP e di quello in CUDA C. Tre core CPU vengono riservati per l'esecuzione del codice in OpenMP, il quarto per gestire la GPU usando il porting in CUDA C. Questa doppia parallelizzazione ha la massima efficienza in funzione della potenza e dell’energia, mentre il cluster HPC ha la massima efficienza in velocità di calcolo. Il metodo proposto quindi permetterebbe di sfruttare quasi completamente le potenzialità della CPU e GPU con un costo molto contenuto. Una possibile ottimizzazione futura potrebbe prevedere la ricostruzione di due slice contemporaneamente sulla GPU, raddoppiando circa la velocità totale e sfruttando al meglio l’hardware. Questo studio ha dato risultati molto soddisfacenti, infatti, è possibile con solo tre schede TK1 eguagliare e forse a superare, in seguito, la potenza di calcolo di un server tradizionale con il vantaggio aggiunto di avere un sistema portatile, a basso consumo e costo. Questa ricerca si va a porre nell’ambito del computing come uno tra i primi studi effettivi su architetture SoC low-power e sul loro impiego in ambito scientifico, con risultati molto promettenti.
Resumo:
Fully controlled liquid injection and flow in hydrophobic polydimethylsiloxane (PDMS) two-dimensional microchannel arrays based on on-chip integrated, low-voltage-driven micropumps are demonstrated. Our architecture exploits the surface-acoustic-wave (SAW) induced counterflow mechanism and the effect of nebulization anisotropies at crossing areas owing to lateral propagating SAWs. We show that by selectively exciting single or multiple SAWs, fluids can be drawn from their reservoirs and moved towards selected positions of a microchannel grid. Splitting of the main liquid flow is also demonstrated by exploiting multiple SAW beams. As a demonstrator, we show simultaneous filling of two orthogonal microchannels. The present results show that SAW micropumps are good candidates for truly integrated on-chip fluidic networks allowing liquid control in arbitrarily shaped two-dimensional microchannel arrays.