964 resultados para Data Warehousing Systems
Resumo:
The presented work was conducted within the Dissertation / Internship, branch of Environmental Protection Technology, associated to the Master thesis in Chemical Engineering by the Instituto Superior de Engenharia do Porto and it was developed in the Aquatest a.s, headquartered in Prague, in Czech Republic. The ore mining exploitation in the Czech Republic began in the thirteenth century, and has been extended until the twentieth century, being now evident the consequences of the intensive extraction which includes contamination of soil and sub-soil by high concentrations of heavy metals. The mountain region of Zlaté Hory was chosen for the implementation of the remediation project, which consisted in the construction of three cells (tanks), the first to raise the pH, the second for the sedimentation of the formed precipitates and a third to increase the process efficiency in order to reduce high concentrations of metals, with special emphasis on iron, manganese and sulfates. This project was initiated in 2005, being pioneer in this country and is still ongoing due to the complex chemical and biological phenomenon’s inherent to the system. At the site where the project was implemented, there is a natural lagoon, thereby enabling a comparative study of the two systems (natural and artificial) regarding the efficiency of both in the reduction/ removal of the referred pollutants. The study aimed to assist and cooperate in the ongoing investigation at the company Aquatest, in terms of field work conducted in Zlaté Hory and in terms of research methodologies used in it. Thereby, it was carried out a survey and analysis of available data from 2005 to 2008, being complemented by the treatment of new data from 2009 to 2010. Moreover, a theoretical study of the chemical and biological processes that occurs in both systems was performed. Regarding the field work, an active participation in the collection and in situ sample analyzing of water and soil from the natural pond has been attained, with the supervision of Engineer, Irena Šupiková. Laboratory analysis of water and soil were carried out by laboratory technicians. It was found that the natural lagoon is more efficient in reducing iron and manganese, being obtained removal percentages of 100%. The artificial lagoon had a removal percentage of 90% and 33% for iron and manganese respectively. Despite the minor efficiency of the constructed wetland, it must be pointed out that this system was designed for the treatment and consequent reduction of iron. In this context, it can conclude that the main goal has been achieved. In the case of sulphates, the removal optimization is yet a goal to be achieved not only in the Czech Republic but also in other places where this type of contamination persists. In fact, in the natural lagoon and in the constructed wetland, removal efficiencies of 45% and 7% were obtained respectively. It has been speculated that the water at the entrance of both systems has different sources. The analysis of the collected data shows at the entrance of the natural pond, a concentration of 4.6 mg/L of total iron, 14.6 mg/L of manganese and 951 mg/L of sulphates. In the artificial pond, the concentrations are 27.7 mg/L, 8.1 mg/L and 382 mg/L respectively for iron, manganese and sulphates. During 2010 the investigation has been expanded. The study of soil samples has started in order to observe and evaluate the contribution of bacteria in the removal of heavy metals being in its early phase. Summarizing, this technology has revealed to be an interesting solution, since in addition to substantially reduce the mentioned contaminants, mostly iron, it combines the low cost of implementation with an reduced maintenance, and it can also be installed in recreation parks, providing habitats for plants and birds.
Resumo:
This thesis presents the Fuzzy Monte Carlo Model for Transmission Power Systems Reliability based studies (FMC-TRel) methodology, which is based on statistical failure and repair data of the transmission power system components and uses fuzzyprobabilistic modeling for system component outage parameters. Using statistical records allows developing the fuzzy membership functions of system component outage parameters. The proposed hybrid method of fuzzy set and Monte Carlo simulation based on the fuzzy-probabilistic models allows catching both randomness and fuzziness of component outage parameters. A network contingency analysis to identify any overloading or voltage violation in the network is performed once obtained the system states. This is followed by a remedial action algorithm, based on Optimal Power Flow, to reschedule generations and alleviate constraint violations and, at the same time, to avoid any load curtailment, if possible, or, otherwise, to minimize the total load curtailment, for the states identified by the contingency analysis. For the system states that cause load curtailment, an optimization approach is applied to reduce the probability of occurrence of these states while minimizing the costs to achieve that reduction. This methodology is of most importance for supporting the transmission system operator decision making, namely in the identification of critical components and in the planning of future investments in the transmission power system. A case study based on Reliability Test System (RTS) 1996 IEEE 24 Bus is presented to illustrate with detail the application of the proposed methodology.
Resumo:
The main goal of this research study was the removal of Cu(II), Ni(II) and Zn(II) from aqueous solutions using peanut hulls. This work was mainly focused on the following aspects: chemical characterization of the biosorbent, kinetic studies, study of the pH influence in mono-component systems, equilibrium isotherms and column studies, both in mono and tri-component systems, and with a real industrial effluent from the electroplating industry. The chemical characterization of peanut hulls showed a high cellulose (44.8%) and lignin (36.1%) content, which favours biosorption of metal cations. The kinetic studies performed indicate that most of the sorption occurs in the first 30 min for all systems. In general, a pseudo-second order kinetics was followed, both in mono and tri-component systems. The equilibrium isotherms were better described by Freundlich model in all systems. Peanut hulls showed higher affinity for copper than for nickel and zinc when they are both present. The pH value between 5 and 6 was the most favourable for all systems. The sorbent capacity in column was 0.028 and 0.025 mmol g-1 for copper, respectively in mono and tri-component systems. A decrease of capacity for copper (50%) was observed when dealing with the real effluent. The Yoon-Nelson, Thomas and Yan’s models were fitted to the experimental data, being the latter the best fit.
Resumo:
The aim of this paper is to develop models for experimental open-channel water delivery systems and assess the use of three data-driven modeling tools toward that end. Water delivery canals are nonlinear dynamical systems and thus should be modeled to meet given operational requirements while capturing all relevant dynamics, including transport delays. Typically, the derivation of first principle models for open-channel systems is based on the use of Saint-Venant equations for shallow water, which is a time-consuming task and demands for specific expertise. The present paper proposes and assesses the use of three data-driven modeling tools: artificial neural networks, composite local linear models and fuzzy systems. The canal from Hydraulics and Canal Control Nucleus (A parts per thousand vora University, Portugal) will be used as a benchmark: The models are identified using data collected from the experimental facility, and then their performances are assessed based on suitable validation criterion. The performance of all models is compared among each other and against the experimental data to show the effectiveness of such tools to capture all significant dynamics within the canal system and, therefore, provide accurate nonlinear models that can be used for simulation or control. The models are available upon request to the authors.
Resumo:
Seismic data is difficult to analyze and classical mathematical tools reveal strong limitations in exposing hidden relationships between earthquakes. In this paper, we study earthquake phenomena in the perspective of complex systems. Global seismic data, covering the period from 1962 up to 2011 is analyzed. The events, characterized by their magnitude, geographic location and time of occurrence, are divided into groups, either according to the Flinn-Engdahl (F-E) seismic regions of Earth or using a rectangular grid based in latitude and longitude coordinates. Two methods of analysis are considered and compared in this study. In a first method, the distributions of magnitudes are approximated by Gutenberg-Richter (G-R) distributions and the parameters used to reveal the relationships among regions. In the second method, the mutual information is calculated and adopted as a measure of similarity between regions. In both cases, using clustering analysis, visualization maps are generated, providing an intuitive and useful representation of the complex relationships that are present among seismic data. Such relationships might not be perceived on classical geographic maps. Therefore, the generated charts are a valid alternative to other visualization tools, for understanding the global behavior of earthquakes.
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
With the advent of Web 2.0, new kinds of tools became available, which are not seen as novel anymore but are widely used. For instance, according to Eurostat data, in 2010 32% of individuals aged 16 to 74 used the Internet to post messages to social media sites or instant messaging tools, ranging from 17% in Romania to 46% in Sweden (Eurostat, 2012). Web 2.0 applications have been used in technology-enhanced learning environments. Learning 2.0 is a concept that has been used to describe the use of social media for learning. Many Learning 2.0 initiatives have been launched by educational and training institutions in Europe. Web 2.0 applications have also been used for informal learning. Web 2.0 tools can be used in classrooms, virtual or not, not only to engage students but also to support collaborative activities. Many of these tools allow users to use tags to organize resources and facilitate their retrieval at a later date or time. The aim of this chapter is to describe how tagging has been used in systems that support formal or informal learning and to summarize the functionalities that are common to these systems. In addition, common and unusual tagging applications that have been used in some Learning Objects Repositories are analysed.
Resumo:
The goal of the this paper is to show that the DGPS data Internet service we designed and developed provides campus-wide real time access to Differential GPS (DGPS) data and, thus, supports precise outdoor navigation. First we describe the developed distributed system in terms of architecture (a three tier client/server application), services provided (real time DGPS data transportation from remote DGPS sources and campus wide data dissemination) and transmission modes implemented (raw and frame mode over TCP and UDP). Then we present and discuss the results obtained and, finally, we draw some conclusions.
Resumo:
Estuaries are perhaps the most threatened environments in the coastal fringe; the coincidence of high natural value and attractiveness for human use has led to conflicts between conservation and development. These conflicts occur in the Sado Estuary since its location is near the industrialised zone of Peninsula of Setúbal and at the same time, a great part of the Estuary is classified as a Natural Reserve due to its high biodiversity. These facts led us to the need of implementing a model of environmental management and quality assessment, based on methodologies that enable the assessment of the Sado Estuary quality and evaluation of the human pressures in the estuary. These methodologies are based on indicators that can better depict the state of the environment and not necessarily all that could be measured or analysed. Sediments have always been considered as an important temporary source of some compounds or a sink for other type of materials or an interface where a great diversity of biogeochemical transformations occur. For all this they are of great importance in the formulation of coastal management system. Many authors have been using sediments to monitor aquatic contamination, showing great advantages when compared to the sampling of the traditional water column. The main objective of this thesis was to develop an estuary environmental management framework applied to Sado Estuary using the DPSIR Model (EMMSado), including data collection, data processing and data analysis. The support infrastructure of EMMSado were a set of spatially contiguous and homogeneous regions of sediment structure (management units). The environmental quality of the estuary was assessed through the sediment quality assessment and integrated in a preliminary stage with the human pressure for development. Besides the earlier explained advantages, studying the quality of the estuary mainly based on the indicators and indexes of the sediment compartment also turns this methodology easier, faster and human and financial resource saving. These are essential factors to an efficient environmental management of coastal areas. Data management, visualization, processing and analysis was obtained through the combined use of indicators and indices, sampling optimization techniques, Geographical Information Systems, remote sensing, statistics for spatial data, Global Positioning Systems and best expert judgments. As a global conclusion, from the nineteen management units delineated and analyzed three showed no ecological risk (18.5 % of the study area). The areas of more concern (5.6 % of the study area) are located in the North Channel and are under strong human pressure mainly due to industrial activities. These areas have also low hydrodynamics and are, thus associated with high levels of deposition. In particular the areas near Lisnave and Eurominas industries can also accumulate the contamination coming from Águas de Moura Channel, since particles coming from that channel can settle down in that area due to residual flow. In these areas the contaminants of concern, from those analyzed, are the heavy metals and metalloids (Cd, Cu, Zn and As exceeded the PEL guidelines) and the pesticides BHC isomers, heptachlor, isodrin, DDT and metabolits, endosulfan and endrin. In the remain management units (76 % of the study area) there is a moderate impact potential of occurrence of adverse ecological effects and in some of these areas no stress agents could be identified. This emphasizes the need for further research, since unmeasured chemicals may be causing or contributing to these adverse effects. Special attention must be taken to the units with moderate impact potential of occurrence of adverse ecological effects, located inside the natural reserve. Non-point source pollution coming from agriculture and aquaculture activities also seem to contribute with important pollution load into the estuary entering from Águas de Moura Channel. This pressure is expressed in a moderate impact potential for ecological risk existent in the areas near the entrance of this Channel. Pressures may also came from Alcácer Channel although they were not quantified in this study. The management framework presented here, including all the methodological tools may be applied and tested in other estuarine ecosystems, which will also allow a comparison between estuarine ecosystems in other parts of the globe.
Resumo:
Embedded real-time applications increasingly present high computation requirements, which need to be completed within specific deadlines, but that present highly variable patterns, depending on the set of data available in a determined instant. The current trend to provide parallel processing in the embedded domain allows providing higher processing power; however, it does not address the variability in the processing pattern. Dimensioning each device for its worst-case scenario implies lower average utilization, and increased available, but unusable, processing in the overall system. A solution for this problem is to extend the parallel execution of the applications, allowing networked nodes to distribute the workload, on peak situations, to neighbour nodes. In this context, this report proposes a framework to develop parallel and distributed real-time embedded applications, transparently using OpenMP and Message Passing Interface (MPI), within a programming model based on OpenMP. The technical report also devises an integrated timing model, which enables the structured reasoning on the timing behaviour of these hybrid architectures.
Resumo:
Managing the physical and compute infrastructure of a large data center is an embodiment of a Cyber-Physical System (CPS). The physical parameters of the data center (such as power, temperature, pressure, humidity) are tightly coupled with computations, even more so in upcoming data centers, where the location of workloads can vary substantially due, for example, to workloads being moved in a cloud infrastructure hosted in the data center. In this paper, we describe a data collection and distribution architecture that enables gathering physical parameters of a large data center at a very high temporal and spatial resolutionof the sensor measurements. We think this is an important characteristic to enable more accurate heat-flow models of the data center andwith them, _and opportunities to optimize energy consumption. Havinga high resolution picture of the data center conditions, also enables minimizing local hotspots, perform more accurate predictive maintenance (pending failures in cooling and other infrastructure equipment can be more promptly detected) and more accurate billing. We detail this architecture and define the structure of the underlying messaging system that is used to collect and distribute the data. Finally, we show the results of a preliminary study of a typical data center radio environment.
Resumo:
The recent trends of chip architectures with higher number of heterogeneous cores, and non-uniform memory/non-coherent caches, brings renewed attention to the use of Software Transactional Memory (STM) as a fundamental building block for developing parallel applications. Nevertheless, although STM promises to ease concurrent and parallel software development, it relies on the possibility of aborting conflicting transactions to maintain data consistency, which impacts on the responsiveness and timing guarantees required by embedded real-time systems. In these systems, contention delays must be (efficiently) limited so that the response times of tasks executing transactions are upper-bounded and task sets can be feasibly scheduled. In this paper we assess the use of STM in the development of embedded real-time software, defending that the amount of contention can be reduced if read-only transactions access recent consistent data snapshots, progressing in a wait-free manner. We show how the required number of versions of a shared object can be calculated for a set of tasks. We also outline an algorithm to manage conflicts between update transactions that prevents starvation.
Resumo:
It has been shown that in reality at least two general scenarios of data structuring are possible: (a) a self-similar (SS) scenario when the measured data form an SS structure and (b) a quasi-periodic (QP) scenario when the repeated (strongly correlated) data form random sequences that are almost periodic with respect to each other. In the second case it becomes possible to describe their behavior and express a part of their randomness quantitatively in terms of the deterministic amplitude–frequency response belonging to the generalized Prony spectrum. This possibility allows us to re-examine the conventional concept of measurements and opens a new way for the description of a wide set of different data. In particular, it concerns different complex systems when the ‘best-fit’ model pretending to be the description of the data measured is absent but the barest necessity of description of these data in terms of the reduced number of quantitative parameters exists. The possibilities of the proposed approach and detection algorithm of the QP processes were demonstrated on actual data: spectroscopic data recorded for pure water and acoustic data for a test hole. The suggested methodology allows revising the accepted classification of different incommensurable and self-affine spatial structures and finding accurate interpretation of the generalized Prony spectroscopy that includes the Fourier spectroscopy as a partial case.
Resumo:
The foreseen evolution of chip architectures to higher number of, heterogeneous, cores, with non-uniform memory and non-coherent caches, brings renewed attention to the use of Software Transactional Memory (STM) as an alternative to lock-based synchronisation. However, STM relies on the possibility of aborting conflicting transactions to maintain data consistency, which impacts on the responsiveness and timing guarantees required by real-time systems. In these systems, contention delays must be (efficiently) limited so that the response times of tasks executing transactions are upperbounded and task sets can be feasibly scheduled. In this paper we defend the role of the transaction contention manager to reduce the number of transaction retries and to help the real-time scheduler assuring schedulability. For such purpose, the contention management policy should be aware of on-line scheduling information.
Resumo:
The availability of small inexpensive sensor elements enables the employment of large wired or wireless sensor networks for feeding control systems. Unfortunately, the need to transmit a large number of sensor measurements over a network negatively affects the timing parameters of the control loop. This paper presents a solution to this problem by representing sensor measurements with an approximate representation-an interpolation of sensor measurements as a function of space coordinates. A priority-based medium access control (MAC) protocol is used to select the sensor messages with high information content. Thus, the information from a large number of sensor measurements is conveyed within a few messages. This approach greatly reduces the time for obtaining a snapshot of the environment state and therefore supports the real-time requirements of feedback control loops.