50 resultados para Data Storage Solutions
Resumo:
An important competence of human data analysts is to interpret and explain the meaning of the results of data analysis to end-users. However, existing automatic solutions for intelligent data analysis provide limited help to interpret and communicate information to non-expert users. In this paper we present a general approach to generating explanatory descriptions about the meaning of quantitative sensor data. We propose a type of web application: a virtual newspaper with automatically generated news stories that describe the meaning of sensor data. This solution integrates a variety of techniques from intelligent data analysis into a web-based multimedia presentation system. We validated our approach in a real world problem and demonstrate its generality using data sets from several domains. Our experience shows that this solution can facilitate the use of sensor data by general users and, therefore, can increase the utility of sensor network infrastructures.
Resumo:
Arquitectura de almacenamiento para imágenes JPEG2000 basado en la fragmentación del fichero para poder almacenar los datos en diferentes discos para optimizar el almacenamiento en función de la calidad de los datos y posibilitar el aumento de transferencia.
Resumo:
In spite of the increasing presence of Semantic Web Facilities, only a limited amount of the available resources in the Internet provide a semantic access. Recent initiatives such as the emerging Linked Data Web are providing semantic access to available data by porting existing resources to the semantic web using different technologies, such as database-semantic mapping and scraping. Nevertheless, existing scraping solutions are based on ad-hoc solutions complemented with graphical interfaces for speeding up the scraper development. This article proposes a generic framework for web scraping based on semantic technologies. This framework is structured in three levels: scraping services, semantic scraping model and syntactic scraping. The first level provides an interface to generic applications or intelligent agents for gathering information from the web at a high level. The second level defines a semantic RDF model of the scraping process, in order to provide a declarative approach to the scraping task. Finally, the third level provides an implementation of the RDF scraping model for specific technologies. The work has been validated in a scenario that illustrates its application to mashup technologies
Resumo:
To date, big data applications have focused on the store-and-process paradigm. In this paper we describe an initiative to deal with big data applications for continuous streams of events. In many emerging applications, the volume of data being streamed is so large that the traditional ‘store-then-process’ paradigm is either not suitable or too inefficient. Moreover, soft-real time requirements might severely limit the engineering solutions. Many scenarios fit this description. In network security for cloud data centres, for instance, very high volumes of IP packets and events from sensors at firewalls, network switches and routers and servers need to be analyzed and should detect attacks in minimal time, in order to limit the effect of the malicious activity over the IT infrastructure. Similarly, in the fraud department of a credit card company, payment requests should be processed online and need to be processed as quickly as possible in order to provide meaningful results in real-time. An ideal system would detect fraud during the authorization process that lasts hundreds of milliseconds and deny the payment authorization, minimizing the damage to the user and the credit card company.
Resumo:
Presenting relevant information via web-based user friendly interfac- es makes the information more accessible to the general public. This is especial- ly useful for sensor networks that monitor natural environments. Adequately communicating this type of information helps increase awareness about the limited availability of natural resources and promotes their better use with sus- tainable practices. In this paper, I suggest an approach to communicating this information to wide audiences based on simulating data journalism using artifi- cial intelligence techniques. I analyze this approach by describing a pioneer knowledge-based system called VSAIH, which looks for news in hydrological data from a national sensor network in Spain and creates news stories that gen- eral users can understand. VSAIH integrates artificial intelligence techniques, including a model-based data analyzer and a presentation planner. In the paper, I also describe characteristics of the hydrological national sensor network and the technical solutions applied by VSAIH to simulate data journalism.
Resumo:
Machine learning techniques are used for extracting valuable knowledge from data. Nowa¬days, these techniques are becoming even more important due to the evolution in data ac¬quisition and storage, which is leading to data with different characteristics that must be exploited. Therefore, advances in data collection must be accompanied with advances in machine learning techniques to solve new challenges that might arise, on both academic and real applications. There are several machine learning techniques depending on both data characteristics and purpose. Unsupervised classification or clustering is one of the most known techniques when data lack of supervision (unlabeled data) and the aim is to discover data groups (clusters) according to their similarity. On the other hand, supervised classification needs data with supervision (labeled data) and its aim is to make predictions about labels of new data. The presence of data labels is a very important characteristic that guides not only the learning task but also other related tasks such as validation. When only some of the available data are labeled whereas the others remain unlabeled (partially labeled data), neither clustering nor supervised classification can be used. This scenario, which is becoming common nowadays because of labeling process ignorance or cost, is tackled with semi-supervised learning techniques. This thesis focuses on the branch of semi-supervised learning closest to clustering, i.e., to discover clusters using available labels as support to guide and improve the clustering process. Another important data characteristic, different from the presence of data labels, is the relevance or not of data features. Data are characterized by features, but it is possible that not all of them are relevant, or equally relevant, for the learning process. A recent clustering tendency, related to data relevance and called subspace clustering, claims that different clusters might be described by different feature subsets. This differs from traditional solutions to data relevance problem, where a single feature subset (usually the complete set of original features) is found and used to perform the clustering process. The proximity of this work to clustering leads to the first goal of this thesis. As commented above, clustering validation is a difficult task due to the absence of data labels. Although there are many indices that can be used to assess the quality of clustering solutions, these validations depend on clustering algorithms and data characteristics. Hence, in the first goal three known clustering algorithms are used to cluster data with outliers and noise, to critically study how some of the most known validation indices behave. The main goal of this work is however to combine semi-supervised clustering with subspace clustering to obtain clustering solutions that can be correctly validated by using either known indices or expert opinions. Two different algorithms are proposed from different points of view to discover clusters characterized by different subspaces. For the first algorithm, available data labels are used for searching for subspaces firstly, before searching for clusters. This algorithm assigns each instance to only one cluster (hard clustering) and is based on mapping known labels to subspaces using supervised classification techniques. Subspaces are then used to find clusters using traditional clustering techniques. The second algorithm uses available data labels to search for subspaces and clusters at the same time in an iterative process. This algorithm assigns each instance to each cluster based on a membership probability (soft clustering) and is based on integrating known labels and the search for subspaces into a model-based clustering approach. The different proposals are tested using different real and synthetic databases, and comparisons to other methods are also included when appropriate. Finally, as an example of real and current application, different machine learning tech¬niques, including one of the proposals of this work (the most sophisticated one) are applied to a task of one of the most challenging biological problems nowadays, the human brain model¬ing. Specifically, expert neuroscientists do not agree with a neuron classification for the brain cortex, which makes impossible not only any modeling attempt but also the day-to-day work without a common way to name neurons. Therefore, machine learning techniques may help to get an accepted solution to this problem, which can be an important milestone for future research in neuroscience. Resumen Las técnicas de aprendizaje automático se usan para extraer información valiosa de datos. Hoy en día, la importancia de estas técnicas está siendo incluso mayor, debido a que la evolución en la adquisición y almacenamiento de datos está llevando a datos con diferentes características que deben ser explotadas. Por lo tanto, los avances en la recolección de datos deben ir ligados a avances en las técnicas de aprendizaje automático para resolver nuevos retos que pueden aparecer, tanto en aplicaciones académicas como reales. Existen varias técnicas de aprendizaje automático dependiendo de las características de los datos y del propósito. La clasificación no supervisada o clustering es una de las técnicas más conocidas cuando los datos carecen de supervisión (datos sin etiqueta), siendo el objetivo descubrir nuevos grupos (agrupaciones) dependiendo de la similitud de los datos. Por otra parte, la clasificación supervisada necesita datos con supervisión (datos etiquetados) y su objetivo es realizar predicciones sobre las etiquetas de nuevos datos. La presencia de las etiquetas es una característica muy importante que guía no solo el aprendizaje sino también otras tareas relacionadas como la validación. Cuando solo algunos de los datos disponibles están etiquetados, mientras que el resto permanece sin etiqueta (datos parcialmente etiquetados), ni el clustering ni la clasificación supervisada se pueden utilizar. Este escenario, que está llegando a ser común hoy en día debido a la ignorancia o el coste del proceso de etiquetado, es abordado utilizando técnicas de aprendizaje semi-supervisadas. Esta tesis trata la rama del aprendizaje semi-supervisado más cercana al clustering, es decir, descubrir agrupaciones utilizando las etiquetas disponibles como apoyo para guiar y mejorar el proceso de clustering. Otra característica importante de los datos, distinta de la presencia de etiquetas, es la relevancia o no de los atributos de los datos. Los datos se caracterizan por atributos, pero es posible que no todos ellos sean relevantes, o igualmente relevantes, para el proceso de aprendizaje. Una tendencia reciente en clustering, relacionada con la relevancia de los datos y llamada clustering en subespacios, afirma que agrupaciones diferentes pueden estar descritas por subconjuntos de atributos diferentes. Esto difiere de las soluciones tradicionales para el problema de la relevancia de los datos, en las que se busca un único subconjunto de atributos (normalmente el conjunto original de atributos) y se utiliza para realizar el proceso de clustering. La cercanía de este trabajo con el clustering lleva al primer objetivo de la tesis. Como se ha comentado previamente, la validación en clustering es una tarea difícil debido a la ausencia de etiquetas. Aunque existen muchos índices que pueden usarse para evaluar la calidad de las soluciones de clustering, estas validaciones dependen de los algoritmos de clustering utilizados y de las características de los datos. Por lo tanto, en el primer objetivo tres conocidos algoritmos se usan para agrupar datos con valores atípicos y ruido para estudiar de forma crítica cómo se comportan algunos de los índices de validación más conocidos. El objetivo principal de este trabajo sin embargo es combinar clustering semi-supervisado con clustering en subespacios para obtener soluciones de clustering que puedan ser validadas de forma correcta utilizando índices conocidos u opiniones expertas. Se proponen dos algoritmos desde dos puntos de vista diferentes para descubrir agrupaciones caracterizadas por diferentes subespacios. Para el primer algoritmo, las etiquetas disponibles se usan para bus¬car en primer lugar los subespacios antes de buscar las agrupaciones. Este algoritmo asigna cada instancia a un único cluster (hard clustering) y se basa en mapear las etiquetas cono-cidas a subespacios utilizando técnicas de clasificación supervisada. El segundo algoritmo utiliza las etiquetas disponibles para buscar de forma simultánea los subespacios y las agru¬paciones en un proceso iterativo. Este algoritmo asigna cada instancia a cada cluster con una probabilidad de pertenencia (soft clustering) y se basa en integrar las etiquetas conocidas y la búsqueda en subespacios dentro de clustering basado en modelos. Las propuestas son probadas utilizando diferentes bases de datos reales y sintéticas, incluyendo comparaciones con otros métodos cuando resulten apropiadas. Finalmente, a modo de ejemplo de una aplicación real y actual, se aplican diferentes técnicas de aprendizaje automático, incluyendo una de las propuestas de este trabajo (la más sofisticada) a una tarea de uno de los problemas biológicos más desafiantes hoy en día, el modelado del cerebro humano. Específicamente, expertos neurocientíficos no se ponen de acuerdo en una clasificación de neuronas para la corteza cerebral, lo que imposibilita no sólo cualquier intento de modelado sino también el trabajo del día a día al no tener una forma estándar de llamar a las neuronas. Por lo tanto, las técnicas de aprendizaje automático pueden ayudar a conseguir una solución aceptada para este problema, lo cual puede ser un importante hito para investigaciones futuras en neurociencia.
Resumo:
Abstract interpretation-based data-flow analysis of logic programs is at this point relatively well understood from the point of view of general frameworks and abstract domains. On the other hand, comparatively little attention has been given to the problems which arise when analysis of a full, practical dialect of the Prolog language is attempted, and only few solutions to these problems have been proposed to date. Such problems relate to dealing correctly with all builtins, including meta-logical and extra-logical predicates, with dynamic predicates (where the program is modified during execution), and with the absence of certain program text during compilation. Existing proposals for dealing with such issues generally restrict in one way or another the classes of programs which can be analyzed if the information from analysis is to be used for program optimization. This paper attempts to fill this gap by considering a full dialect of Prolog, essentially following the recently proposed ISO standard, pointing out the problems that may arise in the analysis of such a dialect, and proposing a combination of known and novel solutions that together allow the correct analysis of arbitrary programs using the full power of the language.
Resumo:
Data grid services have been used to deal with the increasing needs of applications in terms of data volume and throughput. The large scale, heterogeneity and dynamism of grid environments often make management and tuning of these data services very complex. Furthermore, current high-performance I/O approaches are characterized by their high complexity and specific features that usually require specialized administrator skills. Autonomic computing can help manage this complexity. The present paper describes an autonomic subsystem intended to provide self-management features aimed at efficiently reducing the I/O problem in a grid environment, thereby enhancing the quality of service (QoS) of data access and storage services in the grid. Our proposal takes into account that data produced in an I/O system is not usually immediately required. Therefore, performance improvements are related not only to current but also to any future I/O access, as the actual data access usually occurs later on. Nevertheless, the exact time of the next I/O operations is unknown. Thus, our approach proposes a long-term prediction designed to forecast the future workload of grid components. This enables the autonomic subsystem to determine the optimal data placement to improve both current and future I/O operations.
Resumo:
In just a few years cloud computing has become a very popular paradigm and a business success story, with storage being one of the key features. To achieve high data availability, cloud storage services rely on replication. In this context, one major challenge is data consistency. In contrast to traditional approaches that are mostly based on strong consistency, many cloud storage services opt for weaker consistency models in order to achieve better availability and performance. This comes at the cost of a high probability of stale data being read, as the replicas involved in the reads may not always have the most recent write. In this paper, we propose a novel approach, named Harmony, which adaptively tunes the consistency level at run-time according to the application requirements. The key idea behind Harmony is an intelligent estimation model of stale reads, allowing to elastically scale up or down the number of replicas involved in read operations to maintain a low (possibly zero) tolerable fraction of stale reads. As a result, Harmony can meet the desired consistency of the applications while achieving good performance. We have implemented Harmony and performed extensive evaluations with the Cassandra cloud storage on Grid?5000 testbed and on Amazon EC2. The results show that Harmony can achieve good performance without exceeding the tolerated number of stale reads. For instance, in contrast to the static eventual consistency used in Cassandra, Harmony reduces the stale data being read by almost 80% while adding only minimal latency. Meanwhile, it improves the throughput of the system by 45% while maintaining the desired consistency requirements of the applications when compared to the strong consistency model in Cassandra.
Resumo:
In professional video production, users have to access to huge multimedia files simultaneously in an error-free environment, this restriction force the use of expensive disk architectures for video servers. Previous researches proposed different RAID systems for each specific task (ingest, editing, file, play-out, etc.). Video production companies have to acquire different servers with different RAIDs systems in order to support each task in the production workflow. The solution has multiples disadvantages, duplicated material in several RAIDs, duplicated material for different qualities, transfer and transcoding processes, etc. In this work, an architecture for video servers based on the spreading of JPEG200 data in different RAIDs is presented, each individual part of the data structure goes to a specific RAID type depending on the effect that produces the data on the overall image quality, the method provide a redundancy correlated with the data rank. The global storage can be used in all the different tasks of the production workflow saving disk space, redundant files and transfers procedures.
Resumo:
Current methods and tools that support Linked Data publication have mainly focused so far on static data, without considering the growing amount of streaming data available on the Web. In this paper we describe a case study that involves the publication of static and streaming Linked Data for bike sharing systems and related entities. We describe some of the challenges that we have faced, the solutions that we have explored, the lessons that we have learned, and the opportunities that lie in the future for exploiting Linked Stream Data.
Resumo:
In this paper we present a revisited classification of term variation in the light of the Linked Data initiative. Linked Data refers to a set of best practices for publishing and connecting structured data on the Web with the idea of transforming it into a global graph. One of the crucial steps of this initiative is the linking step, in which datasets in one or more languages need to be linked or connected with one another. We claim that the linking process would be facilitated if datasets are enriched with lexical and terminological information. Being that the final aim, we propose a classification of lexical, terminological and semantic variants that will become part of a model of linguistic descriptions that is currently being proposed within the framework of the W3C Ontology-Lexica Community Group to enrich ontologies and Linked Data vocabularies. Examples of modeling solutions of the different types of variants are also provided.
Resumo:
The application of the response of fruits to low energy for mechanical impacts is described, for evaluation of post-harvest ripening of avocadoes of the variety "Hass". An impactor of 50g of weight, provided with an accelerometer, and free-falling from a height of 4 cm, is used; it is interfaced to a computer and uses a special software for retrieving and analyzing the deceleration data. Impact response parameters of individual fruits were compared to firmness of the pulp, measured by the most used method of double-plate puncture, as well as to other physical and physiological parameters: color, skin puncture ethylene production rate and others. Two groups of fruits were carefully selected, stored at 6º C (60 days) and ripened at 20ºC (11 days), and tested during the storage period. It is shown that, as in other types of fruits, impact response can be a good predictor of firmness in avocadoes, obtaining the same accuracy as with destructive firmness measurements. Mathematical and multiple regression models are calculated and compared to measured data, with which a prediction of storage period can be made for these fruits.
Resumo:
RESUMEN Las enfermedades cardiovasculares constituyen en la actualidad la principal causa de mortalidad en el mundo y se prevé que sigan siéndolo en un futuro, generando además elevados costes para los sistemas de salud. Los dispositivos cardiacos implantables constituyen una de las opciones para el diagnóstico y el tratamiento de las alteraciones del ritmo cardiaco. La investigación clínica con estos dispositivos alcanza gran relevancia para combatir estas enfermedades que tanto afectan a nuestra sociedad. Tanto la industria farmacéutica y de tecnología médica, como los propios investigadores, cada día se ven involucrados en un mayor número de proyectos de investigación clínica. No sólo el incremento en su volumen, sino el aumento de la complejidad, están generando mayores gastos en las actividades asociadas a la investigación médica. Esto está conduciendo a las compañías del sector sanitario a estudiar nuevas soluciones que les permitan reducir los costes de los estudios clínicos. Las Tecnologías de la Información y las Comunicaciones han facilitado la investigación clínica, especialmente en la última década. Los sistemas y aplicaciones electrónicos han proporcionado nuevas posibilidades en la adquisición, procesamiento y análisis de los datos. Por otro lado, la tecnología web propició la aparición de los primeros sistemas electrónicos de adquisición de datos, que han ido evolucionando a lo largo de los últimos años. Sin embargo, la mejora y perfeccionamiento de estos sistemas sigue siendo crucial para el progreso de la investigación clínica. En otro orden de cosas, la forma tradicional de realizar los estudios clínicos con dispositivos cardiacos implantables precisaba mejorar el tratamiento de los datos almacenados por estos dispositivos, así como para su fusión con los datos clínicos recopilados por investigadores y pacientes. La justificación de este trabajo de investigación se basa en la necesidad de mejorar la eficiencia en la investigación clínica con dispositivos cardiacos implantables, mediante la reducción de costes y tiempos de desarrollo de los proyectos, y el incremento de la calidad de los datos recopilados y el diseño de soluciones que permitan obtener un mayor rendimiento de los datos mediante la fusión de datos de distintas fuentes o estudios. Con este fin se proponen como objetivos específicos de este proyecto de investigación dos nuevos modelos: - Un modelo de recuperación y procesamiento de datos para los estudios clínicos con dispositivos cardiacos implantables, que permita estructurar y estandarizar estos procedimientos, con el fin de reducir tiempos de desarrollo Modelos de Métrica para Sistemas Electrónicos de Adquisición de Datos y de Procesamiento para Investigación Clínica con Dispositivos Cardiacos Implantables de estas tareas, mejorar la calidad del resultado obtenido, disminuyendo en consecuencia los costes. - Un modelo de métrica integrado en un Sistema Electrónico de Adquisición de Datos (EDC) que permita analizar los resultados del proyecto de investigación y, particularmente del rendimiento obtenido del EDC, con el fin de perfeccionar estos sistemas y reducir tiempos y costes de desarrollo del proyecto y mejorar la calidad de los datos clínicos recopilados. Como resultado de esta investigación, el modelo de procesamiento propuesto ha permitido reducir el tiempo medio de procesamiento de los datos en más de un 90%, los costes derivados del mismo en más de un 85% y todo ello, gracias a la automatización de la extracción y almacenamiento de los datos, consiguiendo una mejora de la calidad de los mismos. Por otro lado, el modelo de métrica posibilita el análisis descriptivo detallado de distintos indicadores que caracterizan el rendimiento del proyecto de investigación clínica, haciendo factible además la comparación entre distintos estudios. La conclusión de esta tesis doctoral es que los resultados obtenidos han demostrado que la utilización en estudios clínicos reales de los dos modelos desarrollados ha conducido a una mejora en la eficiencia de los proyectos, reduciendo los costes globales de los mismos, disminuyendo los tiempos de ejecución, e incrementando la calidad de los datos recopilados. Las principales aportaciones de este trabajo de investigación al conocimiento científico son la implementación de un sistema de procesamiento inteligente de los datos almacenados por los dispositivos cardiacos implantables, la integración en el mismo de una base de datos global y optimizada para todos los modelos de dispositivos, la generación automatizada de un repositorio unificado de datos clínicos y datos de dispositivos cardiacos implantables, y el diseño de una métrica aplicada e integrable en los sistemas electrónicos de adquisición de datos para el análisis de resultados de rendimiento de los proyectos de investigación clínica. ABSTRACT Cardiovascular diseases are the main cause of death worldwide and it is expected to continue in the future, generating high costs for health care systems. Implantable cardiac devices have become one of the options for diagnosis and treatment of cardiac rhythm disorders. Clinical research with these devices has acquired great importance to fight against these diseases that affect so many people in our society. Both pharmaceutical and medical technology companies, and also investigators, are involved in an increasingly number of clinical research projects. The growth in volume and the increase in medical research complexity are contributing to raise the expenditure level associated with clinical investigation. This situation is driving health care sector companies to explore new solutions to reduce clinical trial costs. Information and Communication Technologies have facilitated clinical research, mainly in the last decade. Electronic systems and software applications have provided new possibilities in the acquisition, processing and analysis of clinical studies data. On the other hand, web technology contributed to the appearance of the first electronic data capture systems that have evolved during the last years. Nevertheless, improvement of these systems is still a key aspect for the progress of clinical research. On a different matter, the traditional way to develop clinical studies with implantable cardiac devices needed an improvement in the processing of the data stored by these devices, and also in the merging of these data with the data collected by investigators and patients. The rationale of this research is based on the need to improve the efficiency in clinical investigation with implantable cardiac devices, by means of reduction in costs and time of projects development, as well as improvement in the quality of information obtained from the studies and to obtain better performance of data through the merging of data from different sources or trials. The objective of this research project is to develop the next two models: • A model for the retrieval and processing of data for clinical studies with implantable cardiac devices, enabling structure and standardization of these procedures, in order to reduce the time of development of these tasks, to improve the quality of the results, diminish therefore costs. • A model of metric integrated in an Electronic Data Capture system (EDC) that allow to analyze the results of the research project, and particularly the EDC performance, in order to improve those systems and to reduce time and costs of the project, and to get a better quality of the collected clinical data. As a result of this work, the proposed processing model has led to a reduction of the average time for data processing by more than 90 per cent, of related costs by more than 85 per cent, and all of this, through automatic data retrieval and storage, achieving an improvement of quality of data. On the other hand, the model of metrics makes possible a detailed descriptive analysis of a set of indicators that characterize the performance of each research project, allowing inter‐studies comparison. This doctoral thesis results have demonstrated that the application of the two developed models in real clinical trials has led to an improvement in projects efficiency, reducing global costs, diminishing time in execution, and increasing quality of data collected. The main contributions to scientific knowledge of this research work are the implementation of an intelligent processing system for data stored by implantable cardiac devices, the integration in this system of a global and optimized database for all models of devices, the automatic creation of an unified repository of clinical data and data stored by medical devices, and the design of a metric to be applied and integrated in electronic data capture systems to analyze the performance results of clinical research projects.
Resumo:
Geologic storage of carbon dioxide (CO2) has been proposed as a viable means for reducing anthropogenic CO2 emissions. Once injection begins, a program for measurement, monitoring, and verification (MMV) of CO2 distribution is required in order to: a) research key features, effects and processes needed for risk assessment; b) manage the injection process; c) delineate and identify leakage risk and surface escape; d) provide early warnings of failure near the reservoir; and f) verify storage for accounting and crediting. The selection of the methodology of monitoring (characterization of site and control and verification in the post-injection phase) is influenced by economic and technological variables. Multiple Criteria Decision Making (MCDM) refers to a methodology developed for making decisions in the presence of multiple criteria. MCDM as a discipline has only a relatively short history of 40 years, and it has been closely related to advancements on computer technology. Evaluation methods and multicriteria decisions include the selection of a set of feasible alternatives, the simultaneous optimization of several objective functions, and a decision-making process and evaluation procedures that must be rational and consistent. The application of a mathematical model of decision-making will help to find the best solution, establishing the mechanisms to facilitate the management of information generated by number of disciplines of knowledge. Those problems in which decision alternatives are finite are called Discrete Multicriteria Decision problems. Such problems are most common in reality and this case scenario will be applied in solving the problem of site selection for storing CO2. Discrete MCDM is used to assess and decide on issues that by nature or design support a finite number of alternative solutions. Recently, Multicriteria Decision Analysis has been applied to hierarchy policy incentives for CCS, to assess the role of CCS, and to select potential areas which could be suitable to store. For those reasons, MCDM have been considered in the monitoring phase of CO2 storage, in order to select suitable technologies which could be techno-economical viable. In this paper, we identify techniques of gas measurements in subsurface which are currently applying in the phase of characterization (pre-injection); MCDM will help decision-makers to hierarchy the most suitable technique which fit the purpose to monitor the specific physic-chemical parameter.