25 resultados para downloading of data
em Universidad Politécnica de Madrid
Resumo:
In the paper we report on the results of our experiments on the construction of the opinion ontology. Our aim is to show the benefits of publishing in the open, on the Web, the results of the opinion mining process in a structured form. On the road to achieving this, we attempt to answer the research question to what extent opinion information can be formalized in a unified way. Furthermore, as part of the evaluation, we experiment with the usage of Semantic Web technologies and show particular use cases that support our claims.
Resumo:
Recently, the Semantic Web has experienced signi�cant advancements in standards and techniques, as well as in the amount of semantic information available online. Even so, mechanisms are still needed to automatically reconcile semantic information when it is expressed in di�erent natural languages, so that access to Web information across language barriers can be improved. That requires developing techniques for discovering and representing cross-lingual links on the Web of Data. In this paper we explore the different dimensions of such a problem and reflect on possible avenues of research on that topic.
Resumo:
The Web has witnessed an enormous growth in the amount of semantic information published in recent years. This growth has been stimulated to a large extent by the emergence of Linked Data. Although this brings us a big step closer to the vision of a Semantic Web, it also raises new issues such as the need for dealing with information expressed in different natural languages. Indeed, although the Web of Data can contain any kind of information in any language, it still lacks explicit mechanisms to automatically reconcile such information when it is expressed in different languages. This leads to situations in which data expressed in a certain language is not easily accessible to speakers of other languages. The Web of Data shows the potential for being extended to a truly multilingual web as vocabularies and data can be published in a language-independent fashion, while associated language-dependent (linguistic) information supporting the access across languages can be stored separately. In this sense, the multilingual Web of Data can be realized in our view as a layer of services and resources on top of the existing Linked Data infrastructure adding i) linguistic information for data and vocabularies in different languages, ii) mappings between data with labels in different languages, and iii) services to dynamically access and traverse Linked Data across different languages. In this article we present this vision of a multilingual Web of Data. We discuss challenges that need to be addressed to make this vision come true and discuss the role that techniques such as ontology localization, ontology mapping, and cross-lingual ontology-based information access and presentation will play in achieving this. Further, we propose an initial architecture and describe a roadmap that can provide a basis for the implementation of this vision.
Resumo:
Cross‐lingual link discovery in the Web of Data
Resumo:
The Semantic Web is growing at a fast pace, recently boosted by the creation of the Linked Data initiative and principles. Methods, standards, techniques and the state of technology are becoming more mature and therefore are easing the task of publication and consumption of semantic information on the Web.
Resumo:
Data grid services have been used to deal with the increasing needs of applications in terms of data volume and throughput. The large scale, heterogeneity and dynamism of grid environments often make management and tuning of these data services very complex. Furthermore, current high-performance I/O approaches are characterized by their high complexity and specific features that usually require specialized administrator skills. Autonomic computing can help manage this complexity. The present paper describes an autonomic subsystem intended to provide self-management features aimed at efficiently reducing the I/O problem in a grid environment, thereby enhancing the quality of service (QoS) of data access and storage services in the grid. Our proposal takes into account that data produced in an I/O system is not usually immediately required. Therefore, performance improvements are related not only to current but also to any future I/O access, as the actual data access usually occurs later on. Nevertheless, the exact time of the next I/O operations is unknown. Thus, our approach proposes a long-term prediction designed to forecast the future workload of grid components. This enables the autonomic subsystem to determine the optimal data placement to improve both current and future I/O operations.
Resumo:
Ion beam therapy is a valuable method for the treatment of deep-seated and radio-resistant tumors thanks to the favorable depth-dose distribution characterized by the Bragg peak. Hadrontherapy facilities take advantage of the specific ion range, resulting in a highly conformal dose in the target volume, while the dose in critical organs is reduced as compared to photon therapy. The necessity to monitor the delivery precision, i.e. the ion range, is unquestionable, thus different approaches have been investigated, such as the detection of prompt photons or annihilation photons of positron emitter nuclei created during the therapeutic treatment. Based on the measurement of the induced β+ activity, our group has developed various in-beam PET prototypes: the one under test is composed by two planar detector heads, each one consisting of four modules with a total active area of 10 × 10 cm2. A single detector module is made of a LYSO crystal matrix coupled to a position sensitive photomultiplier and is read-out by dedicated frontend electronics. A preliminary data taking was performed at the Italian National Centre for Oncological Hadron Therapy (CNAO, Pavia), using proton beams in the energy range of 93–112 MeV impinging on a plastic phantom. The measured activity profiles are presented and compared with the simulated ones based on the Monte Carlo FLUKA package.
Resumo:
There are several different standardised and widespread formats to represent emotions. However, there is no standard semantic model yet. This paper presents a new ontology, called Onyx, that aims to become such a standard while adding concepts from the latest Semantic Web models. In particular, the ontology focuses on the representation of Emotion Analysis results. But the model is abstract and inherits from previous standards and formats. It can thus be used as a reference representation of emotions in any future application or ontology. To prove this, we have translated resources from EmotionML representation to Onyx. We also present several ways in which developers could benefit from using this ontology instead of an ad-hoc presentation. Our ultimate goal is to foster the use of semantic technologies for emotion Analysis while following the Linked Data ideals.
Resumo:
Lately, the mobile data market has moved into a growth stage triggered by two facts: affordability of mobile broadband, and availability of data-friendly devices. At this stage, market growth is no longer dependent on push strategies from suppliers; on the contrary, demand is now driving the market. However, it will not be easy for mobile operating companies to cope up with the demand to come in the near future. The infrastructure that is needed to support corresponding demand is far from completion. Operators are forced to make heavy investments to upgrade and expand their networks. To decide how to handle the present and upcoming demand, they need to identify and understand the characteristics of the scenarios they face. This is precisely the aim of this article, which provides figures on the consequences for mobile infrastructures of a generalised mobile media uptake. Data from the Spanish mobile deployment case have been used to arrive at practical figures and illustration of results, but the conclusions are easily extended to other countries and regions
Resumo:
Data centers are easily found in every sector of the worldwide economy. They are composed of thousands of servers, serving millions of users globally and 24-7. In the last years, e-Science applications such e-Health or Smart Cities have experienced a significant development. The need to deal efficiently with the computational needs of next-generation applications together with the increasing demand for higher resources in traditional applications has facilitated the rapid proliferation and growing of Data Centers. A drawback to this capacity growth has been the rapid increase of the energy consumption of these facilities. In 2010, data center electricity represented 1.3% of all the electricity use in the world. In year 2012 alone, global data center power demand grep 63% to 38GW. A further rise of 17% to 43GW was estimated in 2013. Moreover, Data Centers are responsible for more than 2% of total carbon dioxide emissions.
Resumo:
En los últimos años la sociedad está experimentando una serie de cambios. Uno de estos cambios es la datificación (“datafication” en inglés). Este término puede ser definido como la transformación sistemática de aspectos de la vida cotidiana de las personas en datos procesados por ordenadores. Cada día, a cada minuto y a cada segundo, cada vez que alguien emplea un dispositivo digital,hay datos siendo guardados en algún lugar. Se puede tratar del contenido de un correo electrónico pero también puede ser el número de pasos que esa persona ha caminado o su historial médico. El simple almacenamiento de datos no proporciona un valor añadido por si solo. Para extraer conocimiento de los datos, y por tanto darles un valor, se requiere del análisis de datos. La ciencia de los datos junto con el análisis de datos se está volviendo cada vez más popular. Hoy en día, se pueden encontrar millones de web APIs estadísticas; estas APIs ofrecen la posibilidad de analizar tendencias o sentimientos presentes en las redes sociales o en internet en general. Una de las redes sociales más populares, Twitter, es pública. Cada mensaje, o tweet, publicado puede ser visto por cualquier persona en el mundo, siempre y cuando posea una conexión a internet. Esto hace de Twitter un medio interesante a la hora de analizar hábitos sociales o perfiles de consumo. Es en este contexto en que se engloba este proyecto. Este trabajo, combinando el análisis estadístico de datos y el análisis de contenido, trata de extraer conocimiento de tweets públicos de Twitter. En particular tratará de establecer si el género es un factor influyente en las relaciones entre usuarios de Twitter. Para ello, se analizará una base de datos que contiene casi 2.000 tweets. En primer lugar se determinará el género de los usuarios mediante web APIs. En segundo lugar se empleará el contraste de hipótesis para saber si el género influye en los usuarios a la hora de relacionarse con otros usuarios. Finalmente se construirá un modelo estadístico para predecir el comportamiento de los usuarios de Twitter en relación a su género.
Resumo:
Los Centros de Datos se encuentran actualmente en cualquier sector de la economía mundial. Están compuestos por miles de servidores, dando servicio a los usuarios de forma global, las 24 horas del día y los 365 días del año. Durante los últimos años, las aplicaciones del ámbito de la e-Ciencia, como la e-Salud o las Ciudades Inteligentes han experimentado un desarrollo muy significativo. La necesidad de manejar de forma eficiente las necesidades de cómputo de aplicaciones de nueva generación, junto con la creciente demanda de recursos en aplicaciones tradicionales, han facilitado el rápido crecimiento y la proliferación de los Centros de Datos. El principal inconveniente de este aumento de capacidad ha sido el rápido y dramático incremento del consumo energético de estas infraestructuras. En 2010, la factura eléctrica de los Centros de Datos representaba el 1.3% del consumo eléctrico mundial. Sólo en el año 2012, el consumo de potencia de los Centros de Datos creció un 63%, alcanzando los 38GW. En 2013 se estimó un crecimiento de otro 17%, hasta llegar a los 43GW. Además, los Centros de Datos son responsables de más del 2% del total de emisiones de dióxido de carbono a la atmósfera. Esta tesis doctoral se enfrenta al problema energético proponiendo técnicas proactivas y reactivas conscientes de la temperatura y de la energía, que contribuyen a tener Centros de Datos más eficientes. Este trabajo desarrolla modelos de energía y utiliza el conocimiento sobre la demanda energética de la carga de trabajo a ejecutar y de los recursos de computación y refrigeración del Centro de Datos para optimizar el consumo. Además, los Centros de Datos son considerados como un elemento crucial dentro del marco de la aplicación ejecutada, optimizando no sólo el consumo del Centro de Datos sino el consumo energético global de la aplicación. Los principales componentes del consumo en los Centros de Datos son la potencia de computación utilizada por los equipos de IT, y la refrigeración necesaria para mantener los servidores dentro de un rango de temperatura de trabajo que asegure su correcto funcionamiento. Debido a la relación cúbica entre la velocidad de los ventiladores y el consumo de los mismos, las soluciones basadas en el sobre-aprovisionamiento de aire frío al servidor generalmente tienen como resultado ineficiencias energéticas. Por otro lado, temperaturas más elevadas en el procesador llevan a un consumo de fugas mayor, debido a la relación exponencial del consumo de fugas con la temperatura. Además, las características de la carga de trabajo y las políticas de asignación de recursos tienen un impacto importante en los balances entre corriente de fugas y consumo de refrigeración. La primera gran contribución de este trabajo es el desarrollo de modelos de potencia y temperatura que permiten describes estos balances entre corriente de fugas y refrigeración; así como la propuesta de estrategias para minimizar el consumo del servidor por medio de la asignación conjunta de refrigeración y carga desde una perspectiva multivariable. Cuando escalamos a nivel del Centro de Datos, observamos un comportamiento similar en términos del balance entre corrientes de fugas y refrigeración. Conforme aumenta la temperatura de la sala, mejora la eficiencia de la refrigeración. Sin embargo, este incremente de la temperatura de sala provoca un aumento en la temperatura de la CPU y, por tanto, también del consumo de fugas. Además, la dinámica de la sala tiene un comportamiento muy desigual, no equilibrado, debido a la asignación de carga y a la heterogeneidad en el equipamiento de IT. La segunda contribución de esta tesis es la propuesta de técnicas de asigación conscientes de la temperatura y heterogeneidad que permiten optimizar conjuntamente la asignación de tareas y refrigeración a los servidores. Estas estrategias necesitan estar respaldadas por modelos flexibles, que puedan trabajar en tiempo real, para describir el sistema desde un nivel de abstracción alto. Dentro del ámbito de las aplicaciones de nueva generación, las decisiones tomadas en el nivel de aplicación pueden tener un impacto dramático en el consumo energético de niveles de abstracción menores, como por ejemplo, en el Centro de Datos. Es importante considerar las relaciones entre todos los agentes computacionales implicados en el problema, de forma que puedan cooperar para conseguir el objetivo común de reducir el coste energético global del sistema. La tercera contribución de esta tesis es el desarrollo de optimizaciones energéticas para la aplicación global por medio de la evaluación de los costes de ejecutar parte del procesado necesario en otros niveles de abstracción, que van desde los nodos hasta el Centro de Datos, por medio de técnicas de balanceo de carga. Como resumen, el trabajo presentado en esta tesis lleva a cabo contribuciones en el modelado y optimización consciente del consumo por fugas y la refrigeración de servidores; el modelado de los Centros de Datos y el desarrollo de políticas de asignación conscientes de la heterogeneidad; y desarrolla mecanismos para la optimización energética de aplicaciones de nueva generación desde varios niveles de abstracción. ABSTRACT Data centers are easily found in every sector of the worldwide economy. They consist of tens of thousands of servers, serving millions of users globally and 24-7. In the last years, e-Science applications such e-Health or Smart Cities have experienced a significant development. The need to deal efficiently with the computational needs of next-generation applications together with the increasing demand for higher resources in traditional applications has facilitated the rapid proliferation and growing of data centers. A drawback to this capacity growth has been the rapid increase of the energy consumption of these facilities. In 2010, data center electricity represented 1.3% of all the electricity use in the world. In year 2012 alone, global data center power demand grew 63% to 38GW. A further rise of 17% to 43GW was estimated in 2013. Moreover, data centers are responsible for more than 2% of total carbon dioxide emissions. This PhD Thesis addresses the energy challenge by proposing proactive and reactive thermal and energy-aware optimization techniques that contribute to place data centers on a more scalable curve. This work develops energy models and uses the knowledge about the energy demand of the workload to be executed and the computational and cooling resources available at data center to optimize energy consumption. Moreover, data centers are considered as a crucial element within their application framework, optimizing not only the energy consumption of the facility, but the global energy consumption of the application. The main contributors to the energy consumption in a data center are the computing power drawn by IT equipment and the cooling power needed to keep the servers within a certain temperature range that ensures safe operation. Because of the cubic relation of fan power with fan speed, solutions based on over-provisioning cold air into the server usually lead to inefficiencies. On the other hand, higher chip temperatures lead to higher leakage power because of the exponential dependence of leakage on temperature. Moreover, workload characteristics as well as allocation policies also have an important impact on the leakage-cooling tradeoffs. The first key contribution of this work is the development of power and temperature models that accurately describe the leakage-cooling tradeoffs at the server level, and the proposal of strategies to minimize server energy via joint cooling and workload management from a multivariate perspective. When scaling to the data center level, a similar behavior in terms of leakage-temperature tradeoffs can be observed. As room temperature raises, the efficiency of data room cooling units improves. However, as we increase room temperature, CPU temperature raises and so does leakage power. Moreover, the thermal dynamics of a data room exhibit unbalanced patterns due to both the workload allocation and the heterogeneity of computing equipment. The second main contribution is the proposal of thermal- and heterogeneity-aware workload management techniques that jointly optimize the allocation of computation and cooling to servers. These strategies need to be backed up by flexible room level models, able to work on runtime, that describe the system from a high level perspective. Within the framework of next-generation applications, decisions taken at this scope can have a dramatical impact on the energy consumption of lower abstraction levels, i.e. the data center facility. It is important to consider the relationships between all the computational agents involved in the problem, so that they can cooperate to achieve the common goal of reducing energy in the overall system. The third main contribution is the energy optimization of the overall application by evaluating the energy costs of performing part of the processing in any of the different abstraction layers, from the node to the data center, via workload management and off-loading techniques. In summary, the work presented in this PhD Thesis, makes contributions on leakage and cooling aware server modeling and optimization, data center thermal modeling and heterogeneityaware data center resource allocation, and develops mechanisms for the energy optimization for next-generation applications from a multi-layer perspective.
Resumo:
Internet está evolucionando hacia la conocida como Live Web. En esta nueva etapa en la evolución de Internet, se pone al servicio de los usuarios multitud de streams de datos sociales. Gracias a estas fuentes de datos, los usuarios han pasado de navegar por páginas web estáticas a interacturar con aplicaciones que ofrecen contenido personalizado, basada en sus preferencias. Cada usuario interactúa a diario con multiples aplicaciones que ofrecen notificaciones y alertas, en este sentido cada usuario es una fuente de eventos, y a menudo los usuarios se sienten desbordados y no son capaces de procesar toda esa información a la carta. Para lidiar con esta sobresaturación, han aparecido múltiples herramientas que automatizan las tareas más habituales, desde gestores de bandeja de entrada, gestores de alertas en redes sociales, a complejos CRMs o smart-home hubs. La contrapartida es que aunque ofrecen una solución a problemas comunes, no pueden adaptarse a las necesidades de cada usuario ofreciendo una solucion personalizada. Los Servicios de Automatización de Tareas (TAS de sus siglas en inglés) entraron en escena a partir de 2012 para dar solución a esta liminación. Dada su semejanza, estos servicios también son considerados como un nuevo enfoque en la tecnología de mash-ups pero centra en el usuarios. Los usuarios de estas plataformas tienen la capacidad de interconectar servicios, sensores y otros aparatos con connexión a internet diseñando las automatizaciones que se ajustan a sus necesidades. La propuesta ha sido ámpliamante aceptada por los usuarios. Este hecho ha propiciado multitud de plataformas que ofrecen servicios TAS entren en escena. Al ser un nuevo campo de investigación, esta tesis presenta las principales características de los TAS, describe sus componentes, e identifica las dimensiones fundamentales que los defines y permiten su clasificación. En este trabajo se acuña el termino Servicio de Automatización de Tareas (TAS) dando una descripción formal para estos servicios y sus componentes (llamados canales), y proporciona una arquitectura de referencia. De igual forma, existe una falta de herramientas para describir servicios de automatización, y las reglas de automatización. A este respecto, esta tesis propone un modelo común que se concreta en la ontología EWE (Evented WEb Ontology). Este modelo permite com parar y equiparar canales y automatizaciones de distintos TASs, constituyendo un aporte considerable paraa la portabilidad de automatizaciones de usuarios entre plataformas. De igual manera, dado el carácter semántico del modelo, permite incluir en las automatizaciones elementos de fuentes externas sobre los que razonar, como es el caso de Linked Open Data. Utilizando este modelo, se ha generado un dataset de canales y automatizaciones, con los datos obtenidos de algunos de los TAS existentes en el mercado. Como último paso hacia el lograr un modelo común para describir TAS, se ha desarrollado un algoritmo para aprender ontologías de forma automática a partir de los datos del dataset. De esta forma, se favorece el descubrimiento de nuevos canales, y se reduce el coste de mantenimiento del modelo, el cual se actualiza de forma semi-automática. En conclusión, las principales contribuciones de esta tesis son: i) describir el estado del arte en automatización de tareas y acuñar el término Servicio de Automatización de Tareas, ii) desarrollar una ontología para el modelado de los componentes de TASs y automatizaciones, iii) poblar un dataset de datos de canales y automatizaciones, usado para desarrollar un algoritmo de aprendizaje automatico de ontologías, y iv) diseñar una arquitectura de agentes para la asistencia a usuarios en la creación de automatizaciones. ABSTRACT The new stage in the evolution of the Web (the Live Web or Evented Web) puts lots of social data-streams at the service of users, who no longer browse static web pages but interact with applications that present them contextual and relevant experiences. Given that each user is a potential source of events, a typical user often gets overwhelmed. To deal with that huge amount of data, multiple automation tools have emerged, covering from simple social media managers or notification aggregators to complex CRMs or smart-home Hub/Apps. As a downside, they cannot tailor to the needs of every single user. As a natural response to this downside, Task Automation Services broke in the Internet. They may be seen as a new model of mash-up technology for combining social streams, services and connected devices from an end-user perspective: end-users are empowered to connect those stream however they want, designing the automations they need. The numbers of those platforms that appeared early on shot up, and as a consequence the amount of platforms following this approach is growing fast. Being a novel field, this thesis aims to shed light on it, presenting and exemplifying the main characteristics of Task Automation Services, describing their components, and identifying several dimensions to classify them. This thesis coins the term Task Automation Services (TAS) by providing a formal definition of them, their components (called channels), as well a TAS reference architecture. There is also a lack of tools for describing automation services and automations rules. In this regard, this thesis proposes a theoretical common model of TAS and formalizes it as the EWE ontology This model enables to compare channels and automations from different TASs, which has a high impact in interoperability; and enhances automations providing a mechanism to reason over external sources such as Linked Open Data. Based on this model, a dataset of components of TAS was built, harvesting data from the web sites of actual TASs. Going a step further towards this common model, an algorithm for categorizing them was designed, enabling their discovery across different TAS. Thus, the main contributions of the thesis are: i) surveying the state of the art on task automation and coining the term Task Automation Service; ii) providing a semantic common model for describing TAS components and automations; iii) populating a categorized dataset of TAS components, used to learn ontologies of particular domains from the TAS perspective; and iv) designing an agent architecture for assisting users in setting up automations, that is aware of their context and acts in consequence.
Resumo:
Over the last few years, the Data Center market has increased exponentially and this tendency continues today. As a direct consequence of this trend, the industry is pushing the development and implementation of different new technologies that would improve the energy consumption efficiency of data centers. An adaptive dashboard would allow the user to monitor the most important parameters of a data center in real time. For that reason, monitoring companies work with IoT big data filtering tools and cloud computing systems to handle the amounts of data obtained from the sensors placed in a data center.Analyzing the market trends in this field we can affirm that the study of predictive algorithms has become an essential area for competitive IT companies. Complex algorithms are used to forecast risk situations based on historical data and warn the user in case of danger. Considering that several different users will interact with this dashboard from IT experts or maintenance staff to accounting managers, it is vital to personalize it automatically. Following that line of though, the dashboard should only show relevant metrics to the user in different formats like overlapped maps or representative graphs among others. These maps will show all the information needed in a visual and easy-to-evaluate way. To sum up, this dashboard will allow the user to visualize and control a wide range of variables. Monitoring essential factors such as average temperature, gradients or hotspots as well as energy and power consumption and savings by rack or building would allow the client to understand how his equipment is behaving, helping him to optimize the energy consumption and efficiency of the racks. It also would help him to prevent possible damages in the equipment with predictive high-tech algorithms.
Resumo:
The Spanish National Library (Biblioteca Nacional de España1. BNE) and the Ontology Engineering Group2 of Universidad Politécnica de Madrid are working on the joint project ?Preliminary Study of Linked Data?, whose aim is to enrich the Web of Data with the BNE authority and bibliographic records. To this end, they are transforming the BNE information to RDF following the Linked Data principles3 proposed by Tim Berners Lee.