917 resultados para Web Log Data
Resumo:
The choice of an appropriate family of linear models for the analysis of longitudinal data is often a matter of concern for practitioners. To attenuate such difficulties, we discuss some issues that emerge when analyzing this type of data via a practical example involving pretestposttest longitudinal data. In particular, we consider log-normal linear mixed models (LNLMM), generalized linear mixed models (GLMM), and models based on generalized estimating equations (GEE). We show how some special features of the data, like a nonconstant coefficient of variation, may be handled in the three approaches and evaluate their performance with respect to the magnitude of standard errors of interpretable and comparable parameters. We also show how different diagnostic tools may be employed to identify outliers and comment on available software. We conclude by noting that the results are similar, but that GEE-based models may be preferable when the goal is to compare the marginal expected responses.
Resumo:
Abstract Background The search for enriched (aka over-represented or enhanced) ontology terms in a list of genes obtained from microarray experiments is becoming a standard procedure for a system-level analysis. This procedure tries to summarize the information focussing on classification designs such as Gene Ontology, KEGG pathways, and so on, instead of focussing on individual genes. Although it is well known in statistics that association and significance are distinct concepts, only the former approach has been used to deal with the ontology term enrichment problem. Results BayGO implements a Bayesian approach to search for enriched terms from microarray data. The R source-code is freely available at http://blasto.iq.usp.br/~tkoide/BayGO in three versions: Linux, which can be easily incorporated into pre-existent pipelines; Windows, to be controlled interactively; and as a web-tool. The software was validated using a bacterial heat shock response dataset, since this stress triggers known system-level responses. Conclusion The Bayesian model accounts for the fact that, eventually, not all the genes from a given category are observable in microarray data due to low intensity signal, quality filters, genes that were not spotted and so on. Moreover, BayGO allows one to measure the statistical association between generic ontology terms and differential expression, instead of working only with the common significance analysis.
Resumo:
Abstract Background Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space. Results Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster. Conclusion Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.
Resumo:
Background The use of the knowledge produced by sciences to promote human health is the main goal of translational medicine. To make it feasible we need computational methods to handle the large amount of information that arises from bench to bedside and to deal with its heterogeneity. A computational challenge that must be faced is to promote the integration of clinical, socio-demographic and biological data. In this effort, ontologies play an essential role as a powerful artifact for knowledge representation. Chado is a modular ontology-oriented database model that gained popularity due to its robustness and flexibility as a generic platform to store biological data; however it lacks supporting representation of clinical and socio-demographic information. Results We have implemented an extension of Chado – the Clinical Module - to allow the representation of this kind of information. Our approach consists of a framework for data integration through the use of a common reference ontology. The design of this framework has four levels: data level, to store the data; semantic level, to integrate and standardize the data by the use of ontologies; application level, to manage clinical databases, ontologies and data integration process; and web interface level, to allow interaction between the user and the system. The clinical module was built based on the Entity-Attribute-Value (EAV) model. We also proposed a methodology to migrate data from legacy clinical databases to the integrative framework. A Chado instance was initialized using a relational database management system. The Clinical Module was implemented and the framework was loaded using data from a factual clinical research database. Clinical and demographic data as well as biomaterial data were obtained from patients with tumors of head and neck. We implemented the IPTrans tool that is a complete environment for data migration, which comprises: the construction of a model to describe the legacy clinical data, based on an ontology; the Extraction, Transformation and Load (ETL) process to extract the data from the source clinical database and load it in the Clinical Module of Chado; the development of a web tool and a Bridge Layer to adapt the web tool to Chado, as well as other applications. Conclusions Open-source computational solutions currently available for translational science does not have a model to represent biomolecular information and also are not integrated with the existing bioinformatics tools. On the other hand, existing genomic data models do not represent clinical patient data. A framework was developed to support translational research by integrating biomolecular information coming from different “omics” technologies with patient’s clinical and socio-demographic data. This framework should present some features: flexibility, compression and robustness. The experiments accomplished from a use case demonstrated that the proposed system meets requirements of flexibility and robustness, leading to the desired integration. The Clinical Module can be accessed in http://dcm.ffclrp.usp.br/caib/pg=iptrans webcite.
Resumo:
O presente trabalho tem como objetivo mostrar como as técnicas da Inteligência Competitiva podem ser adaptadas para o ambiente de serviços de informação, apresentando um projeto de monitoramento web de bibliotecas universitárias especializadas na ár ea de Química como estratégia para a melhoria contínua desses ser viços, através da comparação de serviços de informação análogos, selecionados entre as quatro primeiras instituições classificadas no Webometrics - Ranking Web of World Universities , fornecendo dados para o incremento e atualização dos conteúdos informaciona is disponíveis na página virtual de bibliotecas dessa área, melhorando seu acesso e dis ponibilização de informação, bem como contribuindo para a maximização da visibilidade e a valiação da instituição universitária. Palavras-Chave: Inteligência Competitiva, Monitoramento Web, Bibli otecas Universitárias e especializadas, Página Virtual, Serviços de Informa ção
Resumo:
The University of São Paulo has been experiencing the increase in contents in electronic and digital formats, distributed by different suppliers and hosted remotely or in clouds, and is faced with the also increasing difficulties related to facilitating access to this digital collection by its users besides coexisting with the traditional world of physical collections. A possible solution was identified in the new generation of systems called Web Scale Discovery, which allow better management, data integration and agility of search. Aiming to identify if and how such a system would meet the USP demand and expectation and, in case it does, to identify what the analysis criteria of such a tool would be, an analytical study with an essentially documental base was structured, as from a revision of the literature and from data available in official websites and of libraries using this kind of resources. The conceptual base of the study was defined after the identification of software assessment methods already available, generating a standard with 40 analysis criteria, from details on the unique access interface to information contents, web 2.0 characteristics, intuitive interface, facet navigation, among others. The details of the studies conducted into four of the major systems currently available in this software category are presented, providing subsidies for the decision-making of other libraries interested in such systems.
Resumo:
[ES] El presente TFG consiste en una aplicación para la detección de personas de cuerpo entero. La idea es aplicar este detector a las continuas imágenes recogidas en tiempo real a través de una web-cam, o de un archivo con formato de vídeo que se encuentre ubicado en el propio sistema. El código está escrito en C++. Para conseguir este objetivo nos basamos en el uso conjunto de dos sistemas de detección ya existentes: primero, OpenCV, mediante un método de histograma de gradientes orientados, el cual ya proporciona propiamente un detector de personas que será aplicado a cada una de las imágenes del stream de vídeo; por otro lado, el detector facial de la librería Encara que se aplica a cada una de las detecciones de supuestas personas obtenidas en el método de OpenCV, para comprobar si hay una cara en la supuesta persona detectada. En caso de ser así, y de haber una cara más o menos correctamente situada, determinamos que es realmente una persona. Para cada persona detectada se guardan sus datos de situación en la imagen, en una lista, para posteriormente compararlos con los datos obtenidos en frames anteriores, e intentar hacer un seguimiento de todas las personas. Visualmente se observaría como se va recuadrando cada persona con un color determinado aleatorio asignado a cada una, mientras se visualiza el vídeo. También se registra la hora y frame de aparición, y la hora y frame de salida, de cada persona detectada, quedando estos datos guardados tanto en un fichero de log, como en una base de datos. Los resultados son, bastante satisfactorios, aunque con posibilidades de mejora, ya que es un trabajo que permite combinar otras técnicas diferentes a las descritas. Debido a la complejidad de los métodos empleados se destaca la necesidad de alta capacidad de computación para poder ejecutar la aplicación en tiempo real sin ralentizaciones.
Resumo:
[ES] Este Trabajo de Fin de Grado ha tenido como objetivo el desarrollo de un gestor de menús de restaurantes como aplicación web para una empresa que ofrece hostings de menús y publicidad mediante la publicación de dichos menús en pantallas y portales web. Las empresas asociadas (bares y restaurantes) podrán elaborar menús compuestos de dos platos (primero y segundo), postre y bebidas para ser ‘enviados’ al servicio de publicación. La aplicación proporciona un sistema de gestión de dichos menús facilitando la reutilización de platos entre menús, la personalización de la imagen representativa de cada plato, así como diversas operaciones de copia, visualización y modificación de los menús y de los platos. Los usuarios registrados tendrán la posibilidad de recuperar su contraseña de forma automática en caso de que la misma sea olvidada. La información relacionada con los platos, menús y usuarios registrados será almacenada automáticamente sobre una base de datos diseñada al efecto. Por otro lado, la aplicación web dispone de una página accesible únicamente para el administrador para la gestión de los usuarios, por ejemplo, editar, alta, baja, habilitar y deshabilitar cuentas de usuarios. Por último, las tecnologías y herramientas utilizadas en la elaboración de este trabajo incluyen Php, Mysql, jQuery, CSS, HTML y sobre todo el framework Twitter Bootstrap que ha sido de gran ayuda en el desarrollo del proyecto.
Resumo:
[ES] Este Trabajo de Fin de Grado es un servicio basado en tecnologías web. El objetivo principal es ofrecer un servicio de creación y gestión de actas para el Ayuntamiento de Las Palmas de Gran Canaria. Para ello, consta de dos módulos principales, uno para “crear actas” y otro para “editar actas”. También se ha desarrollado otro módulo llamado plantillas donde se genera un PDF a partir de una plantilla preestablecida. Esta aplicación ha sido dividida en diferentes partes. La primera parte consistió en generar todas las configuraciones de base de datos necesarias para el funcionamiento de la aplicación. Después generamos todos los ficheros HTML y las interconexiones entre ellos. Finalmente, dotamos a esos HTML estáticos de un estilo mucho más claro y organizado, dando a la aplicación una apariencia mucho más bonita. Una vez finalizada la parte frontal de la aplicación, empezamos a implementar la lógica detrás de la aplicación. Los módulos de “crear” y “editar” se hicieron utilizando formularios HTML y combinando la información obtenida de esos formularios con unas plantillas HTML generadas por nosotros. Toda esa información obtenida de los formularios se guarda en unos ficheros .txt para poder ser utilizados por el módulo editar. El módulo de plantillas nos muestra un editor HTML rellenado con una plantilla que ha sido previamente seleccionada por el usuario. Los ficheros pdf de este módulo no pueden editados con posterioridad por lo que no se generan ficheros .txt. Por último, hay dos módulos que nos permiten ver todas las actas generadas por la aplicación. El primero de los dos módulos es el módulo de búsqueda, que nos permite buscar una palabra clave dentro de todos los ficheros pdf. El otro módulo nos muestra todas las actas que han sido marcadas como “cerradas”. Esta aplicación ha sido diseñada de forma modular, de manera que podemos añadir o quitar módulos de manera sencilla.
Resumo:
To understand a city and its urban structure it is necessary to study its history. This is feasible through GIS (Geographical Information Systems) and its by-products on the web. Starting from a cartographic view they allow an initial understanding of, and a comparison between, present and past data together with an easy and intuitive access to database information. The research done led to the creation of a GIS for the city of Bologna. It is based on varied data such as historical map, vector and alphanumeric historical data, etc.. After providing information about GIS we thought of spreading and sharing the collected data on the Web after studying two solutions available on the market: Web Mapping and WebGIS. In this study we discuss the stages, beginning with the development of Historical GIS of Bologna, which led to the making of a WebGIS Open Source (MapServer and Chameleon) and the Web Mapping services (Google Earth, Google Maps and OpenLayers).
Resumo:
Con la crescente diffusione del web e dei servizi informatici offerti via internet, è aumentato in questi anni l’utilizzo dei data center e conseguentemente, il consumo di energia elettrica degli stessi. Il problema ambientale che comporta l’alto fabbisogno energetico, porta gli operatori di data center ad utilizzare tecniche a basso consumo e sistemi efficienti. Organizzazioni ambientali hanno rilevato che nel 2011 i consumi derivanti dai data center raggiungeranno i 100 milioni di kWh, con un costo complessivo di 7,4 milioni di dollari nei soli Stati Uniti, con una proiezione simile anche a livello globale. La seguente tesi intende valutare le tecniche in uso per diminuire il consumo energetico nei data center, e quali tecniche vengono maggiormente utilizzate per questo scopo. Innanzitutto si comincerà da una panoramica sui data center, per capire il loro funzionamento e per mostrare quali sono i componenti fondamentali che lo costituiscono; successivamente si mostrerà quali sono le parti che incidono maggiormente nei consumi, e come si devono effettuare le misurazioni per avere dei valori affidabili attraverso la rilevazione del PUE, unità di misura che valuta l’efficienza di un data center. Dal terzo capitolo si elencheranno le varie tecniche esistenti e in uso per risolvere il problema dell’efficienza energetica, mostrando alla fine una breve analisi sui metodi che hanno utilizzato le maggiori imprese del settore per risolvere il problema dei consumi nei loro data center. Lo scopo di questo elaborato è quello di capire quali sono le tecniche e le strategie per poter ridurre i consumi e aumentare l’efficienza energetica dei data center.
Resumo:
Ontology design and population -core aspects of semantic technologies- re- cently have become fields of great interest due to the increasing need of domain-specific knowledge bases that can boost the use of Semantic Web. For building such knowledge resources, the state of the art tools for ontology design require a lot of human work. Producing meaningful schemas and populating them with domain-specific data is in fact a very difficult and time-consuming task. Even more if the task consists in modelling knowledge at a web scale. The primary aim of this work is to investigate a novel and flexible method- ology for automatically learning ontology from textual data, lightening the human workload required for conceptualizing domain-specific knowledge and populating an extracted schema with real data, speeding up the whole ontology production process. Here computational linguistics plays a fundamental role, from automati- cally identifying facts from natural language and extracting frame of relations among recognized entities, to producing linked data with which extending existing knowledge bases or creating new ones. In the state of the art, automatic ontology learning systems are mainly based on plain-pipelined linguistics classifiers performing tasks such as Named Entity recognition, Entity resolution, Taxonomy and Relation extraction [11]. These approaches present some weaknesses, specially in capturing struc- tures through which the meaning of complex concepts is expressed [24]. Humans, in fact, tend to organize knowledge in well-defined patterns, which include participant entities and meaningful relations linking entities with each other. In literature, these structures have been called Semantic Frames by Fill- 6 Introduction more [20], or more recently as Knowledge Patterns [23]. Some NLP studies has recently shown the possibility of performing more accurate deep parsing with the ability of logically understanding the structure of discourse [7]. In this work, some of these technologies have been investigated and em- ployed to produce accurate ontology schemas. The long-term goal is to collect large amounts of semantically structured information from the web of crowds, through an automated process, in order to identify and investigate the cognitive patterns used by human to organize their knowledge.
Resumo:
Semantic Web technologies are strategic in order to fulfill the openness requirement of Self-Aware Pervasive Service Ecosystems. In fact they provide agents with the ability to cope with distributed data, using RDF to represent information, ontologies to describe relations between concepts from any domain (e.g. equivalence, specialization/extension, and so on) and reasoners to extract implicit knowledge. The aim of this thesis is to study these technologies and design an extension of a pervasive service ecosystems middleware capable of exploiting semantic power, and deepening performance implications.
Resumo:
This work is concerned with the increasing relationships between two distinct multidisciplinary research fields, Semantic Web technologies and scholarly publishing, that in this context converge into one precise research topic: Semantic Publishing. In the spirit of the original aim of Semantic Publishing, i.e. the improvement of scientific communication by means of semantic technologies, this thesis proposes theories, formalisms and applications for opening up semantic publishing to an effective interaction between scholarly documents (e.g., journal articles) and their related semantic and formal descriptions. In fact, the main aim of this work is to increase the users' comprehension of documents and to allow document enrichment, discovery and linkage to document-related resources and contexts, such as other articles and raw scientific data. In order to achieve these goals, this thesis investigates and proposes solutions for three of the main issues that semantic publishing promises to address, namely: the need of tools for linking document text to a formal representation of its meaning, the lack of complete metadata schemas for describing documents according to the publishing vocabulary, and absence of effective user interfaces for easily acting on semantic publishing models and theories.