21 resultados para Blog datasets
Resumo:
Although blogs exist from the beginning of the Internet, their use has considerablybeen increased in the last decade. Nowadays, they are ready for being used bya broad range of people. From teenagers to multinationals, everyone can have aglobal communication space.Companies know blogs are a valuable publicity tool to share information withthe participants, and the importance of creating consumer communities aroundthem: participants come together to exchange ideas, review and recommend newproducts, and even support each other. Also, companies can use blogs for differentpurposes, such as a content management system to manage the content of websites,a bulletin board to support communication and document sharing in teams,an instrument in marketing to communicate with Internet users, or a KnowledgeManagement Tool. However, an increasing number of blog content do not findtheir source in the personal experiences of the writer. Thus, the information cancurrently be kept in the user¿s desktop documents, in the companies¿ catalogues,or in another blogs. Although the gap between blog and data source can be manuallytraversed in a manual coding, this is a cumbersome task that defeats the blog¿seasiness principle. Moreover, depending on the quantity of information and itscharacterisation (i.e., structured content, unstructured content, etc.), an automaticapproach can be more effective.Based on these observations, the aim of this dissertation is to assist blog publicationthrough annotation, model transformation and crossblogging techniques.These techniques have been implemented to give rise to Blogouse, Catablog, andBlogUnion. These tools strive to improve the publication process considering theaforementioned data sources.
Resumo:
More and more users aim at taking advantage of the existing Linked Open Data environment to formulate a query over a dataset and to then try to process the same query over different datasets, one after another, in order to obtain a broader set of answers. However, the heterogeneity of vocabularies used in the datasets on the one side, and the fact that the number of alignments among those datasets is scarce on the other, makes that querying task difficult for them. Considering this scenario we present in this paper a proposal that allows on demand translations of queries formulated over an original dataset, into queries expressed using the vocabulary of a targeted dataset. Our approach relieves users from knowing the vocabulary used in the targeted datasets and even more it considers situations where alignments do not exist or they are not suitable for the formulated query. Therefore, in order to favour the possibility of getting answers, sometimes there is no guarantee of obtaining a semantically equivalent translation. The core component of our proposal is a query rewriting model that considers a set of transformation rules devised from a pragmatic point of view. The feasibility of our scheme has been validated with queries defined in well known benchmarks and SPARQL endpoint logs, as the obtained results confirm.
Resumo:
Nivel educativo: Grado. Duración (en horas): De 41 a 50 horas
Resumo:
[EU]Lana hasi baino lehen argi neukan interneti buruzkoa egin nahi nuela, gainera blog desberdinetan idatzi izan dut 2008. urtetik eta ikasle autodidakta naiz betidanik, horrela nire esperientzia hori marketinera bideratzea nahi nuen hasiera batetik. Beraz, gaur egun internet duen garrantzia goraipatu dut lan osoan zehar, bai ikuspegi akademiko batetatik zein profesional batetatik. Hasiera batean, gaur egun erabiltzen diren ikasteko modu desberdinei buruz hitz egiten hasi nahiz, batez ere Bolonia Planaren ostean garatutakoak hala nola, ikasketa parte hartzailea, learning by doing, etab. Gero autodidakta izateari buruz idazten hasi nintzen, zer den, bere abantailak, desabantailak eta unibertsitateak autodidakta izateko dituen mugei buruz. Ondoren, marketinari buruz ikertzen hasi naiz, batez ere honek 2.0 munduarekin duen garrantzia erakusteko eta munduaren eboluzioarekin batera, marketina ikusteko beste modu bat garatu zelako. Eta marketinean gero eta garrantzi gehiago duten erremintak blogak direnez, hauei buruz idaztea pentsatu nuen, zer diren eta ikuspegi akademiko batetatik izango luketen garrantzia ikusiz. Amaitzeko, lanari ukitu pertsonal bat eman nahi izan nion. Horretarako, Sarrikoko ikasleen erantzuna jakitea oso garrantzitsua zela pentsatu nuen, eta horretarako 3 motatako inkestak garatu nituen, 1etik 10era puntuatzekoa, 1etik 7ra ordenatzekoa eta galdetegi ireki bat ikasleek zer uste zuten graduko konpetentziak era autonomo batean lantzeko marketineko blog baten eraketari buruz.
Resumo:
[ES] El objetivo del presente trabajo se ha centrado en medir la eficacia publicitaria de dos formatos de anuncios on-line, concretamente el robapáginas frente al contextual, así como los factores que sobre la base de la literatura revisada influyen en dicha eficacia publicitaria. Su contexto de aplicación son las páginas web de tipo blog o bitácora que, a pesar del auge que han experimentado en su utilización, no han recibido atención en el ámbito de la investigación publicitaria.
Resumo:
This paper explores how audio chord estimation could improve if information about chord boundaries or beat onsets is revealed by an oracle. Chord estimation at the frame level is compared with three simulations, each using an oracle of increasing powers. The beat and chord segments revealed by an oracle are used to compute a chord ranking at the segment level, and to compute the cumulative probability of finding the correct chord among the top ranked chords. Oracle results on two different audio datasets demonstrate the substantial potential of segment versus frame approaches for chord audio estimation. This paper also provides a comparison of the oracle results on the Beatles dataset, the standard dataset in this area, with the new Billboard Hot 100 chord dataset.
Resumo:
[ES]El objetivo principal de esta tesis de máster es el estudio del comportamiento térmico del instrumento TriboLAB durante su estancia en la Estación Espacial Internacional, junto con la comparación de dicho comportamiento con el pronosticado por los modelos térmicos matemáticos empleados en el diseño de su sistema de control térmico. El trabajo realizado ha permitido profundizar de forma importante en el conocimiento del mencionado comportamiento. Ello permitirá poner a disposición de otros experimentadores interesados en ubicar sus instrumentos en los balcones exteriores de la Estación Espacial Internacional, información real acerca del comportamiento térmico de un equipo de las características del TriboLAB en dichas condiciones. Información de gran interés para ser empleada en el diseño del control térmico de sus instrumentos, especialmente ahora que la vida útil de la Estación Espacial Internacional ha sido prorrogada hasta 2020. El control térmico de los equipos espaciales es un aspecto clave para asegurar su supervivencia y correcto funcionamiento bajo las extremas condiciones existentes en el espacio. Su misión es mantener los distintos componentes dentro de su rango de temperaturas admisibles, puesto que en caso contrario no podrían funcionar o incluso ni siquiera sobrevivir más allá de esas temperaturas. Adicionalmente ha sido posible comprobar la aplicabilidad de distintas técnicas de análisis de datos funcionales en lo que respecta al estudio del tipo de datos aquí contemplado. Así mismo, se han comparado los resultados de la campaña de ensayos térmicos con los modelos térmicos matemáticos que han guiado el diseño del control térmico, y que son una pieza fundamental en el diseño del control térmico de cualquier instrumento espacial. Ello ha permitido verificar tanto la validez del sistema de control térmico diseñado para el TriboLAB como con la adecuada similitud existente entre los resultados de los modelos térmicos matemáticos y las temperaturas registradas en el equipo. Todo ello, ha sido realizado desde la perspectiva del análisis de datos funcionales.
Resumo:
[ES] Como parte de este proyecto de investigación se realizó el siguiente proyecto fin de carrera:
Resumo:
[EN] This paper is an outcome of the ERASMUS IP program called TOPCART, there are more information about this project that can be accessed from the following item:
Resumo:
[EN]Measuring semantic similarity and relatedness between textual items (words, sentences, paragraphs or even documents) is a very important research area in Natural Language Processing (NLP). In fact, it has many practical applications in other NLP tasks. For instance, Word Sense Disambiguation, Textual Entailment, Paraphrase detection, Machine Translation, Summarization and other related tasks such as Information Retrieval or Question Answering. In this masther thesis we study di erent approaches to compute the semantic similarity between textual items. In the framework of the european PATHS project1, we also evaluate a knowledge-base method on a dataset of cultural item descriptions. Additionaly, we describe the work carried out for the Semantic Textual Similarity (STS) shared task of SemEval-2012. This work has involved supporting the creation of datasets for similarity tasks, as well as the organization of the task itself.
Resumo:
Background: Colorectal cancer (CRC) is a disease of complex aetiology, with much of the expected inherited risk being due to several common low risk variants. Genome-Wide Association Studies (GWAS) have identified 20 CRC risk variants. Nevertheless, these have only been able to explain part of the missing heritability. Moreover, these signals have only been inspected in populations of Northern European origin. Results: Thus, we followed the same approach in a Spanish cohort of 881 cases and 667 controls. Sixty-four variants at 24 loci were found to be associated with CRC at p-values <10-5. We therefore evaluated the 24 loci in another Spanish replication cohort (1481 cases and 1850 controls). Two of these SNPs, rs12080929 at 1p33 (P-replication=0.042; P-pooled=5.523x10(-03); OR (CI95%)=0.866(0.782-0.959)) and rs11987193 at 8p12 (P-replication=0.039; P-pooled=6.985x10(-5); OR (CI95%)=0.786(0.705-0.878)) were replicated in the second Phase, although they did not reach genome-wide statistical significance. Conclusions: We have performed the first CRC GWAS in a Southern European population and by these means we were able to identify two new susceptibility variants at 1p33 and 8p12 loci. These two SNPs are located near the SLC5A9 and DUSP4 loci, respectively, which could be good functional candidates for the association signals. We therefore believe that these two markers constitute good candidates for CRC susceptibility loci and should be further evaluated in other larger datasets. Moreover, we highlight that were these two SNPs true susceptibility variants, they would constitute a decrease in the CRC missing heritability fraction.
Resumo:
In spite of over a century of research on cortical circuits, it is still unknown how many classes of cortical neurons exist. Neuronal classification has been a difficult problem because it is unclear what a neuronal cell class actually is and what are the best characteristics are to define them. Recently, unsupervised classifications using cluster analysis based on morphological, physiological or molecular characteristics, when applied to selected datasets, have provided quantitative and unbiased identification of distinct neuronal subtypes. However, better and more robust classification methods are needed for increasingly complex and larger datasets. We explored the use of affinity propagation, a recently developed unsupervised classification algorithm imported from machine learning, which gives a representative example or exemplar for each cluster. As a case study, we applied affinity propagation to a test dataset of 337 interneurons belonging to four subtypes, previously identified based on morphological and physiological characteristics. We found that affinity propagation correctly classified most of the neurons in a blind, non-supervised manner. In fact, using a combined anatomical/physiological dataset, our algorithm differentiated parvalbumin from somatostatin interneurons in 49 out of 50 cases. Affinity propagation could therefore be used in future studies to validly classify neurons, as a first step to help reverse engineer neural circuits.
Resumo:
265 p.
Resumo:
Singular Value Decomposition (SVD) is a key linear algebraic operation in many scientific and engineering applications. In particular, many computational intelligence systems rely on machine learning methods involving high dimensionality datasets that have to be fast processed for real-time adaptability. In this paper we describe a practical FPGA (Field Programmable Gate Array) implementation of a SVD processor for accelerating the solution of large LSE problems. The design approach has been comprehensive, from the algorithmic refinement to the numerical analysis to the customization for an efficient hardware realization. The processing scheme rests on an adaptive vector rotation evaluator for error regularization that enhances convergence speed with no penalty on the solution accuracy. The proposed architecture, which follows a data transfer scheme, is scalable and based on the interconnection of simple rotations units, which allows for a trade-off between occupied area and processing acceleration in the final implementation. This permits the SVD processor to be implemented both on low-cost and highend FPGAs, according to the final application requirements.
Resumo:
Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a single entity are, some of which have no sequence homology to any known proteins. De novo assembly of single viruses from a metagenome is challenging, not only because of the lack of a reference genome, but also because of intrapopulation variation and uneven or insufficient coverage. Here we explored different assembly algorithms, remote homology searches, genome-specific sequence motifs, k-mer frequency ranking, and coverage profile binning to detect and obtain viral target genomes from metagenomes. All methods were tested on 454-generated sequencing datasets containing three recently described RNA viruses with a relatively large genome which were divergent to previously known viruses from the viral families Rhabdoviridae and Coronaviridae. Depending on specific characteristics of the target virus and the metagenomic community, different assembly and in silico gap closure strategies were successful in obtaining near complete viral genomes.