970 resultados para DATASETS
Resumo:
Thesis (Master's)--University of Washington, 2014
Resumo:
Tese de doutoramento, Biologia (Biologia do Desenvolvimento), Universidade de Lisboa, Faculdade de Ciências, 2015
Resumo:
Thesis (Ph.D.)--University of Washington, 2015
Resumo:
In the context of monolingual and bilingual retrieval, Simple Knowledge Organisation System (SKOS) datasets can play a dual role as knowledge bases for semantic annotations and as language-independent resources for translation. With no existing track of formal evaluations of these aspects for datasets in SKOS format, we describe a case study on the usage of the Thesaurus for the Social Sciences in SKOS format for a retrieval setup based on the CLEF 2004-2006 Domain-Specific Track topics, documents and relevance assessments. Results showed a mixed picture with significant system-level improvements in terms of mean average precision in the bilingual runs. Our experiments set a new and improved baseline for using SKOS-based datasets with the GIRT collection and are an example of component-based evaluation.
Resumo:
For two reasons, our capacity for systematic comparison of innovative participatory democratic processes remains limited. First, the category of participatory democratic innovations remains relatively vague when compared to more traditional democratic institutions and practices. Second, until recently there existed no large-sample databases that captured relevant variables in the practice of democratic innovation. The lone exception to these patterns is the Participedia database, located online. Participedia is well placed to respond to the two obstacles to systematic comparative research on democratic innovation. First, its crowdsourced data collection strategy means that many of the cases on the platform are not well known and have not been the subject of sustained academic analysis. Second, the data captured in the articles provides the basis for systematic comparative analysis of democratic innovations both within type (e.g., participatory budgeting, mini-publics) and across types. The platform allows for systematic content analysis of text descriptions and/or statistical analysis of the datasets generated from the structured data fields. This article describes the data about innovative participatory democratic processes available from Participedia, and furnishes examples of the kinds of quantitative and qualitative insights about those processes that Participedia enables.
Resumo:
Combined heredity of surnames and physique, coupled with past marriage patterns and trade-specific physical aptitude and selection factors, may have led to differential assortment of bodily characteristics among present-day men with specific trade-reflecting surnames (Tailor vs. Smith). Two studies reported here were partially consistent with this genetic-social hypothesis, first proposed by Bäumler (1980). Study 1 (N = 224) indicated significantly higher self-rated physical aptitude for prototypically strength-related activities (professions, sports, hobbies) in a random sample of Smiths. The counterpart effect (higher aptitude for dexterity-related activities among Tailors) was directionally correct, but not significant, and Tailor-Smith differences in basic physique variables were not significant. Study 2 examined two large datasets (Austria/Germany combined, and UK: N = 7001 and 20532) of men’s national high-score lists for track-and-field events requiring different physiques. In both datasets, proportions of Smiths significantly increased from light-stature over medium-stature to heavy-stature sports categories. The predicted counterpart effect (decreasing prevalences of Tailors along these categories) was not supported. Related prior findings, implicit egotism as an alternative interpretation of the evidence, and directions for further inquiry are discussed in conclusion.
Resumo:
Data registration refers to a series of techniques for matching or bringing similar objects or datasets together into alignment. These techniques enjoy widespread use in a diverse variety of applications, such as video coding, tracking, object and face detection and recognition, surveillance and satellite imaging, medical image analysis and structure from motion. Registration methods are as numerous as their manifold uses, from pixel level and block or feature based methods to Fourier domain methods. This book is focused on providing algorithms and image and video techniques for registration and quality performance metrics. The authors provide various assessment metrics for measuring registration quality alongside analyses of registration techniques, introducing and explaining both familiar and state–of–the–art registration methodologies used in a variety of targeted applications.
Resumo:
In order to accelerate computing the convex hull on a set of n points, a heuristic procedure is often applied to reduce the number of points to a set of s points, s ≤ n, which also contains the same hull. We present an algorithm to precondition 2D data with integer coordinates bounded by a box of size p × q before building a 2D convex hull, with three distinct advantages. First, we prove that under the condition min(p, q) ≤ n the algorithm executes in time within O(n); second, no explicit sorting of data is required; and third, the reduced set of s points forms a simple polygonal chain and thus can be directly pipelined into an O(n) time convex hull algorithm. This paper empirically evaluates and quantifies the speed up gained by preconditioning a set of points by a method based on the proposed algorithm before using common convex hull algorithms to build the final hull. A speedup factor of at least four is consistently found from experiments on various datasets when the condition min(p, q) ≤ n holds; the smaller the ratio min(p, q)/n is in the dataset, the greater the speedup factor achieved.
Resumo:
Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.
Resumo:
Relatório do Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia de Electrónica e Telecomunicações