967 resultados para Theoretical Computer Science
Resumo:
Apresenta·se um breve resumo histórico da evolução da amostragem por transectos lineares e desenvolve·se a sua teoria. Descrevemos a teoria de amostragem por transectos lineares, proposta por Buckland (1992), sendo apresentados os pontos mais relevantes, no que diz respeito à modelação da função de detecção. Apresentamos uma descrição do princípio CDM (Rissanen, 1978) e a sua aplicação à estimação de uma função densidade por um histograma (Kontkanen e Myllymãki, 2006), procedendo à aplicação de um exemplo prático, recorrendo a uma mistura de densidades. Procedemos à sua aplicação ao cálculo do estimador da probabilidade de detecção, no caso dos transectos lineares e desta forma estimar a densidade populacional de animais. Analisamos dois casos práticos, clássicos na amostragem por distâncias, comparando os resultados obtidos. De forma a avaliar a metodologia, simulámos vários conjuntos de observações, tendo como base o exemplo das estacas, recorrendo às funções de detecção semi-normal, taxa de risco, exponencial e uniforme com um cosseno. Os resultados foram obtidos com o programa DISTANCE (Thomas et al., in press) e um algoritmo escrito em linguagem C, cedido pelo Professor Doutor Petri Kontkanen (Departamento de Ciências da Computação, Universidade de Helsínquia). Foram desenvolvidos programas de forma a calcular intervalos de confiança recorrendo à técnica bootstrap (Efron, 1978). São discutidos os resultados finais e apresentadas sugestões de desenvolvimentos futuros. ABSTRACT; We present a brief historical note on the evolution of line transect sampling and its theoretical developments. We describe line transect sampling theory as proposed by Buckland (1992), and present the most relevant issues about modeling the detection function. We present a description of the CDM principle (Rissanen, 1978) and its application to histogram density estimation (Kontkanen and Myllymãki, 2006), with a practical example, using a mixture of densities. We proceed with the application and estimate probability of detection and animal population density in the context of line transect sampling. Two classical examples from the literature are analyzed and compared. ln order to evaluate the proposed methodology, we carry out a simulation study based on a wooden stakes example, and using as detection functions half normal, hazard rate, exponential and uniform with a cosine term. The results were obtained using program DISTANCE (Thomas et al., in press), and an algorithm written in C language, kindly offered by Professor Petri Kontkanen (Department of Computer Science, University of Helsinki). We develop some programs in order to estimate confidence intervals using the bootstrap technique (Efron, 1978). Finally, the results are presented and discussed with suggestions for future developments.
Resumo:
In the study of complex networks, vertex centrality measures are used to identify the most important vertices within a graph. A related problem is that of measuring the centrality of an edge. In this paper, we propose a novel edge centrality index rooted in quantum information. More specifically, we measure the importance of an edge in terms of the contribution that it gives to the Von Neumann entropy of the graph. We show that this can be computed in terms of the Holevo quantity, a well known quantum information theoretical measure. While computing the Von Neumann entropy and hence the Holevo quantity requires computing the spectrum of the graph Laplacian, we show how to obtain a simplified measure through a quadratic approximation of the Shannon entropy. This in turns shows that the proposed centrality measure is strongly correlated with the negative degree centrality on the line graph. We evaluate our centrality measure through an extensive set of experiments on real-world as well as synthetic networks, and we compare it against commonly used alternative measures.
Resumo:
This thesis presents a cloud-based software platform for sharing publicly available scientific datasets. The proposed platform leverages the potential of NoSQL databases and asynchronous IO technologies, such as Node.JS, in order to achieve high performances and flexible solutions. This solution will serve two main groups of users. The dataset providers, which are the researchers responsible for sharing and maintaining datasets, and the dataset users, that are those who desire to access the public data. To the former are given tools to easily publish and maintain large volumes of data, whereas the later are given tools to enable the preview and creation of subsets of the original data through the introduction of filter and aggregation operations. The choice of NoSQL over more traditional RDDMS emerged from and extended benchmark between relational databases (MySQL) and NoSQL (MongoDB) that is also presented in this thesis. The obtained results come to confirm the theoretical guarantees that NoSQL databases are more suitable for the kind of data that our system users will be handling, i. e., non-homogeneous data structures that can grow really fast. It is envisioned that a platform like this can lead the way to a new era of scientific data sharing where researchers are able to easily share and access all kinds of datasets, and even in more advanced scenarios be presented with recommended datasets and already existing research results on top of those recommendations.
Resumo:
This thesis presents a cloud-based software platform for sharing publicly available scientific datasets. The proposed platform leverages the potential of NoSQL databases and asynchronous IO technologies, such as Node.JS, in order to achieve high performances and flexible solutions. This solution will serve two main groups of users. The dataset providers, which are the researchers responsible for sharing and maintaining datasets, and the dataset users, that are those who desire to access the public data. To the former are given tools to easily publish and maintain large volumes of data, whereas the later are given tools to enable the preview and creation of subsets of the original data through the introduction of filter and aggregation operations. The choice of NoSQL over more traditional RDDMS emerged from and extended benchmark between relational databases (MySQL) and NoSQL (MongoDB) that is also presented in this thesis. The obtained results come to confirm the theoretical guarantees that NoSQL databases are more suitable for the kind of data that our system users will be handling, i. e., non-homogeneous data structures that can grow really fast. It is envisioned that a platform like this can lead the way to a new era of scientific data sharing where researchers are able to easily share and access all kinds of datasets, and even in more advanced scenarios be presented with recommended datasets and already existing research results on top of those recommendations.
Resumo:
Se calculó la obtención de las constantes ópticas usando el método de Wolfe. Dichas contantes: coeficiente de absorción (α), índice de refracción (n) y espesor de una película delgada (d ), son de importancia en el proceso de caracterización óptica del material. Se realizó una comparación del método del Wolfe con el método empleado por R. Swanepoel. Se desarrolló un modelo de programación no lineal con restricciones, de manera que fue posible estimar las constantes ópticas de películas delgadas semiconductoras, a partir únicamente, de datos de transmisión conocidos. Se presentó una solución al modelo de programación no lineal para programación cuadrática. Se demostró la confiabilidad del método propuesto, obteniendo valores de α = 10378.34 cm−1, n = 2.4595, d =989.71 nm y Eg = 1.39 Ev, a través de experimentos numéricos con datos de medidas de transmitancia espectral en películas delgadas de Cu3BiS3.
Resumo:
Neural representations (NR) have emerged in the last few years as a powerful tool to represent signals from several domains, such as images, 3D shapes, or audio. Indeed, deep neural networks have been shown capable of approximating continuous functions that describe a given signal with theoretical infinite resolution. This finding allows obtaining representations whose memory footprint is fixed and decoupled from the resolution at which the underlying signal can be sampled, something that is not possible with traditional discrete representations, e.g., grids of pixels for images or voxels for 3D shapes. During the last two years, many techniques have been proposed to improve the capability of NR to approximate high-frequency details and to make the optimization procedures required to obtain NR less demanding both in terms of time and data requirements, motivating many researchers to deploy NR as the main form of data representation for complex pipelines. Following this line of research, we first show that NR can approximate precisely Unsigned Distance Functions, providing an effective way to represent garments that feature open 3D surfaces and unknown topology. Then, we present a pipeline to obtain in a few minutes a compact Neural Twin® for a given object, by exploiting the recent advances in modeling neural radiance fields. Furthermore, we move a step in the direction of adopting NR as a standalone representation, by considering the possibility of performing downstream tasks by processing directly the NR weights. We first show that deep neural networks can be compressed into compact latent codes. Then, we show how this technique can be exploited to perform deep learning on implicit neural representations (INR) of 3D shapes, by only looking at the weights of the networks.
Resumo:
The discovery of new materials and their functions has always been a fundamental component of technological progress. Nowadays, the quest for new materials is stronger than ever: sustainability, medicine, robotics and electronics are all key assets which depend on the ability to create specifically tailored materials. However, designing materials with desired properties is a difficult task, and the complexity of the discipline makes it difficult to identify general criteria. While scientists developed a set of best practices (often based on experience and expertise), this is still a trial-and-error process. This becomes even more complex when dealing with advanced functional materials. Their properties depend on structural and morphological features, which in turn depend on fabrication procedures and environment, and subtle alterations leads to dramatically different results. Because of this, materials modeling and design is one of the most prolific research fields. Many techniques and instruments are continuously developed to enable new possibilities, both in the experimental and computational realms. Scientists strive to enforce cutting-edge technologies in order to make progress. However, the field is strongly affected by unorganized file management, proliferation of custom data formats and storage procedures, both in experimental and computational research. Results are difficult to find, interpret and re-use, and a huge amount of time is spent interpreting and re-organizing data. This also strongly limit the application of data-driven and machine learning techniques. This work introduces possible solutions to the problems described above. Specifically, it talks about developing features for specific classes of advanced materials and use them to train machine learning models and accelerate computational predictions for molecular compounds; developing method for organizing non homogeneous materials data; automate the process of using devices simulations to train machine learning models; dealing with scattered experimental data and use them to discover new patterns.
Resumo:
The world currently faces a paradox in terms of accessibility for people with disabilities. While digital technologies hold immense potential to improve their quality of life, the majority of web content still exhibits critical accessibility issues. This PhD thesis addresses this challenge by proposing two interconnected research branches. The first introduces a groundbreaking approach to improving web accessibility by rethinking how it is approached, making it more accessible itself. It involves the development of: 1. AX, a declarative framework of web components that enforces the generation of accessible markup by means of static analysis. 2. An innovative accessibility testing and evaluation methodology, which communicates test results by exploiting concepts that developers are already familiar with (visual rendering and mouse operability) to convey the accessibility of a page. This methodology is implemented through the SAHARIAN browser extension. 3. A11A, a categorized and structured collection of curated accessibility resources aimed at facilitating their intended audiences discover and use them. The second branch focuses on unleashing the full potential of digital technologies to improve accessibility in the physical world. The thesis proposes the SCAMP methodology to make scientific artifacts accessible to blind, visually impaired individuals, and the general public. It enhances the natural characteristics of objects, making them more accessible through interactive, multimodal, and multisensory experiences. Additionally, the prototype of \gls{a11yvt}, a system supporting accessible virtual tours, is presented. It provides blind and visually impaired individuals with features necessary to explore unfamiliar indoor environments, while maintaining universal design principles that makes it suitable for usage by the general public. The thesis extensively discusses the theoretical foundations, design, development, and unique characteristics of these innovative tools. Usability tests with the intended target audiences demonstrate the effectiveness of the proposed artifacts, suggesting their potential to significantly improve the current state of accessibility.
Resumo:
This thesis project is framed in the research field of Physics Education and aims to contribute to the reflection on the importance of disciplinary identities in addressing interdisciplinarity through the lens of the Nature of Science (NOS). In particular, the study focuses on the module on the parabola and parabolic motion, which was designed within the EU project IDENTITIES. The project aims to design modules to innovate pre-service teacher education according to contemporary challenges, focusing on interdisciplinarity in curricular and STEM topics (especially between physics, mathematics and computer science). The modules are designed according to a model of disciplines and interdisciplinarity that the project IDENTITIES has been elaborating on two main theoretical frameworks: the Family Resemblance Approach (FRA), reconceptualized for the Nature of science (Erduran & Dagher, 2014), and the boundary crossing and boundary objects framework by Akkerman and Bakker (2011). The main aim of the thesis is to explore the impact of this interdisciplinary model in the specific case of the implementation of the parabola and parabolic motion module in a context of preservice teacher education. To reach this purpose, we have analyzed some data collected during the implementation in order to investigate, in particular, the role of the FRA as a learning tool to: a) elaborate on the concept of “discipline”, within the broader problem to define interdisciplinarity; b) compare the epistemic core of physics and mathematics; c) develop epistemic skills and interdisciplinary competences in student-teachers. The analysis of the data led us to recognize three different roles played by the FRA: FRA as epistemological activator, FRA as scaffolding for reasoning and navigating (inhabiting) the complexity, and FRA as lens to investigate the relationship between physics and mathematics in the historical case.
Resumo:
Due to both the widespread and multipurpose use of document images and the current availability of a high number of document images repositories, robust information retrieval mechanisms and systems have been increasingly demanded. This paper presents an approach to support the automatic generation of relationships among document images by exploiting Latent Semantic Indexing (LSI) and Optical Character Recognition (OCR). We developed the LinkDI (Linking of Document Images) service, which extracts and indexes document images content, computes its latent semantics, and defines relationships among images as hyperlinks. LinkDI was experimented with document images repositories, and its performance was evaluated by comparing the quality of the relationships created among textual documents as well as among their respective document images. Considering those same document images, we ran further experiments in order to compare the performance of LinkDI when it exploits or not the LSI technique. Experimental results showed that LSI can mitigate the effects of usual OCR misrecognition, which reinforces the feasibility of LinkDI relating OCR output with high degradation.
Resumo:
In Natural Language Processing (NLP) symbolic systems, several linguistic phenomena, for instance, the thematic role relationships between sentence constituents, such as AGENT, PATIENT, and LOCATION, can be accounted for by the employment of a rule-based grammar. Another approach to NLP concerns the use of the connectionist model, which has the benefits of learning, generalization and fault tolerance, among others. A third option merges the two previous approaches into a hybrid one: a symbolic thematic theory is used to supply the connectionist network with initial knowledge. Inspired on neuroscience, it is proposed a symbolic-connectionist hybrid system called BIO theta PRED (BIOlogically plausible thematic (theta) symbolic-connectionist PREDictor), designed to reveal the thematic grid assigned to a sentence. Its connectionist architecture comprises, as input, a featural representation of the words (based on the verb/noun WordNet classification and on the classical semantic microfeature representation), and, as output, the thematic grid assigned to the sentence. BIO theta PRED is designed to ""predict"" thematic (semantic) roles assigned to words in a sentence context, employing biologically inspired training algorithm and architecture, and adopting a psycholinguistic view of thematic theory.
Resumo:
This paper presents SMarty, a variability management approach for UML-based software product lines (PL). SMarty is supported by a UML profile, the SMartyProfile, and a process for managing variabilities, the SMartyProcess. SMartyProfile aims at representing variabilities, variation points, and variants in UML models by applying a set of stereotypes. SMartyProcess consists of a set of activities that is systematically executed to trace, identify, and control variabilities in a PL based on SMarty. It also identifies variability implementation mechanisms and analyzes specific product configurations. In addition, a more comprehensive application of SMarty is presented using SEI's Arcade Game Maker PL. An evaluation of SMarty and related work are discussed.
Resumo:
Objective: We carry out a systematic assessment on a suite of kernel-based learning machines while coping with the task of epilepsy diagnosis through automatic electroencephalogram (EEG) signal classification. Methods and materials: The kernel machines investigated include the standard support vector machine (SVM), the least squares SVM, the Lagrangian SVM, the smooth SVM, the proximal SVM, and the relevance vector machine. An extensive series of experiments was conducted on publicly available data, whose clinical EEG recordings were obtained from five normal subjects and five epileptic patients. The performance levels delivered by the different kernel machines are contrasted in terms of the criteria of predictive accuracy, sensitivity to the kernel function/parameter value, and sensitivity to the type of features extracted from the signal. For this purpose, 26 values for the kernel parameter (radius) of two well-known kernel functions (namely. Gaussian and exponential radial basis functions) were considered as well as 21 types of features extracted from the EEG signal, including statistical values derived from the discrete wavelet transform, Lyapunov exponents, and combinations thereof. Results: We first quantitatively assess the impact of the choice of the wavelet basis on the quality of the features extracted. Four wavelet basis functions were considered in this study. Then, we provide the average accuracy (i.e., cross-validation error) values delivered by 252 kernel machine configurations; in particular, 40%/35% of the best-calibrated models of the standard and least squares SVMs reached 100% accuracy rate for the two kernel functions considered. Moreover, we show the sensitivity profiles exhibited by a large sample of the configurations whereby one can visually inspect their levels of sensitiveness to the type of feature and to the kernel function/parameter value. Conclusions: Overall, the results evidence that all kernel machines are competitive in terms of accuracy, with the standard and least squares SVMs prevailing more consistently. Moreover, the choice of the kernel function and parameter value as well as the choice of the feature extractor are critical decisions to be taken, albeit the choice of the wavelet family seems not to be so relevant. Also, the statistical values calculated over the Lyapunov exponents were good sources of signal representation, but not as informative as their wavelet counterparts. Finally, a typical sensitivity profile has emerged among all types of machines, involving some regions of stability separated by zones of sharp variation, with some kernel parameter values clearly associated with better accuracy rates (zones of optimality). (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.
Resumo:
A model where agents show discrete behavior regarding their actions, but have continuous opinions that are updated by interacting with other agents is presented. This new updating rule is applied to both the voter and Sznajd models for interaction between neighbors, and its consequences are discussed. The appearance of extremists is naturally observed and it seems to be a characteristic of this model.