Biblioteca Digital

933 resultados para non-trivial data structures

TA-RE: An Exchange Language for Mining Software Repositories

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software repositories have been getting a lot of attention from researchers in recent years. In order to analyze software repositories, it is necessary to first extract raw data from the version control and problem tracking systems. This poses two challenges: (1) extraction requires a non-trivial effort, and (2) the results depend on the heuristics used during extraction. These challenges burden researchers that are new to the community and make it difficult to benchmark software repository mining since it is almost impossible to reproduce experiments done by another team. In this paper we present the TA-RE corpus. TA-RE collects extracted data from software repositories in order to build a collection of projects that will simplify extraction process. Additionally the collection can be used for benchmarking. As the first step we propose an exchange language capable of making sharing and reusing data as simple as possible.

The Monotonicity Puzzle: An Experimental Investigation of Incentive Structures

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Non-monotone incentive structures, which - according to theory - are able to induce optimal behavior, are often regarded as empirically less relevant for labor relationships. We compare the performance of a theoretically optimal non-monotone contract with a monotone one under controlled laboratory conditions. Implementing some features relevant to real-world employment relationships, our paper demonstrates that, in fact, the frequency of income-maximizing decisions made by agents is higher under the monotone contract. Although this observed behavior does not change the superiority of the non-monotone contract for principals, they do not choose this contract type in a significant way. This is what we call the monotonicity puzzle. Detailed investigations of decisions provide a clue for solving the puzzle and a possible explanation for the popularity of monotone contracts.

GPU-based Ray Tracing of Dynamic Scenes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Interactive ray tracing of non-trivial scenes is just becoming feasible on single graphics processing units (GPU). Recent work in this area focuses on building effective acceleration structures, which work well under the constraints of current GPUs. Most approaches are targeted at static scenes and only allow navigation in the virtual scene. So far support for dynamic scenes has not been considered for GPU implementations. We have developed a GPU-based ray tracing system for dynamic scenes consisting of a set of individual objects. Each object may independently move around, but its geometry and topology are static.

Parsing for agile modeling

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to analyze software systems, it is necessary to model them. Static software models are commonly imported by parsing source code and related data. Unfortunately, building custom parsers for most programming languages is a non-trivial endeavour. This poses a major bottleneck for analyzing software systems programmed in languages for which importers do not already exist. Luckily, initial software models do not require detailed parsers, so it is possible to start analysis with a coarse-grained importer, which is then gradually refined. In this paper we propose an approach to "agile modeling" that exploits island grammars to extract initial coarse-grained models, parser combinators to enable gradual refinement of model importers, and various heuristics to recognize language structure, keywords and other language artifacts.

Quantitative 3D strain analysis in analogue experiments: Integration of X-ray computed tomography and digital volume correlation techniques

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The combination of scaled analogue experiments, material mechanics, X-ray computed tomography (XRCT) and Digital Volume Correlation techniques (DVC) is a powerful new tool not only to examine the 3 dimensional structure and kinematic evolution of complex deformation structures in scaled analogue experiments, but also to fully quantify their spatial strain distribution and complete strain history. Digital image correlation (DIC) is an important advance in quantitative physical modelling and helps to understand non-linear deformation processes. Optical non-intrusive (DIC) techniques enable the quantification of localised and distributed deformation in analogue experiments based either on images taken through transparent sidewalls (2D DIC) or on surface views (3D DIC). X-ray computed tomography (XRCT) analysis permits the non-destructive visualisation of the internal structure and kinematic evolution of scaled analogue experiments simulating tectonic evolution of complex geological structures. The combination of XRCT sectional image data of analogue experiments with 2D DIC only allows quantification of 2D displacement and strain components in section direction. This completely omits the potential of CT experiments for full 3D strain analysis of complex, non-cylindrical deformation structures. In this study, we apply digital volume correlation (DVC) techniques on XRCT scan data of “solid” analogue experiments to fully quantify the internal displacement and strain in 3 dimensions over time. Our first results indicate that the application of DVC techniques on XRCT volume data can successfully be used to quantify the 3D spatial and temporal strain patterns inside analogue experiments. We demonstrate the potential of combining DVC techniques and XRCT volume imaging for 3D strain analysis of a contractional experiment simulating the development of a non-cylindrical pop-up structure. Furthermore, we discuss various options for optimisation of granular materials, pattern generation, and data acquisition for increased resolution and accuracy of the strain results. Three-dimensional strain analysis of analogue models is of particular interest for geological and seismic interpretations of complex, non-cylindrical geological structures. The volume strain data enable the analysis of the large-scale and small-scale strain history of geological structures.

Cue Recognition and Integration - Eye Tracking Evidence of Processing Differences in Sentence Comprehension in Aphasia

Relevância:

100.00% 100.00%

Publicador:

Resumo:

PURPOSE: We aimed at further elucidating whether aphasic patients' difficulties in understanding non-canonical sentence structures, such as Passive or Object-Verb-Subject sentences, can be attributed to impaired morphosyntactic cue recognition, and to problems in integrating competing interpretations. METHODS: A sentence-picture matching task with canonical and non-canonical spoken sentences was performed using concurrent eye tracking. Accuracy, reaction time, and eye tracking data (fixations) of 50 healthy subjects and 12 aphasic patients were analysed. RESULTS: Patients showed increased error rates and reaction times, as well as delayed fixation preferences for target pictures in non-canonical sentences. Patients' fixation patterns differed from healthy controls and revealed deficits in recognizing and immediately integrating morphosyntactic cues. CONCLUSION: Our study corroborates the notion that difficulties in understanding syntactically complex sentences are attributable to a processing deficit encompassing delayed and therefore impaired recognition and integration of cues, as well as increased competition between interpretations.

An MILP-based heuristic for staff scheduling problems with acceptance levels

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a real-world staff-assignment problem that was reported to us by a provider of an online workforce scheduling software. The problem consists of assigning employees to work shifts subject to a large variety of requirements related to work laws, work shift compatibility, workload balancing, and personal preferences of employees. A target value is given for each requirement, and all possible deviations from these values are associated with acceptance levels. The objective is to minimize the total number of deviations in ascending order of the acceptance levels. We present an exact lexicographic goal programming MILP formulation and an MILP-based heuristic. The heuristic consists of two phases: in the first phase a feasible schedule is built and in the second phase parts of the schedule are iteratively re-optimized by applying an exact MILP model. A major advantage of such MILP-based approaches is the flexibility to account for additional constraints or modified planning objectives, which is important as the requirements may vary depending on the company or planning period. The applicability of the heuristic is demonstrated for a test set derived from real-world data. Our computational results indicate that the heuristic is able to devise optimal solutions to non-trivial problem instances, and outperforms the exact lexicographic goal programming formulation on medium- and large-sized problem instances.

Selection bias and the cross-validation of regression models for prediction

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^

DEVELOPMENT OF NOVEL METHODS TO MINIMIZE THE IMPACT OF SEQUENCING ERRORS IN THE NEXT-GENERATION SEQUENCING DATA ANALYSIS

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. . To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved. SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (

Generating Awareness from Collaborative Working Environment using Social Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, Internet is a place where social networks have reached an important impact in collaboration among people over the world in different ways. This article proposes a new paradigm for building CSCW business tools following the novel ideas provided by the social web to collaborate and generate awareness. An implementation of these concepts is described, including the components we provide to collaborate in workspaces, (such as videoconference, chat, desktop sharing, forums or temporal events), and the way we generate awareness from these complex social data structures. Figures and validation results are also presented to stress that this architecture has been defined to support awareness generation via joining current and future social data from business and social networks worlds, based on the idea of using social data stored in the cloud.

Cloud Computing Service for Managing Large Medical Image Data-Sets Using Balanced Collaborative Agents

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Managing large medical image collections is an increasingly demanding important issue in many hospitals and other medical settings. A huge amount of this information is daily generated, which requires robust and agile systems. In this paper we present a distributed multi-agent system capable of managing very large medical image datasets. In this approach, agents extract low-level information from images and store them in a data structure implemented in a relational database. The data structure can also store semantic information related to images and particular regions. A distinctive aspect of our work is that a single image can be divided so that the resultant sub-images can be stored and managed separately by different agents to improve performance in data accessing and processing. The system also offers the possibility of applying some region-based operations and filters on images, facilitating image classification. These operations can be performed directly on data structures in the database.

Sharing analysis of arrays, collections, and recursive structures

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Precise modeling of the program heap is fundamental for understanding the behavior of a program, and is thus of signiflcant interest for many optimization applications. One of the fundamental properties of the heap that can be used in a range of optimization techniques is the sharing relationships between the elements in an array or collection. If an analysis can determine that the memory locations pointed to by different entries of an array (or collection) are disjoint, then in many cases loops that traverse the array can be vectorized or transformed into a thread-parallel versión. This paper introduces several novel sharing properties over the concrete heap and corresponding abstractions to represent them. In conjunction with an existing shape analysis technique, these abstractions allow us to precisely resolve the sharing relations in a wide range of heap structures (arrays, collections, recursive data structures, composite heap structures) in a computationally efflcient manner. The effectiveness of the approach is evaluated on a set of challenge problems from the JOlden and SPECjvm98 suites. Sharing information obtained from the analysis is used to achieve substantial thread-level parallel speedups.

Advanced Topics in Resource Analysis: Certification, Incrementality, Concurrency and Array-Sensitivity

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El Análisis de Consumo de Recursos o Análisis de Coste trata de aproximar el coste de ejecutar un programa como una función dependiente de sus datos de entrada. A pesar de que existen trabajos previos a esta tesis doctoral que desarrollan potentes marcos para el análisis de coste de programas orientados a objetos, algunos aspectos avanzados, como la eficiencia, la precisión y la fiabilidad de los resultados, todavía deben ser estudiados en profundidad. Esta tesis aborda estos aspectos desde cuatro perspectivas diferentes: (1) Las estructuras de datos compartidas en la memoria del programa son una pesadilla para el análisis estático de programas. Trabajos recientes proponen una serie de condiciones de localidad para poder mantener de forma consistente información sobre los atributos de los objetos almacenados en memoria compartida, reemplazando éstos por variables locales no almacenadas en la memoria compartida. En esta tesis presentamos dos extensiones a estos trabajos: la primera es considerar, no sólo los accesos a los atributos, sino también los accesos a los elementos almacenados en arrays; la segunda se centra en los casos en los que las condiciones de localidad no se cumplen de forma incondicional, para lo cual, proponemos una técnica para encontrar las precondiciones necesarias para garantizar la consistencia de la información acerca de los datos almacenados en memoria. (2) El objetivo del análisis incremental es, dado un programa, los resultados de su análisis y una serie de cambios sobre el programa, obtener los nuevos resultados del análisis de la forma más eficiente posible, evitando reanalizar aquellos fragmentos de código que no se hayan visto afectados por los cambios. Los analizadores actuales todavía leen y analizan el programa completo de forma no incremental. Esta tesis presenta un análisis de coste incremental, que, dado un cambio en el programa, reconstruye la información sobre el coste del programa de todos los métodos afectados por el cambio de forma incremental. Para esto, proponemos (i) un algoritmo multi-dominio y de punto fijo que puede ser utilizado en todos los análisis globales necesarios para inferir el coste, y (ii) una novedosa forma de almacenar las expresiones de coste que nos permite reconstruir de forma incremental únicamente las funciones de coste de aquellos componentes afectados por el cambio. (3) Las garantías de coste obtenidas de forma automática por herramientas de análisis estático no son consideradas totalmente fiables salvo que la implementación de la herramienta o los resultados obtenidos sean verificados formalmente. Llevar a cabo el análisis de estas herramientas es una tarea titánica, ya que se trata de herramientas de gran tamaño y complejidad. En esta tesis nos centramos en el desarrollo de un marco formal para la verificación de las garantías de coste obtenidas por los analizadores en lugar de analizar las herramientas. Hemos implementado esta idea mediante la herramienta COSTA, un analizador de coste para programas Java y KeY, una herramienta de verificación de programas Java. De esta forma, COSTA genera las garantías de coste, mientras que KeY prueba la validez formal de los resultados obtenidos, generando de esta forma garantías de coste verificadas. (4) Hoy en día la concurrencia y los programas distribuidos son clave en el desarrollo de software. Los objetos concurrentes son un modelo de concurrencia asentado para el desarrollo de sistemas concurrentes. En este modelo, los objetos son las unidades de concurrencia y se comunican entre ellos mediante llamadas asíncronas a sus métodos. La distribución de las tareas sugiere que el análisis de coste debe inferir el coste de los diferentes componentes distribuidos por separado. En esta tesis proponemos un análisis de coste sensible a objetos que, utilizando los resultados obtenidos mediante un análisis de apunta-a, mantiene el coste de los diferentes componentes de forma independiente. Abstract Resource Analysis (a.k.a. Cost Analysis) tries to approximate the cost of executing programs as functions on their input data sizes and without actually having to execute the programs. While a powerful resource analysis framework on object-oriented programs existed before this thesis, advanced aspects to improve the efficiency, the accuracy and the reliability of the results of the analysis still need to be further investigated. This thesis tackles this need from the following four different perspectives. (1) Shared mutable data structures are the bane of formal reasoning and static analysis. Analyses which keep track of heap-allocated data are referred to as heap-sensitive. Recent work proposes locality conditions for soundly tracking field accesses by means of ghost non-heap allocated variables. In this thesis we present two extensions to this approach: the first extension is to consider arrays accesses (in addition to object fields), while the second extension focuses on handling cases for which the locality conditions cannot be proven unconditionally by finding aliasing preconditions under which tracking such heap locations is feasible. (2) The aim of incremental analysis is, given a program, its analysis results and a series of changes to the program, to obtain the new analysis results as efficiently as possible and, ideally, without having to (re-)analyze fragments of code that are not affected by the changes. During software development, programs are permanently modified but most analyzers still read and analyze the entire program at once in a non-incremental way. This thesis presents an incremental resource usage analysis which, after a change in the program is made, is able to reconstruct the upper-bounds of all affected methods in an incremental way. To this purpose, we propose (i) a multi-domain incremental fixed-point algorithm which can be used by all global analyses required to infer the cost, and (ii) a novel form of cost summaries that allows us to incrementally reconstruct only those components of cost functions affected by the change. (3) Resource guarantees that are automatically inferred by static analysis tools are generally not considered completely trustworthy, unless the tool implementation or the results are formally verified. Performing full-blown verification of such tools is a daunting task, since they are large and complex. In this thesis we focus on the development of a formal framework for the verification of the resource guarantees obtained by the analyzers, instead of verifying the tools. We have implemented this idea using COSTA, a state-of-the-art cost analyzer for Java programs and KeY, a state-of-the-art verification tool for Java source code. COSTA is able to derive upper-bounds of Java programs while KeY proves the validity of these bounds and provides a certificate. The main contribution of our work is to show that the proposed tools cooperation can be used for automatically producing verified resource guarantees. (4) Distribution and concurrency are today mainstream. Concurrent objects form a well established model for distributed concurrent systems. In this model, objects are the concurrency units that communicate via asynchronous method calls. Distribution suggests that analysis must infer the cost of the diverse distributed components separately. In this thesis we propose a novel object-sensitive cost analysis which, by using the results gathered by a points-to analysis, can keep the cost of the diverse distributed components separate.

Time scales in cognitive neuroscience

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cognitive neuroscience boils down to describing the ways in which cognitive function results from brain activity. In turn, brain activity shows complex fluctuations, with structure at many spatio-temporal scales. Exactly how cognitive function inherits the physical dimensions of neural activity, though, is highly non-trivial, and so are generally the corresponding dimensions of cognitive phenomena. As for any physical phenomenon, when studying cognitive function, the first conceptual step should be that of establishing its dimensions. Here, we provide a systematic presentation of the temporal aspects of task-related brain activity, from the smallest scale of the brain imaging technique's resolution, to the observation time of a given experiment, through the characteristic time scales of the process under study. We first review some standard assumptions on the temporal scales of cognitive function. In spite of their general use, these assumptions hold true to a high degree of approximation for many cognitive (viz. fast perceptual) processes, but have their limitations for other ones (e.g., thinking or reasoning). We define in a rigorous way the temporal quantifiers of cognition at all scales, and illustrate how they qualitatively vary as a function of the properties of the cognitive process under study. We propose that each phenomenon should be approached with its own set of theoretical, methodological and analytical tools. In particular, we show that when treating cognitive processes such as thinking or reasoning, complex properties of ongoing brain activity, which can be drastically simplified when considering fast (e.g., perceptual) processes, start playing a major role, and not only characterize the temporal properties of task-related brain activity, but also determine the conditions for proper observation of the phenomena. Finally, some implications on the design of experiments, data analyses, and the choice of recording parameters are discussed.

Distributed, layered and reliable computing nets to represent neuronal receptive fields

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract. Receptive fields of retinal and other sensory neurons show a large variety of spatiotemporal linear and non linear types of responses to local stimuli. In visual neurons, these responses present either asymmetric sensitive zones or center-surround organization. In most cases, the nature of the responses suggests the existence of a kind of distributed computation prior to the integration by the final cell which is evidently supported by the anatomy. We describe a new kind of discrete and continuous filters to model the kind of computations taking place in the receptive fields of retinal cells. To show their performance in the analysis of diferent non-trivial neuron-like structures, we use a computer tool specifically programmed by the authors to that efect. This tool is also extended to study the efect of lesions on the whole performance of our model nets.

«
1
2
...
6
7
8
9
10
11
12
...
62
63
»