935 resultados para source code querying
Resumo:
The development of increasingly powerful computers, which has enabled the use of windowing software, has also opened the way for the computer study, via simulation, of very complex physical systems. In this study, the main issues related to the implementation of interactive simulations of complex systems are identified and discussed. Most existing simulators are closed in the sense that there is no access to the source code and, even if it were available, adaptation to interaction with other systems would require extensive code re-writing. This work aims to increase the flexibility of such software by developing a set of object-oriented simulation classes, which can be extended, by subclassing, at any level, i.e., at the problem domain, presentation or interaction levels. A strategy, which involves the use of an object-oriented framework, concurrent execution of several simulation modules, use of a networked windowing system and the re-use of existing software written in procedural languages, is proposed. A prototype tool which combines these techniques has been implemented and is presented. It allows the on-line definition of the configuration of the physical system and generates the appropriate graphical user interface. Simulation routines have been developed for the chemical recovery cycle of a paper pulp mill. The application, by creation of new classes, of the prototype to the interactive simulation of this physical system is described. Besides providing visual feedback, the resulting graphical user interface greatly simplifies the interaction with this set of simulation modules. This study shows that considerable benefits can be obtained by application of computer science concepts to the engineering domain, by helping domain experts to tailor interactive tools to suit their needs.
Resumo:
Most parametric software cost estimation models used today evolved in the late 70's and early 80's. At that time, the dominant software development techniques being used were the early 'structured methods'. Since then, several new systems development paradigms and methods have emerged, one being Jackson Systems Development (JSD). As current cost estimating methods do not take account of these developments, their non-universality means they cannot provide adequate estimates of effort and hence cost. In order to address these shortcomings two new estimation methods have been developed for JSD projects. One of these methods JSD-FPA, is a top-down estimating method, based on the existing MKII function point method. The other method, JSD-COCOMO, is a sizing technique which sizes a project, in terms of lines of code, from the process structure diagrams and thus provides an input to the traditional COCOMO method.The JSD-FPA method allows JSD projects in both the real-time and scientific application areas to be costed, as well as the commercial information systems applications to which FPA is usually applied. The method is based upon a three-dimensional view of a system specification as opposed to the largely data-oriented view traditionally used by FPA. The method uses counts of various attributes of a JSD specification to develop a metric which provides an indication of the size of the system to be developed. This size metric is then transformed into an estimate of effort by calculating past project productivity and utilising this figure to predict the effort and hence cost of a future project. The effort estimates produced were validated by comparing them against the effort figures for six actual projects.The JSD-COCOMO method uses counts of the levels in a process structure chart as the input to an empirically derived model which transforms them into an estimate of delivered source code instructions.
Resumo:
The focus of our work is the verification of tight functional properties of numerical programs, such as showing that a floating-point implementation of Riemann integration computes a close approximation of the exact integral. Programmers and engineers writing such programs will benefit from verification tools that support an expressive specification language and that are highly automated. Our work provides a new method for verification of numerical software, supporting a substantially more expressive language for specifications than other publicly available automated tools. The additional expressivity in the specification language is provided by two constructs. First, the specification can feature inclusions between interval arithmetic expressions. Second, the integral operator from classical analysis can be used in the specifications, where the integration bounds can be arbitrary expressions over real variables. To support our claim of expressivity, we outline the verification of four example programs, including the integration example mentioned earlier. A key component of our method is an algorithm for proving numerical theorems. This algorithm is based on automatic polynomial approximation of non-linear real and real-interval functions defined by expressions. The PolyPaver tool is our implementation of the algorithm and its source code is publicly available. In this paper we report on experiments using PolyPaver that indicate that the additional expressivity does not come at a performance cost when comparing with other publicly available state-of-the-art provers. We also include a scalability study that explores the limits of PolyPaver in proving tight functional specifications of progressively larger randomly generated programs. © 2014 Springer International Publishing Switzerland.
Resumo:
Internationalization of software as a previous step for localization is usually taken into account during early phases of the life-cycle of software development. However, the need to adapt software applications into different languages and cultural settings can appear once the application is finished and even in the market. In these cases, software localization implies a high cost of time and resources. This paper shows a real case of a existent software application, designed and developed without taking into account future necessities of localization, whose architecture and source code were modified to include the possibility of straightforward adaptation into new languages. The use of standard languages and advanced programming languages has permitted the authors to adapt the software in a simple and straightforward mode.
Resumo:
The paper has been presented at the 12th International Conference on Applications of Computer Algebra, Varna, Bulgaria, June, 2006
Resumo:
Two-dimensional 'Mercedes Benz' (MB) or BN2D water model (Naim, 1971) is implemented in Molecular Dynamics. It is known that the MB model can capture abnormal properties of real water (high heat capacity, minima of pressure and isothermal compressibility, negative thermal expansion coefficient) (Silverstein et al., 1998). In this work formulas for calculating the thermodynamic, structural and dynamic properties in microcanonical (NVE) and isothermal-isobaric (NPT) ensembles for the model from Molecular Dynamics simulation are derived and verified against known Monte Carlo results. The convergence of the thermodynamic properties and the system's numerical stability are investigated. The results qualitatively reproduce the peculiarities of real water making the model a visually convenient tool that also requires less computational resources, thus allowing simulations of large (hydrodynamic scale) molecular systems. We provide the open source code written in C/C++ for the BN2D water model implementation using Molecular Dynamics.
Resumo:
Motivation: In molecular biology, molecular events describe observable alterations of biomolecules, such as binding of proteins or RNA production. These events might be responsible for drug reactions or development of certain diseases. As such, biomedical event extraction, the process of automatically detecting description of molecular interactions in research articles, attracted substantial research interest recently. Event trigger identification, detecting the words describing the event types, is a crucial and prerequisite step in the pipeline process of biomedical event extraction. Taking the event types as classes, event trigger identification can be viewed as a classification task. For each word in a sentence, a trained classifier predicts whether the word corresponds to an event type and which event type based on the context features. Therefore, a well-designed feature set with a good level of discrimination and generalization is crucial for the performance of event trigger identification. Results: In this article, we propose a novel framework for event trigger identification. In particular, we learn biomedical domain knowledge from a large text corpus built from Medline and embed it into word features using neural language modeling. The embedded features are then combined with the syntactic and semantic context features using the multiple kernel learning method. The combined feature set is used for training the event trigger classifier. Experimental results on the golden standard corpus show that >2.5% improvement on F-score is achieved by the proposed framework when compared with the state-of-the-art approach, demonstrating the effectiveness of the proposed framework. © 2014 The Author 2014. The source code for the proposed framework is freely available and can be downloaded at http://cse.seu.edu.cn/people/zhoudeyu/ETI_Sourcecode.zip.
Resumo:
High-quality software documentation is a substantial issue for understanding software systems. Shorter time-to-market software cycles increase the importance of automatism for keeping the documentation up to date. In this paper, we describe the automatic support of the software documentation process using semantic technologies. We introduce a software documentation ontology as an underlying knowledge base. The defined ontology is populated automatically by analysing source code, software documentation and code execution. Through selected results we demonstrate that the use of such semantic systems can support software documentation processes efficiently.
Resumo:
GitHub is the most popular repository for open source code (Finley 2011). It has more than 3.5 million users, as the company declared in April 2013, and more than 10 million repositories, as of December 2013. It has a publicly accessible API and, since March 2012, it also publishes a stream of all the events occurring on public projects. Interactions among GitHub users are of a complex nature and take place in different forms. Developers create and fork repositories, push code, approve code pushed by others, bookmark their favorite projects and follow other developers to keep track of their activities. In this paper we present a characterization of GitHub, as both a social network and a collaborative platform. To the best of our knowledge, this is the first quantitative study about the interactions happening on GitHub. We analyze the logs from the service over 18 months (between March 11, 2012 and September 11, 2013), describing 183.54 million events and we obtain information about 2.19 million users and 5.68 million repositories, both growing linearly in time. We show that the distributions of the number of contributors per project, watchers per project and followers per user show a power-law-like shape. We analyze social ties and repository-mediated collaboration patterns, and we observe a remarkably low level of reciprocity of the social connections. We also measure the activity of each user in terms of authored events and we observe that very active users do not necessarily have a large number of followers. Finally, we provide a geographic characterization of the centers of activity and we investigate how distance influences collaboration.
Resumo:
FDI is believed to be a conduit of new technologies between countries. The first chapter of this dissertation studies the advantages of outward FDI for the home country of multinationals conducting research and development abroad. We use patent citations as a proxy for technology spillovers and we bring empirical evidence that supports the hypothesis that a U.S. subsidiary conducting research and development overseas facilitates the flow of knowledge between its host and home countries.^ The second chapter examines the impact of intellectual property rights (IPR) reforms on the technology flows between the U.S. and host countries of U.S. affiliates. We again use patent citations to examine whether the diffusion of new technology between the host countries and the U.S. is accelerated by the reforms. Our results suggest that the reforms favor innovative efforts of domestic firms in the reforming countries rather than U.S. affiliates efforts. In other words, reforms mediate the technology flows from the U.S. to the reforming countries.^ The third chapter deals with another form of IPR, open source (OS) licenses. These differ in the conditions under which licensors and OS contributors are allowed to modify and redistribute the source code. We measure OS project quality by the speed with which programming bugs are fixed and test whether the license chosen by project leaders influences bug resolution rates. In initial regressions, we find a strong correlation between the hazard of bug resolution and the use of highly restrictive licenses. However, license choices are likely to be endogenous. We instrument license choice using (i) the human language in which contributors operate and (ii) the license choice of the project leaders for a previous project. We then find weak evidence that restrictive licenses adversely affect project success.^
Resumo:
The increasing use of model-driven software development has renewed emphasis on using domain-specific models during application development. More specifically, there has been emphasis on using domain-specific modeling languages (DSMLs) to capture user-specified requirements when creating applications. The current approach to realizing these applications is to translate DSML models into source code using several model-to-model and model-to-code transformations. This approach is still dependent on the underlying source code representation and only raises the level of abstraction during development. Experience has shown that developers will many times be required to manually modify the generated source code, which can be error-prone and time consuming. ^ An alternative to the aforementioned approach involves using an interpreted domain-specific modeling language (i-DSML) whose models can be directly executed using a Domain Specific Virtual Machine (DSVM). Direct execution of i-DSML models require a semantically rich platform that reduces the gap between the application models and the underlying services required to realize the application. One layer in this platform is the domain-specific middleware that is responsible for the management and delivery of services in the specific domain. ^ In this dissertation, we investigated the problem of designing the domain-specific middleware of the DSVM to facilitate the bifurcation of the semantics of the domain and the model of execution (MoE) while supporting runtime adaptation and validation. We approached our investigation by seeking solutions to the following sub-problems: (1) How can the domain-specific knowledge (DSK) semantics be separated from the MoE for a given domain? (2) How do we define a generic model of execution (GMoE) of the middleware so that it is adaptable and realizes DSK operations to support delivery of services? (3) How do we validate the realization of DSK operations at runtime? ^ Our research into the domain-specific middleware was done using an i-DSML for the user-centric communication domain, Communication Modeling Language (CML), and for microgrid energy management domain, Microgrid Modeling Language (MGridML). We have successfully developed a methodology to separate the DSK and GMoE of the middleware of a DSVM that supports specialization for a given domain, and is able to perform adaptation and validation at runtime. ^
Resumo:
We present a data set of 738 planktonic foraminiferal species counts from sediment surface samples of the eastern North Atlantic and the South Atlantic between 87°N and 40°S, 35°E and 60°W including published Climate: Long-Range Investigation, Mapping, and Prediction (CLIMAP) data. These species counts are linked to Levitus's [1982] modern water temperature data for the four caloric seasons, four depth ranges (0, 30, 50, and 75 m), and the combined means of those depth ranges. The relation between planktonic foraminiferal assemblages and sea surface temperature (SST) data is estimated using the newly developed SIMMAX technique, which is an acronym for a modern analog technique (MAT) with a similarity index, based on (1) the scalar product of the normalized faunal percentages and (2) a weighting procedure of the modern analog's SSTs according to the inverse geographical distances of the most similar samples. Compared to the classical CLIMAP transfer technique and conventional MAT techniques, SIMMAX provides a more confident reconstruction of paleo-SSTs (correlation coefficient is 0.994 for the caloric winter and 0.993 for caloric summer). The standard deviation of the residuals is 0.90°C for caloric winter and 0.96°C for caloric summer at 0-m water depth. The SST estimates reach optimum stability (standard deviation of the residuals is 0.88°C) at the average 0- to 75-m water depth. Our extensive database provides SST estimates over a range of -1.4 to 27.2°C for caloric winter and 0.4 to 28.6°C for caloric summer, allowing SST estimates which are especially valuable for the high-latitude Atlantic during glacial times.
Resumo:
The software product line engineering brings advantages when compared with the traditional software development regarding the mass customization of the system components. However, there are scenarios that to maintain separated clones of a software system seems to be an easier and more flexible approach to manage their variabilities of a software product line. This dissertation evaluates qualitatively an approach that aims to support the reconciliation of functionalities between cloned systems. The analyzed approach is based on mining data about the issues and source code of evolved cloned web systems. The next step is to process the merge conflicts collected by the approach and not indicated by traditional control version systems to identify potential integration problems from the cloned software systems. The results of the study show the feasibility of the approach to perform a systematic characterization and analysis of merge conflicts for large-scale web-based systems.
Resumo:
A manutenção e evolução de sistemas de software tornou-se uma tarefa bastante crítica ao longo dos últimos anos devido à diversidade e alta demanda de funcionalidades, dispositivos e usuários. Entender e analisar como novas mudanças impactam os atributos de qualidade da arquitetura de tais sistemas é um pré-requisito essencial para evitar a deterioração de sua qualidade durante sua evolução. Esta tese propõe uma abordagem automatizada para a análise de variação do atributo de qualidade de desempenho em termos de tempo de execução (tempo de resposta). Ela é implementada por um framework que adota técnicas de análise dinâmica e mineração de repositório de software para fornecer uma forma automatizada de revelar fontes potenciais – commits e issues – de variação de desempenho em cenários durante a evolução de sistemas de software. A abordagem define quatro fases: (i) preparação – escolher os cenários e preparar os releases alvos; (ii) análise dinâmica – determinar o desempenho de cenários e métodos calculando seus tempos de execução; (iii) análise de variação – processar e comparar os resultados da análise dinâmica para releases diferentes; e (iv) mineração de repositório – identificar issues e commits associados com a variação de desempenho detectada. Estudos empíricos foram realizados para avaliar a abordagem de diferentes perspectivas. Um estudo exploratório analisou a viabilidade de se aplicar a abordagem em sistemas de diferentes domínios para identificar automaticamente elementos de código fonte com variação de desempenho e as mudanças que afetaram tais elementos durante uma evolução. Esse estudo analisou três sistemas: (i) SIGAA – um sistema web para gerência acadêmica; (ii) ArgoUML – uma ferramenta de modelagem UML; e (iii) Netty – um framework para aplicações de rede. Outro estudo realizou uma análise evolucionária ao aplicar a abordagem em múltiplos releases do Netty, e dos frameworks web Wicket e Jetty. Nesse estudo foram analisados 21 releases (sete de cada sistema), totalizando 57 cenários. Em resumo, foram encontrados 14 cenários com variação significante de desempenho para Netty, 13 para Wicket e 9 para Jetty. Adicionalmente, foi obtido feedback de oito desenvolvedores desses sistemas através de um formulário online. Finalmente, no último estudo, um modelo de regressão para desempenho foi desenvolvido visando indicar propriedades de commits que são mais prováveis a causar degradação de desempenho. No geral, 997 commits foram minerados, sendo 103 recuperados de elementos de código fonte degradados e 19 de otimizados, enquanto 875 não tiveram impacto no tempo de execução. O número de dias antes de disponibilizar o release e o dia da semana se mostraram como as variáveis mais relevantes dos commits que degradam desempenho no nosso modelo. A área de característica de operação do receptor (ROC – Receiver Operating Characteristic) do modelo de regressão é 60%, o que significa que usar o modelo para decidir se um commit causará degradação ou não é 10% melhor do que uma decisão aleatória.
Resumo:
A manutenção e evolução de sistemas de software tornou-se uma tarefa bastante crítica ao longo dos últimos anos devido à diversidade e alta demanda de funcionalidades, dispositivos e usuários. Entender e analisar como novas mudanças impactam os atributos de qualidade da arquitetura de tais sistemas é um pré-requisito essencial para evitar a deterioração de sua qualidade durante sua evolução. Esta tese propõe uma abordagem automatizada para a análise de variação do atributo de qualidade de desempenho em termos de tempo de execução (tempo de resposta). Ela é implementada por um framework que adota técnicas de análise dinâmica e mineração de repositório de software para fornecer uma forma automatizada de revelar fontes potenciais – commits e issues – de variação de desempenho em cenários durante a evolução de sistemas de software. A abordagem define quatro fases: (i) preparação – escolher os cenários e preparar os releases alvos; (ii) análise dinâmica – determinar o desempenho de cenários e métodos calculando seus tempos de execução; (iii) análise de variação – processar e comparar os resultados da análise dinâmica para releases diferentes; e (iv) mineração de repositório – identificar issues e commits associados com a variação de desempenho detectada. Estudos empíricos foram realizados para avaliar a abordagem de diferentes perspectivas. Um estudo exploratório analisou a viabilidade de se aplicar a abordagem em sistemas de diferentes domínios para identificar automaticamente elementos de código fonte com variação de desempenho e as mudanças que afetaram tais elementos durante uma evolução. Esse estudo analisou três sistemas: (i) SIGAA – um sistema web para gerência acadêmica; (ii) ArgoUML – uma ferramenta de modelagem UML; e (iii) Netty – um framework para aplicações de rede. Outro estudo realizou uma análise evolucionária ao aplicar a abordagem em múltiplos releases do Netty, e dos frameworks web Wicket e Jetty. Nesse estudo foram analisados 21 releases (sete de cada sistema), totalizando 57 cenários. Em resumo, foram encontrados 14 cenários com variação significante de desempenho para Netty, 13 para Wicket e 9 para Jetty. Adicionalmente, foi obtido feedback de oito desenvolvedores desses sistemas através de um formulário online. Finalmente, no último estudo, um modelo de regressão para desempenho foi desenvolvido visando indicar propriedades de commits que são mais prováveis a causar degradação de desempenho. No geral, 997 commits foram minerados, sendo 103 recuperados de elementos de código fonte degradados e 19 de otimizados, enquanto 875 não tiveram impacto no tempo de execução. O número de dias antes de disponibilizar o release e o dia da semana se mostraram como as variáveis mais relevantes dos commits que degradam desempenho no nosso modelo. A área de característica de operação do receptor (ROC – Receiver Operating Characteristic) do modelo de regressão é 60%, o que significa que usar o modelo para decidir se um commit causará degradação ou não é 10% melhor do que uma decisão aleatória.