Existing distributed hydrologic models are complex and computationally demanding for using as a rapid-forecasting policy-decision tool, or even as a class-room educational tool. In addition, platform dependence, specific input/output data structures and non-dynamic data-interaction with pluggable software components inside the existing proprietary frameworks make these models restrictive only to the specialized user groups. RWater is a web-based hydrologic analysis and modeling framework that utilizes the commonly used R software within the HUBzero cyber infrastructure of Purdue University. RWater is designed as an integrated framework for distributed hydrologic simulation, along with subsequent parameter optimization and visualization schemes. RWater provides platform independent web-based interface, flexible data integration capacity, grid-based simulations, and user-extensibility. RWater uses RStudio to simulate hydrologic processes on raster based data obtained through conventional GIS pre-processing. The program integrates Shuffled Complex Evolution (SCE) algorithm for parameter optimization. Moreover, RWater enables users to produce different descriptive statistics and visualization of the outputs at different temporal resolutions. The applicability of RWater will be demonstrated by application on two watersheds in Indiana for multiple rainfall events.
Java Card technology allows the development and execution of small applications embedded in smart cards. A Java Card application is composed of an external card client and of an application in the card that implements the services available to the client by means of an Application Programming Interface (API). Usually, these applications manipulate and store important information, such as cash and confidential data of their owners. Thus, it is necessary to adopt rigor on developing a smart card application to improve its quality and trustworthiness. The use of formal methods on the development of these applications is a way to reach these quality requirements. The B method is one of the many formal methods for system specification. The development in B starts with the functional specification of the system, continues with the application of some optional refinements to the specification and, from the last level of refinement, it is possible to generate code for some programming language. The B formalism has a good tool support and its application to Java Card is adequate since the specification and development of APIs is one of the major applications of B. The BSmart method proposed here aims to promote the rigorous development of Java Card applications up to the generation of its code, based on the refinement of its formal specification described in the B notation. This development is supported by the BSmart tool, that is composed of some programs that automate each stage of the method; and by a library of B modules and Java Card classes that model primitive types, essential Java Card API classes and reusable data structures
Programs manipulate information. However, information is abstract in nature and needs to be represented, usually by data structures, making it possible to be manipulated. This work presents the AGraphs, a representation and exchange format of the data that uses typed directed graphs with a simulation of hyperedges and hierarchical graphs. Associated to the AGraphs format there is a manipulation library with a simple programming interface, tailored to the language being represented. The AGraphs format in ad-hoc manner was used as representation format in tools developed at UFRN, and, to make it more usable in other tools, an accurate description and the development of support tools was necessary. These accurate description and tools have been developed and are described in this work. This work compares the AGraphs format with other representation and exchange formats (e.g ATerms, GDL, GraphML, GraX, GXL and XML). The main objective this comparison is to capture important characteristics and where the AGraphs concepts can still evolve
The increase of computing power of the microcomputers has stimulated the building of direct manipulation interfaces that allow graphical representation of Linear Programming (LP) models. This work discusses the components of such a graphical interface as the basis for a system to assist users in the process of formulating LP problems. In essence, this work proposes a methodology which considers the modelling task as divided into three stages which are specification of the Data Model, the Conceptual Model and the LP Model. The necessity for using Artificial Intelligence techniques in the problem conceptualisation and to help the model formulation task is illustrated.
Foram simuladas estruturas de dados em modelos mistos representando o teste de 100 reprodutores, sendo cada reprodutor acasalado com 10 matrizes (total de 1000 matrizes), originando em cada acasalamento 2 proles, totalizando 2000 proles (vinte proles por reprodutor). De cada combinação reprodutor e matriz, dez proles tiveram seu fenótipo expresso no ambiente de baixa produção (Estrato 1) e, a outra metade, no ambiente de alta produção (Estrato 2). A simulação foi realizada de forma a representar diferentes situações de presença de heterogeneidade de variâncias, combinando-se as origens da heterogeneidade, de natureza genética e ambiental. Na presença de heterogeneidade residual, o valor estimado para o componente de variância residual, considerando homogeneidade de variâncias se aproximou do valor médio das variâncias entre os estratos. Houve superestimação, também, do componente de variância genético aditivo. Ao simular heterogeneidade de variância de origem genética, observou-se que a estimação desse componente situou-se em valor intermediário aos simulados. Nessa situação, o componente de variância residual estimado foi próximo do valor simulado, indicando que a heterogeneidade de variâncias quando proveniente de fatores genéticos, não interfere, substancialmente, sobre e estimação do componente de variância residual. Na simulação de dados com presença de heterogeneidade tanto de origem genética quanto ambiental (estrutura de dados 4), conduziu a estimação de componentes de variâncias intermediários aos valores simulados em cada estrato. Assim, observa-se que, mesmo quando os reprodutores apresentam proles bem distribuídas em ambos os estratos, a heterogeneidade de variância proveniente de fatores não genético provoca distorções sobre a estimação da variância genética aditiva. Mas por outro lado, quando a heterogeneidade de variância é decorrente de fatores genéticos, não há grande interferência sobre a estimativa da variância residual, tal comportamento pode ser explicado pela incorporação da matriz de parentesco na estimação do componente de variância genético aditivo, possibilitando discriminar melhor a origem da diferenças entre variâncias. Na estrutura onde a variância residual foi heterogênea a estimativa de herdabilidade foi menor em relação à estrutura de homogeneidade de variâncias. Por outro lado, quando somente a variância genética aditiva foi heterogênea, a estimativa de herdabilidade, considerando-se apenas o estrato de alta variabilidade genética, foi inflacionada pela superestimação da variância genética aditiva. No entanto, a estimativa de herdabilidade obtida, desconsiderando essa fonte de heterogeneidade de variância, foi próxima à situação de homogeneidade de variância, indicando que, quando os reprodutores possuem boa distribuição de proles em diferentes ambientes, as estimativas relacionadas ao efeito genético são ponderadas pelo desempenho dos animais em cada ambiente. As correlações de Spearman e de Pearson entre os valores genéticos preditos dos reprodutores, para todas as situações, foram maiores que 0,90. O resultado indica que, mesmo havendo presença de heterogeneidade de variância genética e/ou ambiental, se os reprodutores possuem proles bem distribuídas entre os ambientes (estratos heterogêneos) a classificação do mérito genético não se altera, o que era esperado, pois em análises unicarácter, quando ocorre uma fonte de viés na avaliação genética, ela é comum a todos os indivíduos. Na situação em que foi imposta a estrutura de dados à presença de heterogeneidade de variância residual com número de número desigual de proles por reprodutor nos estratos, provocou superestimação dos componentes de variância. Porém mesmo havendo alteração na magnitude dos valores genéticos preditos para os reprodutores, a heterogeneidade de variância não alterou a classificação entre os reprodutores todas as correlações de ordem foram próximas à unidade. O efeito da heterogeneidade de variância, oriunda de fatores ambientais, ocasiona em maiores distorções sobre a avaliação genética animal, em relação, quando a mesma é proveniente de causas genéticas. A conexidade genética entre diferentes ambientes, dilui o efeito da heterogeneidade de variância, tanto de origem genética, quanto ambiental, na predição de valores genéticos dos reprodutores.
Topographical surfaces can be represented with a good degree of accuracy by means of maps. However these are not always the best tools for the understanding of more complex reliefs. In this sense, the greatest contribution of this work is to specify and to implement the architecture of an opensource software system capable of representing TIN (Triangular Irregular Network) based digital terrain models. The system implementation follows the object oriented programming and generic paradigms enabling the integration of various opensource tools such as GDAL, OGR, OpenGL, OpenSceneGraph and Qt. Furthermore, the representation core of the system has the ability to work with multiple topological data structures from which can be extracted, in constant time, all the connectivity relations between the entities vertices, edges and faces existing in a planar triangulation what helps enormously the implementation for real time applications. This is an important capability, for example, in the use of laser survey data (Lidar, ALS, TLS), allowing for the generation of triangular mesh models in the order of millions of points.
This thesis presents Bayesian solutions to inference problems for three types of social network data structures: a single observation of a social network, repeated observations on the same social network, and repeated observations on a social network developing through time. A social network is conceived as being a structure consisting of actors and their social interaction with each other. A common conceptualisation of social networks is to let the actors be represented by nodes in a graph with edges between pairs of nodes that are relationally tied to each other according to some definition. Statistical analysis of social networks is to a large extent concerned with modelling of these relational ties, which lends itself to empirical evaluation. The first paper deals with a family of statistical models for social networks called exponential random graphs that takes various structural features of the network into account. In general, the likelihood functions of exponential random graphs are only known up to a constant of proportionality. A procedure for performing Bayesian inference using Markov chain Monte Carlo (MCMC) methods is presented. The algorithm consists of two basic steps, one in which an ordinary Metropolis-Hastings up-dating step is used, and another in which an importance sampling scheme is used to calculate the acceptance probability of the Metropolis-Hastings step. In paper number two a method for modelling reports given by actors (or other informants) on their social interaction with others is investigated in a Bayesian framework. The model contains two basic ingredients: the unknown network structure and functions that link this unknown network structure to the reports given by the actors. These functions take the form of probit link functions. An intrinsic problem is that the model is not identified, meaning that there are combinations of values on the unknown structure and the parameters in the probit link functions that are observationally equivalent. Instead of using restrictions for achieving identification, it is proposed that the different observationally equivalent combinations of parameters and unknown structure be investigated a posteriori. Estimation of parameters is carried out using Gibbs sampling with a switching devise that enables transitions between posterior modal regions. The main goal of the procedures is to provide tools for comparisons of different model specifications. Papers 3 and 4, propose Bayesian methods for longitudinal social networks. The premise of the models investigated is that overall change in social networks occurs as a consequence of sequences of incremental changes. Models for the evolution of social networks using continuos-time Markov chains are meant to capture these dynamics. Paper 3 presents an MCMC algorithm for exploring the posteriors of parameters for such Markov chains. More specifically, the unobserved evolution of the network in-between observations is explicitly modelled thereby avoiding the need to deal with explicit formulas for the transition probabilities. This enables likelihood based parameter inference in a wider class of network evolution models than has been available before. Paper 4 builds on the proposed inference procedure of Paper 3 and demonstrates how to perform model selection for a class of network evolution models.
[ES] El objetivo de este Trabajo es el de parametrizar, implementar las estructuras de datos y programar las aplicaciones necesarias que posibilitan el intercambio de información entre dos entornos software, SAP R/3 y Knapp, líderes en sus campos de actuación. El resultado de aplicar tales cambios permitirá a la organización no sólo centralizar la información en el ERP, sino que mejorará sus procesos de negocio y agilizará la toma de decisiones por parte de los responsables. Se realiza un estudio de la situación actual y, tras un análisis detallado, se propone una solución que permita alcanzar los objetivos propuestos. Una vez diseñada, presentada y aprobada la propuesta, se procede a la parametrización de SAP R/3, a la definición de los segmentos y tipos de IDOC y a la codificación de funciones y programas que permitan tratar la información enviada por Knapp. Finalizadas estas tareas, se elaboran juegos de datos de los procesos comerciales y se ejecutan en un entorno de test, en colaboración con los usuarios claves, para comprobar la bondad de la solución implementada. Se analizan los resultados y se corrigen posibles deficiencias. Finalmente se transporta al sistema productivo todos los cambios realizados y se verifica la correcta ejecución de los procesos de negocio de la organización.
Die vorliegende Dissertation analysiert die Middleware- Technologien CORBA (Common Object Request Broker Architecture), COM/DCOM (Component Object Model/Distributed Component Object Model), J2EE (Java-2-Enterprise Edition) und Web Services (inklusive .NET) auf ihre Eignung bzgl. eng und lose gekoppelten verteilten Anwendungen. Zusätzlich werden primär für CORBA die dynamischen CORBA-Komponenten DII (Dynamic Invocation Interface), IFR (Interface Repository) und die generischen Datentypen Any und DynAny (dynamisches Any) im Detail untersucht. Ziel ist es, a. konkrete Aussagen über diese Komponenten zu erzielen, und festzustellen, in welchem Umfeld diese generischen Ansätze ihre Berechtigung finden. b. das zeitliche Verhalten der dynamischen Komponenten bzgl. der Informationsgewinnung über die unbekannten Objekte zu analysieren. c. das zeitliche Verhalten der dynamischen Komponenten bzgl. ihrer Kommunikation zu messen. d. das zeitliche Verhalten bzgl. der Erzeugung von generischen Datentypen und das Einstellen von Daten zu messen und zu analysieren. e. das zeitliche Verhalten bzgl. des Erstellens von unbekannten, d. h. nicht in IDL beschriebenen Datentypen zur Laufzeit zu messen und zu analysieren. f. die Vorzüge/Nachteile der dynamischen Komponenten aufzuzeigen, ihre Einsatzgebiete zu definieren und mit anderen Technologien wie COM/DCOM, J2EE und den Web Services bzgl. ihrer Möglichkeiten zu vergleichen. g. Aussagen bzgl. enger und loser Koppelung zu tätigen. CORBA wird als standardisierte und vollständige Verteilungsplattform ausgewählt, um die o. a. Problemstellungen zu untersuchen. Bzgl. seines dynamischen Verhaltens, das zum Zeitpunkt dieser Ausarbeitung noch nicht oder nur unzureichend untersucht wurde, sind CORBA und die Web Services richtungsweisend bzgl. a. Arbeiten mit unbekannten Objekten. Dies kann durchaus Implikationen bzgl. der Entwicklung intelligenter Softwareagenten haben. b. der Integration von Legacy-Applikationen. c. der Möglichkeiten im Zusammenhang mit B2B (Business-to-Business). Diese Problemstellungen beinhalten auch allgemeine Fragen zum Marshalling/Unmarshalling von Daten und welche Aufwände hierfür notwendig sind, ebenso wie allgemeine Aussagen bzgl. der Echtzeitfähigkeit von CORBA-basierten, verteilten Anwendungen. Die Ergebnisse werden anschließend auf andere Technologien wie COM/DCOM, J2EE und den Web Services, soweit es zulässig ist, übertragen. Die Vergleiche CORBA mit DCOM, CORBA mit J2EE und CORBA mit Web Services zeigen im Detail die Eignung dieser Technologien bzgl. loser und enger Koppelung. Desweiteren werden aus den erzielten Resultaten allgemeine Konzepte bzgl. der Architektur und der Optimierung der Kommunikation abgeleitet. Diese Empfehlungen gelten uneingeschränkt für alle untersuchten Technologien im Zusammenhang mit verteilter Verarbeitung.
A permutation is said to avoid a pattern if it does not contain any subsequence which is order-isomorphic to it. Donald Knuth, in the first volume of his celebrated book "The art of Computer Programming", observed that the permutations that can be computed (or, equivalently, sorted) by some particular data structures can be characterized in terms of pattern avoidance. In more recent years, the topic was reopened several times, while often in terms of sortable permutations rather than computable ones. The idea to sort permutations by using one of Knuth’s devices suggests to look for a deterministic procedure that decides, in linear time, if there exists a sequence of operations which is able to convert a given permutation into the identical one. In this thesis we show that, for the stack and the restricted deques, there exists an unique way to implement such a procedure. Moreover, we use these sorting procedures to create new sorting algorithms, and we prove some unexpected commutation properties between these procedures and the base step of bubblesort. We also show that the permutations that can be sorted by a combination of the base steps of bubblesort and its dual can be expressed, once again, in terms of pattern avoidance. In the final chapter we give an alternative proof of some enumerative results, in particular for the classes of permutations that can be sorted by the two restricted deques. It is well-known that the permutations that can be sorted through a restricted deque are counted by the Schrӧder numbers. In the thesis, we show how the deterministic sorting procedures yield a bijection between sortable permutations and Schrӧder paths.
Questionnaire data may contain missing values because certain questions do not apply to all respondents. For instance, questions addressing particular attributes of a symptom, such as frequency, triggers or seasonality, are only applicable to those who have experienced the symptom, while for those who have not, responses to these items will be missing. This missing information does not fall into the category 'missing by design', rather the features of interest do not exist and cannot be measured regardless of survey design. Analysis of responses to such conditional items is therefore typically restricted to the subpopulation in which they apply. This article is concerned with joint multivariate modelling of responses to both unconditional and conditional items without restricting the analysis to this subpopulation. Such an approach is of interest when the distributions of both types of responses are thought to be determined by common parameters affecting the whole population. By integrating the conditional item structure into the model, inference can be based both on unconditional data from the entire population and on conditional data from subjects for whom they exist. This approach opens new possibilities for multivariate analysis of such data. We apply this approach to latent class modelling and provide an example using data on respiratory symptoms (wheeze and cough) in children. Conditional data structures such as that considered here are common in medical research settings and, although our focus is on latent class models, the approach can be applied to other multivariate models.
We present in this paper several contributions on the collision detection optimization centered on hardware performance. We focus on the broad phase which is the first step of the collision detection process and propose three new ways of parallelization of the well-known Sweep and Prune algorithm. We first developed a multi-core model takes into account the number of available cores. Multi-core architecture enables us to distribute geometric computations with use of multi-threading. Critical writing section and threads idling have been minimized by introducing new data structures for each thread. Programming with directives, like OpenMP, appears to be a good compromise for code portability. We then proposed a new GPU-based algorithm also based on the "Sweep and Prune" that has been adapted to multi-GPU architectures. Our technique is based on a spatial subdivision method used to distribute computations among GPUs. Results show that significant speed-up can be obtained by passing from 1 to 4 GPUs in a large-scale environment.
In this paper, we describe dynamic unicast to increase communication efficiency in opportunistic Information-centric networks. The approach is based on broadcast requests to quickly find content and dynamically creating unicast links to content sources without the need of neighbor discovery. The links are kept temporarily as long as they deliver content and are quickly removed otherwise. Evaluations in mobile networks show that this approach maintains ICN flexibility to support seamless mobile communication and achieves up to 56.6% shorter transmission times compared to broadcast in case of multiple concurrent requesters. Apart from that, dynamic unicast unburdens listener nodes from processing unwanted content resulting in lower processing overhead and power consumption at these nodes. The approach can be easily included into existing ICN architectures using only available data structures.
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^