12 resultados para Data Migration Processes Modeling

em Universidad Politécnica de Madrid


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Digital atlases of animal development provide a quantitative description of morphogenesis, opening the path toward processes modeling. Prototypic atlases offer a data integration framework where to gather information from cohorts of individuals with phenotypic variability. Relevant information for further theoretical reconstruction includes measurements in time and space for cell behaviors and gene expression. The latter as well as data integration in a prototypic model, rely on image processing strategies. Developing the tools to integrate and analyze biological multidimensional data are highly relevant for assessing chemical toxicity or performing drugs preclinical testing. This article surveys some of the most prominent efforts to assemble these prototypes, categorizes them according to salient criteria and discusses the key questions in the field and the future challenges toward the reconstruction of multiscale dynamics in model organisms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Los Centros de Datos se encuentran actualmente en cualquier sector de la economía mundial. Están compuestos por miles de servidores, dando servicio a los usuarios de forma global, las 24 horas del día y los 365 días del año. Durante los últimos años, las aplicaciones del ámbito de la e-Ciencia, como la e-Salud o las Ciudades Inteligentes han experimentado un desarrollo muy significativo. La necesidad de manejar de forma eficiente las necesidades de cómputo de aplicaciones de nueva generación, junto con la creciente demanda de recursos en aplicaciones tradicionales, han facilitado el rápido crecimiento y la proliferación de los Centros de Datos. El principal inconveniente de este aumento de capacidad ha sido el rápido y dramático incremento del consumo energético de estas infraestructuras. En 2010, la factura eléctrica de los Centros de Datos representaba el 1.3% del consumo eléctrico mundial. Sólo en el año 2012, el consumo de potencia de los Centros de Datos creció un 63%, alcanzando los 38GW. En 2013 se estimó un crecimiento de otro 17%, hasta llegar a los 43GW. Además, los Centros de Datos son responsables de más del 2% del total de emisiones de dióxido de carbono a la atmósfera. Esta tesis doctoral se enfrenta al problema energético proponiendo técnicas proactivas y reactivas conscientes de la temperatura y de la energía, que contribuyen a tener Centros de Datos más eficientes. Este trabajo desarrolla modelos de energía y utiliza el conocimiento sobre la demanda energética de la carga de trabajo a ejecutar y de los recursos de computación y refrigeración del Centro de Datos para optimizar el consumo. Además, los Centros de Datos son considerados como un elemento crucial dentro del marco de la aplicación ejecutada, optimizando no sólo el consumo del Centro de Datos sino el consumo energético global de la aplicación. Los principales componentes del consumo en los Centros de Datos son la potencia de computación utilizada por los equipos de IT, y la refrigeración necesaria para mantener los servidores dentro de un rango de temperatura de trabajo que asegure su correcto funcionamiento. Debido a la relación cúbica entre la velocidad de los ventiladores y el consumo de los mismos, las soluciones basadas en el sobre-aprovisionamiento de aire frío al servidor generalmente tienen como resultado ineficiencias energéticas. Por otro lado, temperaturas más elevadas en el procesador llevan a un consumo de fugas mayor, debido a la relación exponencial del consumo de fugas con la temperatura. Además, las características de la carga de trabajo y las políticas de asignación de recursos tienen un impacto importante en los balances entre corriente de fugas y consumo de refrigeración. La primera gran contribución de este trabajo es el desarrollo de modelos de potencia y temperatura que permiten describes estos balances entre corriente de fugas y refrigeración; así como la propuesta de estrategias para minimizar el consumo del servidor por medio de la asignación conjunta de refrigeración y carga desde una perspectiva multivariable. Cuando escalamos a nivel del Centro de Datos, observamos un comportamiento similar en términos del balance entre corrientes de fugas y refrigeración. Conforme aumenta la temperatura de la sala, mejora la eficiencia de la refrigeración. Sin embargo, este incremente de la temperatura de sala provoca un aumento en la temperatura de la CPU y, por tanto, también del consumo de fugas. Además, la dinámica de la sala tiene un comportamiento muy desigual, no equilibrado, debido a la asignación de carga y a la heterogeneidad en el equipamiento de IT. La segunda contribución de esta tesis es la propuesta de técnicas de asigación conscientes de la temperatura y heterogeneidad que permiten optimizar conjuntamente la asignación de tareas y refrigeración a los servidores. Estas estrategias necesitan estar respaldadas por modelos flexibles, que puedan trabajar en tiempo real, para describir el sistema desde un nivel de abstracción alto. Dentro del ámbito de las aplicaciones de nueva generación, las decisiones tomadas en el nivel de aplicación pueden tener un impacto dramático en el consumo energético de niveles de abstracción menores, como por ejemplo, en el Centro de Datos. Es importante considerar las relaciones entre todos los agentes computacionales implicados en el problema, de forma que puedan cooperar para conseguir el objetivo común de reducir el coste energético global del sistema. La tercera contribución de esta tesis es el desarrollo de optimizaciones energéticas para la aplicación global por medio de la evaluación de los costes de ejecutar parte del procesado necesario en otros niveles de abstracción, que van desde los nodos hasta el Centro de Datos, por medio de técnicas de balanceo de carga. Como resumen, el trabajo presentado en esta tesis lleva a cabo contribuciones en el modelado y optimización consciente del consumo por fugas y la refrigeración de servidores; el modelado de los Centros de Datos y el desarrollo de políticas de asignación conscientes de la heterogeneidad; y desarrolla mecanismos para la optimización energética de aplicaciones de nueva generación desde varios niveles de abstracción. ABSTRACT Data centers are easily found in every sector of the worldwide economy. They consist of tens of thousands of servers, serving millions of users globally and 24-7. In the last years, e-Science applications such e-Health or Smart Cities have experienced a significant development. The need to deal efficiently with the computational needs of next-generation applications together with the increasing demand for higher resources in traditional applications has facilitated the rapid proliferation and growing of data centers. A drawback to this capacity growth has been the rapid increase of the energy consumption of these facilities. In 2010, data center electricity represented 1.3% of all the electricity use in the world. In year 2012 alone, global data center power demand grew 63% to 38GW. A further rise of 17% to 43GW was estimated in 2013. Moreover, data centers are responsible for more than 2% of total carbon dioxide emissions. This PhD Thesis addresses the energy challenge by proposing proactive and reactive thermal and energy-aware optimization techniques that contribute to place data centers on a more scalable curve. This work develops energy models and uses the knowledge about the energy demand of the workload to be executed and the computational and cooling resources available at data center to optimize energy consumption. Moreover, data centers are considered as a crucial element within their application framework, optimizing not only the energy consumption of the facility, but the global energy consumption of the application. The main contributors to the energy consumption in a data center are the computing power drawn by IT equipment and the cooling power needed to keep the servers within a certain temperature range that ensures safe operation. Because of the cubic relation of fan power with fan speed, solutions based on over-provisioning cold air into the server usually lead to inefficiencies. On the other hand, higher chip temperatures lead to higher leakage power because of the exponential dependence of leakage on temperature. Moreover, workload characteristics as well as allocation policies also have an important impact on the leakage-cooling tradeoffs. The first key contribution of this work is the development of power and temperature models that accurately describe the leakage-cooling tradeoffs at the server level, and the proposal of strategies to minimize server energy via joint cooling and workload management from a multivariate perspective. When scaling to the data center level, a similar behavior in terms of leakage-temperature tradeoffs can be observed. As room temperature raises, the efficiency of data room cooling units improves. However, as we increase room temperature, CPU temperature raises and so does leakage power. Moreover, the thermal dynamics of a data room exhibit unbalanced patterns due to both the workload allocation and the heterogeneity of computing equipment. The second main contribution is the proposal of thermal- and heterogeneity-aware workload management techniques that jointly optimize the allocation of computation and cooling to servers. These strategies need to be backed up by flexible room level models, able to work on runtime, that describe the system from a high level perspective. Within the framework of next-generation applications, decisions taken at this scope can have a dramatical impact on the energy consumption of lower abstraction levels, i.e. the data center facility. It is important to consider the relationships between all the computational agents involved in the problem, so that they can cooperate to achieve the common goal of reducing energy in the overall system. The third main contribution is the energy optimization of the overall application by evaluating the energy costs of performing part of the processing in any of the different abstraction layers, from the node to the data center, via workload management and off-loading techniques. In summary, the work presented in this PhD Thesis, makes contributions on leakage and cooling aware server modeling and optimization, data center thermal modeling and heterogeneityaware data center resource allocation, and develops mechanisms for the energy optimization for next-generation applications from a multi-layer perspective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Self-OrganizingMap (SOM) is a neural network model that performs an ordered projection of a high dimensional input space in a low-dimensional topological structure. The process in which such mapping is formed is defined by the SOM algorithm, which is a competitive, unsupervised and nonparametric method, since it does not make any assumption about the input data distribution. The feature maps provided by this algorithm have been successfully applied for vector quantization, clustering and high dimensional data visualization processes. However, the initialization of the network topology and the selection of the SOM training parameters are two difficult tasks caused by the unknown distribution of the input signals. A misconfiguration of these parameters can generate a feature map of low-quality, so it is necessary to have some measure of the degree of adaptation of the SOM network to the input data model. The topologypreservation is the most common concept used to implement this measure. Several qualitative and quantitative methods have been proposed for measuring the degree of SOM topologypreservation, particularly using Kohonen's model. In this work, two methods for measuring the topologypreservation of the Growing Cell Structures (GCSs) model are proposed: the topographic function and the topology preserving map

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La presente tesis doctoral describe los desarrollos realizados, y finalmente materializados en patentes con registro de la propiedad intelectual, para la integración de las nuevas tecnologías de documentación fotogramétrica y las bases de datos de los barredores láser terrestres, en los procesos de elaboración, redacción y ejecución de proyectos de restauración y rehabilitación arquitectónicos. Los avances tecnológicos aparecidos en control métrico, junto con las técnicas de imagen digital y los desarrollos fotogramétricos, pueden aportar mejoras significativas en el proceso proyectual y permiten aplicar nuevos procedimientos de extracción de datos para generar de forma sencilla, bajo el control directo y supervisión de los responsables del proyecto, la información métrica y documental más adecuada. Se establecen como principios, y por tanto como base para el diseño de dicha herramienta, que los desarrollos aparecidos sí han producido el uso extendido del sistema CAD (como instrumento de dibujo) así como el uso de la imagen digital como herramienta de documentación. La herramienta a diseñar se fundamenta por tanto en la imagen digital (imágenes digitales, imágenes rectificadas, ortofotografías, estéreo- modelos, estereo- ortofotografías) así como su integración en autocad para un tratamiento interactivo. En la aplicación de la fotogrametría a la disciplina arquitectónica, se considera de interés estructurar aplicaciones con carácter integrador que, con mayores capacidades de interactuación y a partir de información veraz y rigurosa, permitan completar o elaborar documentos de interés proyectual, ABSTRACT This doctoral thesis explains the developments carried out, and finally patented with intellectual property rights, for the integration of the new photogrammetric technology documentation and terrestrial scanner databases in the preparation, documentation and implementation processes of restoration projects and architectural renovation. The technological advances in metric control, as well as the digital image techniques and photogrammetric developments, can together bring a significant improvement to the projecting process, and, under the direct control and supervision of those in charge of the project, can allow new data extraction processes to be applied in order to easily generate the most appropriate metric information and documentation. The principles and, therefore, the basis for the design of this tool are that the developments have indeed produced the widespread use of the CAD system (as a drawing instrument) and the use of digital images as a documentation tool. The tool to be designed is therefore based on digital images (rectify images, orthophotos, stereomodels, stereo-orthophotos) as well as its integration in Autocad for interactive processing. In the application of photogrammetry to the architectural discipline, what interests us is to structure applications of an integrative nature which, with a greater capacity for interaction and from accurate and thorough information, enable the completion or elaboration of documents that are of interest to the project.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the advent of cloud computing model, distributed caches have become the cornerstone for building scalable applications. Popular systems like Facebook [1] or Twitter use Memcached [5], a highly scalable distributed object cache, to speed up applications by avoiding database accesses. Distributed object caches assign objects to cache instances based on a hashing function, and objects are not moved from a cache instance to another unless more instances are added to the cache and objects are redistributed. This may lead to situations where some cache instances are overloaded when some of the objects they store are frequently accessed, while other cache instances are less frequently used. In this paper we propose a multi-resource load balancing algorithm for distributed cache systems. The algorithm aims at balancing both CPU and Memory resources among cache instances by redistributing stored data. Considering the possible conflict of balancing multiple resources at the same time, we give CPU and Memory resources weighted priorities based on the runtime load distributions. A scarcer resource is given a higher weight than a less scarce resource when load balancing. The system imbalance degree is evaluated based on monitoring information, and the utility load of a node, a unit for resource consumption. Besides, since continuous rebalance of the system may affect the QoS of applications utilizing the cache system, our data selection policy ensures that each data migration minimizes the system imbalance degree and hence, the total reconfiguration cost can be minimized. An extensive simulation is conducted to compare our policy with other policies. Our policy shows a significant improvement in time efficiency and decrease in reconfiguration cost.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Stream-mining approach is defined as a set of cutting-edge techniques designed to process streams of data in real time, in order to extract knowledge. In the particular case of classification, stream-mining has to adapt its behaviour to the volatile underlying data distributions, what has been called concept drift. Moreover, it is important to note that concept drift may lead to situations where predictive models become invalid and have therefore to be updated to represent the actual concepts that data poses. In this context, there is a specific type of concept drift, known as recurrent concept drift, where the concepts represented by data have already appeared in the past. In those cases the learning process could be saved or at least minimized by applying a previously trained model. This could be extremely useful in ubiquitous environments that are characterized by the existence of resource constrained devices. To deal with the aforementioned scenario, meta-models can be used in the process of enhancing the drift detection mechanisms used by data stream algorithms, by representing and predicting when the change will occur. There are some real-world situations where a concept reappears, as in the case of intrusion detection systems (IDS), where the same incidents or an adaptation of them usually reappear over time. In these environments the early prediction of drift by means of a better knowledge of past models can help to anticipate to the change, thus improving efficiency of the model regarding the training instances needed. By means of using meta-models as a recurrent drift detection mechanism, the ability to share concepts representations among different data mining processes is open. That kind of exchanges could improve the accuracy of the resultant local model as such model may benefit from patterns similar to the local concept that were observed in other scenarios, but not yet locally. This would also improve the efficiency of training instances used during the classification process, as long as the exchange of models would aid in the application of already trained recurrent models, that have been previously seen by any of the collaborative devices. Which it is to say that the scope of recurrence detection and representation is broaden. In fact the detection, representation and exchange of concept drift patterns would be extremely useful for the law enforcement activities fighting against cyber crime. Being the information exchange one of the main pillars of cooperation, national units would benefit from the experience and knowledge gained by third parties. Moreover, in the specific scope of critical infrastructures protection it is crucial to count with information exchange mechanisms, both from a strategical and technical scope. The exchange of concept drift detection schemes in cyber security environments would aid in the process of preventing, detecting and effectively responding to threads in cyber space. Furthermore, as a complement of meta-models, a mechanism to assess the similarity between classification models is also needed when dealing with recurrent concepts. In this context, when reusing a previously trained model a rough comparison between concepts is usually made, applying boolean logic. The introduction of fuzzy logic comparisons between models could lead to a better efficient reuse of previously seen concepts, by applying not just equal models, but also similar ones. This work faces the aforementioned open issues by means of: the MMPRec system, that integrates a meta-model mechanism and a fuzzy similarity function; a collaborative environment to share meta-models between different devices; a recurrent drift generator that allows to test the usefulness of recurrent drift systems, as it is the case of MMPRec. Moreover, this thesis presents an experimental validation of the proposed contributions using synthetic and real datasets.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

En la actualidad, el seguimiento de la dinámica de los procesos medio ambientales está considerado como un punto de gran interés en el campo medioambiental. La cobertura espacio temporal de los datos de teledetección proporciona información continua con una alta frecuencia temporal, permitiendo el análisis de la evolución de los ecosistemas desde diferentes escalas espacio-temporales. Aunque el valor de la teledetección ha sido ampliamente probado, en la actualidad solo existe un número reducido de metodologías que permiten su análisis de una forma cuantitativa. En la presente tesis se propone un esquema de trabajo para explotar las series temporales de datos de teledetección, basado en la combinación del análisis estadístico de series de tiempo y la fenometría. El objetivo principal es demostrar el uso de las series temporales de datos de teledetección para analizar la dinámica de variables medio ambientales de una forma cuantitativa. Los objetivos específicos son: (1) evaluar dichas variables medio ambientales y (2) desarrollar modelos empíricos para predecir su comportamiento futuro. Estos objetivos se materializan en cuatro aplicaciones cuyos objetivos específicos son: (1) evaluar y cartografiar estados fenológicos del cultivo del algodón mediante análisis espectral y fenometría, (2) evaluar y modelizar la estacionalidad de incendios forestales en dos regiones bioclimáticas mediante modelos dinámicos, (3) predecir el riesgo de incendios forestales a nivel pixel utilizando modelos dinámicos y (4) evaluar el funcionamiento de la vegetación en base a la autocorrelación temporal y la fenometría. Los resultados de esta tesis muestran la utilidad del ajuste de funciones para modelizar los índices espectrales AS1 y AS2. Los parámetros fenológicos derivados del ajuste de funciones permiten la identificación de distintos estados fenológicos del cultivo del algodón. El análisis espectral ha demostrado, de una forma cuantitativa, la presencia de un ciclo en el índice AS2 y de dos ciclos en el AS1 así como el comportamiento unimodal y bimodal de la estacionalidad de incendios en las regiones mediterránea y templada respectivamente. Modelos autorregresivos han sido utilizados para caracterizar la dinámica de la estacionalidad de incendios y para predecir de una forma muy precisa el riesgo de incendios forestales a nivel pixel. Ha sido demostrada la utilidad de la autocorrelación temporal para definir y caracterizar el funcionamiento de la vegetación a nivel pixel. Finalmente el concepto “Optical Functional Type” ha sido definido, donde se propone que los pixeles deberían ser considerados como unidades temporales y analizados en función de su dinámica temporal. ix SUMMARY A good understanding of land surface processes is considered as a key subject in environmental sciences. The spatial-temporal coverage of remote sensing data provides continuous observations with a high temporal frequency allowing the assessment of ecosystem evolution at different temporal and spatial scales. Although the value of remote sensing time series has been firmly proved, only few time series methods have been developed for analyzing this data in a quantitative and continuous manner. In the present dissertation a working framework to exploit Remote Sensing time series is proposed based on the combination of Time Series Analysis and phenometric approach. The main goal is to demonstrate the use of remote sensing time series to analyze quantitatively environmental variable dynamics. The specific objectives are (1) to assess environmental variables based on remote sensing time series and (2) to develop empirical models to forecast environmental variables. These objectives have been achieved in four applications which specific objectives are (1) assessing and mapping cotton crop phenological stages using spectral and phenometric analyses, (2) assessing and modeling fire seasonality in two different ecoregions by dynamic models, (3) forecasting forest fire risk on a pixel basis by dynamic models, and (4) assessing vegetation functioning based on temporal autocorrelation and phenometric analysis. The results of this dissertation show the usefulness of function fitting procedures to model AS1 and AS2. Phenometrics derived from function fitting procedure makes it possible to identify cotton crop phenological stages. Spectral analysis has demonstrated quantitatively the presence of one cycle in AS2 and two in AS1 and the unimodal and bimodal behaviour of fire seasonality in the Mediterranean and temperate ecoregions respectively. Autoregressive models has been used to characterize the dynamics of fire seasonality in two ecoregions and to forecasts accurately fire risk on a pixel basis. The usefulness of temporal autocorrelation to define and characterized land surface functioning has been demonstrated. And finally the “Optical Functional Types” concept has been proposed, in this approach pixels could be as temporal unities based on its temporal dynamics or functioning.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Due to the advancement of both, information technology in general, and databases in particular; data storage devices are becoming cheaper and data processing speed is increasing. As result of this, organizations tend to store large volumes of data holding great potential information. Decision Support Systems, DSS try to use the stored data to obtain valuable information for organizations. In this paper, we use both data models and use cases to represent the functionality of data processing in DSS following Software Engineering processes. We propose a methodology to develop DSS in the Analysis phase, respective of data processing modeling. We have used, as a starting point, a data model adapted to the semantics involved in multidimensional databases or data warehouses, DW. Also, we have taken an algorithm that provides us with all the possible ways to automatically cross check multidimensional model data. Using the aforementioned, we propose diagrams and descriptions of use cases, which can be considered as patterns representing the DSS functionality, in regard to DW data processing, DW on which DSS are based. We highlight the reusability and automation benefits that this can be achieved, and we think this study can serve as a guide in the development of DSS.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The need to refine models for best-estimate calculations, based on good-quality experimental data, has been expressed in many recent meetings in the field of nuclear applications. The modeling needs arising in this respect should not be limited to the currently available macroscopic methods but should be extended to next-generation analysis techniques that focus on more microscopic processes. One of the most valuable databases identified for the thermalhydraulics modeling was developed by the Nuclear Power Engineering Corporation (NUPEC), Japan. From 1987 to 1995, NUPEC performed steady-state and transient critical power and departure from nucleate boiling (DNB) test series based on the equivalent full-size mock-ups. Considering the reliability not only of the measured data, but also other relevant parameters such as the system pressure, inlet sub-cooling and rod surface temperature, these test series supplied the first substantial database for the development of truly mechanistic and consistent models for boiling transition and critical heat flux. Over the last few years the Pennsylvania State University (PSU) under the sponsorship of the U.S. Nuclear Regulatory Commission (NRC) has prepared, organized, conducted and summarized the OECD/NRC Full-size Fine-mesh Bundle Tests (BFBT) Benchmark. The international benchmark activities have been conducted in cooperation with the Nuclear Energy Agency/Organization for Economic Co-operation and Development (NEA/OECD) and Japan Nuclear Energy Safety (JNES) organization, Japan. Consequently, the JNES has made available the Boiling Water Reactor (BWR) NUPEC database for the purposes of the benchmark. Based on the success of the OECD/NRC BFBT benchmark the JNES has decided to release also the data based on the NUPEC Pressurized Water Reactor (PWR) subchannel and bundle tests for another follow-up international benchmark entitled OECD/NRC PWR Subchannel and Bundle Tests (PSBT) benchmark. This paper presents an application of the joint Penn State University/Technical University of Madrid (UPM) version of the well-known subchannel code COBRA-TF, namely CTF, to the critical power and departure from nucleate boiling (DNB) exercises of the OECD/NRC BFBT and PSBT benchmarks

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Provenance models are crucial for describing experimental results in science. The W3C Provenance Working Group has recently released the PROV family of specifications for provenance on the Web. While provenance focuses on what is executed, it is important in science to publish the general methods that describe scientific processes at a more abstract and general level. In this paper, we propose P-PLAN, an extension of PROV to represent plans that guid-ed the execution and their correspondence to provenance records that describe the execution itself. We motivate and discuss the use of P-PLAN and PROV to publish scientific workflows as Linked Data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There is now an emerging need for an efficient modeling strategy to develop a new generation of monitoring systems. One method of approaching the modeling of complex processes is to obtain a global model. It should be able to capture the basic or general behavior of the system, by means of a linear or quadratic regression, and then superimpose a local model on it that can capture the localized nonlinearities of the system. In this paper, a novel method based on a hybrid incremental modeling approach is designed and applied for tool wear detection in turning processes. It involves a two-step iterative process that combines a global model with a local model to take advantage of their underlying, complementary capacities. Thus, the first step constructs a global model using a least squares regression. A local model using the fuzzy k-nearest-neighbors smoothing algorithm is obtained in the second step. A comparative study then demonstrates that the hybrid incremental model provides better error-based performance indices for detecting tool wear than a transductive neurofuzzy model and an inductive neurofuzzy model.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Purely data-driven approaches for machine learning present difficulties when data are scarce relative to the complexity of the model or when the model is forced to extrapolate. On the other hand, purely mechanistic approaches need to identify and specify all the interactions in the problem at hand (which may not be feasible) and still leave the issue of how to parameterize the system. In this paper, we present a hybrid approach using Gaussian processes and differential equations to combine data-driven modeling with a physical model of the system. We show how different, physically inspired, kernel functions can be developed through sensible, simple, mechanistic assumptions about the underlying system. The versatility of our approach is illustrated with three case studies from motion capture, computational biology, and geostatistics.