49 resultados para Towards Seamless Integration of Geoscience Models and Data
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
This line of research of my group intends to establish a Silicon technological platform in the field of photonics allowing the development of a wide set of applications. Particularly, what is still lacking in Silicon Photonics is an efficient and integrable light source such an LED or laser. Nanocrystals in silicon oxide or nitride matrices have been recently demonstrated as competitive materials for both active components (electrically and optically driven light emitters and optical amplifiers) and passive ones (waveguides and modulators). The final goal is the achievement of a complete integration of electronic and optical functions in the same CMOS chip. The first part of this paper will introduce the structural and optical properties of LEDs fabricated from silicon nanostructures. The second will treat the interaction of such nanocrystals with rare-earth elements (Er), which lead to an efficient hybrid system emitting in the third window of optical fibers. I will present the fabrication and assessment of optical waveguide amplifiers at 1.54 ¿m for which we have been able to demonstrate recently optical gain in waveguides made from sputtered silicon suboxide materials.
Resumo:
Approximate Quickselect, a simple modification of the well known Quickselect algorithm for selection, can be used to efficiently find an element with rank k in a given range [i..j], out of n given elements. We study basic cost measures of Approximate Quickselect by computing exact and asymptotic results for the expected number of passes, comparisons and data moves during the execution of this algorithm. The key element appearing in the analysis of Approximate Quickselect is a trivariate recurrence that we solve in full generality. The general solution of the recurrence proves to be very useful, as it allows us to tackle several related problems, besides the analysis that originally motivated us. In particular, we have been able to carry out a precise analysis of the expected number of moves of the ith element when selecting the jth smallest element with standard Quickselect, where we are able to give both exact and asymptotic results. Moreover, we can apply our general results to obtain exact and asymptotic results for several parameters in binary search trees, namely the expected number of common ancestors of the nodes with rank i and j, the expected size of the subtree rooted at the least common ancestor of the nodes with rank i and j, and the expected distance between the nodes of ranks i and j.
Resumo:
Background: Single nucleotide polymorphisms (SNPs) are the most frequent type of sequence variation between individuals, and represent a promising tool for finding genetic determinants of complex diseases and understanding the differences in drug response. In this regard, it is of particular interest to study the effect of non-synonymous SNPs in the context of biological networks such as cell signalling pathways. UniProt provides curated information about the functional and phenotypic effects of sequence variation, including SNPs, as well as on mutations of protein sequences. However, no strategy has been developed to integrate this information with biological networks, with the ultimate goal of studying the impact of the functional effect of SNPs in the structure and dynamics of biological networks. Results: First, we identified the different challenges posed by the integration of the phenotypic effect of sequence variants and mutations with biological networks. Second, we developed a strategy for the combination of data extracted from public resources, such as UniProt, NCBI dbSNP, Reactome and BioModels. We generated attribute files containing phenotypic and genotypic annotations to the nodes of biological networks, which can be imported into network visualization tools such as Cytoscape. These resources allow the mapping and visualization of mutations and natural variations of human proteins and their phenotypic effect on biological networks (e.g. signalling pathways, protein-protein interaction networks, dynamic models). Finally, an example on the use of the sequence variation data in the dynamics of a network model is presented. Conclusion: In this paper we present a general strategy for the integration of pathway and sequence variation data for visualization, analysis and modelling purposes, including the study of the functional impact of protein sequence variations on the dynamics of signalling pathways. This is of particular interest when the SNP or mutation is known to be associated to disease. We expect that this approach will help in the study of the functional impact of disease-associated SNPs on the behaviour of cell signalling pathways, which ultimately will lead to a better understanding of the mechanisms underlying complex diseases.
Resumo:
The increasing volume of data describing humandisease processes and the growing complexity of understanding, managing, and sharing such data presents a huge challenge for clinicians and medical researchers. This paper presents the@neurIST system, which provides an infrastructure for biomedical research while aiding clinical care, by bringing together heterogeneous data and complex processing and computing services. Although @neurIST targets the investigation and treatment of cerebral aneurysms, the system’s architecture is generic enough that it could be adapted to the treatment of other diseases.Innovations in @neurIST include confining the patient data pertaining to aneurysms inside a single environment that offers cliniciansthe tools to analyze and interpret patient data and make use of knowledge-based guidance in planning their treatment. Medicalresearchers gain access to a critical mass of aneurysm related data due to the system’s ability to federate distributed informationsources. A semantically mediated grid infrastructure ensures that both clinicians and researchers are able to seamlessly access andwork on data that is distributed across multiple sites in a secure way in addition to providing computing resources on demand forperforming computationally intensive simulations for treatment planning and research.
Resumo:
In this paper we analyse, using Monte Carlo simulation, the possible consequences of incorrect assumptions on the true structure of the random effects covariance matrix and the true correlation pattern of residuals, over the performance of an estimation method for nonlinear mixed models. The procedure under study is the well known linearization method due to Lindstrom and Bates (1990), implemented in the nlme library of S-Plus and R. Its performance is studied in terms of bias, mean square error (MSE), and true coverage of the associated asymptotic confidence intervals. Ignoring other criteria like the convenience of avoiding over parameterised models, it seems worst to erroneously assume some structure than do not assume any structure when this would be adequate.
A priori parameterisation of the CERES soil-crop models and tests against several European data sets
Resumo:
Mechanistic soil-crop models have become indispensable tools to investigate the effect of management practices on the productivity or environmental impacts of arable crops. Ideally these models may claim to be universally applicable because they simulate the major processes governing the fate of inputs such as fertiliser nitrogen or pesticides. However, because they deal with complex systems and uncertain phenomena, site-specific calibration is usually a prerequisite to ensure their predictions are realistic. This statement implies that some experimental knowledge on the system to be simulated should be available prior to any modelling attempt, and raises a tremendous limitation to practical applications of models. Because the demand for more general simulation results is high, modellers have nevertheless taken the bold step of extrapolating a model tested within a limited sample of real conditions to a much larger domain. While methodological questions are often disregarded in this extrapolation process, they are specifically addressed in this paper, and in particular the issue of models a priori parameterisation. We thus implemented and tested a standard procedure to parameterize the soil components of a modified version of the CERES models. The procedure converts routinely-available soil properties into functional characteristics by means of pedo-transfer functions. The resulting predictions of soil water and nitrogen dynamics, as well as crop biomass, nitrogen content and leaf area index were compared to observations from trials conducted in five locations across Europe (southern Italy, northern Spain, northern France and northern Germany). In three cases, the model’s performance was judged acceptable when compared to experimental errors on the measurements, based on a test of the model’s root mean squared error (RMSE). Significant deviations between observations and model outputs were however noted in all sites, and could be ascribed to various model routines. In decreasing importance, these were: water balance, the turnover of soil organic matter, and crop N uptake. A better match to field observations could therefore be achieved by visually adjusting related parameters, such as field-capacity water content or the size of soil microbial biomass. As a result, model predictions fell within the measurement errors in all sites for most variables, and the model’s RMSE was within the range of published values for similar tests. We conclude that the proposed a priori method yields acceptable simulations with only a 50% probability, a figure which may be greatly increased through a posteriori calibration. Modellers should thus exercise caution when extrapolating their models to a large sample of pedo-climatic conditions for which they have only limited information.
Resumo:
In this paper I explore the issue of nonlinearity (both in the datageneration process and in the functional form that establishes therelationship between the parameters and the data) regarding the poorperformance of the Generalized Method of Moments (GMM) in small samples.To this purpose I build a sequence of models starting with a simple linearmodel and enlarging it progressively until I approximate a standard (nonlinear)neoclassical growth model. I then use simulation techniques to find the smallsample distribution of the GMM estimators in each of the models.
Resumo:
Although it is commonly accepted that most macroeconomic variables are nonstationary, it is often difficult to identify the source of the non-stationarity. In particular, it is well-known that integrated and short memory models containing trending components that may display sudden changes in their parameters share some statistical properties that make their identification a hard task. The goal of this paper is to extend the classical testing framework for I(1) versus I(0)+ breaks by considering a a more general class of models under the null hypothesis: non-stationary fractionally integrated (FI) processes. A similar identification problem holds in this broader setting which is shown to be a relevant issue from both a statistical and an economic perspective. The proposed test is developed in the time domain and is very simple to compute. The asymptotic properties of the new technique are derived and it is shown by simulation that it is very well-behaved in finite samples. To illustrate the usefulness of the proposed technique, an application using inflation data is also provided.
Resumo:
We study theoretical and empirical aspects of the mean exit time (MET) of financial time series. The theoretical modeling is done within the framework of continuous time random walk. We empirically verify that the mean exit time follows a quadratic scaling law and it has associated a prefactor which is specific to the analyzed stock. We perform a series of statistical tests to determine which kind of correlation are responsible for this specificity. The main contribution is associated with the autocorrelation property of stock returns. We introduce and solve analytically both two-state and three-state Markov chain models. The analytical results obtained with the two-state Markov chain model allows us to obtain a data collapse of the 20 measured MET profiles in a single master curve.
Resumo:
In this correspondence, we propose applying the hiddenMarkov models (HMM) theory to the problem of blind channel estimationand data detection. The Baum–Welch (BW) algorithm, which is able toestimate all the parameters of the model, is enriched by introducingsome linear constraints emerging from a linear FIR hypothesis on thechannel. Additionally, a version of the algorithm that is suitable for timevaryingchannels is also presented. Performance is analyzed in a GSMenvironment using standard test channels and is found to be close to thatobtained with a nonblind receiver.
Resumo:
Membrane bioreactors (MBRs) are a combination of activated sludge bioreactors and membrane filtration, enabling high quality effluent with a small footprint. However, they can be beset by fouling, which causes an increase in transmembrane pressure (TMP). Modelling and simulation of changes in TMP could be useful to describe fouling through the identification of the most relevant operating conditions. Using experimental data from a MBR pilot plant operated for 462days, two different models were developed: a deterministic model using activated sludge model n°2d (ASM2d) for the biological component and a resistance in-series model for the filtration component as well as a data-driven model based on multivariable regressions. Once validated, these models were used to describe membrane fouling (as changes in TMP over time) under different operating conditions. The deterministic model performed better at higher temperatures (>20°C), constant operating conditions (DO set-point, membrane air-flow, pH and ORP), and high mixed liquor suspended solids (>6.9gL-1) and flux changes. At low pH (<7) or periods with higher pH changes, the data-driven model was more accurate. Changes in the DO set-point of the aerobic reactor that affected the TMP were also better described by the data-driven model. By combining the use of both models, a better description of fouling can be achieved under different operating conditions
Resumo:
Expectations are central to behaviour. Despite the existence of subjective expectations data, the standard approach is to ignore these, to hypothecate a model of behaviour and to infer expectations from realisations. In the context of income models, we reveal the informational gain obtained from using both a canonical model and subjective expectations data. We propose a test for this informational gain, and illustrate our approach with an application to the problem of measuring income risk.
Resumo:
Objective: This study examines health care utilization of immigrants relative to the native-born populations aged 50 years and older in eleven European countries. Methods. We analyzed data from the Survey of Health Aging and Retirement in Europe (SHARE) from 2004 for a sample of 27,444 individuals in 11 European countries. Negative Binomial regression was conducted to examine the difference in number of doctor visits, visits to General Practitioners (GPs), and hospital stays between immigrants and the native-born individuals. Results: We find evidence those immigrants above age 50 use health services on average more than the native-born populations with the same characteristics. Our models show immigrants have between 6% and 27% more expected visits to the doctor, GP or hospital stays when compared to native-born populations in a number of European countries. Discussion: Elderly immigrant populations might be using health services more intensively due to cultural reasons.
Resumo:
The application of compositional data analysis through log ratio trans-formations corresponds to a multinomial logit model for the shares themselves.This model is characterized by the property of Independence of Irrelevant Alter-natives (IIA). IIA states that the odds ratio in this case the ratio of shares is invariant to the addition or deletion of outcomes to the problem. It is exactlythis invariance of the ratio that underlies the commonly used zero replacementprocedure in compositional data analysis. In this paper we investigate using thenested logit model that does not embody IIA and an associated zero replacementprocedure and compare its performance with that of the more usual approach ofusing the multinomial logit model. Our comparisons exploit a data set that com-bines voting data by electoral division with corresponding census data for eachdivision for the 2001 Federal election in Australia
Resumo:
Developments in the statistical analysis of compositional data over the last twodecades have made possible a much deeper exploration of the nature of variability,and the possible processes associated with compositional data sets from manydisciplines. In this paper we concentrate on geochemical data sets. First we explainhow hypotheses of compositional variability may be formulated within the naturalsample space, the unit simplex, including useful hypotheses of subcompositionaldiscrimination and specific perturbational change. Then we develop through standardmethodology, such as generalised likelihood ratio tests, statistical tools to allow thesystematic investigation of a complete lattice of such hypotheses. Some of these tests are simple adaptations of existing multivariate tests but others require specialconstruction. We comment on the use of graphical methods in compositional dataanalysis and on the ordination of specimens. The recent development of the conceptof compositional processes is then explained together with the necessary tools for astaying- in-the-simplex approach, namely compositional singular value decompositions. All these statistical techniques are illustrated for a substantial compositional data set, consisting of 209 major-oxide and rare-element compositions of metamorphosed limestones from the Northeast and Central Highlands of Scotland.Finally we point out a number of unresolved problems in the statistical analysis ofcompositional processes