821 resultados para distributed programming abstractions


20.00% 20.00%



Background: The variety of DNA microarray formats and datasets presently available offers an unprecedented opportunity to perform insightful comparisons of heterogeneous data. Cross-species studies, in particular, have the power of identifying conserved, functionally important molecular processes. Validation of discoveries can now often be performed in readily available public data which frequently requires cross-platform studies.Cross-platform and cross-species analyses require matching probes on different microarray formats. This can be achieved using the information in microarray annotations and additional molecular biology databases, such as orthology databases. Although annotations and other biological information are stored using modern database models ( e. g. relational), they are very often distributed and shared as tables in text files, i.e. flat file databases. This common flat database format thus provides a simple and robust solution to flexibly integrate various sources of information and a basis for the combined analysis of heterogeneous gene expression profiles.Results: We provide annotationTools, a Bioconductor-compliant R package to annotate microarray experiments and integrate heterogeneous gene expression profiles using annotation and other molecular biology information available as flat file databases. First, annotationTools contains a specialized set of functions for mining this widely used database format in a systematic manner. It thus offers a straightforward solution for annotating microarray experiments. Second, building on these basic functions and relying on the combination of information from several databases, it provides tools to easily perform cross-species analyses of gene expression data.Here, we present two example applications of annotationTools that are of direct relevance for the analysis of heterogeneous gene expression profiles, namely a cross-platform mapping of probes and a cross-species mapping of orthologous probes using different orthology databases. We also show how to perform an explorative comparison of disease-related transcriptional changes in human patients and in a genetic mouse model.Conclusion: The R package annotationTools provides a simple solution to handle microarray annotation and orthology tables, as well as other flat molecular biology databases. Thereby, it allows easy integration and analysis of heterogeneous microarray experiments across different technological platforms or species.


20.00% 20.00%



The Computational Biophysics Group at the Universitat Pompeu Fabra (GRIB-UPF) hosts two unique computational resources dedicated to the execution of large scale molecular dynamics (MD) simulations: (a) the ACMD molecular-dynamics software, used on standard personal computers with graphical processing units (GPUs); and (b) the GPUGRID. net computing network, supported by users distributed worldwide that volunteer GPUs for biomedical research. We leveraged these resources and developed studies, protocols and open-source software to elucidate energetics and pathways of a number of biomolecular systems, with a special focus on flexible proteins with many degrees of freedom. First, we characterized ion permeation through the bactericidal model protein Gramicidin A conducting one of the largest studies to date with the steered MD biasing methodology. Next, we addressed an open problem in structural biology, the determination of drug-protein association kinetics; we reconstructed the binding free energy, association, and dissaciociation rates of a drug like model system through a spatial decomposition and a Makov-chain analysis. The work was published in the Proceedings of the National Academy of Sciences and become one of the few landmark papers elucidating a ligand-binding pathway. Furthermore, we investigated the unstructured Kinase Inducible Domain (KID), a 28-peptide central to signalling and transcriptional response; the kinetics of this challenging system was modelled with a Markovian approach in collaboration with Frank Noe’s group at the Freie University of Berlin. The impact of the funding includes three peer-reviewed publication on high-impact journals; three more papers under review; four MD analysis components, released as open-source software; MD protocols; didactic material, and code for the hosting group.


20.00% 20.00%



In a number of programs for gene structure prediction in higher eukaryotic genomic sequences, exon prediction is decoupled from gene assembly: a large pool of candidate exons is predicted and scored from features located in the query DNA sequence, and candidate genes are assembled from such a pool as sequences of nonoverlapping frame-compatible exons. Genes are scored as a function of the scores of the assembled exons, and the highest scoring candidate gene is assumed to be the most likely gene encoded by the query DNA sequence. Considering additive gene scoring functions, currently available algorithms to determine such a highest scoring candidate gene run in time proportional to the square of the number of predicted exons. Here, we present an algorithm whose running time grows only linearly with the size of the set of predicted exons. Polynomial algorithms rely on the fact that, while scanning the set of predicted exons, the highest scoring gene ending in a given exon can be obtained by appending the exon to the highest scoring among the highest scoring genes ending at each compatible preceding exon. The algorithm here relies on the simple fact that such highest scoring gene can be stored and updated. This requires scanning the set of predicted exons simultaneously by increasing acceptor and donor position. On the other hand, the algorithm described here does not assume an underlying gene structure model. Indeed, the definition of valid gene structures is externally defined in the so-called Gene Model. The Gene Model specifies simply which gene features are allowed immediately upstream which other gene features in valid gene structures. This allows for great flexibility in formulating the gene identification problem. In particular it allows for multiple-gene two-strand predictions and for considering gene features other than coding exons (such as promoter elements) in valid gene structures.


20.00% 20.00%



In a distributed key distribution scheme, a set of servers helps a set of users in a group to securely obtain a common key. Security means that an adversary who corrupts some servers and some users has no information about the key of a noncorrupted group. In this work, we formalize the security analysis of one such scheme which was not considered in the original proposal. We prove the scheme is secure in the random oracle model, assuming that the Decisional Diffie-Hellman (DDH) problem is hard to solve. We also detail a possible modification of that scheme and the one in which allows us to prove the security of the schemes without assuming that a specific hash function behaves as a random oracle. As usual, this improvement in the security of the schemes is at the cost of an efficiency loss.


20.00% 20.00%



Aim  Recently developed parametric methods in historical biogeography allow researchers to integrate temporal and palaeogeographical information into the reconstruction of biogeographical scenarios, thus overcoming a known bias of parsimony-based approaches. Here, we compare a parametric method, dispersal-extinction-cladogenesis (DEC), against a parsimony-based method, dispersal-vicariance analysis (DIVA), which does not incorporate branch lengths but accounts for phylogenetic uncertainty through a Bayesian empirical approach (Bayes-DIVA). We analyse the benefits and limitations of each method using the cosmopolitan plant family Sapindaceae as a case study.Location  World-wide.Methods  Phylogenetic relationships were estimated by Bayesian inference on a large dataset representing generic diversity within Sapindaceae. Lineage divergence times were estimated by penalized likelihood over a sample of trees from the posterior distribution of the phylogeny to account for dating uncertainty in biogeographical reconstructions. We compared biogeographical scenarios between Bayes-DIVA and two different DEC models: one with no geological constraints and another that employed a stratified palaeogeographical model in which dispersal rates were scaled according to area connectivity across four time slices, reflecting the changing continental configuration over the last 110 million years.Results  Despite differences in the underlying biogeographical model, Bayes-DIVA and DEC inferred similar biogeographical scenarios. The main differences were: (1) in the timing of dispersal events - which in Bayes-DIVA sometimes conflicts with palaeogeographical information, and (2) in the lower frequency of terminal dispersal events inferred by DEC. Uncertainty in divergence time estimations influenced both the inference of ancestral ranges and the decisiveness with which an area can be assigned to a node.Main conclusions  By considering lineage divergence times, the DEC method gives more accurate reconstructions that are in agreement with palaeogeographical evidence. In contrast, Bayes-DIVA showed the highest decisiveness in unequivocally reconstructing ancestral ranges, probably reflecting its ability to integrate phylogenetic uncertainty. Care should be taken in defining the palaeogeographical model in DEC because of the possibility of overestimating the frequency of extinction events, or of inferring ancestral ranges that are outside the extant species ranges, owing to dispersal constraints enforced by the model. The wide-spanning spatial and temporal model proposed here could prove useful for testing large-scale biogeographical patterns in plants.


20.00% 20.00%



En aquest projecte he avaluat un seguit de plataformes per veure quina era la millor pertal d’integrar les eines que proporcionen serveis del projecte TENCompetence.Per començar el projecte plantejaré el context del projecte. Com se situa al marc del projecte TENCompetence on he desenvolupat aquest treball fi de carrera. Tot seguit es veuen quines eines disposem per tal d’accedir als diferents serveis que ens proporciona el projecte.Comento els escenaris on s’aplicarà la tecnologia que triem i finalment comento les diferents plataformes web on integrarem les diferents eines.A continuació he realitzat un capítol per tal de comentar l’anàlisi de requeriments del’escenari d’aplicació de cada pilot. Per a cada escenari aplico unes determinades eines a un determinat context, i per tant hi han unes necessitats concretes que he de recollir. Per plasmar-ho en paper he realitzat l’anàlisi de requeriments. Un cop recollides totes les dades he pogut feruna selecció de la plataforma contenidora que més s’escau a cada pilot.Amb els requeriments i la plataforma seleccionada, he realitzat un disseny per a cada pilot. Després de refinar el disseny he realitzat la implementació per tal de cobrir les necessitats dels pilots. També he aprofitat per veure quina tecnologia es pot utilitzar per tal d’integrar leseines dins de la plataforma.Amb la implementació feta he realitzat un seguit de proves per tal de veure els resultats aconseguits. Tot seguit he iniciat un procés iteractiu per tal refinar el disseny i millorar la implementació.


20.00% 20.00%



The integrity of the cornea, the most anterior part of the eye, is indispensable for vision. Forty-five million individuals worldwide are bilaterally blind and another 135 million have severely impaired vision in both eyes because of loss of corneal transparency; treatments range from local medications to corneal transplants, and more recently to stem cell therapy. The corneal epithelium is a squamous epithelium that is constantly renewing, with a vertical turnover of 7 to 14 days in many mammals. Identification of slow cycling cells (label-retaining cells) in the limbus of the mouse has led to the notion that the limbus is the niche for the stem cells responsible for the long-term renewal of the cornea; hence, the corneal epithelium is supposedly renewed by cells generated at and migrating from the limbus, in marked opposition to other squamous epithelia in which each resident stem cell has in charge a limited area of epithelium. Here we show that the corneal epithelium of the mouse can be serially transplanted, is self-maintained and contains oligopotent stem cells with the capacity to generate goblet cells if provided with a conjunctival environment. Furthermore, the entire ocular surface of the pig, including the cornea, contains oligopotent stem cells (holoclones) with the capacity to generate individual colonies of corneal and conjunctival cells. Therefore, the limbus is not the only niche for corneal stem cells and corneal renewal is not different from other squamous epithelia. We propose a model that unifies our observations with the literature and explains why the limbal region is enriched in stem cells.


20.00% 20.00%



Models incorporating more realistic models of customer behavior, as customers choosing froman offer set, have recently become popular in assortment optimization and revenue management.The dynamic program for these models is intractable and approximated by a deterministiclinear program called the CDLP which has an exponential number of columns. However, whenthe segment consideration sets overlap, the CDLP is difficult to solve. Column generationhas been proposed but finding an entering column has been shown to be NP-hard. In thispaper we propose a new approach called SDCP to solving CDLP based on segments and theirconsideration sets. SDCP is a relaxation of CDLP and hence forms a looser upper bound onthe dynamic program but coincides with CDLP for the case of non-overlapping segments. Ifthe number of elements in a consideration set for a segment is not very large (SDCP) can beapplied to any discrete-choice model of consumer behavior. We tighten the SDCP bound by(i) simulations, called the randomized concave programming (RCP) method, and (ii) by addingcuts to a recent compact formulation of the problem for a latent multinomial-choice model ofdemand (SBLP+). This latter approach turns out to be very effective, essentially obtainingCDLP value, and excellent revenue performance in simulations, even for overlapping segments.By formulating the problem as a separation problem, we give insight into why CDLP is easyfor the MNL with non-overlapping considerations sets and why generalizations of MNL posedifficulties. We perform numerical simulations to determine the revenue performance of all themethods on reference data sets in the literature.


20.00% 20.00%



The choice network revenue management model incorporates customer purchase behavioras a function of the offered products, and is the appropriate model for airline and hotel networkrevenue management, dynamic sales of bundles, and dynamic assortment optimization.The optimization problem is a stochastic dynamic program and is intractable. A certainty-equivalencerelaxation of the dynamic program, called the choice deterministic linear program(CDLP) is usually used to generate dyamic controls. Recently, a compact linear programmingformulation of this linear program was given for the multi-segment multinomial-logit (MNL)model of customer choice with non-overlapping consideration sets. Our objective is to obtaina tighter bound than this formulation while retaining the appealing properties of a compactlinear programming representation. To this end, it is natural to consider the affine relaxationof the dynamic program. We first show that the affine relaxation is NP-complete even for asingle-segment MNL model. Nevertheless, by analyzing the affine relaxation we derive a newcompact linear program that approximates the dynamic programming value function betterthan CDLP, provably between the CDLP value and the affine relaxation, and often comingclose to the latter in our numerical experiments. When the segment consideration sets overlap,we show that some strong equalities called product cuts developed for the CDLP remain validfor our new formulation. Finally we perform extensive numerical comparisons on the variousbounds to evaluate their performance.


20.00% 20.00%



We present a new unifying framework for investigating throughput-WIP(Work-in-Process) optimal control problems in queueing systems,based on reformulating them as linear programming (LP) problems withspecial structure: We show that if a throughput-WIP performance pairin a stochastic system satisfies the Threshold Property we introducein this paper, then we can reformulate the problem of optimizing alinear objective of throughput-WIP performance as a (semi-infinite)LP problem over a polygon with special structure (a thresholdpolygon). The strong structural properties of such polygones explainthe optimality of threshold policies for optimizing linearperformance objectives: their vertices correspond to the performancepairs of threshold policies. We analyze in this framework theversatile input-output queueing intensity control model introduced byChen and Yao (1990), obtaining a variety of new results, including (a)an exact reformulation of the control problem as an LP problem over athreshold polygon; (b) an analytical characterization of the Min WIPfunction (giving the minimum WIP level required to attain a targetthroughput level); (c) an LP Value Decomposition Theorem that relatesthe objective value under an arbitrary policy with that of a giventhreshold policy (thus revealing the LP interpretation of Chen andYao's optimality conditions); (d) diminishing returns and invarianceproperties of throughput-WIP performance, which underlie thresholdoptimality; (e) a unified treatment of the time-discounted andtime-average cases.


20.00% 20.00%



This paper introduces the approach of using Total Unduplicated Reach and Frequency analysis (TURF) to design a product line through a binary linear programming model. This improves the efficiency of the search for the solution to the problem compared to the algorithms that have been used to date. The results obtained through our exact algorithm are presented, and this method shows to be extremely efficient both in obtaining optimal solutions and in computing time for very large instances of the problem at hand. Furthermore, the proposed technique enables the model to be improved in order to overcome the main drawbacks presented by TURF analysis in practice.


20.00% 20.00%



We develop a mathematical programming approach for the classicalPSPACE - hard restless bandit problem in stochastic optimization.We introduce a hierarchy of n (where n is the number of bandits)increasingly stronger linear programming relaxations, the lastof which is exact and corresponds to the (exponential size)formulation of the problem as a Markov decision chain, while theother relaxations provide bounds and are efficiently computed. Wealso propose a priority-index heuristic scheduling policy fromthe solution to the first-order relaxation, where the indices aredefined in terms of optimal dual variables. In this way wepropose a policy and a suboptimality guarantee. We report resultsof computational experiments that suggest that the proposedheuristic policy is nearly optimal. Moreover, the second-orderrelaxation is found to provide strong bounds on the optimalvalue.


20.00% 20.00%



Hydrological models developed for extreme precipitation of PMP type are difficult to calibrate because of the scarcity of available data for these events. This article presents the process and results of calibration for a distributed hydrological model at fine scale developed for the estimation of probable maximal floods in the case of a PMP. This calibration is done on two Swiss catchments for two events of summer storms. The calculation done is concentrated on the estimation of the parameters of the model, divided in two parts. The first is necessary for the computation of flow speeds while the second is required for the determination of the initial and final infiltration capacities for each terrain type. The results, validated with the Nash equation show a good correlation between the simulated and observed flows. We also apply this model on two Romanian catchments, showing the river network and estimated flow.


20.00% 20.00%



Understanding the factors that drive geographic variation in life history is an important challenge in evolutionary ecology. Here, we analyze what predicts geographic variation in life-history traits of the common lizard, Zootoca vivipara, which has the globally largest distribution range of all terrestrial reptile species. Variation in body size was predicted by differences in the length of activity season, while we found no effects of environmental temperature per se. Females experiencing relatively short activity season mature at a larger size and remain larger on average than females in populations with relatively long activity seasons. Interpopulation variation in fecundity was largely explained by mean body size of females and reproductive mode, with viviparous populations having larger clutch size than oviparous populations. Finally, body size-fecundity relationship differs between viviparous and oviparous populations, with relatively lower reproductive investment for a given body size in oviparous populations. While the phylogenetic signal was weak overall, the patterns of variation showed spatial effects, perhaps reflecting genetic divergence or geographic variation in additional biotic and abiotic factors. Our findings emphasize that time constraints imposed by the environment rather than ambient temperature play a major role in shaping life histories in the common lizard. This might be attributed to the fact that lizards can attain their preferred body temperature via behavioral thermoregulation across different thermal environments. Length of activity season, defining the maximum time available for lizards to maintain optimal performance, is thus the main environmental factor constraining growth rate and annual rates of mortality. Our results suggest that this factor may partly explain variation in the extent to which different taxa follow ecogeographic rules.