922 resultados para Data replication processes
Resumo:
The proportion of population living in or around cites is more important than ever. Urban sprawl and car dependence have taken over the pedestrian-friendly compact city. Environmental problems like air pollution, land waste or noise, and health problems are the result of this still continuing process. The urban planners have to find solutions to these complex problems, and at the same time insure the economic performance of the city and its surroundings. At the same time, an increasing quantity of socio-economic and environmental data is acquired. In order to get a better understanding of the processes and phenomena taking place in the complex urban environment, these data should be analysed. Numerous methods for modelling and simulating such a system exist and are still under development and can be exploited by the urban geographers for improving our understanding of the urban metabolism. Modern and innovative visualisation techniques help in communicating the results of such models and simulations. This thesis covers several methods for analysis, modelling, simulation and visualisation of problems related to urban geography. The analysis of high dimensional socio-economic data using artificial neural network techniques, especially self-organising maps, is showed using two examples at different scales. The problem of spatiotemporal modelling and data representation is treated and some possible solutions are shown. The simulation of urban dynamics and more specifically the traffic due to commuting to work is illustrated using multi-agent micro-simulation techniques. A section on visualisation methods presents cartograms for transforming the geographic space into a feature space, and the distance circle map, a centre-based map representation particularly useful for urban agglomerations. Some issues on the importance of scale in urban analysis and clustering of urban phenomena are exposed. A new approach on how to define urban areas at different scales is developed, and the link with percolation theory established. Fractal statistics, especially the lacunarity measure, and scale laws are used for characterising urban clusters. In a last section, the population evolution is modelled using a model close to the well-established gravity model. The work covers quite a wide range of methods useful in urban geography. Methods should still be developed further and at the same time find their way into the daily work and decision process of urban planners. La part de personnes vivant dans une région urbaine est plus élevé que jamais et continue à croître. L'étalement urbain et la dépendance automobile ont supplanté la ville compacte adaptée aux piétons. La pollution de l'air, le gaspillage du sol, le bruit, et des problèmes de santé pour les habitants en sont la conséquence. Les urbanistes doivent trouver, ensemble avec toute la société, des solutions à ces problèmes complexes. En même temps, il faut assurer la performance économique de la ville et de sa région. Actuellement, une quantité grandissante de données socio-économiques et environnementales est récoltée. Pour mieux comprendre les processus et phénomènes du système complexe "ville", ces données doivent être traitées et analysées. Des nombreuses méthodes pour modéliser et simuler un tel système existent et sont continuellement en développement. Elles peuvent être exploitées par le géographe urbain pour améliorer sa connaissance du métabolisme urbain. Des techniques modernes et innovatrices de visualisation aident dans la communication des résultats de tels modèles et simulations. Cette thèse décrit plusieurs méthodes permettant d'analyser, de modéliser, de simuler et de visualiser des phénomènes urbains. L'analyse de données socio-économiques à très haute dimension à l'aide de réseaux de neurones artificiels, notamment des cartes auto-organisatrices, est montré à travers deux exemples aux échelles différentes. Le problème de modélisation spatio-temporelle et de représentation des données est discuté et quelques ébauches de solutions esquissées. La simulation de la dynamique urbaine, et plus spécifiquement du trafic automobile engendré par les pendulaires est illustrée à l'aide d'une simulation multi-agents. Une section sur les méthodes de visualisation montre des cartes en anamorphoses permettant de transformer l'espace géographique en espace fonctionnel. Un autre type de carte, les cartes circulaires, est présenté. Ce type de carte est particulièrement utile pour les agglomérations urbaines. Quelques questions liées à l'importance de l'échelle dans l'analyse urbaine sont également discutées. Une nouvelle approche pour définir des clusters urbains à des échelles différentes est développée, et le lien avec la théorie de la percolation est établi. Des statistiques fractales, notamment la lacunarité, sont utilisées pour caractériser ces clusters urbains. L'évolution de la population est modélisée à l'aide d'un modèle proche du modèle gravitaire bien connu. Le travail couvre une large panoplie de méthodes utiles en géographie urbaine. Toutefois, il est toujours nécessaire de développer plus loin ces méthodes et en même temps, elles doivent trouver leur chemin dans la vie quotidienne des urbanistes et planificateurs.
Resumo:
Tests for bioaccessibility are useful in human health risk assessment. No research data with the objective of determining bioaccessible arsenic (As) in areas affected by gold mining and smelting activities have been published so far in Brazil. Samples were collected from four areas: a private natural land reserve of Cerrado; mine tailings; overburden; and refuse from gold smelting of a mining company in Paracatu, Minas Gerais. The total, bioaccessible and Mehlich-1-extractable As levels were determined. Based on the reproducibility and the accuracy/precision of the in vitro gastrointestinal (IVG) determination method of bioaccessible As in the reference material NIST 2710, it was concluded that this procedure is adequate to determine bioaccessible As in soil and tailing samples from gold mining areas in Brazil. All samples from the studied mining area contained low percentages of bioaccessible As.
Resumo:
An equation for mean first-passage times of non-Markovian processes driven by colored noise is derived through an appropriate backward integro-differential equation. The equation is solved in a Bourret-like approximation. In a weak-noise bistable situation, non-Markovian effects are taken into account by an effective diffusion coefficient. In this situation, our results compare satisfactorily with other approaches and experimental data.
Influence of age on retinochoroidal healing processes after argon photocoagulation in C57bl/6j mice.
Resumo:
PURPOSE: To analyze the influence of age on retinochoroidal wound healing processes and on glial growth factor and cytokine mRNA expression profiles observed after argon laser photocoagulation. METHODS: A cellular and morphometric study was performed that used 44 C57Bl/6J mice: 4-week-old mice (group I, n=8), 6-week-old mice (group II, n=8), 10-12-week-old mice (group III, n=14), and 1-year-old mice (group IV, n=14). All mice in these groups underwent a standard argon laser photocoagulation (50 microm, 400 mW, 0.05 s). Two separated lesions were created in each retina using a slit lamp delivery system. At 1, 3, 7, 14, 60 days, and 4 months after photocoagulation, mice from each of the four groups were sacrificed by carbon dioxide inhalation. Groups III and IV were also studied at 6, 7, and 8 months after photocoagulation. At each time point the enucleated eyes were either mounted in Tissue Tek (OCT), snap frozen and processed for immunohistochemistry or either flat mounted (left eyes of groups III and IV). To determine, by RT-PCR, the time course of glial fibrillary acidic protein (GFAP), vascular endothelial growth factor (VEGF), and monocyte chemotactic protein-1 (MCP-1) gene expression, we delivered ten laser burns (50 microm, 400 mW, 0.05 s) to each retina in 10-12-week-old mice (group III', n=10) and 1-year-old mice (group IV', n=10). Animals from Groups III' and IV' had the same age than those from Groups III and IV, but they received ten laser impacts in each eye and served for the molecular analysis. Mice from Groups III and IV received only two laser impacts per eye and served for the cellular and morphologic study. Retinal and choroidal tissues from these treated mice were collected at 16 h, and 1, 2, 3, and 7 days after photocoagulation. Two mice of each group did not receive photocoagulation and were used as controls. RESULTS: In the cellular and morphologic study, the resultant retinal pigment epithelium interruption expanse was significantly different between the four groups. It was more concise and smaller in the oldest group IV (112.1 microm+/-11.4 versus 219.1 microm+/-12.2 in group III) p<0.0001 between groups III and IV. By contrast, while choroidal neovascularization (CNV) was mild and not readily identifiable in group I, at all time points studied, CNV was more prominent in the (1-year-old mice) Group IV than in the other groups. For instance, up to 14 days after photocoagulation, CNV reaction was statistically larger in group IV than in group III ((p=0.0049 between groups III and IV on slide sections and p<0.0001 between the same groups on flat mounts). Moreover, four months after photocoagulation, the CNV area (on slide sections) was 1,282 microm(2)+/-90 for group III and 2,999 microm(2)+/-115 for group IV (p<0.0001 between groups III and IV). Accordingly, GFAP, VEGF, and MCP-1 mRNA expression profiles, determined by RT-PCR at 16 h, 1, 2, 3, and 7 days postphotocoagulation, were modified with aging. In 1-year-old mice (group IV), GFAP mRNA expression was already significantly higher than in the younger (10-12 week) group III before photocoagulation. After laser burns, GFAP mRNA expression peaked at 16-24 h and on day 7, decreasing thereafter. VEGF mRNA expression was markedly increased after photocoagulation in old mice eyes, reaching 2.7 times its basal level at day 3, while it was only slightly increased in young mice (1.3 times its level in untreated young mice 3 days postphotocoagulation). At all time points after photocoagulation, MCP-1 mRNA expression was elevated in old mice, reaching high levels of expression at 16 h and day 3 respectively. CONCLUSIONS: Our results were based on the study of four different age groups and included not only data from morphological observations but also from a molecular analysis of the various alterations of cytokine signaling and expression. One-year-old mice demonstrated more extensive CNV formation and a slower pace of regression after laser photocoagulation than younger mice. These were accompanied by differences in growth factors and cytokine expression profiles indicate that aging is a factor that aggravates CNV. The above results may provide some insight into possible therapeutic strategies in the future.
Resumo:
Background: Chronic disease management initiatives emphasize patient-centered care, and quality of life (QoL) is increasingly considered a representative outcome in that context. In this study we evaluated the association between receipt of processes of diabetic care and QoL. Methods: This cross-sectional population-based study (2011) used self-reported data from non-institutionalized, adult diabetics, recruited from randomly selected community pharmacies in Vaud. Outcomes included the physical and mental composites of the SF-36 (PCS, MCS) and the disease-specific Audit of Diabetes-Dependent QoL (ADDQoL). Main exposure variables were receipt of six diabetes processes-of care in the past 12 months. We also evaluated whether the association between care received and QoL was congruent with the chronic care model, when assessed by the Patient Assessment of Chronic Illness Care (PACIC). We used linear regressions to examine the association between process measures and the three composites of health-related QoL. Analyses were adjusted for age, gender, socioeconomic status, living companion, BMI, alcohol, smoking, physical activity, co-morbidities and diabetes mellitus (DM) characteristics (type, insulin use, complications, duration). Results: Mean age of the 519 diabetic patients was 64.4 years (SD 11.3), 60% were male and 73% had a living companion; 87% reported type 2 DM, half of respondents required insulin treatment, 48% had at least one DM complication, and 48% had DM over 10 years. Crude overall mean QoL scores were PCS: 43.4 (SD 10.5), MCS: 47.0 (SD 11.2) and ADDQoL: -1.56 (SD 1.6). In bivariate analyses, patients who received the influenza vaccine versus those who did not, had lower ADDQoL and PCS scores; there were no other indicator differences. In adjusted models including all processes, receipt of influenza vaccine was associated with lower ADDQoL (β= - 0.41, p=.01); there were no other associations between process indicators and QoL composites. There was no process association even when these were reported as combined measures of processes of care. PACIC score was associated only with the MCS (β= 1.57, p=.004). Conclusions: Process indicators for diabetes care did not show an association with QoL. This may represent an effect lag time between time of process received and quality of life; or that treatment may be related with inconvenience and patient worry. Further research is needed to explore these unexpected findings.
Resumo:
The graphical representation of spatial soil properties in a digital environment is complex because it requires a conversion of data collected in a discrete form onto a continuous surface. The objective of this study was to apply three-dimension techniques of interpolation and visualization on soil texture and fertility properties and establish relationships with pedogenetic factors and processes in a slope area. The GRASS Geographic Information System was used to generate three-dimensional models and ParaView software to visualize soil volumes. Samples of the A, AB, BA, and B horizons were collected in a regular 122-point grid in an area of 13 ha, in Pinhais, PR, in southern Brazil. Geoprocessing and graphic computing techniques were effective in identifying and delimiting soil volumes of distinct ranges of fertility properties confined within the soil matrix. Both three-dimensional interpolation and the visualization tool facilitated interpretation in a continuous space (volumes) of the cause-effect relationships between soil texture and fertility properties and pedological factors and processes, such as higher clay contents following the drainage lines of the area. The flattest part with more weathered soils (Oxisols) had the highest pH values and lower Al3+ concentrations. These techniques of data interpolation and visualization have great potential for use in diverse areas of soil science, such as identification of soil volumes occurring side-by-side but that exhibit different physical, chemical, and mineralogical conditions for plant root growth, and monitoring of plumes of organic and inorganic pollutants in soils and sediments, among other applications. The methodological details for interpolation and a three-dimensional view of soil data are presented here.
Resumo:
Because of the increase in workplace automation and the diversification of industrial processes, workplaces have become more and more complex. The classical approaches used to address workplace hazard concerns, such as checklists or sequence models, are, therefore, of limited use in such complex systems. Moreover, because of the multifaceted nature of workplaces, the use of single-oriented methods, such as AEA (man oriented), FMEA (system oriented), or HAZOP (process oriented), is not satisfactory. The use of a dynamic modeling approach in order to allow multiple-oriented analyses may constitute an alternative to overcome this limitation. The qualitative modeling aspects of the MORM (man-machine occupational risk modeling) model are discussed in this article. The model, realized on an object-oriented Petri net tool (CO-OPN), has been developed to simulate and analyze industrial processes in an OH&S perspective. The industrial process is modeled as a set of interconnected subnets (state spaces), which describe its constitutive machines. Process-related factors are introduced, in an explicit way, through machine interconnections and flow properties. While man-machine interactions are modeled as triggering events for the state spaces of the machines, the CREAM cognitive behavior model is used in order to establish the relevant triggering events. In the CO-OPN formalism, the model is expressed as a set of interconnected CO-OPN objects defined over data types expressing the measure attached to the flow of entities transiting through the machines. Constraints on the measures assigned to these entities are used to determine the state changes in each machine. Interconnecting machines implies the composition of such flow and consequently the interconnection of the measure constraints. This is reflected by the construction of constraint enrichment hierarchies, which can be used for simulation and analysis optimization in a clear mathematical framework. The use of Petri nets to perform multiple-oriented analysis opens perspectives in the field of industrial risk management. It may significantly reduce the duration of the assessment process. But, most of all, it opens perspectives in the field of risk comparisons and integrated risk management. Moreover, because of the generic nature of the model and tool used, the same concepts and patterns may be used to model a wide range of systems and application fields.
Resumo:
Extreme times techniques, generally applied to nonequilibrium statistical mechanical processes, are also useful for a better understanding of financial markets. We present a detailed study on the mean first-passage time for the volatility of return time series. The empirical results extracted from daily data of major indices seem to follow the same law regardless of the kind of index thus suggesting an universal pattern. The empirical mean first-passage time to a certain level L is fairly different from that of the Wiener process showing a dissimilar behavior depending on whether L is higher or lower than the average volatility. All of this indicates a more complex dynamics in which a reverting force drives volatility toward its mean value. We thus present the mean first-passage time expressions of the most common stochastic volatility models whose approach is comparable to the random diffusion description. We discuss asymptotic approximations of these models and confront them to empirical results with a good agreement with the exponential Ornstein-Uhlenbeck model.
Resumo:
Preface The starting point for this work and eventually the subject of the whole thesis was the question: how to estimate parameters of the affine stochastic volatility jump-diffusion models. These models are very important for contingent claim pricing. Their major advantage, availability T of analytical solutions for characteristic functions, made them the models of choice for many theoretical constructions and practical applications. At the same time, estimation of parameters of stochastic volatility jump-diffusion models is not a straightforward task. The problem is coming from the variance process, which is non-observable. There are several estimation methodologies that deal with estimation problems of latent variables. One appeared to be particularly interesting. It proposes the estimator that in contrast to the other methods requires neither discretization nor simulation of the process: the Continuous Empirical Characteristic function estimator (EGF) based on the unconditional characteristic function. However, the procedure was derived only for the stochastic volatility models without jumps. Thus, it has become the subject of my research. This thesis consists of three parts. Each one is written as independent and self contained article. At the same time, questions that are answered by the second and third parts of this Work arise naturally from the issues investigated and results obtained in the first one. The first chapter is the theoretical foundation of the thesis. It proposes an estimation procedure for the stochastic volatility models with jumps both in the asset price and variance processes. The estimation procedure is based on the joint unconditional characteristic function for the stochastic process. The major analytical result of this part as well as of the whole thesis is the closed form expression for the joint unconditional characteristic function for the stochastic volatility jump-diffusion models. The empirical part of the chapter suggests that besides a stochastic volatility, jumps both in the mean and the volatility equation are relevant for modelling returns of the S&P500 index, which has been chosen as a general representative of the stock asset class. Hence, the next question is: what jump process to use to model returns of the S&P500. The decision about the jump process in the framework of the affine jump- diffusion models boils down to defining the intensity of the compound Poisson process, a constant or some function of state variables, and to choosing the distribution of the jump size. While the jump in the variance process is usually assumed to be exponential, there are at least three distributions of the jump size which are currently used for the asset log-prices: normal, exponential and double exponential. The second part of this thesis shows that normal jumps in the asset log-returns should be used if we are to model S&P500 index by a stochastic volatility jump-diffusion model. This is a surprising result. Exponential distribution has fatter tails and for this reason either exponential or double exponential jump size was expected to provide the best it of the stochastic volatility jump-diffusion models to the data. The idea of testing the efficiency of the Continuous ECF estimator on the simulated data has already appeared when the first estimation results of the first chapter were obtained. In the absence of a benchmark or any ground for comparison it is unreasonable to be sure that our parameter estimates and the true parameters of the models coincide. The conclusion of the second chapter provides one more reason to do that kind of test. Thus, the third part of this thesis concentrates on the estimation of parameters of stochastic volatility jump- diffusion models on the basis of the asset price time-series simulated from various "true" parameter sets. The goal is to show that the Continuous ECF estimator based on the joint unconditional characteristic function is capable of finding the true parameters. And, the third chapter proves that our estimator indeed has the ability to do so. Once it is clear that the Continuous ECF estimator based on the unconditional characteristic function is working, the next question does not wait to appear. The question is whether the computation effort can be reduced without affecting the efficiency of the estimator, or whether the efficiency of the estimator can be improved without dramatically increasing the computational burden. The efficiency of the Continuous ECF estimator depends on the number of dimensions of the joint unconditional characteristic function which is used for its construction. Theoretically, the more dimensions there are, the more efficient is the estimation procedure. In practice, however, this relationship is not so straightforward due to the increasing computational difficulties. The second chapter, for example, in addition to the choice of the jump process, discusses the possibility of using the marginal, i.e. one-dimensional, unconditional characteristic function in the estimation instead of the joint, bi-dimensional, unconditional characteristic function. As result, the preference for one or the other depends on the model to be estimated. Thus, the computational effort can be reduced in some cases without affecting the efficiency of the estimator. The improvement of the estimator s efficiency by increasing its dimensionality faces more difficulties. The third chapter of this thesis, in addition to what was discussed above, compares the performance of the estimators with bi- and three-dimensional unconditional characteristic functions on the simulated data. It shows that the theoretical efficiency of the Continuous ECF estimator based on the three-dimensional unconditional characteristic function is not attainable in practice, at least for the moment, due to the limitations on the computer power and optimization toolboxes available to the general public. Thus, the Continuous ECF estimator based on the joint, bi-dimensional, unconditional characteristic function has all the reasons to exist and to be used for the estimation of parameters of the stochastic volatility jump-diffusion models.
Resumo:
The in vitro adenovirus (Ad) DNA replication system provides an assay to study the interaction of viral and host replication proteins with the DNA template in the formation of the preinitiation complex. This initiation system requires in addition to the origin DNA sequences 1) Ad DNA polymerase (Pol), 2) Ad preterminal protein (pTP), the covalent acceptor for protein-primed DNA replication, and 3) nuclear factor I (NFI), a host cell protein identical to the CCAAT box-binding transcription factor. The interactions of these proteins were studied by coimmunoprecipitation and Ad origin DNA binding assays. The Ad Pol can bind to origin sequences only in the presence of another protein which can be either pTP or NFI. While NFI alone can bind to its origin recognition sequence, pTP does not specifically recognize DNA unless Ad Pol is present. Thus, protein-protein interactions are necessary for the targetting of either Ad Pol or pTP to the preinitiation complex. DNA footprinting demonstrated that the Ad DNA site recognized by the pTP.Pol complex was within the first 18 bases at the end of the template which constitutes the minimal origin of replication. Mutagenesis studies have defined the Ad Pol interaction site on NFI between amino acids 68-150, which overlaps the DNA binding and replication activation domain of this factor. A putative zinc finger on the Ad Pol has been mutated to a product that fails to bind the Ad origin sequences but still interacts with pTP. These results indicate that both protein-protein and protein-DNA interactions mediate specific recognition of the replication origin by Ad DNA polymerase.
Resumo:
AIM: Phylogenetic diversity patterns are increasingly being used to better understand the role of ecological and evolutionary processes in community assembly. Here, we quantify how these patterns are influenced by scale choices in terms of spatial and environmental extent and organismic scales. LOCATION: European Alps. METHODS: We applied 42 sampling strategies differing in their combination of focal scales. For each resulting sub-dataset, we estimated the phylogenetic diversity of the species pools, phylogenetic α-diversities of local communities, and statistics commonly used together with null models in order to infer non-random diversity patterns (i.e. phylogenetic clustering versus over-dispersion). Finally, we studied the effects of scale choices on these measures using regression analyses. RESULTS: Scale choices were decisive for revealing signals in diversity patterns. Notably, changes in focal scales sometimes reversed a pattern of over-dispersion into clustering. Organismic scale had a stronger effect than spatial and environmental extent. However, we did not find general rules for the direction of change from over-dispersion to clustering with changing scales. Importantly, these scale issues had only a weak influence when focusing on regional diversity patterns that change along abiotic gradients. MAIN CONCLUSIONS: Our results call for caution when combining phylogenetic data with distributional data to study how and why communities differ from random expectations of phylogenetic relatedness. These analyses seem to be robust when the focus is on relating community diversity patterns to variation in habitat conditions, such as abiotic gradients. However, if the focus is on identifying relevant assembly rules for local communities, the uncertainty arising from a certain scale choice can be immense. In the latter case, it becomes necessary to test whether emerging patterns are robust to alternative scale choices.
Resumo:
Background: Molecular tools may help to uncover closely related and still diverging species from a wide variety of taxa and provide insight into the mechanisms, pace and geography of marine speciation. There is a certain controversy on the phylogeography and speciation modes of species-groups with an Eastern Atlantic-Western Indian Ocean distribution, with previous studies suggesting that older events (Miocene) and/or more recent (Pleistocene) oceanographic processes could have influenced the phylogeny of marine taxa. The spiny lobster genus Palinurus allows for testing among speciation hypotheses, since it has a particular distribution with two groups of three species each in the Northeastern Atlantic (P. elephas, P. mauritanicus and P. charlestoni) and Southeastern Atlantic and Southwestern Indian Oceans (P. gilchristi, P. delagoae and P. barbarae). In the present study, we obtain a more complete understanding of the phylogenetic relationships among these species through a combined dataset with both nuclear and mitochondrial markers, by testing alternative hypotheses on both the mutation rate and tree topology under the recently developed approximate Bayesian computation (ABC) methods. Results Our analyses support a North-to-South speciation pattern in Palinurus with all the South-African species forming a monophyletic clade nested within the Northern Hemisphere species. Coalescent-based ABC methods allowed us to reject the previously proposed hypothesis of a Middle Miocene speciation event related with the closure of the Tethyan Seaway. Instead, divergence times obtained for Palinurus species using the combined mtDNA-microsatellite dataset and standard mutation rates for mtDNA agree with known glaciation-related processes occurring during the last 2 my. Conclusion The Palinurus speciation pattern is a typical example of a series of rapid speciation events occurring within a group, with very short branches separating different species. Our results support the hypothesis that recent climate change-related oceanographic processes have influenced the phylogeny of marine taxa, with most Palinurus species originating during the last two million years. The present study highlights the value of new coalescent-based statistical methods such as ABC for testing different speciation hypotheses using molecular data.
Resumo:
Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.
Resumo:
An equation for mean first-passage times of non-Markovian processes driven by colored noise is derived through an appropriate backward integro-differential equation. The equation is solved in a Bourret-like approximation. In a weak-noise bistable situation, non-Markovian effects are taken into account by an effective diffusion coefficient. In this situation, our results compare satisfactorily with other approaches and experimental data.
Resumo:
The temporal dynamics of species diversity are shaped by variations in the rates of speciation and extinction, and there is a long history of inferring these rates using first and last appearances of taxa in the fossil record. Understanding diversity dynamics critically depends on unbiased estimates of the unobserved times of speciation and extinction for all lineages, but the inference of these parameters is challenging due to the complex nature of the available data. Here, we present a new probabilistic framework to jointly estimate species-specific times of speciation and extinction and the rates of the underlying birth-death process based on the fossil record. The rates are allowed to vary through time independently of each other, and the probability of preservation and sampling is explicitly incorporated in the model to estimate the true lifespan of each lineage. We implement a Bayesian algorithm to assess the presence of rate shifts by exploring alternative diversification models. Tests on a range of simulated data sets reveal the accuracy and robustness of our approach against violations of the underlying assumptions and various degrees of data incompleteness. Finally, we demonstrate the application of our method with the diversification of the mammal family Rhinocerotidae and reveal a complex history of repeated and independent temporal shifts of both speciation and extinction rates, leading to the expansion and subsequent decline of the group. The estimated parameters of the birth-death process implemented here are directly comparable with those obtained from dated molecular phylogenies. Thus, our model represents a step towards integrating phylogenetic and fossil information to infer macroevolutionary processes.