889 resultados para Heterogeneous regression
Resumo:
La regressió basada en distàncies és un mètode de predicció que consisteix en dos passos: a partir de les distàncies entre observacions obtenim les variables latents, les quals passen a ser els regressors en un model lineal de mínims quadrats ordinaris. Les distàncies les calculem a partir dels predictors originals fent us d'una funció de dissimilaritats adequada. Donat que, en general, els regressors estan relacionats de manera no lineal amb la resposta, la seva selecció amb el test F usual no és possible. En aquest treball proposem una solució a aquest problema de selecció de predictors definint tests estadístics generalitzats i adaptant un mètode de bootstrap no paramètric per a l'estimació dels p-valors. Incluim un exemple numèric amb dades de l'assegurança d'automòbils.
Resumo:
[cat] En aquest treball s'analitza un model estocàstic en temps continu en el que l'agent decisor descompta les utilitats instantànies i la funció final amb taxes de preferència temporal constants però diferents. En aquest context es poden modelitzar problemes en els quals, quan el temps s'acosta al moment final, la valoració de la funció final incrementa en comparació amb les utilitats instantànies. Aquest tipus d'asimetria no es pot descriure ni amb un descompte estàndard ni amb un variable. Per tal d'obtenir solucions consistents temporalment es deriva l'equació de programació dinàmica estocàstica, les solucions de la qual són equilibris Markovians. Per a aquest tipus de preferències temporals, s'estudia el model clàssic de consum i inversió (Merton, 1971) per a les funcions d'utilitat del tipus CRRA i CARA, comparant els equilibris Markovians amb les solucions inconsistents temporalment. Finalment es discuteix la introducció del temps final aleatori.
Resumo:
Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.
Resumo:
A Knudsen flow reactor has been used to quantify surface functional groups on aerosols collected in the field. This technique is based on a heterogeneous titration reaction between a probe gas and a specific functional group on the particle surface. In the first part of this work, the reactivity of different probe gases on laboratory-generated aerosols (limonene SOA, Pb(NO3)2, Cd(NO3)2) and diesel reference soot (SRM 2975) has been studied. Five probe gases have been selected for the quantitative determination of important functional groups: N(CH3)3 (for the titration of acidic sites), NH2OH (for carbonyl functions), CF3COOH and HCl (for basic sites of different strength), and O3 (for oxidizable groups). The second part describes a field campaign that has been undertaken in several bus depots in Switzerland, where ambient fine and ultrafine particles were collected on suitable filters and quantitatively investigated using the Knudsen flow reactor. Results point to important differences in the surface reactivity of ambient particles, depending on the sampling site and season. The particle surface appears to be multi-functional, with the simultaneous presence of antagonistic functional groups which do not undergo internal chemical reactions, such as acid-base neutralization. Results also indicate that the surface of ambient particles was characterized by a high density of carbonyl functions (reactivity towards NH2OH probe in the range 0.26-6 formal molecular monolayers) and a low density of acidic sites (reactivity towards N(CH3)3 probe in the range 0.01-0.20 formal molecular monolayer). Kinetic parameters point to fast redox reactions (uptake coefficient ?0>10-3 for O3 probe) and slow acid-base reactions (?0<10-4 for N(CH3)3 probe) on the particle surface. [Authors]
Resumo:
Six gases (N((CH3)3), NH2OH, CF3COOH, HCl, NO2, O3) were selected to probe the surface of seven combustion aerosol (amorphous carbon, flame soot) and three types of TiO2 nanoparticles using heterogeneous, that is gas-surface reactions. The gas uptake to saturation of the probes was measured under molecular flow conditions in a Knudsen flow reactor and expressed as a density of surface functional groups on a particular aerosol, namely acidic (carboxylic) and basic (conjugated oxides such as pyrones, N-heterocycles) sites, carbonyl (R1-C(O)-R2) and oxidizable (olefinic, -OH) groups. The limit of detection was generally well below 1% of a formal monolayer of adsorbed probe gas. With few exceptions most investigated aerosol samples interacted with all probe gases which points to the coexistence of different functional groups on the same aerosol surface such as acidic and basic groups. Generally, the carbonaceous particles displayed significant differences in surface group density: Printex 60 amorphous carbon had the lowest density of surface functional groups throughout, whereas Diesel soot recovered from a Diesel particulate filter had the largest. The presence of basic oxides on carbonaceous aerosol particles was inferred from the ratio of uptakes of CF3COOH and HCl owing to the larger stability of the acetate compared to the chloride counterion in the resulting pyrylium salt. Both soots generated from a rich and a lean hexane diffusion flame had a large density of oxidizable groups similar to amorphous carbon FS 101. TiO2 15 had the lowest density of functional groups among the three studied TiO2 nanoparticles for all probe gases despite the smallest size of its primary particles. The used technique enabled the measurement of the uptake probability of the probe gases on the various supported aerosol samples. The initial uptake probability, g0, of the probe gas onto the supported nanoparticles differed significantly among the various investigated aerosol samples but was roughly correlated with the density of surface groups, as expected. [Authors]
Resumo:
Robust estimators for accelerated failure time models with asymmetric (or symmetric) error distribution and censored observations are proposed. It is assumed that the error model belongs to a log-location-scale family of distributions and that the mean response is the parameter of interest. Since scale is a main component of mean, scale is not treated as a nuisance parameter. A three steps procedure is proposed. In the first step, an initial high breakdown point S estimate is computed. In the second step, observations that are unlikely under the estimated model are rejected or down weighted. Finally, a weighted maximum likelihood estimate is computed. To define the estimates, functions of censored residuals are replaced by their estimated conditional expectation given that the response is larger than the observed censored value. The rejection rule in the second step is based on an adaptive cut-off that, asymptotically, does not reject any observation when the data are generat ed according to the model. Therefore, the final estimate attains full efficiency at the model, with respect to the maximum likelihood estimate, while maintaining the breakdown point of the initial estimator. Asymptotic results are provided. The new procedure is evaluated with the help of Monte Carlo simulations. Two examples with real data are discussed.
Resumo:
The relationship between hypoxic stress, autophagy, and specific cell-mediated cytotoxicity remains unknown. This study shows that hypoxia-induced resistance of lung tumor to cytolytic T lymphocyte (CTL)-mediated lysis is associated with autophagy induction in target cells. In turn, this correlates with STAT3 phosphorylation on tyrosine 705 residue (pSTAT3) and HIF-1α accumulation. Inhibition of autophagy by siRNA targeting of either beclin1 or Atg5 resulted in impairment of pSTAT3 and restoration of hypoxic tumor cell susceptibility to CTL-mediated lysis. Furthermore, inhibition of pSTAT3 in hypoxic Atg5 or beclin1-targeted tumor cells was found to be associated with the inhibition Src kinase (pSrc). Autophagy-induced pSTAT3 and pSrc regulation seemed to involve the ubiquitin proteasome system and p62/SQSTM1. In vivo experiments using B16-F10 melanoma tumor cells indicated that depletion of beclin1 resulted in an inhibition of B16-F10 tumor growth and increased tumor apoptosis. Moreover, in vivo inhibition of autophagy by hydroxychloroquine in B16-F10 tumor-bearing mice and mice vaccinated with tyrosinase-related protein-2 peptide dramatically increased tumor growth inhibition. Collectively, this study establishes a novel functional link between hypoxia-induced autophagy and the regulation of antigen-specific T-cell lysis and points to a major role of autophagy in the control of in vivo tumor growth.
Resumo:
The determination of line crossing sequences between rollerball pens and laser printers presents difficulties that may not be overcome using traditional techniques. This research aimed to study the potential of digital microscopy and 3-D laser profilometry to determine line crossing sequences between a toner and an aqueous ink line. Different paper types, rollerball pens, and writing pressure were tested. Correct opinions of the sequence were given for all case scenarios, using both techniques. When the toner was printed before the ink, a light reflection was observed in all crossing specimens, while this was never observed in the other sequence types. The 3-D laser profilometry, more time-consuming, presented the main advantage of providing quantitative results. The findings confirm the potential of the 3-D laser profilometry and demonstrate the efficiency of digital microscopy as a new technique for determining the sequence of line crossings involving rollerball pen ink and toner. With the mass marketing of laser printers and the popularity of rollerball pens, the determination of line crossing sequences between such instruments is encountered by forensic document examiners. This type of crossing presents difficulties with optical microscopic line crossing techniques involving ballpoint pens or gel pens and toner (1-4). Indeed, the rollerball's aqueous ink penetrates through the toner and is absorbed by the fibers of the paper, leaving the examiner with the impression that the toner is above the ink even when it is not (5). Novotny and Westwood (3) investigated the possibility of determining aqueous ink and toner crossing sequences by microscopic observation of the intersection before and after toner removal. A major disadvantage of their study resides in destruction of the sample by scraping off the toner line to see what was underneath. The aim of this research was to investigate the ways to overcome these difficulties through digital microscopy and three-dimensional (3-D) laser profilometry. The former was used as a technique for the determination of sequences between gel pen and toner printing strokes, but provided less conclusive results than that of an optical stereomicroscope (4). 3-D laser profilometry, which allows one to observe and measure the topography of a surface, has been the subject of a number of recent studies in this area. Berx and De Kinder (6) and Schirripa Spagnolo (7,8) have tested the application of laser profilometry to determine the sequence of intersections of several lines. The results obtained in these studies overcome disadvantages of other methods applied in this area, such as scanning electron microscope or the atomic force microscope. The main advantages of 3-D laser profilometry include the ease of implementation of the technique and its nondestructive nature, which does not require sample preparation (8-10). Moreover, the technique is reproducible and presents a high degree of freedom in the vertical axes (up to 1000 μm). However, when the paper surface presents a given roughness, if the pen impressions alter the paper with a depth similar to the roughness of medium, the results are not always conclusive (8). It becomes difficult in this case to distinguish which characteristics can be imputed to the pen impressions or the quality of the paper surface. This important limitation is assessed by testing different types of paper of variable quality (of different grammage and finishing) and the writing pressure. The authors will therefore assess the limits of 3-D laser profilometry technique and determine whether the method can overcome such constraints. Second, the authors will investigate the use of digital microscopy because it presents a number of advantages: it is efficient, user-friendly, and provides an objective evaluation and interpretation.
Resumo:
When researchers introduce a new test they have to demonstrate that it is valid, using unbiased designs and suitable statistical procedures. In this article we use Monte Carlo analyses to highlight how incorrect statistical procedures (i.e., stepwise regression, extreme scores analyses) or ignoring regression assumptions (e.g., heteroscedasticity) contribute to wrong validity estimates. Beyond these demonstrations, and as an example, we re-examined the results reported by Warwick, Nettelbeck, and Ward (2010) concerning the validity of the Ability Emotional Intelligence Measure (AEIM). Warwick et al. used the wrong statistical procedures to conclude that the AEIM was incrementally valid beyond intelligence and personality traits in predicting various outcomes. In our re-analysis, we found that the reliability-corrected multiple correlation of their measures with personality and intelligence was up to .69. Using robust statistical procedures and appropriate controls, we also found that the AEIM did not predict incremental variance in GPA, stress, loneliness, or well-being, demonstrating the importance for testing validity instead of looking for it.
Resumo:
The effect of heterogeneous environments upon the dynamics of invasion and the eradication or control of invasive species is poorly understood, although it is a major challenge for biodiversity conservation. Here, we first investigate how the probability and time for invasion are affected by spatial heterogeneity. Then, we study the effect of control program strategies (e.g. species specificity, spatial scale of action, detection and eradication efficiency) on the success and time of eradication. We find that heterogeneity increases both the invasion probability and the time to invasion. Heterogeneity also reduces the probability of eradication but does not change the time taken for successful eradication. We confirm that early detection of invasive species reduces the time until eradication, but we also demonstrate that this is true only if the local control action is sufficiently efficient. The criterion of removal efficiency is even more important for an eradication program than simply ensuring control effort when the invasive species is not abundant.
Resumo:
Logistic regression is included into the analysis techniques which are valid for observationalmethodology. However, its presence at the heart of thismethodology, and more specifically in physical activity and sports studies, is scarce. With a view to highlighting the possibilities this technique offers within the scope of observational methodology applied to physical activity and sports, an application of the logistic regression model is presented. The model is applied in the context of an observational design which aims to determine, from the analysis of use of the playing area, which football discipline (7 a side football, 9 a side football or 11 a side football) is best adapted to the child"s possibilities. A multiple logistic regression model can provide an effective prognosis regarding the probability of a move being successful (reaching the opposing goal area) depending on the sector in which the move commenced and the football discipline which is being played.
Resumo:
Cuscuta spp. are holoparasitic plants that can simultaneously parasitise several host plants. It has been suggested that Cuscuta has evolved a foraging strategy based on a positive relationship between preuptake investment and subsequent reward on different host species. Here we establish reliable parasite size measures and show that parasitism on individuals of different host species alters the biomass of C. campestris but that within host species size and age also contributes to the heterogeneous resource landscape. We then performed two additional experiments to test whether C. campestris achieves greater resource acquisition by parasitising two host species rather than one and whether C. campestris forages in communities of hosts offering different rewards (a choice experiment). There was no evidence in either experiment for direct benefits of a mixed host diet. Cuscuta campestris foraged by parasitising the most rewarding hosts the fastest and then investing the most on them. We conclude that our data present strong evidence for foraging in the parasitic plant C. campestris.
Resumo:
This paper investigates the use of ensemble of predictors in order to improve the performance of spatial prediction methods. Support vector regression (SVR), a popular method from the field of statistical machine learning, is used. Several instances of SVR are combined using different data sampling schemes (bagging and boosting). Bagging shows good performance, and proves to be more computationally efficient than training a single SVR model while reducing error. Boosting, however, does not improve results on this specific problem.