87 resultados para Missing data
Resumo:
DNA microarrays are one of the most used technologies for gene expression measurement. However, there are several distinct microarray platforms, from different manufacturers, each with its own measurement protocol, resulting in data that can hardly be compared or directly integrated. Data integration from multiple sources aims to improve the assertiveness of statistical tests, reducing the data dimensionality problem. The integration of heterogeneous DNA microarray platforms comprehends a set of tasks that range from the re-annotation of the features used on gene expression, to data normalization and batch effect elimination. In this work, a complete methodology for gene expression data integration and application is proposed, which comprehends a transcript-based re-annotation process and several methods for batch effect attenuation. The integrated data will be used to select the best feature set and learning algorithm for a brain tumor classification case study. The integration will consider data from heterogeneous Agilent and Affymetrix platforms, collected from public gene expression databases, such as The Cancer Genome Atlas and Gene Expression Omnibus.
Resumo:
Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.
Resumo:
Dijet events produced in LHC proton--proton collisions at a center-of-mass energy s√=8 TeV are studied with the ATLAS detector using the full 2012 data set, with an integrated luminosity of 20.3 fb−1. Dijet masses up to about 4.5 TeV are probed. No resonance-like features are observed in the dijet mass spectrum. Limits on the cross section times acceptance are set at the 95% credibility level for various hypotheses of new phenomena in terms of mass or energy scale, as appropriate. This analysis excludes excited quarks with a mass below 4.09 TeV, color-octet scalars with a mass below 2.72 TeV, heavy W′ bosons with a mass below 2.45 TeV, chiral W∗ bosons with a mass below 1.75 TeV, and quantum black holes with six extra space-time dimensions with threshold mass below 5.82 TeV.
Resumo:
This Letter presents a search at the LHC for s-channel single top-quark production in proton-proton collisions at a centre-of-mass energy of 8 TeV. The analyzed data set was recorded by the ATLAS detector and corresponds to an integrated luminosity of 20.3 fb−1. Selected events contain one charged lepton, large missing transverse momentum and exactly two b-tagged jets. A multivariate event classifier based on boosted decision trees is developed to discriminate s-channel single top-quark events from the main background contributions. The signal extraction is based on a binned maximum-likelihood fit of the output classifier distribution. The analysis leads to an upper limit on the s-channel single top-quark production cross-section of 14.6 pb at the 95% confidence level. The fit gives a cross-section of σs=5.0±4.3 pb, consistent with the Standard Model expectation.
Resumo:
Simultaneous measurements of the tt¯, W+W−, and Z/γ∗→ττ production cross-sections using an integrated luminosity of 4.6 fb−1 of pp collisions at s√=7 TeV collected by the ATLAS detector at the LHC are presented. Events are selected with two high transverse momentum leptons consisting of an oppositely charged electron and muon pair. The three processes are separated using the distributions of the missing transverse momentum of events with zero and greater than zero jet multiplicities. Measurements of the fiducial cross-section are presented along with results that quantify for the first time the underlying correlations in the predicted and measured cross-sections due to proton parton distribution functions. These results indicate that the correlated NLO predictions for tt¯ and Z/γ∗→ττ significantly underestimate the data, while those at NNLO generally describe the data well. The full cross-sections are measured to be σ(tt¯)=181.2±2.8+9.7−9.5±3.3±3.3 pb, σ(W+W−)=53.3±2.7+7.3−8.0±1.0±0.5 pb, and σ(Z/γ∗→ττ)=1174±24+72−87±21±9 pb, where the cited uncertainties are due to statistics, systematic effects, luminosity and the LHC beam energy measurement, respectively.
Resumo:
A search for the pair-production of heavy leptons (N0,L±) predicted by the type-III seesaw theory formulated to explain the origin of small neutrino masses is presented. The decay channels N0→W±l∓ (ℓ=e,μ,τ) and L±→W±ν (ν=νe,νμ,ντ) are considered. The analysis is performed using the final state that contains two leptons (electrons or muons), two jets from a hadronically decaying W boson, and large missing transverse momentum. The data used in the measurement correspond to an integrated luminosity of 20.3fb−1 of pp collisions at s√=8 TeV collected by the ATLAS detector at the LHC. No evidence of heavy lepton pair-production is observed. Heavy leptons with masses below 325--540 GeV are excluded at the 95% confidence level, depending on the theoretical scenario considered.
Resumo:
A search for pair production of vector-like quarks, both up-type (T) and down-type (B), as well as for four-top-quark production, is presented. The search is based on pp collisions at s√=8 TeV recorded in 2012 with the ATLAS detector at the CERN Large Hadron Collider and corresponding to an integrated luminosity of 20.3 fb−1. Data are analysed in the lepton-plus-jets final state, characterised by an isolated electron or muon with high transverse momentum, large missing transverse momentum and multiple jets. Dedicated analyses are performed targeting three cases: a T quark with significant branching ratio to a W boson and a b-quark (TT¯→Wb+X), and both a T quark and a B quark with significant branching ratio to a Higgs boson and a third-generation quark (TT¯→Ht+X and BB¯→Hb+X respectively). No significant excess of events above the Standard Model expectation is observed, and 95% CL lower limits are derived on the masses of the vector-like T and B quarks under several branching ratio hypotheses assuming contributions from T→Wb, Zt, Ht and B→Wt, Zb, Hb decays. The 95% CL observed lower limits on the T quark mass range between 715 GeV and 950 GeV for all possible values of the branching ratios into the three decay modes, and are the most stringent constraints to date. Additionally, the most restrictive upper bounds on four-top-quark production are set in a number of new physics scenarios.
Resumo:
Programa Doutoral em Matemática e Aplicações.
Resumo:
A measurement of the top--antitop (tt¯) charge asymmetry is presented using data corresponding to an integrated luminosity of 4.6 fb−1 of LHC pp collisions at a centre-of-mass energy of 7 TeV collected by the ATLAS detector. Events with two charged leptons, at least two jets and large missing transverse momentum are selected. Two observables are studied: AℓℓC based on the identified charged leptons, and Att¯C, based on the reconstructed tt¯ final state. The asymmetries are measured to be AℓℓC=0.024±0.015 (stat.)±0.009 (syst.), Att¯C=0.021±0.025 (stat.)±0.017 (syst.). The measured values are in agreement with the Standard Model predictions.
Resumo:
The inclusive jet cross-section is measured in proton--proton collisions at a centre-of-mass energy of 7 TeV using a data set corresponding to an integrated luminosity of 4.5 fb−1 collected with the ATLAS detector at the Large Hadron Collider in 2011. Jets are identified using the anti-kt algorithm with radius parameter values of 0.4 and 0.6. The double-differential cross-sections are presented as a function of the jet transverse momentum and the jet rapidity, covering jet transverse momenta from 100 GeV to 2 TeV. Next-to-leading-order QCD calculations corrected for non-perturbative effects and electroweak effects, as well as Monte Carlo simulations with next-to-leading-order matrix elements interfaced to parton showering, are compared to the measured cross-sections. A quantitative comparison of the measured cross-sections to the QCD calculations using several sets of parton distribution functions is performed.
Resumo:
A search for the production of single-top-quarks in association with missing energy is performed in proton--proton collisions at a centre-of-mass energy of s√ = 8 TeV with the ATLAS experiment at the Large Hadron Collider using data collected in 2012, corresponding to an integrated luminosity of 20.3 fb−1. In this search, the W boson from the top quark is required to decay into an electron or a muon and a neutrino. No deviation from the Standard Model prediction is observed, and upper limits are set on the production cross-section for resonant and non-resonant production of an invisible exotic state in association with a right-handed top quark. In the case of resonant production, for a spin-0 resonance with a mass of 500 GeV, an effective coupling strength above 0.15 is excluded at 95% confidence level for the top quark and an invisible spin-1/2 state with mass between 0 GeV and 100 GeV. In the case of non-resonant production, an effective coupling strength above 0.2 is excluded at 95% confidence level for the top quark and an invisible spin-1 state with mass between 0 GeV and 657 GeV.
Resumo:
The mass of the top quark is measured in a data set corresponding to 4.6 fb−1 of proton--proton collisions with centre-of-mass energy s√=7 TeV collected by the ATLAS detector at the LHC. Events consistent with hadronic decays of top--antitop quark pairs with at least six jets in the final state are selected. The substantial background from multijet production is modelled with data-driven methods that utilise the number of identified b-quark jets and the transverse momentum of the sixth leading jet, which have minimal correlation. The top-quark mass is obtained from template fits to the ratio of three-jet to dijet mass. The three-jet mass is calculated from the three jets of a top-quark decay. Using these three jets the dijet mass is obtained from the two jets of the W boson decay. The top-quark mass obtained from this fit is thus less sensitive to the uncertainty in the energy measurement of the jets. A binned likelihood fit yields a top-quark mass of mt = 175.1 ± 1.4 (stat.) ± 1.2 (syst.) GeV.
Resumo:
Extreme value models are widely used in different areas. The Birnbaum–Saunders distribution is receiving considerable attention due to its physical arguments and its good properties. We propose a methodology based on extreme value Birnbaum–Saunders regression models, which includes model formulation, estimation, inference and checking. We further conduct a simulation study for evaluating its performance. A statistical analysis with real-world extreme value environmental data using the methodology is provided as illustration.
Resumo:
In longitudinal studies of disease, patients may experience several events through a follow-up period. In these studies, the sequentially ordered events are often of interest and lead to problems that have received much attention recently. Issues of interest include the estimation of bivariate survival, marginal distributions and the conditional distribution of gap times. In this work we consider the estimation of the survival function conditional to a previous event. Different nonparametric approaches will be considered for estimating these quantities, all based on the Kaplan-Meier estimator of the survival function. We explore the finite sample behavior of the estimators through simulations. The different methods proposed in this article are applied to a data set from a German Breast Cancer Study. The methods are used to obtain predictors for the conditional survival probabilities as well as to study the influence of recurrence in overall survival.
Resumo:
The nitrogen dioxide is a primary pollutant, regarded for the estimation of the air quality index, whose excessive presence may cause significant environmental and health problems. In the current work, we suggest characterizing the evolution of NO2 levels, by using geostatisti- cal approaches that deal with both the space and time coordinates. To develop our proposal, a first exploratory analysis was carried out on daily values of the target variable, daily measured in Portugal from 2004 to 2012, which led to identify three influential covariates (type of site, environment and month of measurement). In a second step, appropriate geostatistical tools were applied to model the trend and the space-time variability, thus enabling us to use the kriging techniques for prediction, without requiring data from a dense monitoring network. This method- ology has valuable applications, as it can provide accurate assessment of the nitrogen dioxide concentrations at sites where either data have been lost or there is no monitoring station nearby.