Biblioteca Digital

996 resultados para data summarization

Discrimination of Brazilian propolis according to the seasoning using chemometrics and machine learning based on UV-Vis scanning data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Propolis is a chemically complex biomass produced by honeybees (Apis mellifera) from plant resins added of salivary enzymes, beeswax, and pollen. The biological activities described for propolis were also identified for donor plants resin, but a big challenge for the standardization of the chemical composition and biological effects of propolis remains on a better understanding of the influence of seasonality on the chemical constituents of that raw material. Since propolis quality depends, among other variables, on the local flora which is strongly influenced by (a)biotic factors over the seasons, to unravel the harvest season effect on the propolis chemical profile is an issue of recognized importance. For that, fast, cheap, and robust analytical techniques seem to be the best choice for large scale quality control processes in the most demanding markets, e.g., human health applications. For that, UV-Visible (UV-Vis) scanning spectrophotometry of hydroalcoholic extracts (HE) of seventy-three propolis samples, collected over the seasons in 2014 (summer, spring, autumn, and winter) and 2015 (summer and autumn) in Southern Brazil was adopted. Further machine learning and chemometrics techniques were applied to the UV-Vis dataset aiming to gain insights as to the seasonality effect on the claimed chemical heterogeneity of propolis samples determined by changes in the flora of the geographic region under study. Descriptive and classification models were built following a chemometric approach, i.e. principal component analysis (PCA) and hierarchical clustering analysis (HCA) supported by scripts written in the R language. The UV-Vis profiles associated with chemometric analysis allowed identifying a typical pattern in propolis samples collected in the summer. Importantly, the discrimination based on PCA could be improved by using the dataset of the fingerprint region of phenolic compounds ( = 280-400m), suggesting that besides the biological activities of those secondary metabolites, they also play a relevant role for the discrimination and classification of that complex matrix through bioinformatics tools. Finally, a series of machine learning approaches, e.g., partial least square-discriminant analysis (PLS-DA), k-Nearest Neighbors (kNN), and Decision Trees showed to be complementary to PCA and HCA, allowing to obtain relevant information as to the sample discrimination.

Integrating data from heterogeneous DNA microarray platforms

Relevância:

20.00% 20.00%

Publicador:

Resumo:

DNA microarrays are one of the most used technologies for gene expression measurement. However, there are several distinct microarray platforms, from different manufacturers, each with its own measurement protocol, resulting in data that can hardly be compared or directly integrated. Data integration from multiple sources aims to improve the assertiveness of statistical tests, reducing the data dimensionality problem. The integration of heterogeneous DNA microarray platforms comprehends a set of tasks that range from the re-annotation of the features used on gene expression, to data normalization and batch effect elimination. In this work, a complete methodology for gene expression data integration and application is proposed, which comprehends a transcript-based re-annotation process and several methods for batch effect attenuation. The integrated data will be used to select the best feature set and learning algorithm for a brain tumor classification case study. The integration will consider data from heterogeneous Agilent and Affymetrix platforms, collected from public gene expression databases, such as The Cancer Genome Atlas and Gene Expression Omnibus.

Reconstructing transcriptional regulatory networks using data integration and text mining

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.

Search for new phenomena in the dijet mass distribution using pp collision data at s√=8 TeV with the ATLAS detector

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dijet events produced in LHC proton--proton collisions at a center-of-mass energy s√=8 TeV are studied with the ATLAS detector using the full 2012 data set, with an integrated luminosity of 20.3 fb−1. Dijet masses up to about 4.5 TeV are probed. No resonance-like features are observed in the dijet mass spectrum. Limits on the cross section times acceptance are set at the 95% credibility level for various hypotheses of new phenomena in terms of mass or energy scale, as appropriate. This analysis excludes excited quarks with a mass below 4.09 TeV, color-octet scalars with a mass below 2.72 TeV, heavy W′ bosons with a mass below 2.45 TeV, chiral W∗ bosons with a mass below 1.75 TeV, and quantum black holes with six extra space-time dimensions with threshold mass below 5.82 TeV.

Spatio-temporal modelling of environmental data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Programa Doutoral em Matemática e Aplicações.

Search for charged Higgs bosons decaying via H± → τ±ν in fully hadronic final states using pp collision data at √s = 8 TeV with the ATLAS detector

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The results of a search for charged Higgs bosons decaying to a τ lepton and a neutrino, H±→τ±ν, are presented. The analysis is based on 19.5 fb−1 of proton--proton collision data at s√=8 TeV collected by the ATLAS experiment at the Large Hadron Collider. Charged Higgs bosons are searched for in events consistent with top-quark pair production or in associated production with a top quark. The final state is characterised by the presence of a hadronic τ decay, missing transverse momentum, b-tagged jets, a hadronically decaying W boson, and the absence of any isolated electrons or muons with high transverse momenta. The data are consistent with the expected background from Standard Model processes. A statistical analysis leads to 95% confidence-level upper limits on the product of branching ratios B(t→bH±)×B(H±→τ±ν), between 0.23% and 1.3% for charged Higgs boson masses in the range 80--160 GeV. It also leads to 95% confidence-level upper limits on the production cross section times branching ratio, σ(pp→tH±+X)×B(H±→τ±ν), between 0.76 pb and 4.5 fb, for charged Higgs boson masses ranging from 180 GeV to 1000 GeV. In the context of different scenarios of the Minimal Supersymmetric Standard Model, these results exclude nearly all values of tanβ above one for charged Higgs boson masses between 80 GeV and 160 GeV, and exclude a region of parameter space with high tanβ for H± masses between 200 GeV and 250 GeV.

Measurement of the inclusive jet cross-section in proton--proton collisions at s√=7 TeV using 4.5 fb−1 of data with the ATLAS detector

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The inclusive jet cross-section is measured in proton--proton collisions at a centre-of-mass energy of 7 TeV using a data set corresponding to an integrated luminosity of 4.5 fb−1 collected with the ATLAS detector at the Large Hadron Collider in 2011. Jets are identified using the anti-kt algorithm with radius parameter values of 0.4 and 0.6. The double-differential cross-sections are presented as a function of the jet transverse momentum and the jet rapidity, covering jet transverse momenta from 100 GeV to 2 TeV. Next-to-leading-order QCD calculations corrected for non-perturbative effects and electroweak effects, as well as Monte Carlo simulations with next-to-leading-order matrix elements interfaced to parton showering, are compared to the measured cross-sections. A quantitative comparison of the measured cross-sections to the QCD calculations using several sets of parton distribution functions is performed.

Measurement of the top-quark mass in the fully hadronic decay channel from ATLAS data at s√=7 TeV

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The mass of the top quark is measured in a data set corresponding to 4.6 fb−1 of proton--proton collisions with centre-of-mass energy s√=7 TeV collected by the ATLAS detector at the LHC. Events consistent with hadronic decays of top--antitop quark pairs with at least six jets in the final state are selected. The substantial background from multijet production is modelled with data-driven methods that utilise the number of identified b-quark jets and the transverse momentum of the sixth leading jet, which have minimal correlation. The top-quark mass is obtained from template fits to the ratio of three-jet to dijet mass. The three-jet mass is calculated from the three jets of a top-quark decay. Using these three jets the dijet mass is obtained from the two jets of the W boson decay. The top-quark mass obtained from this fit is thus less sensitive to the uncertainty in the energy measurement of the jets. A binned likelihood fit yields a top-quark mass of mt = 175.1 ± 1.4 (stat.) ± 1.2 (syst.) GeV.

Extreme value Birnbaum-Saunders regression models applied to environmental data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Extreme value models are widely used in different areas. The Birnbaum–Saunders distribution is receiving considerable attention due to its physical arguments and its good properties. We propose a methodology based on extreme value Birnbaum–Saunders regression models, which includes model formulation, estimation, inference and checking. We further conduct a simulation study for evaluating its performance. A statistical analysis with real-world extreme value environmental data using the methodology is provided as illustration.

Nonparametric estimation of the survival function for ordered multivariate failure time data: a comparative study

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In longitudinal studies of disease, patients may experience several events through a follow-up period. In these studies, the sequentially ordered events are often of interest and lead to problems that have received much attention recently. Issues of interest include the estimation of bivariate survival, marginal distributions and the conditional distribution of gap times. In this work we consider the estimation of the survival function conditional to a previous event. Different nonparametric approaches will be considered for estimating these quantities, all based on the Kaplan-Meier estimator of the survival function. We explore the finite sample behavior of the estimators through simulations. The different methods proposed in this article are applied to a data set from a German Breast Cancer Study. The methods are used to obtain predictors for the conditional survival probabilities as well as to study the influence of recurrence in overall survival.

Spatial-temporal modellization of the NO2 concentration data through geostatistical tools

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The nitrogen dioxide is a primary pollutant, regarded for the estimation of the air quality index, whose excessive presence may cause significant environmental and health problems. In the current work, we suggest characterizing the evolution of NO2 levels, by using geostatisti- cal approaches that deal with both the space and time coordinates. To develop our proposal, a first exploratory analysis was carried out on daily values of the target variable, daily measured in Portugal from 2004 to 2012, which led to identify three influential covariates (type of site, environment and month of measurement). In a second step, appropriate geostatistical tools were applied to model the trend and the space-time variability, thus enabling us to use the kriging techniques for prediction, without requiring data from a dense monitoring network. This method- ology has valuable applications, as it can provide accurate assessment of the nitrogen dioxide concentrations at sites where either data have been lost or there is no monitoring station nearby.

Data quality in biofilm high-throughput routine analysis: intralaboratory protocol adaptation and experiment reproducibility

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Biofilm research is growing more diverse and dependent on high-throughput technologies and the large-scale production of results aggravates data substantiation. In particular, it is often the case that experimental protocols are adapted to meet the needs of a particular laboratory and no statistical validation of the modified method is provided. This paper discusses the impact of intra-laboratory adaptation and non-rigorous documentation of experimental protocols on biofilm data interchange and validation. The case study is a non-standard, but widely used, workflow for Pseudomonas aeruginosa biofilm development, considering three analysis assays: the crystal violet (CV) assay for biomass quantification, the XTT assay for respiratory activity assessment, and the colony forming units (CFU) assay for determination of cell viability. The ruggedness of the protocol was assessed by introducing small changes in the biofilm growth conditions, which simulate minor protocol adaptations and non-rigorous protocol documentation. Results show that even minor variations in the biofilm growth conditions may affect the results considerably, and that the biofilm analysis assays lack repeatability. Intra-laboratory validation of non-standard protocols is found critical to ensure data quality and enable the comparison of results within and among laboratories.

Local non-negative initial data scalar characterization of the Kerr solution

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For any vacuum initial data set, we define a local, non-negative scalar quantity which vanishes at every point of the data hypersurface if and only if the data are Kerr initial data. Our scalar quantity only depends on the quantities used to construct the vacuum initial data set which are the Riemannian metric defined on the initial data hypersurface and a symmetric tensor which plays the role of the second fundamental form of the embedded initial data hypersurface. The dependency is algorithmic in the sense that given the initial data one can compute the scalar quantity by algebraic and differential manipulations, being thus suitable for an implementation in a numerical code. The scalar could also be useful in studies of the non-linear stability of the Kerr solution because it serves to measure the deviation of a vacuum initial data set from the Kerr initial data in a local and algorithmic way.

Discussing Chevalier’s data on the efficiency of tariffs for american and french canals in the 1830s

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article revisits Michel Chevalier’s work and discussions of tariffs. Chevalier shifted from Saint-Simonism to economic liberalism during his life in the 19th century. His influence was soon perceived in the political world and economic debates, mainly because of his discussion of tariffs as instruments of efficient transport policies. This work discusses Chevalier’s thoughts on tariffs by revisiting his masterpiece, Le Cours d’Économie Politique. Data Envelopment Analysis (DEA) was conducted to test Chevalier’s hypothesis on the inefficiency of French tariffs. This work showed that Chevalier’s claims on French tariffs are not validated by DEA.

Using balanced scorecards to evaluate the data warehouse system utility

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação de mestrado em Systems Engineering

«
1
2
...
33
34
35
36
37
38
39
...
66
67
»