Biblioteca Digital

971 resultados para DATA SET

Fitting a mixture model to three-mode three-way data with missing information

Relevância:

70.00% 70.00%

Publicador:

Resumo:

When the data consist of certain attributes measured on the same set of items in different situations, they would be described as a three-mode three-way array. A mixture likelihood approach can be implemented to cluster the items (i.e., one of the modes) on the basis of both of the other modes simultaneously (i.e,, the attributes measured in different situations). In this paper, it is shown that this approach can be extended to handle three-mode three-way arrays where some of the data values are missing at random in the sense of Little and Rubin (1987). The methodology is illustrated by clustering the genotypes in a three-way soybean data set where various attributes were measured on genotypes grown in several environments.

An EM-based Semi-Parametric Mixture Model Approach to the Regression Analysis of Competing-Risks Data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.

Effect of changing data collection parameters on statistical motor unit number estimates

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The effect of number of samples and selection of data for analysis on the calculation of surface motor unit potential (SMUP) size in the statistical method of motor unit number estimates (MUNE) was determined in 10 normal subjects and 10 with amyotrophic lateral sclerosis (ALS). We recorded 500 sequential compound muscle action potentials (CMAPs) at three different stable stimulus intensities (10–50% of maximal CMAP). Estimated mean SMUP sizes were calculated using Poisson statistical assumptions from the variance of 500 sequential CMAP obtained at each stimulus intensity. The results with the 500 data points were compared with smaller subsets from the same data set. The results using a range of 50–80% of the 500 data points were compared with the full 500. The effect of restricting analysis to data between 5–20% of the CMAP and to standard deviation limits was also assessed. No differences in mean SMUP size were found with stimulus intensity or use of different ranges of data. Consistency was improved with a greater sample number. Data within 5% of CMAP size gave both increased consistency and reduced mean SMUP size in many subjects, but excluded valid responses present at that stimulus intensity. These changes were more prominent in ALS patients in whom the presence of isolated SMUP responses was a striking difference from normal subjects. Noise, spurious data, and large SMUP limited the Poisson assumptions. When these factors are considered, consistent statistical MUNE can be calculated from a continuous sequence of data points. A 2 to 2.5 SD or 10% window are reasonable methods of limiting data for analysis. Muscle Nerve 27: 320–331, 2003

Geostatistical Mapping of Outfall Plume Dispersion Data gathered with an autonomous underwater vehicle

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The main purpose of this study was to examine the applicability of geostatistical modeling to obtain valuable information for assessing the environmental impact of sewage outfall discharges. The data set used was obtained in a monitoring campaign to S. Jacinto outfall, located off the Portuguese west coast near Aveiro region, using an AUV. The Matheron’s classical estimator was used the compute the experimental semivariogram which was fitted to three theoretical models: spherical, exponential and gaussian. The cross-validation procedure suggested the best semivariogram model and ordinary kriging was used to obtain the predictions of salinity at unknown locations. The generated map shows clearly the plume dispersion in the studied area, indicating that the effluent does not reach the near by beaches. Our study suggests that an optimal design for the AUV sampling trajectory from a geostatistical prediction point of view, can help to compute more precise predictions and hence to quantify more accurately dilution. Moreover, since accurate measurements of plume’s dilution are rare, these studies might be very helpful in the future for validation of dispersion models.

Modelo de data mining para detecção de tumores em exames de rastreio

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação para obtenção do grau de Mestre em Engenharia Informática

On clustering interval data with different scales of measures : experimental results

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This article is is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Attribution-NonCommercial (CC BY-NC) license lets others remix, tweak, and build upon work non-commercially, and although the new works must also acknowledge & be non-commercial.

On cluster analysis of complex and heterogeneous data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

3rd SMTDA Conference Proceedings, 11-14 June 2014, Lisbon Portugal.

Modelo de data mining para deteção de embolias pulmonares

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Trabalho de Projeto para obtenção do grau de Mestre em Engenharia Informática e de Computadores

Discovery and retrieval of Geographic data using Google

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

The impact of driving styles on fuel consumption: a data-warehouse-and-data-mining-based discovery process

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper discusses the results of applied research on the eco-driving domain based on a huge data set produced from a fleet of Lisbon's public transportation buses for a three-year period. This data set is based on events automatically extracted from the control area network bus and enriched with GPS coordinates, weather conditions, and road information. We apply online analytical processing (OLAP) and knowledge discovery (KD) techniques to deal with the high volume of this data set and to determine the major factors that influence the average fuel consumption, and then classify the drivers involved according to their driving efficiency. Consequently, we identify the most appropriate driving practices and styles. Our findings show that introducing simple practices, such as optimal clutch, engine rotation, and engine running in idle, can reduce fuel consumption on average from 3 to 5l/100 km, meaning a saving of 30 l per bus on one day. These findings have been strongly considered in the drivers' training sessions.

On the applicability of joint inversion of gravity and resistivity data to the study of a tectonic sedimentary basin in Northern Portugal

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Chaves basin is a pull-apart tectonic depression implanted on granites, schists, and graywackes, and filled with a sedimentary sequence of variable thickness. It is a rather complex structure, as it includes an intricate network of faults and hydrogeological systems. The topography of the basement of the Chaves basin still remains unclear, as no drill hole has ever intersected the bottom of the sediments, and resistivity surveys suffer from severe equivalence issues resulting from the geological setting. In this work, a joint inversion approach of 1D resistivity and gravity data designed for layered environments is used to combine the consistent spatial distribution of the gravity data with the depth sensitivity of the resistivity data. A comparison between the results from the inversion of each data set individually and the results from the joint inversion show that although the joint inversion has more difficulty adjusting to the observed data, it provides more realistic and geologically meaningful models than the ones calculated by the inversion of each data set individually. This work provides a contribution for a better understanding of the Chaves basin, while using the opportunity to study further both the advantages and difficulties comprising the application of the method of joint inversion of gravity and resistivity data.

Search for new phenomena in the dijet mass distribution using pp collision data at s√=8 TeV with the ATLAS detector

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dijet events produced in LHC proton--proton collisions at a center-of-mass energy s√=8 TeV are studied with the ATLAS detector using the full 2012 data set, with an integrated luminosity of 20.3 fb−1. Dijet masses up to about 4.5 TeV are probed. No resonance-like features are observed in the dijet mass spectrum. Limits on the cross section times acceptance are set at the 95% credibility level for various hypotheses of new phenomena in terms of mass or energy scale, as appropriate. This analysis excludes excited quarks with a mass below 4.09 TeV, color-octet scalars with a mass below 2.72 TeV, heavy W′ bosons with a mass below 2.45 TeV, chiral W∗ bosons with a mass below 1.75 TeV, and quantum black holes with six extra space-time dimensions with threshold mass below 5.82 TeV.

Spatio-temporal modelling of environmental data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Programa Doutoral em Matemática e Aplicações.

Measurement of the inclusive jet cross-section in proton--proton collisions at s√=7 TeV using 4.5 fb−1 of data with the ATLAS detector

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The inclusive jet cross-section is measured in proton--proton collisions at a centre-of-mass energy of 7 TeV using a data set corresponding to an integrated luminosity of 4.5 fb−1 collected with the ATLAS detector at the Large Hadron Collider in 2011. Jets are identified using the anti-kt algorithm with radius parameter values of 0.4 and 0.6. The double-differential cross-sections are presented as a function of the jet transverse momentum and the jet rapidity, covering jet transverse momenta from 100 GeV to 2 TeV. Next-to-leading-order QCD calculations corrected for non-perturbative effects and electroweak effects, as well as Monte Carlo simulations with next-to-leading-order matrix elements interfaced to parton showering, are compared to the measured cross-sections. A quantitative comparison of the measured cross-sections to the QCD calculations using several sets of parton distribution functions is performed.

Measurement of the top-quark mass in the fully hadronic decay channel from ATLAS data at s√=7 TeV

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The mass of the top quark is measured in a data set corresponding to 4.6 fb−1 of proton--proton collisions with centre-of-mass energy s√=7 TeV collected by the ATLAS detector at the LHC. Events consistent with hadronic decays of top--antitop quark pairs with at least six jets in the final state are selected. The substantial background from multijet production is modelled with data-driven methods that utilise the number of identified b-quark jets and the transverse momentum of the sixth leading jet, which have minimal correlation. The top-quark mass is obtained from template fits to the ratio of three-jet to dijet mass. The three-jet mass is calculated from the three jets of a top-quark decay. Using these three jets the dijet mass is obtained from the two jets of the W boson decay. The top-quark mass obtained from this fit is thus less sensitive to the uncertainty in the energy measurement of the jets. A binned likelihood fit yields a top-quark mass of mt = 175.1 ± 1.4 (stat.) ± 1.2 (syst.) GeV.

«
1
2
...
4
5
6
7
8
9
10
...
64
65
»