980 resultados para Missing values structures


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Doutor em Estatística e Gestão do Risco, especialidade em Estatística

Relevância:

80.00% 80.00%

Publicador:

Resumo:

OBJECTIVE: The European Surgical Outcomes Study described mortality following in-patient surgery. Several factors were identified that were able to predict poor outcomes in a multivariate analysis. These included age, procedure urgency, severity and type and the American Association of Anaesthesia score. This study describes in greater detail the relationship between the American Association of Anaesthesia score and postoperative mortality. METHODS: Patients in this 7-day cohort study were enrolled in April 2011. Consecutive patients aged 16 years and older undergoing inpatient non-cardiac surgery with a recorded American Association of Anaesthesia score in 498 hospitals across 28 European nations were included and followed up for a maximum of 60 days. The primary endpoint was in-hospital mortality. Decision tree analysis with the CHAID (SPSS) system was used to delineate nodes associated with mortality. RESULTS: The study enrolled 46,539 patients. Due to missing values, 873 patients were excluded, resulting in the analysis of 45,666 patients. Increasing American Association of Anaesthesia scores were associated with increased admission rates to intensive care and higher mortality rates. Despite a progressive relationship with mortality, discrimination was poor, with an area under the ROC curve of 0.658 (95% CI 0.642 - 0.6775). Using regression trees (CHAID), we identified four discrete American Association of Anaesthesia nodes associated with mortality, with American Association of Anaesthesia 1 and American Association of Anaesthesia 2 compressed into the same node. CONCLUSION: The American Association of Anaesthesia score can be used to determine higher risk groups of surgical patients, but clinicians cannot use the score to discriminate between grades 1 and 2. Overall, the discriminatory power of the model was less than acceptable for widespread use.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Various methodologies in economic literature have been used to analyse the international hydrocarbon retail sector. Nevertheless at a Spanish level these studies are much more recent and most conclude that generally there is no effective competition present in this market, regardless of the approach used. In this paper, in order to analyse the price levels in the Spanish petrol market, our starting hypothesis is that in uncompetitive markets the prices are higher and the standard deviation is lower. We use weekly retail petrol price data from the ten biggest Spanish cities, and apply Markov chains to fill the missing values for petrol 95 and diesel, and we also employ a variance filter. We conclude that this market demonstrates reduced price dispersion, regardless of brand or city.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

R from http://www.r-project.org/ is ‘GNU S’ – a language and environment for statistical computingand graphics. The environment in which many classical and modern statistical techniques havebeen implemented, but many are supplied as packages. There are 8 standard packages and many moreare available through the cran family of Internet sites http://cran.r-project.org .We started to develop a library of functions in R to support the analysis of mixtures and our goal isa MixeR package for compositional data analysis that provides support foroperations on compositions: perturbation and power multiplication, subcomposition with or withoutresiduals, centering of the data, computing Aitchison’s, Euclidean, Bhattacharyya distances,compositional Kullback-Leibler divergence etc.graphical presentation of compositions in ternary diagrams and tetrahedrons with additional features:barycenter, geometric mean of the data set, the percentiles lines, marking and coloring ofsubsets of the data set, theirs geometric means, notation of individual data in the set . . .dealing with zeros and missing values in compositional data sets with R procedures for simpleand multiplicative replacement strategy,the time series analysis of compositional data.We’ll present the current status of MixeR development and illustrate its use on selected data sets

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The R-package “compositions”is a tool for advanced compositional analysis. Its basicfunctionality has seen some conceptual improvement, containing now some facilitiesto work with and represent ilr bases built from balances, and an elaborated subsys-tem for dealing with several kinds of irregular data: (rounded or structural) zeroes,incomplete observations and outliers. The general approach to these irregularities isbased on subcompositions: for an irregular datum, one can distinguish a “regular” sub-composition (where all parts are actually observed and the datum behaves typically)and a “problematic” subcomposition (with those unobserved, zero or rounded parts, orelse where the datum shows an erratic or atypical behaviour). Systematic classificationschemes are proposed for both outliers and missing values (including zeros) focusing onthe nature of irregularities in the datum subcomposition(s).To compute statistics with values missing at random and structural zeros, a projectionapproach is implemented: a given datum contributes to the estimation of the desiredparameters only on the subcompositon where it was observed. For data sets withvalues below the detection limit, two different approaches are provided: the well-knownimputation technique, and also the projection approach.To compute statistics in the presence of outliers, robust statistics are adapted to thecharacteristics of compositional data, based on the minimum covariance determinantapproach. The outlier classification is based on four different models of outlier occur-rence and Monte-Carlo-based tests for their characterization. Furthermore the packageprovides special plots helping to understand the nature of outliers in the dataset.Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator,robustness, rounded zeros

Relevância:

80.00% 80.00%

Publicador:

Resumo:

As stated in Aitchison (1986), a proper study of relative variation in a compositional data set should be based on logratios, and dealing with logratios excludes dealing with zeros. Nevertheless, it is clear that zero observations might be present in real data sets, either because the corresponding part is completelyabsent –essential zeros– or because it is below detection limit –rounded zeros. Because the second kind of zeros is usually understood as “a trace too small to measure”, it seems reasonable to replace them by a suitable small value, and this has been the traditional approach. As stated, e.g. by Tauber (1999) and byMartín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000), the principal problem in compositional data analysis is related to rounded zeros. One should be careful to use a replacement strategy that does not seriously distort the general structure of the data. In particular, the covariance structure of the involvedparts –and thus the metric properties– should be preserved, as otherwise further analysis on subpopulations could be misleading. Following this point of view, a non-parametric imputation method isintroduced in Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000). This method is analyzed in depth by Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2003) where it is shown that thetheoretical drawbacks of the additive zero replacement method proposed in Aitchison (1986) can be overcome using a new multiplicative approach on the non-zero parts of a composition. The new approachhas reasonable properties from a compositional point of view. In particular, it is “natural” in the sense thatit recovers the “true” composition if replacement values are identical to the missing values, and it is coherent with the basic operations on the simplex. This coherence implies that the covariance structure of subcompositions with no zeros is preserved. As a generalization of the multiplicative replacement, in thesame paper a substitution method for missing values on compositional data sets is introduced

Relevância:

80.00% 80.00%

Publicador:

Resumo:

INTRODUCTION Chronic low-grade inflammation and immune activation may persist in HIV patients despite effective antiretroviral therapy (ART). These abnormalities are associated with increased oxidative stress (OS). Bilirubin (BR) may have a beneficial role in counteracting OS. Atazanavir (ATV) inhibits UGT1A1, thus increasing unconjugated BR levels, a distinctive feature of this drug. We compared changes in OS markers in HIV patients on ATV/r versus efavirenz (EFV)-based first-line therapies. MATERIALS AND METHODS Cohort of the Spanish Research Network (CoRIS) is a multicentre, open, prospective cohort of HIV-infected patients naïve to ART at entry and linked to a biobank. We identified hepatitis C virus/hepatitis B virus (HCV/HBV) negative patients who started first-line ART with either ATV/r or EFV, had a baseline biobank sample and a follow-up sample after at least nine months of ART while maintaining initial regimen and being virologically suppressed. Lipoprotein-associated Phospholipase A2 (Lp-PLA2), Myeloperoxidase (MPO) and Oxidized LDL (OxLDL) were measured in paired samples. Marker values at one year were interpolated from available data. Multiple imputations using chained equations were used to deal with missing values. Change in the OS markers was modelled using multiple linear regressions adjusting for baseline marker values and baseline confounders. Correlations between continuous variables were explored using Pearson's correlation tests. RESULTS 145 patients (97 EFV; 48 ATV/r) were studied. Mean (SD) baseline values for OS markers in EFV and ATV/r groups were: Lp-PLA2 [142.2 (72.8) and 150.1 (92.8) ng/mL], MPO [74.3 (48.2) and 93.9 (64.3) µg/L] and OxLDL [76.3 (52.3) and 82.2 (54.4) µg/L]. After adjustment for baseline variables patients on ATV/r had a significant decrease in Lp-PLA2 (estimated difference -16.3 [CI 95%: -31.4, -1.25; p=0.03]) and a significantly lower increase in OxLDL (estimated difference -21.8 [-38.0, -5.6; p<0.01] relative to those on EFV, whereas no differences in MPO were found. Adjusted changes in BR were significantly higher for the ATV/r group (estimated difference 1.33 [1.03, 1.52; p<0.01]). Changes in BR and changes in OS markers were significantly correlated. CONCLUSIONS In virologically suppressed patients on stable ART, OS was lower in ATV/r-based regimens compared to EFV. We hypothesize these changes could be in part attributable to increased BR plasma levels.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

OBJECTIVE: To better understand the structure of the Patient Assessment of Chronic Illness Care (PACIC) instrument. More specifically to test all published validation models, using one single data set and appropriate statistical tools. DESIGN: Validation study using data from cross-sectional survey. PARTICIPANTS: A population-based sample of non-institutionalized adults with diabetes residing in Switzerland (canton of Vaud). MAIN OUTCOME MEASURE: French version of the 20-items PACIC instrument (5-point response scale). We conducted validation analyses using confirmatory factor analysis (CFA). The original five-dimension model and other published models were tested with three types of CFA: based on (i) a Pearson estimator of variance-covariance matrix, (ii) a polychoric correlation matrix and (iii) a likelihood estimation with a multinomial distribution for the manifest variables. All models were assessed using loadings and goodness-of-fit measures. RESULTS: The analytical sample included 406 patients. Mean age was 64.4 years and 59% were men. Median of item responses varied between 1 and 4 (range 1-5), and range of missing values was between 5.7 and 12.3%. Strong floor and ceiling effects were present. Even though loadings of the tested models were relatively high, the only model showing acceptable fit was the 11-item single-dimension model. PACIC was associated with the expected variables of the field. CONCLUSIONS: Our results showed that the model considering 11 items in a single dimension exhibited the best fit for our data. A single score, in complement to the consideration of single-item results, might be used instead of the five dimensions usually described.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Current methods for constructing house price indices are based on comparisons of sale prices of residential properties sold two or more times and on regression of the sale prices on the attributes of the properties and of their locations. The two methods have well recognised deficiencies, selection bias and model assumptions, respectively. We introduce a new method based on propensity score matching. The average house prices for two periods are compared by selecting pairs of properties, one sold in each period, that are as similar on a set of available attributes (covariates) as is feasible to arrange. The uncertainty associated with such matching is addressed by multiple imputation, framing the problem as involving missing values. The method is applied to aregister of transactions ofresidential properties in New Zealand and compared with the established alternatives.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

INTRODUCTION: This study describes the characteristics of the metabolic syndrome in HIV-positive patients in the Data Collection on Adverse Events of Anti-HIV Drugs study and discusses the impact of different methodological approaches on estimates of the prevalence of metabolic syndrome over time. METHODS: We described the prevalence of the metabolic syndrome in patients under follow-up at the end of six calendar periods from 2000 to 2007. The definition that was used for the metabolic syndrome was modified to take account of the use of lipid-lowering and antihypertensive medication, measurement variability and missing values, and assessed the impact of these modifications on the estimated prevalence. RESULTS: For all definitions considered, there was an increasing prevalence of the metabolic syndrome over time, although the prevalence estimates themselves varied widely. Using our primary definition, we found an increase in prevalence from 19.4% in 2000/2001 to 41.6% in 2006/2007. Modification of the definition to incorporate antihypertensive and lipid-lowering medication had relatively little impact on the prevalence estimates, as did modification to allow for missing data. In contrast, modification to allow the metabolic syndrome to be reversible and to allow for measurement variability lowered prevalence estimates substantially. DISCUSSION: The prevalence of the metabolic syndrome in cohort studies is largely based on the use of nonstandardized measurements as they are captured in daily clinical care. As a result, bias is easily introduced, particularly when measurements are both highly variable and may be missing. We suggest that the prevalence of the metabolic syndrome in cohort studies should be based on two consecutive measurements of the laboratory components in the syndrome definition.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Decisions taken in modern organizations are often multi-dimensional, involving multiple decision makers and several criteria measured on different scales. Multiple Criteria Decision Making (MCDM) methods are designed to analyze and to give recommendations in this kind of situations. Among the numerous MCDM methods, two large families of methods are the multi-attribute utility theory based methods and the outranking methods. Traditionally both method families require exact values for technical parameters and criteria measurements, as well as for preferences expressed as weights. Often it is hard, if not impossible, to obtain exact values. Stochastic Multicriteria Acceptability Analysis (SMAA) is a family of methods designed to help in this type of situations where exact values are not available. Different variants of SMAA allow handling all types of MCDM problems. They support defining the model through uncertain, imprecise, or completely missing values. The methods are based on simulation that is applied to obtain descriptive indices characterizing the problem. In this thesis we present new advances in the SMAA methodology. We present and analyze algorithms for the SMAA-2 method and its extension to handle ordinal preferences. We then present an application of SMAA-2 to an area where MCDM models have not been applied before: planning elevator groups for high-rise buildings. Following this, we introduce two new methods to the family: SMAA-TRI that extends ELECTRE TRI for sorting problems with uncertain parameter values, and SMAA-III that extends ELECTRE III in a similar way. An efficient software implementing these two methods has been developed in conjunction with this work, and is briefly presented in this thesis. The thesis is closed with a comprehensive survey of SMAA methodology including a definition of a unified framework.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND: The assessment of Health Related Quality of Life (HRQL) is important in people with dementia as it could influence their care and support plan. Many studies on dementia do not specifically set out to measure dementia-specific HRQL but do include related items. The aim of this study is to explore the distribution of HRQL by functional and socio-demographic variables in a population-based setting. METHODS: Domains of DEMQOL's conceptual framework were mapped in the Cambridge City over 75's Cohort (CC75C) Study. HRQL was estimated in 110 participants aged 80+ years with a confirmed diagnosis of dementia with mild/moderate severity. Acceptability (missing values and normality of the total score), internal consistency (Cronbach's alpha), convergent, discriminant and known group differences validity (Spearman correlations, Wilcoxon Mann-Whitney and Kruskal-Wallis tests) were assessed. The distribution of HRQL by socio-demographic and functional descriptors was explored. RESULTS: The HRQL score ranged from 0 to 16 and showed an internal consistency Alpha of 0.74. Validity of the instrument was found to be acceptable. Men had higher HRQL than women. Marital status had a greater effect on HRQL for men than it did for women. The HRQL of those with good self-reported health was higher than those with fair/poor self-reported health. HRQL was not associated with dementia severity. CONCLUSIONS: To our knowledge this is the first study to examine the distribution of dementia-specific HRQL in a population sample of the very old. We have mapped an existing conceptual framework of dementia specific HRQL onto an existing study and demonstrated the feasibility of this approach. Findings in this study suggest that whereas there is big emphasis in dementia severity, characteristics such as gender should be taken into account when assessing and implementing programmes to improve HRQL.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Raw measurement data does not always immediately convey useful information, but applying mathematical statistical analysis tools into measurement data can improve the situation. Data analysis can offer benefits like acquiring meaningful insight from the dataset, basing critical decisions on the findings, and ruling out human bias through proper statistical treatment. In this thesis we analyze data from an industrial mineral processing plant with the aim of studying the possibility of forecasting the quality of the final product, given by one variable, with a model based on the other variables. For the study mathematical tools like Qlucore Omics Explorer (QOE) and Sparse Bayesian regression (SB) are used. Later on, linear regression is used to build a model based on a subset of variables that seem to have most significant weights in the SB model. The results obtained from QOE show that the variable representing the desired final product does not correlate with other variables. For SB and linear regression, the results show that both SB and linear regression models built on 1-day averaged data seriously underestimate the variance of true data, whereas the two models built on 1-month averaged data are reliable and able to explain a larger proportion of variability in the available data, making them suitable for prediction purposes. However, it is concluded that no single model can fit well the whole available dataset and therefore, it is proposed for future work to make piecewise non linear regression models if the same available dataset is used, or the plant to provide another dataset that should be collected in a more systematic fashion than the present data for further analysis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This study discusses the pronounced importance of corporate entrepreneurial behavior, CEB, facilitation and enablement as a key dimension in the rapidly changing business environment within companies. The research target is a large finance company in Finland, where regulations, compliance and processes restrict and refine extensively business approach. The purpose of this study is to foster the understanding of corporate entrepreneurial behavior and requirements and identify the supporters and inhibitors of facilitation of it. Furthermore, this study examines who should be driving the implementation and offer concrete outcome for the company to get the facilitation started and berth it as part of the organizational culture and values. The theoretical background is constructed from literature related to concept of corporate entrepreneurial behavior, factors supporting and hindering the facilitation based on previous studies and innovation management. Furthermore theoretical framework of middle managers entrepreneurial behavior in facilitation process was researched. Additionally top down and bottom up approach of conversational space building within the organization in order to foster innovation and involving mindset and behavior was in the core literature. The empirical research conducted for the study consists three parts; innovation audit questionnaire, semi-structured interviews and secondary data from previously made research within the case company. Questionnaire and interviews were targeted to eight middle managers within the company, the head of branch regions in corporate segment. The secondary data was collected from over 300 employees in the case company by an external company. Research results were analyzed mainly by themes and source division in adaption with the theoretical framework. The study finds that facilitation of CEB should be a strategic choice and requires strong management support and examples. Behavior should be involved with organizational culture, values, structures and processes. The companies´ willingness to take risks and encourage employees at all levels to participate and be involved by taking ownership and responsibility is in the core. CEB is found to be a key dimension in increasing employee satisfaction and engagement, competitive advantages and economic growth of companies. There is increased interest towards CEB in the case company but there is lack in the mutual consensus of it. CEB is not in the strategy although the mindset and support from management is in place. There is no concrete enablement and space for innovation and CEB although the platform would be receptive. Further research is needed to build shared vision of CEB and how to make it a part of the organizational culture and values in addition to building the conversational space.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tutkimus sai innoituksensa, kun tutkija huomasi tarpeen liiketaloudelliselle, ajantasaiselle ja realistiselle tutkimukselle Pohjois-Korean markkinoista, joka kuvailisi markkinoiden olemassaolevia ja puuttuvia rakenteita sekä tutkisi mahdollisuuksia ylittää puuttuvat rakenteet. Institutionaalinen teoria valittiin sopivaksi viitekehykseksi kuvailla ja tutkia markkinarakennetta. Tutkimuskysymys muotoiltiin seuraavasti: “Miten ulkomaiset yritykset voivat reagoida puuttuviin markkinarakenteisiin Pohjois-Koreassa?”. Tutkimuskysymys jaettiin kolmeen osakysymykseen: (1) Millainen on Pohjois-Korean markkinoiden institutionaalinen ympäristö? (2) Mitkä ovat merkittävimmät puuttuvat markkinarakenteet Pohjois-Koreassa? (3) Mitä mahdollisuuksia ulkomaisilla yrityksillä voisi olla reagoida puuttuviin markkinarakenteisiin? Tutkimus toteutettiin kvalitatiivisena, koska tutkimuskysymys on deskriptiivinen. Aineisto kerättiin asiantuntijahaastattelun ja kvalitatiivisen sisällönanalyysin keinoin. Primääriaineiston muodostavat 2 asiantuntijahaastattelua ja sekundääriaineiston muodostavat 95 artikkelia, jotka kerättiin 40 lähteestä. Aineisto analysoitiin kvalitatiivisen sisällönanalyysin keinoin. Aineisto koodattiin, luokiteltiin ja esitettiin kokonaisuuksina luokittelurungon avulla, joka laadittiin tutkimusta varten muodostetun teoreettisen viitekehyksen mukaan. Tulokset ja johtopäätökset voidaan tiivistää seuraavasti. (1) Pohjois-Korean markkinan instituutioihin vaikuttaa kaksoisrakenne, jossa muodollinen, sosialistinen rakenne ja epämuodollinen, markkinalähtöinen rakenne toimivat päällekkäin. (2) Puuttuvia rakenteita on sekä markkinan kontekstissa että markkinatasolla. Puutteet ovat osittain seurausta vanhojen rakenteiden korvaantumisesta uusilla, jotka eivät ole institutionalisoituneet. (3) Yritykset voivat käyttää samoja mahdollisuuuksia reagoida puuttuviin markkinarakenteisiin Pohjois-Koreassa, joita kehittyvien markkinoiden yhteydessä on esitetty. Sen tulkittiin vähentävän käsitystä, jonka mukaan Pohjois-Korean markkina on liian erikoinen yritystoiminnalle. (4) Kasvava keskiluokka sekä yrittäjyyden ja naisten yhä merkittävämpi rooli liike-elämässä aiheuttavat alhaalta ylöspäin suuntautuvaa kehitystä markkinoilla. Nämä ovat merkkejä viimeaikaisesta kehityksestä, jotka eivät ole saaneet laajaa huomiota länsimaisessa mediassa. Se korostaa tarvetta liiketaloudelliselle, ajantasaiselle jatkotutkimukselle Pohjois-Korean markkinoista.