194 resultados para imputation


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis investigates materialization strategies of non-assumption of enunciation responsibility and inscription of an authorial voice in scientific articles produced by initial researchers in Linguistics. The specific focus lays on identify, describe and interpret: i) linguistics marks that assign enunciation responsibility; ii) the positions taken by the first speaker-enunciator (L1/E1) in relation to points of view (PoV) imputed to second enunciators (e2); and iii) the linguistic marks that assign the formulation of themselves' PoV. As a practical deployment, it is proposed to discuss how to teach taking into account text discursive strategies regarding to enunciation responsibility and also authorship in academic and scientific texts. Our research corpus is formed by eight scientific essays and they were selected in a renamed Linguistics scientific magazine which is high evaluated by Qualis/CAPES (Brazil Science Agency). The methodology follows the assumptions of a qualitative research, and an it has such an interpretative basis, even though it takes support in a quantitative approach, too. Theoretically, we based this research on Textual Analysis of Speech and linguistics theories about linguistic enunciation area. The results show two kinds of movements in PoV management: imputation and responsibility. In imputation contexts, the most recursive linguistic marks were reported speech, indirect speech, reported speech with “that”, modalization in reported speech (in enunciation with “according to”, “in agreement with”, “for”), beyond that we see certain points of non-coincidences of speech, specifically the non-coincidence of the speech itself. The way those linguistic marks occur in the text point out three kinds of enunciation positions that are assumed by L1/E1 in relation to PoV of e2: agreement, disagreement and a pseudo neutrality. It was clearly recursive the imputation followed by agreement (explicit or not), this perspective puts other’s voices to defend a speech assumed like own authorship. In speech responsibility contexts, we observed such a formulation of inner PoV that results from theoretical findings undertaken by novice researchers (revealing how he/she interpreted concepts of the theory) or arising from their research data, allowing them to express with more autonomy and without reporting to speeches from e2. Based on those data, we can say that, in text by initial researchers, the authorship is strongly built upon PoV and also dependent from others' words (theory and the scholars quoted there), taking into account that many contexts in which we can observe agreement position, PoV formulations with words taken from e2 and assumed as own words by syntactic integration, the comments about what the other says, the absence of explanations and additions, as well as a data analysis that could show agreement with the theory used to support the work. These results allow us to visualize how initial researcher dialogs with the theoretical enunciation sources he or she takes as support and how he/she displays the status of a subject doing a research and positioning himself/herself as a researcher/author in the scientific field. In assuming the reported speech, when quoting, as a resource that allows the enunciation responsibility and also when doing evidence to the positions of speaker-enunciator in relation do reported PoV, this suggests to a textual-discursive treatment of quoting in academic and scientific text, in a context of teaching that gives attention to the development of communication skills of initial researcher and that can contribute to insert and interact students in the scientific field.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis investigates materialization strategies of non-assumption of enunciation responsibility and inscription of an authorial voice in scientific articles produced by initial researchers in Linguistics. The specific focus lays on identify, describe and interpret: i) linguistics marks that assign enunciation responsibility; ii) the positions taken by the first speaker-enunciator (L1/E1) in relation to points of view (PoV) imputed to second enunciators (e2); and iii) the linguistic marks that assign the formulation of themselves' PoV. As a practical deployment, it is proposed to discuss how to teach taking into account text discursive strategies regarding to enunciation responsibility and also authorship in academic and scientific texts. Our research corpus is formed by eight scientific essays and they were selected in a renamed Linguistics scientific magazine which is high evaluated by Qualis/CAPES (Brazil Science Agency). The methodology follows the assumptions of a qualitative research, and an it has such an interpretative basis, even though it takes support in a quantitative approach, too. Theoretically, we based this research on Textual Analysis of Speech and linguistics theories about linguistic enunciation area. The results show two kinds of movements in PoV management: imputation and responsibility. In imputation contexts, the most recursive linguistic marks were reported speech, indirect speech, reported speech with “that”, modalization in reported speech (in enunciation with “according to”, “in agreement with”, “for”), beyond that we see certain points of non-coincidences of speech, specifically the non-coincidence of the speech itself. The way those linguistic marks occur in the text point out three kinds of enunciation positions that are assumed by L1/E1 in relation to PoV of e2: agreement, disagreement and a pseudo neutrality. It was clearly recursive the imputation followed by agreement (explicit or not), this perspective puts other’s voices to defend a speech assumed like own authorship. In speech responsibility contexts, we observed such a formulation of inner PoV that results from theoretical findings undertaken by novice researchers (revealing how he/she interpreted concepts of the theory) or arising from their research data, allowing them to express with more autonomy and without reporting to speeches from e2. Based on those data, we can say that, in text by initial researchers, the authorship is strongly built upon PoV and also dependent from others' words (theory and the scholars quoted there), taking into account that many contexts in which we can observe agreement position, PoV formulations with words taken from e2 and assumed as own words by syntactic integration, the comments about what the other says, the absence of explanations and additions, as well as a data analysis that could show agreement with the theory used to support the work. These results allow us to visualize how initial researcher dialogs with the theoretical enunciation sources he or she takes as support and how he/she displays the status of a subject doing a research and positioning himself/herself as a researcher/author in the scientific field. In assuming the reported speech, when quoting, as a resource that allows the enunciation responsibility and also when doing evidence to the positions of speaker-enunciator in relation do reported PoV, this suggests to a textual-discursive treatment of quoting in academic and scientific text, in a context of teaching that gives attention to the development of communication skills of initial researcher and that can contribute to insert and interact students in the scientific field.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper provides a method for constructing a new historical global nitrogen fertilizer application map (0.5° × 0.5° resolution) for the period 1961-2010 based on country-specific information from Food and Agriculture Organization statistics (FAOSTAT) and various global datasets. This new map incorporates the fraction of NH+4 (and NONO-3) in N fertilizer inputs by utilizing fertilizer species information in FAOSTAT, in which species can be categorized as NH+4 and/or NO-3-forming N fertilizers. During data processing, we applied a statistical data imputation method for the missing data (19 % of national N fertilizer consumption) in FAOSTAT. The multiple imputation method enabled us to fill gaps in the time-series data using plausible values using covariates information (year, population, GDP, and crop area). After the imputation, we downscaled the national consumption data to a gridded cropland map. Also, we applied the multiple imputation method to the available chemical fertilizer species consumption, allowing for the estimation of the NH+4/NO-3 ratio in national fertilizer consumption. In this study, the synthetic N fertilizer inputs in 2000 showed a general consistency with the existing N fertilizer map (Potter et al., 2010, doi:10.1175/2009EI288.1) in relation to the ranges of N fertilizer inputs. Globally, the estimated N fertilizer inputs based on the sum of filled data increased from 15 Tg-N to 110 Tg-N during 1961-2010. On the other hand, the global NO-3 input started to decline after the late 1980s and the fraction of NO-3 in global N fertilizer decreased consistently from 35 % to 13 % over a 50-year period. NH+4 based fertilizers are dominant in most countries; however, the NH+4/NO-3 ratio in N fertilizer inputs shows clear differences temporally and geographically. This new map can be utilized as an input data to global model studies and bring new insights for the assessment of historical terrestrial N cycling changes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract

Continuous variable is one of the major data types collected by the survey organizations. It can be incomplete such that the data collectors need to fill in the missingness. Or, it can contain sensitive information which needs protection from re-identification. One of the approaches to protect continuous microdata is to sum them up according to different cells of features. In this thesis, I represents novel methods of multiple imputation (MI) that can be applied to impute missing values and synthesize confidential values for continuous and magnitude data.

The first method is for limiting the disclosure risk of the continuous microdata whose marginal sums are fixed. The motivation for developing such a method comes from the magnitude tables of non-negative integer values in economic surveys. I present approaches based on a mixture of Poisson distributions to describe the multivariate distribution so that the marginals of the synthetic data are guaranteed to sum to the original totals. At the same time, I present methods for assessing disclosure risks in releasing such synthetic magnitude microdata. The illustration on a survey of manufacturing establishments shows that the disclosure risks are low while the information loss is acceptable.

The second method is for releasing synthetic continuous micro data by a nonstandard MI method. Traditionally, MI fits a model on the confidential values and then generates multiple synthetic datasets from this model. Its disclosure risk tends to be high, especially when the original data contain extreme values. I present a nonstandard MI approach conditioned on the protective intervals. Its basic idea is to estimate the model parameters from these intervals rather than the confidential values. The encouraging results of simple simulation studies suggest the potential of this new approach in limiting the posterior disclosure risk.

The third method is for imputing missing values in continuous and categorical variables. It is extended from a hierarchically coupled mixture model with local dependence. However, the new method separates the variables into non-focused (e.g., almost-fully-observed) and focused (e.g., missing-a-lot) ones. The sub-model structure of focused variables is more complex than that of non-focused ones. At the same time, their cluster indicators are linked together by tensor factorization and the focused continuous variables depend locally on non-focused values. The model properties suggest that moving the strongly associated non-focused variables to the side of focused ones can help to improve estimation accuracy, which is examined by several simulation studies. And this method is applied to data from the American Community Survey.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Previously developed models for predicting absolute risk of invasive epithelial ovarian cancer have included a limited number of risk factors and have had low discriminatory power (area under the receiver operating characteristic curve (AUC) < 0.60). Because of this, we developed and internally validated a relative risk prediction model that incorporates 17 established epidemiologic risk factors and 17 genome-wide significant single nucleotide polymorphisms (SNPs) using data from 11 case-control studies in the United States (5,793 cases; 9,512 controls) from the Ovarian Cancer Association Consortium (data accrued from 1992 to 2010). We developed a hierarchical logistic regression model for predicting case-control status that included imputation of missing data. We randomly divided the data into an 80% training sample and used the remaining 20% for model evaluation. The AUC for the full model was 0.664. A reduced model without SNPs performed similarly (AUC = 0.649). Both models performed better than a baseline model that included age and study site only (AUC = 0.563). The best predictive power was obtained in the full model among women younger than 50 years of age (AUC = 0.714); however, the addition of SNPs increased the AUC the most for women older than 50 years of age (AUC = 0.638 vs. 0.616). Adapting this improved model to estimate absolute risk and evaluating it in prospective data sets is warranted.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Over the last two decades social vulnerability has emerged as a major area of study, with increasing attention to the study of vulnerable populations. Generally, the elderly are among the most vulnerable members of any society, and widespread population aging has led to greater focus on elderly vulnerability. However, the absence of a valid and practical measure constrains the ability of policy-makers to address this issue in a comprehensive way. This study developed a composite indicator, The Elderly Social Vulnerability Index (ESVI), and used it to undertake a comparative analysis of the availability of support for elderly Jamaicans based on their access to human, material and social resources. The results of the ESVI indicated that while the elderly are more vulnerable overall, certain segments of the population appear to be at greater risk. Females had consistently lower scores than males, and the oldest-old had the highest scores of all groups of older persons. Vulnerability scores also varied according to place of residence, with more rural parishes having higher scores than their urban counterparts. These findings support the political economy framework which locates disadvantage in old age within political and ideological structures. The findings also point to the pervasiveness and persistence of gender inequality as argued by feminist theories of aging. Based on the results of the study it is clear that there is a need for policies that target specific population segments, in addition to universal policies that could make the experience of old age less challenging for the majority of older persons. Overall, the ESVI has displayed usefulness as a tool for theoretical analysis and demonstrated its potential as a policy instrument to assist decision-makers in determining where to target their efforts as they seek to address the issue of social vulnerability in old age. Data for this study came from the 2001 population and housing census of Jamaica, with multiple imputation for missing data. The index was derived from the linear aggregation of three equally weighted domains, comprised of eleven unweighted indicators which were normalized using z-scores. Indicators were selected based on theoretical relevance and data availability.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease where the heart muscle is partially thickened and blood flow is - potentially fatally - obstructed. It is one of the leading causes of sudden cardiac death in young people. Electrocardiography (ECG) and Echocardiography (Echo) are the standard tests for identifying HCM and other cardiac abnormalities. The American Heart Association has recommended using a pre-participation questionnaire for young athletes instead of ECG or Echo tests due to considerations of cost and time involved in interpreting the results of these tests by an expert cardiologist. Initially we set out to develop a classifier for automated prediction of young athletes’ heart conditions based on the answers to the questionnaire. Classification results and further in-depth analysis using computational and statistical methods indicated significant shortcomings of the questionnaire in predicting cardiac abnormalities. Automated methods for analyzing ECG signals can help reduce cost and save time in the pre-participation screening process by detecting HCM and other cardiac abnormalities. Therefore, the main goal of this dissertation work is to identify HCM through computational analysis of 12-lead ECG. ECG signals recorded on one or two leads have been analyzed in the past for classifying individual heartbeats into different types of arrhythmia as annotated primarily in the MIT-BIH database. In contrast, we classify complete sequences of 12-lead ECGs to assign patients into two groups: HCM vs. non-HCM. The challenges and issues we address include missing ECG waves in one or more leads and the dimensionality of a large feature-set. We address these by proposing imputation and feature-selection methods. We develop heartbeat-classifiers by employing Random Forests and Support Vector Machines, and propose a method to classify full 12-lead ECGs based on the proportion of heartbeats classified as HCM. The results from our experiments show that the classifiers developed using our methods perform well in identifying HCM. Thus the two contributions of this thesis are the utilization of computational and statistical methods for discovering shortcomings in a current screening procedure and the development of methods to identify HCM through computational analysis of 12-lead ECG signals.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Moderate-to-vigorous physical activity (MVPA) is an important determinant of children’s physical health, and is commonly measured using accelerometers. A major limitation of accelerometers is non-wear time, which is the time the participant did not wear their device. Given that non-wear time is traditionally discarded from the dataset prior to estimating MVPA, final estimates of MVPA may be biased. Therefore, alternate approaches should be explored. OBJECTIVES: The objectives of this thesis were to 1) develop and describe an imputation approach that uses the socio-demographic, time, health, and behavioural data from participants to replace non-wear time accelerometer data, 2) determine the extent to which imputation of non-wear time data influences estimates of MVPA, and 3) determine if imputation of non-wear time data influences the associations between MVPA, body mass index (BMI), and systolic blood pressure (SBP). METHODS: Seven days of accelerometer data were collected using Actical accelerometers from 332 children aged 10-13. Three methods for handling missing accelerometer data were compared: 1) the “non-imputed” method wherein non-wear time was deleted from the dataset, 2) imputation dataset I, wherein the imputation of MVPA during non-wear time was based upon socio-demographic factors of the participant (e.g., age), health information (e.g., BMI), and time characteristics of the non-wear period (e.g., season), and 3) imputation dataset II wherein the imputation of MVPA was based upon the same variables as imputation dataset I, plus organized sport information. Associations between MVPA and health outcomes in each method were assessed using linear regression. RESULTS: Non-wear time accounted for 7.5% of epochs during waking hours. The average minutes/day of MVPA was 56.8 (95% CI: 54.2, 59.5) in the non-imputed dataset, 58.4 (95% CI: 55.8, 61.0) in imputed dataset I, and 59.0 (95% CI: 56.3, 61.5) in imputed dataset II. Estimates between datasets were not significantly different. The strength of the relationship between MVPA with BMI and SBP were comparable between all three datasets. CONCLUSION: These findings suggest that studies that achieve high accelerometer compliance with unsystematic patterns of missing data can use the traditional approach of deleting non-wear time from the dataset to obtain MVPA measures without substantial bias.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Estimates of HIV prevalence are important for policy in order to establish the health status of a country's population and to evaluate the effectiveness of population-based interventions and campaigns. However, participation rates in testing for surveillance conducted as part of household surveys, on which many of these estimates are based, can be low. HIV positive individuals may be less likely to participate because they fear disclosure, in which case estimates obtained using conventional approaches to deal with missing data, such as imputation-based methods, will be biased. We develop a Heckman-type simultaneous equation approach which accounts for non-ignorable selection, but unlike previous implementations, allows for spatial dependence and does not impose a homogeneous selection process on all respondents. In addition, our framework addresses the issue of separation, where for instance some factors are severely unbalanced and highly predictive of the response, which would ordinarily prevent model convergence. Estimation is carried out within a penalized likelihood framework where smoothing is achieved using a parametrization of the smoothing criterion which makes estimation more stable and efficient. We provide the software for straightforward implementation of the proposed approach, and apply our methodology to estimating national and sub-national HIV prevalence in Swaziland, Zimbabwe and Zambia.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Primary total knee replacement is a common operation that is performed to provide pain relief and restore functional ability. Inpatient physiotherapy is routinely provided after surgery to enhance recovery prior to hospital discharge. However, international variation exists in the provision of outpatient physiotherapy after hospital discharge. While evidence indicates that outpatient physiotherapy can improve short-term function, the longer term benefits are unknown. The aim of this randomised controlled trial is to evaluate the long-term clinical effectiveness and cost-effectiveness of a 6-week group-based outpatient physiotherapy intervention following knee replacement. Methods/design: Two hundred and fifty-six patients waiting for knee replacement because of osteoarthritis will be recruited from two orthopaedic centres. Participants randomised to the usual-care group (n = 128) will be given a booklet about exercise and referred for physiotherapy if deemed appropriate by the clinical care team. The intervention group (n = 128) will receive the same usual care and additionally be invited to attend a group-based outpatient physiotherapy class starting 6 weeks after surgery. The 1-hour class will be run on a weekly basis over 6 weeks and will involve task-orientated and individualised exercises. The primary outcome will be the Lower Extremity Functional Scale at 12 months post-operative. Secondary outcomes include: quality of life, knee pain and function, depression, anxiety and satisfaction. Data collection will be by questionnaire prior to surgery and 3, 6 and 12 months after surgery and will include a resource-use questionnaire to enable a trial-based economic evaluation. Trial participation and satisfaction with the classes will be evaluated through structured telephone interviews. The primary statistical and economic analyses will be conducted on an intention-to-treat basis with and without imputation of missing data. The primary economic result will estimate the incremental cost per quality-adjusted life year gained from this intervention from a National Health Services (NHS) and personal social services perspective. Discussion: This research aims to benefit patients and the NHS by providing evidence on the long-term effectiveness and cost-effectiveness of outpatient physiotherapy after knee replacement. If the intervention is found to be effective and cost-effective, implementation into clinical practice could lead to improvement in patients’ outcomes and improved health care resource efficiency.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The widespread efforts to incorporate the economic values of oceans into national income accounts have reached a stage where coordination of national efforts is desirable. A symposium held in 2015 began this process by bringing together representatives from ten countries. The symposium concluded that a definition of core ocean industries was possible but beyond that core the definition of ocean industries is in flux. Better coordination of ocean income accounts will require addressing issues of aggregation, geography, partial ocean industries, confidential, and imputation is also needed. Beyond the standard national income accounts, a need to incorporate environmental resource and ecosystem service values to gain a complete picture of the economic role of the oceans was identified. The U.N. System of Environmental and Economic Accounts and the Experimental Ecosystem Service Accounts provide frameworks for this expansion. This will require the development of physical accounts of environmental assets linked to the economic accounts as well as the adaptation of transaction and welfare based economic valuation methods to environmental resources and ecosystem services. The future development of ocean economic data is most likely to require cooperative efforts at development of metadata standards and the use of multiple platforms of opportunity created by policy analysis, economic development, and conservation projects to both collect new economic data and to sustain ocean economy data collection into the future by building capacity in economic data collection and use..

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Comme les résultats obtenus relativement à la relation entre l'utilisation de la capitalisation et le niveau d'endettement de l'entreprise sont difficiles à généraliser, ils ne permettent pas de conclure à l'existence d'une telle relation. Or, il a été démontré dans la littérature comptable, qu'en absence de normalisation, les entreprises endettées favorisent la méthode de capitalisation. Cela suggère donc que les critères énoncés par l'ICCA limitent le recours à la capitalisation. Les résultats obtenus relativement à la proportion des frais de développement capitalisés suggèrent qu'aucune relation n'existe entre la proportion capitalisée et le niveau d'endettement de l'entreprise. Cela suggère que les critères énoncés limitent le montant des frais de développement capitalisés. Par ailleurs, une association négative a été observée entre le recours à l'imputation et la taille des entreprises. Ce résultat est surprenant car les grandes entreprises sont vraisemblablement fructueuses et ont de bonnes chances de satisfaire les critères énoncés par l'ICCA. Cela suggère que les grandes entreprises se soustraient à l'obligation de capitaliser leurs frais de développement et que cela est vraisemblablement toléré par l'ICCA.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Snapper (Pagrus auratus) is widely distributed throughout subtropical and temperate southern oceans and forms a significant recreational and commercial fishery in Queensland, Australia. Using data from government reports, media sources, popular publications and a government fisheries survey carried out in 1910, we compiled information on individual snapper fishing trips that took place prior to the commencement of fisherywide organized data collection, from 1871 to 1939. In addition to extracting all available quantitative data, we translated qualitative information into bounded estimates and used multiple imputation to handle missing values, forming 287 records for which catch rate (snapper fisher−1 h−1) could be derived. Uncertainty was handled through a parametric maximum likelihood framework (a transformed trivariate Gaussian), which facilitated statistical comparisons between data sources. No statistically significant differences in catch rates were found among media sources and the government fisheries survey. Catch rates remained stable throughout the time series, averaging 3.75 snapper fisher−1 h−1 (95% confidence interval, 3.42–4.09) as the fishery expanded into new grounds. In comparison, a contemporary (1993–2002) south-east Queensland charter fishery produced an average catch rate of 0.4 snapper fisher−1 h−1 (95% confidence interval, 0.31–0.58). These data illustrate the productivity of a fishery during its earliest years of development and represent the earliest catch rate data globally for this species. By adopting a formalized approach to address issues common to many historical records – missing data, a lack of quantitative information and reporting bias – our analysis demonstrates the potential for historical narratives to contribute to contemporary fisheries management.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes arithmetic and geometric Paasche quality-adjusted price indexes that combine micro data from the base period with macro data on the averages of asset prices and characteristics at the index period.