802 resultados para multiple linear regression
Resumo:
BACKGROUND Multiple breath washout (MBW) derived Scond is an established index of ventilation inhomogeneity. Time-consuming post hoc calculations of the expirogram's slope of alveolar phase III (SIII) and the lack of available software hampered widespread application of Scond. METHODS Seventy-two school-aged children (45 with cystic fibrosis; CF) performed 3 nitrogen MBW. We tested a new automated algorithm for Scond analysis (Scondauto ) which comprised breath selection for SIII detection, calculation and reporting of test quality. We compared Scondauto to (i) standard Scond analysis (Scondmanual ) with manual breath selection and to (ii) pragmatic Scond analysis including all breaths (Scondall ). Primary outcomes were success rate and agreement between different Scond protocols, and Scond fitting quality (linear regression R(2) ). RESULTS Average Scondauto (0.06 for CF and 0.01 for controls) was not different from Scondmanual (0.06 for CF and 0.01 for controls) and showed comparable fitting quality (R(2) 0.53 for CF and 0.13 for controls vs. R(2) 0.54 for CF and 0.13 for controls). Scondall was similar in CF and controls but with inferior fitting quality compared to Scondauto and Scondmanual . CONCLUSIONS Automated Scond calculation is feasible and produces robust results comparable to the standard manual way of Scond calculation. This algorithm provides a valid, fast and objective tool for regular use, even in children. Pediatr Pulmonol. © 2014 Wiley Periodicals, Inc.
Resumo:
robreg provides a number of robust estimators for linear regression models. Among them are the high breakdown-point and high efficiency MM-estimator, the Huber and bisquare M-estimator, and the S-estimator, each supporting classic or robust standard errors. Furthermore, basic versions of the LMS/LQS (least median of squares) and LTS (least trimmed squares) estimators are provided. Note that the moremata package, also available from SSC, is required.
Resumo:
Parameter estimates from commonly used multivariable parametric survival regression models do not directly quantify differences in years of life expectancy. Gaussian linear regression models give results in terms of absolute mean differences, but are not appropriate in modeling life expectancy, because in many situations time to death has a negative skewed distribution. A regression approach using a skew-normal distribution would be an alternative to parametric survival models in the modeling of life expectancy, because parameter estimates can be interpreted in terms of survival time differences while allowing for skewness of the distribution. In this paper we show how to use the skew-normal regression so that censored and left-truncated observations are accounted for. With this we model differences in life expectancy using data from the Swiss National Cohort Study and from official life expectancy estimates and compare the results with those derived from commonly used survival regression models. We conclude that a censored skew-normal survival regression approach for left-truncated observations can be used to model differences in life expectancy across covariates of interest.
Resumo:
The main objective of this preliminary study was to further clarify the association between testosterone (T) levels and depression by investigating symptom-based depression subtypes in a sample of 64 men. The data were taken from the ZInEP epidemiology survey. Gonadal hormones of a melancholic (n = 25) and an atypical (n = 14) depression subtype, derived from latent class analysis, were compared with those of healthy controls (n = 18). Serum T was assayed using an enzyme-linked immunosorbent assay procedure. Analysis of variance, analysis of covariance, non-parametrical tests, and generalized linear regression models were performed to examine group differences. The atypical depressive subtype showed significantly lower T levels compared with the melancholic depressives. While accumulative evidence indicates that, beyond psychosocial characteristics, the melancholic and atypical depressive subtypes are also distinguishable by biological correlates, the current study expanded this knowledge to include gonadal hormones. Further longitudinal research is warranted to disclose causality by linking the multiple processes in pathogenesis of depression.
Resumo:
More than a quarter of patients with HIV in the United States are diagnosed in hospital settings most often with advanced HIV related conditions.(1) There has been little research done on the causes of hospitalization when the patients are first diagnosed with HIV. The aim of this study was to determine if the patients are hospitalized due to an HIV related cause or due to some other co-morbidity. Reduced access to care could be one possible reason why patients are diagnosed late in the course of the disease. This study compared the access to care of patients diagnosed with HIV in hospital and outpatient setting. The data used for the study was a part of the ongoing study “Attitudes and Beliefs and Steps of HIV Care”. The participants in the study were newly diagnosed with HIV and recruited from both inpatient and outpatient settings. The primary and the secondary diagnoses from hospital discharge reports were extracted and a primary reason for hospitalization was ascertained. These were classified as HIV-related, other infectious causes, non–infectious causes, other systemic causes, and miscellaneous causes. Access to care was determined by a score based on responses to a set of questions derived from the HIV Cost and Services Utilization Study (HCSUS) on a 6 point scale. The mean score of the hospitalized patients and mean score of the patients diagnosed in an outpatient setting was compared. We used multiple linear regressions to compare mean differences in the two groups after adjusting for age, sex, race, household income educational level and health insurance at the time of diagnosis. There were 185 participants in the study, including 78 who were diagnosed in hospital settings and 107 who were diagnosed in outpatient settings. We found that HIV-related conditions were the leading cause of hospitalization, accounting for 60% of admissions, followed by non-infectious causes (20%) and then other infectious causes (17%). The inpatient diagnosed group did not have greater perceived access-to-care as compared to the outpatient group. Regression analysis demonstrated a statistically significant improvement in access-to-care with advancing education level (p=0.04) and with better health insurance (p=0.004). HIV-related causes account for many hospitalizations when patients are first diagnosed with HIV. Many of these HIV-related hospitalizations could have been prevented if patients were diagnosed early and linked to medical care. Programs to increase HIV awareness need to be an integral part of activities aimed at control of spread of HIV in the community. Routine testing for HIV infection to promote early HIV diagnosis can prevent significant morbidity and mortality.^
Resumo:
In recent years, disaster preparedness through assessment of medical and special needs persons (MSNP) has taken a center place in public eye in effect of frequent natural disasters such as hurricanes, storm surge or tsunami due to climate change and increased human activity on our planet. Statistical methods complex survey design and analysis have equally gained significance as a consequence. However, there exist many challenges still, to infer such assessments over the target population for policy level advocacy and implementation. ^ Objective. This study discusses the use of some of the statistical methods for disaster preparedness and medical needs assessment to facilitate local and state governments for its policy level decision making and logistic support to avoid any loss of life and property in future calamities. ^ Methods. In order to obtain precise and unbiased estimates for Medical Special Needs Persons (MSNP) and disaster preparedness for evacuation in Rio Grande Valley (RGV) of Texas, a stratified and cluster-randomized multi-stage sampling design was implemented. US School of Public Health, Brownsville surveyed 3088 households in three counties namely Cameron, Hidalgo, and Willacy. Multiple statistical methods were implemented and estimates were obtained taking into count probability of selection and clustering effects. Statistical methods for data analysis discussed were Multivariate Linear Regression (MLR), Survey Linear Regression (Svy-Reg), Generalized Estimation Equation (GEE) and Multilevel Mixed Models (MLM) all with and without sampling weights. ^ Results. Estimated population for RGV was 1,146,796. There were 51.5% female, 90% Hispanic, 73% married, 56% unemployed and 37% with their personal transport. 40% people attained education up to elementary school, another 42% reaching high school and only 18% went to college. Median household income is less than $15,000/year. MSNP estimated to be 44,196 (3.98%) [95% CI: 39,029; 51,123]. All statistical models are in concordance with MSNP estimates ranging from 44,000 to 48,000. MSNP estimates for statistical methods are: MLR (47,707; 95% CI: 42,462; 52,999), MLR with weights (45,882; 95% CI: 39,792; 51,972), Bootstrap Regression (47,730; 95% CI: 41,629; 53,785), GEE (47,649; 95% CI: 41,629; 53,670), GEE with weights (45,076; 95% CI: 39,029; 51,123), Svy-Reg (44,196; 95% CI: 40,004; 48,390) and MLM (46,513; 95% CI: 39,869; 53,157). ^ Conclusion. RGV is a flood zone, most susceptible to hurricanes and other natural disasters. People in the region are mostly Hispanic, under-educated with least income levels in the U.S. In case of any disaster people in large are incapacitated with only 37% have their personal transport to take care of MSNP. Local and state government’s intervention in terms of planning, preparation and support for evacuation is necessary in any such disaster to avoid loss of precious human life. ^ Key words: Complex Surveys, statistical methods, multilevel models, cluster randomized, sampling weights, raking, survey regression, generalized estimation equations (GEE), random effects, Intracluster correlation coefficient (ICC).^
Resumo:
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^
Resumo:
This study described the relationship of sexual maturation and blood pressure in a sample (n = 361) of white females, ages seven through 18, attending public schools in a defined area of Central Texas during October through December, 1984. Other correlates of blood pressure were also described for this sample.^ A survey was performed to obtain the data on height, weight, body mass, pulse rate, upper arm circumference and length, and blood pressure. Each subject self-assessed her secondary sex characteristics (breast and pubic hair) according to drawings of the Tanner stages of maturation. The subjects were interviewed to obtain data on personal health habits and menstrual status. Student age, ethnic group and place of residence were abstracted from school records. Parents or guardians of the subjects responded to a questionnaire pertaining to parental and subject health history and parents' occupation and educational attainment.^ In the simple linear regression analysis, sexual maturation and variables of body size were significantly (p < 0.001) and positively associated with systolic and fourth- and fifth-phase diastolic blood pressure. The demographic and socioeconomic variables were not sufficiently variant in this population to have differential effects on the relation between blood pressure and maturation. Stepwise multiple regression was used to assess the contribution of sexual maturation to the variance of blood pressure after accounting for the variables of body size. Sexual maturation (breast stage) along with weight, height and body mass remained in the multiple regression models for fourth- and fifth-phase diastolic blood pressure. Only height and body mass remained in the regression model for systolic blood pressure; sexual maturation did not contribute more to the explanation of the systolic blood pressure variance.^ The association of sexual maturation with blood pressure level was established in this sample of young white females. More research is needed first, to determine if this relationship prevails in other populations of young females, and second, to determine the relationship of sexual maturation sequence and change with the change of blood pressure during childhood and adolescence. ^
Resumo:
Ovarian cancer is the leading cause of cancer-related death for females due to lack of specific early detection method. It is of great interest to find molecular-based biomarkers which are sensitive and specific to ovarian cancer for early diagnosis, prognosis and therapeutics. miRNAs have been proposed to be potential biomarkers that could be used in cancer prevention and therapeutics. The current study analyzed the miRNA and mRNA expression data extracted from the Cancer Genome Atlas (TCGA) database. Using simple linear regression and multiple regression models, we found 71 miRNA-mRNA pairs which were negatively associated between 56 miRNAs and 24 genes of PI3K/AKT pathway. Among these miRNA and mRNA target pairs, 9 of them were in agreement with the predictions from the most commonly used target prediction programs including miRGen, miRDB, miRTarbase and miR2Disease. These shared miRNA-mRNA pairs were considered to be the most potential genes that were involved in ovarian cancer. Furthermore, 4 of the 9 target genes encode cell cycle or apoptosis related proteins including Cyclin D1, p21, FOXO1 and Bcl2, suggesting that their regulator miRNAs including miR-16, miR-96 and miR-21 most likely played important roles in promoting tumor growth through dysregulated cell cycle or apoptosis. miR-96 was also found to directly target IRS-1. In addition, the results showed that miR-17 and miR-9 may be involved in ovarian cancer through targeting JAK1. This study might provide evidence for using miRNA or miRNA profile as biomarker.^
Resumo:
Aim: To investigate shell size variation among gastropod faunas of fossil and recent long-lived European lakes and discuss potential underlying processes. Location: 23 long-lived lakes of the Miocene to Recent of Europe. Methods: Based on a dataset of 1412 species of both fossil and extant lacustrine gastropods, we assessed differences in shell size in terms of characteristics of the faunas (species richness, degree of endemism, differences in family composition) and the lakes (surface area, latitude and longitude of lake centroid, distance to closest neighbouring lake) using multiple and linear regression models. Because of a strong species-area relationship, we used resampling to determine whether any observed correlation is driven by that relationship. Results: The regression models indicated size range expansion rather than unidirectional increase or decrease as the dominant pattern of size evolution. The multiple regression models for size range and maximum and minimum size were statistically significant, while the model with mean size was not. Individual contributions and linear regressions indicated species richness and lake surface area as best predictors for size changes. Resampling analysis revealed no significant effects of species richness on the observed patterns. The correlations are comparable across families of different size classes, suggesting a general pattern. Main conclusions: Among the chosen variables, species richness and lake surface area are the most robust predictors of shell size in long-lived lake gastropods. Although the most outstanding and attractive examples for size evolution in lacustrine gastropods derive from lakes with extensive durations, shell size appears to be independent of the duration of the lake as well as longevity of a species. The analogue of long-lived lakes as 'evolutionary islands' does not hold for developments of shell size because different sets of parameters predict size changes.
Resumo:
Locally weighted regression is a technique that predicts the response for new data items from their neighbors in the training data set, where closer data items are assigned higher weights in the prediction. However, the original method may suffer from overfitting and fail to select the relevant variables. In this paper we propose combining a regularization approach with locally weighted regression to achieve sparse models. Specifically, the lasso is a shrinkage and selection method for linear regression. We present an algorithm that embeds lasso in an iterative procedure that alternatively computes weights and performs lasso-wise regression. The algorithm is tested on three synthetic scenarios and two real data sets. Results show that the proposed method outperforms linear and local models for several kinds of scenarios
Resumo:
Time domain laser reflectance spectroscopy (TDRS) was applied for the first time to evaluate internal fruit quality. This technique, known in medicine-related knowledge areas, has not been used before in agricultural or food research. It allows the simultaneous non-destructive measuring of two optical characteristics of the tissues: light scattering and absorption. Models to measure firmness, sugar & acid contents in kiwifruit, tomato, apple, peach, nectarine and other fruits were built using sequential statistical techniques: principal component analysis, multiple stepwise linear regression, clustering and discriminant analysis. Consistent correlations were established between the two parameters measured with TDRS, i.e. absorption & transport scattering coefficients, with chemical constituents (sugars and acids) and firmness, respectively. Classification models were built to sort fruits into three quality grades, according to their firmness, soluble solids and acidity.
Resumo:
The objective of this study was to propose a multi-criteria optimization and decision-making technique to solve food engineering problems. This technique was demostrated using experimental data obtained on osmotic dehydratation of carrot cubes in a sodium chloride solution. The Aggregating Functions Approach, the Adaptive Random Search Algorithm, and the Penalty Functions Approach were used in this study to compute the initial set of non-dominated or Pareto-optimal solutions. Multiple non-linear regression analysis was performed on a set of experimental data in order to obtain particular multi-objective functions (responses), namely water loss, solute gain, rehydration ratio, three different colour criteria of rehydrated product, and sensory evaluation (organoleptic quality). Two multi-criteria decision-making approaches, the Analytic Hierarchy Process (AHP) and the Tabular Method (TM), were used simultaneously to choose the best alternative among the set of non-dominated solutions. The multi-criteria optimization and decision-making technique proposed in this study can facilitate the assessment of criteria weights, giving rise to a fairer, more consistent, and adequate final compromised solution or food process. This technique can be useful to food scientists in research and education, as well as to engineers involved in the improvement of a variety of food engineering processes.
Resumo:
Introdução: A caracterização dos padrões alimentares dos adolescentes permite analisar os efeitos da dieta como um todo sobre a saúde. Objetivos: Identificar na literatura científica as múltiplas soluções adotadas nas técnicas multivariadas para obtenção de padrões alimentares; Analisar a relação entre os principais padrões alimentares praticados por adolescentes brasileiros com o excesso de peso e obesidade e Analisar a influência de fatores socioeconômicos sobre os principais padrões alimentares praticados por um grupo multiétnico de adolescentes. Métodos: Esta tese foi composta de três artigos. O primeiro corresponde a uma revisão da literatura sobre padrões alimentares estimados por diferentes técnicas multivariadas. Para os demais artigos duas bases de dados foram utilizadas: a Pesquisa de Orçamentos Familiares (POF) de 2008 09 e o estudo Healthy Lifestyle in Europe by Nutrition in Adolescence (HELENA) conduzido em 2006-07. A análise fatorial exploratória foi utilizada para obtenção dos padrões alimentares. O segundo artigo utilizou modelo de regressão logística para verificar a associação entre os escores dos padrões alimentares e o excesso de peso e obesidade ajustado para variáveis socioeconômicas. O terceiro artigo utilizou o modelo de regressão linear para avaliar a associação entre indicadores de renda e escolaridade e os escores dos padrões alimentares. Resultados: Na revisão da literatura foi verificado grande heterogeneidade na escolha dos critérios adotados durante as múltiplas etapas das técnicas multivariadas. No segundo manuscrito, foi verificado que quanto maior a adesão ao Padrão Lanches e ao Padrão Snacks maior a chance de estar com excesso de peso e obesidade. No terceiro manuscrito, 10 padrões alimentares foram identificados entre adolescentes de áreas urbanas no Brasil e Europa. Entre os adolescentes brasileiros, maiores níveis socioeconômicos e educacionais da pessoa de referência do domicílio foram associados positivamente com o padrão composto por queijo, cereais matinais, frutas e sucos de fruta, leite e derivados. Entre os adolescentes europeus, maiores níveis socioeconômicos e maior educação das mães foram positivamente associados ao padrão composto por bebidas lácteas, cereais matinais; leite e derivados, manteiga e margarina, além disso, maiores níveis socioeconômicos também foram negativamente associados com o padrão composto por óleos vegetais, nozes, sementes, pão, carnes, leguminosas, hortaliças e tubérculos, ovos e os maiores níveis de de educação materna foram associados negativamente com o padrão composto por pão; carne; bebidas açúcaradas e salgadinhos. Conclusão: Os achados mostraram a elevada prática de padrões alimentares baseados em alimentos com altas concentrações de gorduras e açúcares os quais estão sendo responsáveis pelo aumento no excesso de peso e obesidade entre os adolescentes brasileiros. No geral, os adolescentes que possuíram maior renda ou bens materiais e maior nível de escolaridade do adulto responsável praticaram padrões alimentares um pouco mais saudáveis. No entanto, no Brasil a maior escolaridade da pessoa de referência do domicílio por si só não está diretamente associada a melhores práticas alimentares entre os adolescentes, o contrário do que acontece na Europa. Sendo assim, o maior acesso à renda e a maior escolaridade dos responsáveis desempenham um papel importante na adoção de padrões alimentares mais saudáveis entre os adolescentes.