9 resultados para Cross-validation

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Despite many researches on development in education and psychology, not often is the methodology tested with real data. A major barrier to test the growth model is that the design of study includes repeated observations and the nature of the growth is nonlinear. The repeat measurements on a nonlinear model require sophisticated statistical methods. In this study, we present mixed effects model in a negative exponential curve to describe the development of children's reading skills. This model can describe the nature of the growth on children's reading skills and account for intra-individual and inter-individual variation. We also apply simple techniques including cross-validation, regression, and graphical methods to determine the most appropriate curve for data, to find efficient initial values of parameters, and to select potential covariates. We illustrate with an example that motivated this research: a longitudinal study of academic skills from grade 1 to grade 12 in Connecticut public schools. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Mistreatment and self-neglect significantly increase the risk of dying in older adults. It is estimated that 1 to 2 million older adults experience elder mistreatment and self-neglect every year in the United States. Currently, there are no elder mistreatment and self-neglect assessment tools with construct validity and measurement invariance testing and no studies have sought to identify underlying latent classes of elder self-neglect that may have differential mortality rates. Using data from 11,280 adults with Texas APS substantiated elder mistreatment and self-neglect 3 studies were conducted to: (1) test the construct validity and (2) the measurement invariance across gender and ethnicity of the Texas Adult Protective Services (APS) Client Assessment and Risk Evaluation (CARE) tool and (3) identify latent classes associated with elder self-neglect. Study 1 confirmed the construct validity of the CARE tool following adjustments to the initial hypothesized CARE tool. This resulted in the deletion of 14 assessment items and a final assessment with 5 original factors and 43 items. Cross-validation for this model was achieved. Study 2 provided empirical evidence for factor loading and item-threshold invariance of the CARE tool across gender and between African-Americans and Caucasians. The financial status domain of the CARE tool did not function properly for Hispanics and thus, had to be deleted. Subsequent analyses showed factor loading and item-threshold invariance across all 3 ethnic groups with the exception of some residual errors. Study 3 identified 4-latent classes associated with elder self-neglect behaviors which included individuals with evidence of problems in the areas of (1) their environment, (2) physical and medical status, (3) multiple domains and (4) finances. Overall, these studies provide evidence supporting the use of APS CARE tool for providing unbiased and valid investigations of mistreatment and neglect in older adults with different demographic characteristics. Furthermore, the findings support the underlying notion that elder self-neglect may not only occur along a continuum, but that differential types may exist. All of which, have very important potential implications for social and health services distributed to vulnerable mistreated and neglected older adults.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Radiomics is the high-throughput extraction and analysis of quantitative image features. For non-small cell lung cancer (NSCLC) patients, radiomics can be applied to standard of care computed tomography (CT) images to improve tumor diagnosis, staging, and response assessment. The first objective of this work was to show that CT image features extracted from pre-treatment NSCLC tumors could be used to predict tumor shrinkage in response to therapy. This is important since tumor shrinkage is an important cancer treatment endpoint that is correlated with probability of disease progression and overall survival. Accurate prediction of tumor shrinkage could also lead to individually customized treatment plans. To accomplish this objective, 64 stage NSCLC patients with similar treatments were all imaged using the same CT scanner and protocol. Quantitative image features were extracted and principal component regression with simulated annealing subset selection was used to predict shrinkage. Cross validation and permutation tests were used to validate the results. The optimal model gave a strong correlation between the observed and predicted shrinkages with . The second objective of this work was to identify sets of NSCLC CT image features that are reproducible, non-redundant, and informative across multiple machines. Feature sets with these qualities are needed for NSCLC radiomics models to be robust to machine variation and spurious correlation. To accomplish this objective, test-retest CT image pairs were obtained from 56 NSCLC patients imaged on three CT machines from two institutions. For each machine, quantitative image features with concordance correlation coefficient values greater than 0.90 were considered reproducible. Multi-machine reproducible feature sets were created by taking the intersection of individual machine reproducible feature sets. Redundant features were removed through hierarchical clustering. The findings showed that image feature reproducibility and redundancy depended on both the CT machine and the CT image type (average cine 4D-CT imaging vs. end-exhale cine 4D-CT imaging vs. helical inspiratory breath-hold 3D CT). For each image type, a set of cross-machine reproducible, non-redundant, and informative image features was identified. Compared to end-exhale 4D-CT and breath-hold 3D-CT, average 4D-CT derived image features showed superior multi-machine reproducibility and are the best candidates for clinical correlation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Cervical cancer is the leading cause of death and disease from malignant neoplasms among women in developing countries. Even though the Pap smear has significantly decreased the number of deaths from cervical cancer in the past years, it has its limitations. Researchers have developed an automated screening machine which can potentially detect abnormal cases that are overlooked by conventional screening. The goal of quantitative cytology is to classify the patient's tissue sample based on quantitative measurements of the individual cells. It is also much cheaper and potentially can take less time. One of the major challenges of collecting cells with a cytobrush is the possibility of not sampling any existing dysplastic cells on the cervix. Being able to correctly classify patients who have disease without the presence of dysplastic cells could improve the accuracy of quantitative cytology algorithms. Subtle morphologic changes in normal-appearing tissues adjacent to or distant from malignant tumors have been shown to exist, but a comparison of various statistical methods, including many recent advances in the statistical learning field, has not previously been done. The objective of this thesis is to use different classification methods applied to quantitative cytology data for the detection of malignancy associated changes (MACs). In this thesis, Elastic Net is the best algorithm. When we applied the Elastic Net algorithm to the test set, we combined the training set and validation set as "training" set and used 5-fold cross validation to choose the parameter for Elastic Net. It has a sensitivity of 47% at 80% specificity, an AUC 0.52, and a partial AUC 0.10 (95% CI 0.09-0.11).^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Accurate quantitative estimation of exposure using retrospective data has been one of the most challenging tasks in the exposure assessment field. To improve these estimates, some models have been developed using published exposure databases with their corresponding exposure determinants. These models are designed to be applied to reported exposure determinants obtained from study subjects or exposure levels assigned by an industrial hygienist, so quantitative exposure estimates can be obtained. ^ In an effort to improve the prediction accuracy and generalizability of these models, and taking into account that the limitations encountered in previous studies might be due to limitations in the applicability of traditional statistical methods and concepts, the use of computer science- derived data analysis methods, predominantly machine learning approaches, were proposed and explored in this study. ^ The goal of this study was to develop a set of models using decision trees/ensemble and neural networks methods to predict occupational outcomes based on literature-derived databases, and compare, using cross-validation and data splitting techniques, the resulting prediction capacity to that of traditional regression models. Two cases were addressed: the categorical case, where the exposure level was measured as an exposure rating following the American Industrial Hygiene Association guidelines and the continuous case, where the result of the exposure is expressed as a concentration value. Previously developed literature-based exposure databases for 1,1,1 trichloroethane, methylene dichloride and, trichloroethylene were used. ^ When compared to regression estimations, results showed better accuracy of decision trees/ensemble techniques for the categorical case while neural networks were better for estimation of continuous exposure values. Overrepresentation of classes and overfitting were the main causes for poor neural network performance and accuracy. Estimations based on literature-based databases using machine learning techniques might provide an advantage when they are applied to other methodologies that combine `expert inputs' with current exposure measurements, like the Bayesian Decision Analysis tool. The use of machine learning techniques to more accurately estimate exposures from literature-based exposure databases might represent the starting point for the independence from the expert judgment.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Many users search the Internet for answers to health questions. Complementary and alternative medicine (CAM) is a particularly common search topic. Because many CAM therapies do not require a clinician's prescription, false or misleading CAM information may be more dangerous than information about traditional therapies. Many quality criteria have been suggested to filter out potentially harmful online health information. However, assessing the accuracy of CAM information is uniquely challenging since CAM is generally not supported by conventional literature. OBJECTIVE: The purpose of this study is to determine whether domain-independent technical quality criteria can identify potentially harmful online CAM content. METHODS: We analyzed 150 Web sites retrieved from a search for the three most popular herbs: ginseng, ginkgo and St. John's wort and their purported uses on the ten most commonly used search engines. The presence of technical quality criteria as well as potentially harmful statements (commissions) and vital information that should have been mentioned (omissions) was recorded. RESULTS: Thirty-eight sites (25%) contained statements that could lead to direct physical harm if acted upon. One hundred forty five sites (97%) had omitted information. We found no relationship between technical quality criteria and potentially harmful information. CONCLUSIONS: Current technical quality criteria do not identify potentially harmful CAM information online. Consumers should be warned to use other means of validation or to trust only known sites. Quality criteria that consider the uniqueness of CAM must be developed and validated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Symptoms has been shown to predict quality of life, treatment course and survival in solid tumor patients. Currently, no instrument exists that measures both cancer-related symptoms and the neurologic symptoms that are unique to persons with primary brain tumors (PBT). The aim of this study was to develop and validate an instrument to measure symptoms in patients who have PBT. A conceptual analysis of symptoms and symptom theories led to defining the symptoms experience as the perception of the frequency, intensity, distress, and meaning that occurs as symptoms are produced, perceived, and expressed. The M.D. Anderson Symptom Inventory (MDASI) measures both symptoms and how they interfere with daily functioning in patients with cancer, which is similar to the situational meaning defined in the analysis. A list of symptoms pertinent to the PBT population was added to the core MDASI and reviewed by a group of experts for validity. As a result, 18 items were added to the core MDASI (the MDASI-BT) for the next phase of instrument development, establishing validity and reliability through a descriptive, cross-sectional approach with PBT patients. Data were collected with a patient completed demographic data sheet, an investigator completed clinician checklist, and the MDASI-BT. Analysis evaluated the reliability and validity of the MDASI-BT in PBT patients. Data were obtained from 201 patients. The number of items was reduced to 22 by evaluation of symptom severity as well as cluster analysis. Regression analysis showed more than half (56%) of the variability in symptom severity was explained by the brain tumor module items. Factor analysis confirmed that the 22 item MDASI-BT measured six underlying constructs: (a) affective; (b) cognitive; (c) focal neurologic deficits; (d) constitutional symptoms; (e) treatment-related symptoms; and (f) gastrointestinal symptoms. The MDASI-BT was sensitive to disease severity and if the patient was hospitalized. The MDASI-BT is the first instrument to measure symptoms in PBT patients that has demonstrated reliability and validity. It is the first step in a program of research to evaluate the occurrence of symptoms and plan and evaluate interventions for PBT patients. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background/significance. The scarcity of reliable and valid Spanish language instruments for health related research has hindered research with the Hispanic population. Research suggests that fatalistic attitudes are related to poor cancer screening behaviors and may be one reason for low participation of Mexican-Americans in cancer screening. This problem is of major concern because Mexican-Americans constitute the largest Hispanic subgroup in the U.S.^ Purpose. The purposes of this study were: (1) To translate the Powe Fatalism Inventory, (PFI) into Spanish, and culturally adapt the instrument to the Mexican-American culture as found along the U.S.-Mexico border and (2) To test the equivalence between the Spanish translated, culturally adapted version of the PFI and the English version of the PFI to include clarity, content validity, reading level and reliability.^ Design. Descriptive, cross-sectional.^ Methods. The Spanish language translation used a translation model which incorporates a cultural adaptation process. The SPFI was administered to 175 bilingual participants residing in a midsize, U.S-Mexico border city. Data analysis included estimation of Cronbach's alpha, factor analysis, paired samples t-test comparison and multiple regression analysis using SPSS software, as well as measurement of content validity and reading level of the SPFI. ^ Findings. A reliability estimate using Cronbach's alpha coefficient was 0.81 for the SPFI compared to 0.80 for the PFI in this study. Factor Analysis extracted four factors which explained 59% of the variance. Paired t-test comparison revealed no statistically significant differences between the SPFI and PFI total or individual item scores. Content Validity Index was determined to be 1.0. Reading Level was assessed to be less than a 6th grade reading level. The correlation coefficient between the SPFI and PFI was 0.95.^ Conclusions. This study provided strong psychometric evidence that the Spanish translated, culturally adapted SPFI is an equivalent tool to the English version of the PFI in measuring cancer fatalism. This indicates that the two forms of the instrument can be used interchangeably in a single study to accommodate reading and speaking abilities of respondents. ^