56 resultados para Regression imputation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an independent calibration model for the determination of biogenic silica (BSi) in sediments, developed from analysis of synthetic sediment mixtures and application of Fourier transform infrared spectroscopy (FTIRS) and partial least squares regression (PLSR) modeling. In contrast to current FTIRS applications for quantifying BSi, this new calibration is independent from conventional wet-chemical techniques and their associated measurement uncertainties. This approach also removes the need for developing internal calibrations between the two methods for individual sediments records. For the independent calibration, we produced six series of different synthetic sediment mixtures using two purified diatom extracts, with one extract mixed with quartz sand, calcite, 60/40 quartz/calcite and two different natural sediments, and a second extract mixed with one of the natural sediments. A total of 306 samples—51 samples per series—yielded BSi contents ranging from 0 to 100 %. The resulting PLSR calibration model between the FTIR spectral information and the defined BSi concentration of the synthetic sediment mixtures exhibits a strong cross-validated correlation ( R2cv = 0.97) and a low root-mean square error of cross-validation (RMSECV = 4.7 %). Application of the independent calibration to natural lacustrine and marine sediments yields robust BSi reconstructions. At present, the synthetic mixtures do not include the variation in organic matter that occurs in natural samples, which may explain the somewhat lower prediction accuracy of the calibration model for organic-rich samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Histopathologic determination of tumor regression provides important prognostic information for locally advanced gastroesophageal carcinomas after neoadjuvant treatment. Regression grading systems mostly refer to the amount of therapy-induced fibrosis in relation to residual tumor or the estimated percentage of residual tumor in relation to the former tumor site. Although these methods are generally accepted, currently there is no common standard for reporting tumor regression in gastroesophageal cancers. We compared the application of these 2 major principles for assessment of tumor regression: hematoxylin and eosin-stained slides from 89 resection specimens of esophageal adenocarcinomas following neoadjuvant chemotherapy were independently reviewed by 3 pathologists from different institutions. Tumor regression was determined by the 5-tiered Mandard system (fibrosis/tumor relation) and the 4-tiered Becker system (residual tumor in %). Interobserver agreement for the Becker system showed better weighted κ values compared with the Mandard system (0.78 vs. 0.62). Evaluation of the whole embedded tumor site showed improved results (Becker: 0.83; Mandard: 0.73) as compared with only 1 representative slide (Becker: 0.68; Mandard: 0.71). Modification into simplified 3-tiered systems showed comparable interobserver agreement but better prognostic stratification for both systems (log rank Becker: P=0.015; Mandard P=0.03), with independent prognostic impact for overall survival (modified Becker: P=0.011, hazard ratio=3.07; modified Mandard: P=0.023, hazard ratio=2.72). In conclusion, both systems provide substantial to excellent interobserver agreement for estimation of tumor regression after neoadjuvant chemotherapy in esophageal adenocarcinomas. A simple 3-tiered system with the estimation of residual tumor in % (complete regression/1% to 50% residual tumor/>50% residual tumor) maintains the highest reproducibility and prognostic value.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND Recently, histopathological tumour regression, prevalence of signet ring cells, and localisation were reported as prognostic factors in neoadjuvantly treated oesophagogastric (junctional and gastric) cancer. This exploratory retrospective study analyses independent prognostic factors within a large patient cohort after preoperative chemotherapy including clinical and histopathological factors. METHODS In all, 850 patients presenting with oesophagogastric cancer staged cT3/4 Nany cM0/x were treated with neoadjuvant chemotherapy followed by resection in two academic centres. Patient data were documented in a prospective database and retrospectively analysed. RESULTS Of all factors prognostic on univariate analysis, only clinical response, complications, ypTNM stage, and R category were independently prognostic (P<0.01) on multivariate analysis. Tumour localisation and signet ring cells were independently prognostic only when investigator-dependent clinical response evaluation was excluded from the multivariate model. Histopathological tumour regression correlates with tumour grading, Laurén classification, clinical response, ypT, ypN, and R categories but was not identified as an independent prognostic factor. Within R0-resected patients only surgical complications and ypTNM stage were independent prognostic factors. CONCLUSIONS Only established prognostic factors like ypTNM stage, R category, and complications were identified as independent prognostic factors in resected patients after neoadjuvant chemotherapy. In contrast, histopathological tumour regression was not found as an independent prognostic marker.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Graphical display of regression results has become increasingly popular in presentations and in scientific literature because graphs are often much easier to read than tables. Such plots can be produced in Stata by the marginsplot command (see [R] marginsplot). However, while marginsplot is versatile and flexible, it has two major limitations: it can only process results left behind by margins (see [R] margins), and it can handle only one set of results at a time. In this article, I introduce a new command called coefplot that overcomes these limitations. It plots results from any estimation command and combines results from several models into one graph. The default behavior of coefplot is to plot markers for coefficients and horizontal spikes for confidence intervals. However, coefplot can also produce other types of graphs. I illustrate the capabilities of coefplot by using a series of examples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sequence analysis and optimal matching are useful heuristic tools for the descriptive analysis of heterogeneous individual pathways such as educational careers, job sequences or patterns of family formation. However, to date it remains unclear how to handle the inevitable problems caused by missing values with regard to such analysis. Multiple Imputation (MI) offers a possible solution for this problem but it has not been tested in the context of sequence analysis. Against this background, we contribute to the literature by assessing the potential of MI in the context of sequence analyses using an empirical example. Methodologically, we draw upon the work of Brendan Halpin and extend it to additional types of missing value patterns. Our empirical case is a sequence analysis of panel data with substantial attrition that examines the typical patterns and the persistence of sex segregation in school-to-work transitions in Switzerland. The preliminary results indicate that MI is a valuable methodology for handling missing values due to panel mortality in the context of sequence analysis. MI is especially useful in facilitating a sound interpretation of the resulting sequence types.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The counterfactual decomposition technique popularized by Blinder (1973, Journal of Human Resources, 436–455) and Oaxaca (1973, International Economic Review, 693–709) is widely used to study mean outcome differences between groups. For example, the technique is often used to analyze wage gaps by sex or race. This article summarizes the technique and addresses several complications, such as the identification of effects of categorical predictors in the detailed decomposition or the estimation of standard errors. A new command called oaxaca is introduced, and examples illustrating its usage are given.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

estout, introduced by Jann (Stata Journal 5: 288–308), is a useful tool for producing regression tables from stored estimates. However, its syntax is relatively complex and commands may turn out long even for simple tables. Furthermore, having to store the estimates beforehand can be cumbersome. To facilitate the production of regression tables, I therefore present here two new commands called eststo and esttab. eststo is a wrapper for offcial Stata’s estimates store and simplifies the storing of estimation results for tabulation. esttab, on the other hand, is a wrapper for estout and simplifies compiling nice-looking tables from the stored estimates without much typing. I also provide updates to estout and estadd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Organizing and archiving statistical results and processing a subset of those results for publication are important and often underestimated issues in conducting statistical analyses. Because automation of these tasks is often poor, processing results produced by statistical packages is quite laborious and vulnerable to error. I will therefore present a new package called estout that facilitates and automates some of these tasks. This new command can be used to produce regression tables for use with spreadsheets, LaTeX, HTML, or word processors. For example, the results for multiple models can be organized in spreadsheets and can thus be archived in an orderly manner. Alternatively, the results can be directly saved as a publication-ready table for inclusion in, for example, a LaTeX document. estout is implemented as a wrapper for estimates table but has many additional features, such as support for mfx. However, despite its flexibility, estout is—I believe—still very straightforward and easy to use. Furthermore, estout can be customized via so-called defaults files. A tool to make available supplementary statistics called estadd is also provided.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a new fully-automatic method for localizing and segmenting 3D intervertebral discs from MR images, where the two problems are solved in a unified data-driven regression and classification framework. We estimate the output (image displacements for localization, or fg/bg labels for segmentation) of image points by exploiting both training data and geometric constraints simultaneously. The problem is formulated in a unified objective function which is then solved globally and efficiently. We validate our method on MR images of 25 patients. Taking manually labeled data as the ground truth, our method achieves a mean localization error of 1.3 mm, a mean Dice metric of 87%, and a mean surface distance of 1.3 mm. Our method can be applied to other localization and segmentation tasks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In clinical practice, traditional X-ray radiography is widely used, and knowledge of landmarks and contours in anteroposterior (AP) pelvis X-rays is invaluable for computer aided diagnosis, hip surgery planning and image-guided interventions. This paper presents a fully automatic approach for landmark detection and shape segmentation of both pelvis and femur in conventional AP X-ray images. Our approach is based on the framework of landmark detection via Random Forest (RF) regression and shape regularization via hierarchical sparse shape composition. We propose a visual feature FL-HoG (Flexible- Level Histogram of Oriented Gradients) and a feature selection algorithm based on trace radio optimization to improve the robustness and the efficacy of RF-based landmark detection. The landmark detection result is then used in a hierarchical sparse shape composition framework for shape regularization. Finally, the extracted shape contour is fine-tuned by a post-processing step based on low level image features. The experimental results demonstrate that our feature selection algorithm reduces the feature dimension in a factor of 40 and improves both training and test efficiency. Further experiments conducted on 436 clinical AP pelvis X-rays show that our approach achieves an average point-to-curve error around 1.2 mm for femur and 1.9 mm for pelvis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

robreg provides a number of robust estimators for linear regression models. Among them are the high breakdown-point and high efficiency MM-estimator, the Huber and bisquare M-estimator, and the S-estimator, each supporting classic or robust standard errors. Furthermore, basic versions of the LMS/LQS (least median of squares) and LTS (least trimmed squares) estimators are provided. Note that the moremata package, also available from SSC, is required.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND A cost-effective strategy to increase the density of available markers within a population is to sequence a small proportion of the population and impute whole-genome sequence data for the remaining population. Increased densities of typed markers are advantageous for genome-wide association studies (GWAS) and genomic predictions. METHODS We obtained genotypes for 54 602 SNPs (single nucleotide polymorphisms) in 1077 Franches-Montagnes (FM) horses and Illumina paired-end whole-genome sequencing data for 30 FM horses and 14 Warmblood horses. After variant calling, the sequence-derived SNP genotypes (~13 million SNPs) were used for genotype imputation with the software programs Beagle, Impute2 and FImpute. RESULTS The mean imputation accuracy of FM horses using Impute2 was 92.0%. Imputation accuracy using Beagle and FImpute was 74.3% and 77.2%, respectively. In addition, for Impute2 we determined the imputation accuracy of all individual horses in the validation population, which ranged from 85.7% to 99.8%. The subsequent inclusion of Warmblood sequence data further increased the correlation between true and imputed genotypes for most horses, especially for horses with a high level of admixture. The final imputation accuracy of the horses ranged from 91.2% to 99.5%. CONCLUSIONS Using Impute2, the imputation accuracy was higher than 91% for all horses in the validation population, which indicates that direct imputation of 50k SNP-chip data to sequence level genotypes is feasible in the FM population. The individual imputation accuracy depended mainly on the applied software and the level of admixture.