3 resultados para regression algorithm

em Duke University


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Knowledge-based radiation treatment is an emerging concept in radiotherapy. It

mainly refers to the technique that can guide or automate treatment planning in

clinic by learning from prior knowledge. Dierent models are developed to realize

it, one of which is proposed by Yuan et al. at Duke for lung IMRT planning. This

model can automatically determine both beam conguration and optimization ob-

jectives with non-coplanar beams based on patient-specic anatomical information.

Although plans automatically generated by this model demonstrate equivalent or

better dosimetric quality compared to clinical approved plans, its validity and gener-

ality are limited due to the empirical assignment to a coecient called angle spread

constraint dened in the beam eciency index used for beam ranking. To eliminate

these limitations, a systematic study on this coecient is needed to acquire evidences

for its optimal value.

To achieve this purpose, eleven lung cancer patients with complex tumor shape

with non-coplanar beams adopted in clinical approved plans were retrospectively

studied in the frame of the automatic lung IMRT treatment algorithm. The primary

and boost plans used in three patients were treated as dierent cases due to the

dierent target size and shape. A total of 14 lung cases, thus, were re-planned using

the knowledge-based automatic lung IMRT planning algorithm by varying angle

spread constraint from 0 to 1 with increment of 0.2. A modied beam angle eciency

index used for navigate the beam selection was adopted. Great eorts were made to assure the quality of plans associated to every angle spread constraint as good

as possible. Important dosimetric parameters for PTV and OARs, quantitatively

re

ecting the plan quality, were extracted from the DVHs and analyzed as a function

of angle spread constraint for each case. Comparisons of these parameters between

clinical plans and model-based plans were evaluated by two-sampled Students t-tests,

and regression analysis on a composite index built on the percentage errors between

dosimetric parameters in the model-based plans and those in the clinical plans as a

function of angle spread constraint was performed.

Results show that model-based plans generally have equivalent or better quality

than clinical approved plans, qualitatively and quantitatively. All dosimetric param-

eters except those for lungs in the automatically generated plans are statistically

better or comparable to those in the clinical plans. On average, more than 15% re-

duction on conformity index and homogeneity index for PTV and V40, V60 for heart

while an 8% and 3% increase on V5, V20 for lungs, respectively, are observed. The

intra-plan comparison among model-based plans demonstrates that plan quality does

not change much with angle spread constraint larger than 0.4. Further examination

on the variation curve of the composite index as a function of angle spread constraint

shows that 0.6 is the optimal value that can result in statistically the best achievable

plans.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.

While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.

For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.