990 resultados para Computerized adaptive testing
Resumo:
The aim of this paper is to present a new class of smoothness testing strategies in the context of hp-adaptive refinements based on continuous Sobolev embeddings. In addition to deriving a modified form of the 1d smoothness indicators introduced in [26], they will be extended and applied to a higher dimensional framework. A few numerical experiments in the context of the hp-adaptive FEM for a linear elliptic PDE will be performed.
Resumo:
BACKGROUND Long-term hormone therapy has been the standard of care for advanced prostate cancer since the 1940s. STAMPEDE is a randomised controlled trial using a multiarm, multistage platform design. It recruits men with high-risk, locally advanced, metastatic or recurrent prostate cancer who are starting first-line long-term hormone therapy. We report primary survival results for three research comparisons testing the addition of zoledronic acid, docetaxel, or their combination to standard of care versus standard of care alone. METHODS Standard of care was hormone therapy for at least 2 years; radiotherapy was encouraged for men with N0M0 disease to November, 2011, then mandated; radiotherapy was optional for men with node-positive non-metastatic (N+M0) disease. Stratified randomisation (via minimisation) allocated men 2:1:1:1 to standard of care only (SOC-only; control), standard of care plus zoledronic acid (SOC + ZA), standard of care plus docetaxel (SOC + Doc), or standard of care with both zoledronic acid and docetaxel (SOC + ZA + Doc). Zoledronic acid (4 mg) was given for six 3-weekly cycles, then 4-weekly until 2 years, and docetaxel (75 mg/m(2)) for six 3-weekly cycles with prednisolone 10 mg daily. There was no blinding to treatment allocation. The primary outcome measure was overall survival. Pairwise comparisons of research versus control had 90% power at 2·5% one-sided α for hazard ratio (HR) 0·75, requiring roughly 400 control arm deaths. Statistical analyses were undertaken with standard log-rank-type methods for time-to-event data, with hazard ratios (HRs) and 95% CIs derived from adjusted Cox models. This trial is registered at ClinicalTrials.gov (NCT00268476) and ControlledTrials.com (ISRCTN78818544). FINDINGS 2962 men were randomly assigned to four groups between Oct 5, 2005, and March 31, 2013. Median age was 65 years (IQR 60-71). 1817 (61%) men had M+ disease, 448 (15%) had N+/X M0, and 697 (24%) had N0M0. 165 (6%) men were previously treated with local therapy, and median prostate-specific antigen was 65 ng/mL (IQR 23-184). Median follow-up was 43 months (IQR 30-60). There were 415 deaths in the control group (347 [84%] prostate cancer). Median overall survival was 71 months (IQR 32 to not reached) for SOC-only, not reached (32 to not reached) for SOC + ZA (HR 0·94, 95% CI 0·79-1·11; p=0·450), 81 months (41 to not reached) for SOC + Doc (0·78, 0·66-0·93; p=0·006), and 76 months (39 to not reached) for SOC + ZA + Doc (0·82, 0·69-0·97; p=0·022). There was no evidence of heterogeneity in treatment effect (for any of the treatments) across prespecified subsets. Grade 3-5 adverse events were reported for 399 (32%) patients receiving SOC, 197 (32%) receiving SOC + ZA, 288 (52%) receiving SOC + Doc, and 269 (52%) receiving SOC + ZA + Doc. INTERPRETATION Zoledronic acid showed no evidence of survival improvement and should not be part of standard of care for this population. Docetaxel chemotherapy, given at the time of long-term hormone therapy initiation, showed evidence of improved survival accompanied by an increase in adverse events. Docetaxel treatment should become part of standard of care for adequately fit men commencing long-term hormone therapy. FUNDING Cancer Research UK, Medical Research Council, Novartis, Sanofi-Aventis, Pfizer, Janssen, Astellas, NIHR Clinical Research Network, Swiss Group for Clinical Cancer Research.
Resumo:
Monte Carlo simulation has been conducted to investigate parameter estimation and hypothesis testing in some well known adaptive randomization procedures. The four urn models studied are Randomized Play-the-Winner (RPW), Randomized Pôlya Urn (RPU), Birth and Death Urn with Immigration (BDUI), and Drop-the-Loses Urn (DL). Two sequential estimation methods, the sequential maximum likelihood estimation (SMLE) and the doubly adaptive biased coin design (DABC), are simulated at three optimal allocation targets that minimize the expected number of failures under the assumption of constant variance of simple difference (RSIHR), relative risk (ORR), and odds ratio (OOR) respectively. Log likelihood ratio test and three Wald-type tests (simple difference, log of relative risk, log of odds ratio) are compared in different adaptive procedures. ^ Simulation results indicates that although RPW is slightly better in assigning more patients to the superior treatment, the DL method is considerably less variable and the test statistics have better normality. When compared with SMLE, DABC has slightly higher overall response rate with lower variance, but has larger bias and variance in parameter estimation. Additionally, the test statistics in SMLE have better normality and lower type I error rate, and the power of hypothesis testing is more comparable with the equal randomization. Usually, RSIHR has the highest power among the 3 optimal allocation ratios. However, the ORR allocation has better power and lower type I error rate when the log of relative risk is the test statistics. The number of expected failures in ORR is smaller than RSIHR. It is also shown that the simple difference of response rates has the worst normality among all 4 test statistics. The power of hypothesis test is always inflated when simple difference is used. On the other hand, the normality of the log likelihood ratio test statistics is robust against the change of adaptive randomization procedures. ^
Resumo:
Tuberculosis (TB) is an infectious disease of great public health importance, particularly to institutions that provide health care to large numbers of TB patients such as Parkland Hospital in Dallas, TX. The purpose of this retrospective chart review was to analyze differences in TB positive and TB negative patients to better understand whether or not there were variables that could be utilized to develop a predictive model for use in the emergency department to reduce the overall number of suspected TB patients being sent to respiratory isolation for TB testing. This study included patients who presented to the Parkland Hospital emergency department between November 2006 and December 2007 and were isolated and tested for TB. Outcome of TB was defined as a positive sputum AFB test or a positive M. tuberculosis culture result. Data were collected utilizing the UT Southwestern Medical Center computerized database OACIS and included demographic information, TB risk factors, physical symptoms, and clinical results. Only two variables were significantly (P<0.05) related to TB outcome: dyspnea (shortness of breath) (P<0.001) and abnormal x-ray (P<0.001). Marginally significant variables included hemoptysis (P=0.06), weight loss (P=0.11), night sweats (P=0.20), history of homelessness or incarceration (P=0.15), and history of positive skin PPD (P=0.19). Using a combination of significant and marginally significant variables, a predictive model was designed which demonstrated a specificity of 24% and a sensitivity of 70%. In conclusion, a predictive model for TB outcome based on patients who presented to the Parkland Hospital emergency department between November 2006 and December 2007 was unsuccessful given the limited number of variables that differed significantly between TB positive and TB negative patients. It is suggested that a future prospective cohort study should be implemented to collect data on TB positive and TB negative patients. It may be possible that a more thorough prospective collection of data may lead to clearer comparisons between TB positive and TB negative patients and ultimately to the design of a more sensitive predictive model for TB outcome. ^
Resumo:
My dissertation focuses mainly on Bayesian adaptive designs for phase I and phase II clinical trials. It includes three specific topics: (1) proposing a novel two-dimensional dose-finding algorithm for biological agents, (2) developing Bayesian adaptive screening designs to provide more efficient and ethical clinical trials, and (3) incorporating missing late-onset responses to make an early stopping decision. Treating patients with novel biological agents is becoming a leading trend in oncology. Unlike cytotoxic agents, for which toxicity and efficacy monotonically increase with dose, biological agents may exhibit non-monotonic patterns in their dose-response relationships. Using a trial with two biological agents as an example, we propose a phase I/II trial design to identify the biologically optimal dose combination (BODC), which is defined as the dose combination of the two agents with the highest efficacy and tolerable toxicity. A change-point model is used to reflect the fact that the dose-toxicity surface of the combinational agents may plateau at higher dose levels, and a flexible logistic model is proposed to accommodate the possible non-monotonic pattern for the dose-efficacy relationship. During the trial, we continuously update the posterior estimates of toxicity and efficacy and assign patients to the most appropriate dose combination. We propose a novel dose-finding algorithm to encourage sufficient exploration of untried dose combinations in the two-dimensional space. Extensive simulation studies show that the proposed design has desirable operating characteristics in identifying the BODC under various patterns of dose-toxicity and dose-efficacy relationships. Trials of combination therapies for the treatment of cancer are playing an increasingly important role in the battle against this disease. To more efficiently handle the large number of combination therapies that must be tested, we propose a novel Bayesian phase II adaptive screening design to simultaneously select among possible treatment combinations involving multiple agents. Our design is based on formulating the selection procedure as a Bayesian hypothesis testing problem in which the superiority of each treatment combination is equated to a single hypothesis. During the trial conduct, we use the current values of the posterior probabilities of all hypotheses to adaptively allocate patients to treatment combinations. Simulation studies show that the proposed design substantially outperforms the conventional multi-arm balanced factorial trial design. The proposed design yields a significantly higher probability for selecting the best treatment while at the same time allocating substantially more patients to efficacious treatments. The proposed design is most appropriate for the trials combining multiple agents and screening out the efficacious combination to be further investigated. The proposed Bayesian adaptive phase II screening design substantially outperformed the conventional complete factorial design. Our design allocates more patients to better treatments while at the same time providing higher power to identify the best treatment at the end of the trial. Phase II trial studies usually are single-arm trials which are conducted to test the efficacy of experimental agents and decide whether agents are promising to be sent to phase III trials. Interim monitoring is employed to stop the trial early for futility to avoid assigning unacceptable number of patients to inferior treatments. We propose a Bayesian single-arm phase II design with continuous monitoring for estimating the response rate of the experimental drug. To address the issue of late-onset responses, we use a piece-wise exponential model to estimate the hazard function of time to response data and handle the missing responses using the multiple imputation approach. We evaluate the operating characteristics of the proposed method through extensive simulation studies. We show that the proposed method reduces the total length of the trial duration and yields desirable operating characteristics for different physician-specified lower bounds of response rate with different true response rates.
Resumo:
AUTOFLY-Aid Project aims to develop and demonstrate novel automation support algorithms and tools to the flight crew for flight critical collision avoidance using “dynamic 4D trajectory management”. The automation support system is envisioned to improve the primary shortcomings of TCAS, and to aid the pilot through add-on avionics/head-up displays and reality augmentation devices in dynamically evolving collision avoidance scenarios. The main theoretical innovative and novel concepts to be developed by AUTOFLY-Aid project are a) design and development of the mathematical models of the full composite airspace picture from the flight deck’s perspective, as seen/measured/informed by the aircraft flying in SESAR 2020, b) design and development of a dynamic trajectory planning algorithm that can generate at real-time (on the order of seconds) flyable (i.e. dynamically and performance-wise feasible) alternative trajectories across the evolving stochastic composite airspace picture (which includes new conflicts, blunder risks, terrain and weather limitations) and c) development and testing of the Collision Avoidance Automation Support System on a Boeing 737 NG FNPT II Flight Simulator with synthetic vision and reality augmentation while providing the flight crew with quantified and visual understanding of collision risks in terms of time and directions and countermeasures.
Resumo:
The choice value and the testing process against the vigilance parameter, characteristic of ART Neural Network, are merged. Only, a single unique test is required to determine if a committed category node can represent the current input or not. Advantages of APT over ART are: 1-Avoid testing every committed category node before deciding to train a committed category node or a new node must be committed, 2-The vigilance parameter is fixed during training, and 3-The choice value parameter is eliminated.
Resumo:
Nowadays, HTTP adaptive streaming (HAS) has become a reliable distribution technology offering significant advantages in terms of both user perceived Quality of Experience (QoE) and resource utilization for content and network service providers. By trading-off the video quality, HAS is able to adapt to the available bandwidth and display requirements so that it can deliver the video content to a variety of devices over the Internet. However, until now there is not enough knowledge of how the adaptation techniques affect the end user's visual experience. Therefore, this paper presents a comparative analysis of different bitrate adaptation strategies in adaptive streaming of monoscopic and stereoscopic video. This has been done through a subjective experiment of testing the end-user response to the video quality variations, considering the visual comfort issue. The experimental outcomes have made a good insight into the factors that can influence on the QoE of different adaptation strategies.
Resumo:
Actualmente la optimization de la calidad de experiencia (Quality of Experience- QoE) de HTTP Adaptive Streaming (HAS) de video recibe una atención creciente. Este incremento de interés proviene fundamentalmente de las carencias de las soluciones actuales HAS, que, al no ser QoE-driven, no incluyen la percepción de la calidad de los usuarios finales como una parte integral de la lógica de adaptación. Por lo tanto, la obtención de información de referencia fiable en QoE en HAS presenta retos importantes, ya que las metodologías de evaluación subjetiva de la calidad de vídeo propuestas en las normas actuales no son adecuadas para tratar con la variación temporal de la calidad que es consustancial de HAS. Esta tesis investiga la influencia de la adaptación dinámica en la calidad de la transmisión de vídeo considerando métodos de evaluación subjetiva. Tras un estudio exhaustivo del estado del arte en la evaluación subjetiva de QoE en HAS, se han resaltado los retos asociados y las líneas de investigación abiertas. Como resultado, se han seleccionado dos líneas principales de investigación: el análisis del impacto en la QoE de los parámetros de las técnicas de adaptación y la investigación de las metodologías de prueba subjetiva adecuada para evaluación de QoE en HAS. Se han llevado a cabo un conjunto de experimentos de laboratorio para investigar las cuestiones planteadas mediante la utilización de diferentes metodologáas para pruebas subjetivas. El análisis estadístico muestra que no son robustas todas las suposiciones y reivindicaciones de las referencias analizadas, en particular en lo que respecta al impacto en la QoE de la frecuencia de las variaciones de calidad, de las adaptaciones suaves o abruptas y de las oscilaciones de calidad. Por otra parte, nuestros resultados confirman la influencia de otros parámetros, como la longitud de los segmentos de vídeo y la amplitud de las oscilaciones de calidad. Los resultados también muestran que tomar en consideración las características objetivas de los contenidos puede ser beneficioso para la mejora de la QoE en HAS. Además, todos los resultados han sido validados mediante extensos análisis experimentales que han incluido estudio tanto en otros laboratorios como en crowdsourcing Por último, sobre los aspectos metodológicos de las pruebas subjetivas de QoE, se ha realizado la comparación entre los resultados experimentales obtenidos a partir de un método estandarizado basado en estímulos cortos (ACR) y un método semi continuo (desarrollado para la evaluación de secuencias prolongadas de vídeo). A pesar de algunas diferencias, el resultado de los análisis estadísticos no muestra ningún efecto significativo de la metodología de prueba. Asimismo, aunque se percibe la influencia de la presencia de audio en la evaluación de degradaciones del vídeo, no se han encontrado efectos estadísticamente significativos de dicha presencia. A partir de la ausencia de influencia del método de prueba y de la presencia de audio, se ha realizado un análisis adicional sobre el impacto de realizar comparaciones estadísticas múltiples en niveles estadísticos de importancia que aumentan la probabilidad de los errores de tipo-I (falsos positivos). Nuestros resultados muestran que, para obtener un efectos sólido en el análisis estadístico de los resultados subjetivos, es necesario aumentar el número de sujetos de las pruebas claramente por encima de los tamaños de muestras propuestos por las normas y recomendaciones actuales. ABSTRACT Optimizing the Quality of Experience (QoE) of HTTP adaptive video streaming (HAS) is receiving increasing attention nowadays. The growth of interest is mainly caused by the fact that current HAS solutions are not QoE-driven, i.e. end-user quality perception is not integral part of the adaptation logic. However, obtaining the necessary reliable ground truths on HAS QoE faces substantial challenges, since the subjective video quality assessment methodologies as proposed by current standards are not well-suited for dealing with the time-varying quality properties that are characteristic for HAS. This thesis investigates the influence of dynamic quality adaptation on the QoE of streaming video by means of subjective evaluation approaches. Based on a comprehensive survey of related work on subjective HAS QoE assessment, the related challenges and open research questions are highlighted and discussed. As a result, two main research directions are selected for further investigation: analysis of the QoE impact of different technical adaptation parameters, and investigation of testing methodologies suitable for HAS QoE evaluation. In order to investigate related research issues and questions, a set of laboratory experiments have been conducted using different subjective testing methodologies. Our statistical analysis demonstrates that not all assumptions and claims reported in the literature are robust, particularly as regards the QoE impact of switching frequency, smooth vs. abrupt switching, and quality oscillation. On the other hand, our results confirm the influence of some other parameters such as chunk length and switching amplitude on perceived quality. We also show that taking the objective characteristics of the content into account can be beneficial to improve the adaptation viewing experience. In addition, all aforementioned findings are validated by means of an extensive cross-experimental analysis that involves external laboratory and crowdsourcing studies. Finally, to address the methodological aspects of subjective QoE testing, a comparison between the experimental results obtained from a (short stimuli-based) ACR standardized method and a semi-continuous method (developed for assessment of long video sequences) has been performed. In spite of observation of some differences, the result of statistical analysis does not show any significant effect of testing methodology. Similarly, although the influence of audio presence on evaluation of video-related degradations is perceived, no statistically significant effect of audio presence could be found. Motivating by this finding (no effect of testing method and audio presence), a subsequent analysis has been performed investigating the impact of performing multiple statistical comparisons on statistical levels of significance which increase the likelihood of Type-I errors (false positives). Our results show that in order to obtain a strong effect from the statistical analysis of the subjective results, it is necessary to increase the number of test subjects well beyond the sample sizes proposed by current quality assessment standards and recommendations.
Resumo:
The usage of HTTP adaptive streaming (HAS) has become widely spread in multimedia services. Because it allows the service providers to improve the network resource utilization and user׳s Quality of Experience (QoE). Using this technology, the video playback interruption is reduced since the network and server status in addition to capability of user device, all are taken into account by HAS client to adapt the quality to the current condition. Adaptation can be done using different strategies. In order to provide optimal QoE, the perceptual impact of adaptation strategies from point of view of the user should be studied. However, the time-varying video quality due to the adaptation which usually takes place in a long interval introduces a new type of impairment making the subjective evaluation of adaptive streaming system challenging. The contribution of this paper is two-fold: first, it investigates the testing methodology to evaluate HAS QoE by comparing the subjective experimental outcomes obtained from ACR standardized method and a semi-continuous method developed to evaluate the long sequences. In addition, influence of using audiovisual stimuli to evaluate the video-related impairment is inquired. Second, impact of some of the adaptation technical factors including the quality switching amplitude and chunk size in combination with high range of commercial content type is investigated. The results of this study provide a good insight toward achieving appropriate testing method to evaluate HAS QoE, in addition to designing switching strategies with optimal visual quality.
Resumo:
The behaviour of self adaptive systems can be emergent. The difficulty in predicting the system's behaviour means that there is scope for the system to surprise its customers and its developers. Because its behaviour is emergent, a self-adaptive system needs to garner confidence in its customers and it needs to resolve any surprise on the part of the developer during testing and mainteinance. We believe that these two functions can only be achieved if a self-adaptive system is also capable of self-explanation. We argue a self-adaptive system's behaviour needs to be explained in terms of satisfaction of its requirements. Since self-adaptive system requirements may themselves be emergent, a means needs to be found to explain the current behaviour of the system and the reasons that brought that behaviour about. We propose the use of goal-based models during runtime to offer self-explanation of how a system is meeting its requirements, and why the means of meeting these were chosen. We discuss the results of early experiments in self-explanation, and set out future work. © 2012 C.E.S.A.M.E.S.
Resumo:
The behaviour of self adaptive systems can be emergent, which means that the system’s behaviour may be seen as unexpected by its customers and its developers. Therefore, a self-adaptive system needs to garner confidence in its customers and it also needs to resolve any surprise on the part of the developer during testing and maintenance. We believe that these two functions can only be achieved if a self-adaptive system is also capable of self-explanation. We argue a self-adaptive system’s behaviour needs to be explained in terms of satisfaction of its requirements. Since self-adaptive system requirements may themselves be emergent, we propose the use of goal-based requirements models at runtime to offer self-explanation of how a system is meeting its requirements. We demonstrate the analysis of run-time requirements models to yield a self-explanation codified in a domain specific language, and discuss possible future work.
Resumo:
The controlled from distance teaching (DT) in the system of technical education has a row of features: complication of informative content, necessity of development of simulation models and trainers for conducting of practical and laboratory employments, conducting of knowledge diagnostics on the basis of mathematical-based algorithms, organization of execution collective projects of the applied setting. For development of the process of teaching bases of fundamental discipline control system Theory of automatic control (TAC) the combined approach of optimum combination of existent programmatic instruments of support was chosen DT and own developments. The system DT TAC included: controlled from distance course (DC) of TAC, site of virtual laboratory practical works in LAB.TAC and students knowledge remote diagnostic system d-tester.
Resumo:
Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.
Resumo:
Previous work has demonstrated that planning behaviours may be more adaptive than avoidance strategies in driving self-regulation, but ways of encouraging planning have not been investigated. The efficacy of an extended theory of planned behaviour (TPB) plus implementation intention based intervention to promote planning self-regulation in drivers across the lifespan was tested. An age stratified group of participants (N=81, aged 18-83 years) was randomly assigned to an experimental or control condition. The intervention prompted specific goal setting with action planning and barrier identification. Goal setting was carried out using an agreed behavioural contract. Baseline and follow-up measures of TPB variables, self-reported, driving self-regulation behaviours (avoidance and planning) and mobility goal achievements were collected using postal questionnaires. Like many previous efforts to change planned behaviour by changing its predictors using models of planned behaviour such as the TPB, results showed that the intervention did not significantly change any of the model components. However, more than 90% of participants achieved their primary driving goal, and self-regulation planning as measured on a self-regulation inventory was marginally improved. The study demonstrates the role of pre-decisional, or motivational components as contrasted with post-decisional goal enactment, and offers promise for the role of self-regulation planning and implementation intentions in assisting drivers in achieving their mobility goals and promoting safer driving across the lifespan, even in the context of unchanging beliefs such as perceived risk or driver anxiety.