985 resultados para sequential-tests
Resumo:
In clinical trials, situations often arise where more than one response from each patient is of interest; and it is required that any decision to stop the study be based upon some or all of these measures simultaneously. Theory for the design of sequential experiments with simultaneous bivariate responses is described by Jennison and Turnbull (Jennison, C., Turnbull, B. W. (1993). Group sequential tests for bivariate response: interim analyses of clinical trials with both efficacy and safety endpoints. Biometrics 49:741-752) and Cook and Farewell (Cook, R. J., Farewell, V. T. (1994). Guidelines for monitoring efficacy and toxicity responses in clinical trials. Biometrics 50:1146-1152) in the context of one efficacy and one safety response. These expositions are in terms of normally distributed data with known covariance. The methods proposed require specification of the correlation, ρ between test statistics monitored as part of the sequential test. It can be difficult to quantify ρ and previous authors have suggested simply taking the lowest plausible value, as this will guarantee power. This paper begins with an illustration of the effect that inappropriate specification of ρ can have on the preservation of trial error rates. It is shown that both the type I error and the power can be adversely affected. As a possible solution to this problem, formulas are provided for the calculation of correlation from data collected as part of the trial. An adaptive approach is proposed and evaluated that makes use of these formulas and an example is provided to illustrate the method. Attention is restricted to the bivariate case for ease of computation, although the formulas derived are applicable in the general multivariate case.
Resumo:
The paper studies stochastic approximation as a technique for bias reduction. The proposed method does not require approximating the bias explicitly, nor does it rely on having independent identically distributed (i.i.d.) data. The method always removes the leading bias term, under very mild conditions, as long as auxiliary samples from distributions with given parameters are available. Expectation and variance of the bias-corrected estimate are given. Examples in sequential clinical trials (non-i.i.d. case), curved exponential models (i.i.d. case) and length-biased sampling (where the estimates are inconsistent) are used to illustrate the applications of the proposed method and its small sample properties.
Resumo:
Designing and implementing thread-safe multithreaded libraries can be a daunting task as developers of these libraries need to ensure that their implementations are free from concurrency bugs, including deadlocks. The usual practice involves employing software testing and/or dynamic analysis to detect. deadlocks. Their effectiveness is dependent on well-designed multithreaded test cases. Unsurprisingly, developing multithreaded tests is significantly harder than developing sequential tests for obvious reasons. In this paper, we address the problem of automatically synthesizing multithreaded tests that can induce deadlocks. The key insight to our approach is that a subset of the properties observed when a deadlock manifests in a concurrent execution can also be observed in a single threaded execution. We design a novel, automatic, scalable and directed approach that identifies these properties and synthesizes a deadlock revealing multithreaded test. The input to our approach is the library implementation under consideration and the output is a set of deadlock revealing multithreaded tests. We have implemented our approach as part of a tool, named OMEN1. OMEN is able to synthesize multithreaded tests on many multithreaded Java libraries. Applying a dynamic deadlock detector on the execution of the synthesized tests results in the detection of a number of deadlocks, including 35 real deadlocks in classes documented as thread-safe. Moreover, our experimental results show that dynamic analysis on multithreaded tests that are either synthesized randomly or developed by third-party programmers are ineffective in detecting the deadlocks.
Resumo:
Relações de ordem podem ser documentadas por meio de testes comportamentais através das propriedades de assimetria, transitividade e conectividade. A emergência de classes seqüenciais pode ser estabelecidas de diferentes maneiras, inclusive a partir do matching to sample com pareamento consistente de estímulos e sem conseqüências imediatas. O presente estudo buscou verificar o efeito do treino com pareamento consistente entre estímulos visuais sobre desempenhos emergentes. Cinco universitários de ambos os sexos foram submetidos ao treino das relações condicionais AB, AC e AD. A tarefa dos participantes era responder ordinalmente a dígitos e formas geométricas abstratas. Em seguida, os participantes foram expostos a testes para ordenação de três seqüências diferentes com cinco estímulos. Três participantes alcançaram o critério de acerto e apresentaram um responder consistente nos testes. Os resultados indicaram que o treino foi efetivo no estabelecimento de relações de ordem entre estímulos e replicam dados da literatura no estabelecimento de desempenho seqüencial após treino com matching to sample.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
STUDY QUESTION Does intrauterine application of diluted seminal plasma (SP) at the time of ovum pick-up improve the pregnancy rate by ≥14% in IVF treatment? SUMMARY ANSWER Intrauterine instillation of diluted SP at the time of ovum pick-up is unlikely to increase the pregnancy rate by ≥14% in IVF. WHAT IS KNOWN ALREADY SP modulates endometrial function, and sexual intercourse around the time of embryo transfer has been suggested to increase the likelihood of pregnancy. A previous randomized double-blind pilot study demonstrated a strong trend towards increased pregnancy rates following the intracervical application of undiluted SP. As this study was not conclusive and as the finding could have been confounded by sexual intercourse, the intrauterine application of diluted SP was investigated in the present trial. STUDY DESIGN, SIZE, DURATION A single-centre, prospective, double-blind, placebo-controlled, randomized, superiority trial on women undergoing IVF was conducted from April 2007 until February 2012 at the University Department of Gynaecological Endocrinology and Reproductive Medicine, Heidelberg, Germany. PARTICIPANTS/MATERIALS, SETTING, METHODS The study was powered to detect an 14% increase in the clinical pregnancy rate and two sequential tests were planned using the Pocock spending function. At the first interim analysis, 279 women had been randomly assigned to intrauterine diluted SP (20% SP in saline from the patients' partner) (n = 138) or placebo (n = 141) at the time of ovum pick-up. MAIN RESULTS AND THE ROLE OF CHANCE The clinical pregnancy rate per randomized patient was 37/138 (26.8%) in the SP group and 41/141 (29.1%) in the placebo group (difference: -2.3%, 95% confidence interval of the difference: -12.7 to +8.2%; P = 0.69). The live birth rate per randomized patient was 28/138 (20.3%) in the SP group and 33/141 (23.4%) in the placebo group (difference: -3.1%, 95% confidence interval of the difference: -12.7 to +6.6%; P = 0.56). It was decided to terminate the trial due to futility at the first interim analysis, at a conditional power of 62%. LIMITATIONS, REASONS FOR CAUTION The confidence interval of the difference remains wide, thus clinically relevant differences cannot reliably be excluded based on this single study. WIDER IMPLICATIONS OF THE FINDINGS The results of this study cast doubt on the validity of the concept that SP increases endometrial receptivity and thus implantation in humans. STUDY FUNDING/COMPETING INTEREST(S) Funding was provided by the department's own research facilities. TRIAL REGISTRATION NUMBER DRKS00004615.
Resumo:
2000 Mathematics Subject Classification: 62L10, 62L15.
Resumo:
This thesis investigates how web search evaluation can be improved using historical interaction data. Modern search engines combine offline and online evaluation approaches in a sequence of steps that a tested change needs to pass through to be accepted as an improvement and subsequently deployed. We refer to such a sequence of steps as an evaluation pipeline. In this thesis, we consider the evaluation pipeline to contain three sequential steps: an offline evaluation step, an online evaluation scheduling step, and an online evaluation step. In this thesis we show that historical user interaction data can aid in improving the accuracy or efficiency of each of the steps of the web search evaluation pipeline. As a result of these improvements, the overall efficiency of the entire evaluation pipeline is increased. Firstly, we investigate how user interaction data can be used to build accurate offline evaluation methods for query auto-completion mechanisms. We propose a family of offline evaluation metrics for query auto-completion that represents the effort the user has to spend in order to submit their query. The parameters of our proposed metrics are trained against a set of user interactions recorded in the search engine’s query logs. From our experimental study, we observe that our proposed metrics are significantly more correlated with an online user satisfaction indicator than the metrics proposed in the existing literature. Hence, fewer changes will pass the offline evaluation step to be rejected after the online evaluation step. As a result, this would allow us to achieve a higher efficiency of the entire evaluation pipeline. Secondly, we state the problem of the optimised scheduling of online experiments. We tackle this problem by considering a greedy scheduler that prioritises the evaluation queue according to the predicted likelihood of success of a particular experiment. This predictor is trained on a set of online experiments, and uses a diverse set of features to represent an online experiment. Our study demonstrates that a higher number of successful experiments per unit of time can be achieved by deploying such a scheduler on the second step of the evaluation pipeline. Consequently, we argue that the efficiency of the evaluation pipeline can be increased. Next, to improve the efficiency of the online evaluation step, we propose the Generalised Team Draft interleaving framework. Generalised Team Draft considers both the interleaving policy (how often a particular combination of results is shown) and click scoring (how important each click is) as parameters in a data-driven optimisation of the interleaving sensitivity. Further, Generalised Team Draft is applicable beyond domains with a list-based representation of results, i.e. in domains with a grid-based representation, such as image search. Our study using datasets of interleaving experiments performed both in document and image search domains demonstrates that Generalised Team Draft achieves the highest sensitivity. A higher sensitivity indicates that the interleaving experiments can be deployed for a shorter period of time or use a smaller sample of users. Importantly, Generalised Team Draft optimises the interleaving parameters w.r.t. historical interaction data recorded in the interleaving experiments. Finally, we propose to apply the sequential testing methods to reduce the mean deployment time for the interleaving experiments. We adapt two sequential tests for the interleaving experimentation. We demonstrate that one can achieve a significant decrease in experiment duration by using such sequential testing methods. The highest efficiency is achieved by the sequential tests that adjust their stopping thresholds using historical interaction data recorded in diagnostic experiments. Our further experimental study demonstrates that cumulative gains in the online experimentation efficiency can be achieved by combining the interleaving sensitivity optimisation approaches, including Generalised Team Draft, and the sequential testing approaches. Overall, the central contributions of this thesis are the proposed approaches to improve the accuracy or efficiency of the steps of the evaluation pipeline: the offline evaluation frameworks for the query auto-completion, an approach for the optimised scheduling of online experiments, a general framework for the efficient online interleaving evaluation, and a sequential testing approach for the online search evaluation. The experiments in this thesis are based on massive real-life datasets obtained from Yandex, a leading commercial search engine. These experiments demonstrate the potential of the proposed approaches to improve the efficiency of the evaluation pipeline.
Resumo:
We consider the problem of detecting statistically significant sequential patterns in multineuronal spike trains. These patterns are characterized by ordered sequences of spikes from different neurons with specific delays between spikes. We have previously proposed a data-mining scheme to efficiently discover such patterns, which occur often enough in the data. Here we propose a method to determine the statistical significance of such repeating patterns. The novelty of our approach is that we use a compound null hypothesis that not only includes models of independent neurons but also models where neurons have weak dependencies. The strength of interaction among the neurons is represented in terms of certain pair-wise conditional probabilities. We specify our null hypothesis by putting an upper bound on all such conditional probabilities. We construct a probabilistic model that captures the counting process and use this to derive a test of significance for rejecting such a compound null hypothesis. The structure of our null hypothesis also allows us to rank-order different significant patterns. We illustrate the effectiveness of our approach using spike trains generated with a simulator.
Resumo:
Oral nutrition supplements (ONS) are routinely prescribed to those with, or at risk of, malnutrition. Previous research identified poor compliance due to taste and sweetness. This paper investigates taste and hedonic liking of ONS, of varying sweetness and metallic levels, over consumption volume; an important consideration as patients are prescribed large volumes of ONS daily. A sequential descriptive profile was developed to determine the perception of sensory attributes over repeat consumption of ONS. Changes in liking of ONS following repeat consumption were characterised by a boredom test. Certain flavour (metallic taste, soya milk flavour) and mouthfeel (mouthdrying, mouthcoating) attributes built up over increased consumption volume (p 0.002). Hedonic liking data from two cohorts, healthy older volunteers (n = 32, median age 73) and patients (n = 28, median age 85), suggested such build-up was disliked. Efforts made to improve the palatability of ONS must take account of the build up of taste and mouthfeel characteristics over increased consumption volume.
Resumo:
The use of Wireless Sensor Networks (WSNs) for Structural Health Monitoring (SHM) has become a promising approach due to many advantages such as low cost, fast and flexible deployment. However, inherent technical issues such as data synchronization error and data loss have prevented these distinct systems from being extensively used. Recently, several SHM-oriented WSNs have been proposed and believed to be able to overcome a large number of technical uncertainties. Nevertheless, there is limited research verifying the applicability of those WSNs with respect to demanding SHM applications like modal analysis and damage identification. This paper first presents a brief review of the most inherent uncertainties of the SHM-oriented WSN platforms and then investigates their effects on outcomes and performance of the most robust Output-only Modal Analysis (OMA) techniques when employing merged data from multiple tests. The two OMA families selected for this investigation are Frequency Domain Decomposition (FDD) and Data-driven Stochastic Subspace Identification (SSI-data) due to the fact that they both have been widely applied in the past decade. Experimental accelerations collected by a wired sensory system on a large-scale laboratory bridge model are initially used as clean data before being contaminated by different data pollutants in sequential manner to simulate practical SHM-oriented WSN uncertainties. The results of this study show the robustness of FDD and the precautions needed for SSI-data family when dealing with SHM-WSN uncertainties. Finally, the use of the measurement channel projection for the time-domain OMA techniques and the preferred combination of the OMA techniques to cope with the SHM-WSN uncertainties is recommended.
Resumo:
Summary. Interim analysis is important in a large clinical trial for ethical and cost considerations. Sometimes, an interim analysis needs to be performed at an earlier than planned time point. In that case, methods using stochastic curtailment are useful in examining the data for early stopping while controlling the inflation of type I and type II errors. We consider a three-arm randomized study of treatments to reduce perioperative blood loss following major surgery. Owing to slow accrual, an unplanned interim analysis was required by the study team to determine whether the study should be continued. We distinguish two different cases: when all treatments are under direct comparison and when one of the treatments is a control. We used simulations to study the operating characteristics of five different stochastic curtailment methods. We also considered the influence of timing of the interim analyses on the type I error and power of the test. We found that the type I error and power between the different methods can be quite different. The analysis for the perioperative blood loss trial was carried out at approximately a quarter of the planned sample size. We found that there is little evidence that the active treatments are better than a placebo and recommended closure of the trial.
Resumo:
Thirty-seven surface (0-0.10 or 0-0.20 m) soils covering a wide range of soil types (16 Vertosols, 6 Ferrosols, 6 Dermosols, 4 Hydrosols, 2 Kandosols, 1 Sodosol, 1 Rudosol, and 1 Chromosol) were exhaustively cropped in 2 glasshouse experiments. The test species were Panicum maximum cv. Green Panic in Experiment A and Avena sativa cv. Barcoo in Experiment B. Successive forage harvests were taken until the plants could no longer grow in most soils because of severe potassium (K) deficiency. Soil samples were taken prior to cropping and after the final harvest in both experiments, and also after the initial harvest in Experiment B. Samples were analysed for solution K, exchangeable K (Exch K), tetraphenyl borate extractable K for extraction periods of 15 min (TBK15) and 60 min (TBK60), and boiling nitric acid extractable K (Nitric K). Inter-correlations between the initial levels of the various soil K parameters indicated that the following pools were in sequential equilibrium: solution K, Exch K, fast release fixed K [estimated as (TBK15-Exch K)], and slow release fixed K [estimated as (TBK60-TBK15)]. Structural K [estimated as (Nitric K-TBK60)] was not correlated with any of the other pools. However, following exhaustive drawdown of soil K by cropping, structural K became correlated with solution K, suggesting dissolution of K minerals when solution K was low. The change in the various K pools following cropping was correlated with K uptake at Harvest 1 ( Experiment B only) and cumulative K uptake ( both experiments). The change in Exch K for 30 soils was linearly related to cumulative K uptake (r = 0.98), although on average, K uptake was 35% higher than the change in Exch K. For the remaining 7 soils, K uptake considerably exceeded the change in Exch K. However, the changes in TBK15 and TBK60 were both highly linearly correlated with K uptake across all soils (r = 0.95 and 0.98, respectively). The slopes of the regression lines were not significantly different from unity, and the y-axis intercepts were very small. These results indicate that the plant is removing K from the TBK pool. Although the change in Exch K did not consistently equate with K uptake across all soils, initial Exch K was highly correlated with K uptake (r = 0.99) if one Vertosol was omitted. Exchangeable K is therefore a satisfactory diagnostic indicator of soil K status for the current crop. However, the change in Exch K following K uptake is soil-dependent, and many soils with large amounts of TBK relative to Exch K were able to buffer changes in Exch K. These soils tended to be Vertosols occurring on floodplains. In contrast, 5 soils (a Dermosol, a Rudosol, a Kandosol, and 2 Hydrosols) with large amounts of TBK did not buffer decreases in Exch K caused by K uptake, indicating that the TBK pool in these soils was unavailable to plants under the conditions of these experiments. It is likely that K fertiliser recommendations will need to take account of whether the soil has TBK reserves, and the availability of these reserves, when deciding rates required to raise exchangeable K status to adequate levels.
Resumo:
One of the foremost design considerations in microelectronics miniaturization is the use of embedded passives which provide practical solution. In a typical circuit, over 80 percent of the electronic components are passives such as resistors, inductors, and capacitors that could take up to almost 50 percent of the entire printed circuit board area. By integrating passive components within the substrate instead of being on the surface, embedded passives reduce the system real estate, eliminate the need for discrete and assembly, enhance electrical performance and reliability, and potentially reduce the overall cost. Moreover, it is lead free. Even with these advantages, embedded passive technology is at a relatively immature stage and more characterization and optimization are needed for practical applications leading to its commercialization.This paper presents an entire process from design and fabrication to electrical characterization and reliability test of embedded passives on multilayered microvia organic substrate. Two test vehicles focusing on resistors and capacitors have been designed and fabricated. Embedded capacitors in this study are made with polymer/ceramic nanocomposite (BaTiO3) material to take advantage of low processing temperature of polymers and relatively high dielectric constant of ceramics and the values of these capacitors range from 50 pF to 1.5 nF with capacitance per area of approximately 1.5 nF/cm(2). Limited high frequency measurement of these capacitors was performed. Furthermore, reliability assessments of thermal shock and temperature humidity tests based on JEDEC standards were carried out. Resistors used in this work have been of three types: 1) carbon ink based polymer thick film (PTF), 2) resistor foils with known sheet resistivities which are laminated to printed wiring board (PWB) during a sequential build-up (SBU) process and 3) thin-film resistor plating by electroless method. Realization of embedded resistors on conventional board-level high-loss epoxy (similar to 0.015 at 1 GHz) and proposed low-loss BCB dielectric (similar to 0.0008 at > 40 GHz) has been explored in this study. Ni-P and Ni-W-P alloys were plated using conventional electroless plating, and NiCr and NiCrAlSi foils were used for the foil transfer process. For the first time, Benzocyclobutene (BCB) has been proposed as a board level dielectric for advanced System-on-Package (SOP) module primarily due to its attractive low-loss (for RF application) and thin film (for high density wiring) properties.Although embedded passives are more reliable by eliminating solder joint interconnects, they also introduce other concerns such as cracks, delamination and component instability. More layers may be needed to accommodate the embedded passives, and various materials within the substrate may cause significant thermo -mechanical stress due to coefficient of thermal expansion (CTE) mismatch. In this work, numerical models of embedded capacitors have been developed to qualitatively examine the effects of process conditions and electrical performance due to thermo-mechanical deformations.Also, a prototype working product with the board level design including features of embedded resistors and capacitors are underway. Preliminary results of these are presented.