860 resultados para two-stage sampling


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Our media is saturated with claims of ``facts'' made from data. Database research has in the past focused on how to answer queries, but has not devoted much attention to discerning more subtle qualities of the resulting claims, e.g., is a claim ``cherry-picking''? This paper proposes a Query Response Surface (QRS) based framework that models claims based on structured data as parameterized queries. A key insight is that we can learn a lot about a claim by perturbing its parameters and seeing how its conclusion changes. This framework lets us formulate and tackle practical fact-checking tasks --- reverse-engineering vague claims, and countering questionable claims --- as computational problems. Within the QRS based framework, we take one step further, and propose a problem along with efficient algorithms for finding high-quality claims of a given form from data, i.e. raising good questions, in the first place. This is achieved to using a limited number of high-valued claims to represent high-valued regions of the QRS. Besides the general purpose high-quality claim finding problem, lead-finding can be tailored towards specific claim quality measures, also defined within the QRS framework. An example of uniqueness-based lead-finding is presented for ``one-of-the-few'' claims, landing in interpretable high-quality claims, and an adjustable mechanism for ranking objects, e.g. NBA players, based on what claims can be made for them. Finally, we study the use of visualization as a powerful way of conveying results of a large number of claims. An efficient two stage sampling algorithm is proposed for generating input of 2d scatter plot with heatmap, evalutaing a limited amount of data, while preserving the two essential visual features, namely outliers and clusters. For all the problems, we present real-world examples and experiments that demonstrate the power of our model, efficiency of our algorithms, and usefulness of their results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

L’asthme est connu comme l’une des maladies chroniques les plus fréquentes chez la femme enceinte avec une prévalence de 4 à 8%. La prévalence élevée de l’asthme fait en sorte qu’on se préoccupe de l’impact de la grossesse sur l’asthme et de l’impact de l’asthme sur les issus de la grossesse. La littérature présente des résultats conflictuels concernant l’impact de l’asthme maternel sur les issus périnatales comme les naissances prématurées, les bébés de petit poids et les bébés de petit poids pour l’âge gestationnel (PPGA). De plus, les données scientifiques sont rares concernant l’impact de la sévérité et de la maîtrise de l’asthme durant la grossesse sur les issus périnatales. Donc, nous avons mené cinq études pour réaliser les objectifs suivants: 1. Le développement et la validation de deux indexes pour mesurer la sévérité et la maîtrise de l’asthme. 2. L’évaluation de l’impact du sexe du fœtus sur le risque d’exacerbation de l’asthme maternel et l’utilisation de médicaments antiasthmatiques durant la grossesse; 3. L’évaluation de l’impact de l’asthme maternel sur les issus périnatales; 4. L’évaluation de l’impact de la sévérité de l’asthme maternel durant la grossesse sur les issus périnatales; 5. L’évaluation de l’impact de la maîtrise de l’asthme maternel durant la grossesse sur les issus périnatales. Pour réaliser ces projets de recherche, nous avons travaillé avec une large cohorte de grossesse reconstruite à partir du croisement de trois banques de données administratives du Québec recouvrant la période entre 1990 et 2002. Pour les trois dernières études, nous avons utilisé un devis de cohorte à deux phases d’échantillonnage pour obtenir, à l’aide d’un questionnaire postal, des informations complémentaires qui ne se trouvaient pas dans les banques de données, comme la consommation de cigarettes et d’alcool pendant la grossesse. Nous n’avons trouvé aucune différence significative entre les mères de fétus féminins et de fétus masculins pour les exacerbations de l’asthme pendant la grossesse (aRR=1.02; IC 95%: 0.92 to 1.14). Par contre, nous avons trouvé que le risque de bébé PPGA (OR: 1.27, IC 95%: 1.14-1.41), de bébé de petit poids (OR: 1.41, IC 95%:1.22-1.63) et de naissance prématurée (OR: 1.64, IC 95%:1.46-1.83) était significativement plus élevés chez les femmes asthmatiques que chez les femmes non asthmatiques. De plus, nous avons démontré que le risque d’un bébé PPAG était significativement plus élevé chez les femmes avec un asthme sévère (OR:1.48, IC 95%: 1.15-1.91) et modéré (OR: 1.30, IC 95%:1.10-1.55) que chez les femmes qui avaient un asthme léger. Nous avons aussi observé que les femmes qui avaient un asthme bien maîtrisé durant la grossesse étaient significativement plus à risque d’avoir un bébé PPAG (OR:1.28, IC 95%: 1.15-1.43), un bébé de petit poids (OR: 1.42, IC 95%:1.22-1.66), et un bébé prématuré (OR: 1.63, IC 95%:1.46-1.83) que les femmes non asthmatiques. D’après nos résultats, toutes les femmes asthmatiques même celles qui ont un asthme bien maîtrisé doivent être suivies de près durant la grossesse car elles courent un risque plus élevé d’avoir des issus de grossesses défavorables pour leur nouveau-né.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

L’asthme maternel complique environ 3,4% à 12,4% des grossesses dans les pays développés ce qui en fait une des maladies chroniques les plus fréquentes pouvant engendrer de sérieux problèmes médicaux chez la mère et le fœtus. D’autre part, un taux relativement important de femmes enceintes, soit 4 à 7%, utilisent des médicaments anti-asthmatiques. La mortinaissance, la mortalité néonatale et/ou la mortalité périnatale sont les issues de grossesses les plus dramatiques pour l’enfant et la famille. Toutefois, l’effet de l’asthme et de l’utilisation des corticostéroïdes inhalés (CSI) pendant la grossesse sur ces complications a été inadéquatement évalué. La majorité des études qui ont évalué ces associations souffraient d’un manque de puissance statistique et/ou d’une absence ou d’un ajustement inadéquat pour les variables potentiellement confondantes. Les travaux présentés dans cette thèse ont donc pour objectif d’évaluer le risque de mortalité périnatale chez les femmes asthmatiques comparativement aux femmes non- asthmatiques. Cette thèse vise également à évaluer si les femmes asthmatiques exposées aux CSI courent plus de risque de mortalité périnatale que les femmes asthmatiques non exposées et si le risque de mortalité périnatale varie en fonction de la dose quotidienne de CSI utilisée par la mère pendant la grossesse. À l’aide du croisement de trois bases de données administratives du Québec, une large cohorte de femmes asthmatiques et non-asthmatiques qui ont eu au moins une grossesse entre 1990 et 2002 a été construite (n=41 142). À partir de cette cohorte, deux cohortes de grossesses ont été constituées. Les deux premières études présentées dans cette thèse sont basées sur toute la cohorte alors que la dernière étude est basée uniquement sur les grossesses de femmes asthmatiques. Une étude de cohorte a d’abord été réalisée afin d’évaluer l’effet de l’asthme maternel sur le risque de mortalité périnatale permettant l’ajustement pour les variables provenant des bases de données administratives. Afin de mieux estimer le risque de mortalité périnatale chez les femmes asthmatiques une étude de cohorte comprenant deux phases d’échantillonnage a ensuite été réalisée à l’aide d’informations additionnelles sur le tabagisme, l’utilisation de drogue illicite et l’histoire de mortinaissances, colligées à partir du dossier médical de la mère. Finalement, le risque de mortalité périnatale chez les femmes asthmatiques qui ont utilisé des CSI pendant la grossesse et le risque de mortalité périnatale en fonction de la dose moyenne quotidienne de CSI consommée par la mère pendant la grossesse ont été investigués à l’aide d’une étude de cohorte à deux phases d’échantillonnage chez les femmes asthmatiques uniquement. Nous avons premièrement observé que l’asthme pendant la grossesse pourrait augmenter le risque de mortalité périnatale due à l’augmentation du risque de bébés de petits poids et de bébés prématurés chez les femmes asthmatiques (OR=1,30; IC 95%: 1,05-1,57). Toutefois, après avoir ajusté pour le tabagisme pendant la grossesse, le risque relatif de mortalité périnatale a diminué à 12% et l’association n’est pas demeurée statistiquement significative (OR= 1,12; IC 95%: 0,87-1,45). Finalement, l’utilisation de CSI pendant la grossesse, lorsque la dose n’a pas été considérée, n’a pas été associé à une augmentation significative du risque de mortalité périnatale (OR= 1,07; IC 95% : 0,70-1,61) et un effet protecteur non-significatif de l’utilisation de doses de CSI de 250 ug ou moins par jour a été observé (OR=0,89; IC 95%: 0,55 -1,44). Toutefois, les femmes qui ont pris des doses >250 ug/jour avaient un risque accru de mortalité périnatale de 52%, mais cette association n’était pas statistiquement significative (OR=1,52; IC 95%: 0,62-3,76). Cette augmentation du risque pourrait toutefois résulter d’un ajustement imparfait pour la sévérité et le contrôle de l’asthme (les femmes asthmatiques qui ont utlisé >250 ug/jour sont susceptibles d’avoir un asthme plus sévère ou inadéquatement maîtrisé). Les conclusions de nos travaux qui sont plutôt rassurantes pourront contribuer à une meilleure prise en charge des femmes enceintes asthmatiques, à aider les médecins dans la prescription de CSI pendant la grossesse et à rassurer les femmes enceintes souffrant d’asthme et les femmes enceintes qui doivent utiliser des CSI. Toutefois, des études supplémentaires sont nécessaires afin de pouvoir conclure que l’utilisation de doses plus élevées de CSI (>250 ug/jour) pendant la grossesse sont sécuritaires.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thèse numérisée par la Division de la gestion de documents et des archives de l'Université de Montréal

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Prediction of random effects is an important problem with expanding applications. In the simplest context, the problem corresponds to prediction of the latent value (the mean) of a realized cluster selected via two-stage sampling. Recently, Stanek and Singer [Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 119-130] developed best linear unbiased predictors (BLUP) under a finite population mixed model that outperform BLUPs from mixed models and superpopulation models. Their setup, however, does not allow for unequally sized clusters. To overcome this drawback, we consider an expanded finite population mixed model based on a larger set of random variables that span a higher dimensional space than those typically applied to such problems. We show that BLUPs for linear combinations of the realized cluster means derived under such a model have considerably smaller mean squared error (MSE) than those obtained from mixed models, superpopulation models, and finite population mixed models. We motivate our general approach by an example developed for two-stage cluster sampling and show that it faithfully captures the stochastic aspects of sampling in the problem. We also consider simulation studies to illustrate the increased accuracy of the BLUP obtained under the expanded finite population mixed model. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this article, we consider the synthetic control chart with two-stage sampling (SyTS chart) to control the process mean and variance. During the first stage, one item of the sample is inspected; if its value X, is close to the target value of the process mean, then the sampling is interrupted. Otherwise, the sampling goes on to the second stage, where the remaining items are inspected and the statistic T = Sigma [x(i) - mu(0) + xi sigma(0)](2) is computed taking into account all items of the sample. The design parameter is function of X-1. When the statistic T is larger than a specified value, the sample is classified as nonconforming. According to the synthetic procedure, the signal is based on Conforming Run Length (CRL). The CRL is the number of samples taken from the process since the previous nonconforming sample until the occurrence of the next nonconforming sample. If the CRL is sufficiently small, then a signal is generated. A comparative study shows that the SyTS chart and the joint X and S charts with double sampling are very similar in performance. However, from the practical viewpoint, the SyTS chart is more convenient to administer than the joint charts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ABSTRACT: BACKGROUND: Sierra Leone has undergone a decade of civil war from 1991 to 2001. From this period few data on immunization coverage are available, and conflict-related delays in immunization according to the Expanded Programme on Immunization (EPI) schedule have not been investigated. We aimed to study delays in childhood immunization in the context of civil war in a Sierra Leonean community. METHODS: We conducted an immunization survey in Kissy Mess-Mess in the Greater Freetown area in 1998/99 using a two-stage sampling method. Based on immunization cards and verbal history we collected data on immunization for tuberculosis, diphtheria, tetanus, pertussis, polio, and measles by age group (0-8/9-11/12-23/24-35 months). We studied differences between age groups and explored temporal associations with war-related hostilities taking place in the community. RESULTS: We included 286 children who received 1690 vaccine doses; card retention was 87%. In 243 children (85%, 95% confidence interval (CI): 80-89%) immunization was up-to-date. In 161 of these children (56%, 95%CI: 50-62%) full age-appropriate immunization was achieved; in 82 (29%, 95%CI: 24-34%) immunization was not appropriate for age. In the remaining 43 children immunization was partial in 37 (13%, 95%CI: 9-17) and absent in 6 (2%, 95%CI: 1-5). Immunization status varied across age groups. In children aged 9-11 months the proportion with age-inappropriate (delayed) immunization was higher than in other age groups suggesting an association with war-related hostilities in the community. CONCLUSION: Only about half of children under three years received full age-appropriate immunization. In children born during a period of increased hostilities, immunization was mostly inappropriate for age, but recommended immunizations were not completely abandoned. Missing or delayed immunization represents an additional threat to the health of children living in conflict areas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The aim of this study was to assess the quality of rapid HIV testing in South Africa. Method: A two-stage sampling procedure was used to select HCT sites in eight provinces of South Africa. The study employed both semi-structured interviews with HIV testers and observation of testing sessions as a means of data collection. In total, 63 HCT sites (one HIV tester per site) were included in the survey assessing qualification, training, testing practices and attitudes towards rapid tests. Quantitative data was analysed using descriptive statistics and qualitative data was content analysed. Results: Of the 63 HIV testers, 20.6% had a nursing qualification, 14.3% were professional counsellors, 58.7% were lay HIV counsellors and testers and 6.4% were from other professions. Most HIV testers (87.3%) had had a formal training in testing, which ranged between 10-14 days, while 6 (9.5%) had none. Findings revealed sub-standard practices in relation to testing. These were mainly related to non-adherence to testing algorithms, poor external quality control practices, poor handling and communication of discordant results. Conclusion: Quality of HIV rapid testing may be highly compromised through poor adherence to guidelines as observed in our study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Atlantic menhaden, Brrvoortia tyrannus, the object of a major purse-seine fishery along the U.S. east coast, are landed at plants from northern Florida to central Maine. The National Marine Fisheries Service has sampled these landings since 1955 for length, weight, and age. Together with records of landings at each plant, the samples are used to estimate numbers of fish landed at each age. This report analyzes the sampling design in terms of probablity sampling theory. The design is c1assified as two-stage cluster sampling, the first stage consisting of purse-seine sets randomly selected from the population of all sets landed, and the second stage consisting of fish randomly selected from each sampled set. Implicit assumptions of this design are discussed with special attention to current sampling procedures. Methods are developed for estimating mean fish weight, numbers of fish landed, and age composition of the catch, with approximate 95% confidence intervals. Based on specific results from three ports (port Monmouth, N.J., Reedville, Va., and Beaufort, N.C.) for the 1979 fishing season, recommendations are made for improving sampling procedures to comply more exactly with assumptions of the sampling design. These recommendatlons include adopting more formal methods for randomizing set and fish selection, increasing the number of sets sampled, considering the bias introduced by unequal set sizes, and developing methods to optimize the use of funds and personnel. (PDF file contains 22 pages.)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Inter-American Tropical Tuna Commission (IATTC) staff has been sampling the size distributions of tunas in the eastern Pacific Ocean (EPO) since 1954, and the species composition of the catches since 2000. The IATTC staff use the data from the species composition samples, in conjunction with observer and/or logbook data, and unloading data from the canneries to estimate the total annual catches of yellowfin (Thunnus albacares), skipjack (Katsuwonus pelamis), and bigeye (Thunnus obesus) tunas. These sample data are collected based on a stratified sampling design. I propose an update of the stratification of the EPO into more homogenous areas in order to reduce the variance in the estimates of the total annual catches and incorporate the geographical shifts resulting from the expansion of the floating-object fishery during the 1990s. The sampling model used by the IATTC is a stratified two-stage (cluster) random sampling design with first stage units varying (unequal) in size. The strata are month, area, and set type. Wells, the first cluster stage, are selected to be sampled only if all of the fish were caught in the same month, same area, and same set type. Fish, the second cluster stage, are sampled for lengths, and independently, for species composition of the catch. The EPO is divided into 13 sampling areas, which were defined in 1968, based on the catch distributions of yellowfin and skipjack tunas. This area stratification does not reflect the multi-species, multi-set-type fishery of today. In order to define more homogenous areas, I used agglomerative cluster analysis to look for groupings of the size data and the catch and effort data for 2000–2006. I plotted the results from both datasets against the IATTC Sampling Areas, and then created new areas. I also used the results of the cluster analysis to update the substitution scheme for strata with catch, but no sample. I then calculated the total annual catch (and variance) by species by stratifying the data into new Proposed Sampling Areas and compared the results to those reported by the IATTC. Results showed that re-stratifying the areas produced smaller variances of the catch estimates for some species in some years, but the results were not significant.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ichthyoplankton surveys have been used to provide an independent estimate of adult spawning biomass of commercially exploited species and to further our understanding of the recruitment processes in the early life stages. However, predicting recruitment has been difficult because of the complex interaction of physical and biological processes operating at different spatial and temporal scales that can occur at the different life stages. A model of first-year life-stage recruitment was applied to Georges Bank Atlantic cod (Gadus morhua) and haddock (Melanogrammus aeglefinus) stocks over the years 1977–2004 by using environmental and densitydependent relationships. The best lifestage mortality relationships for eggs, larvae, pelagic juveniles, and demersal juveniles were first determined by hindcasting recruitment estimates based on egg and larval abundance and mortality rates derived from two intensive sampling periods, 1977–87 and 1995–99. A wind-driven egg mortality relationship was used to estimate losses due to transport off the bank, and a wind-stress larval mortality relationship was derived from feeding and survival studies. A simple metric for the density-dependent effects of Atlantic cod was used for both Atlantic cod and haddock. These life stage proxies were then applied to the virtual population analysis (VPA) derived annual egg abundances to predict age-1 recruitment. Best models were determined from the correlation of predicted and VPA-derived age-1 abundance. The larval stage was the most quantifiable of any stage from surveys, whereas abundance estimates of the demersal juvenile stage were not available because of undersampling. Attempts to forecast recruitment from spawning stock biomass or egg abundance, however, will always be poor because of variable egg survival.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sampling is a key element in the assessment of any fish stock. It is often one of the most expensive activities of the management process; thus, improved efficiency can result in significant cost savings. In most cases a two-phase sampling strategy is employed. Two commonly used versions of such stratified random schemes were simulated using a test population based on Atlantic cod, Gadus morhua. A 1 otolith per 1 cm length frequency currently used for many flatfish and some smaller gadoids and a 3 otolith per 3 cm length frequency currently used for many of the larger gadoids. No difference was detected in the age composition or mean length at age for either scheme; however, 10 percent fewer otoliths were collected in 1 for 1 sampling than 3 for 3. There was an improvement of between 30 and 60 percent in the coefficient of variation of the estimated catch numbers at age using the 1 for 1 compared with the 3 for 3 stratified sampling. For these reasons and other operational considerations, the 1 for 1 stratified random design of sampling appears to be superior.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.