896 resultados para large sample distributions


Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper gives a first step toward a methodology to quantify the influences of regulation on short-run earnings dynamics. It also provides evidence on the patterns of wage adjustment adopted during the recent high inflationary experience in Brazil.The large variety of official wage indexation rules adopted in Brazil during the recent years combined with the availability of monthly surveys on labor markets makes the Brazilian case a good laboratory to test how regulation affects earnings dynamics. In particular, the combination of large sample sizes with the possibility of following the same worker through short periods of time allows to estimate the cross-sectional distribution of longitudinal statistics based on observed earnings (e.g., monthly and annual rates of change).The empirical strategy adopted here is to compare the distributions of longitudinal statistics extracted from actual earnings data with simulations generated from minimum adjustment requirements imposed by the Brazilian Wage Law. The analysis provides statistics on how binding were wage regulation schemes. The visual analysis of the distribution of wage adjustments proves useful to highlight stylized facts that may guide future empirical work.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This study aimed to investigate the phenomenology of obsessive compulsive disorder (OCD), addressing specific questions about the nature of obsessions and compulsions, and to contribute to the World Health Organization's (WHO) revision of OCD diagnostic guidelines. Data from 1001 patients from the Brazilian Research Consortium on Obsessive Compulsive Spectrum Disorders were used. Patients were evaluated by trained clinicians using validated instruments, including the Dimensional Yale Brown Obsessive Compulsive Scale, the University of Sao Paulo Sensory Phenomena Scale, and the Brown Assessment of Beliefs Scale. The aims were to compare the types of sensory phenomena (SP, subjective experiences that precede or accompany compulsions) in OCD patients with and without tic disorders and to determine the frequency of mental compulsions, the co-occurrence of obsessions and compulsions, and the range of insight. SP were common in the whole sample, but patients with tic disorders were more likely to have physical sensations and urges only. Mental compulsions occurred in the majority of OCD patients. It was extremely rare for OCD patients to have obsessions without compulsions. A wide range of insight into OCD beliefs was observed, with a small subset presenting no insight. The data generated from this large sample will help practicing clinicians appreciate the full range of OCD symptoms and confirm prior studies in smaller samples the degree to which insight varies. These findings also support specific revisions to the WHO's diagnostic guidelines for OCD, such as describing sensory phenomena, mental compulsions and level of insight, so that the world-wide recognition of this disabling disorder is increased. (C) 2014 Elsevier Ltd. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Thanks to the Chandra and XMM–Newton surveys, the hard X-ray sky is now probed down to a flux limit where the bulk of the X-ray background is almost completely resolved into discrete sources, at least in the 2–8 keV band. Extensive programs of multiwavelength follow-up observations showed that the large majority of hard X–ray selected sources are identified with Active Galactic Nuclei (AGN) spanning a broad range of redshifts, luminosities and optical properties. A sizable fraction of relatively luminous X-ray sources hosting an active, presumably obscured, nucleus would not have been easily recognized as such on the basis of optical observations because characterized by “peculiar” optical properties. In my PhD thesis, I will focus the attention on the nature of two classes of hard X-ray selected “elusive” sources: those characterized by high X-ray-to-optical flux ratios and red optical-to-near-infrared colors, a fraction of which associated with Type 2 quasars, and the X-ray bright optically normal galaxies, also known as XBONGs. In order to characterize the properties of these classes of elusive AGN, the datasets of several deep and large-area surveys have been fully exploited. The first class of “elusive” sources is characterized by X-ray-to-optical flux ratios (X/O) significantly higher than what is generally observed from unobscured quasars and Seyfert galaxies. The properties of well defined samples of high X/O sources detected at bright X–ray fluxes suggest that X/O selection is highly efficient in sampling high–redshift obscured quasars. At the limits of deep Chandra surveys (∼10−16 erg cm−2 s−1), high X/O sources are generally characterized by extremely faint optical magnitudes, hence their spectroscopic identification is hardly feasible even with the largest telescopes. In this framework, a detailed investigation of their X-ray properties may provide useful information on the nature of this important component of the X-ray source population. The X-ray data of the deepest X-ray observations ever performed, the Chandra deep fields, allows us to characterize the average X-ray properties of the high X/O population. The results of spectral analysis clearly indicate that the high X/O sources represent the most obscured component of the X–ray background. Their spectra are harder (G ∼ 1) than any other class of sources in the deep fields and also of the XRB spectrum (G ≈ 1.4). In order to better understand the AGN physics and evolution, a much better knowledge of the redshift, luminosity and spectral energy distributions (SEDs) of elusive AGN is of paramount importance. The recent COSMOS survey provides the necessary multiwavelength database to characterize the SEDs of a statistically robust sample of obscured sources. The combination of high X/O and red-colors offers a powerful tool to select obscured luminous objects at high redshift. A large sample of X-ray emitting extremely red objects (R−K >5) has been collected and their optical-infrared properties have been studied. In particular, using an appropriate SED fitting procedure, the nuclear and the host galaxy components have been deconvolved over a large range of wavelengths and ptical nuclear extinctions, black hole masses and Eddington ratios have been estimated. It is important to remark that the combination of hard X-ray selection and extreme red colors is highly efficient in picking up highly obscured, luminous sources at high redshift. Although the XBONGs do not present a new source population, the interest on the nature of these sources has gained a renewed attention after the discovery of several examples from recent Chandra and XMM–Newton surveys. Even though several possibilities were proposed in recent literature to explain why a relatively luminous (LX = 1042 − 1043erg s−1) hard X-ray source does not leave any significant signature of its presence in terms of optical emission lines, the very nature of XBONGs is still subject of debate. Good-quality photometric near-infrared data (ISAAC/VLT) of 4 low-redshift XBONGs from the HELLAS2XMMsurvey have been used to search for the presence of the putative nucleus, applying the surface-brightness decomposition technique. In two out of the four sources, the presence of a nuclear weak component hosted by a bright galaxy has been revealed. The results indicate that moderate amounts of gas and dust, covering a large solid angle (possibly 4p) at the nuclear source, may explain the lack of optical emission lines. A weak nucleus not able to produce suffcient UV photons may provide an alternative or additional explanation. On the basis of an admittedly small sample, we conclude that XBONGs constitute a mixed bag rather than a new source population. When the presence of a nucleus is revealed, it turns out to be mildly absorbed and hosted by a bright galaxy.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Suppose that we are interested in establishing simple, but reliable rules for predicting future t-year survivors via censored regression models. In this article, we present inference procedures for evaluating such binary classification rules based on various prediction precision measures quantified by the overall misclassification rate, sensitivity and specificity, and positive and negative predictive values. Specifically, under various working models we derive consistent estimators for the above measures via substitution and cross validation estimation procedures. Furthermore, we provide large sample approximations to the distributions of these nonsmooth estimators without assuming that the working model is correctly specified. Confidence intervals, for example, for the difference of the precision measures between two competing rules can then be constructed. All the proposals are illustrated with two real examples and their finite sample properties are evaluated via a simulation study.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we study panel count data with informative observation times. We assume nonparametric and semiparametric proportional rate models for the underlying recurrent event process, where the form of the baseline rate function is left unspecified and a subject-specific frailty variable inflates or deflates the rate function multiplicatively. The proposed models allow the recurrent event processes and observation times to be correlated through their connections with the unobserved frailty; moreover, the distributions of both the frailty variable and observation times are considered as nuisance parameters. The baseline rate function and the regression parameters are estimated by maximizing a conditional likelihood function of observed event counts and solving estimation equations. Large sample properties of the proposed estimators are studied. Numerical studies demonstrate that the proposed estimation procedures perform well for moderate sample sizes. An application to a bladder tumor study is presented to illustrate the use of the proposed methods.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Simulation-based assessment is a popular and frequently necessary approach to evaluation of statistical procedures. Sometimes overlooked is the ability to take advantage of underlying mathematical relations and we focus on this aspect. We show how to take advantage of large-sample theory when conducting a simulation using the analysis of genomic data as a motivating example. The approach uses convergence results to provide an approximation to smaller-sample results, results that are available only by simulation. We consider evaluating and comparing a variety of ranking-based methods for identifying the most highly associated SNPs in a genome-wide association study, derive integral equation representations of the pre-posterior distribution of percentiles produced by three ranking methods, and provide examples comparing performance. These results are of interest in their own right and set the framework for a more extensive set of comparisons.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The current study tested two competing models of Attention-Deficit/Hyperactivity Disorder (AD/HD), the inhibition and state regulation theories, by conducting fine-grained analyses of the Stop-Signal Task and another putative measure of behavioral inhibition, the Gordon Continuous Performance Test (G-CPT), in a large sample of children and adolescents. The inhibition theory posits that performance on these tasks reflects increased difficulties for AD/HD participants to inhibit prepotent responses. The model predicts that putative stop-signal reaction time (SSRT) group differences on the Stop-Signal Task will be primarily related to AD/HD participants requiring more warning than control participants to inhibit to the stop-signal and emphasizes the relative importance of commission errors, particularly "impulsive" type commissions, over other error types on the G-CPT. The state regulation theory, on the other hand, proposes response variability due to difficulties maintaining an optimal state of arousal as the primary deficit in AD/HD. This model predicts that SSRT differences will be more attributable to slower and/or more variable reaction time (RT) in the AD/HD group, as opposed to reflecting inhibitory deficits. State regulation assumptions also emphasize the relative importance of omission errors and "slow processing" type commissions over other error types on the G-CPT. Overall, results of Stop-Signal Task analyses were more supportive of state regulation predictions and showed that greater response variability (i.e., SDRT) in the AD/HD group was not reducible to slow mean reaction time (MRT) and that response variability made a larger contribution to increased SSRT in the AD/HD group than inhibitory processes. Examined further, ex-Gaussian analyses of Stop-Signal Task go-trial RT distributions revealed that increased variability in the AD/HD group was not due solely to a few excessively long RTs in the tail of the AD/HD distribution (i.e., tau), but rather indicated the importance of response variability throughout AD/HD group performance on the Stop-Signal Task, as well as the notable sensitivity of ex-Gaussian analyses to variability in data screening procedures. Results of G-CPT analyses indicated some support for the inhibition model, although error type analyses failed to further differentiate the theories. Finally, inclusion of primary variables of interest in exploratory factor analysis with other neurocognitive predictors of AD/HD indicated response variability as a separable construct and further supported its role in Stop-Signal Task performance. Response variability did not, however, make a unique contribution to the prediction of AD/HD symptoms beyond measures of motor processing speed in multiple deficit regression analyses. Results have implications for the interpretation of the processes reflected in widely-used variables in the AD/HD literature, as well as for the theoretical understanding of AD/HD.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-06

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The three-parameter lognormal distribution is the extension of the two-parameter lognormal distribution to meet the need of the biological, sociological, and other fields. Numerous research papers have been published for the parameter estimation problems for the lognormal distributions. The inclusion of the location parameter brings in some technical difficulties for the parameter estimation problems, especially for the interval estimation. This paper proposes a method for constructing exact confidence intervals and exact upper confidence limits for the location parameter of the three-parameter lognormal distribution. The point estimation problem is discussed as well. The performance of the point estimator is compared with the maximum likelihood estimator, which is widely used in practice. Simulation result shows that the proposed method is less biased in estimating the location parameter. The large sample size case is discussed in the paper.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.

While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.

For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Aims. The large and small-scale (pc) structure of the Galactic interstellar medium can be investigated by utilising spectra of early-type stellar probes of known distances in the same region of the sky. This paper determines the variation in line strength of Ca ii at 3933.661 Å as a function of probe separation for a large sample of stars, including a number of sightlines in the Magellanic Clouds. 

Methods. FLAMES-GIRAFFE data taken with the Very Large Telescope towards early-type stars in 3 Galactic and 4 Magellanic open clusters in Ca ii are used to obtain the velocity, equivalent width, column density, and line width of interstellar Galactic calcium for a total of 657 stars, of which 443 are Magellanic Cloud sightlines. In each cluster there are between 43 and 111 stars observed. Additionally, FEROS and UVES Ca ii K and Na i D spectra of 21 Galactic and 154 Magellanic early-type stars are presented and combined with data from the literature to study the calcium column density - parallax relationship. 

Results. For the four Magellanic clusters studied with FLAMES, the strength of the Galactic interstellar Ca ii K equivalent width on transverse scales from ∼0.05-9 pc is found to vary by factors of ∼1.8-3.0, corresponding to column density variations of ∼0.3-0.5 dex in the optically-thin approximation. Using FLAMES, FEROS, and UVES archive spectra, the minimum and maximum reduced equivalent widths for Milky Way gas are found to lie in the range ∼35-125 mÅ and ∼30-160 mÅ for Ca ii K and Na i D, respectively. The range is consistent with a previously published simple model of the interstellar medium consisting of spherical cloudlets of filling factor ∼0.3, although other geometries are not ruled out. Finally, the derived functional form for parallax (π) and Ca ii column density (NCaII) is found to be π(mas) = 1 / (2.39 × 10-13 × NCaII (cm-2) + 0.11). Our derived parallax is ∼25 per cent lower than predicted by Megier et al. (2009, A&A, 507, 833) at a distance of ∼100 pc and ∼15 percent lower at a distance of ∼200 pc, reflecting inhomogeneity in the Ca ii distribution in the different sightlines studied.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This report discusses the calculation of analytic second-order bias techniques for the maximum likelihood estimates (for short, MLEs) of the unknown parameters of the distribution in quality and reliability analysis. It is well-known that the MLEs are widely used to estimate the unknown parameters of the probability distributions due to their various desirable properties; for example, the MLEs are asymptotically unbiased, consistent, and asymptotically normal. However, many of these properties depend on an extremely large sample sizes. Those properties, such as unbiasedness, may not be valid for small or even moderate sample sizes, which are more practical in real data applications. Therefore, some bias-corrected techniques for the MLEs are desired in practice, especially when the sample size is small. Two commonly used popular techniques to reduce the bias of the MLEs, are ‘preventive’ and ‘corrective’ approaches. They both can reduce the bias of the MLEs to order O(n−2), whereas the ‘preventive’ approach does not have an explicit closed form expression. Consequently, we mainly focus on the ‘corrective’ approach in this report. To illustrate the importance of the bias-correction in practice, we apply the bias-corrected method to two popular lifetime distributions: the inverse Lindley distribution and the weighted Lindley distribution. Numerical studies based on the two distributions show that the considered bias-corrected technique is highly recommended over other commonly used estimators without bias-correction. Therefore, special attention should be paid when we estimate the unknown parameters of the probability distributions under the scenario in which the sample size is small or moderate.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The cranial base, composed of the midline and lateral basicranium, is a structurally important region of the skull associated with several key traits, which has been extensively studied in anthropology and primatology. In particular, most studies have focused on the association between midline cranial base flexion and relative brain size, or encephalization. However, variation in lateral basicranial morphology has been studied less thoroughly. Platyrrhines are a group of primates that experienced a major evolutionary radiation accompanied by extensive morphological diversification in Central and South America over a large temporal scale. Previous studies have also suggested that they underwent several evolutionarily independent processes of encephalization. Given these characteristics, platyrrhines present an excellent opportunity to study, on a large phylogenetic scale, the morphological correlates of primate diversification in brain size. In this study we explore the pattern of variation in basicranial morphology and its relationship with phylogenetic branching and with encephalization in platyrrhines. We quantify variation in the 3D shape of the midline and lateral basicranium and endocranial volumes in a large sample of platyrrhine species, employing high-resolution CT-scans and geometric morphometric techniques. We investigate the relationship between basicranial shape and encephalization using phylogenetic regression methods and calculate a measure of phylogenetic signal in the datasets. The results showed that phylogenetic structure is the most important dimension for understanding platyrrhine cranial base diversification; only Aotus species do not show concordance with our molecular phylogeny. Encephalization was only correlated with midline basicranial flexion, and species that exhibit convergence in their relative brain size do not display convergence in lateral basicranial shape. The evolution of basicranial variation in primates is probably more complex than previously believed, and understanding it will require further studies exploring the complex interactions between encephalization, brain shape, cranial base morphology, and ecological dimensions acting along the species divergence process.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: The alpha1A-adrenergic receptor (alpha(1A)-AR) regulates the cardiac and peripheral vascular system through sympathetic activation. Due to its important role in the regulation of vascular tone and blood pressure, we aimed to investigate the association between the Arg347Cys polymorphism in the alpha(1A)-AR gene and blood pressure phenotypes, in a large sample of Brazilians from an urban population. Methods: A total of 1568 individuals were randomly selected from the general population of the Vitoria City metropolitan area. Genetic analysis of the Arg347Cys polymorphism was conducted by polymerase chain reaction/restriction fragment length polymorphism. We have compared cardiovascular risk variables and genotypes using ANOVA, and Chi-square test for univariate comparisons and logistic regression for multivariate comparisons. Results: Association analysis indicated a significant difference between genotype groups with respect to diastolic blood pressure (p = 0.04), but not systolic blood pressure (p = 0.12). In addition, presence of the Cys/Cys genotype was marginally associated with hypertension in our population (p = 0.06). Significant interaction effects were observed between the studied genetic variant, age and physical activity. Presence of the Cys/Cys genotype was associated with hypertension only in individuals with regular physical activity (odds ratio = 1.86; p = 0.03) or younger than 45 years (odds ratio = 1.27; p = 0.04). Conclusion: Physical activity and age may potentially play a role by disclosing the effects of the Cys allele on blood pressure. According to our data it is possible that the Arg347Cys polymorphism can be used as a biomarker to disease risk in a selected group of individuals.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Introduction and Purpose: Bimatoprost and the fixed combination of latanoprost with timolol maleate are 2 medications widely used to treat glaucoma and ocular hypertension (OHT). The aim of the study is to compare the efficacy of these 2 drugs in reducing intraocular pressure (IOP) after 8 weeks of treatment in patients with primary open angle glaucoma (POAG) or OHT. Methods: In this randomized, open-label trial, 44 patients with POAG or OHT were allocated to receive either bimatoprost (1 drop QD) or latanoprost/timolol (1 drop QD). Primary outcome was the mean diurnal IOP measurement at the 8th week, calculated as the mean IOP measurements taken at 8:00 AM, 10: 00 AM, and 12: 00 PM Secondary outcomes included the baseline change in IOP measured 3 times a day, after the water-drinking test (performed after the last IOP measurement), and the assessment of side effects of each therapy. Results: The mean IOP levels of latanoprost/timolol (13.83, SD = 2.54) was significantly lower than of bimatoprost (16.16, SD = 3.28; P < 0.0001) at week 8. Also, the change in mean IOP values was significantly higher in the latanoprost/timolol group at 10:00 AM (P = 0.013) and 12:00 PM (P = 0.01), but not at 8: 00 AM (P = ns). During the water-drinking test, there was no signifi cant difference in IOP increase (absolute and percentage) between groups; however, there was a signifi cant decrease in mean heart rate in the latanoprost/timolol group. Finally, no signifi cant changes in blood pressure and lung spirometry were observed in either groups. Conclusions: The fixed combination of latanoprost/timolol was significantly superior to bimatoprost alone in reducing IOP in patients with POAG or OHT. Further studies with large sample sizes should be taken to support the superior efficacy of latanoprost/timolol, as well as to better assess its profile of side effects.