964 resultados para Statistic
Resumo:
Introduction: It is unclear whether patients diagnosed according to International Classification of Headache Disorders criteria for migraine with aura (MA) and migraine without aura (MO) experience distinct disorders or whether their migraine subtypes are genetically related. Aim: Using a novel gene-based (statistical) approach, we aimed to identify individual genes and pathways associated both with MA and MO. Methods: Gene-based tests were performed using genome-wide association summary statistic results from the most recent International Headache Genetics Consortium study comparing 4505 MA cases with 34,813 controls and 4038 MO cases with 40,294 controls. After accounting for non-independence of gene-based test results, we examined the significance of the proportion of shared genes associated with MA and MO. Results: We found a significant overlap in genes associated with MA and MO. Of the total 1514 genes with a nominally significant gene-based p value (pgene-based ≤ 0.05) in the MA subgroup, 107 also produced pgene-based ≤ 0.05 in the MO subgroup. The proportion of overlapping genes is almost double the empirically derived null expectation, producing significant evidence of gene-based overlap (pleiotropy) (pbinomial-test = 1.5 × 10–4). Combining results across MA and MO, six genes produced genome-wide significant gene-based p values. Four of these genes (TRPM8, UFL1, FHL5 and LRP1) were located in close proximity to previously reported genome-wide significant SNPs for migraine, while two genes, TARBP2 and NPFF separated by just 259 bp on chromosome 12q13.13, represent a novel risk locus. The genes overlapping in both migraine types were enriched for functions related to inflammation, the cardiovascular system and connective tissue. Conclusions: Our results provide novel insight into the likely genes and biological mechanisms that underlie both MA and MO, and when combined with previous data, highlight the neuropeptide FF-amide peptide encoding gene (NPFF) as a novel candidate risk gene for both types of migraine.
Resumo:
Population structure, including population stratification and cryptic relatedness, can cause spurious associations in genome-wide association studies (GWAS). Usually, the scaled median or mean test statistic for association calculated from multiple single-nucleotide-polymorphisms across the genome is used to assess such effects, and 'genomic control' can be applied subsequently to adjust test statistics at individual loci by a genomic inflation factor. Published GWAS have clearly shown that there are many loci underlying genetic variation for a wide range of complex diseases and traits, implying that a substantial proportion of the genome should show inflation of the test statistic. Here, we show by theory, simulation and analysis of data that in the absence of population structure and other technical artefacts, but in the presence of polygenic inheritance, substantial genomic inflation is expected. Its magnitude depends on sample size, heritability, linkage disequilibrium structure and the number of causal variants. Our predictions are consistent with empirical observations on height in independent samples of ~4000 and ~133,000 individuals.
Resumo:
The impact of erroneous genotypes having passed standard quality control (QC) can be severe in genome-wide association studies, genotype imputation, and estimation of heritability and prediction of genetic risk based on single nucleotide polymorphisms (SNP). To detect such genotyping errors, a simple two-locus QC method, based on the difference in test statistic of association between single SNPs and pairs of SNPs, was developed and applied. The proposed approach could detect many problematic SNPs with statistical significance even when standard single SNP QC analyses fail to detect them in real data. Depending on the data set used, the number of erroneous SNPs that were not filtered out by standard single SNP QC but detected by the proposed approach varied from a few hundred to thousands. Using simulated data, it was shown that the proposed method was powerful and performed better than other tested existing methods. The power of the proposed approach to detect erroneous genotypes was approximately 80% for a 3% error rate per SNP. This novel QC approach is easy to implement and computationally efficient, and can lead to a better quality of genotypes for subsequent genotype-phenotype investigations.
Design and testing of stand-specific bucking instructions for use on modern cut-to-length harvesters
Resumo:
This study addresses three important issues in tree bucking optimization in the context of cut-to-length harvesting. (1) Would the fit between the log demand and log output distributions be better if the price and/or demand matrices controlling the bucking decisions on modern cut-to-length harvesters were adjusted to the unique conditions of each individual stand? (2) In what ways can we generate stand and product specific price and demand matrices? (3) What alternatives do we have to measure the fit between the log demand and log output distributions, and what would be an ideal goodness-of-fit measure? Three iterative search systems were developed for seeking stand-specific price and demand matrix sets: (1) A fuzzy logic control system for calibrating the price matrix of one log product for one stand at a time (the stand-level one-product approach); (2) a genetic algorithm system for adjusting the price matrices of one log product in parallel for several stands (the forest-level one-product approach); and (3) a genetic algorithm system for dividing the overall demand matrix of each of the several log products into stand-specific sub-demands simultaneously for several stands and products (the forest-level multi-product approach). The stem material used for testing the performance of the stand-specific price and demand matrices against that of the reference matrices was comprised of 9 155 Norway spruce (Picea abies (L.) Karst.) sawlog stems gathered by harvesters from 15 mature spruce-dominated stands in southern Finland. The reference price and demand matrices were either direct copies or slightly modified versions of those used by two Finnish sawmilling companies. Two types of stand-specific bucking matrices were compiled for each log product. One was from the harvester-collected stem profiles and the other was from the pre-harvest inventory data. Four goodness-of-fit measures were analyzed for their appropriateness in determining the similarity between the log demand and log output distributions: (1) the apportionment degree (index), (2) the chi-square statistic, (3) Laspeyres quantity index, and (4) the price-weighted apportionment degree. The study confirmed that any improvement in the fit between the log demand and log output distributions can only be realized at the expense of log volumes produced. Stand-level pre-control of price matrices was found to be advantageous, provided the control is done with perfect stem data. Forest-level pre-control of price matrices resulted in no improvement in the cumulative apportionment degree. Cutting stands under the control of stand-specific demand matrices yielded a better total fit between the demand and output matrices at the forest level than was obtained by cutting each stand with non-stand-specific reference matrices. The theoretical and experimental analyses suggest that none of the three alternative goodness-of-fit measures clearly outperforms the traditional apportionment degree measure. Keywords: harvesting, tree bucking optimization, simulation, fuzzy control, genetic algorithms, goodness-of-fit
Resumo:
Objective To investigate the epidemic characteristics of human cutaneous anthrax (CA) in China, detect the spatiotemporal clusters at the county level for preemptive public health interventions, and evaluate the differences in the epidemiological characteristics within and outside clusters. Methods CA cases reported during 2005–2012 from the national surveillance system were evaluated at the county level using space-time scan statistic. Comparative analysis of the epidemic characteristics within and outside identified clusters was performed using using the χ2 test or Kruskal-Wallis test. Results The group of 30–39 years had the highest incidence of CA, and the fatality rate increased with age, with persons ≥70 years showing a fatality rate of 4.04%. Seasonality analysis showed that most of CA cases occurred between May/June and September/October of each year. The primary spatiotemporal cluster contained 19 counties from June 2006 to May 2010, and it was mainly located straddling the borders of Sichuan, Gansu, and Qinghai provinces. In these high-risk areas, CA cases were predominantly found among younger, local, males, shepherds, who were living on agriculture and stockbreeding and characterized with high morbidity, low mortality and a shorter period from illness onset to diagnosis. Conclusion CA was geographically and persistently clustered in the Southwestern China during 2005–2012, with notable differences in the epidemic characteristics within and outside spatiotemporal clusters; this demonstrates the necessity for CA interventions such as enhanced surveillance, health education, mandatory and standard decontamination or disinfection procedures to be geographically targeted to the areas identified in this study.
Resumo:
OBJECTIVES Based on self-reported measures, sedentary time has been associated with chronic disease and mortality. This study examined the validity of the wrist-worn GENEactiv accelerometer for measuring sedentary time (i.e. sitting and lying) by posture classification, during waking hours in free living adults. DESIGN Fifty-seven participants (age=18-55 years 52% male) were recruited using convenience sampling from a large metropolitan Australian university. METHODS Participants wore a GENEActiv accelerometer on their non-dominant wrist and an activPAL device attached to their right thigh for 24-h (00:00 to 23:59:59). Pearson's Correlation Coefficient was used to examine the convergent validity of the GENEActiv and the activPAL for estimating total sedentary time during waking hours. Agreement was illustrated using Bland and Altman plots, and intra-individual agreement for posture was assessed with the Kappa statistic. RESULTS Estimates of average total sedentary time over 24-h were 623 (SD 103) min/day from the GENEActiv, and 626 (SD 123) min/day from the activPAL, with an Intraclass Correlation Coefficient of 0.80 (95% confidence intervals 0.68-0.88). Bland and Altman plots showed slight underestimation of mean total sedentary time for GENEActiv relative to activPAL (mean difference: -3.44min/day), with moderate limits of agreement (-144 to 137min/day). Mean Kappa for posture was 0.53 (SD 0.12), indicating moderate agreement for this sample at the individual level. CONCLUSIONS The estimation of sedentary time by posture classification of the wrist-worn GENEActiv accelerometer was comparable to the activPAL. The GENEActiv may provide an alternative, easy to wear device based measure for descriptive estimates of sedentary time in population samples
Resumo:
Having the ability to work with complex models can be highly beneficial, but the computational cost of doing so is often large. Complex models often have intractable likelihoods, so methods that directly use the likelihood function are infeasible. In these situations, the benefits of working with likelihood-free methods become apparent. Likelihood-free methods, such as parametric Bayesian indirect likelihood that uses the likelihood of an alternative parametric auxiliary model, have been explored throughout the literature as a good alternative when the model of interest is complex. One of these methods is called the synthetic likelihood (SL), which assumes a multivariate normal approximation to the likelihood of a summary statistic of interest. This paper explores the accuracy and computational efficiency of the Bayesian version of the synthetic likelihood (BSL) approach in comparison to a competitor known as approximate Bayesian computation (ABC) and its sensitivity to its tuning parameters and assumptions. We relate BSL to pseudo-marginal methods and propose to use an alternative SL that uses an unbiased estimator of the exact working normal likelihood when the summary statistic has a multivariate normal distribution. Several applications of varying complexity are considered to illustrate the findings of this paper.
Resumo:
This thesis studies empirically whether measurement errors in aggregate production statistics affect sentiment and future output. Initial announcements of aggregate production are subject to measurement error, because many of the data required to compile the statistics are produced with a lag. This measurement error can be gauged as the difference between the latest revised statistic and its initial announcement. Assuming aggregate production statistics help forecast future aggregate production, these measurement errors are expected to affect macroeconomic forecasts. Assuming agents’ macroeconomic forecasts affect their production choices, these measurement errors should affect future output through sentiment. This thesis is primarily empirical, so the theoretical basis, strategic complementarity, is discussed quite briefly. However, it is a model in which higher aggregate production increases each agent’s incentive to produce. In this circumstance a statistical announcement which suggests aggregate production is high would increase each agent’s incentive to produce, thus resulting in higher aggregate production. In this way the existence of strategic complementarity provides the theoretical basis for output fluctuations caused by measurement mistakes in aggregate production statistics. Previous empirical studies suggest that measurement errors in gross national product affect future aggregate production in the United States. Additionally it has been demonstrated that measurement errors in the Index of Leading Indicators affect forecasts by professional economists as well as future industrial production in the United States. This thesis aims to verify the applicability of these findings to other countries, as well as study the link between measurement errors in gross domestic product and sentiment. This thesis explores the relationship between measurement errors in gross domestic production and sentiment and future output. Professional forecasts and consumer sentiment in the United States and Finland, as well as producer sentiment in Finland, are used as the measures of sentiment. Using statistical techniques it is found that measurement errors in gross domestic product affect forecasts and producer sentiment. The effect on consumer sentiment is ambiguous. The relationship between measurement errors and future output is explored using data from Finland, United States, United Kingdom, New Zealand and Sweden. It is found that measurement errors have affected aggregate production or investment in Finland, United States, United Kingdom and Sweden. Specifically, it was found that overly optimistic statistics announcements are associated with higher output and vice versa.
Resumo:
The paper presents a geometry-free approach to assess the variation of covariance matrices of undifferenced triple frequency GNSS measurements and its impact on positioning solutions. Four independent geometryfree/ ionosphere-free (GFIF) models formed from original triple-frequency code and phase signals allow for effective computation of variance-covariance matrices using real data. Variance Component Estimation (VCE) algorithms are implemented to obtain the covariance matrices for three pseudorange and three carrier-phase signals epoch-by-epoch. Covariance results from the triple frequency Beidou System (BDS) and GPS data sets demonstrate that the estimated standard deviation varies in consistence with the amplitude of actual GFIF error time series. The single point positioning (SPP) results from BDS ionosphere-free measurements at four MGEX stations demonstrate an improvement of up to about 50% in Up direction relative to the results based on a mean square statistics. Additionally, a more extensive SPP analysis at 95 global MGEX stations based on GPS ionosphere-free measurements shows an average improvement of about 10% relative to the traditional results. This finding provides a preliminary confirmation that adequate consideration of the variation of covariance leads to the improvement of GNSS state solutions.
Resumo:
A test for time-varying correlation is developed within the framework of a dynamic conditional score (DCS) model for both Gaussian and Student t-distributions. The test may be interpreted as a Lagrange multiplier test and modified to allow for the estimation of models for time-varying volatility in the individual series. Unlike standard moment-based tests, the score-based test statistic includes information on the level of correlation under the null hypothesis and local power arguments indicate the benefits of doing so. A simulation study shows that the performance of the score-based test is strong relative to existing tests across a range of data generating processes. An application to the Hong Kong and South Korean equity markets shows that the new test reveals changes in correlation that are not detected by the standard moment-based test.
Resumo:
In this paper, we consider the design and bit-error performance analysis of linear parallel interference cancellers (LPIC) for multicarrier (MC) direct-sequence code division multiple access (DS-CDMA) systems. We propose an LPIC scheme where we estimate and cancel the multiple access interference (MAT) based on the soft decision outputs on individual subcarriers, and the interference cancelled outputs on different subcarriers are combined to form the final decision statistic. We scale the MAI estimate on individual subcarriers by a weight before cancellation. In order to choose these weights optimally, we derive exact closed-form expressions for the bit-error rate (BER) at the output of different stages of the LPIC, which we minimize to obtain the optimum weights for the different stages. In addition, using an alternate approach involving the characteristic function of the decision variable, we derive BER expressions for the weighted LPIC scheme, matched filter (MF) detector, decorrelating detector, and minimum mean square error (MMSE) detector for the considered multicarrier DS-CDMA system. We show that the proposed BER-optimized weighted LPIC scheme performs better than the MF detector and the conventional LPIC scheme (where the weights are taken to be unity), and close to the decorrelating and MMSE detectors.
Resumo:
In this thesis we deal with the concept of risk. The objective is to bring together and conclude on some normative information regarding quantitative portfolio management and risk assessment. The first essay concentrates on return dependency. We propose an algorithm for classifying markets into rising and falling. Given the algorithm, we derive a statistic: the Trend Switch Probability, for detection of long-term return dependency in the first moment. The empirical results suggest that the Trend Switch Probability is robust over various volatility specifications. The serial dependency in bear and bull markets behaves however differently. It is strongly positive in rising market whereas in bear markets it is closer to a random walk. Realized volatility, a technique for estimating volatility from high frequency data, is investigated in essays two and three. In the second essay we find, when measuring realized variance on a set of German stocks, that the second moment dependency structure is highly unstable and changes randomly. Results also suggest that volatility is non-stationary from time to time. In the third essay we examine the impact from market microstructure on the error between estimated realized volatility and the volatility of the underlying process. With simulation-based techniques we show that autocorrelation in returns leads to biased variance estimates and that lower sampling frequency and non-constant volatility increases the error variation between the estimated variance and the variance of the underlying process. From these essays we can conclude that volatility is not easily estimated, even from high frequency data. It is neither very well behaved in terms of stability nor dependency over time. Based on these observations, we would recommend the use of simple, transparent methods that are likely to be more robust over differing volatility regimes than models with a complex parameter universe. In analyzing long-term return dependency in the first moment we find that the Trend Switch Probability is a robust estimator. This is an interesting area for further research, with important implications for active asset allocation.
Resumo:
The problem of detecting an unknown transient signal in noise is considered. The SNR of the observed data is first enhanced using wavelet domain filter The output of the wavelet domain filter is then transformed using a Wigner-Ville transform,which separates the spectrum of the observed signal into narrow frequency bands. Each subband signal at the output of the Wigner-ville block is subjected kto wavelet based level dependent denoising (WBLDD)to supress colored noise A weighted sum of the absolute value of outputs of WBLDD is passed through an energy detector, whose output is used as test statistic to take the final decision. By assigning weights proportional to the energy of the corresponding subband signals, the proposed detector approximates a frequency domain matched filter Simulation results are presented to show that the performance of the proposed detector is better than that of the wavelet packet transform based detector.
Resumo:
Tutkimus on osa Metsäklusteri Oy:n Future Biorefinery –tutkimusohjelmaa, jossa kartoitetaan mahdollisuuksia hyödyntää metsäteollisuuden raaka-aineita aiempaa tarkemmin ja uusissa tuotteissa. Tutkimuksen tavoitteena on selvittää männyn (Pinus sylvestris L.) ja kuusen (Picea abies [L.] Karst.) juurten ja kantopuun rakenne ja ominaisuudet. Tutkimuksessa selvitetään löytyykö männyn ja kuusen juurista reaktiopuuta ja määritetään asetoniliukoisten uuteaineiden osuus kantoja juuripuussa. Tutkimusaineistona oli viisi eri-ikäistä mäntyä ja viisi eri-ikäistä kuusta. Juuri- ja kantoaineisto kerättiin Metsäntutkimuslaitoksen toimesta Parkanon seudulta (62.017°N, 23.017°E) hakkuun jälkeen. Maanalaisista juurista otettiin näytteet kolmelta eri etäisyydeltä juurenniskaan nähden. Kummankaan lajin juurista ei löytynyt varsinaista reaktiopuuta, mutta joissakin näytteissä havaittiin lievää reaktiopuuta. Lievää reaktiopuuta löytyi enemmän männyn kuin kuusen juurista, eikä sitä löytynyt lainkaan kaikkein ohuimmista, noin 2 cm paksuisista juurenosista. Männyn kannoissa uuteaineprosentti oli korkeampi kuin kuusen. Männyn juurissa uuteaineprosentti kasvoi edettäessä kohti juuren kärkeä. Kuusella uuteaineprosentti laski aluksi, mutta lähellä juuren kärkeä taas kasvoi. Kuoren uuteainepitoisuus oli molemmilla puulajeilla korkeampi kuin puuaineen. Tutkimusaineisto oli suppea, eikä tutkimuksessa pyritty tilastolliseen yleistettävyyteen. Laajemmasta aineistosta tehdylle tutkimukselle on tarvetta, sillä turvekankailta saatavan puun tarjonta on Suomessa kasvussa, mutta juurten uuteaine- ja reaktiopuututkimuksia on tehty vain kivennäismailta kerätyistä aineistoista.
Resumo:
We revise and extend the extreme value statistic, introduced in Gupta et al., to study direction dependence in the high-redshift supernova data, arising either from departures, from the cosmological principle or due to direction-dependent statistical systematics in the data. We introduce a likelihood function that analytically marginalizes over the,Hubble constant and use it to extend our previous statistic. We also introduce a new statistic that is sensitive to direction dependence arising from living off-centre inside a large void as well as from previously mentioned reasons for anisotropy. We show that for large data sets, this statistic has a limiting form that can be computed analytically. We apply our statistics to the gold data sets from Riess et al., as in our previous work. Our revision and extension of the previous statistic show that the effect of marginalizing over the Hubble constant instead of using its best-fitting value on our results is only marginal. However, correction of errors in our previous work reduces the level of non-Gaussianity in the 2004 gold data that were found in our earlier work. The revised results for the 2007 gold data show that the data are consistent with isotropy and Gaussianity. Our second statistic confirms these results.