999 resultados para data snooping


Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is common in econometric applications that several hypothesis tests arecarried out at the same time. The problem then becomes how to decide whichhypotheses to reject, accounting for the multitude of tests. In this paper,we suggest a stepwise multiple testing procedure which asymptoticallycontrols the familywise error rate at a desired level. Compared to relatedsingle-step methods, our procedure is more powerful in the sense that itoften will reject more false hypotheses. In addition, we advocate the useof studentization when it is feasible. Unlike some stepwise methods, ourmethod implicitly captures the joint dependence structure of the teststatistics, which results in increased ability to detect alternativehypotheses. We prove our method asymptotically controls the familywise errorrate under minimal assumptions. We present our methodology in the context ofcomparing several strategies to a common benchmark and deciding whichstrategies actually beat the benchmark. However, our ideas can easily beextended and/or modied to other contexts, such as making inference for theindividual regression coecients in a multiple regression framework. Somesimulation studies show the improvements of our methods over previous proposals. We also provide an application to a set of real data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This thesis addresses the problem of information hiding in low dimensional digital data focussing on issues of privacy and security in Electronic Patient Health Records (EPHRs). The thesis proposes a new security protocol based on data hiding techniques for EPHRs. This thesis contends that embedding of sensitive patient information inside the EPHR is the most appropriate solution currently available to resolve the issues of security in EPHRs. Watermarking techniques are applied to one-dimensional time series data such as the electroencephalogram (EEG) to show that they add a level of confidence (in terms of privacy and security) in an individual’s diverse bio-profile (the digital fingerprint of an individual’s medical history), ensure belief that the data being analysed does indeed belong to the correct person, and also that it is not being accessed by unauthorised personnel. Embedding information inside single channel biomedical time series data is more difficult than the standard application for images due to the reduced redundancy. A data hiding approach which has an in built capability to protect against illegal data snooping is developed. The capability of this secure method is enhanced by embedding not just a single message but multiple messages into an example one-dimensional EEG signal. Embedding multiple messages of similar characteristics, for example identities of clinicians accessing the medical record helps in creating a log of access while embedding multiple messages of dissimilar characteristics into an EPHR enhances confidence in the use of the EPHR. The novel method of embedding multiple messages of both similar and dissimilar characteristics into a single channel EEG demonstrated in this thesis shows how this embedding of data boosts the implementation and use of the EPHR securely.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper proposes new methodologies for evaluating out-of-sample forecastingperformance that are robust to the choice of the estimation window size. The methodologies involve evaluating the predictive ability of forecasting models over a wide rangeof window sizes. We show that the tests proposed in the literature may lack the powerto detect predictive ability and might be subject to data snooping across differentwindow sizes if used repeatedly. An empirical application shows the usefulness of themethodologies for evaluating exchange rate models' forecasting ability.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Este trabalho busca testar a validade da hipótese de eficiência dos mercados no mercado futuro do índice lbovespa através do uso das chamadas estratégias de análise técnica. São utilizados testes de habilidade preditiva para verificar a hipótese de superioridade destas regras de decisão como forma de investimento. Estes testes possuem a vantagem de considerar a possibilidade de data-snooping na escolha da melhor estratégia, permitindo identificar se a aparente capacidade preditiva destes modelos é realmente significativa ou mero produto do acaso. Os resultados indicam que as estratégias de análise técnica não são capazes de gerar retornos estatisticamente significativos quando os efeitos de data-snooping são levados em conta. Estes resultados estão de acordo com o previsto pela hipótese fraca de eficiência de mercado.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Este trabalho estuda a lucratividade dos modelos de Análise Técnica no mercado de câmbio brasileiro. Utilizando a metodologia de White (2000) para testar 1712 regras geradas a partir de quatro modelos de Análise Técnica verifica-se que a melhor regra não possui poder de previsibilidade significante ao se considerar os efeitos de data-snooping. Os resultados indicam que o mercado de câmbio brasileiro está de acordo com a hipótese de mercado eficiente sugerida pela literatura.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Image orientation is a basic problem in Digital Photogrammetry. While interior and relative orientations were succesfully automated, the same can not be said about absolute orientation. This process can be automated by using an approach based on relational matching and a heuristic that uses the analytical relation between straight features in the object space and its homologous in the image space. A build-in self-diagnosis is also used in this method, that is based on the implementation of data snooping statistic test in the process of spatial resection, using the Iterated Extended Kalman Filtering (IEKF). The aim of this paper is to present the basic principles of the proposed approach and results based on real data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The identification of ground control on photographs or images is usually carried out by a human operator, who uses his natural skills to make interpretations. In Digital Photogrammetry, which uses techniques of digital image processing extraction of ground control can be automated by using an approach based on relational matching and a heuristic that uses the analytical relation between straight features of object space and its homologous in the image space. A build-in self-diagnosis is also used in this method. It is based on implementation of data snooping statistic test in the process of spatial resection using the Iterated Extended Kalman Filtering (IEKF). The aim of this paper is to present the basic principles of the proposed approach and results based on real data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We provide a comprehensive study of out-of-sample forecasts for the EUR/USD exchange rate based on multivariate macroeconomic models and forecast combinations. We use profit maximization measures based on directional accuracy and trading strategies in addition to standard loss minimization measures. When comparing predictive accuracy and profit measures, data snooping bias free tests are used. The results indicate that forecast combinations, in particular those based on principal components of forecasts, help to improve over benchmark trading strategies, although the excess return per unit of deviation is limited.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled into contigs/singlets, or for lists of proteins/genes, metabolites and drugs of interest, and add them to the project; (iii) Annotation module, which assigns annotations from several databases for the contigs/singlets or lists of proteins/genes, generating tables with automatic annotation that can be manually curated; and (iv) Interactome module, which maps the contigs/singlets or the uploaded lists to entries in our integrated database, building networks that gather novel identified interactions, protein and metabolite expression/concentration levels, subcellular localization and computed topological metrics, GO biological processes and KEGG pathways enrichment. This module generates a XGMML file that can be imported into Cytoscape or be visualized directly on the web. We have developed IIS by the integration of diverse databases following the need of appropriate tools for a systematic analysis of physical, genetic and chemical-genetic interactions. IIS was validated with yeast two-hybrid, proteomics and metabolomics datasets, but it is also extendable to other datasets. IIS is freely available online at: http://www.lge.ibi.unicamp.br/lnbio/IIS/.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The article seeks to investigate patterns of performance and relationships between grip strength, gait speed and self-rated health, and investigate the relationships between them, considering the variables of gender, age and family income. This was conducted in a probabilistic sample of community-dwelling elderly aged 65 and over, members of a population study on frailty. A total of 689 elderly people without cognitive deficit suggestive of dementia underwent tests of gait speed and grip strength. Comparisons between groups were based on low, medium and high speed and strength. Self-related health was assessed using a 5-point scale. The males and the younger elderly individuals scored significantly higher on grip strength and gait speed than the female and oldest did; the richest scored higher than the poorest on grip strength and gait speed; females and men aged over 80 had weaker grip strength and lower gait speed; slow gait speed and low income arose as risk factors for a worse health evaluation. Lower muscular strength affects the self-rated assessment of health because it results in a reduction in functional capacity, especially in the presence of poverty and a lack of compensatory factors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Obstructive sleep apnea syndrome has a high prevalence among adults. Cephalometric variables can be a valuable method for evaluating patients with this syndrome. To correlate cephalometric data with the apnea-hypopnea sleep index. We performed a retrospective and cross-sectional study that analyzed the cephalometric data of patients followed in the Sleep Disorders Outpatient Clinic of the Discipline of Otorhinolaryngology of a university hospital, from June 2007 to May 2012. Ninety-six patients were included, 45 men, and 51 women, with a mean age of 50.3 years. A total of 11 patients had snoring, 20 had mild apnea, 26 had moderate apnea, and 39 had severe apnea. The distance from the hyoid bone to the mandibular plane was the only variable that showed a statistically significant correlation with the apnea-hypopnea index. Cephalometric variables are useful tools for the understanding of obstructive sleep apnea syndrome. The distance from the hyoid bone to the mandibular plane showed a statistically significant correlation with the apnea-hypopnea index.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In acquired immunodeficiency syndrome (AIDS) studies it is quite common to observe viral load measurements collected irregularly over time. Moreover, these measurements can be subjected to some upper and/or lower detection limits depending on the quantification assays. A complication arises when these continuous repeated measures have a heavy-tailed behavior. For such data structures, we propose a robust structure for a censored linear model based on the multivariate Student's t-distribution. To compensate for the autocorrelation existing among irregularly observed measures, a damped exponential correlation structure is employed. An efficient expectation maximization type algorithm is developed for computing the maximum likelihood estimates, obtaining as a by-product the standard errors of the fixed effects and the log-likelihood function. The proposed algorithm uses closed-form expressions at the E-step that rely on formulas for the mean and variance of a truncated multivariate Student's t-distribution. The methodology is illustrated through an application to an Human Immunodeficiency Virus-AIDS (HIV-AIDS) study and several simulation studies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To assess the completeness and reliability of the Information System on Live Births (Sinasc) data. A cross-sectional analysis of the reliability and completeness of Sinasc's data was performed using a sample of Live Birth Certificate (LBC) from 2009, related to births from Campinas, Southeast Brazil. For data analysis, hospitals were grouped according to category of service (Unified National Health System, private or both), 600 LBCs were randomly selected and the data were collected in LBC-copies through mothers and newborns' hospital records and by telephone interviews. The completeness of LBCs was evaluated, calculating the percentage of blank fields, and the LBCs agreement comparing the originals with the copies was evaluated by Kappa and intraclass correlation coefficients. The percentage of completeness of LBCs ranged from 99.8%-100%. For the most items, the agreement was excellent. However, the agreement was acceptable for marital status, maternal education and newborn infants' race/color, low for prenatal visits and presence of birth defects, and very low for the number of deceased children. The results showed that the municipality Sinasc is reliable for most of the studied variables. Investments in training of the professionals are suggested in an attempt to improve system capacity to support planning and implementation of health activities for the benefit of maternal and child population.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Often in biomedical research, we deal with continuous (clustered) proportion responses ranging between zero and one quantifying the disease status of the cluster units. Interestingly, the study population might also consist of relatively disease-free as well as highly diseased subjects, contributing to proportion values in the interval [0, 1]. Regression on a variety of parametric densities with support lying in (0, 1), such as beta regression, can assess important covariate effects. However, they are deemed inappropriate due to the presence of zeros and/or ones. To evade this, we introduce a class of general proportion density, and further augment the probabilities of zero and one to this general proportion density, controlling for the clustering. Our approach is Bayesian and presents a computationally convenient framework amenable to available freeware. Bayesian case-deletion influence diagnostics based on q-divergence measures are automatic from the Markov chain Monte Carlo output. The methodology is illustrated using both simulation studies and application to a real dataset from a clinical periodontology study.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Patients with obstructive sleep apnea syndrome usually present with changes in upper airway morphology and/or body fat distribution, which may occur throughout life and increase the severity of obstructive sleep apnea syndrome with age. To correlate cephalometric and anthropometric measures with obstructive sleep apnea syndrome severity in different age groups. A retrospective study of cephalometric and anthropometric measures of 102 patients with obstructive sleep apnea syndrome was analyzed. Patients were divided into three age groups (≥20 and <40 years, ≥40 and <60 years, and ≥60 years). Pearson's correlation was performed for these measures with the apnea-hypopnea index in the full sample, and subsequently by age group. The cephalometric measures MP-H (distance between the mandibular plane and the hyoid bone) and PNS-P (distance between the posterior nasal spine and the tip of the soft palate) and the neck and waist circumferences showed a statistically significant correlation with apnea-hypopnea index in both the full sample and in the ≥40 and <60 years age group. These variables did not show any significant correlation with the other two age groups (<40 and ≥60 years). Cephalometric measurements MP-H and PNS-P and cervical and waist circumferences correlated with obstructive sleep apnea syndrome severity in patients in the ≥40 and <60 age group.