53 resultados para methods: data analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a Bayesian-odds-ratio-based algorithm for detecting stellar flares in light-curve data. We assume flares are described by a model in which there is a rapid rise with a half-Gaussian profile, followed by an exponential decay. Our signal model also contains a polynomial background model required to fit underlying light-curve variations in the data, which could otherwise partially mimic a flare. We characterize the false alarm probability and efficiency of this method under the assumption that any unmodelled noise in the data is Gaussian, and compare it with a simpler thresholding method based on that used in Walkowicz et al. We find our method has a significant increase in detection efficiency for low signal-to-noise ratio (S/N) flares. For a conservative false alarm probability our method can detect 95 per cent of flares with S/N less than 20, as compared to S/N of 25 for the simpler method. We also test how well the assumption of Gaussian noise holds by applying the method to a selection of 'quiet' Kepler stars. As an example we have applied our method to a selection of stars in Kepler Quarter 1 data. The method finds 687 flaring stars with a total of 1873 flares after vetos have been applied. For these flares we have made preliminary characterizations of their durations and and S/N.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Efficient identification and follow-up of astronomical transients is hindered by the need for humans to manually select promising candidates from data streams that contain many false positives. These artefacts arise in the difference images that are produced by most major ground-based time-domain surveys with large format CCD cameras. This dependence on humans to reject bogus detections is unsustainable for next generation all-sky surveys and significant effort is now being invested to solve the problem computationally. In this paper, we explore a simple machine learning approach to real-bogus classification by constructing a training set from the image data of similar to 32 000 real astrophysical transients and bogus detections from the Pan-STARRS1 Medium Deep Survey. We derive our feature representation from the pixel intensity values of a 20 x 20 pixel stamp around the centre of the candidates. This differs from previous work in that it works directly on the pixels rather than catalogued domain knowledge for feature design or selection. Three machine learning algorithms are trained (artificial neural networks, support vector machines and random forests) and their performances are tested on a held-out subset of 25 per cent of the training data. We find the best results from the random forest classifier and demonstrate that by accepting a false positive rate of 1 per cent, the classifier initially suggests a missed detection rate of around 10 per cent. However, we also find that a combination of bright star variability, nuclear transients and uncertainty in human labelling means that our best estimate of the missed detection rate is approximately 6 per cent.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When a planet transits its host star, it blocks regions of the stellar surface from view; this causes a distortion of the spectral lines and a change in the line-of-sight (LOS) velocities, known as the Rossiter-McLaughlin (RM) effect. Since the LOS velocities depend, in part, on the stellar rotation, the RM waveform is sensitive to the star-planet alignment (which provides information on the system’s dynamical history). We present a new RM modelling technique that directly measures the spatially-resolved stellar spectrum behind the planet. This is done by scaling the continuum flux of the (HARPS) spectra by the transit light curve, and then subtracting the infrom the out-of-transit spectra to isolate the starlight behind the planet. This technique does not assume any shape for the intrinsic local profiles. In it, we also allow for differential stellar rotation and centre-to-limb variations in the convective blueshift. We apply this technique to HD 189733 and compare to 3D magnetohydrodynamic (MHD) simulations. We reject rigid body rotation with high confidence (>99% probability), which allows us to determine the occulted stellar latitudes and measure the stellar inclination. In turn, we determine both the sky-projected (λ ≈ −0.4 ± 0.2◦) and true 3D obliquity (ψ ≈ 7+12 −4 ◦ ). We also find good agreement with the MHD simulations, with no significant centre-to-limb variations detectable in the local profiles. Hence, this technique provides a new powerful tool that can probe stellar photospheres, differential rotation, determine 3D obliquities, and remove sky-projection biases in planet migration theories. This technique can be implemented with existing instrumentation, but will become even more powerful with the next generation of high-precision radial velocity spectrographs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Photometry of moving sources typically suffers from a reduced signal-to-noise ratio (S/N) or flux measurements biased to incorrect low values through the use of circular apertures. To address this issue, we present the software package, TRIPPy: TRailed Image Photometry in Python. TRIPPy introduces the pill aperture, which is the natural extension of the circular aperture appropriate for linearly trailed sources. The pill shape is a rectangle with two semicircular end-caps and is described by three parameters, the trail length and angle, and the radius. The TRIPPy software package also includes a new technique to generate accurate model point-spread functions (PSFs) and trailed PSFs (TSFs) from stationary background sources in sidereally tracked images. The TSF is merely the convolution of the model PSF, which consists of a moffat profile, and super-sampled lookup table. From the TSF, accurate pill aperture corrections can be estimated as a function of pill radius with an accuracy of 10 mmag for highly trailed sources. Analogous to the use of small circular apertures and associated aperture corrections, small radius pill apertures can be used to preserve S/Ns of low flux sources, with appropriate aperture correction applied to provide an accurate, unbiased flux measurement at all S/Ns.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Retrospective clinical datasets are often characterized by a relatively small sample size and many missing data. In this case, a common way for handling the missingness consists in discarding from the analysis patients with missing covariates, further reducing the sample size. Alternatively, if the mechanism that generated the missing allows, incomplete data can be imputed on the basis of the observed data, avoiding the reduction of the sample size and allowing methods to deal with complete data later on. Moreover, methodologies for data imputation might depend on the particular purpose and might achieve better results by considering specific characteristics of the domain. The problem of missing data treatment is studied in the context of survival tree analysis for the estimation of a prognostic patient stratification. Survival tree methods usually address this problem by using surrogate splits, that is, splitting rules that use other variables yielding similar results to the original ones. Instead, our methodology consists in modeling the dependencies among the clinical variables with a Bayesian network, which is then used to perform data imputation, thus allowing the survival tree to be applied on the completed dataset. The Bayesian network is directly learned from the incomplete data using a structural expectation–maximization (EM) procedure in which the maximization step is performed with an exact anytime method, so that the only source of approximation is due to the EM formulation itself. On both simulated and real data, our proposed methodology usually outperformed several existing methods for data imputation and the imputation so obtained improved the stratification estimated by the survival tree (especially with respect to using surrogate splits).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The predominant fear in capital markets is that of a price spike. Commodity markets differ in that there is a fear of both upward and down jumps, this results in implied volatility curves displaying distinct shapes when compared to equity markets. The use of a novel functional data analysis (FDA) approach, provides a framework to produce and interpret functional objects that characterise the underlying dynamics of oil future options. We use the FDA framework to examine implied volatility, jump risk, and pricing dynamics within crude oil markets. Examining a WTI crude oil sample for the 2007–2013 period, which includes the global financial crisis and the Arab Spring, strong evidence is found of converse jump dynamics during periods of demand and supply side weakness. This is used as a basis for an FDA-derived Merton (1976) jump diffusion optimised delta hedging strategy, which exhibits superior portfolio management results over traditional methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Perfect information is seldom available to man or machines due to uncertainties inherent in real world problems. Uncertainties in geographic information systems (GIS) stem from either vague/ambiguous or imprecise/inaccurate/incomplete information and it is necessary for GIS to develop tools and techniques to manage these uncertainties. There is a widespread agreement in the GIS community that although GIS has the potential to support a wide range of spatial data analysis problems, this potential is often hindered by the lack of consistency and uniformity. Uncertainties come in many shapes and forms, and processing uncertain spatial data requires a practical taxonomy to aid decision makers in choosing the most suitable data modeling and analysis method. In this paper, we: (1) review important developments in handling uncertainties when working with spatial data and GIS applications; (2) propose a taxonomy of models for dealing with uncertainties in GIS; and (3) identify current challenges and future research directions in spatial data analysis and GIS for managing uncertainties.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

PURPOSE. To examine internal consistency, refine the response scale, and obtain a linear scoring system for the visual function instrument, the Daily Living Tasks Dependent on Vision (DLTV). METHODS. Data were available from 186 participants with a clinical diagnosis of AMD who completed the 22-item DLTV (DLTV-22) according to four-point ordinal response scale. An independent group of 386 participants with AMD were administered a reduced version of the DLTV with 11 items (DLTV-11), according to a five-point response scale. Rasch analysis was performed on both datasets and used to generate item statistics for measure order, response odds ratios per item and per person, and infit and outfit mean square statistics. The Rasch output from the DLTV-22 was examined to identify redundant items and for factorial validity and person item measure separation reliabilities. RESULTS. The average rating for the DLTV-22 changed monotonically with the magnitude of the latent person trait. The expected versus observed average measures were extremely close, with step calibrations evenly separated for the four-point ordinal scale. In the case of the DLTV-11, step calibrations were not as evenly separated, suggesting that the five-point scale should be reduced to either a four- or three-point scale. Five items in the DLTV-22 were removed, and all 17 remaining items had good infit and outfit mean squares. PCA with residuals from Rasch analysis identified two domains containing 7 and 10 items each. The domains had high person separation reliabilities (0.86 and 0.77 for domains 1 and 2, respectively) and item measure reliabilities (0.99 and 0.98 for domains 1 and 2, respectively). CONCLUSIONS. With the improved internal consistency, establishment of the accuracy and precision of the rating scale for the DLTV and the establishment of a valid domain structure we believe that it constitutes a useful instrument for assessing visual function in older adults with age-related macular degeneration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aim Determination of the main directions of variance in an extensive data base of annual pollen deposition, and the relationship between pollen data from modified Tauber traps and palaeoecological data. Location Northern Finland and Norway. Methods Pollen analysis of annual samples from pollen traps and contiguous high-resolution samples from a peat sequence. Numerical analysis (principal components analysis) of the resulting data. Results The main direction of variation in the trap data is due to the vegetation region in which each trap is located. A secondary direction of variation is due to the annual variability of pollen production of some of the tree taxa, especially Betula and Pinus. This annual variability is more conspicuous in ‘absolute’ data than it is in percentage data which, at this annual resolution, becomes more random. There are systematic differences, with respect to peat-forming taxa, between pollen data from traps and pollen data from a peat profile collected over the same period of time. Main conclusions Annual variability in pollen production is rarely visible in fossil pollen samples because these cannot be sampled at precisely a 12-month resolution. At near-annual resolution sampling, it results in erratic percentage values which do not reflect changes in vegetation. Profiles sampled at near annual resolution are better analysed in terms of pollen accumulation rates with the realization that even these do not record changes in plant abundance but changes in pollen abundance. However, at the coarser temporal resolution common in most fossil samples it does not mask the origin of the pollen in terms of its vegetation region. Climate change may not be recognizable from pollen assemblages until the change has persisted in the same direction sufficiently long enough to alter the flowering (pollen production) pattern of the dominant trees.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: To assess the role of plasma total homocysteine (tHcy) concentrations and homozygosity for the thermolabile variant of the methylenetetrahydrofolate reductase (MTHFR) C677T gene as risk factors for retinal vascular occlusive disease.

Design: Retinal vein occlusion (RVO) is an important cause of vision loss. Early meta-analyses showed that tHcy was associated with an increased risk of RVO, but a significant number of new studies have been published. Participants and/or Controls: RVO patients and controls.

Methods: Data sources included MEDLINE, Web of Science, and PubMed searches and searching reference lists of relevant articles and reviews. Reviewers searched the databases, selected the studies, and then extracted data. Results were pooled quantitatively using meta-analytic methods.

Main Outcome Measures: tHcy concentrations and MTHFR genotype.

Results: There were 25 case-control studies for tHcy (1533 cases and 1708 controls) and 18 case-control studies for MTHFR (1082 cases and 4706 controls). The mean tHcy was on average 2.8 mol/L (95% confidence
interval [CI], 1.8 –3.7) greater in the RVO cases compared with controls, but there was evidence of between-study heterogeneity (P0.001, I2 93%). There was funnel plot asymmetry suggesting publication bias. There was no evidence of association between homozygosity for the MTHFR C677T genotype and RVO (odds ratio [OR] 1.20; 95% CI, 0.84–1.71), but again marked heterogeneity (P 0.004, I2 53%) was observed.

Conclusions: There was some evidence that elevated tHcy was associated with RVO, but not homozygosity for the MTHFR C677T genotype. Both analyses should be interpreted cautiously because of marked heterogeneity between the study estimates and possible effect of publication bias on the tHcy findings.

Financial Disclosure(s): The author(s) have no proprietary or commercial interest in any materials discussed in this article.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Obesity has emerged as a risk factor for the development of asthma and it may also influence asthma control and airways inflammation. However, the role of obesity in severe asthma remains unclear. OBJECTIVE: To explore the association between obesity (defined by BMI) and severe asthma. METHODS: Data from the National Registry for dedicated UK Difficult Asthma Services were used to compare patient demographics, disease characteristics and healthcare utilisation between three body mass index (BMI) categories (normal weight: 18.5 -24.99, overweight: 25 -29.99, obese: =30) in a well characterised group of severe asthmatic adults. RESULTS: The study population consisted of 666 severe asthmatics with a median BMI of 29.8 (interquartile range 22.5 -34.0). The obese group exhibited greater asthma medication requirements in terms of maintenance corticosteroid therapy (48.9% versus 40.4% and 34.5% in the overweight and normal weight groups, respectively), steroid burst therapy and short-acting ß2-agonist (SABA) use per day. Significant differences were seen with gastro-oesophageal reflux disease (GORD) (53.9% versus 48.1% and 39.7% in the overweight and normal weight groups, respectively) and proton pump inhibitor (PPI) use. Bone density scores were higher in the obese group, whilst pulmonary function testing revealed a reduced FVC and raised Kco. Serum IgE levels decreased with increasing BMI and the obese group were more likely to report eczema, but less likely to have a history of nasal polyps. CONCLUSIONS: Severe asthmatics display particular characteristics according to BMI that support the view that obesity associated severe asthma may represent a distinct clinical phenotype.1Royal Brompton Hospital, London, UK;2Department of Computing, Imperial College, UK3Airways Disease, National Heart & Lung Institute, Imperial College, UK;4Centre for infection and immunity, Queen's University of Belfast, UK;5University of Leicester, UK;6The University of Manchester and University Hospital of South Manchester, UK;7Birmingham Heartlands Hospital, University of Birmingham, UK;8Gartnavel General Hospital, University of Glasgow, UK;9Glasgow Royal Infirmary, Glasgow, UKCorrespondence: Dr Andrew N. Menzies-Gow, Royal Brompton Hospital, Fulham Road, London SW3 6HP.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Increases in food production and the ever-present threat of food contamination from microbiological and chemical sources have led the food industry and regulators to pursue rapid, inexpensive methods of analysis to safeguard the health and safety of the consumer. Although sophisticated techniques such as chromatography and spectrometry provide more accurate and conclusive results, screening tests allow a much higher throughput of samples at a lower cost and with less operator training, so larger numbers of samples can be analysed. Biosensors combine a biological recognition element (enzyme, antibody, receptor) with a transducer to produce a measurable signal proportional to the extent of interaction between the recognition element and the analyte. The different uses of the biosensing instrumentation available today are extremely varied, with food analysis as an emerging and growing application. The advantages offered by biosensors over other screening methods such as radioimmunoassay, enzyme-linked immunosorbent assay, fluorescence immunoassay and luminescence immunoassay, with respect to food analysis, include automation, improved reproducibility, speed of analysis and real-time analysis. This article will provide a brief footing in history before reviewing the latest developments in biosensor applications for analysis of food contaminants (January 2007 to December 2010), focusing on the detection of pathogens, toxins, pesticides and veterinary drug residues by biosensors, with emphasis on articles showing data in food matrices. The main areas of development common to these groups of contaminants include multiplexing, the ability to simultaneously analyse a sample for more than one contaminant and portability. Biosensors currently have an important role in food safety; further advances in the technology, reagents and sample handling will surely reinforce this position.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identifying differential expression of genes in psoriatic and healthy skin by microarray data analysis is a key approach to understand the pathogenesis of psoriasis. Analysis of more than one dataset to identify genes commonly upregulated reduces the likelihood of false positives and narrows down the possible signature genes. Genes controlling the critical balance between T helper 17 and regulatory T cells are of special interest in psoriasis. Our objectives were to identify genes that are consistently upregulated in lesional skin from three published microarray datasets. We carried out a reanalysis of gene expression data extracted from three experiments on samples from psoriatic and nonlesional skin using the same stringency threshold and software and further compared the expression levels of 92 genes related to the T helper 17 and regulatory T cell signaling pathways. We found 73 probe sets representing 57 genes commonly upregulated in lesional skin from all datasets. These included 26 probe sets representing 20 genes that have no previous link to the etiopathogenesis of psoriasis. These genes may represent novel therapeutic targets and surely need more rigorous experimental testing to be validated. Our analysis also identified 12 of 92 genes known to be related to the T helper 17 and regulatory T cell signaling pathways, and these were found to be differentially expressed in the lesional skin samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Previous research demonstrates various associations between depression, cardiovascular disease (CVD) incidence and mortality, possibly as a result of the different methodologies used to measure depression and analyse relationships. This analysis investigated the association between depression, CVD incidence (CVDI) and mortality from CVD (MCVD), smoking related conditions (MSRC), and all causes (MALL), in a sample data set, where depression was measured using items from a validated questionnaire and using items derived from the factor analysis of a larger questionnaire, and analyses were conducted based on continuous data and grouped data.

Methods: Data from the PRIME Study (N=9798 men) on depression and 10-year CVD incidence and mortality were analysed using Cox proportional hazards models.

Results: Using continuous data, both measures of depression resulted in the emergence of positive associations between depression and mortality (MCVD, MSRC, MALL). Using grouped data, however, associations between a validated measure of depression and MCVD, and between a measure of depression derived from factor analysis and all measures of mortality were lost.

Limitations: Low levels of depression, low numbers of individuals with high depression and low numbers of outcome events may limit these analyses, but levels are usual for the population studied.

Conclusions: These data demonstrate a possible association between depression and mortality but detecting this association is dependent on the measurement used and method of analysis. Different findings based on methodology present clear problems for the elucidation and determination of relationships. The differences here argue for the use of validated scales where possible and suggest against over-reduction via factor analysis and grouping.