750 resultados para zero-inflated data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Big data is certainly the buzz term in executive networking circles at the moment. Heralded by management consultancies and research organisations alike as the next big thing in business efficiency, it is shooting up the Gartner hype cycle to the giddy heights of the peak of inflated expectations before it tumbles down in to the trough of disillusionment

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Small RNA sequencing is commonly used to identify novel miRNAs and to determine their expression levels in plants. There are several miRNA identification tools for animals such as miRDeep, miRDeep2 and miRDeep*. miRDeep-P was developed to identify plant miRNA using miRDeep’s probabilistic model of miRNA biogenesis, but it depends on several third party tools and lacks a user-friendly interface. The objective of our miRPlant program is to predict novel plant miRNA, while providing a user-friendly interface with improved accuracy of prediction. Result We have developed a user-friendly plant miRNA prediction tool called miRPlant. We show using 16 plant miRNA datasets from four different plant species that miRPlant has at least a 10% improvement in accuracy compared to miRDeep-P, which is the most popular plant miRNA prediction tool. Furthermore, miRPlant uses a Graphical User Interface for data input and output, and identified miRNA are shown with all RNAseq reads in a hairpin diagram. Conclusions We have developed miRPlant which extends miRDeep* to various plant species by adopting suitable strategies to identify hairpin excision regions and hairpin structure filtering for plants. miRPlant does not require any third party tools such as mapping or RNA secondary structure prediction tools. miRPlant is also the first plant miRNA prediction tool that dynamically plots miRNA hairpin structure with small reads for identified novel miRNAs. This feature will enable biologists to visualize novel pre-miRNA structure and the location of small RNA reads relative to the hairpin. Moreover, miRPlant can be easily used by biologists with limited bioinformatics skills.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Historically, the paper hand-held record (PHR) has been used for sharing information between hospital clinicians, general practitioners and pregnant women in a maternity shared-care environment. Recently in alignment with a National e-health agenda, an electronic health record (EHR) was introduced at an Australian tertiary maternity service to replace the PHR for collection and transfer of data. The aim of this study was to examine and compare the completeness of clinical data collected in a PHR and an EHR. Methods We undertook a comparative cohort design study to determine differences in completeness between data collected from maternity records in two phases. Phase 1 data were collected from the PHR and Phase 2 data from the EHR. Records were compared for completeness of best practice variables collected The primary outcome was the presence of best practice variables and the secondary outcomes were the differences in individual variables between the records. Results Ninety-four percent of paper medical charts were available in Phase 1 and 100% of records from an obstetric database in Phase 2. No PHR or EHR had a complete dataset of best practice variables. The variables with significant improvement in completeness of data documented in the EHR, compared with the PHR, were urine culture, glucose tolerance test, nuchal screening, morphology scans, folic acid advice, tobacco smoking, illicit drug assessment and domestic violence assessment (p = 0.001). Additionally the documentation of immunisations (pertussis, hepatitis B, varicella, fluvax) were markedly improved in the EHR (p = 0.001). The variables of blood pressure, proteinuria, blood group, antibody, rubella and syphilis status, showed no significant differences in completeness of recording. Conclusion This is the first paper to report on the comparison of clinical data collected on a PHR and EHR in a maternity shared-care setting. The use of an EHR demonstrated significant improvements to the collection of best practice variables. Additionally, the data in an EHR were more available to relevant clinical staff with the appropriate log-in and more easily retrieved than from the PHR. This study contributes to an under-researched area of determining data quality collected in patient records.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Problem The Manchester Driver Behaviour Questionnaire (DBQ) is the most commonly used self-report tool in traffic safety research and applied settings. It has been claimed that the violation factor of this instrument predicts accident involvement, which was supported by a previous meta-analysis. However, that analysis did not test for methodological effects, or include contacting researchers to obtain unpublished results. Method The present study re-analysed studies on prediction of accident involvement from DBQ factors, including lapses, and many unpublished effects. Tests of various types of dissemination bias and common method variance were undertaken. Results Outlier analysis showed that some effects were probably not reliable data, but excluding them did not change the results. For correlations between violations and crashes, tendencies for published effects to be larger than unpublished ones and for effects to decrease over time were observed, but were not significant. Also, analysis using the proxy of the mean of accidents in studies indicated that studies where effects for violations are unknown have smaller effect sizes. These differences indicate dissemination bias. Studies using self-reported accidents as dependent variables had much larger effects than those using recorded accident data. Also, zero-order correlations were larger than partial correlations that controlled for exposure. Similarly, violations/accidents effects were strong only when there was also a strong correlation between accidents and exposure. Overall, the true effect is probably very close to zero (r<.07) for violations versus traffic accident involvement, depending upon which systematic tendencies in the data are controlled for. Conclusions: Methodological factors and dissemination bias have inflated the mean effect size of the DBQ in the published literature. Strong evidence of various artefactual effects is apparent. Practical Applications A greater level of care should be taken if the DBQ continues to be used in traffic safety research. Also, validation of self-reports should be more comprehensive in the future, taking into account the possibility of common method variance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Spatial analysis is increasingly important for identifying modifiable geographic risk factors for disease. However, spatial health data from surveys are often incomplete, ranging from missing data for only a few variables, to missing data for many variables. For spatial analyses of health outcomes, selection of an appropriate imputation method is critical in order to produce the most accurate inferences. Methods We present a cross-validation approach to select between three imputation methods for health survey data with correlated lifestyle covariates, using as a case study, type II diabetes mellitus (DM II) risk across 71 Queensland Local Government Areas (LGAs). We compare the accuracy of mean imputation to imputation using multivariate normal and conditional autoregressive prior distributions. Results Choice of imputation method depends upon the application and is not necessarily the most complex method. Mean imputation was selected as the most accurate method in this application. Conclusions Selecting an appropriate imputation method for health survey data, after accounting for spatial correlation and correlation between covariates, allows more complete analysis of geographic risk factors for disease with more confidence in the results to inform public policy decision-making.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction Two symposia on “cardiovascular diseases and vulnerable plaques” Cardiovascular disease (CVD) is the leading cause of death worldwide. Huge effort has been made in many disciplines including medical imaging, computational modeling, bio- mechanics, bioengineering, medical devices, animal and clinical studies, population studies as well as genomic, molecular, cellular and organ-level studies seeking improved methods for early detection, diagnosis, prevention and treatment of these diseases [1-14]. However, the mechanisms governing the initiation, progression and the occurrence of final acute clinical CVD events are still poorly understood. A large number of victims of these dis- eases who are apparently healthy die suddenly without prior symptoms. Available screening and diagnostic methods are insufficient to identify the victims before the event occurs [8,9]. Most cardiovascular diseases are associated with vulnerable plaques. A grand challenge here is to develop new imaging techniques, predictive methods and patient screening tools to identify vulnerable plaques and patients who are more vulnerable to plaque rupture and associated clinical events such as stroke and heart attack, and recommend proper treatment plans to prevent those clinical events from happening. Articles in this special issue came from two symposia held recently focusing on “Cardio-vascular Diseases and Vulnerable Plaques: Data, Modeling, Predictions and Clinical Applications.” One was held at Worcester Polytechnic Institute (WPI), Worcester, MA, USA, July 13-14, 2014, right after the 7th World Congress of Biomechanics. This symposium was endorsed by the World Council of Biomechanics, and partially supported by a grant from NIH-National Institute of Biomedical Image and Bioengineering. The other was held at Southeast University (SEU), Nanjing, China, April 18-20, 2014.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Deriving an estimate of optimal fishing effort or even an approximate estimate is very valuable for managing fisheries with multiple target species. The most challenging task associated with this is allocating effort to individual species when only the total effort is recorded. Spatial information on the distribution of each species within a fishery can be used to justify the allocations, but often such information is not available. To determine the long-term overall effort required to achieve maximum sustainable yield (MSY) and maximum economic yield (MEY), we consider three methods for allocating effort: (i) optimal allocation, which optimally allocates effort among target species; (ii) fixed proportions, which chooses proportions based on past catch data; and (iii) economic allocation, which splits effort based on the expected catch value of each species. Determining the overall fishing effort required to achieve these management objectives is a maximizing problem subject to constraints due to economic and social considerations. We illustrated the approaches using a case study of the Moreton Bay Prawn Trawl Fishery in Queensland (Australia). The results were consistent across the three methods. Importantly, our analysis demonstrated the optimal total effort was very sensitive to daily fishing costs-the effort ranged from 9500-11 500 to 6000-7000, 4000 and 2500 boat-days, using daily cost estimates of $0, $500, $750, and $950, respectively. The zero daily cost corresponds to the MSY, while a daily cost of $750 most closely represents the actual present fishing cost. Given the recent debate on which costs should be factored into the analyses for deriving MEY, our findings highlight the importance of including an appropriate cost function for practical management advice. The approaches developed here could be applied to other multispecies fisheries where only aggregated fishing effort data are recorded, as the literature on this type of modelling is sparse.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Robust methods are useful in making reliable statistical inferences when there are small deviations from the model assumptions. The widely used method of the generalized estimating equations can be "robustified" by replacing the standardized residuals with the M-residuals. If the Pearson residuals are assumed to be unbiased from zero, parameter estimators from the robust approach are asymptotically biased when error distributions are not symmetric. We propose a distribution-free method for correcting this bias. Our extensive numerical studies show that the proposed method can reduce the bias substantially. Examples are given for illustration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Standard methods for quantifying IncuCyte ZOOM™ assays involve measurements that quantify how rapidly the initially-vacant area becomes re-colonised with cells as a function of time. Unfortunately, these measurements give no insight into the details of the cellular-level mechanisms acting to close the initially-vacant area. We provide an alternative method enabling us to quantify the role of cell motility and cell proliferation separately. To achieve this we calibrate standard data available from IncuCyte ZOOM™ images to the solution of the Fisher-Kolmogorov model. Results: The Fisher-Kolmogorov model is a reaction-diffusion equation that has been used to describe collective cell spreading driven by cell migration, characterised by a cell diffusivity, D, and carrying capacity limited proliferation with proliferation rate, λ, and carrying capacity density, K. By analysing temporal changes in cell density in several subregions located well-behind the initial position of the leading edge we estimate λ and K. Given these estimates, we then apply automatic leading edge detection algorithms to the images produced by the IncuCyte ZOOM™ assay and match this data with a numerical solution of the Fisher-Kolmogorov equation to provide an estimate of D. We demonstrate this method by applying it to interpret a suite of IncuCyte ZOOM™ assays using PC-3 prostate cancer cells and obtain estimates of D, λ and K. Comparing estimates of D, λ and K for a control assay with estimates of D, λ and K for assays where epidermal growth factor (EGF) is applied in varying concentrations confirms that EGF enhances the rate of scratch closure and that this stimulation is driven by an increase in D and λ, whereas K is relatively unaffected by EGF. Conclusions: Our approach for estimating D, λ and K from an IncuCyte ZOOM™ assay provides more detail about cellular-level behaviour than standard methods for analysing these assays. In particular, our approach can be used to quantify the balance of cell migration and cell proliferation and, as we demonstrate, allow us to quantify how the addition of growth factors affects these processes individually.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Several prospective studies have suggested that gait and plantar pressure abnormalities secondary to diabetic peripheral neuropathy contributes to foot ulceration. There are many different methods by which gait and plantar pressures are assessed and currently there is no agreed standardised approach. This study aimed to describe the methods and reproducibility of three-dimensional gait and plantar pressure assessments in a small subset of participants using pre-existing protocols. Methods Fourteen participants were conveniently sampled prior to a planned longitudinal study; four patients with diabetes and plantar foot ulcers, five patients with diabetes but no foot ulcers and five healthy controls. The repeatability of measuring key biomechanical data was assessed including the identification of 16 key anatomical landmarks, the measurement of seven leg dimensions, the processing of 22 three-dimensional gait parameters and the analysis of four different plantar pressures measures at 20 foot regions. Results The mean inter-observer differences were within the pre-defined acceptable level (<7 mm) for 100 % (16 of 16) of key anatomical landmarks measured for gait analysis. The intra-observer assessment concordance correlation coefficients were > 0.9 for 100 % (7 of 7) of leg dimensions. The coefficients of variations (CVs) were within the pre-defined acceptable level (<10 %) for 100 % (22 of 22) of gait parameters. The CVs were within the pre-defined acceptable level (<30 %) for 95 % (19 of 20) of the contact area measures, 85 % (17 of 20) of mean plantar pressures, 70 % (14 of 20) of pressure time integrals and 55 % (11 of 20) of maximum sensor plantar pressure measures. Conclusion Overall, the findings of this study suggest that important gait and plantar pressure measurements can be reliably acquired. Nearly all measures contributing to three-dimensional gait parameter assessments were within predefined acceptable limits. Most plantar pressure measurements were also within predefined acceptable limits; however, reproducibility was not as good for assessment of the maximum sensor pressure. To our knowledge, this is the first study to investigate the reproducibility of several biomechanical methods in a heterogeneous cohort.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Traffic offences have been considered an important predictor of crash involvement, and have often been used as a proxy safety variable for crashes. However the association between crashes and offences has never been meta-analysed and the population effect size never established. Research is yet to determine the extent to which this relationship may be spuriously inflated through systematic measurement error, with obvious implications for researchers endeavouring to accurately identify salient factors predictive of crashes. Methodology and Principal Findings Studies yielding a correlation between crashes and traffic offences were collated and a meta-analysis of 144 effects drawn from 99 road safety studies conducted. Potential impact of factors such as age, time period, crash and offence rates, crash severity and data type, sourced from either self-report surveys or archival records, were considered and discussed. After weighting for sample size, an average correlation of r = .18 was observed over the mean time period of 3.2 years. Evidence emerged suggesting the strength of this correlation is decreasing over time. Stronger correlations between crashes and offences were generally found in studies involving younger drivers. Consistent with common method variance effects, a within country analysis found stronger effect sizes in self-reported data even controlling for crash mean. Significance The effectiveness of traffic offences as a proxy for crashes may be limited. Inclusion of elements such as independently validated crash and offence histories or accurate measures of exposure to the road would facilitate a better understanding of the factors that influence crash involvement.

Relevância:

20.00% 20.00%

Publicador: