957 resultados para Imbalanced datasets
Resumo:
These guidelines provide a practical and evidence-based resource for the management of patients with Barrett's oesophagus and related early neoplasia. The Appraisal of Guidelines for Research and Evaluation (AGREE II) instrument was followed to provide a methodological strategy for the guideline development. A systematic review of the literature was performed for English language articles published up until December 2012 in order to address controversial issues in Barrett's oesophagus including definition, screening and diagnosis, surveillance, pathological grading for dysplasia, management of dysplasia, and early cancer including training requirements. The rigour and quality of the studies was evaluated using the SIGN checklist system. Recommendations on each topic were scored by each author using a five-tier system (A+, strong agreement, to D+, strongly disagree). Statements that failed to reach substantial agreement among authors, defined as >80% agreement (A or A+), were revisited and modified until substantial agreement (>80%) was reached. In formulating these guidelines, we took into consideration benefits and risks for the population and national health system, as well as patient perspectives. For the first time, we have suggested stratification of patients according to their estimated cancer risk based on clinical and histopathological criteria. In order to improve communication between clinicians, we recommend the use of minimum datasets for reporting endoscopic and pathological findings. We advocate endoscopic therapy for high-grade dysplasia and early cancer, which should be performed in high-volume centres. We hope that these guidelines will standardise and improve management for patients with Barrett's oesophagus and related neoplasia.
Resumo:
Clade V nematodes comprise several parasitic species that include the cyathostomins, primary helminth pathogens of horses. Next generation transcriptome datasets are available for eight parasitic clade V nematodes, although no equine parasites are included in this group. Here, we report next generation transcriptome sequencing analysis for the common cyathostomin species, Cylicostephanus goldi. A cDNA library was generated from RNA extracted from 17 C. goldi male and female adult parasites. Following sequencing using a 454 GS FLX pyrosequencer, a total of 475,215 sequencing reads were generated, which were assembled into 26,910 contigs. Using Gene Ontology and Kyoto Encyclopedia of Genes and Genomes databases, 27% of the transcriptome was annotated. Further in-depth analysis was carried out by comparing the C. goldi dataset with the next generation transcriptomes and genomes of other clade V nematodes, with the Oesophagostomum dentatum transcriptome and the Haemonchus contortus genome showing the highest levels of sequence identity with the cyathostomin dataset (45%). The C. goldi transcriptome was mined for genes associated with anthelmintic mode of action and/or resistance. Sequences encoding proteins previously associated with the three major anthelmintic classes used in horses were identified, with the exception of the P-glycoprotein group. Targeted resequencing of the glutamate gated chloride channel α4 subunit (glc-3), one of the primary targets of the macrocyclic lactone anthelmintics, was performed for several cyathostomin species. We believe this study reports the first transcriptome dataset for an equine helminth parasite, providing the opportunity for in-depth analysis of these important parasites at the molecular level. Sequences encoding enzymes involved in key processes and genes associated with levamisole/pyrantel and macrocyclic lactone resistance, in particular the glutamate gated chloride channels, were identified. This novel data will inform cyathostomin biology and anthelmintic resistance studies in future.
Resumo:
We present the latest analysis and results from SEPPCoN (Survey of Ensemble Physical Properties of Cometary Nuclei). This on-going survey involves studying 100 JFCs - about 25% of the known population - at both mid-infrared and visible wave-lengths to constrain the distributions of sizes, shapes, spins, and albedos of this population. Having earlier reported results from measuring thermal emissions of our sample nuclei [1,2,3,4], we report here progress on the visible-wavelength observations that we have obtained at many ground-based facilities in Chile, Spain, and the United States. To date we have attempted observations of 91% of our sample of 100 JFCs, and at least 64 of those were successfully detected. In most cases the comets were at heliocentric distances between 3.0 and 6.5 AU so as to decrease the odds of a comet having a coma. Of the 64 detected comets, 48 were apparently bare, having no extended emission. Our datasets are further augmented by archival data and photometry from the NEAT program [5]. An important goal of SEPPCoN is to accumulate a large comprehensive set of high quality physical data on cometary nuclei in order to make accurate statistical comparisons with other minor-body populations such as Trojans, Centaurs, and Kuiper-belt objects. Information on the size, shape, spin-rate, albedo and color distributions is critical for understanding their origins and evolutionary processes affecting them.
Resumo:
We present new results from SEPPCoN, a Survey of Ensemble Physical Properties of Cometary Nuclei. This project is currently surveying 100 Jupiter-family comets (JFCs) to measure the mid-infrared thermal emission and visible reflected sunlight of the nuclei. The scientific goal is to determine the distributions of radius, geometric albedo, thermal inertia, axial ratio, and color among the JFC nuclei. In the past we have presented results from the completed mid-IR observations of our sample [1]; here we present preliminary results from ongoing, broadband visible-wavelength observations of nuclei obtained from a variety of ground-based facilities (Mauna Kea, Cerro Pachon, La Silla, La Palma, Apache Point, Table Mtn., and Palomar Mtn.), including contributions from the Near Earth Asteroid Telescope project (NEAT) archive. The nuclei were observed at high heliocentric distance (usually over 4 AU) and so many comets show either no or little contamination from dust coma. While several nuclei have been observed as snapshots, we have multiepoch photometry for many of our targets. With our datasets we are building a large database of photometry, and such a database is essential to the derivation of albedo and shape of a large number of nuclei, and to the understanding of biases in the survey. Support for this work was provided by NSF and the NASA Planetary Astronomy program. Reference: [1] Fernandez, Y.R., et al. 2007, BAAS 39, 827.
Resumo:
Currently wind power is dominated by onshore wind farms in the British Isles, but both the United Kingdom and the Republic of Ireland have high renewable energy targets, expected to come mostly from wind power. However, as the demand for wind power grows to ensure security of energy supply, as a potentially cheaper alternative to fossil fuels and to meet greenhouse gas emissions reduction targets offshore wind power will grow rapidly as the availability of suitable onshore sites decrease. However, wind is variable and stochastic by nature and thus difficult to schedule. In order to plan for these uncertainties market operators use wind forecasting tools, reserve plant and ancillary service agreements. Onshore wind power forecasting techniques have improved dramatically and continue to advance, but offshore wind power forecasting is more difficult due to limited datasets and knowledge. So as the amount of offshore wind power increases in the British Isles robust forecasting and planning techniques are even more critical. This paper presents a methodology to investigate the impacts of better offshore wind forecasting on the operation and management of the single wholesale electricity market in the Republic of Ireland and Northern Ireland using PLEXOS for Power Systems. © 2013 IEEE.
Resumo:
Globally the amount of installed terrestrial wind power both onshore and offshore has grown rapidly over the last twenty years. Most large onshore and offshore wind turbines are designed to harvest winds within the atmospheric boundary layer, which can be vary variable due to terrain and weather effects. The height of the neutral atmospheric boundary layer is estimated at above 1300m. A relatively new concept is to harvest more consistent wind conditions above the atmospheric boundary layer using high altitude wind harvesting devices such as tethered kites, air foils and dirigible rotors. This paper presents a techno-economic feasibility study of high altitude wind power in Northern Ireland. First this research involved a state of the art review of the resource and the technologies proposed for high altitude wind power. Next the techno-economic analysis involving four steps is presented. In step one, the potential of high altitude wind power in Northern Ireland using online datasets (e.g. Earth System Research Laboratory) is estimated. In step two a map for easier visualisation of geographical limitations (e.g. airports, areas of scenic beauty, flight paths, military training areas, settlements etc.) that could impact on high altitude wind power is developed. In step three the actual feasible resource available is recalculated using the visualisation map to determine the ‘optimal’ high altitude wind power locations in Northern Ireland. In the last step four the list of equipment, resources and budget needed to build a demonstrator is provided in the form of a concise techno-economic appraisal using the findings of the previous three steps.
Resumo:
In this paper we propose a graph stream clustering algorithm with a unied similarity measure on both structural and attribute properties of vertices, with each attribute being treated as a vertex. Unlike others, our approach does not require an input parameter for the number of clusters, instead, it dynamically creates new sketch-based clusters and periodically merges existing similar clusters. Experiments on two publicly available datasets reveal the advantages of our approach in detecting vertex clusters in the graph stream. We provide a detailed investigation into how parameters affect the algorithm performance. We also provide a quantitative evaluation and comparison with a well-known offline community detection algorithm which shows that our streaming algorithm can achieve comparable or better average cluster purity.
Resumo:
BACKGROUND: The impact of bronchiectasis on sedentary behaviour and physical activity is unknown. It is important to explore this to identify the need for physical activity interventions and how to tailor interventions to this patient population. We aimed to explore the patterns and correlates of sedentary behaviour and physical activity in bronchiectasis.
METHODS: Physical activity was assessed in 63 patients with bronchiectasis using an ActiGraph GT3X+ accelerometer over seven days. Patients completed: questionnaires on health-related quality-of-life and attitudes to physical activity (questions based on an adaption of the transtheoretical model (TTM) of behaviour change); spirometry; and the modified shuttle test (MST). Multiple linear regression analysis using forward selection based on likelihood ratio statistics explored the correlates of sedentary behaviour and physical activity dimensions. Between-group analysis using independent sample t-tests were used to explore differences for selected variables.
RESULTS: Fifty-five patients had complete datasets. Average daily time, mean(standard deviation) spent in sedentary behaviour was 634(77)mins, light-lifestyle physical activity was 207(63)mins and moderate-vigorous physical activity (MVPA) was 25(20)mins. Only 11% of patients met recommended guidelines. Forced expiratory volume in one-second percentage predicted (FEV1% predicted) and disease severity were not correlates of sedentary behaviour or physical activity. For sedentary behaviour, decisional balance 'pros' score was the only correlate. Performance on the MST was the strongest correlate of physical activity. In addition to the MST, there were other important correlate variables for MVPA accumulated in ≥10-minute bouts (QOL-B Social Functioning) and for activity energy expenditure (Body Mass Index and QOL-B Respiratory Symptoms).
CONCLUSIONS: Patients with bronchiectasis demonstrated a largely inactive lifestyle and few met the recommended physical activity guidelines. Exercise capacity was the strongest correlate of physical activity, and dimensions of the QOL-B were also important. FEV1% predicted and disease severity were not correlates of sedentary behaviour or physical activity. The inclusion of a range of physical activity dimensions could facilitate in-depth exploration of patterns of physical activity. This study demonstrates the need for interventions targeted at reducing sedentary behaviour and increasing physical activity, and provides information to tailor interventions to the bronchiectasis population.
Resumo:
Objective: Diabetic nephropathy (DN) is a microvascular complication of diabetes. Members of the WNT/ β-catenin pathways have been implicated in interstitial fibrosis and glomerular sclerosis, characteristic hallmarks of DN. These processes are controlled, in part, by transcription factors (TFs), proteins which bind to gene promoter regions attenuating their regulation. We sought to identify predicted cis-acting transcription factor binding sites (TFBS) over-represented within the promoter regions of WNT pathway members compared to genes across the genome.Methods: We assessed the frequency of 62 TFBS motifs from the JASPAR databases on 65 WNT pathway genes. P-values were estimated on the hypergeometric distribution for each TF. Gene expression profiles of enriched motifs were examined from DN-related datasets to assess clinical significance.Results: TFBS motifs transcription factor AP-2 alpha (TFAP2A), myeloid zinc finger 1 (MZF1), and specificity protein 1 (SP1) were significantly enriched within WNT pathway genes (P-values<6.83x10-29, 1.34x10-11 and 3.01x10-6 respectively). MZF1 gene expression was significantly increased in DN in a whole kidney dataset (fold change = 1.16; 16% increase; P = 0.03). TFAP2A gene expression was decreased in an independent dataset (fold change = -1.02; P = 0.03). SP1 was not differentially expressed in any datasets examined.Conclusions: Three TFBS profiles are significantly enriched within the WNT pathway genes examined highlighting the use of in silico analyses for identifying key regulators of this pathway. Modification of TF binding to gene promoter regions involved in DN pathology may limit progression, making refinement of targeted therapeutic strategies possible through clearer delineation of their role.
Resumo:
Immunotherapy is a promising strategy for the treatment of various types of cancer. An antibody that targets programmed death ligand-1 (PD-L1) pathway has been shown to be active towards various types of cancer, including melanoma and lung cancer. MPDL3280A, an anti‑PD-L1 antibody, has shown clear clinical activity in PD-L1-overexpressing bladder cancer with an objective response rate of 40-50%, resulting in a breakthrough therapy designation granted by FDA. These events pronounce the importance of targeting the PD-L1 pathway in the treatment of bladder cancer. In the present study, we investigated the prognostic significance of the expression of three genes in the PD-L1 pathway, including PD-L1, B7.1 and PD-1, in three independent bladder cancer datasets in the Gene Expression Omnibus database. PD-L1, B7.1 and PD-1 were significantly associated with clinicopathological parameters indicative of a more aggressive phenotype of bladder cancer, such as a more advanced stage and a higher tumor grade. In addition, a high level expression of PD-L1 was associated with reduced patient survival. Of note, the combination of PD-L1 and B7.1 expression, but not other combinations of the three genes, were also able to predict patient survival. Our findings support the development of anti-PD-L1, which blocks PD-L1-PD-1 and B7.1-PD-L1 interactions, in treatment of bladder cancer. The observations were consistent in the three independent bladder cancer datasets consisting of a total of 695 human bladder specimens. The datasets were then assessed and it was found that the expression levels of the chemokine CC-motif ligand (CCL), CCL3, CCL8 and CCL18, were correlated with the PD-L1 expression level, while ADAMTS13 was differentially expressed in patients with a different survival status (alive or deceased). Additional investigations are required to elucidate the role of these genes in the PD-L1-mediated immune system suppression and bladder cancer progression. In conclusion, findings of this study suggested that PD-L1 is an important prognostic marker and a therapeutic target for bladder cancer.
Resumo:
Targeting angiogenesis through inhibition of the vascular endothelial growth factor (VEGF) pathway has been successful in the treatment of late stage colorectal cancer. However, not all patients benefit from inhibition of VEGF. Ras status is a powerful biomarker for response to anti-epidermal growth factor receptor therapy; however, an appropriate biomarker for response to anti-VEGF therapy is yet to be identified. VEGF and its receptors, FLT1 and KDR, play a crucial role in colon cancer progression; individually, these factors have been shown to be prognostic in colon cancer; however, expression of none of these factors alone was predictive of tumor response to anti-VEGF therapy. In the present study, we analyzed the expression levels of VEGFA, FLT1, and KDR in two independent colon cancer datasets and found that high expression levels of all three factors afforded a very poor prognosis. The observation was further confirmed in another independent colon cancer dataset, wherein high levels of expression of this three-gene signature was predictive of poor prognosis in patients with proficient mismatch repair a wild-type KRas status, or mutant p53 status. Most importantly, this signature also predicted tumor response to bevacizumab, an antibody targeting VEGFA, in a cohort of bevacizumab-treated patients. Since bevacizumab has been proven to be an important drug in the treatment of advanced stage colon cancer, our results suggest that the three-gene signature approach is valuable in terms of its prognostic value, and that it should be further evaluated in a prospective clinical trial to investigate its predictive value to anti-VEGF treatment.
Resumo:
Modern cancer research on prognostic and predictive biomarkers demands the integration of established and emerging high-throughput technologies. However, these data are meaningless unless carefully integrated with patient clinical outcome and epidemiological information. Integrated datasets hold the key to discovering new biomarkers and therapeutic targets in cancer. We have developed a novel approach and set of methods for integrating and interrogating phenomic, genomic and clinical data sets to facilitate cancer biomarker discovery and patient stratification. Applied to a known paradigm, the biological and clinical relevance of TP53, PICan was able to recapitulate the known biomarker status and prognostic significance at a DNA, RNA and protein levels.
Resumo:
In this paper we explore ways to address the issue of dataset bias in person re-identification by using data augmentation to increase the variability of the available datasets, and we introduce a novel data augmentation method for re-identification based on changing the image background. We show that use of data augmentation can improve the cross-dataset generalisation of convolutional network based re-identification systems, and that changing the image background yields further improvements.
Resumo:
A brief, historical overview of 10 apparently different, although in some cases, upon inspection, closely related, popular proposed reaction mechanisms and their associated rate equations, is given and in which the rate expression for each mechanism is derived from basic principles, Appendix A. In Appendix B, each of the 5 main mechanisms are tested using datasets, comprising initial reaction rate vs. organic pollutant concentration, [P] and incident irradiance, ρ, data, reported previously for TiO2, where P is phenol, 4-chlorophenol and formic acid. The best of those tested, in terms of overall fit, simplicity, usefulness and versatility is the disrupted adsorption kinetic model proposed by Ollis. The usual basic assumptions made in constructing these mechanisms are reported and the main underlying concerns explored.
Resumo:
Slow release drugs must be manufactured to meet target specifications with respect to dissolution curve profiles. In this paper we consider the problem of identifying the drivers of dissolution curve variability of a drug from historical manufacturing data. Several data sources are considered: raw material parameters, coating data, loss on drying and pellet size statistics. The methodology employed is to develop predictive models using LASSO, a powerful machine learning algorithm for regression with high-dimensional datasets. LASSO provides sparse solutions facilitating the identification of the most important causes of variability in the drug fabrication process. The proposed methodology is illustrated using manufacturing data for a slow release drug.