108 resultados para EXTRACTION CHROMATOGRAPHY
Resumo:
We present an overview of the QUT plant classification system submitted to LifeCLEF 2014. This system uses generic features extracted from a convolutional neural network previously used to perform general object classification. We examine the effectiveness of these features to perform plant classification when used in combination with an extremely randomised forest. Using this system, with minimal tuning, we obtained relatively good results with a score of 0:249 on the test set of LifeCLEF 2014.
Resumo:
Erythropoietin (EPO), a glycoprotein hormone of ∼34 kDa, is an important hematopoietic growth factor, mainly produced in the kidney and controls the number of red blood cells circulating in the blood stream. Sensitive and rapid recombinant human EPO (rHuEPO) detection tools that improve on the current laborious EPO detection techniques are in high demand for both clinical and sports industry. A sensitive aptamer-functionalized biosensor (aptasensor) has been developed by controlled growth of gold nanostructures (AuNS) over a gold substrate (pAu/AuNS). The aptasensor selectively binds to rHuEPO and, therefore, was used to extract and detect the drug from horse plasma by surface enhanced Raman spectroscopy (SERS). Due to the nanogap separation between the nanostructures, the high population and distribution of hot spots on the pAu/AuNS substrate surface, strong signal enhancement was acquired. By using wide area illumination (WAI) setting for the Raman detection, a low RSD of 4.92% over 150 SERS measurements was achieved. The significant reproducibility of the new biosensor addresses the serious problem of SERS signal inconsistency that hampers the use of the technique in the field. The WAI setting is compatible with handheld Raman devices. Therefore, the new aptasensor can be used for the selective extraction of rHuEPO from biological fluids and subsequently screened with handheld Raman spectrometer for SERS based in-field protein detection.
Resumo:
Objective This paper presents an automatic active learning-based system for the extraction of medical concepts from clinical free-text reports. Specifically, (1) the contribution of active learning in reducing the annotation effort, and (2) the robustness of incremental active learning framework across different selection criteria and datasets is determined. Materials and methods The comparative performance of an active learning framework and a fully supervised approach were investigated to study how active learning reduces the annotation effort while achieving the same effectiveness as a supervised approach. Conditional Random Fields as the supervised method, and least confidence and information density as two selection criteria for active learning framework were used. The effect of incremental learning vs. standard learning on the robustness of the models within the active learning framework with different selection criteria was also investigated. Two clinical datasets were used for evaluation: the i2b2/VA 2010 NLP challenge and the ShARe/CLEF 2013 eHealth Evaluation Lab. Results The annotation effort saved by active learning to achieve the same effectiveness as supervised learning is up to 77%, 57%, and 46% of the total number of sequences, tokens, and concepts, respectively. Compared to the Random sampling baseline, the saving is at least doubled. Discussion Incremental active learning guarantees robustness across all selection criteria and datasets. The reduction of annotation effort is always above random sampling and longest sequence baselines. Conclusion Incremental active learning is a promising approach for building effective and robust medical concept extraction models, while significantly reducing the burden of manual annotation.
Resumo:
This paper presents a new active learning query strategy for information extraction, called Domain Knowledge Informativeness (DKI). Active learning is often used to reduce the amount of annotation effort required to obtain training data for machine learning algorithms. A key component of an active learning approach is the query strategy, which is used to iteratively select samples for annotation. Knowledge resources have been used in information extraction as a means to derive additional features for sample representation. DKI is, however, the first query strategy that exploits such resources to inform sample selection. To evaluate the merits of DKI, in particular with respect to the reduction in annotation effort that the new query strategy allows to achieve, we conduct a comprehensive empirical comparison of active learning query strategies for information extraction within the clinical domain. The clinical domain was chosen for this work because of the availability of extensive structured knowledge resources which have often been exploited for feature generation. In addition, the clinical domain offers a compelling use case for active learning because of the necessary high costs and hurdles associated with obtaining annotations in this domain. Our experimental findings demonstrated that 1) amongst existing query strategies, the ones based on the classification model’s confidence are a better choice for clinical data as they perform equally well with a much lighter computational load, and 2) significant reductions in annotation effort are achievable by exploiting knowledge resources within active learning query strategies, with up to 14% less tokens and concepts to manually annotate than with state-of-the-art query strategies.
Resumo:
An automated method for extracting brain volumes from three commonly acquired three-dimensional (3D) MR images (proton density, T1 weighted, and T2-weighted) of the human head is described. The procedure is divided into four levels: preprocessing, segmentation, scalp removal, and postprocessing. A user-provided reference point is the sole operator-dependent input required. The method's parameters were first optimized and then fixed and applied to 30 repeat data sets from 15 normal older adult subjects to investigate its reproducibility. Percent differences between total brain volumes (TBVs) for the subjects' repeated data sets ranged from .5% to 2.2%. We conclude that the method is both robust and reproducible and has the potential for wide application.
Resumo:
Currently we are facing an overburdening growth of the number of reliable information sources on the Internet. The quantity of information available to everyone via Internet is dramatically growing each year [15]. At the same time, temporal and cognitive resources of human users are not changing, therefore causing a phenomenon of information overload. World Wide Web is one of the main sources of information for decision makers (reference to my research). However our studies show that, at least in Poland, the decision makers see some important problems when turning to Internet as a source of decision information. One of the most common obstacles raised is distribution of relevant information among many sources, and therefore need to visit different Web sources in order to collect all important content and analyze it. A few research groups have recently turned to the problem of information extraction from the Web [13]. The most effort so far has been directed toward collecting data from dispersed databases accessible via web pages (related to as data extraction or information extraction from the Web) and towards understanding natural language texts by means of fact, entity, and association recognition (related to as information extraction). Data extraction efforts show some interesting results, however proper integration of web databases is still beyond us. Information extraction field has been recently very successful in retrieving information from natural language texts, however it is still lacking abilities to understand more complex information, requiring use of common sense knowledge, discourse analysis and disambiguation techniques.
Resumo:
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative XPath expressions, although not widely used, should be used in preference to absolute XPath expressions in extracting content from human-created Web documents. Evaluation of robustness covers four thousand queries executed on several hundred webpages. We show that in referencing parts of real world dynamic HTML documents, relative XPath expressions are on average significantly more robust than absolute XPath ones.
Resumo:
Purified proteins are mandatory for molecular, immunological and cellular studies. However, purification of proteins from complex mixtures requires specialised chromatography methods (i.e., gel filtration, ion exchange, etc.) using fast protein liquid chromatography (FPLC) or high-performance liquid chromatography (HPLC) systems. Such systems are expensive and certain proteins require two or more different steps for sufficient purity and generally result in low recovery. The aim of this study was to develop a rapid, inexpensive and efficient gel-electrophoresis-based protein purification method using basic and readily available laboratory equipment. We have used crude rye grass pollen extract to purify the major allergens Lol p 1 and Lol p 5 as the model protein candidates. Total proteins were resolved on large primary gel and Coomassie Brilliant Blue (CBB)-stained Lol p 1/5 allergens were excised and purified on a secondary "mini"-gel. Purified proteins were extracted from unstained separating gels and subjected to sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and immunoblot analyses. Silver-stained SDS-PAGE gels resolved pure proteins (i.e., 875 μg of Lol p 1 recovered from a 8 mg crude starting material) while immunoblot analysis confirmed immunological reactivity of the purified proteins. Such a purification method is rapid, inexpensive, and efficient in generating proteins of sufficient purity for use in monoclonal antibody (mAb) production, protein sequencing and general molecular, immunological, and cellular studies.
Resumo:
Flos Chrysanthemum is a generic name for a particular group of edible plants, which also have medicinal properties. There are, in fact, twenty to thirty different cultivars, which are commonly used in beverages and for medicinal purposes. In this work, four Flos Chrysanthemum cultivars, Hangju, Taiju, Gongju, and Boju, were collected and chromatographic fingerprints were used to distinguish and assess these cultivars for quality control purposes. Chromatography fingerprints contain chemical information but also often have baseline drifts and peak shifts, which complicate data processing, and adaptive iteratively reweighted, penalized least squares, and correlation optimized warping were applied to correct the fingerprint peaks. The adjusted data were submitted to unsupervised and supervised pattern recognition methods. Principal component analysis was used to qualitatively differentiate the Flos Chrysanthemum cultivars. Partial least squares, continuum power regression, and K-nearest neighbors were used to predict the unknown samples. Finally, the elliptic joint confidence region method was used to evaluate the prediction ability of these models. The partial least squares and continuum power regression methods were shown to best represent the experimental results.
Resumo:
Many protocols have been used for extraction of DNA from Thraustochytrids. These generally involve the use of CTAB, phenol/chloroform and ethanol. They also feature mechanical grinding, sonication, N2 freezing or bead beating. However, the resulting chemical and physical damage to extracted DNA reduces its quality. The methods are also unsuitable for large numbers of samples. Commercially-available DNA extraction kits give better quality and yields but are expensive. Therefore, an optimized DNA extraction protocol was developed which is suitable for Thraustochytrids to both minimise expensive and time-consuming steps prior to DNA extraction and also to improve the yield. The most effective method is a combination of single bead in TissueLyser (Qiagen) and Proteinase K. Results were conclusive: both the quality and the yield of extracted DNA were higher than with any other method giving an average yield of 8.5 µg/100 mg biomass.
Resumo:
Scientists have injected endotoxin into animals to investigate and understand various pathologies and novel therapies for several decades. Recent observations have shown that there is selective susceptibility to Escherichia coli lipopolysaccharide (LPS) endotoxin in sheep, despite having similar breed characteristics. The reason behind this difference is unknown, and has prompted studies aiming to explain the variation by proteogenomic characterisation of circulating acute phase biomarkers. It is hypothesised that genetic trait, biochemical, immunological and inflammation marker patterns contribute in defining and predicting mammalian response to LPS. This review discusses the effects of endotoxin and host responses, genetic basis of innate defences, activation of the acute phase response (APR) following experimental LPS challenge, and the current approaches employed in detecting novel biomarkers including acute phase proteins (APP) and micro-ribonucleic acids (miRNAs) in serum or plasma. miRNAs are novel targets for elucidating molecular mechanisms of disease because of their differential expression during pathological, and in healthy states. Changes in miRNA profiles during a disease challenge may be reflected in plasma. Studies show that gel-based two-dimensional electrophoresis (2-DE) coupled with either matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF MS) or liquid chromatography-mass spectrometry (LC-MS/MS) are currently the most used methods for proteome characterisation. Further evidence suggests that proteomic investigations are preferentially shifting from 2-DE to non-gel based LC-MS/MS coupled with data extraction by sequential window acquisition of all theoretical fragment-ion spectra (SWATH) approaches that are able to identify a wider range of proteins. Enzyme-linked immunosorbent assay (ELISA), quantitative real-time polymerase chain reaction (qRT-PCR), and most recently proteomic methods have been used to quantify low abundance proteins such as cytokines. qRT-PCR and next generation sequencing (NGS) are used for the characterisation of miRNA. Proteogenomic approaches for detecting APP and novel miRNA profiling are essential in understanding the selective resistance to endotoxin in sheep. The results of these methods could help in understanding similar pathology in humans. It might also be helpful in the development of physiological and diagnostic screening assays for determining experimental inclusion and endpoints, and in clinical trials in future
Resumo:
Frog protection has become increasingly essential due to the rapid decline of its biodiversity. Therefore, it is valuable to develop new methods for studying this biodiversity. In this paper, a novel feature extraction method is proposed based on perceptual wavelet packet decomposition for classifying frog calls in noisy environments. Pre-processing and syllable segmentation are first applied to the frog call. Then, a spectral peak track is extracted from each syllable if possible. Track duration, dominant frequency and oscillation rate are directly extracted from the track. With k-means clustering algorithm, the calculated dominant frequency of all frog species is clustered into k parts, which produce a frequency scale for wavelet packet decomposition. Based on the adaptive frequency scale, wavelet packet decomposition is applied to the frog calls. Using the wavelet packet decomposition coefficients, a new feature set named perceptual wavelet packet decomposition sub-band cepstral coefficients is extracted. Finally, a k-nearest neighbour (k-NN) classifier is used for the classification. The experiment results show that the proposed features can achieve an average classification accuracy of 97.45% which outperforms syllable features (86.87%) and Mel-frequency cepstral coefficients (MFCCs) feature (90.80%).
Resumo:
This study investigates the use of unsupervised features derived from word embedding approaches and novel sequence representation approaches for improving clinical information extraction systems. Our results corroborate previous findings that indicate that the use of word embeddings significantly improve the effectiveness of concept extraction models; however, we further determine the influence that the corpora used to generate such features have. We also demonstrate the promise of sequence-based unsupervised features for further improving concept extraction.
Resumo:
Dialkyl phthalate esters (phthalates) are ubiquitous chemicals used extensively as plasticizers, solvents and adhesives in a range of industrial and consumer products. 1,2-Cyclohexane dicarboxylic acid, diisononyl ester (DINCH) is a phthalate alternative introduced due to a more favourable toxicological profile, but exposure is largely uncharacterised. The aim of this study was to provide the first assessment of exposure to phthalates and DINCH in the general Australian population. De-identified urine specimens stratified by age and sex were obtained from a community-based pathology laboratory and pooled (n = 24 pools of 100). Concentrations of free and total species were measured using online solid phase extraction isotope dilution high performance liquid chromatography tandem mass spectrometry. Concentrations ranged from 2.4 to 71.9 ng/mL for metabolites of di(2-ethylhexyl)phthalate, and from < 0.5 to 775 ng/mL for all other metabolites. Our data suggest that phthalate metabolites concentrations in Australia were at least two times higher than in the United States and Germany; and may be related to legislative differences among countries. DINCH metabolite concentrations were comparatively low and consistent with the limited data available. Ongoing biomonitoring among the general Australian population may help assess temporal trends in exposure and assess the effectiveness of actions aimed at reducing exposures.
Resumo:
Fluorinated surfactant-based aqueous film-forming foams (AFFFs) are made up of per- and polyfluorinated alkyl substances (PFAS) and are used to extinguish fires involving highly flammable liquids. The use of perfluorooctanesulfonic acid (PFOS) and other perfluoroalkyl acids (PFAAs) in some AFFF formulations has been linked to substantial environmental contamination. Recent studies have identified a large number of novel and infrequently reported fluorinated surfactants in different AFFF formulations. In this study, a strategy based on a case-control approach using quadrupole time-of-flight tandem mass spectrometry (QTOF-MS/MS) and advanced statistical methods has been used to extract and identify known and unknown PFAS in human serum associated with AFFF-exposed firefighters. Two target sulfonic acids [PFOS and perfluorohexanesulfonic acid (PFHxS)], three non-target acids [perfluoropentanesulfonic acid (PFPeS), perfluoroheptanesulfonic acid (PFHpS), and perfluorononanesulfonic acid (PFNS)], and four unknown sulfonic acids (Cl-PFOS, ketone-PFOS, ether-PFHxS, and Cl-PFHxS) were exclusively or significantly more frequently detected at higher levels in firefighters compared to controls. The application of this strategy has allowed for identification of previously unreported fluorinated chemicals in a timely and cost-efficient way.