930 resultados para extraction methods
Resumo:
Guaranteeing the quality of extracted features that describe relevant knowledge to users or topics is a challenge because of the large number of extracted features. Most popular existing term-based feature selection methods suffer from noisy feature extraction, which is irrelevant to the user needs (noisy). One popular method is to extract phrases or n-grams to describe the relevant knowledge. However, extracted n-grams and phrases usually contain a lot of noise. This paper proposes a method for reducing the noise in n-grams. The method first extracts more specific features (terms) to remove noisy features. The method then uses an extended random set to accurately weight n-grams based on their distribution in the documents and their terms distribution in n-grams. The proposed approach not only reduces the number of extracted n-grams but also improves the performance. The experimental results on Reuters Corpus Volume 1 (RCV1) data collection and TREC topics show that the proposed method significantly outperforms the state-of-art methods underpinned by Okapi BM25, tf*idf and Rocchio.
Resumo:
RATIONALE: Polymer-based surface coatings in outdoor applications experience accelerated degradation due to exposure to solar radiation, oxygen and atmospheric pollutants. These deleterious agents cause undesirable changes to the aesthetic and mechanical properties of the polymer, reducing its lifetime. The use of antioxidants such as hindered amine light stabilisers (HALS) retards these degradative processes; however, mechanisms for HALS action and polymer degradation are poorly understood. METHODS: Detection of the HALS TINUVINW123 (bis(1-octyloxy-2,2,6,6-tetramethyl-4-piperidyl) sebacate) and the polymer degradation products directly from a polyester-based coil coating was achieved by liquid extraction surface analysis (LESA) coupled to a triple quadrupole QTRAPW 5500 mass spectrometer. The detection of TINUVINW123 and melamine was confirmed by the characteristic fragmentation pattern observed in LESA-MS/MS spectra that was identical to that reported for authentic samples. RESULTS: Analysis of an unstabilised coil coating by LESA-MS after exposure to 4 years of outdoor field testing revealed the presence of melamine (1,3,5-triazine-2,4,6-triamine) as a polymer degradation product at elevated levels. Changes to the physical appearance of the coil coating, including powder-like deposits on the coating's surface, were observed to coincide with melamine deposits and are indicative of the phenomenon known as polymer ' blooming'. CONCLUSIONS: For the first time, in situ detection of analytes from a thermoset polymer coating was accomplished without any sample preparation, providing advantages over traditional extraction-analysis approaches and some contemporary ambient MS methods. Detection of HALS and polymer degradation products such as melamine provides insight into the mechanisms by which degradation occurs and suggests LESA-MS is a powerful new tool for polymer analysis. Copyright (C) 2012 John Wiley & Sons, Ltd.
Resumo:
The strain data acquired from structural health monitoring (SHM) systems play an important role in the state monitoring and damage identification of bridges. Due to the environmental complexity of civil structures, a better understanding of the actual strain data will help filling the gap between theoretical/laboratorial results and practical application. In the study, the multi-scale features of strain response are first revealed after abundant investigations on the actual data from two typical long-span bridges. Results show that, strain types at the three typical temporal scales of 10^5, 10^2 and 10^0 sec are caused by temperature change, trains and heavy trucks, and have their respective cut-off frequency in the order of 10^-2, 10^-1 and 10^0 Hz. Multi-resolution analysis and wavelet shrinkage are applied for separating and extracting these strain types. During the above process, two methods for determining thresholds are introduced. The excellent ability of wavelet transform on simultaneously time-frequency analysis leads to an effective information extraction. After extraction, the strain data will be compressed at an attractive ratio. This research may contribute to a further understanding of actual strain data of long-span bridges; also, the proposed extracting methodology is applicable on actual SHM systems.
Resumo:
BACKGROUND: The use of salivary diagnostics is increasing because of its noninvasiveness, ease of sampling, and the relatively low risk of contracting infectious organisms. Saliva has been used as a biological fluid to identify and validate RNA targets in head and neck cancer patients. The goal of this study was to develop a robust, easy, and cost-effective method for isolating high yields of total RNA from saliva for downstream expression studies. METHODS: Oral whole saliva (200 mu L) was collected from healthy controls (n = 6) and from patients with head and neck cancer (n = 8). The method developed in-house used QIAzol lysis reagent (Qiagen) to extract RNA from saliva (both cell-free supernatants and cell pellets), followed by isopropyl alcohol precipitation, cDNA synthesis, and real-time PCR analyses for the genes encoding beta-actin ("housekeeping" gene) and histatin (a salivary gland-specific gene). RESULTS: The in-house QIAzol lysis reagent produced a high yield of total RNA (0.89 -7.1 mu g) from saliva (cell-free saliva and cell pellet) after DNase treatment. The ratio of the absorbance measured at 260 nm to that at 280 nm ranged from 1.6 to 1.9. The commercial kit produced a 10-fold lower RNA yield. Using our method with the QIAzol lysis reagent, we were also able to isolate RNA from archived saliva samples that had been stored without RNase inhibitors at -80 degrees C for >2 years. CONCLUSIONS: Our in-house QIAzol method is robust, is simple, provides RNA at high yields, and can be implemented to allow saliva transcriptomic studies to be translated into a clinical setting.
Resumo:
This paper examines the feasibility of using vertical light pipes to naturally illuminate the central core of a multilevel building not reached by window light. The challenges addressed were finding a method to extract and distribute equal amounts of light at each level and designing collectors to improve the effectiveness of vertical light pipes in delivering low elevation sunlight to the interior. Extraction was achieved by inserting partially reflecting cones within transparent sections of the pipes at each floor level. Theory was formulated to estimate the partial reflectance necessary to provide equal light extraction at each level. Designs for daylight collectors formed from laser cut panels tilted above the light pipe were developed and the benefits and limitations of static collectors as opposed to collectors that follow the sun azimuth investigated. Performance was assessed with both basic and detailed mathematical simulation and by observations made with a five level model building under clear sky conditions.
Resumo:
Objective This paper presents an automatic active learning-based system for the extraction of medical concepts from clinical free-text reports. Specifically, (1) the contribution of active learning in reducing the annotation effort, and (2) the robustness of incremental active learning framework across different selection criteria and datasets is determined. Materials and methods The comparative performance of an active learning framework and a fully supervised approach were investigated to study how active learning reduces the annotation effort while achieving the same effectiveness as a supervised approach. Conditional Random Fields as the supervised method, and least confidence and information density as two selection criteria for active learning framework were used. The effect of incremental learning vs. standard learning on the robustness of the models within the active learning framework with different selection criteria was also investigated. Two clinical datasets were used for evaluation: the i2b2/VA 2010 NLP challenge and the ShARe/CLEF 2013 eHealth Evaluation Lab. Results The annotation effort saved by active learning to achieve the same effectiveness as supervised learning is up to 77%, 57%, and 46% of the total number of sequences, tokens, and concepts, respectively. Compared to the Random sampling baseline, the saving is at least doubled. Discussion Incremental active learning guarantees robustness across all selection criteria and datasets. The reduction of annotation effort is always above random sampling and longest sequence baselines. Conclusion Incremental active learning is a promising approach for building effective and robust medical concept extraction models, while significantly reducing the burden of manual annotation.
Resumo:
Frog protection has become increasingly essential due to the rapid decline of its biodiversity. Therefore, it is valuable to develop new methods for studying this biodiversity. In this paper, a novel feature extraction method is proposed based on perceptual wavelet packet decomposition for classifying frog calls in noisy environments. Pre-processing and syllable segmentation are first applied to the frog call. Then, a spectral peak track is extracted from each syllable if possible. Track duration, dominant frequency and oscillation rate are directly extracted from the track. With k-means clustering algorithm, the calculated dominant frequency of all frog species is clustered into k parts, which produce a frequency scale for wavelet packet decomposition. Based on the adaptive frequency scale, wavelet packet decomposition is applied to the frog calls. Using the wavelet packet decomposition coefficients, a new feature set named perceptual wavelet packet decomposition sub-band cepstral coefficients is extracted. Finally, a k-nearest neighbour (k-NN) classifier is used for the classification. The experiment results show that the proposed features can achieve an average classification accuracy of 97.45% which outperforms syllable features (86.87%) and Mel-frequency cepstral coefficients (MFCCs) feature (90.80%).
Resumo:
The feasibility of different modern analytical techniques for the mass spectrometric detection of anabolic androgenic steroids (AAS) in human urine was examined in order to enhance the prevalent analytics and to find reasonable strategies for effective sports drug testing. A comparative study of the sensitivity and specificity between gas chromatography (GC) combined with low (LRMS) and high resolution mass spectrometry (HRMS) in screening of AAS was carried out with four metabolites of methandienone. Measurements were done in selected ion monitoring mode with HRMS using a mass resolution of 5000. With HRMS the detection limits were considerably lower than with LRMS, enabling detection of steroids at low 0.2-0.5 ng/ml levels. However, also with HRMS, the biological background hampered the detection of some steroids. The applicability of liquid-phase microextraction (LPME) was studied with metabolites of fluoxymesterone, 4-chlorodehydromethyltestosterone, stanozolol and danazol. Factors affecting the extraction process were studied and a novel LPME method with in-fiber silylation was developed and validated for GC/MS analysis of the danazol metabolite. The method allowed precise, selective and sensitive analysis of the metabolite and enabled simultaneous filtration, extraction, enrichment and derivatization of the analyte from urine without any other steps in sample preparation. Liquid chromatographic/tandem mass spectrometric (LC/MS/MS) methods utilizing electrospray ionization (ESI), atmospheric pressure chemical ionization (APCI) and atmospheric pressure photoionization (APPI) were developed and applied for detection of oxandrolone and metabolites of stanozolol and 4-chlorodehydromethyltestosterone in urine. All methods exhibited high sensitivity and specificity. ESI showed, however, the best applicability, and a LC/ESI-MS/MS method for routine screening of nine 17-alkyl-substituted AAS was thus developed enabling fast and precise measurement of all analytes with detection limits below 2 ng/ml. The potential of chemometrics to resolve complex GC/MS data was demonstrated with samples prepared for AAS screening. Acquired full scan spectral data (m/z 40-700) were processed by the OSCAR algorithm (Optimization by Stepwise Constraints of Alternating Regression). The deconvolution process was able to dig out from a GC/MS run more than the double number of components as compared with the number of visible chromatographic peaks. Severely overlapping components, as well as components hidden in the chromatographic background could be isolated successfully. All studied techniques proved to be useful analytical tools to improve detection of AAS in urine. Superiority of different procedures is, however, compound-dependent and different techniques complement each other.
Resumo:
This thesis discusses the use of sub- and supercritical fluids as the medium in extraction and chromatography. Super- and subcritical extraction was used to separate essential oils from herbal plant Angelica archangelica. The effect of extraction parameters was studied and sensory analyses of the extracts were done by an expert panel. The results of the sensory analyses were compared to the analytically determined contents of the extracts. Sub- and supercritical fluid chromatography (SFC) was used to separate and purify high-value pharmaceuticals. Chiral SFC was used to separate the enantiomers of racemic mixtures of pharmaceutical compounds. Very low (cryogenic) temperatures were applied to substantially enhance the separation efficiency of chiral SFC. The thermodynamic aspects affecting the resolving ability of chiral stationary phases are briefly reviewed. The process production rate which is a key factor in industrial chromatography was optimized by empirical multivariate methods. General linear model was used to optimize the separation of omega-3 fatty acid ethyl esters from esterized fish oil by using reversed-phase SFC. Chiral separation of racemic mixtures of guaifenesin and ferulic acid dimer ethyl ester was optimized by using response surface method with three variables per time. It was found that by optimizing four variables (temperature, load, flowate and modifier content) the production rate of the chiral resolution of racemic guaifenesin by cryogenic SFC could be increased severalfold compared to published results of similar application. A novel pressure-compensated design of industrial high pressure chromatographic column was introduced, using the technology developed in building the deep-sea submersibles (Mir 1 and 2). A demonstration SFC plant was built and the immunosuppressant drug cyclosporine A was purified to meet the requirements of US Pharmacopoeia. A smaller semi-pilot size column with similar design was used for cryogenic chiral separation of aromatase inhibitor Finrozole for use in its development phase 2.
Resumo:
In this work, separation methods have been developed for the analysis of anthropogenic transuranium elements plutonium, americium, curium and neptunium from environmental samples contaminated by global nuclear weapons testing and the Chernobyl accident. The analytical methods utilized in this study are based on extraction chromatography. Highly varying atmospheric plutonium isotope concentrations and activity ratios were found at both Kurchatov (Kazakhstan), near the former Semipalatinsk test site, and Sodankylä (Finland). The origin of plutonium is almost impossible to identify at Kurchatov, since hundreds of nuclear tests were performed at the Semipalatinsk test site. In Sodankylä, plutonium in the surface air originated from nuclear weapons testing, conducted mostly by USSR and USA before the sampling year 1963. The variation in americium, curium and neptunium concentrations was great as well in peat samples collected in southern and central Finland in 1986 immediately after the Chernobyl accident. The main source of transuranium contamination in peats was from global nuclear test fallout, although there are wide regional differences in the fraction of Chernobyl-originated activity (of the total activity) for americium, curium and neptunium.
Resumo:
Pressurised hot water extraction (PHWE) exploits the unique temperature-dependent solvent properties of water minimising the use of harmful organic solvents. Water is environmentally friendly, cheap and easily available extraction medium. The effects of temperature, pressure and extraction time in PHWE have often been studied, but here the emphasis was on other parameters important for the extraction, most notably the dimensions of the extraction vessel and the stability and solubility of the analytes to be extracted. Non-linear data analysis and self-organising maps were employed in the data analysis to obtain correlations between the parameters studied, recoveries and relative errors. First, pressurised hot water extraction (PHWE) was combined on-line with liquid chromatography-gas chromatography (LC-GC), and the system was applied to the extraction and analysis of polycyclic aromatic hydrocarbons (PAHs) in sediment. The method is of superior sensitivity compared with the traditional methods, and only a small 10 mg sample was required for analysis. The commercial extraction vessels were replaced by laboratory-made stainless steel vessels because of some problems that arose. The performance of the laboratory-made vessels was comparable to that of the commercial ones. In an investigation of the effect of thermal desorption in PHWE, it was found that at lower temperatures (200ºC and 250ºC) the effect of thermal desorption is smaller than the effect of the solvating property of hot water. At 300ºC, however, thermal desorption is the main mechanism. The effect of the geometry of the extraction vessel on recoveries was studied with five specially constructed extraction vessels. In addition to the extraction vessel geometry, the sediment packing style and the direction of water flow through the vessel were investigated. The geometry of the vessel was found to have only minor effect on the recoveries, and the same was true of the sediment packing style and the direction of water flow through the vessel. These are good results because these parameters do not have to be carefully optimised before the start of extractions. Liquid-liquid extraction (LLE) and solid-phase extraction (SPE) were compared as trapping techniques for PHWE. LLE was more robust than SPE and it provided better recoveries and repeatabilities than did SPE. Problems related to blocking of the Tenax trap and unrepeatable trapping of the analytes were encountered in SPE. Thus, although LLE is more labour intensive, it can be recommended over SPE. The stabilities of the PAHs in aqueous solutions were measured using a batch-type reaction vessel. Degradation was observed at 300ºC even with the shortest heating time. Ketones and quinones and other oxidation products were observed. Although the conditions of the stability studies differed considerably from the extraction conditions in PHWE, the results indicate that the risk of analyte degradation must be taken into account in PHWE. The aqueous solubilities of acenaphthene, anthracene and pyrene were measured, first below and then above the melting point of the analytes. Measurements below the melting point were made to check that the equipment was working, and the results were compared with those obtained earlier. Good agreement was found between the measured and literature values. A new saturation cell was constructed for the solubility measurements above the melting point of the analytes because the flow-through saturation cell could not be used above the melting point. An exponential relationship was found between the solubilities measured for pyrene and anthracene and temperature.
Resumo:
Organochlorine pesticides (OCPs) are ubiquitous environmental contaminants with adverse impacts on aquatic biota, wildlife and human health even at low concentrations. However, conventional methods for their determination in river sediments are resource intensive. This paper presents an approach that is rapid and also reliable for the detection of OCPs. Accelerated Solvent Extraction (ASE) with in-cell silica gel clean-up followed by Triple Quadrupole Gas Chromatograph Mass Spectrometry (GCMS/MS) was used to recover OCPs from sediment samples. Variables such as temperature, solvent ratio, adsorbent mass and extraction cycle were evaluated and optimised for the extraction. With the exception of Aldrin, which was unaffected by any of the variables evaluated, the recovery of OCPs from sediment samples was largely influenced by solvent ratio and adsorbent mass and, to some extent, the number of cycles and temperature. The optimised conditions for OCPs extraction in sediment with good recoveries were determined to be 4 cycles, 4.5 g of silica gel, 105 ᴼC, and 4:3 v/v DCM: hexane mixture. With the exception of two compounds (α-BHC and Aldrin) whose recoveries were low (59.73 and 47.66 % respectively), the recovery of the other pesticides were in the range 85.35 – 117.97% with precision < 10 % RSD. The method developed significantly reduces sample preparation time, the amount of solvent used, matrix interference, and is highly sensitive and selective.
Resumo:
We report a measurement of the ratio of the tt̅ to Z/γ* production cross sections in √s=1.96 TeV pp̅ collisions using data corresponding to an integrated luminosity of up to 4.6 fb-1, collected by the CDF II detector. The tt̅ cross section ratio is measured using two complementary methods, a b-jet tagging measurement and a topological approach. By multiplying the ratios by the well-known theoretical Z/γ*→ll cross section predicted by the standard model, the extracted tt̅ cross sections are effectively insensitive to the uncertainty on luminosity. A best linear unbiased estimate is used to combine both measurements with the result σtt̅ =7.70±0.52 pb, for a top-quark mass of 172.5 GeV/c2.
Resumo:
We report a measurement of the ratio of the tt̅ to Z/γ* production cross sections in √s=1.96 TeV pp̅ collisions using data corresponding to an integrated luminosity of up to 4.6 fb-1, collected by the CDF II detector. The tt̅ cross section ratio is measured using two complementary methods, a b-jet tagging measurement and a topological approach. By multiplying the ratios by the well-known theoretical Z/γ*→ll cross section predicted by the standard model, the extracted tt̅ cross sections are effectively insensitive to the uncertainty on luminosity. A best linear unbiased estimate is used to combine both measurements with the result σtt̅ =7.70±0.52 pb, for a top-quark mass of 172.5 GeV/c2.
Resumo:
We report a measurement of the ratio of the top-antitop to Z/gamma* production cross sections in sqrt(s) = 1.96 TeV proton-antiproton collisions using data corresponding to an integrated luminosity of up to 4.6 fb-1, collected by the CDF II detector. The top-antitop cross section ratio is measured using two complementary methods, a b-jet tagging measurement and a topological approach. By multiplying the ratios by the well-known theoretical Z/gamma*->ll cross section, the extracted top-antitop cross sections are effectively insensitive to the uncertainty on luminosity. A best linear unbiased estimate is used to combine both measurements with the result sigma_(top-antitop) = 7.70 +/- 0.52 pb, for a top-quark mass of 172.5 GeV/c^2.