903 resultados para Sentence extraction


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new active learning query strategy for information extraction, called Domain Knowledge Informativeness (DKI). Active learning is often used to reduce the amount of annotation effort required to obtain training data for machine learning algorithms. A key component of an active learning approach is the query strategy, which is used to iteratively select samples for annotation. Knowledge resources have been used in information extraction as a means to derive additional features for sample representation. DKI is, however, the first query strategy that exploits such resources to inform sample selection. To evaluate the merits of DKI, in particular with respect to the reduction in annotation effort that the new query strategy allows to achieve, we conduct a comprehensive empirical comparison of active learning query strategies for information extraction within the clinical domain. The clinical domain was chosen for this work because of the availability of extensive structured knowledge resources which have often been exploited for feature generation. In addition, the clinical domain offers a compelling use case for active learning because of the necessary high costs and hurdles associated with obtaining annotations in this domain. Our experimental findings demonstrated that 1) amongst existing query strategies, the ones based on the classification model’s confidence are a better choice for clinical data as they perform equally well with a much lighter computational load, and 2) significant reductions in annotation effort are achievable by exploiting knowledge resources within active learning query strategies, with up to 14% less tokens and concepts to manually annotate than with state-of-the-art query strategies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An automated method for extracting brain volumes from three commonly acquired three-dimensional (3D) MR images (proton density, T1 weighted, and T2-weighted) of the human head is described. The procedure is divided into four levels: preprocessing, segmentation, scalp removal, and postprocessing. A user-provided reference point is the sole operator-dependent input required. The method's parameters were first optimized and then fixed and applied to 30 repeat data sets from 15 normal older adult subjects to investigate its reproducibility. Percent differences between total brain volumes (TBVs) for the subjects' repeated data sets ranged from .5% to 2.2%. We conclude that the method is both robust and reproducible and has the potential for wide application.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Currently we are facing an overburdening growth of the number of reliable information sources on the Internet. The quantity of information available to everyone via Internet is dramatically growing each year [15]. At the same time, temporal and cognitive resources of human users are not changing, therefore causing a phenomenon of information overload. World Wide Web is one of the main sources of information for decision makers (reference to my research). However our studies show that, at least in Poland, the decision makers see some important problems when turning to Internet as a source of decision information. One of the most common obstacles raised is distribution of relevant information among many sources, and therefore need to visit different Web sources in order to collect all important content and analyze it. A few research groups have recently turned to the problem of information extraction from the Web [13]. The most effort so far has been directed toward collecting data from dispersed databases accessible via web pages (related to as data extraction or information extraction from the Web) and towards understanding natural language texts by means of fact, entity, and association recognition (related to as information extraction). Data extraction efforts show some interesting results, however proper integration of web databases is still beyond us. Information extraction field has been recently very successful in retrieving information from natural language texts, however it is still lacking abilities to understand more complex information, requiring use of common sense knowledge, discourse analysis and disambiguation techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative XPath expressions, although not widely used, should be used in preference to absolute XPath expressions in extracting content from human-created Web documents. Evaluation of robustness covers four thousand queries executed on several hundred webpages. We show that in referencing parts of real world dynamic HTML documents, relative XPath expressions are on average significantly more robust than absolute XPath ones.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A method for determination of tricyclazole in water using solid phase extraction and high performance liquid chromatography (HPLC) with UV detection at 230nm and a mobile phase of acetonitrile:water (20:80, v/v) was developed. A performance comparison between two types of solid phase sorbents, the C18 sorbent of Supelclean ENVI-18 cartridge and the styrene-divinyl benzene copolymer sorbent of Sep-Pak PS2-Plus cartridge was conducted. The Sep-Pak PS2-Plus cartridges were found more suitable for extracting tricyclazole from water samples than the Supelclean ENVI-18 cartridges. For this cartridge, both methanol and ethyl acetate produced good results. The method was validated with good linearity and with a limit of detection of 0.008gL-1 for a 500-fold concentration through the SPE procedure. The recoveries of the method were stable at 80% and the precision was from 1.1-6.0% within the range of fortified concentrations. The validated method was also applied to measure the concentrations of tricyclazole in real paddy water.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semi-rigid molecular tweezers 1, 3 and 4 bind picric acid with more than tenfold increment in tetrachloromethane as compared to chloroform.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Frog protection has become increasingly essential due to the rapid decline of its biodiversity. Therefore, it is valuable to develop new methods for studying this biodiversity. In this paper, a novel feature extraction method is proposed based on perceptual wavelet packet decomposition for classifying frog calls in noisy environments. Pre-processing and syllable segmentation are first applied to the frog call. Then, a spectral peak track is extracted from each syllable if possible. Track duration, dominant frequency and oscillation rate are directly extracted from the track. With k-means clustering algorithm, the calculated dominant frequency of all frog species is clustered into k parts, which produce a frequency scale for wavelet packet decomposition. Based on the adaptive frequency scale, wavelet packet decomposition is applied to the frog calls. Using the wavelet packet decomposition coefficients, a new feature set named perceptual wavelet packet decomposition sub-band cepstral coefficients is extracted. Finally, a k-nearest neighbour (k-NN) classifier is used for the classification. The experiment results show that the proposed features can achieve an average classification accuracy of 97.45% which outperforms syllable features (86.87%) and Mel-frequency cepstral coefficients (MFCCs) feature (90.80%).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dendrocalamus strictus and Bambusa arundinacea are monocarpic, gregariously flowering species of bamboo, common in the deciduous forests of the State of Karnataka in India. Their populations have significantly declined, especially since the last flowering. This decline parelleis increasing incidence of grazing, fire and extraction in recent decades. Results of an experiment in which the intensities of grazing and fire were varied, indicate that while grazing significantly depresses the survival of seedlings and the recruitment of new eulms of bamboo clumps, fire appeared to enhance seedling survival, presumably by reducing competition of lass fire-resistant species. New shoots of bamboo are destroyed by insects and a variety of herbivorous mammals. In areas of intense herbivore pressure, a bamboo clump initiates the production of a much larger number of new culrm, but results in many fewer and shorter intact culms. Extraction renders the new shoots more susceptible to herbivore pressure by removal of the protective covering of branches at the base of a bamboo clump. Hence, regular and extensive extraction by the paper mills in conjuction with intense grazing pressure strongly depresses the addition of new culms to bamboo clumps. Regulation of grazing in the forest by domestic livestock along with maintenance of the cover at the base of the clumps by extracting the culms at a higher level should reduce the rate of decline of the bamboo stocks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study investigates the use of unsupervised features derived from word embedding approaches and novel sequence representation approaches for improving clinical information extraction systems. Our results corroborate previous findings that indicate that the use of word embeddings significantly improve the effectiveness of concept extraction models; however, we further determine the influence that the corpora used to generate such features have. We also demonstrate the promise of sequence-based unsupervised features for further improving concept extraction.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A method of ion extraction from plasmas is reported in which the interference of field lines due to the extraction system in the plasma region is avoided by proper shaping of the extractor electrode and is supported by field plots.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis discusses the use of sub- and supercritical fluids as the medium in extraction and chromatography. Super- and subcritical extraction was used to separate essential oils from herbal plant Angelica archangelica. The effect of extraction parameters was studied and sensory analyses of the extracts were done by an expert panel. The results of the sensory analyses were compared to the analytically determined contents of the extracts. Sub- and supercritical fluid chromatography (SFC) was used to separate and purify high-value pharmaceuticals. Chiral SFC was used to separate the enantiomers of racemic mixtures of pharmaceutical compounds. Very low (cryogenic) temperatures were applied to substantially enhance the separation efficiency of chiral SFC. The thermodynamic aspects affecting the resolving ability of chiral stationary phases are briefly reviewed. The process production rate which is a key factor in industrial chromatography was optimized by empirical multivariate methods. General linear model was used to optimize the separation of omega-3 fatty acid ethyl esters from esterized fish oil by using reversed-phase SFC. Chiral separation of racemic mixtures of guaifenesin and ferulic acid dimer ethyl ester was optimized by using response surface method with three variables per time. It was found that by optimizing four variables (temperature, load, flowate and modifier content) the production rate of the chiral resolution of racemic guaifenesin by cryogenic SFC could be increased severalfold compared to published results of similar application. A novel pressure-compensated design of industrial high pressure chromatographic column was introduced, using the technology developed in building the deep-sea submersibles (Mir 1 and 2). A demonstration SFC plant was built and the immunosuppressant drug cyclosporine A was purified to meet the requirements of US Pharmacopoeia. A smaller semi-pilot size column with similar design was used for cryogenic chiral separation of aromatase inhibitor Finrozole for use in its development phase 2.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An apparatus is described that facilitates the determination of incorporation levels of isotope labelled, gaseous precursors into volatile insect-derived metabolites. Atmospheres of varying gas compositions can be generated by evacuation of a working chamber followed by admission of the required levels of component gases, using a precision, digitised pressure read-out system. Insects such as fruit-flies are located initially in a small introduction chamber, from which migration can occur downwards into the working chamber. The level of incorporation of labelled precursors is continuously assayed by the Solid Phase Micro Extraction (SPME) technique and GC-MS analyses. Experiments with both Bactrocera species (fruit-flies) and a parasitoid wasp, Megarhyssa nortoni nortoni (Cresson) and oxygen-18 labelled dioxygen illustrate the utility of this system. The isotope effects of oxygen-18 on the carbon-13 NMR spectra of 1,7- dioxaspiro[5,5]undecane are also described.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is shown that lithium can be oxidatively extracted from Li2MoO3 at room temperature using Br2 in CHCl3. The delithiated oxides, Li2â��xMoO3 (0 < x â�¤ 1.5) retain the parent ordered rocksalt structure. Complete removal of lithium from Li2MoO3 using Br2 in CH3CN results in a poorly crystalline MoO3 that transforms to the stable structure at 280�°C. Li2MoO3 undergoes topotactic ion-exchange in aqueous H2SO4 to yield a new protonated oxide, H2MoO3.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Salinity, sodicity, acidity, and phytotoxic levels of chloride (Cl) in subsoils are major constraints to crop production in many soils of north-eastern Australia because they reduce the ability of crop roots to extract water and nutrients from the soil. The complex interactions and correlations among soil properties result in multi-colinearity between soil properties and crop yield that makes it difficult to determine which constraint is the major limitation. We used ridge-regression analysis to overcome colinearity to evaluate the contribution of soil factors and water supply to the variation in the yields of 5 winter crops on soils with various levels and combinations of subsoil constraints in the region. Subsoil constraints measured were soil Cl, electrical conductivity of the saturation extract (ECse), and exchangeable sodium percentage (ESP). The ridge regression procedure selected several of the variables used in a descriptive model, which included in-crop rainfall, plant-available soil water at sowing in the 0.90-1.10 m soil layer, and soil Cl in the 0.90-1.10 m soil layer, and accounted for 77-85% of the variation in the grain yields of the 5 winter crops. Inclusion of ESP of the top soil (0.0-0.10 m soil layer) marginally increased the descriptive capability of the models for bread wheat, barley and durum wheat. Subsoil Cl concentration was found to be an effective substitute for subsoil water extraction. The estimates of the critical levels of subsoil Cl for a 10% reduction in the grain yield were 492 mg cl/kg for chickpea, 662 mg Cl/kg for durum wheat, 854 mg Cl/kg for bread wheat, 980 mg Cl/kg for canola, and 1012 mg Cl/kg for barley, thus suggesting that chickpea and durum wheat were more sensitive to subsoil Cl than bread wheat, barley, and canola.