879 resultados para topic extraction
Resumo:
News blog hot topics are important for the information recommendation service and marketing. However, information overload and personalized management make the information arrangement more difficult. Moreover, what influences the formation and development of blog hot topics is seldom paid attention to. In order to correctly detect news blog hot topics, the paper first analyzes the development of topics in a new perspective based on W2T (Wisdom Web of Things) methodology. Namely, the characteristics of blog users, context of topic propagation and information granularity are unified to analyze the related problems. Some factors such as the user behavior pattern, network opinion and opinion leader are subsequently identified to be important for the development of topics. Then the topic model based on the view of event reports is constructed. At last, hot topics are identified by the duration, topic novelty, degree of topic growth and degree of user attention. The experimental results show that the proposed method is feasible and effective.
Resumo:
The Beauty Leaf tree (Calophyllum inophyllum) is a potential source of non-edible vegetable oil for producing future generation biodiesel because of its ability to grow in a wide range of climate conditions, easy cultivation, high fruit production rate, and the high oil content in the seed. This plant naturally occurs in the coastal areas of Queensland and the Northern Territory in Australia, and is also widespread in south-east Asia, India and Sri Lanka. Although Beauty Leaf is traditionally used as a source of timber and orientation plant, its potential as a source of second generation biodiesel is yet to be exploited. In this study, the extraction process from the Beauty Leaf oil seed has been optimised in terms of seed preparation, moisture content and oil extraction methods. The two methods that have been considered to extract oil from the seed kernel are mechanical oil extraction using an electric powered screw press, and chemical oil extraction using n-hexane as an oil solvent. The study found that seed preparation has a significant impact on oil yields, especially in the screw press extraction method. Kernels prepared to 15% moisture content provided the highest oil yields for both extraction methods. Mechanical extraction using the screw press can produce oil from correctly prepared product at a low cost, however overall this method is ineffective with relatively low oil yields. Chemical extraction was found to be a very effective method for oil extraction for its consistence performance and high oil yield, but cost of production was relatively higher due to the high cost of solvent. However, a solvent recycle system can be implemented to reduce the production cost of Beauty Leaf biodiesel. The findings of this study are expected to serve as the basis from which industrial scale biodiesel production from Beauty Leaf can be made.
Resumo:
Topic modeling has been widely utilized in the fields of information retrieval, text mining, text classification etc. Most existing statistical topic modeling methods such as LDA and pLSA generate a term based representation to represent a topic by selecting single words from multinomial word distribution over this topic. There are two main shortcomings: firstly, popular or common words occur very often across different topics that bring ambiguity to understand topics; secondly, single words lack coherent semantic meaning to accurately represent topics. In order to overcome these problems, in this paper, we propose a two-stage model that combines text mining and pattern mining with statistical modeling to generate more discriminative and semantic rich topic representations. Experiments show that the optimized topic representations generated by the proposed methods outperform the typical statistical topic modeling method LDA in terms of accuracy and certainty.
Resumo:
Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics in documents. The performance of text categorisation relies on the quality of samples, effectiveness of document features, and the topic coverage of categories, depending on the employing strategies; supervised or unsupervised; single labelled or multi-labelled. Attempting to deal with these reliability issues in text categorisation, we propose an unsupervised multi-labelled text categorisation approach that maps the local knowledge in documents to global knowledge in a world ontology to optimise categorisation result. The conceptual framework of the approach consists of three modules; pattern mining for feature extraction; feature-subject mapping for categorisation; concept generalisation for optimised categorisation. The approach has been promisingly evaluated by compared with typical text categorisation methods, based on the ground truth encoded by human experts.
Resumo:
High-performance liquid chromatography coupled with solid phase extraction method was developed for determination of isofraxidin in rat plasma after oral administration of Acanthopanax senticosus extract (ASE), and pharmacokinetic parameters of isofraxidin either in ASE or pure compound were measured. The HPLC analysis was performed on a Dikma Diamonsil RP(18) column (4.6 mm x 150 mm, 5 microm) with the isocratic elution of solvent A (acetonitrile) and solvent B (0.1% aqueous phosphoric acid, v/v) (A : B = 22 : 78) and the detection wavelength was set at 343 nm. The calibration curve was linear over the range of 0.156-15.625 microg/ml. The limit of detection was 60 ng/ml. The intra-day precision was 5.8%, and the inter-day precision was 6.0%. The recovery was 87.30+/-1.73%. When the dosage of ASE is equal to pure compound caculated by the amount of isofraxidin, it has been found to have two maximum concentrations in plasma while the pure compound only showed one peak in the plasma concentration-time curve. The determined content of isofraxidin in plasma after oral administration of ASE is the total contents of free isofraxidin and its precursors in ASE in vitro. The pharmacokinetic characteristics of ASE showed the priority of the extract and the properities of traditional Chinese medicine.
Resumo:
High performance liquid chromatography (HPLC) coupled with the solid phase extraction method was developed for determining cimifugin (a coumarin derivative; one of Saposhnikovia divaricatae's constituents) in rat plasma after oral administration of Saposhnikovia divaricatae extract (SDE), and the pharmacokinetics of cimifugin either in SDE or as a single compound was investigated. The HPLC analysis was performed on a commercially available column (4.6 mm x 200 mm, 5 pm) with the isocratic elution of solvent A (Methanol) and solvent B (Water) (A:B=60:40) and the detection wavelength was set at 250 nm. The calibration curve was linear over the range of 0.100-10.040 microg/mL. The limit of detection was 30 ng/mL. At the rat plasma concentrations of 0.402, 4.016, 10.040 microg/mL, the intra-day precision was 6.21%, 3.98%, and 2.23%; the inter-day precision was 7.59%, 4.26%, and 2.09%, respectively. The absolute recovery was 76.58%, 76.61%, and 77.67%, respectively. When the dosage of SDE was equal to the pure compound calculated by the amount of cimifugin, it was found to have two maximum peaks while the pure compound only showed one peak in the plasma concentration-time curve. The pharmacokinetic characteristics of SDE showed the superiority of the extract and the properties of traditional Chinese medicine.
Resumo:
Genomic DNA obtained from patient whole blood samples is a key element for genomic research. Advantages and disadvantages, in terms of time-efficiency, cost-effectiveness and laboratory requirements, of procedures available to isolate nucleic acids need to be considered before choosing any particular method. These characteristics have not been fully evaluated for some laboratory techniques, such as the salting out method for DNA extraction, which has been excluded from comparison in different studies published to date. We compared three different protocols (a traditional salting out method, a modified salting out method and a commercially available kit method) to determine the most cost-effective and time-efficient method to extract DNA. We extracted genomic DNA from whole blood samples obtained from breast cancer patient volunteers and compared the results of the product obtained in terms of quantity (concentration of DNA extracted and DNA obtained per ml of blood used) and quality (260/280 ratio and polymerase chain reaction product amplification) of the obtained yield. On average, all three methods showed no statistically significant differences between the final result, but when we accounted for time and cost derived for each method, they showed very significant differences. The modified salting out method resulted in a seven- and twofold reduction in cost compared to the commercial kit and traditional salting out method, respectively and reduced time from 3 days to 1 hour compared to the traditional salting out method. This highlights a modified salting out method as a suitable choice to be used in laboratories and research centres, particularly when dealing with a large number of samples.
Resumo:
One of the main objectives of law schools beyond educating students is to produce viable legal research. The comments in this paper are basically confined to the Australian context, and to examine this topic effectively, it is necessary to briefly review the current tertiary research agenda in Australia. This paper argues that there is a need for recognition and support for an expanded legal research framework along with additional research training for legal academics. There also needs to be more effective methods of measuring and recognising quality in legal research. This method needs to be one that can engender respect in an interdisciplinary context.
Resumo:
The rapid development of the World Wide Web has created massive information leading to the information overload problem. Under this circumstance, personalization techniques have been brought out to help users in finding content which meet their personalized interests or needs out of massively increasing information. User profiling techniques have performed the core role in this research. Traditionally, most user profiling techniques create user representations in a static way. However, changes of user interests may occur with time in real world applications. In this research we develop algorithms for mining user interests by integrating time decay mechanisms into topic-based user interest profiling. Time forgetting functions will be integrated into the calculation of topic interest measurements on in-depth level. The experimental study shows that, considering temporal effects of user interests by integrating time forgetting mechanisms shows better performance of recommendation.
Resumo:
As of today, opinion mining has been widely used to iden- tify the strength and weakness of products (e.g., cameras) or services (e.g., services in medical clinics or hospitals) based upon people's feed- back such as user reviews. Feature extraction is a crucial step for opinion mining which has been used to collect useful information from user reviews. Most existing approaches only find individual features of a product without the structural relationships between the features which usually exists. In this paper, we propose an approach to extract features and feature relationship, represented as tree structure called a feature hi- erarchy, based on frequent patterns and associations between patterns derived from user reviews. The generated feature hierarchy profiles the product at multiple levels and provides more detailed information about the product. Our experiment results based on some popularly used review datasets show that the proposed feature extraction approach can identify more correct features than the baseline model. Even though the datasets used in the experiment are about cameras, our work can be ap- plied to generate features about a service such as the services in hospitals or clinics.
Resumo:
Guaranteeing the quality of extracted features that describe relevant knowledge to users or topics is a challenge because of the large number of extracted features. Most popular existing term-based feature selection methods suffer from noisy feature extraction, which is irrelevant to the user needs (noisy). One popular method is to extract phrases or n-grams to describe the relevant knowledge. However, extracted n-grams and phrases usually contain a lot of noise. This paper proposes a method for reducing the noise in n-grams. The method first extracts more specific features (terms) to remove noisy features. The method then uses an extended random set to accurately weight n-grams based on their distribution in the documents and their terms distribution in n-grams. The proposed approach not only reduces the number of extracted n-grams but also improves the performance. The experimental results on Reuters Corpus Volume 1 (RCV1) data collection and TREC topics show that the proposed method significantly outperforms the state-of-art methods underpinned by Okapi BM25, tf*idf and Rocchio.
Resumo:
Topic modelling has been widely used in the fields of information retrieval, text mining, machine learning, etc. In this paper, we propose a novel model, Pattern Enhanced Topic Model (PETM), which makes improvements to topic modelling by semantically representing topics with discriminative patterns, and also makes innovative contributions to information filtering by utilising the proposed PETM to determine document relevance based on topics distribution and maximum matched patterns proposed in this paper. Extensive experiments are conducted to evaluate the effectiveness of PETM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model significantly outperforms both state-of-the-art term-based models and pattern-based models.
Resumo:
Changes in the molecular structure of polymer antioxidants such as hindered amine light stabilisers (HALS) is central to their efficacy in retarding polymer degradation and therefore requires careful monitoring during their in-service lifetime. The HALS, bis-(1-octyloxy-2,2,6,6-tetramethyl-4-piperidinyl) sebacate (TIN123) and bis-(1,2,2,6,6-pentamethyl-4-piperidinyl) sebacate (TIN292), were formulated in different polymer systems and then exposed to various curing and ageing treatments to simulate in-service use. Samples of these coatings were then analysed directly using liquid extraction surface analysis (LESA) coupled with a triple quadrupole mass spectrometer. Analysis of TIN123 formulated in a cross-linked polyester revealed that the polymer matrix protected TIN123 from undergoing extensive thermal degradation that would normally occur at 292 degrees C, specifically, changes at the 1- and 4-positions of the piperidine groups. The effect of thermal versus photo-oxidative degradation was also compared for TIN292 formulated in polyacrylate films by monitoring the in situ conversion of N-CH3 substituted piperidines to N-H. The analysis confirmed that UV light was required for the conversion of N-CH3 moieties to N-H - a major pathway in the antioxidant protection of polymers - whereas this conversion was not observed with thermal degradation. The use of tandem mass spectrometric techniques, including precursor-ion scanning, is shown to be highly sensitive and specific for detecting molecular-level changes in HALS compounds and, when coupled with LESA, able to monitor these changes in situ with speed and reproducibility. (C) 2013 Elsevier B. V. All rights reserved.
Resumo:
Ranking documents according to the Probability Ranking Principle has been theoretically shown to guarantee optimal retrieval effectiveness in tasks such as ad hoc document retrieval. This ranking strategy assumes independence among document relevance assessments. This assumption, however, often does not hold, for example in the scenarios where redundancy in retrieved documents is of major concern, as it is the case in the sub–topic retrieval task. In this chapter, we propose a new ranking strategy for sub–topic retrieval that builds upon the interdependent document relevance and topic–oriented models. With respect to the topic– oriented model, we investigate both static and dynamic clustering techniques, aiming to group topically similar documents. Evidence from clusters is then combined with information about document dependencies to form a new document ranking. We compare and contrast the proposed method against state–of–the–art approaches, such as Maximal Marginal Relevance, Portfolio Theory for Information Retrieval, and standard cluster–based diversification strategies. The empirical investigation is performed on the ImageCLEF 2009 Photo Retrieval collection, where images are assessed with respect to sub–topics of a more general query topic. The experimental results show that our approaches outperform the state–of–the–art strategies with respect to a number of diversity measures.
Resumo:
Two sources of uncertainty in the X ray computed tomography imaging of polymer gel dosimeters are investigated in the paper.The first cause is a change in postirradiation density, which is proportional to the computed tomography signal and is associated with a volume change. The second cause of uncertainty is reconstruction noise.A simple technique that increases the residual signal to noise ratio by almost two orders of magnitude is examined.