996 resultados para approximate membership extraction
Resumo:
Text is the main method of communicating information in the digital age. Messages, blogs, news articles, reviews, and opinionated information abounds on the Internet. People commonly purchase products online and post their opinions about purchased items. This feedback is displayed publicly to assist others with their purchasing decisions, creating the need for a mechanism with which to extract and summarize useful information for enhancing the decision-making process. Our contribution is to improve the accuracy of extraction by combining different techniques from three major areas, named Data Mining, Natural Language Processing techniques and Ontologies. The proposed framework sequentially mines product’s aspects and users’ opinions, groups representative aspects by similarity, and generates an output summary. This paper focuses on the task of extracting product aspects and users’ opinions by extracting all possible aspects and opinions from reviews using natural language, ontology, and frequent “tag” sets. The proposed framework, when compared with an existing baseline model, yielded promising results.
Resumo:
AN ENGINEERING Workshop was held from 21 to 24 November 2006 in Veracruz, Mexico. Forty delegates from 12 countries attended the workshop on theory and practice of milling and diffusion extraction. This report provides a general overview of activities undertaken during that workshop which consisted of five technical sessions over two days with presentations and discussions plus two days of field and factory visits. Topics covered during the technical sessions included: power transmissions, cane preparation, diffusers, mills, and a comparison of milling and diffusion.
Resumo:
BACKGROUND: The use of salivary diagnostics is increasing because of its noninvasiveness, ease of sampling, and the relatively low risk of contracting infectious organisms. Saliva has been used as a biological fluid to identify and validate RNA targets in head and neck cancer patients. The goal of this study was to develop a robust, easy, and cost-effective method for isolating high yields of total RNA from saliva for downstream expression studies. METHODS: Oral whole saliva (200 mu L) was collected from healthy controls (n = 6) and from patients with head and neck cancer (n = 8). The method developed in-house used QIAzol lysis reagent (Qiagen) to extract RNA from saliva (both cell-free supernatants and cell pellets), followed by isopropyl alcohol precipitation, cDNA synthesis, and real-time PCR analyses for the genes encoding beta-actin ("housekeeping" gene) and histatin (a salivary gland-specific gene). RESULTS: The in-house QIAzol lysis reagent produced a high yield of total RNA (0.89 -7.1 mu g) from saliva (cell-free saliva and cell pellet) after DNase treatment. The ratio of the absorbance measured at 260 nm to that at 280 nm ranged from 1.6 to 1.9. The commercial kit produced a 10-fold lower RNA yield. Using our method with the QIAzol lysis reagent, we were also able to isolate RNA from archived saliva samples that had been stored without RNase inhibitors at -80 degrees C for >2 years. CONCLUSIONS: Our in-house QIAzol method is robust, is simple, provides RNA at high yields, and can be implemented to allow saliva transcriptomic studies to be translated into a clinical setting.
Resumo:
Empirical evidence shows that repositories of business process models used in industrial practice contain significant amounts of duplication. This duplication arises for example when the repository covers multiple variants of the same processes or due to copy-pasting. Previous work has addressed the problem of efficiently retrieving exact clones that can be refactored into shared subprocess models. This article studies the broader problem of approximate clone detection in process models. The article proposes techniques for detecting clusters of approximate clones based on two well-known clustering algorithms: DBSCAN and Hi- erarchical Agglomerative Clustering (HAC). The article also defines a measure of standardizability of an approximate clone cluster, meaning the potential benefit of replacing the approximate clones with a single standardized subprocess. Experiments show that both techniques, in conjunction with the proposed standardizability measure, accurately retrieve clusters of approximate clones that originate from copy-pasting followed by independent modifications to the copied fragments. Additional experiments show that both techniques produce clusters that match those produced by human subjects and that are perceived to be standardizable.
Resumo:
Double-pulse tests are commonly used as a method for assessing the switching performance of power semiconductor switches in a clamped inductive switching application. Data generated from these tests are typically in the form of sampled waveform data captured using an oscilloscope. In cases where it is of interest to explore a multi-dimensional parameter space and corresponding result space it is necessary to reduce the data into key performance metrics via feature extraction. This paper presents techniques for the extraction of switching performance metrics from sampled double-pulse waveform data. The reported techniques are applied to experimental data from characterisation of a cascode gate drive circuit applied to power MOSFETs.
Resumo:
It is well established that the traditional taxonomy and nomenclature of Chironomidae relies on adult males whose usually characteristic genitalia provide evidence of species distinction. In the early days some names were based on female adults of variable distinctiveness – but females are difficult to identify (Ekrem et al. 2010) and many of these names remain dubious. In Russia especially, a system based on larval morphology grew in parallel to the conventional adult-based system. The systems became reconciled with the studies that underlay the production of the Holarctic generic keys to Chironomidae, commencing notably with the larval volume (Wiederholm, 1983). Ever since Thienemann’s pioneering studies, it has been evident that the pupa, notably the cast skins (exuviae) provide a wealth of features that can aid in identification (e.g. Wiederholm, 1986). Furthermore, the pupae can be readily associated with name-bearing adults when a pharate (‘cloaked’) adult stage is visible within the pupa. Association of larvae with the name-bearing later stages has been much more difficult, time-consuming and fraught with risk of failure. Yet it is identification of the larval stage that is needed by most applied researchers due to the value of the immature stages of the family in aquatic monitoring for water quality, although the pupal stage also has advocates (reviewed by Sinclair & Gresens, 2008). Few use the adult stage for such purposes as their provenance and association with the water body can be verified only by emergence trapping, and sampling of adults lies outside regular aquatic monitoring protocols.
Resumo:
The number of bike share programs has increased rapidly in recent years and there are currently over 700 programs in operation globally. Australia’s two bike share programs have been in operation since 2010 and have significantly lower usage rates compared to Europe, North America and China. This study sets out to understand and quantify the factors influencing bike share membership in Australia’s two bike share programs located in Melbourne and Brisbane. An online survey was administered to members of both programs as well as a group with no known association with bike share. A logistic regression model revealed several significant predictors of membership including reactions to mandatory helmet legislation, riding activity over the previous month, and the degree to which convenience motivated private bike riding. In addition, respondents aged 18 - 34 and having docking station within 250m of their workplace were found to be statistically significant predictors of bike share membership. Finally, those with relatively high incomes increased the odds of membership. These results provide insight as to the relative influence of various factors impacting on bike share membership in Australia. The findings may assist bike share operators to maximize membership potential and help achieve the primary goal of bike share – to increase the sustainability of the transport system.
Resumo:
Most of the existing algorithms for approximate Bayesian computation (ABC) assume that it is feasible to simulate pseudo-data from the model at each iteration. However, the computational cost of these simulations can be prohibitive for high dimensional data. An important example is the Potts model, which is commonly used in image analysis. Images encountered in real world applications can have millions of pixels, therefore scalability is a major concern. We apply ABC with a synthetic likelihood to the hidden Potts model with additive Gaussian noise. Using a pre-processing step, we fit a binding function to model the relationship between the model parameters and the synthetic likelihood parameters. Our numerical experiments demonstrate that the precomputed binding function dramatically improves the scalability of ABC, reducing the average runtime required for model fitting from 71 hours to only 7 minutes. We also illustrate the method by estimating the smoothing parameter for remotely sensed satellite imagery. Without precomputation, Bayesian inference is impractical for datasets of that scale.
Resumo:
This paper discusses the following key messages. Taxonomy is (and taxonomists are) more important than ever in times of global change. Taxonomic endeavour is not occurring fast enough: in 250 years since the creation of the Linnean Systema Naturae, only about 20% of Earth's species have been named. We need fundamental changes to the taxonomic process and paradigm to increase taxonomic productivity by orders of magnitude. Currently, taxonomic productivity is limited principally by the rate at which we capture and manage morphological information to enable species discovery. Many recent (and welcomed) initiatives in managing and delivering biodiversity information and accelerating the taxonomic process do not address this bottleneck. Development of computational image analysis and feature extraction methods is a crucial missing capacity needed to enable taxonomists to overcome the taxonomic impediment in a meaningful time frame. Copyright © 2009 Magnolia Press.
Resumo:
We present an overview of the QUT plant classification system submitted to LifeCLEF 2014. This system uses generic features extracted from a convolutional neural network previously used to perform general object classification. We examine the effectiveness of these features to perform plant classification when used in combination with an extremely randomised forest. Using this system, with minimal tuning, we obtained relatively good results with a score of 0:249 on the test set of LifeCLEF 2014.
Resumo:
Group membership is central to social interaction. Within peer groups, social hierarchies and affiliations are matters to which members seriously attend (Corsaro, 2014). Studies of peer groups highlight how status is achieved through oppositional actions. This paper examines the way in which competition and collaboration in a children’s peer group accomplishes status during the production and management of “second stories” (Sacks 1992). We present analysis of the interaction of young boys in a preparatory year playground who are engaged in a single instance of storytelling “rounds”. Analysis highlights the pivotal role of members’ contributions, assessments and receipts in a series of second stories that enact a simultaneously competitive and collaborative local order.
Resumo:
Wound healing and tumour growth involve collective cell spreading, which is driven by individual motility and proliferation events within a population of cells. Mathematical models are often used to interpret experimental data and to estimate the parameters so that predictions can be made. Existing methods for parameter estimation typically assume that these parameters are constants and often ignore any uncertainty in the estimated values. We use approximate Bayesian computation (ABC) to estimate the cell diffusivity, D, and the cell proliferation rate, λ, from a discrete model of collective cell spreading, and we quantify the uncertainty associated with these estimates using Bayesian inference. We use a detailed experimental data set describing the collective cell spreading of 3T3 fibroblast cells. The ABC analysis is conducted for different combinations of initial cell densities and experimental times in two separate scenarios: (i) where collective cell spreading is driven by cell motility alone, and (ii) where collective cell spreading is driven by combined cell motility and cell proliferation. We find that D can be estimated precisely, with a small coefficient of variation (CV) of 2–6%. Our results indicate that D appears to depend on the experimental time, which is a feature that has been previously overlooked. Assuming that the values of D are the same in both experimental scenarios, we use the information about D from the first experimental scenario to obtain reasonably precise estimates of λ, with a CV between 4 and 12%. Our estimates of D and λ are consistent with previously reported values; however, our method is based on a straightforward measurement of the position of the leading edge whereas previous approaches have involved expensive cell counting techniques. Additional insights gained using a fully Bayesian approach justify the computational cost, especially since it allows us to accommodate information from different experiments in a principled way.