77 resultados para Information Filtering, Pattern Mining, Relevance Feature Discovery, Text Mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we discuss a special case of knowledge creation (pattern mining) that was studied using a hermeneutic circle. Hermeneutics assists the oscillation between tacit and explicit knowledge through the process of knowledge qualification, combination, socialization, externalization, internalization and introspection. Our investigation of the knowledge creation process lead to the enrichment of the knowledge creation framework proposed by Wickramasinghe and Lichtenstein, allowing us to reflect on activities and aspects of knowledge elicitation across an application domain and involving practitioners who do not communicate directly.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present preliminary work implementing dynamic privacy in public surveillance. The aim is to maximise the privacy of those under surveillance, while giving an observer access to sufficient information to perform their duties. As these aspects are in conflict, a dynamic approach to privacy is required to balance the system's purpose with the system's privacy. Dynamic privacy is achieved by accounting for the situation, or context, within the environment. The context is determined by a number of visual features that are combined and then used to determine an appropriate level of privacy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Soft Computing is an interdisciplinary area that encompasses a variety of computing paradigms. Examples of some popular soft computing paradigms include fuzzy computing, neural computing, evolutionary computing, and probabilistic computing. Soft computing paradigms, in general, aim to produce computing systems/machines that exhibit some useful properties, e.g. making inference with vague and/or ambiguous information, learning from noisy and/or incomplete data, adapting to changing environments, and reasoning with uncertainties. These properties are important for the systems/machines to be useful in assisting humans in our daily activities. Indeed, soft computing paradigms have been demonstrated to be capable of tackling a wide range of problems, e.g. optimization, decision making, information processing, pattern recognition, and intelligent data analysis. A number of papers pertaining to some recent advances in theoretical development and practical application of different soft computing paradigms are highlighted in this special issue.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The medial prefrontal cortex (mPFC) and the right temporo-parietal junction (rTPj) are highly involved in social understanding, a core area of impairment in autism spectrum disorder (ASD). We used fMRI to investigate sex differences in the neural correlates of social understanding in 27 high-functioning adults with ASD and 23 matched controls. There were no differences in neural activity in the mPFC or rTPj between groups during social processing. Whole brain analysis revealed decreased activity in the posterior superior temporal sulcus in males with ASD compared to control males while processing social information. This pattern was not observed in the female sub-sample. The current study indicates that sex mediates the neurobiology of ASD, particularly with respect to processing social information.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: Continuous content management of health information portals is a feature vital for its sustainability and widespread acceptance. Knowledge and experience of a domain expert is essential for content management in the health domain. The rate of generation of online health resources is exponential and thereby manual examination for relevance to a specific topic and audience is a formidable challenge for domain experts. Intelligent content discovery for effective content management is a less researched topic. An existing expert-endorsed content repository can provide the necessary leverage to automatically identify relevant resources and evaluate qualitative metrics.Objective: This paper reports on the design research towards an intelligent technique for automated content discovery and ranking for health information portals. The proposed technique aims to improve efficiency of the current mostly manual process of portal content management by utilising an existing expert-endorsed content repository as a supporting base and a benchmark to evaluate the suitability of new contentMethods: A model for content management was established based on a field study of potential users. The proposed technique is integral to this content management model and executes in several phases (ie, query construction, content search, text analytics and fuzzy multi-criteria ranking). The construction of multi-dimensional search queries with input from Wordnet, the use of multi-word and single-word terms as representative semantics for text analytics and the use of fuzzy multi-criteria ranking for subjective evaluation of quality metrics are original contributions reported in this paper.Results: The feasibility of the proposed technique was examined with experiments conducted on an actual health information portal, the BCKOnline portal. Both intermediary and final results generated by the technique are presented in the paper and these help to establish benefits of the technique and its contribution towards effective content management.Conclusions: The prevalence of large numbers of online health resources is a key obstacle for domain experts involved in content management of health information portals and websites. The proposed technique has proven successful at search and identification of resources and the measurement of their relevance. It can be used to support the domain expert in content management and thereby ensure the health portal is up-to-date and current.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Modeling probabilistic data is one of important issues in databases due to the fact that data is often uncertainty in real-world applications. So, it is necessary to identify potentially useful patterns in probabilistic databases. Because probabilistic data in 1NF relations is redundant, previous mining techniques don’t work well on probabilistic databases. For this reason, this paper proposes a new model for mining probabilistic databases. A partition is thus developed for preprocessing probabilistic data in a probabilistic databases. We evaluated the proposed technique, and the experimental results demonstrate that our approach is effective and efficient.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Over the past decade, advances in the Internet and media technology have literally brought people closer than ever before. It is interesting to note that traditional sociological definitions of a community have been outmoded, for community has extended far beyond the geographical boundaries that were held by traditional definitions (Wellman & Gulia, 1999). Virtual or online community was defined in such a context to describe various forms of computer-mediated communication (CMC). Although virtual communities do not necessarily arise from the Internet, the overwhelming popularity of the Internet is one of the main reasons that virtual communities receive so much attention (Rheingold, 1999). The beginning of virtual communities is attributed to scientists who exchanged information and cooperatively conduct research during the 1970s. There are four needs of participants in a virtual community: member interest, social interaction, imagination, and transaction (Hagel & Armstrong, 1997). The first two focus more on the information exchange and knowledge discovery; the imagination is for entertainment; and the transaction is for commerce strategy. In this article, we investigate the function of information exchange and knowledge discovery in virtual communities. There are two important inherent properties embedded in virtual communities (Wellman, 2001):

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Textural image classification technologies have been extensively explored and widely applied in many areas. It is advantageous to combine both the occurrence and spatial distribution of local patterns to describe a texture. However, most existing state-of-the-art approaches for textural image classification only employ the occurrence histogram of local patterns to describe textures, without considering their co-occurrence information. And they are usually very time-consuming because of the vector quantization involved. Moreover, those feature extraction paradigms are implemented at a single scale. In this paper we propose a novel multi-scale local pattern co-occurrence matrix (MS_LPCM) descriptor to characterize textural images through four major steps. Firstly, Gaussian filtering pyramid preprocessing is employed to obtain multi-scale images; secondly, a local binary pattern (LBP) operator is applied on each textural image to create a LBP image; thirdly, the gray-level co-occurrence matrix (GLCM) is utilized to extract local pattern co-occurrence matrix (LPCM) from LBP images as the features; finally, all LPCM features from the same textural image at different scales are concatenated as the final feature vectors for classification. The experimental results on three benchmark databases in this study have shown a higher classification accuracy and lower computing cost as compared with other state-of-the-art algorithms.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Social networks have become a convenient and effective means of communication in recent years. Many people use social networks to communicate, lead, and manage activities, and express their opinions in supporting or opposing different causes. This has brought forward the issue of verifying the owners of social accounts, in order to eliminate the effect of any fake accounts on the people. This study aims to authenticate the genuine accounts versus fake account using writeprint, which is the writing style biometric. We first extract a set of features using text mining techniques. Then, training of a supervised machine learning algorithm to build the knowledge base is conducted. The recognition procedure starts by extracting the relevant features and then measuring the similarity of the feature vector with respect to all feature vectors in the knowledge base. Then, the most similar vector is identified as the verified account.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Streams of short text, such as news titles, enable us to effectively and efficiently learn the real world events that occur anywhere and anytime. Short text messages that are companied by timestamps and generally brief events using only a few words differ from other longer text documents, such as web pages, news stories, blogs, technical papers and books. For example, few words repeat in the same news titles, thus frequency of the term (i.e., TF) is not as important in short text corpus as in longer text corpus. Therefore, analysis of short text faces new challenges. Also, detecting and tracking events through short text analysis need to reliably identify events from constant topic clusters; however, existing methods, such as Latent Dirichlet Allocation (LDA), generates different topic results for a corpus at different executions. In this paper, we provide a Finding Topic Clusters using Co-occurring Terms (FTCCT) algorithm to automatically generate topics from a short text corpus, and develop an Event Evolution Mining (EEM) algorithm to discover hot events and their evolutions (i.e., the popularity degrees of events changing over time). In FTCCT, a term (i.e., a single word or a multiple-words phrase) belongs to only one topic in a corpus. Experiments on news titles of 157 countries within 4 months (from July to October, 2013) demonstrate that our FTCCT-based method (combining FTCCT and EEM) achieves far higher quality of the event's content and description words than LDA-based method (combining LDA and EEM) for analysis of streams of short text. Our method also visualizes the evolutions of the hot events. The discovered world-wide event evolutions have explored some interesting correlations of the world-wide events; for example, successive extreme weather phenomenon occur in different locations - typhoon in Hong Kong and Philippines followed hurricane and storm flood in Mexico in September 2013. © 2014 Springer Science+Business Media New York.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The low accuracy rates of textshape dividers for digital ink diagrams are hindering their use in real world applications. While recognition of handwriting is well advanced and there have been many recognition approaches proposed for hand drawn sketches, there has been less attention on the division of text and drawing ink. Feature based recognition is a common approach for textshape division. However, the choice of features and algorithms are critical to the success of the recognition. We propose the use of data mining techniques to build more accurate textshape dividers. A comparative study is used to systematically identify the algorithms best suited for the specific problem. We have generated dividers using data mining with diagrams from three domains and a comprehensive ink feature library. The extensive evaluation on diagrams from six different domains has shown that our resulting dividers, using LADTree and LogitBoost, are significantly more accurate than three existing dividers.