998 resultados para Information granularity


Relevância:

60.00% 60.00%

Publicador:

Resumo:

News blog hot topics are important for the information recommendation service and marketing. However, information overload and personalized management make the information arrangement more difficult. Moreover, what influences the formation and development of blog hot topics is seldom paid attention to. In order to correctly detect news blog hot topics, the paper first analyzes the development of topics in a new perspective based on W2T (Wisdom Web of Things) methodology. Namely, the characteristics of blog users, context of topic propagation and information granularity are unified to analyze the related problems. Some factors such as the user behavior pattern, network opinion and opinion leader are subsequently identified to be important for the development of topics. Then the topic model based on the view of event reports is constructed. At last, hot topics are identified by the duration, topic novelty, degree of topic growth and degree of user attention. The experimental results show that the proposed method is feasible and effective.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although topic detection and tracking techniques have made great progress, most of the researchers seldom pay more attention to the following two aspects. First, the construction of a topic model does not take the characteristics of different topics into consideration. Second, the factors that determine the formation and development of hot topics are not further analyzed. In order to correctly extract news blog hot topics, the paper views the above problems in a new perspective based on the W2T (Wisdom Web of Things) methodology, in which the characteristics of blog users, context of topic propagation and information granularity are investigated in a unified way. The motivations and features of blog users are first analyzed to understand the characteristics of news blog topics. Then the context of topic propagation is decomposed into the blog community, topic network and opinion network, respectively. Some important factors such as the user behavior pattern, opinion leader and network opinion are identified to track the development trends of news blog topics. Moreover, a blog hot topic detection algorithm is proposed, in which news blog hot topics are identified by measuring the duration, topic novelty, attention degree of users and topic growth. Experimental results show that the proposed method is feasible and effective. These results are also useful for further studying the formation mechanism of opinion leaders in blogspace.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Automatic generation of classification rules has been an increasingly popular technique in commercial applications such as Big Data analytics, rule based expert systems and decision making systems. However, a principal problem that arises with most methods for generation of classification rules is the overfit-ting of training data. When Big Data is dealt with, this may result in the generation of a large number of complex rules. This may not only increase computational cost but also lower the accuracy in predicting further unseen instances. This has led to the necessity of developing pruning methods for the simplification of rules. In addition, classification rules are used further to make predictions after the completion of their generation. As efficiency is concerned, it is expected to find the first rule that fires as soon as possible by searching through a rule set. Thus a suit-able structure is required to represent the rule set effectively. In this chapter, the authors introduce a unified framework for construction of rule based classification systems consisting of three operations on Big Data: rule generation, rule simplification and rule representation. The authors also review some existing methods and techniques used for each of the three operations and highlight their limitations. They introduce some novel methods and techniques developed by them recently. These methods and techniques are also discussed in comparison to existing ones with respect to efficient processing of Big Data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This chapter presents fuzzy cognitive maps (FCM) as a vehicle for Web knowledge aggregation, representation, and reasoning. The corresponding Web KnowARR framework incorporates findings from fuzzy logic. To this end, a first emphasis is particularly on the Web KnowARR framework along with a stakeholder management use case to illustrate the framework’s usefulness as a second focal point. This management form is to help projects to acceptance and assertiveness where claims for company decisions are actively involved in the management process. Stakeholder maps visually (re-) present these claims. On one hand, they resort to non-public content and on the other they resort to content that is available to the public (mostly on the Web). The Semantic Web offers opportunities not only to present public content descriptively but also to show relationships. The proposed framework can serve as the basis for the public content of stakeholder maps.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Search technologies are critical to enable clinical sta to rapidly and e ectively access patient information contained in free-text medical records. Medical search is challenging as terms in the query are often general but those in rel- evant documents are very speci c, leading to granularity mismatch. In this paper we propose to tackle granularity mismatch by exploiting subsumption relationships de ned in formal medical domain knowledge resources. In symbolic reasoning, a subsumption (or `is-a') relationship is a parent-child rela- tionship where one concept is a subset of another concept. Subsumed concepts are included in the retrieval function. In addition, we investigate a number of initial methods for combining weights of query concepts and those of subsumed concepts. Subsumption relationships were found to provide strong indication of relevant information; their inclusion in retrieval functions yields performance improvements. This result motivates the development of formal models of rela- tionships between medical concepts for retrieval purposes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Australian e-Health Research Centre and Queensland University of Technology recently participated in the TREC 2012 Medical Records Track. This paper reports on our methods, results and experience using an approach that exploits the concept and inter-concept relationships defined in the SNOMED CT medical ontology. Our concept-based approach is intended to overcome specific challenges in searching medical records, namely vocabulary mismatch and granularity mismatch. Queries and documents are transformed from their term-based originals into medical concepts as defined by the SNOMED CT ontology, this is done to tackle vocabulary mismatch. In addition, we make use of the SNOMED CT parent-child `is-a' relationships between concepts to weight documents that contained concept subsumed by the query concepts; this is done to tackle the problem of granularity mismatch. Finally, we experiment with other SNOMED CT relationships besides the is-a relationship to weight concepts related to query concepts. Results show our concept-based approach performed significantly above the median in all four performance metrics. Further improvements are achieved by the incorporation of weighting subsumed concepts, overall leading to improvement above the median of 28% infAP, 10% infNDCG, 12% R-prec and 7% Prec@10. The incorporation of other relations besides is-a demonstrated mixed results, more research is required to determined which SNOMED CT relationships are best employed when weighting related concepts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper conceptualizes a framework for bridging the BIM (building information modelling)-specifications divide through augmenting objects within BIM with specification parameters derived from a product library. We demonstrate how model information, enriched with data at various LODs (levels of development), can evolve simultaneously with design and construction using different representation of a window object embedded in a wall as lifecycle phase exemplars at different levels of granularity. The conceptual standpoint is informed by the need for exploring a methodological approach which extends beyond current limitations of current modelling platforms in enhancing the information content of BIM models. Therefore, this work demonstrates that BIM objects can be augmented with construction specification parameters leveraging product libraries.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The task of word-level confidence estimation (CE) for automatic speech recognition (ASR) systems stands to benefit from the combination of suitably defined input features from multiple information sources. However, the information sources of interest may not necessarily operate at the same level of granularity as the underlying ASR system. The research described here builds on previous work on confidence estimation for ASR systems using features extracted from word-level recognition lattices, by incorporating information at the sub-word level. Furthermore, the use of Conditional Random Fields (CRFs) with hidden states is investigated as a technique to combine information for word-level CE. Performance improvements are shown using the sub-word-level information in linear-chain CRFs with appropriately engineered feature functions, as well as when applying the hidden-state CRF model at the word level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The study of information requirements for e-business systems reveals that the level of detail, granularity, format of presentation, and a broad range of information types are required for the applications. The provision of relevant information affects how e-business systems can efficiently support the business goals and processes. This paper presents an approach for determining information requirements for e-business systems (DIRES) which will allow the user to describe the core business processes, whose specification maps onto a business activity space. It further aids a configuration of information requirements into an information space. A case study of a logistics company in China demonstrates the use of DIRES techniques and assesses the validity of the research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In group decision making (GDM) problems, it is natural for decision makers (DMs) to provide different preferences and evaluations owing to varying domain knowledge and cultural values. When the number of DMs is large, a higher degree of heterogeneity is expected, and it is difficult to translate heterogeneous information into one unified preference without loss of context. In this aspect, the current GDM models face two main challenges, i.e., handling the complexity pertaining to the unification of heterogeneous information from a large number of DMs, and providing optimal solutions based on unification methods. This paper presents a new consensus-based GDM model to manage heterogeneous information. In the new GDM model, an aggregation of individual priority (AIP)-based aggregation mechanism, which is able to employ flexible methods for deriving each DM's individual priority and to avoid information loss caused by unifying heterogeneous information, is utilized to aggregate the individual preferences. To reach a consensus more efficiently, different revision schemes are employed to reward/penalize the cooperative/non-cooperative DMs, respectively. The temporary collective opinion used to guide the revision process is derived by aggregating only those non-conflicting opinions at each round of revision. In order to measure the consensus in a robust manner, a position-based dissimilarity measure is developed. Compared with the existing GDM models, the proposed GDM model is more effective and flexible in processing heterogeneous information. It can be used to handle different types of information with different degrees of granularity. Six types of information are exemplified in this paper, i.e., ordinal, interval, fuzzy number, linguistic, intuitionistic fuzzy set, and real number. The results indicate that the position-based consensus measure is able to overcome possible distortions of the results in large-scale GDM problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose an analysis for detecting procedures and goals that are deterministic (i.e. that produce at most one solution), or predicates whose clause tests are mutually exclusive (which implies that at most one of their clauses will succeed) even if they are not deterministic (because they cali other predicates that can produce more than one solution). Applications of such determinacy information include detecting programming errors, performing certain high-level program transformations for improving search efñciency, optimizing low level code generation and parallel execution, and estimating tighter upper bounds on the computational costs of goals and data sizes, which can be used for program debugging, resource consumption and granularity control, etc. We have implemented the analysis and integrated it in the CiaoPP system, which also infers automatically the mode and type information that our analysis takes as input. Experiments performed on this implementation show that the analysis is fairly accurate and efncient.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Web document cluster analysis plays an important role in information retrieval by organizing large amounts of documents into a small number of meaningful clusters. Traditional web document clustering is based on the Vector Space Model (VSM), which takes into account only two-level (document and term) knowledge granularity but ignores the bridging paragraph granularity. However, this two-level granularity may lead to unsatisfactory clustering results with “false correlation”. In order to deal with the problem, a Hierarchical Representation Model with Multi-granularity (HRMM), which consists of five-layer representation of data and a twophase clustering process is proposed based on granular computing and article structure theory. To deal with the zero-valued similarity problemresulted from the sparse term-paragraphmatrix, an ontology based strategy and a tolerance-rough-set based strategy are introduced into HRMM. By using granular computing, structural knowledge hidden in documents can be more efficiently and effectively captured in HRMM and thus web document clusters with higher quality can be generated. Extensive experiments show that HRMM, HRMM with tolerancerough-set strategy, and HRMM with ontology all outperform VSM and a representative non VSM-based algorithm, WFP, significantly in terms of the F-Score.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research presents several components encompassing the scope of the objective of Data Partitioning and Replication Management in Distributed GIS Database. Modern Geographic Information Systems (GIS) databases are often large and complicated. Therefore data partitioning and replication management problems need to be addresses in development of an efficient and scalable solution. ^ Part of the research is to study the patterns of geographical raster data processing and to propose the algorithms to improve availability of such data. These algorithms and approaches are targeting granularity of geographic data objects as well as data partitioning in geographic databases to achieve high data availability and Quality of Service(QoS) considering distributed data delivery and processing. To achieve this goal a dynamic, real-time approach for mosaicking digital images of different temporal and spatial characteristics into tiles is proposed. This dynamic approach reuses digital images upon demand and generates mosaicked tiles only for the required region according to user's requirements such as resolution, temporal range, and target bands to reduce redundancy in storage and to utilize available computing and storage resources more efficiently. ^ Another part of the research pursued methods for efficient acquiring of GIS data from external heterogeneous databases and Web services as well as end-user GIS data delivery enhancements, automation and 3D virtual reality presentation. ^ There are vast numbers of computing, network, and storage resources idling or not fully utilized available on the Internet. Proposed "Crawling Distributed Operating System "(CDOS) approach employs such resources and creates benefits for the hosts that lend their CPU, network, and storage resources to be used in GIS database context. ^ The results of this dissertation demonstrate effective ways to develop a highly scalable GIS database. The approach developed in this dissertation has resulted in creation of TerraFly GIS database that is used by US government, researchers, and general public to facilitate Web access to remotely-sensed imagery and GIS vector information. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Organizations apply information security risk assessment (ISRA) methodologies to systematically and comprehensively identify information assets and related security risks. We review the ISRA literature and identify three key deficiencies in current methodologies that stem from their traditional accountancy-based perspective and a limited view of organizational "assets". In response, we propose a novel rich description method (RDM) that adopts a less formal and more holistic view of information and knowledge assets that exist in modern work environments. We report on an in-depth case study to explore the potential for improved asset identification enabled by the RDM compared to traditional ISRAs. The comparison shows how the RDM addresses the three key deficiencies of current ISRAs by providing: 1) a finer level of granularity for identifying assets, 2) a broader coverage of assets that reflects the informal aspects of business practices, and 3) the identification of critical knowledge assets.