918 resultados para Strongly Semantic Information
Resumo:
A search query, being a very concise grounding of user intent, could potentially have many possible interpretations. Search engines hedge their bets by diversifying top results to cover multiple such possibilities so that the user is likely to be satisfied, whatever be her intended interpretation. Diversified Query Expansion is the problem of diversifying query expansion suggestions, so that the user can specialize the query to better suit her intent, even before perusing search results. We propose a method, Select-Link-Rank, that exploits semantic information from Wikipedia to generate diversified query expansions. SLR does collective processing of terms and Wikipedia entities in an integrated framework, simultaneously diversifying query expansions and entity recommendations. SLR starts with selecting informative terms from search results of the initial query, links them to Wikipedia entities, performs a diversity-conscious entity scoring and transfers such scoring to the term space to arrive at query expansion suggestions. Through an extensive empirical analysis and user study, we show that our method outperforms the state-of-the-art diversified query expansion and diversified entity recommendation techniques.
Resumo:
The Portable Document Format (PDF), defined by Adobe Systems Inc. as the basis of its Acrobat product range, is discussed in some detail. Particular emphasis is given to its flexible object-oriented structure, which has yet to be fully exploited. It is currently used to represent not logical structure but simply a series of pages and associated resources. A definition of an Encapsulated PDF (EPDF) is presented, in which EPDF blocks carry with them their own resource requirements, together with geometrical and logical information. A block formatter called Juggler is described which can lay out EPDF blocks from various sources onto new pages. Future revisions of PDF supporting uniquely-named EPDF blocks tagged with semantic information would assist in composite-pagemakeup and could even lead to fully revisable PDF.
Resumo:
Increasing the size of training data in many computer vision tasks has shown to be very effective. Using large scale image datasets (e.g. ImageNet) with simple learning techniques (e.g. linear classifiers) one can achieve state-of-the-art performance in object recognition compared to sophisticated learning techniques on smaller image sets. Semantic search on visual data has become very popular. There are billions of images on the internet and the number is increasing every day. Dealing with large scale image sets is intense per se. They take a significant amount of memory that makes it impossible to process the images with complex algorithms on single CPU machines. Finding an efficient image representation can be a key to attack this problem. A representation being efficient is not enough for image understanding. It should be comprehensive and rich in carrying semantic information. In this proposal we develop an approach to computing binary codes that provide a rich and efficient image representation. We demonstrate several tasks in which binary features can be very effective. We show how binary features can speed up large scale image classification. We present learning techniques to learn the binary features from supervised image set (With different types of semantic supervision; class labels, textual descriptions). We propose several problems that are very important in finding and using efficient image representation.
Resumo:
Eye-tracking was used to examine how younger and older adults use syntactic and semantic information to disambiguate noun/verb (NV) homographs (e.g., park). We find that young adults exhibit inflated first fixations to NV-homographs when only syntactic cues are available for disambiguation (i.e., in syntactic prose). This effect is eliminated with the addition of disambiguating semantic information. Older adults (60+) as a group fail to show the first fixation effect in syntactic prose; they instead reread NV homographs longer. This pattern mirrors that in prior event-related potential work (Lee & Federmeier, 2009, 2011), which reported a sustained frontal negativity to NV-homographs in syntactic prose for young adults, which was eliminated by semantic constraints. The frontal negativity was not observed in older adults as a group, although older adults with high verbal fluency showed the young-like pattern. Analyses of individual differences in eye-tracking patterns revealed a similar effect of verbal fluency in both young and older adults: high verbal fluency groups of both ages show larger first fixation effects, while low verbal fluency groups show larger downstream costs (rereading and/or refixating NV homographs). Jointly, the eye-tracking and ERP data suggest that effortful meaning selection recruits frontal brain areas important for suppressing contextually inappropriate meanings, which also slows eye movements. Efficacy of fronto-temporal circuitry, as captured by verbal fluency, predicts the success of engaging these mechanisms in both young and older adults. Failure to recruit these processes requires compensatory rereading or leads to comprehension failures (Lee & Federmeier, in press).
Resumo:
Thanks to the advanced technologies and social networks that allow the data to be widely shared among the Internet, there is an explosion of pervasive multimedia data, generating high demands of multimedia services and applications in various areas for people to easily access and manage multimedia data. Towards such demands, multimedia big data analysis has become an emerging hot topic in both industry and academia, which ranges from basic infrastructure, management, search, and mining to security, privacy, and applications. Within the scope of this dissertation, a multimedia big data analysis framework is proposed for semantic information management and retrieval with a focus on rare event detection in videos. The proposed framework is able to explore hidden semantic feature groups in multimedia data and incorporate temporal semantics, especially for video event detection. First, a hierarchical semantic data representation is presented to alleviate the semantic gap issue, and the Hidden Coherent Feature Group (HCFG) analysis method is proposed to capture the correlation between features and separate the original feature set into semantic groups, seamlessly integrating multimedia data in multiple modalities. Next, an Importance Factor based Temporal Multiple Correspondence Analysis (i.e., IF-TMCA) approach is presented for effective event detection. Specifically, the HCFG algorithm is integrated with the Hierarchical Information Gain Analysis (HIGA) method to generate the Importance Factor (IF) for producing the initial detection results. Then, the TMCA algorithm is proposed to efficiently incorporate temporal semantics for re-ranking and improving the final performance. At last, a sampling-based ensemble learning mechanism is applied to further accommodate the imbalanced datasets. In addition to the multimedia semantic representation and class imbalance problems, lack of organization is another critical issue for multimedia big data analysis. In this framework, an affinity propagation-based summarization method is also proposed to transform the unorganized data into a better structure with clean and well-organized information. The whole framework has been thoroughly evaluated across multiple domains, such as soccer goal event detection and disaster information management.
Resumo:
Business practices vary from one company to another and business practices often need to be changed due to changes of business environments. To satisfy different business practices, enterprise systems need to be customized. To keep up with ongoing business practice changes, enterprise systems need to be adapted. Because of rigidity and complexity, the customization and adaption of enterprise systems often takes excessive time with potential failures and budget shortfall. Moreover, enterprise systems often drag business behind because they cannot be rapidly adapted to support business practice changes. Extensive literature has addressed this issue by identifying success or failure factors, implementation approaches, and project management strategies. Those efforts were aimed at learning lessons from post implementation experiences to help future projects. This research looks into this issue from a different angle. It attempts to address this issue by delivering a systematic method for developing flexible enterprise systems which can be easily tailored for different business practices or rapidly adapted when business practices change. First, this research examines the role of system models in the context of enterprise system development; and the relationship of system models with software programs in the contexts of computer aided software engineering (CASE), model driven architecture (MDA) and workflow management system (WfMS). Then, by applying the analogical reasoning method, this research initiates a concept of model driven enterprise systems. The novelty of model driven enterprise systems is that it extracts system models from software programs and makes system models able to stay independent of software programs. In the paradigm of model driven enterprise systems, system models act as instructors to guide and control the behavior of software programs. Software programs function by interpreting instructions in system models. This mechanism exposes the opportunity to tailor such a system by changing system models. To make this true, system models should be represented in a language which can be easily understood by human beings and can also be effectively interpreted by computers. In this research, various semantic representations are investigated to support model driven enterprise systems. The significance of this research is 1) the transplantation of the successful structure for flexibility in modern machines and WfMS to enterprise systems; and 2) the advancement of MDA by extending the role of system models from guiding system development to controlling system behaviors. This research contributes to the area relevant to enterprise systems from three perspectives: 1) a new paradigm of enterprise systems, in which enterprise systems consist of two essential elements: system models and software programs. These two elements are loosely coupled and can exist independently; 2) semantic representations, which can effectively represent business entities, entity relationships, business logic and information processing logic in a semantic manner. Semantic representations are the key enabling techniques of model driven enterprise systems; and 3) a brand new role of system models; traditionally the role of system models is to guide developers to write system source code. This research promotes the role of system models to control the behaviors of enterprise.
Resumo:
For more than a decade research in the field of context aware computing has aimed to find ways to exploit situational information that can be detected by mobile computing and sensor technologies. The goal is to provide people with new and improved applications, enhanced functionality and better use experience (Dey, 2001). Early applications focused on representing or computing on physical parameters, such as showing your location and the location of people or things around you. Such applications might show where the next bus is, which of your friends is in the vicinity and so on. With the advent of social networking software and microblogging sites such as Facebook and Twitter, recommender systems and so on context-aware computing is moving towards mining the social web in order to provide better representations and understanding of context, including social context. In this paper we begin by recapping different theoretical framings of context. We then discuss the problem of context- aware computing from a design perspective.
Resumo:
Background This paper presents a novel approach to searching electronic medical records that is based on concept matching rather than keyword matching. Aim The concept-based approach is intended to overcome specific challenges we identified in searching medical records. Method Queries and documents were transformed from their term-based originals into medical concepts as defined by the SNOMED-CT ontology. Results Evaluation on a real-world collection of medical records showed our concept-based approach outperformed a keyword baseline by 25% in Mean Average Precision. Conclusion The concept-based approach provides a framework for further development of inference based search systems for dealing with medical data.
Resumo:
Entity-oriented search has become an essential component of modern search engines. It focuses on retrieving a list of entities or information about the specific entities instead of documents. In this paper, we study the problem of finding entity related information, referred to as attribute-value pairs, that play a significant role in searching target entities. We propose a novel decomposition framework combining reduced relations and the discriminative model, Conditional Random Field (CRF), for automatically finding entity-related attribute-value pairs from free text documents. This decomposition framework allows us to locate potential text fragments and identify the hidden semantics, in the form of attribute-value pairs for user queries. Empirical analysis shows that the decomposition framework outperforms pattern-based approaches due to its capability of effective integration of syntactic and semantic features.
Resumo:
Measures of semantic similarity between medical concepts are central to a number of techniques in medical informatics, including query expansion in medical information retrieval. Previous work has mainly considered thesaurus-based path measures of semantic similarity and has not compared different corpus-driven approaches in depth. We evaluate the effectiveness of eight common corpus-driven measures in capturing semantic relatedness and compare these against human judged concept pairs assessed by medical professionals. Our results show that certain corpus-driven measures correlate strongly (approx 0.8) with human judgements. An important finding is that performance was significantly affected by the choice of corpus used in priming the measure, i.e., used as evidence from which corpus-driven similarities are drawn. This paper provides guidelines for the implementation of semantic similarity measures for medical informatics and concludes with implications for medical information retrieval.
Resumo:
The increasing amount of information that is annotated against standardised semantic resources offers opportunities to incorporate sophisticated levels of reasoning, or inference, into the retrieval process. In this position paper, we reflect on the need to incorporate semantic inference into retrieval (in particular for medical information retrieval) as well as previous attempts that have been made so far with mixed success. Medical information retrieval is a fertile ground for testing inference mechanisms to augment retrieval. The medical domain offers a plethora of carefully curated, structured, semantic resources, along with well established entity extraction and linking tools, and search topics that intuitively require a number of different inferential processes (e.g., conceptual similarity, conceptual implication, etc.). We argue that integrating semantic inference in information retrieval has the potential to uncover a large amount of information that otherwise would be inaccessible; but inference is also risky and, if not used cautiously, can harm retrieval.
Resumo:
The Supreme Court of the United States in Feist v. Rural (Feist, 1991) specified that compilations or databases, and other works, must have a minimal degree of creativity to be copyrightable. The significance and global diffusion of the decision is only matched by the difficulties it has posed for interpretation. The judgment does not specify what is to be understood by creativity, although it does give a full account of the negative of creativity, as ‘so mechanical or routine as to require no creativity whatsoever’ (Feist, 1991, p.362). The negative of creativity as highly mechanical has particularly diffused globally.
A recent interpretation has correlated ‘so mechanical’ (Feist, 1991) with an automatic mechanical procedure or computational process, using a rigorous exegesis fully to correlate the two uses of mechanical. The negative of creativity is then understood as an automatic computation and as a highly routine process. Creativity is itself is conversely understood as non-computational activity, above a certain level of routinicity (Warner, 2013).
The distinction between the negative of creativity and creativity is strongly analogous to an independently developed distinction between forms of mental labour, between semantic and syntactic labour. Semantic labour is understood as human labour motivated by considerations of meaning and syntactic labour as concerned solely with patterns. Semantic labour is distinctively human while syntactic labour can be directly humanly conducted or delegated to machine, as an automatic computational process (Warner, 2005; 2010, pp.33-41).
The value of the analogy is to greatly increase the intersubjective scope of the distinction between semantic and syntactic mental labour. The global diffusion of the standard for extreme absence of copyrightability embodied in the judgment also indicates the possibility that the distinction fully captures the current transformation in the distribution of mental labour, where syntactic tasks which were previously humanly performed are now increasingly conducted by machine.
The paper has substantive and methodological relevance to the conference themes. Substantively, it is concerned with human creativity, with rationality as not reducible to computation, and has relevance to the language myth, through its indirect endorsement of a non-computable or not mechanical semantics. These themes are supported by the underlying idea of technology as a human construction. Methodologically, it is rooted in the humanities and conducts critical thinking through exegesis and empirically tested theoretical development
References
Feist. (1991). Feist Publications, Inc. v. Rural Tel. Service Co., Inc. 499 U.S. 340.
Warner, J. (2005). Labor in information systems. Annual Review of Information Science and Technology. 39, 2005, pp.551-573.
Warner, J. (2010). Human Information Retrieval (History and Foundations of Information Science Series). Cambridge, MA: MIT Press.
Warner, J. (2013). Creativity for Feist. Journal of the American Society for Information Science and Technology. 64, 6, 2013, pp.1173-1192.