118 resultados para query extraction


Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a new active learning query strategy for information extraction, called Domain Knowledge Informativeness (DKI). Active learning is often used to reduce the amount of annotation effort required to obtain training data for machine learning algorithms. A key component of an active learning approach is the query strategy, which is used to iteratively select samples for annotation. Knowledge resources have been used in information extraction as a means to derive additional features for sample representation. DKI is, however, the first query strategy that exploits such resources to inform sample selection. To evaluate the merits of DKI, in particular with respect to the reduction in annotation effort that the new query strategy allows to achieve, we conduct a comprehensive empirical comparison of active learning query strategies for information extraction within the clinical domain. The clinical domain was chosen for this work because of the availability of extensive structured knowledge resources which have often been exploited for feature generation. In addition, the clinical domain offers a compelling use case for active learning because of the necessary high costs and hurdles associated with obtaining annotations in this domain. Our experimental findings demonstrated that 1) amongst existing query strategies, the ones based on the classification model’s confidence are a better choice for clinical data as they perform equally well with a much lighter computational load, and 2) significant reductions in annotation effort are achievable by exploiting knowledge resources within active learning query strategies, with up to 14% less tokens and concepts to manually annotate than with state-of-the-art query strategies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The growing importance and need of data processing for information extraction is vital for Web databases. Due to the sheer size and volume of databases, retrieval of relevant information as needed by users has become a cumbersome process. Information seekers are faced by information overloading - too many result sets are returned for their queries. Moreover, too few or no results are returned if a specific query is asked. This paper proposes a ranking algorithm that gives higher preference to a user’s current search and also utilizes profile information in order to obtain the relevant results for a user’s query.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The quality of discovered features in relevance feedback (RF) is the key issue for effective search query. Most existing feedback methods do not carefully address the issue of selecting features for noise reduction. As a result, extracted noisy features can easily contribute to undesirable effectiveness. In this paper, we propose a novel feature extraction method for query formulation. This method first extract term association patterns in RF as knowledge for feature extraction. Negative RF is then used to improve the quality of the discovered knowledge. A novel information filtering (IF) model is developed to evaluate the proposed method. The experimental results conducted on Reuters Corpus Volume 1 and TREC topics confirm that the proposed model achieved encouraging performance compared to state-of-the-art IF models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A building information model (BIM) provides a rich representation of a building's design. However, there are many challenges in getting construction-specific information from a BIM, limiting the usability of BIM for construction and other downstream processes. This paper describes a novel approach that utilizes ontology-based feature modeling, automatic feature extraction based on ifcXML, and query processing to extract information relevant to construction practitioners from a given BIM. The feature ontology generically represents construction-specific information that is useful for a broad range of construction management functions. The software prototype uses the ontology to transform the designer-focused BIM into a construction-specific feature-based model (FBM). The formal query methods operate on the FBM to further help construction users to quickly extract the necessary information from a BIM. Our tests demonstrate that this approach provides a richer representation of construction-specific information compared to existing BIM tools.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Building information modeling (BIM) is an emerging technology and process that provides rich and intelligent design information models of a facility, enabling enhanced communication, coordination, analysis, and quality control throughout all phases of a building project. Although there are many documented benefits of BIM for construction, identifying essential construction-specific information out of a BIM in an efficient and meaningful way is still a challenging task. This paper presents a framework that combines feature-based modeling and query processing to leverage BIM for construction. The feature-based modeling representation implemented enriches a BIM by representing construction-specific design features relevant to different construction management (CM) functions. The query processing implemented allows for increased flexibility to specify queries and rapidly generate the desired view from a given BIM according to the varied requirements of a specific practitioner or domain. Central to the framework is the formalization of construction domain knowledge in the form of a feature ontology and query specifications. The implementation of our framework enables the automatic extraction and querying of a wide-range of design conditions that are relevant to construction practitioners. The validation studies conducted demonstrate that our approach is significantly more effective than existing solutions. The research described in this paper has the potential to improve the efficiency and effectiveness of decision-making processes in different CM functions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We identify relation completion (RC) as one recurring problem that is central to the success of novel big data applications such as Entity Reconstruction and Data Enrichment. Given a semantic relation, RC attempts at linking entity pairs between two entity lists under the relation. To accomplish the RC goals, we propose to formulate search queries for each query entity α based on some auxiliary information, so that to detect its target entity β from the set of retrieved documents. For instance, a pattern-based method (PaRE) uses extracted patterns as the auxiliary information in formulating search queries. However, high-quality patterns may decrease the probability of finding suitable target entities. As an alternative, we propose CoRE method that uses context terms learned surrounding the expression of a relation as the auxiliary information in formulating queries. The experimental results based on several real-world web data collections demonstrate that CoRE reaches a much higher accuracy than PaRE for the purpose of RC.

Relevância:

20.00% 20.00%

Publicador: