Enrichment of the Phenotypic and Genotypic Data Warehouse analysis using Question Answering systems to facilitate the decision making process in cereal breeding programs


Autoria(s): Peral Cortés, Jesús; Ferrández, Antonio; Gregorio Medrano, Elisa de; Trujillo Mondéjar, Juan Carlos; Maté Morga, Alejandro; Ferrández, Luis José
Contribuinte(s)

Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos

Procesamiento del Lenguaje y Sistemas de Información (GPLSI)

Lucentia

Data(s)

20/05/2014

20/05/2014

15/05/2014

Resumo

Currently there are an overwhelming number of scientific publications in Life Sciences, especially in Genetics and Biotechnology. This huge amount of information is structured in corporate Data Warehouses (DW) or in Biological Databases (e.g. UniProt, RCSB Protein Data Bank, CEREALAB or GenBank), whose main drawback is its cost of updating that makes it obsolete easily. However, these Databases are the main tool for enterprises when they want to update their internal information, for example when a plant breeder enterprise needs to enrich its genetic information (internal structured Database) with recently discovered genes related to specific phenotypic traits (external unstructured data) in order to choose the desired parentals for breeding programs. In this paper, we propose to complement the internal information with external data from the Web using Question Answering (QA) techniques. We go a step further by providing a complete framework for integrating unstructured and structured information by combining traditional Databases and DW architectures with QA systems. The great advantage of our framework is that decision makers can compare instantaneously internal data with external data from competitors, thereby allowing taking quick strategic decisions based on richer data.

This paper has been partially supported by the MESOLAP (TIN2010-14860) and GEODAS-BI (TIN2012-37493-C03-03) projects from the Spanish Ministry of Education and Competitivity. Alejandro Maté is funded by the Generalitat Valenciana under an ACIF grant (ACIF/2010/298).

Identificador

Ecological Informatics. 2014, Accepted Manuscript. doi:10.1016/j.ecoinf.2014.05.003

1574-9541 (Print)

1878-0512 (Online)

http://hdl.handle.net/10045/37202

10.1016/j.ecoinf.2014.05.003

Idioma(s)

eng

Publicador

Elsevier

Relação

http://dx.doi.org/10.1016/j.ecoinf.2014.05.003

Direitos

info:eu-repo/semantics/openAccess

Palavras-Chave #Business intelligence #Data warehouse #Question answering #Information extraction #Information retrieval #Genetic information #Lenguajes y Sistemas Informáticos
Tipo

info:eu-repo/semantics/article