Nowadays, structured documents are marked-up using XML. XML is the W3C standard that allows to give a meaning about the stored content of a document by the definition of its logical structure. A logical structure can be exploited to have a focused access to structured documents. For instance, in XML Information Retrieval, the logical structure is aimed at retrieving the most relevant fragments within documents as answers to queries, instead of the whole document. The problem arises when it is not possible to automatically define the logical structure of a document by using the methodologies presented in the literature. This position paper takes into account this situation and provides a possible solution adopted in the Enel SpA energy company.
Dominoni, M., Calegari, S. (2015). Structuring documents from short texts: The Enel SpA case study. In DATA 2015 - 4th International Conference on Data Management Technologies and Applications, Proceedings 2015 (pp.63-68). SciTePress [10.5220/0005498800630068].
Structuring documents from short texts: The Enel SpA case study
DOMINONI, MATTEO ALESSANDROPrimo
;CALEGARI, SILVIAUltimo
2015
Abstract
Nowadays, structured documents are marked-up using XML. XML is the W3C standard that allows to give a meaning about the stored content of a document by the definition of its logical structure. A logical structure can be exploited to have a focused access to structured documents. For instance, in XML Information Retrieval, the logical structure is aimed at retrieving the most relevant fragments within documents as answers to queries, instead of the whole document. The problem arises when it is not possible to automatically define the logical structure of a document by using the methodologies presented in the literature. This position paper takes into account this situation and provides a possible solution adopted in the Enel SpA energy company.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.