Donia Scott and Cecile Paris
From a generation point of view, our goal is to identify the appropriate mappings between the semantics to be conveyed and expressions in language, in the context of multilingual instruction generation. We study this problem focusing on the identification of the realisations of the relationships of the various components of the task the reader is being instructed about. Corpus analysis to study this issue is tricky as there is a real danger of circularity, by identifying the underlying semantic relations (or styles) based on surface features the text, which renders any conclusions as to how these semantics are then expressed in text invalid. In this paper, we explain how it is necessary to go beyond the text to address this problem, and show how we have been able to apply this method in our work.