Over the recent years the scientific activities of our organisation in large research projects show a shifting priority from model integration to the integration of data itself. Our work in several large projects on integrated modelling for impact assessment studies has clearly shown the importance of data availability for integrated modelling, but of no less importance is the integration, or alignment, of the required input data itself. Moving from the fairly technical model integration in OpenMI and OpenMI related projects, and moving towards basic semantic integration in the SEAMLESS and SENSOR projects, our focus is now shifting towards researching and applying techniques such as Semantic Web technologies to improve data discoverability, its integration, and in the future on reasoning about the constructed integrated knowledge. This paper will present an overview of the on-going work in our European 7th Framework Programme (FP7) project TREES4FUTURE, focussing on automated harvesting of forestry related data sets and enriching its meta data for search ability; the FP7 LIAISE Network of Excellence on linking impact assessment instruments such as models and data to sustainability expertise; and the FP7 research project SEMAGROW on developing visions on processing and querying large RDF triple-stores of integrated agricultural data. In the end we aim at bringing the results of all these projects together to achieve a next step in integrated modelling and to present ways to use Natural Language Processing based methods to help providing meta data.