Working Group on Big Data Harvesting
Large amounts of footprint-relevant data are available in the public domain, but are not accessible in a ready-to-use and validated form for product footprints. It is possible to use automated procedures to harvest the data from the current disparate and incompatible databases and raw data sources, placing each piece of harvested data in the relevant database context.
This Working Group has the task to develop, maintain and improve the necessary data harvesting algorithms. A lot of experience is already available from the ecoinvent and the EXIOBASE database projects on the relationship between the footprinting context and different raw data sources, as well as the interdependencies between different data sources and formats. However, more conceptual description is required to make these experiences available for general use and for automated data harvest.
The data harvesting algorithms can be used not only on raw data, but also on data in existing footprint databases, thus providing an interface for interoperability of existing databases, avoiding the current lock-in into specific data formats.
Some further development work is also needed to ensure adequate and flexible procedures for a number of issues, such as source referencing, inclusion of model uncertainty, inclusion of data on product taxes and product losses, mass balancing, data forecasting, handling activity and product hierarchy issues and disaggregation of activity datasets when more detailed data becomes available, modelling the internal relationships between inputs and outputs of activity datasets, and propagation of data between activity datasets.
The data harvesting algorithms are to be described and implemented, one at a time, using science-based prioritisation (cf. the Working Group on Uncertainty Assessment).
This Working Group is expected to be composed of experts from existing database projects, experts from different industry domains, and software experts with experience in “Big Data” harvesting.
This work may initially be formulated and financed as a publicly funded research project. This will allow the Working Group to start independently of the funding of the overall BONSAI organisation.