K-span: Open and reproducible spatial analytics using scientific workflows

Abstract

This paper describes the design, development, and testing of a general-purpose scientific-workflows tool for spatial analytics. Spatial analytics processes are frequently complex, both conceptually and computationally. Adaptation, documention, and reproduction of bespoke spatial analytics procedures represents a growing challenge today, particularly in this era of big spatial data. Scientific workflow systems hold the promise of increased openness and transparency with improved automation of spatial analytics processes. In this work, we built and implemented a KNIME spatial analytics (“K-span”) software tool, an extension to the general-purpose open-source KNIME scientific workflow platform. The tool augments KNIME with new spatial analytics nodes by linking to and integrating a range of existing open-source spatial software and libraries. The implementation of the K-span system is demonstrated and evaluated with a case study associated with the original process of construction of the Australian national DEM (Digital Elevation Model) in the Greater Brisbane area of Queensland, Australia by Geoscience Australia (GA). The outcomes of translating example spatial analytics process into a an open, transparent, documented, automated, and reproducible scientific workflow highlights the benefits of using our system and our general approach. These benefits may help in increasing users’ assurance and confidence in spatial data products and in understanding of the provenance of foundational spatial data sets across diverse uses and user groups.

Publication
Frontiers in Earth Science