One-stop-shop workflow repository


Background

BioExcel has developed several workflows and workflow building blocks. These are exemplified for multiple workflow systems, including Common Workflow Language, PyCOMPSs and Jupyter Notebook, as well as being available as containers and virtual machine images. These workflow definitions are currently available as a series of GitHub repositories, but do not yet have a user-level presentation for more accessible browsing before the user can decide to try executing the workflow.

Workflow repositories like myExperiment can be utilized for publishing the BioExcel workflows, as it includes a graphical view of the workflow and rich annotations of their use. However the design of the BioExcel workflows have highlighted a new modality, with multiple languages and containers representing the same conceptual workflow. In addition the aging myExperiment code base, managed by BioExcel partner The University of Manchester, does not have any particular support for CWL or PyCOMPSs.

The eScience Lab at The University of Manchester has in other projects developed the SEEK4Science platform, which provides a structured digital asset repository for projects and collaborations. This platform is successfully used by projects like FAIRDOM.

Mini-project

In this BioExcel mini-project we are extending the SEEK platform to add an initial support for Workflows, but unlike myExperiment it will work more like a catalogue (e.g. with links to GitHub and Docker Hub) rather than a repository where workflow definition files are deposited.

This will then form the basis for a BioExcel-customized SEEK installation for presenting the workflow building blocks and examples, allowing for multiple platform representation of the same workflow, which can be organized with related SEEK entries for the BioExcel pilot use cases and their results.

Collaborations

The initial development, which add workflow support to SEEK, will be further extended from 2019 onwards in emerging EU projects:

  • IBISBA (H2020 730976) is developing functionality for execution of CWL and KNIME workflows to its SEEK-based IBISBAHub
  • EOSC-Life (H2020 824087) will use SEEK as a starting point to develop a mechanism for federated publishing and and sharing workflows across the European Open Science Cloud for execution on shared computational resources.

This mini-project is a stepping stone for publisizing BioExcel workflows into the larger community of European scientific workflow cloud infrastructure.