Computational Workflows are widely used in data analysis, enabling innovation and decision-making. In many domains (bioinformatics, image analysis, & radio astronomy) the analysis components are numerous and written in multiple different computer languages by third parties.

However, many competing workflow systems exist, severely limiting portability of such workflows, thereby hindering the transfer of workflows between different systems, between different projects and different settings, leading to vendor lock-ins and limiting their generic re-usability.

Here we present the Common Workflow Language (CWL) project, which produces free and open standards for describing command-line tool based workflows. The CWL standards provide a common but reduced set of abstractions that are both used in practice and implemented in many popular workflow systems.

The CWL language is declarative, which allows expressing computational workflows constructed from diverse software tools, executed each through their command-line interface. Being explicit about the runtime environment and any use of software containers enables portability and reuse.

Workflows written according to the CWL standards are a reusable description of that analysis that are runnable on a diverse set of computing environments. These descriptions contain enough information for advanced optimization without additional input from workflow authors.

The CWL standards support polylingual workflows, enabling portability and reuse of such workflows, easing for example scholarly publication, fulfilling regulatory requirements, collaboration in/between academic research and industry, while reducing implementation costs.

CWL has been taken up by a wide variety of domains, and industries and support has been implemented in many major workflow systems.

Citation

Michael R. Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reyes, Bogdan Gavrilović, Carole Goble, The CWL Community (2022):
Methods Included: Standardizing Computational Reuse and Portability with the Common Workflow Language.
Communications of the ACM 65(6)
https://doi.org/10.1145/3486897
preprint: arXiv:2105.07028

About the author

Stian works in School of Computer Science, at the University of Manchester in Carole Goble‘s eScience Lab as a technical software architect and researcher. In addition to BioExcel, Stian’s involvements include Open PHACTS (pharmacological data warehouse), Common Workflow Language (CWL), Apache Taverna (scientific workflow system), Linked Data and identifiers, research objects (open science) and digital preservation, myExperiment (sharing scientific workflows), provenance (where did things come from and who did it) and annotations (who said what). orcid.org/0000-0001-9842-9718