KNIME is an open source workflow system, with an easy and intuitive drag and drop based Graphical User Interface (GUI). KNIME is popular in cheminformatics for data analysis, statistics and visualization, and is heavily used by pharmaceutical companies. Its functionality is based on a set of modules (KNIME nodes) for data integration, which can be interconnected, generating custom pipelines (workflows). Currently, KNIME has >1500 different free nodes, which are organized in sections, each one dedicated to a special task: data access, data manipulation, data analytics, etc. However, the list of available nodes in the GUI can be extended importing KNIME extensions. In a similar way to Galaxy workflow manager, which is biased to genomic tools, KNIME list of nodes is biased to cheminformatics and data analytics. There is just an implementation of Molecular Dynamics as KNIME nodes, and is a commercial extension of the Schrödinger software suite.
BioExcel‘s workflow building blocks (biobb) are workflow manager-agnostic. Their Python wrapper philosophy allows their usage in any workflow manager tool. The only thing needed is to integrate them in the tool, generating the building block description and call specification/configuration. These have been done already with workflow managers suited for the HPC field (PyCOMPSs, Toil), and for the genomics field (Galaxy). Now BioExcel is starting to work in the port of biobb functionality to KNIME.
KNIME GUI workspace is divided in three main blocks:
- Node Repository
- KNIME projects (workflows)
- Node Description
This project is working on generating a new biobb category in the KNIME Node Repository, and filling it up with all the building blocks generated by BioExcel. That implies the corresponding java code wrapping the building block, and a complete description for each of them, which is going to be shown in the Node Description section of the GUI. The set of new developed nodes will be then available for KNIME users to generate their own biomolecular simulation workflows, using not only BioExcel biobb nodes, but also the set of nodes (>1500) already available in the Node Repository, including the chemistry related ones (RDKit, Marvin, Vernalis, Schrödinger, 3D-e-Chem).