To import a Galaxy workflow into Refinery, you first have to annotated the workflow. The amount of annotation required is minimal and you can conveniently add the annotation for the workflow in the Galaxy workflow editor.
In a nutshell, you have to provide simple Python dictionaries (see examples below if you are not familiar with Python) in the “annotation” text fields for the workflow and corresponding tools. These fields can be found on the right side of the workflow editor.
Annotation fields must either be empty of contain correctly formatted annotation dictionaries as described below. If other information is found in an annotation field, you will not be able to import the workflow into Refinery.
For Refinery to recognize a Galaxy workflow as a Refinery Workflow, you need to provide a set of simple annotations in the workflow annotation field in the Galaxy workflow editor. The annotation field is listed under “Edit Attributes” on the right side of the workflow editor.
Note
The annotation fields in the Galaxy workflow editor behave slightly differently for workflow-level and tool-level annotations. In order to confirm changes to a workflow-level annotation, move the cursor to the end of the input field and hit the Return key. This is not required in tool-level annotation fields. Be sure to save the workflow after editing an annotation field.
The workflow-level annotation is a Python dictionary with the following keys:
Optional | This field is used to describe relationships between inputs of the Workflow. For example, a Workflow that performs peak-calling on ChIP-seq data, requires that each ChIP file is associated with one input file (= genomic background). Such relationships are described using dictionary with three fields:
Schematic tool annotation (indentation only for better readability):
{
"refinery_type": "<workflow_type>",
"refinery_relationships": [
{
"category": "<relationship_type>",
"set1": "<name_of_input_1>",
"set2": "<name_of_input_2>"
}
]
}
A standard analysis workflow with a single input would be annotated as follows:
{
"refinery_type": "analysis"
}
A download workflow would be annotated like this:
{
"refinery_type": "download"
}
A more complex analysis workflow with two inputs and a 1-1 relationship between two inputs named “ChIP file” and “input file” would be annotated as follows: (the name fields of the two input datasets are set to “left input file” and “right input file”, respectively)
{
"refinery_type": "analysis",
"refinery_relationships": [
{
"category": "1-1",
"set1": "ChIP file",
"set2": "input file"
}
]
}
In order to import output files generated a tool in the workflow into Refinery, the tool has to be annotated. To access the annotation field for a tool, click on the tool representation in the workflow editor. The annotation field is named “Annotation / Notes”.
Note
You have to annotate at least one tool and one output file. Workflows that do not declare outputs for import into Refinery will not be imported.
Like in workflow-level annotations, the annotation needs to be provided as a Python dictionary. In order to import output files of the tool back into Refinery, the tool-level annotation dictionary needs to contain a key that is the same as the output declared by the tool, for example "output_file".
This key must be associated with a further dictionary that provides a name, that will be used to import the file into Refinery. Optionally, a description can be provided to further explain the content of the output file, as well as a file type, if the file extension provided by Galaxy is not sufficient to detect the actual file type automatically. This is typically the case when Galaxy uses “data” as the file extension.
Schematic tool annotation (indentation only for better readability)
{
"<tool_output_1>": {
"name": "<filename_1>",
"description": "<description_1>",
"type": "<extension_1>"
},
"<tool_output_2>": {
"name": "<filename_2>",
"description": "<description_2>",
"type": "<extension_2>"
}
}
The following example use indentation for better readability. Indentation is not required.
{
"output_narrow_peak": {
"name": "spp_narrow_peak",
"description": "",
"type": "bed"
},
"output_region_peak": {
"name": "spp_region_peak",
"description": "",
"type": "bed"
},
"output_plot_file": {
"name": "spp_plot_file",
"description": "",
"type": "pdf"
}
}