Some "must know" concepts

Note: Part of this content has been Extracted from the Galaxy project documentation.

What is a Workflow in Galaxy?

A workflow is a series of tools and dataset actions that run in sequence as a batch operation. Workflows are analyses that are intended to be executed (one ore more times) with different user-provided input datasets. Workflow can be reused over and over, not only reducing tedious work, but enhancing reproducibility by applying the same exact methods to all of your data.

  • Workflows can be created from scratch using the workflow editor.
  • Workflows can be annotated, viewed, shared, published, and imported - just like other Galaxy objects.
  • Any workflow that you have permissions to import, you can modify with the workflow editor.

From the perspective of Galaksio, workflows must be designed by skilled users (e.g. a bioinformatician) using the Galaxy tools for workflow edition.

Learn more about workflows.

Figure 1. Creating a workflow in Galaxy using the workflow editor.

Datasets in Galaxy

Datasets are the inputs and outputs of each step in an analysis project in Galaxy. The tracking information associated with Datasets in a History represent an experimental record of the methods, parameters, and other inputs.

More info

Histories in Galaxy

When data is uploaded from your computer or analysis is done on existing data using Galaxy, each output from those steps generates a dataset. These datasets (and the output datasets from subsequent analysis on them) are stored by Galaxy in Histories.

The Current History

All users have one 'current' history, which can be thought of as a workspace or a current working directory in bioinformatics terms. You current history is displayed in the right hand side of the main 'Analyze Data' Galaxy interface in what is called the history panel.

The history panel displays output datasets in the order in which they were created with the oldest/first shown at the bottom. As new analyses are done and new output datasets are generated, the newest datasets are added to the top of the history panel. In this way, the history panel displays the history of your analysis over time.

Users that have registered an account and logged in can have many histories and the history panel allows switching between them and creating new ones. This can be useful to organize different analyses.

More info

Figure 2. Galaxy history is simply the right panel of the interface.

Dataset collections

Collection are designed to simplify complex, multi-sample analyses. In Galaxy you perform data analyses and organize your data simply by clicking on things. In real-world analyses you never have just a few datasets, instead you have many (sometimes thousands) and Collections help manage your data to minimize the amount of clicking you have to do.

More info