Operators
Last updated on 2025-07-05 | Edit this page
Overview
Questions
- How do I perform operations, such as filtering, on channels?
- What are the different kinds of operations I can perform on channels?
- How do I combine operations?
- How can I use a CSV file to process data into a Channel?
Objectives
- Understand what Nextflow operators are.
- Modify the contents/elements of a channel using operators.
- Perform filtering and combining operations on a channel object.
- Use the
splitCsv
operator to parse the contents of CSV file into a channel .
Operators
In the Channels episode we learnt how to create Nextflow channels to
enable us to pass data and values around our workflow. If we want to
modify the contents or behaviour of a channel, Nextflow provides methods
called operators
. We have previously used the
view
operator to view the contents of a channel. There are
many more operator methods that can be applied to Nextflow channels that
can be usefully separated into several groups:
- Filtering operators: reduce the number of elements in a channel.
- Transforming operators: transform the value/data in a channel.
- Splitting operators: split items in a channel into smaller chunks.
- Combining operators: join channels together.
- Maths operators: apply simple math functions on channels.
- Other: such as the view operator.
In this episode you will see examples, and get to use different types of operators.
Using Operators
To use an operator, the syntax is the channel name, followed by a dot
.
, followed by the operator name and brackets
()
.
view
The view
operator prints the items emitted by a channel
to the console appending a new line character to each item in
the channel. We can also chain together the channel factory method
.of
and the operator .view()
using the dot
notation. Note: the view()
operator
doesn’t change the contents of the channel object.
Task 8.1
To make code more readable we can split the operators over several lines. The blank space between the operators is ignored and is solely for readability.
Closures
An optional closure {}
parameter can be
specified to customise how items are printed.
Briefly, a closure is a block of code that can be passed as an
argument to a function. In this way you can define a chunk of code and
then pass it around as if it were a string or an integer. By default the
parameters for a closure are specified with the groovy keyword
$it
(‘it’ is for ‘item’).
Task 8.2
For example here we apply a closure to the queue channel, to separate the first two columns of the csv file as separate parameters and group all remaining columns into a single list of parameters.
Filtering operators
We can reduce the number of items in a channel by using filtering operators.
The filter
operator allows you to get only the items
emitted by a channel that satisfy a condition and discard all the
others. The filtering condition can be specified by using either:
- a regular expression
- a literal value
- a data type qualifier, e.g. Number (any integer,float …), String, Boolean
- or any boolean statement.
Data type qualifier
Here we use the filter
operator specifying the data type
qualifier Number
so that only numeric items are returned.
The Number data type includes both integers and floating point numbers.
We will then use the view
operator to print the contents.
To simplify the code we can chain multiple operators together, such as
filter
and view
using a .
.
The previous example could be rewritten like: The blank space between the operators is ignored and is used for readability.
Regular expression
We chain the .split()
function, to extract the school ID
from the input file name.
Task 8.3
Based on the example code in the 08_operators.nf
file,
explain the purpose of the split operator and intended output. Use the
nextflow run 08_operators.nf
to run the workflow and
inspect the processs output using the .view()
operator.
Note we specify a regular expression .split("_|\\.")
within the function in order to split the string based on the underscore
“_” or punctuation “.” (whichever comes first) to derive an input
variable, based on school ID. This is where generating file names
dynamically as part of the workflow becomes relevant, as file names can
be play an important role in managing the stream of data.
Modifying the contents of a channel
If we want to modify the items in a channel, we can use transforming operators.
Applying a function to items in a channel
The map
operator applies a function of your choosing to
every item in a channel, and returns the items so obtained as a new
channel. The function applied is called the mapping function and is
expressed with a closure {}
.
We can also use the map
operator to transform each
element into a tuple.
In the example below we use the map
operator to
transform a channel.
We can change the default name of the closure parameter keyword from
it
to a more meaningful name file
using
->
. When we have multiple parameters we can specify the
keywords at the start of the closure,
e.g. file, key ->
.
Task 8.4
Inspect the code in the file 08_operators.nf
explain the
purpose of the map
operator on the
estimation_out.simulation_ch
. How is it used to transform
the contents into a tuple with the file and the file’s name? Write
additional comments within the script. (Hint: Use the view
operator to inspect the channel contents.)
The simulation_ch output emits a tuple of elements as part of the
simulation output from the ESTIMATION
process. The map
operator transforms the first of the elements indexed by
[0]
and uses a regular expression to split the character
value on the first _
it encounters. Ex. it takes
school123_period1
and returns school123
, this
allows us to generate a school identifier.
Converting a list into multiple items
The flatten
operator transforms a channel in such a way
that every item in a list
or tuple
is
flattened so that each single entry is emitted as a sole element by the
resulting channel.
This is similar to the channel factory
Channel.fromList
.
Converting the contents of a channel to a single list item.
The reverse of the flatten
operator is
collect
. The collect
operator collects all the
items emitted by a channel to a list and return the resulting object as
a sole emission. This can be extremely useful when combining the results
from the output of multiple processes, or a single process run multiple
times.
The result of the collect operator is a value
channel
and can be used multiple times.
Grouping contents of a channel by a key.
The groupTuple
operator collects tuples
or
lists
of values by grouping together the channel elements
that share the same key. Finally it emits a new tuple object for each
distinct key collected.
If we know the number of items to be grouped we can use the
groupTuple
and size
parameter. When the
specified size is reached, the tuple is emitted. By default incomplete
tuples (i.e. with less than size grouped items) are discarded
(default).
This operator is useful to process altogether all elements for which there’s a common property or a grouping key.
Task 8.5
Inspect the code in the file 08_operators.nf
explain the
purpose of the groupTuple
operator. How is it used to
transform the contents into a tuple ? Write additional comments within
the script. (Hint: Use the view
operator to inspect the
channel contents.)
Merging Channels
Combining operators allows you to merge channels together. This can be useful when you want to combine the output channels from multiple processes.
mix
The mix
operator combines the items emitted by two (or
more) channels into a single channel.
The items emitted by the resulting mixed channel may appear in any order, regardless of which source channel they came from. Thus, the following example it could be a possible result of the above example as well.
Maths operators
The maths operators allows you to apply simple math function on channels.
The maths operators are:
- count
- min
- max
- sum
- toInteger
Splitting items in a channel
Sometimes you want to split the content of a individual item in a channel, like a file or string, into smaller chunks that can be processed by downstream operators or processes e.g. items stored in a CSV file.
Nextflow has a number of splitting operators that can achieve this:
- splitCsv: The splitCsv operator allows you to parse text items emitted by a channel, that are formatted using the CSV format, and split them into records or group them into list of records with a specified length.
- splitText: The splitText operator allows you to split multi-line strings or text file items, emitted by a source channel into chunks containing n lines, which will be emitted by the resulting channel.
splitCsv
The splitCsv
operator allows you to parse text items
emitted by a channel, that are formatted using the CSV format, and split
them into records or group them into list of records with a specified
length. This is useful when you want to use a sample sheet.
In the simplest case just apply the splitCsv
operator to
a channel emitting a CSV formatted text files or text entries. For
example:
For the CSV file effects.csv
.
We can use the splitCsv()
operator to split the channel
contaning a CSV file into three elements.
The above example shows hows the CSV file effects.csv
is
parsed and is split into three elements.
Accessing values
Values can be accessed by their positional indexes using the square
brackets syntax[index]
. So to access the first column you
would use [0]
as shown in the following example:
Column headers
When the CSV begins with a header line defining the column names, you
can specify the parameter header: true
which allows you to
reference each value by its name, as shown in the following example:
Task 8.6
Inspect the 08_operators.nf
, how is the
params/effects.csv
being parsed?
Each row of the csv is read as a separate input. The closure using the map operator organises inputs by indexing the column order. The first two columns are stored as separate elements while the remaining columns are grouped into a list. The resulting input comprises of a tuple that involves 3 elements, two values and one list.
More resources
See the operators documentation on the Nextflow web site.
Key Points
- Nextflow operators are methods that allow you to modify, set or view channels.
- Operators can be separated in to several groups; filtering , transforming , splitting , combining , forking and Maths operators.
- To use an operator use the dot notation after the Channel object
e.g.
ESTIMATION.simulation_ch.view()
. - You can parse text items emitted by a channel, that are formatted
using the CSV format, using the
splitCsv
operator.