Skip to content

Join

The Join operation is also a great tool to use and transform multiple input datasets into one output dataset through one or multiple common keys (columns). The plateform handles these different types of joins :

- Inner
- Left
- Right
- Cross
- Natural

These joins are defined also by multiple conditions that can be specified to match a column from the first dataset to a column from the second dataset :
- is equal to
- is not equal to
- is greater to / is greater or equal to
- is less to / is less or equal to
- Contains
- Start with

To access this option, you need to select a dataset and on the left sidebar, select the Join operation and when the icon is highlighted, you select the other dataset you to join with the first one and a pop-up will appear automatically on your screen.

join step

Join operation in the flow

This pop-up allows you to handle all the settings offered by this operation such as :

  • Select the join type to apply
  • Add as many conditions you need and remove some of them with at least one required
  • Select the columns needed on both datasets
  • Define the output name, the operation name and description
  • Define the persistence settings

join settings

Join operation settings tab

Tip

In the Columns tab, it allows you to select the columns to keep in the output dataset.

join settings

Join operation column selection tab

After setting up your join operation, you just select the Create recipe and Run it now button and a green gear linked to the input datasets and the newly created output dataset appears in your project's flow.

submit operation

Join operation submitting button

join gear icon
Join operation gear and output dataset on the flow

Here is a video showcasing the join operation