Join¶
The Join operation is also a great tool to use and transform multiple input datasets into one output dataset through one or multiple common keys (columns). The plateform handles these different types of joins :
- Inner
- Left
- Right
- Cross
- Natural
These joins are defined also by multiple conditions that can be specified to match a column from the first dataset to a column from the second dataset :
- is equal to
- is not equal to
- is greater to / is greater or equal to
- is less to / is less or equal to
- Contains
- Start with
To access this option, you need to select a dataset and on the left sidebar, select the Join operation and when the icon is highlighted, you select the other dataset you to join with the first one and a pop-up will appear automatically on your screen.
This pop-up allows you to handle all the settings offered by this operation such as :
- Select the join type to apply
- Add as many conditions you need and remove some of them with at least one required
- Select the columns needed on both datasets
- Define the output name, the operation name and description
- Define the persistence settings
Tip
In the Columns tab, it allows you to select the columns to keep in the output dataset.
After setting up your join operation, you just select the Create recipe and Run it now button and a green gear linked to the input datasets and the newly created output dataset appears in your project's flow.
Here is a video showcasing the join operation