Data Integration using Airbyte

Airbyte is an open data integration platform that enables organizations to ingest and synchronize data from diverse sources into centralized storage systems such as data warehouses, lakes, and databases. Airbyte uses prebuilt connectors to create scheduled synchronization jobs that extract data from configured sources, replicate it to destinations.

Calibo Accelerate integrates with Airbyte to run existing data integration jobs without the need for any additional configuration. The connections are created in Airbyte and the source and destinations are defined there. You can select a combination that suits your use case and run an integration job for it through the Calibo Accelerate platform.

Prerequisites

To run a data integration job using Airbyte, you must complete the following prerequisites:

  • Get access to a Airbyte configuration listed under Configuration > Cloud Platform Tools & Technologies > Data Integration and Data Transformation.

  • Identify the combination of source and destination that you want to use for the data integration job using Airbyte.

To create a data integration job using Airbyte

  1. Add a data integration stage to your pipeline.

  2. Add an Airbyte node to the stage, provide the following information and click Save:

    • Technology Title - Provide a name for the technology that you are adding. The title will be visible on the added node.

    • Airbyte Instance - Select an Airbyte configuration from the dropdown list.

  3. Click the Airbyte node. The node shows a Not Configured icon in the top left corner.

  4. Click Create a job. This creates a job using an existing Airbyte connection.

  5. Complete the following steps to create the job:

  6. To view the implicit connections, do the following:

    1. On the Airbyte node, click the expand icon to show the implicit nodes. The Source and Target nodes are automatically displayed with the dotted line connectors.

    2. In the pipeline, click the implicit source node. The configuration opens in read-only mode and cannot be edited.

    3. In the pipeline, click the implicit target node. The configuration opens in read-only mode and cannot be edited.

    Implicit Nodes

    The supported source and target nodes that are automatically added to a Fivetran integration node are called implicit nodes. The connections to the implicit node (Implicit source node to Fivetran, and Fivetran to implicit target node) are displayed as dotted lines.

    Explicit Nodes

    The Fivertan node in a data integration pipeline is an explicit node. Unlike other data integration , this node cannot be connected to any other nodes apart from its implicit source and target nodes.