1. Add your JSON access

  1. In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the JSON option from the list of connectors.

  2. Click Next and you’ll be prompted to add the following configs:

    1. Website URL: URL from which the JSON payload will be extracted. The website should provide a single-page JSON payload.
    2. JSON path: You must add the JSON path to indicate which information should be extracted from the source. If you’re unsure about how JSON path works, check the example below.
    3. Maximum number of samples for schema: Defines the maximum number of samples to generate the schema. You should set a higher number of samples for highly variable payloads. If the records are consistent across the payload, the number of samples can be small.

    As an example to understand how JSON path works, consider the following payload obtained from the Website URL:

        {
            "total_listings": 2069,
            "listings": [
                {
                    "id": 26747,
                    "type": "House",
                    "purpose": "commercial",
                    "transaction": "rent",
                    "value": 1800
                },
                {
                    "id": 774,
                    "type": "Apartment",
                    "purpose": "residencial",
                    "transaction": "rent",
                    "value": 2400
                },
                {
                    "id": 1553,
                    "type": "Studio",
                    "purpose": "residencial",
                    "transaction": "rent",
                    "value": 1800
                }
            ]
        }
    

    If you want to extract all listings from it, your JSON path should be $.listings[*]. This way, each listing inside the array would represent a row in the extracted table, such as:

    idtypepurposetransactionvalue
    26747Housecommercialrent1800
    774Apartmentresidencialrent2400
    1553Studioresidencialrent1800

    For more information about how JSON path works, please check this link.

  3. Click Next.

2. Configure your JSON data streams

  1. Customize how you want your data to appear in your catalog. Select a name for each table (which will contain the fetched data) and the type of sync.
  • Table name: we suggest a name, but feel free to customize it. You have the option to add a prefix and make this process faster!
  • Sync Type: you can choose between INCREMENTAL and FULL_TABLE.
    • Incremental: every time the extraction happens, we’ll get only the new data - which is good if, for example, you want to keep every record ever fetched.
    • Full table: every time the extraction happens, we’ll get the current state of the data - which is good if, for example, you don’t want to have deleted data in your catalog.
  1. Click Next.

4. Configure your JSON data source

  1. Describe your data source for easy identification within your organization. You can inform things like what data it brings, to which team it belongs, etc.

  2. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times).

Check your new source!

  1. Click Next to finalize the setup. Once completed, you’ll receive confirmation that your new source is set up!

  2. You can view your new source on the Sources page. Now, for you to be able to see it on your Catalog, you have to wait for the pipeline to run. You can now monitor it on the Sources page to see its execution and completion. If needed, manually trigger the pipeline by clicking on the refresh icon. Once executed, your new table will appear in the Catalog section.

If you encounter any issues, reach out to us via Slack, and we’ll gladly assist you!