Firestore as a data source
Bring your data from Firestore to your catalog.
1. Important notes
- Since Firestore is a No-SQL database, there might be scenarios where a consistent schema can’t be generated based on a sample of your collections and docs. In such cases, we recommend selection
discovery_mode
asenvelope
so that can be extracted “as is” and stored in adocument
column in string format. - We currently support only one nested level of collections inside documents. If you need more than that, please reach out to us.
2. Add your Firestore access
-
In the Sources tab, click on the “add source” button located on the top right of your screen. Then, select the Firestore option from the list of connectors.
-
Click Next and you’ll be prompted to add your access. Check the instructions next to each configuration option to discover where you can find the required parameters for the connection.
-
Click Next.
3. Select your Firestore streams
-
The next step is letting us know which streams you want to bring. Each stream available in that list corresponds to a top-level collection on Firestore. You can select entire groups of streams or only a subset of them.
Tip: The stream can be found more easily by typing its name.
-
Click Next.
4. Configure your Firestore data streams
- Customize how you want your data to appear in your catalog. Select a name for each table (which will contain the fetched data) and the type of sync.
- Table name: we suggest the same name as the collection, but feel free to customize it. You have the option to add a prefix and make this process faster!
- Sync Type: you can choose between INCREMENTAL and FULL_TABLE.
- Incremental: every time the extraction happens, we’ll get only the new data - which is good if, for example, you want to keep every record ever fetched. In order for that to work, you need to have a valid incremental key (either date or integer) inside your documents. This option is not available when
discovery_mode
isenvelope
. - Full table: every time the extraction happens, we’ll get the current state of the data - which is good if, for example, you don’t want to have deleted data in your catalog.
- Incremental: every time the extraction happens, we’ll get only the new data - which is good if, for example, you want to keep every record ever fetched. In order for that to work, you need to have a valid incremental key (either date or integer) inside your documents. This option is not available when
- Click Next.
5. Configure your Firestore data source
-
Describe your data source for easy identification within your organization. You can inform things like what data it brings, to which team it belongs, etc.
-
To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times).
Check your new source!
-
Click Next to finalize the setup. Once completed, you’ll receive confirmation that your new source is set up!
-
You can view your new source on the Sources page. Now, for you to be able to see it on your Catalog, you have to wait for the pipeline to run. You can now monitor it on the Sources page to see its execution and completion. If needed, manually trigger the pipeline by clicking on the refresh icon. Once executed, your new table will appear in the Catalog section.
If you encounter any issues, reach out to us via Slack, and we’ll gladly assist you!