Parquet as a data source
Bring your data from a Parquet file to your catalog.
1. Select your file
-
In the Sources tab, click on the Add source button located on the top right of your screen. Then, select the Parquet option from the list of connectors.
-
Click Next and you’ll be prompted to upload your file.
As soon as you upload it, optionally, you have the chance to edit the file’s name. You can’t have more than one file with the same name, but you can have the same file uploaded more than once with different names if you ever need it.
- Click Next.
2. Define your new table
Now that you’ve specified your file, define how you want your table in the catalog to be.
-
You have to choose a name for your table and you can select one column to work as primary key.
-
Click Next and the last step will be adding a description to your new source. Describe it for easily identification within your organization.
-
Click Done to finalize.
3. Check your new source!
-
Your Parquet source was added! Now, for you to be able to see it on your datalake, you have to Trigger the source pipeline and wait for the complete run.
-
As soon as it has been successfully run for the first time, you’ll be able to play with your new table in the Catalog.
Let us know through our chat if you face any blocker and we’ll be happy to help!