Skip to main content
SFTP (SSH File Transfer Protocol) is a secure file transfer protocol that provides encrypted data transfer over SSH connections. It’s commonly used for secure file exchange between systems, allowing businesses to safely transfer data files while maintaining data integrity and confidentiality. The SFTP connector supports CSV and Excel (.xlsx, .xls) files, with automatic schema detection and type inference.

Configuring SFTP as a Source

Configuring SFTP as a Source

In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the SFTP option from the list of connectors. Click Next and you’ll be prompted to add your access.

1. Add account access

Configure your SFTP server connection details:
  • Hostname: The host for accessing your SFTP server.
  • Port: The port for accessing your SFTP server. Defaults to 22.
  • Username: The username to access your SFTP server.
  • Password: The password to access your SFTP server. Use either password or private key.
  • Private key: The private key for accessing your SFTP server. Use either password or private key.
  • SFTP folder path: SFTP folder path where files are located.
  • Delimiter: (For CSV files) Character that separates columns in the CSV file.
  • Sheet Name: (For Excel files) Name of the sheet to read from Excel files. Defaults to the first (active) sheet if not provided.
Click Next.

2. Select streams

Choose which data streams you want to sync. The connector automatically detects CSV (.csv) and Excel (.xls, .xlsx) files in your specified folder path and maps them to streams. You can select entire groups of streams or only a subset of them.
Tip: The stream can be found more easily by typing its name.
Select the streams and click Next.

3. Configure data streams

Customize how you want your data to appear in your catalog. Select the desired layer where the data will be placed, a folder to organize it inside the layer, a name for each table (which will effectively contain the fetched data) and the type of sync.
  • Layer: choose between the existing layers on your catalog. This is where you will find your new extracted tables as the extraction runs successfully.
  • Folder: a folder can be created inside the selected layer to group all tables being created from this new data source.
  • Table name: we suggest a name, but feel free to customize it. You have the option to add a prefix to all tables at once and make this process faster!
  • Sync Type: for the SFTP data source, the syncs will always be Full Sync. Read more about sync types here.
Once you are done configuring, click Next.

4. Configure data source

Describe your data source for easy identification within your organization, not exceeding 140 characters. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times). Once you are ready, click Next to finalize the setup.

5. Check your new source

You can view your new source on the Sources page. If needed, manually trigger the source extraction by clicking on the arrow button. Once executed, your data will appear in your Catalog.
For you to be able to see it on your Catalog, you need at least one successful source run.

Streams and Fields

The SFTP connector dynamically generates streams based on the files found in the specified folder path. It supports reading both CSV and Excel files.
Stream generated from .csv files found on the SFTP server.Notes:
  • The schema and fields are dynamically discovered based on the header row of the CSV files.
  • Fields are parsed according to their inferred types (string, integer, float, boolean, etc.).
  • If a specific delimiter is used, it should be configured in the source settings.
Stream generated from .xls and .xlsx files found on the SFTP server.Notes:
  • The schema and fields are dynamically discovered based on the header row of the specified sheet.
  • If no sheet name is provided in the configuration, the connector defaults to reading the first available sheet in the workbook.
  • Fields are parsed according to their inferred types (string, number, boolean) based on sample rows.
  • Column headers are sanitized automatically to ensure they are safe for downstream tables.