SDK - Nekt

Nekt SDK provides a simple interface to read and write data from your Lakehouse. It supports two processing engines:

Python — Uses Pandas DataFrames. Best for smaller datasets and general-purpose data processing.
Spark — Uses PySpark DataFrames. Best for large-scale distributed data processing.

Explore the SDK using Google Colab or check all templates navigating to Notebooks > Templates.

Getting started

Generate an access token

Generate a token with access to the resources you need (tables, volumes, secrets).

Install the SDK

The SDK is already included in Nekt Notebook templates. To install it in your own environment:

pip install git+https://github.com/nektcom/nekt-sdk-py.git#egg=nekt-sdk

Configure the SDK

Set your access token and choose an engine before calling any SDK method:

import nekt

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "python"  # or "spark"

Load your first table

df = nekt.load_table(layer_name="Bronze", table_name="pipedrive_deals")

Configuration

You must set data_access_token and engine before using any SDK method.

Attribute	Required	Description
`nekt.data_access_token`	Yes	Access token for authenticating with the Nekt API. Generate one here.
`nekt.engine`	Yes	Processing engine: `"python"` (Pandas) or `"spark"` (PySpark).

import nekt

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "python"

Configuration is locked after the first SDK operation. If you need to change the engine or token, restart your Python session.

Reading data

Load table

Load a table from your Lakehouse as a DataFrame using the .load_table() method. Returns a Pandas DataFrame when engine="python" or a Spark DataFrame when engine="spark".Parameters:

layer_name (str): The name of the layer where the table is located
table_name (str): The name of the table to load

Layer and table names must match the exact capitalization used in your Catalog.

Returns: Pandas DataFrame (engine="python") or Spark DataFrame (engine="spark")Example with Python engine:

import nekt
import pandas as pd

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "python"

df: pd.DataFrame = nekt.load_table(
    layer_name="Bronze",
    table_name="pipedrive_deals"
)

Example with Spark engine:

import nekt
from pyspark.sql import DataFrame

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "spark"

df: DataFrame = nekt.load_table(
    layer_name="Bronze",
    table_name="pipedrive_deals"
)

Load secret

Load a secret value from your organization’s secrets vault using the .load_secret() method.Parameters:

key (str): The secret key to retrieve

The key name must match the exact capitalization used in your Catalog.

Returns: strExample:

import nekt

api_key = nekt.load_secret(key="MY_SECRET_API_KEY")

Your access token must have permission to access the secret.

Load volume

List files in a volume from your Lakehouse using the .load_volume() method.Parameters:

layer_name (str): The name of the layer where the volume is located
volume_name (str): The name of the volume

Layer and volume names must match the exact capitalization used in your Catalog.

Returns: List of dictionaries containing file pathsExample:

import nekt

files = nekt.load_volume(
    layer_name="Raw",
    volume_name="csv_uploads"
)

for file in files:
    print(file["path"])

Load Delta table

Load a Delta table object from your Lakehouse using the .load_delta_table() method. This gives you access to Delta Lake features like ACID transactions and time travel.Parameters:

layer_name (str): The name of the layer where the table is located
table_name (str): The name of the table to load

This method requires engine="spark". It raises an error if the engine is set to "python".Layer and table names must match the exact capitalization used in your Catalog.

Returns: DeltaTable objectExample:

import nekt

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "spark"

delta_table = nekt.load_delta_table(
    layer_name="Transformation",
    table_name="customer_data"
)

# View table history
delta_table.history().show()

Writing data

save_table is only available when running inside Nekt Notebooks. When running locally, this method prints a message indicating it is not available.

Save table

Save a DataFrame as a table in your Lakehouse using the .save_table() method. Supports overwrite, append, and merge (upsert) modes with automatic schema evolution.Parameters:

df (DataFrame): The DataFrame to save (Pandas or Spark, depending on your engine)
layer_name (str): The name of the target layer
table_name (str): The name of the table to create or update
folder_name (str, optional): Folder within the layer. Defaults to the layer root
mode (str, optional): Write mode — "overwrite", "append", or "merge". Defaults to "overwrite"
merge_keys (list[str], optional): Column names used as keys for merge mode. Required when mode="merge"
schema_evolution (str, optional): Schema evolution strategy — "merge", "strict", or "overwrite". Defaults to "merge"

Layer, table, and folder names must match the exact capitalization used in your Catalog.

Example — overwrite (default):

import nekt

nekt.save_table(
    df=transformed_df,
    layer_name="Transformation",
    table_name="customer_metrics"
)

Example — append:

nekt.save_table(
    df=new_records,
    layer_name="Raw",
    table_name="events",
    mode="append"
)

Example — merge (upsert):

nekt.save_table(
    df=updated_records,
    layer_name="Transformation",
    table_name="users",
    mode="merge",
    merge_keys=["user_id"]
)

Schema evolution strategies:

Strategy	Behavior
`"merge"`	New columns are added automatically. Existing columns are preserved.
`"strict"`	Schema must match exactly. Raises an error if columns differ.
`"overwrite"`	The table schema is replaced with the DataFrame schema.

Logging

The SDK provides a built-in logger for tracking execution progress. Inside Nekt Notebooks, logs are visible in the Nekt UI. Outside Nekt, logs go to standard error output.

Default logger

Use nekt.logger to log messages directly:

import nekt

nekt.logger.info("Starting transformation")
nekt.logger.warning("Skipping 5 invalid rows")

Named loggers

Use nekt.get_logger(name) to create a named sub-logger. Messages are automatically prefixed with the logger name, making it easier to identify which part of your code produced each log entry.

import nekt

validation_logger = nekt.get_logger("validation")
validation_logger.info("Checking schema")    # → "[validation] Checking schema"
validation_logger.error("Missing columns")   # → "[validation] Missing columns"

etl_logger = nekt.get_logger("etl")
etl_logger.info("Processing batch 1")       # → "[etl] Processing batch 1"

Use named loggers to organize logs by pipeline stage or domain. This makes it much easier to filter and debug issues in the Nekt UI.

Utilities

Get Spark session

Access the shared Spark session using the .get_spark_session() method. Useful for creating DataFrames manually or running Spark operations directly.

This method requires engine="spark". It raises an error if the engine is set to "python".

Returns: SparkSession objectExample:

import nekt
from pyspark.sql.types import StructType, StructField, StringType, IntegerType

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "spark"

spark = nekt.get_spark_session()

# Create a DataFrame with custom schema
schema = StructType([
    StructField("id", IntegerType(), True),
    StructField("name", StringType(), True)
])

data = [(1, "Alice"), (2, "Bob")]
custom_df = spark.createDataFrame(data, schema)

# Read files directly
csv_df = spark.read.csv("path/to/file.csv", header=True)

Need Help?

If you encounter any issues with the SDK or have feedback, reach out to our support team.

​Getting started

​Configuration

​Reading data

​Writing data

​Logging

​Default logger

​Named loggers

​Utilities

​Need Help?

Getting started

Configuration

Reading data

Writing data

Logging

Default logger

Named loggers

Utilities

Need Help?