Skip to main content
Nekt SDK provides a simple interface to read and write data from your Lakehouse. It supports two processing engines:
  • Python — Uses Pandas DataFrames. Best for smaller datasets and general-purpose data processing.
  • Spark — Uses PySpark DataFrames. Best for large-scale distributed data processing.
Explore the SDK using Google Colab or check all templates navigating to Notebooks > Templates.

Getting started

1

Generate an access token

Generate a token with access to the resources you need (tables, volumes, secrets).
2

Install the SDK

The SDK is already included in Nekt Notebook templates. To install it in your own environment:
pip install git+https://github.com/nektcom/nekt-sdk-py.git#egg=nekt-sdk
3

Configure the SDK

Set your access token and choose an engine before calling any SDK method:
import nekt

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "python"  # or "spark"
4

Load your first table

df = nekt.load_table(layer_name="Bronze", table_name="pipedrive_deals")

Configuration

You must set data_access_token and engine before using any SDK method.
AttributeRequiredDescription
nekt.data_access_tokenYesAccess token for authenticating with the Nekt API. Generate one here.
nekt.engineYesProcessing engine: "python" (Pandas) or "spark" (PySpark).
import nekt

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "python"
Configuration is locked after the first SDK operation. If you need to change the engine or token, restart your Python session.

Reading data

Load a table from your Lakehouse as a DataFrame using the .load_table() method. Returns a Pandas DataFrame when engine="python" or a Spark DataFrame when engine="spark".Parameters:
  • layer_name (str): The name of the layer where the table is located
  • table_name (str): The name of the table to load
Layer and table names must match the exact capitalization used in your Catalog.
Returns: Pandas DataFrame (engine="python") or Spark DataFrame (engine="spark")Example with Python engine:
import nekt
import pandas as pd

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "python"

df: pd.DataFrame = nekt.load_table(
    layer_name="Bronze",
    table_name="pipedrive_deals"
)
Example with Spark engine:
import nekt
from pyspark.sql import DataFrame

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "spark"

df: DataFrame = nekt.load_table(
    layer_name="Bronze",
    table_name="pipedrive_deals"
)
Load a secret value from your organization’s secrets vault using the .load_secret() method.Parameters:
  • key (str): The secret key to retrieve
The key name must match the exact capitalization used in your Catalog.
Returns: strExample:
import nekt

api_key = nekt.load_secret(key="MY_SECRET_API_KEY")
Your access token must have permission to access the secret.
List files in a volume from your Lakehouse using the .load_volume() method.Parameters:
  • layer_name (str): The name of the layer where the volume is located
  • volume_name (str): The name of the volume
Layer and volume names must match the exact capitalization used in your Catalog.
Returns: List of dictionaries containing file pathsExample:
import nekt

files = nekt.load_volume(
    layer_name="Raw",
    volume_name="csv_uploads"
)

for file in files:
    print(file["path"])
Load a Delta table object from your Lakehouse using the .load_delta_table() method. This gives you access to Delta Lake features like ACID transactions and time travel.Parameters:
  • layer_name (str): The name of the layer where the table is located
  • table_name (str): The name of the table to load
This method requires engine="spark". It raises an error if the engine is set to "python".Layer and table names must match the exact capitalization used in your Catalog.
Returns: DeltaTable objectExample:
import nekt

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "spark"

delta_table = nekt.load_delta_table(
    layer_name="Transformation",
    table_name="customer_data"
)

# View table history
delta_table.history().show()

Writing data

save_table is only available when running inside Nekt Notebooks. When running locally, this method prints a message indicating it is not available.
Save a DataFrame as a table in your Lakehouse using the .save_table() method. Supports overwrite, append, and merge (upsert) modes with automatic schema evolution.Parameters:
  • df (DataFrame): The DataFrame to save (Pandas or Spark, depending on your engine)
  • layer_name (str): The name of the target layer
  • table_name (str): The name of the table to create or update
  • folder_name (str, optional): Folder within the layer. Defaults to the layer root
  • mode (str, optional): Write mode — "overwrite", "append", or "merge". Defaults to "overwrite"
  • merge_keys (list[str], optional): Column names used as keys for merge mode. Required when mode="merge"
  • schema_evolution (str, optional): Schema evolution strategy — "merge", "strict", or "overwrite". Defaults to "merge"
Layer, table, and folder names must match the exact capitalization used in your Catalog.
Example — overwrite (default):
import nekt

nekt.save_table(
    df=transformed_df,
    layer_name="Transformation",
    table_name="customer_metrics"
)
Example — append:
nekt.save_table(
    df=new_records,
    layer_name="Raw",
    table_name="events",
    mode="append"
)
Example — merge (upsert):
nekt.save_table(
    df=updated_records,
    layer_name="Transformation",
    table_name="users",
    mode="merge",
    merge_keys=["user_id"]
)
Schema evolution strategies:
StrategyBehavior
"merge"New columns are added automatically. Existing columns are preserved.
"strict"Schema must match exactly. Raises an error if columns differ.
"overwrite"The table schema is replaced with the DataFrame schema.

Logging

The SDK provides a built-in logger for tracking execution progress. Inside Nekt Notebooks, logs are visible in the Nekt UI. Outside Nekt, logs go to standard error output.

Default logger

Use nekt.logger to log messages directly:
import nekt

nekt.logger.info("Starting transformation")
nekt.logger.warning("Skipping 5 invalid rows")

Named loggers

Use nekt.get_logger(name) to create a named sub-logger. Messages are automatically prefixed with the logger name, making it easier to identify which part of your code produced each log entry.
import nekt

validation_logger = nekt.get_logger("validation")
validation_logger.info("Checking schema")    # → "[validation] Checking schema"
validation_logger.error("Missing columns")   # → "[validation] Missing columns"

etl_logger = nekt.get_logger("etl")
etl_logger.info("Processing batch 1")       # → "[etl] Processing batch 1"
Use named loggers to organize logs by pipeline stage or domain. This makes it much easier to filter and debug issues in the Nekt UI.

Utilities

Access the shared Spark session using the .get_spark_session() method. Useful for creating DataFrames manually or running Spark operations directly.
This method requires engine="spark". It raises an error if the engine is set to "python".
Returns: SparkSession objectExample:
import nekt
from pyspark.sql.types import StructType, StructField, StringType, IntegerType

nekt.data_access_token = "YOUR_TOKEN"
nekt.engine = "spark"

spark = nekt.get_spark_session()

# Create a DataFrame with custom schema
schema = StructType([
    StructField("id", IntegerType(), True),
    StructField("name", StringType(), True)
])

data = [(1, "Alice"), (2, "Bob")]
custom_df = spark.createDataFrame(data, schema)

# Read files directly
csv_df = spark.read.csv("path/to/file.csv", header=True)

Need Help?

If you encounter any issues with the SDK or have feedback, reach out to our support team.