Skip to main content
Instagram is a social media platform for sharing photos, videos, and stories. The Instagram Business API provides access to insights and analytics about your business account’s performance and audience engagement.

Requirements

Before setting up Instagram as a data source, ensure your account meets these requirements:
  • The Instagram account must be a Business account (personal accounts are not supported)
  • The Instagram profile must have at least 100 followers
  • The Instagram profile must be linked to a Facebook page
  • The user connecting the account must have one of these roles on the Facebook page:
    • Owner
    • Administrator
    • Editor
Instagram Limitations:
  1. Media insights (such as views and engagement metrics) are only available for content that has received at least 5 views.
  2. Stories data is only available for 24 hours after posting. If your pipeline runs during this window, the data will be captured and stored in your destination, but new pipelines won’t be able to fetch stories older than 24 hours.

Configuring Instagram as a Source

In the Sources tab, click on the “Add source” button located on the top right of your screen. Then, select the Instagram option from the list of connectors. Click Next and you’ll be prompted to add your access.

1. Add account access

You’ll need to add the following information to connect to Instagram:
  1. Complete Facebook authorization with an account that has access to the page you want to get data from.
  2. Select the Instagram Page you’d like to extract data from.
  3. Define a date to be the starting point of the data retrival.
  4. The lookback window parameter allows you to define the the number of days to look in the past for new updates in Media (such as Comments and Likes). What the lookback does is check data from X days ago to ensure it captures all new updates. You can use default value if you are not sure about it.
Click Next.

2. Select streams

Choose which data streams you want to sync - you can select all streams or pick specific ones that matter most to you.
Tip: The stream can be found more easily by typing its name.
Select the streams and click Next.

3. Configure data streams

Customize how you want your data to appear in your catalog. Select a name for each table (which will contain the fetched data) and the type of sync.
  • Table name: we suggest a name, but feel free to customize it. You have the option to add a prefix and make this process faster!
  • Sync Type: you can choose between INCREMENTAL and FULL_TABLE.
    • Incremental: every time the extraction happens, we’ll get only the new data - which is good if, for example, you want to keep every record ever fetched.
    • Full table: every time the extraction happens, we’ll get the current state of the data - which is good if, for example, you don’t want to have deleted data in your catalog.
Once you are done configuring, click Next.

4. Configure data source

Describe your data source for easy identification within your organization, not exceeding 140 characters. To define your Trigger, consider how often you want data to be extracted from this source. This decision usually depends on how frequently you need the new table data updated (every day, once a week, or only at specific times). Optionally, you can determine when to execute a full sync. This will complement the incremental data extractions, ensuring that your data is completely synchronized with your source every once in a while. Once you are ready, click Next to finalize the setup.

5. Check your new source

You can view your new source on the Sources page. If needed, manually trigger the source extraction by clicking on the arrow button. Once executed, your data will appear in your Catalog.
For you to be able to see it on your Catalog, you need at least one successful source run.

Streams and Fields

Below you’ll find all available data streams from Instagram and their corresponding fields:
Stream for basic account information about your Instagram Business account.Key Fields:
  • id - Account ID (Primary Key)
  • ig_id - Instagram internal numeric ID
  • username - Account username
  • name - Account display name
Profile Information:
  • biography - Account bio text
  • website - Account website URL
  • profile_picture_url - Profile picture URL
Metrics:
  • follows_count - Number of accounts being followed
  • followers_count - Number of followers
  • media_count - Total number of media posts
Stream for posts, stories, and reels published on your account.Key Fields:
  • id - Media ID (Primary Key)
  • ig_id - Instagram internal ID
  • caption - Post caption text
  • media_type - Type of media (IMAGE, VIDEO, CAROUSEL_ALBUM)
  • media_product_type - Product type (FEED, STORY, REELS)
  • timestamp - When the media was posted (Replication Key)
URLs and References:
  • media_url - URL to the media content
  • permalink - Permanent link to the post
  • thumbnail_url - Thumbnail image URL (for videos)
  • shortcode - Short code for the media
Engagement Metrics:
  • comments_count - Number of comments
  • like_count - Number of likes
Additional Information:
  • is_comment_enabled - Whether comments are enabled
  • is_shared_to_feed - Whether shared to feed (for Reels)
  • username - Username who posted
  • owner - Owner object containing:
    • id - Owner account ID
  • boost_eligibility_info - Boost eligibility object:
    • eligible_to_boost - Whether the post can be boosted
Stream for media performance metrics. Insights are only available for media with at least 5 views.Key Fields:
  • media_id - Media ID (Primary Key)
  • media_type - Type of media (IMAGE, VIDEO, CAROUSEL_ALBUM)
  • media_product_type - Product type (FEED, STORY, REELS)
  • timestamp - When the media was posted (Replication Key)
Engagement Metrics:
  • comments - Number of comments
  • likes - Number of likes
  • shares - Number of shares
  • saved - Number of saves
  • total_interactions - Total engagement count
  • replies - Number of replies (Stories)
Reach & Views:
  • reach - Number of unique accounts that saw the post
  • views - Number of views
  • follows - Number of new follows from this media
  • profile_visits - Profile visits from this media
Reels-Specific Metrics:
  • ig_reels_avg_watch_time - Average watch time in milliseconds
  • ig_reels_video_view_total_time - Total view time in milliseconds
Stream for comments on your posts. Comments are extracted as a child stream of Media.Key Fields:
  • id - Comment ID (Primary Key)
  • media_id - ID of the media being commented on
  • text - Comment text content
  • timestamp - When the comment was posted
Engagement:
  • like_count - Number of likes on the comment
  • hidden - Whether the comment is hidden
Stream for replies to comments. Replies are extracted as a child stream of Comments.Key Fields:
  • id - Reply ID (Primary Key)
  • parent_comment_id - ID of the parent comment
  • text - Reply text content
  • timestamp - When the reply was posted
Engagement:
  • like_count - Number of likes on the reply
  • hidden - Whether the reply is hidden
Daily account-level insights with a 1-day rolling window (insights_rolling_day).Key Fields:
  • date - Date of the metrics (Primary Key, Replication Key)
Engagement Metrics:
  • reach - Number of unique accounts reached
  • total_interactions - Total engagement count
  • accounts_engaged - Number of accounts that engaged
  • likes - Number of likes received
  • shares - Number of shares
  • replies - Number of replies
Profile Metrics:
  • follows_and_unfollows - Net follow/unfollow activity
  • profile_links_taps - Profile link clicks
  • website_clicks - Website clicks
  • profile_views - Profile views
Account-level insights with a 7-day rolling window (insights_rolling_week).Key Fields:
  • date - Date of the metrics (Primary Key, Replication Key)
Metrics: Same metrics as Rolling Day, but aggregated over a 7-day rolling window:
  • reach, total_interactions, accounts_engaged, likes, shares, replies
  • follows_and_unfollows, profile_links_taps, website_clicks, profile_views
Account-level insights with a 28-day rolling window (insights_rolling_days28).Key Fields:
  • date - Date of the metrics (Primary Key, Replication Key)
Metrics: Same metrics as Rolling Day, but aggregated over a 28-day rolling window:
  • reach, total_interactions, accounts_engaged, likes, shares, replies
  • follows_and_unfollows, profile_links_taps, website_clicks, profile_views
Lifetime follower demographics broken down by country (follower_demographics_country_lifetime).Key Fields:
  • dimension_value - Country code (e.g., “US”, “BR”, “GB”) (Primary Key)
  • value - Number of followers from this country
Lifetime follower demographics broken down by age range (follower_demographics_age_lifetime).Key Fields:
  • dimension_value - Age range (e.g., “18-24”, “25-34”, “35-44”) (Primary Key)
  • value - Number of followers in this age range
Lifetime follower demographics broken down by gender (follower_demographics_gender_lifetime).Key Fields:
  • dimension_value - Gender (e.g., “M”, “F”, “U”) (Primary Key)
  • value - Number of followers of this gender
Lifetime follower demographics broken down by city (follower_demographics_city_lifetime).Key Fields:
  • dimension_value - City name (e.g., “São Paulo, Brazil”) (Primary Key)
  • value - Number of followers from this city

Data Model

The following diagram illustrates the relationships between the core data streams in Instagram. The arrows indicate the join keys that link the different entities.

Use Cases for Data Analysis

This guide outlines valuable business intelligence use cases when consolidating Instagram data, along with ready-to-use SQL queries that you can run on Explorer.

Content Performance Analysis

1. Media Performance Overview

Track the overall performance of your posts, stories, and reels to identify top-performing content. Business Value:
  • Identify which content types drive the most engagement
  • Understand what content resonates best with your audience
  • Optimize your content strategy based on performance data
  • Track reach and engagement trends over time

SQL query

  • AWS
  • GCP
WITH
    media_performance AS (
        SELECT
            m.id AS media_id,
            m.media_type,
            m.media_product_type,
            m.caption,
            m.timestamp AS posted_at,
            m.like_count,
            m.comments_count,
            mi.reach,
            mi.shares,
            mi.saved,
            mi.views,
            mi.total_interactions,
            mi.ig_reels_avg_watch_time,
            CAST(mi.total_interactions AS DOUBLE) * 100.0 / NULLIF(mi.reach, 0) AS engagement_rate
        FROM
            nekt_raw.instagram_media m
            LEFT JOIN nekt_raw.instagram_media_insights mi ON m.id = mi.media_id
        WHERE
            DATE(m.timestamp) >= CURRENT_DATE - INTERVAL '30' DAY
    )
SELECT
    media_product_type,
    COUNT(*) AS total_posts,
    SUM(like_count) AS total_likes,
    SUM(comments_count) AS total_comments,
    SUM(reach) AS total_reach,
    SUM(shares) AS total_shares,
    SUM(saved) AS total_saves,
    SUM(views) AS total_views,
    SUM(total_interactions) AS total_interactions,
    ROUND(AVG(engagement_rate), 2) AS avg_engagement_rate,
    ROUND(AVG(ig_reels_avg_watch_time) / 1000.0, 2) AS avg_watch_time_seconds
FROM
    media_performance
GROUP BY
    media_product_type
ORDER BY
    total_interactions DESC

2. Top Performing Content

Identify your best-performing posts based on engagement metrics to replicate successful content strategies. Business Value:
  • Discover content patterns that drive high engagement
  • Identify optimal content formats and topics
  • Inform future content creation decisions
  • Benchmark new content against top performers

SQL query

  • AWS
  • GCP
WITH
    ranked_content AS (
        SELECT
            m.id AS media_id,
            m.media_product_type,
            SUBSTRING(m.caption, 1, 100) AS caption_preview,
            m.permalink,
            DATE(m.timestamp) AS posted_date,
            m.like_count,
            m.comments_count,
            mi.reach,
            mi.shares,
            mi.saved,
            mi.total_interactions,
            CAST(mi.total_interactions AS DOUBLE) * 100.0 / NULLIF(mi.reach, 0) AS engagement_rate,
            ROW_NUMBER() OVER (
                PARTITION BY m.media_product_type 
                ORDER BY mi.total_interactions DESC
            ) AS rank_by_type
        FROM
            nekt_raw.instagram_media m
            LEFT JOIN nekt_raw.instagram_media_insights mi ON m.id = mi.media_id
        WHERE
            DATE(m.timestamp) >= CURRENT_DATE - INTERVAL '90' DAY
            AND mi.reach > 0
    )
SELECT
    media_product_type,
    caption_preview,
    posted_date,
    like_count,
    comments_count,
    reach,
    shares,
    saved,
    total_interactions,
    ROUND(engagement_rate, 2) AS engagement_rate_pct
FROM
    ranked_content
WHERE
    rank_by_type <= 5
ORDER BY
    media_product_type,
    total_interactions DESC

Audience Engagement Analysis

Track your account’s growth and engagement metrics over time to monitor performance trends. Business Value:
  • Monitor account health and growth trajectory
  • Identify periods of high or low engagement
  • Correlate engagement with content posting patterns
  • Track progress toward growth goals

SQL query

  • AWS
  • GCP
WITH
    daily_insights AS (
        SELECT
            DATE(date) AS insight_date,
            reach,
            total_interactions,
            accounts_engaged,
            likes,
            shares,
            replies,
            follows_and_unfollows,
            profile_views,
            website_clicks
        FROM
            nekt_raw.instagram_insights_rolling_day
        WHERE
            DATE(date) >= CURRENT_DATE - INTERVAL '30' DAY
    ),
    with_moving_avg AS (
        SELECT
            *,
            AVG(reach) OVER (ORDER BY insight_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS reach_7day_avg,
            AVG(total_interactions) OVER (ORDER BY insight_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS interactions_7day_avg,
            SUM(follows_and_unfollows) OVER (ORDER BY insight_date) AS cumulative_net_followers
        FROM
            daily_insights
    )
SELECT
    insight_date,
    reach,
    ROUND(reach_7day_avg, 0) AS reach_7day_avg,
    total_interactions,
    ROUND(interactions_7day_avg, 0) AS interactions_7day_avg,
    accounts_engaged,
    follows_and_unfollows AS net_followers,
    cumulative_net_followers,
    profile_views,
    website_clicks
FROM
    with_moving_avg
ORDER BY
    insight_date DESC

4. Comment Engagement Analysis

Analyze comment activity on your posts to understand audience sentiment and engagement patterns. Business Value:
  • Measure audience engagement depth
  • Identify posts that spark conversations
  • Track comment response rates
  • Monitor community health and sentiment

SQL query

  • AWS
  • GCP
WITH
    comment_stats AS (
        SELECT
            m.id AS media_id,
            m.media_product_type,
            SUBSTRING(m.caption, 1, 80) AS caption_preview,
            DATE(m.timestamp) AS posted_date,
            m.comments_count,
            COUNT(c.id) AS actual_comments,
            SUM(c.like_count) AS total_comment_likes,
            AVG(c.like_count) AS avg_likes_per_comment,
            SUM(CASE WHEN c.hidden = false THEN 1 ELSE 0 END) AS visible_comments,
            SUM(CASE WHEN c.hidden = true THEN 1 ELSE 0 END) AS hidden_comments
        FROM
            nekt_raw.instagram_media m
            LEFT JOIN nekt_raw.instagram_comments c ON m.id = c.media_id
        WHERE
            DATE(m.timestamp) >= CURRENT_DATE - INTERVAL '30' DAY
        GROUP BY
            m.id,
            m.media_product_type,
            m.caption,
            m.timestamp,
            m.comments_count
    )
SELECT
    media_product_type,
    COUNT(*) AS total_posts,
    SUM(actual_comments) AS total_comments,
    ROUND(AVG(actual_comments), 1) AS avg_comments_per_post,
    SUM(total_comment_likes) AS total_comment_likes,
    ROUND(AVG(avg_likes_per_comment), 2) AS avg_likes_per_comment,
    SUM(hidden_comments) AS total_hidden_comments,
    ROUND(
        CAST(SUM(hidden_comments) AS DOUBLE) * 100.0 / NULLIF(SUM(actual_comments), 0),
        2
    ) AS hidden_comment_rate_pct
FROM
    comment_stats
GROUP BY
    media_product_type
ORDER BY
    total_comments DESC

Audience Demographics Analysis

5. Follower Geographic Distribution

Understand where your followers are located to optimize content timing and targeting. Business Value:
  • Identify key markets and geographic audiences
  • Optimize posting times for primary time zones
  • Inform localization and content strategies
  • Support geo-targeted campaigns

SQL query

  • AWS
  • GCP
WITH
    country_data AS (
        SELECT
            dimension_value AS country_code,
            value AS follower_count
        FROM
            nekt_raw.instagram_follower_demographics_country_lifetime
    ),
    total_followers AS (
        SELECT SUM(follower_count) AS total FROM country_data
    )
SELECT
    cd.country_code,
    cd.follower_count,
    ROUND(
        CAST(cd.follower_count AS DOUBLE) * 100.0 / tf.total,
        2
    ) AS percentage,
    SUM(cd.follower_count) OVER (ORDER BY cd.follower_count DESC) AS cumulative_followers,
    ROUND(
        CAST(SUM(cd.follower_count) OVER (ORDER BY cd.follower_count DESC) AS DOUBLE) * 100.0 / tf.total,
        2
    ) AS cumulative_percentage
FROM
    country_data cd
    CROSS JOIN total_followers tf
ORDER BY
    cd.follower_count DESC
LIMIT 20

6. Audience Demographics Summary

Get a complete view of your audience demographics including age, gender, and location. Business Value:
  • Build comprehensive audience personas
  • Align content with audience preferences
  • Support ad targeting decisions
  • Identify growth opportunities in demographics

SQL query

  • AWS
  • GCP
WITH
    age_data AS (
        SELECT 'age' AS dimension_type, dimension_value, value AS count
        FROM nekt_raw.instagram_follower_demographics_age_lifetime
    ),
    gender_data AS (
        SELECT 'gender' AS dimension_type, dimension_value, value AS count
        FROM nekt_raw.instagram_follower_demographics_gender_lifetime
    ),
    city_data AS (
        SELECT 'city' AS dimension_type, dimension_value, value AS count
        FROM nekt_raw.instagram_follower_demographics_city_lifetime
    ),
    all_demographics AS (
        SELECT * FROM age_data
        UNION ALL
        SELECT * FROM gender_data
        UNION ALL
        SELECT * FROM city_data
    ),
    totals AS (
        SELECT dimension_type, SUM(count) AS total
        FROM all_demographics
        GROUP BY dimension_type
    )
SELECT
    ad.dimension_type,
    ad.dimension_value,
    ad.count AS follower_count,
    ROUND(
        CAST(ad.count AS DOUBLE) * 100.0 / t.total,
        2
    ) AS percentage,
    RANK() OVER (
        PARTITION BY ad.dimension_type 
        ORDER BY ad.count DESC
    ) AS rank_in_category
FROM
    all_demographics ad
    JOIN totals t ON ad.dimension_type = t.dimension_type
ORDER BY
    ad.dimension_type,
    ad.count DESC

Time-Based Analysis

7. Best Days and Times for Posting

Identify the best days of the week for engagement to optimize your posting schedule. Business Value:
  • Optimize posting schedule for maximum reach
  • Identify peak engagement periods
  • Improve content planning and scheduling
  • Maximize return on content creation efforts

SQL query

  • AWS
  • GCP
WITH
    daily_performance AS (
        SELECT
            DATE(date) AS insight_date,
            DATE_FORMAT(DATE(date), '%W') AS day_of_week,
            reach,
            total_interactions,
            accounts_engaged,
            profile_views
        FROM
            nekt_raw.instagram_insights_rolling_day
        WHERE
            DATE(date) >= CURRENT_DATE - INTERVAL '90' DAY
    ),
    day_averages AS (
        SELECT
            day_of_week,
            AVG(reach) AS avg_reach,
            AVG(total_interactions) AS avg_interactions,
            AVG(accounts_engaged) AS avg_engaged,
            AVG(profile_views) AS avg_profile_views,
            COUNT(*) AS days_counted
        FROM
            daily_performance
        GROUP BY
            day_of_week
    )
SELECT
    day_of_week,
    ROUND(avg_reach, 0) AS avg_daily_reach,
    ROUND(avg_interactions, 0) AS avg_daily_interactions,
    ROUND(avg_engaged, 0) AS avg_accounts_engaged,
    ROUND(avg_profile_views, 0) AS avg_profile_views,
    days_counted,
    CASE
        WHEN avg_interactions >= (SELECT MAX(avg_interactions) * 0.8 FROM day_averages) THEN 'Peak Day'
        WHEN avg_interactions <= (SELECT MIN(avg_interactions) * 1.2 FROM day_averages) THEN 'Low Day'
        ELSE 'Average'
    END AS day_classification
FROM
    day_averages
ORDER BY
    CASE day_of_week
        WHEN 'Monday' THEN 1
        WHEN 'Tuesday' THEN 2
        WHEN 'Wednesday' THEN 3
        WHEN 'Thursday' THEN 4
        WHEN 'Friday' THEN 5
        WHEN 'Saturday' THEN 6
        WHEN 'Sunday' THEN 7
    END

Implementation Notes

Data Quality Considerations

  • Media insights are only available for content with at least 5 views
  • Stories data is only available for 24 hours after posting - ensure your pipeline runs frequently to capture this data
  • User insights have a maximum lookback of 2 years from the Instagram API
  • Follower demographics require a minimum of 100 followers

API Limitations

  • The Instagram API has rate limits that may affect extraction times for accounts with large amounts of content
  • The lookback window parameter helps re-fetch updated metrics for recent content
  • Some metrics may not be available for all media types (e.g., Reels-specific metrics only apply to Reels)

Best Practices

  • Schedule extractions at least daily to capture Stories data before it expires
  • Use incremental sync for media and comments to efficiently track updates
  • Monitor the reach metric alongside engagement to understand true content performance
  • Consider the 28-day rolling insights for trend analysis over the daily insights for day-to-day fluctuations