# Blogger Tracking Architecture


# System Overview

graph TB
    subgraph "bloggers-worker"
        A1[POST /blogger/add]
        A2[POST /blogger/add-from-tracking]
        A3[Cron: 0 0,12 * * *]
    end

    subgraph "BLOGGER_VIDEO_QUEUE"
        Q1["type: blogger_initial_fetch"]
        Q2["no type = midnight/noon batch"]
    end

    subgraph "blogger-video-consumer"
        C1[handleInitialFetch]
        C2[handleMidnightBatch]
    end

    subgraph "Platform APIs"
        P1[TikTok]
        P2[Instagram]
        P3[YouTube]
    end

    subgraph "Supabase"
        DB1[blogger_information]
        DB2[tracked_accounts]
        DB3[videos]
        DB4[video_statistics]
    end

    A1 -->|queueInitialVideoFetch| Q1
    A2 -->|queueInitialVideoFetch| Q1
    A3 -->|scheduleMidnightBloggerFetches| Q2

    Q1 --> C1
    Q2 --> C2

    C1 & C2 --> P1 & P2 & P3
    C1 & C2 -->|saveBloggerVideos| DB3 & DB4
    A1 -->|upsert| DB1
    A1 -->|insert| DB2

    C1 & C2 -->|YouTube only| HIGH_PRIORITY_QUEUE

# Flow 1: Initial Tracking

# Trigger

User adds a new blogger via Blogger Management UI:

  • POST /blogger/add — new username, fetches profile from platform API
  • POST /blogger/add-from-tracking — blogger already exists in org

# Sequence

sequenceDiagram
    participant UI as Frontend
    participant BW as bloggers-worker
    participant BMS as bloggerManagement.js
    participant PAPI as Platform API
    participant DB as Supabase
    participant Q as BLOGGER_VIDEO_QUEUE
    participant BVC as blogger-video-consumer

    UI->>BW: POST /blogger/add { username, platform, ... }
    BW->>BMS: addBloggerWithTracking()
    BMS->>BMS: Validate (limits, permissions, platform)
    BMS->>PAPI: getUserInfoFromPlatform()
    PAPI-->>BMS: { channel_name, subscribers, external_id, sec_id, thumbnail }
    BMS->>DB: UPSERT blogger_information
    BMS->>DB: INSERT tracked_accounts (is_active=true)
    BMS->>DB: UPSERT blogger_campaign (optional)
    BMS->>DB: INSERT blogger_tags (optional)
    BMS->>Q: send { type: 'blogger_initial_fetch', ... }
    BW-->>UI: { success: true, blogger: {...} }

    Note over Q,BVC: Async

    Q->>BVC: handleInitialFetch(message)
    BVC->>BVC: Check pending_initial_fetch flag (dedup)
    BVC->>BVC: Check subscription limits
    BVC->>PAPI: getUserVideosBatch({ externalId, targetCount })
    PAPI-->>BVC: CommonVideo[]
    BVC->>DB: saveBloggerVideos() → videos + video_statistics
    BVC->>BVC: Queue thumbnails

    opt YouTube only
        BVC->>HIGH_PRIORITY_QUEUE: video_batch (batches of 50)
    end

    BVC->>DB: Set initial_videos_fetched = true

# Initial Fetch Queue Message

{
  type: 'blogger_initial_fetch',
  bloggerId: string,        // UUID
  secUid: string | null,    // TikTok only
  externalId: string,       // Platform user ID
  platform: 'TikTok' | 'Instagram' | 'YouTube',
  videoCount: number,       // initial_videos_count setting
  organizationId: string,
  userId: string,
  requestId: string,
  videoStatus: 'tracking',
  timestamp: string         // ISO8601
}

# DB Tables Written During Initial Add

Step Table Operation
Blogger profile blogger_information UPSERT (channel_name, subscribers, external_id, sec_id, thumbnail)
Tracking record tracked_accounts INSERT (user_id, org_id, blogger_id, initial/daily_videos_count, is_active=true)
Campaign blogger_campaign UPSERT (optional, sets campaign_tag via BEFORE INSERT trigger)
Tags blogger_tags INSERT (optional, triggers propagate to video_tags)
Videos videos INSERT (via blogger-video-consumer)
Video stats video_statistics INSERT (via blogger-video-consumer, skipped for YouTube)

# Flow 2: Midnight + Noon Batch

# Trigger

Cloudflare cron 0 0,12 * * * — fires at 00:00 UTC (midnight) and 12:00 UTC (noon). Both runs use the same function.

# Sequence

sequenceDiagram
    participant CRON as Cloudflare Cron
    participant BW as bloggers-worker
    participant MBF as midnightBloggerFetch.ts
    participant DB as Supabase
    participant Q as BLOGGER_VIDEO_QUEUE
    participant BVC as blogger-video-consumer
    participant PAPI as Platform API

    CRON->>BW: scheduled() event
    BW->>MBF: scheduleMidnightBloggerFetches()
    MBF->>DB: SELECT * FROM tracked_bloggers_view WHERE is_active = true
    DB-->>MBF: All active tracking records

    loop Per organization
        MBF->>MBF: Check subscription limits
        loop Per blogger
            MBF->>MBF: Apply MIN_PARTIAL_FETCH logic
            MBF->>MBF: Add to pendingMessages[]
        end
    end

    MBF->>Q: sendBatch() in chunks of 100

    Q->>BVC: handleMidnightBatch(message)
    BVC->>BVC: Check subscription limits
    BVC->>PAPI: getUserVideosBatch({ externalId, daily_videos_count })
    PAPI-->>BVC: CommonVideo[]
    BVC->>DB: saveBloggerVideos() → videos + video_statistics
    BVC->>DB: UPDATE tracked_accounts SET last_scraped_at = NOW()

    opt YouTube only
        BVC->>HIGH_PRIORITY_QUEUE: video_batch messages
    end

# Batch Queue Message

// No 'type' field — consumer uses this to route to handleMidnightBatch()
{
  trackingId: string,           // UUID - tracked_accounts.id
  userId: string,
  externalId: string,
  daily_videos_count: number,   // May be reduced by partial fetch logic
  organization_id: string,
  blogger_id: string,
  platform: 'TikTok' | 'Instagram' | 'YouTube'
}

# MIN_PARTIAL_FETCH Logic

When an organization is near its subscription limit:

remaining = subscription_limit - videos_fetched_this_period

remaining >= daily_videos_count  →  full fetch (daily_videos_count videos)
remaining >= 10                  →  partial fetch (10 videos minimum)
remaining < 10                   →  skip this blogger entirely

# Platform API Services

Factory: shared/services/apiServiceFactory.jscreateService(platform, config)

All platforms implement: getUserVideosBatch({ externalId, targetCount, maxPages })CommonVideo[]

// CommonVideo structure
{
  external_video_id: string,
  platform: string,
  video_url: string,
  title: string,
  thumbnail_url: string,
  published_at: string,       // ISO8601
  statistics?: {
    external_video_id: string,
    views: number,
    likes: number,
    comments: number,
    reposts: number
  }
}

Retry: timeout 15s, max 2 retries, base delay 1500ms, backoff factor 4×.

# Instagram URL Construction

extractVideoUrl() in instagram.js uses media.code (shortcode) as the primary source for the URL. If code is absent, it derives the shortcode from the numeric media.id using Instagram's base64-like alphabet:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_

If both are missing, the video is skipped and a warning is sent to Sentry.