Debugging Postiz & Temporal: A Production Runbook for Self-Hosted Social Media Orchestration

Self-hosting your social media scheduling infrastructure is the ultimate way to maintain data sovereignty, build bespoke automation pipelines, and avoid hefty SaaS subscription fees. Tools like Postiz represent the cutting edge of this movement. However, when you combine a modern Next.js/NestJS application with an enterprise-grade workflow engine like Temporal, the operational complexity rises exponentially.

If you are running this stack in a production environment under a reverse proxy (like Apache or Nginx), you will eventually hit edge cases where posts get stuck in queue, backend containers crash loop, or social API limits cascade.

Here is the engineering runbook to configure, troubleshoot, and patch this stack for production resilience.

The Infrastructure Blueprint

A production-grade Postiz deployment is not a single container; it is an orchestration of nine separate services running in concert. When deploying via Docker Compose, the services divide into two primary realms:

1. The Postiz Application Layer

postiz: The main container housing the Next.js frontend (port 5000), the NestJS backend API (port 3000), and the worker orchestrator (port 3002).
postiz-postgres: PostgreSQL 17 database storing user data, accounts, configurations, and scheduled post metadata.
postiz-redis: Redis 7.2 serving as the high-performance caching layer.

2. The Temporal Workflow Layer

temporal: The core Temporal orchestration engine (port 7233). It manages the state, retries, and timing of post publication workflows.
temporal-postgresql: PostgreSQL 16 database storing Temporal internal states and histories.
temporal-elasticsearch: Elasticsearch 7.17 for advanced visibility, listing, and filtering of workflows.
temporal-admin-tools: Command Line Interface (CLI) tools for managing namespaces and workflows.
temporal-ui: A visual dashboard (port 8080) for auditing active and failed workflows.
Monitoring: An optional spotlight container (port 8969) is often integrated for Sentry-based local debugging.

1. The Startup Race: Resolving 502 Bad Gateway

One of the most common issues in a fresh self-hosted setup is the frontend loading successfully but all API requests failing with 502 Bad Gateway or 111: Connection refused.

The Root Cause

The Postiz NestJS backend requires a live, active connection to the Temporal cluster (temporal:7233) at the exact millisecond it boots. If the temporal container is offline, or if the DNS resolution inside Docker fails to resolve the hostname, the NestJS process exits immediately.

Because Temporal relies on its own PostgreSQL database and Elasticsearch instance, it takes significantly longer to start than the Postiz backend. This creates a startup race condition.

The Fix

To prevent the backend from permanently crashing during initialization:

Enforce Strict Dependency Order: In your docker-compose.yml, ensure the postiz service specifies depends_on with service_healthy conditions for database and cache services.
Absolute Configuration Paths: In the Temporal service definition, ensure the DYNAMIC_CONFIG_FILE_PATH is defined as an absolute container path:
```
environment:
  - DYNAMIC_CONFIG_FILE_PATH=/etc/temporal/config/dynamicconfig/development-sql.yaml
```
If this path is relative, the auto-setup script may fail silently, preventing the Temporal server from exposing port 7233.
Controlled Stack Restarts: Avoid rebooting single services during initialization. Use a clean, full stack boot sequence:
```
docker compose down && docker compose up -d
```

2. Stuck Posts & Orchestrator Crash Loops

You schedule a post in the calendar. The time passes, but the post remains permanently in the QUEUE state. There is no error history, and the Temporal UI shows the task queue is not being polled.

The Root Cause

This occurs when the Postiz worker orchestrator crashes or spawns duplicate ghost processes. The orchestrator compile phase takes roughly 90 seconds upon container boot. If a system administrator executes manual PM2 restarts (e.g., pm2 restart orchestrator) within that boot window, the initial process does not terminate cleanly.

The resulting duplicate Node.js processes fight over port 3002, causing port collisions (EADDRINUSE) and trigger an ELIFECYCLE crash loop. With the orchestrator dead or detached, the Temporal worker queue goes unpolled, leaving posts orphaned in QUEUE.

The Action Plan

If you encounter stuck posts, follow this precise troubleshooting path:

Check PM2 Process Health:
```
docker exec postiz npx pm2 list
```
If you see the orchestrator process with high restart counts or status errored, check the error logs:
```
docker exec postiz tail -n 100 /root/.pm2/logs/orchestrator-error.log
```
Kill Ghost Processes: Inspect running processes inside the container to find orphaned node PIDs:
```
docker exec postiz ps aux | grep node
```
If multiple instances of /app/apps/orchestrator are running, terminate the container entirely to clear the process table:
```
docker compose stop postiz && sleep 5 && docker compose start postiz
```
Allow Compilation to Complete: Once started, do not touch the orchestrator for 150 seconds. Let the compiler finish. Run docker exec postiz npx pm2 list to verify the process shows an uptime of several minutes and a restart count of 0.
Reschedule Stuck Posts: Orphaned database rows will not auto-heal. Query your database to find stuck posts and delete/re-create them:
```
SELECT id, state FROM "Post" WHERE state = 'QUEUE';
```

3. Persistent Next.js Frontend Hot-Patching

When self-hosting, you occasionally need to modify the UI behavior—such as altering sorting orders, adding localization, or modifying design details—directly in the containerized Next.js frontend.

However, running pnpm run build:frontend inside a running Docker container is resource-heavy, and any compiled output will be wiped out the moment the container is recreated.

The Volume Mount Solution

To apply persistent patches to compiled frontend code:

Extract the Patched Source: Save your updated source file (e.g., calendar.tsx) to a /patches directory on your host server.

Extract Compiled Assets: Build the frontend once inside the container, then copy the compiled .next directory to the host:

mkdir -p /opt/postiz/frontend-next
docker exec postiz tar -cf - -C /app/apps/frontend .next | tar -xf - -C /opt/postiz/frontend-next

Mount Host Volumes: Mount both the source patch and the compiled production assets back into the container via docker-compose.yml:

services:
  postiz:
    image: ghcr.io/gitroomhq/postiz-app:latest
    volumes:
      - ./patches/calendar.tsx:/app/apps/frontend/src/components/launches/calendar.tsx
      - ./frontend-next/.next:/app/apps/frontend/.next

Recreate Containers: Execute docker compose up -d. The new container will immediately serve the patched, pre-compiled Next.js assets without requiring a compile phase on boot.

4. API & SDK Integration Gotchas

Even with a healthy container stack, the social network APIs themselves present strict validation rules that trigger cryptic failures.

TikTok Sandbox Restrictions

If you are testing your integration using a TikTok developer app in Sandbox mode:

URL Ownership Verification: Ensure your Postiz domain (e.g., social-hub.example.com) is verified in the TikTok Developer Portal under URL properties to avoid url_ownership_unverified errors.
Sandbox Privacy Constraints: TikTok sandbox accounts can only publish videos with privacy set to Self Only (SELF_ONLY). Attempting to publish a Public post will fail with unaudited_client_can_only_post_to_private_accounts.
Media Formats: The TikTok Direct Post API only accepts MP4 video files. Static images or JPEG posts will error out immediately.

The Meta Cascade Failure (Facebook & Instagram)

One of the most elusive bugs occurs when an Instagram publish workflow fails with a deleted_object error:

{
  "error": {
    "message": "Unsupported post request. Object with ID 'deleted_...' does not exist...",
    "code": 100,
    "error_subcode": 33
  }
}

At first glance, this looks like an Instagram account authentication issue. However, the root cause is often on the Facebook side of the Meta integration.

The Media Container Dependency

To publish an image to Instagram, Postiz performs a two-step process:

It uploads the image to Meta's servers under the associated Facebook Page's assets to create a temporary Media Container.
It takes the resulting Media Container ID and instructs Instagram to publish it.

If Meta triggers an Identity Checkpoint (verification check) on your Facebook Page, the Facebook API will block the creation of new assets. While the Instagram connection itself remains active and healthy, the temporary Media Container is immediately deleted or denied access on Meta's end.

Instagram then attempts to read the container ID, fails to find it, and returns a misleading deleted_object error.

The Resolution: Open the Facebook mobile app on the account owner's registered phone, complete the identity verification checkpoint prompt, and the cascade block on both Facebook and Instagram will resolve automatically.

Summary Checklist for Production Self-Hosting

When maintaining a production Postiz stack, keep these operations principles close:

Operational Dimension	Strategy
Startup Order	Ensure database and cache health checks pass before letting the NestJS API start.
Process Control	Never run manual PM2 commands in the orchestrator during the initial 150-second worker compilation phase.
Patch Management	Map compiled frontend folders (`.next`) and components from the host to keep container re-creations light and persistent.
Identity Health	Monitor Meta Developer portal alerts; Facebook identity verification blocks will cascade and break Instagram publishing.

By mastering these architectural behaviors, you can transform a complex container ecosystem into a reliable, self-healing content distribution engine.

If you have questions about setting up webhook pipelines or connecting additional self-hosted channels, feel free to reach out via the Contact Page.

The Infrastructure Blueprint

1. The Postiz Application Layer

2. The Temporal Workflow Layer

1. The Startup Race: Resolving 502 Bad Gateway

The Root Cause

The Fix

2. Stuck Posts & Orchestrator Crash Loops

The Root Cause

The Action Plan

3. Persistent Next.js Frontend Hot-Patching

The Volume Mount Solution

4. API & SDK Integration Gotchas

TikTok Sandbox Restrictions

The Meta Cascade Failure (Facebook & Instagram)

The Media Container Dependency

Summary Checklist for Production Self-Hosting

Michael K. Laweh

Have a project in mind?

Post Details

Related Articles