uRadical blog - Productionising FastAPI Applications with Podman

FastAPI has become a popular choice for building Python APIs thanks to its performance, automatic documentation, and developer experience. Getting an application running locally is one thing; deploying it reliably to production is another matter entirely. This post covers the essential practices for taking your FastAPI application from development to production-ready, using Podman as our container runtime.

Why Podman?

Podman offers several advantages over Docker for production workloads. It runs containers rootless by default, eliminating an entire class of security vulnerabilities. There's no daemon process that could become a single point of failure. Podman also generates systemd unit files directly, making it straightforward to integrate with Linux service management. For organisations with security-conscious environments, the daemonless architecture means fewer attack surfaces to worry about.

Project Structure

A well-organised project makes deployment and maintenance considerably easier. Here's a structure that works well for production FastAPI applications:

myapi/
├── src/
│   └── myapi/
│       ├── __init__.py
│       ├── main.py
│       ├── config.py
│       ├── routes/
│       ├── models/
│       └── services/
├── tests/
├── Containerfile
├── pyproject.toml
├── uv.lock
└── .env.example

Keeping your application code under src/ prevents import confusion and makes packaging cleaner. The Containerfile (Podman's preferred name for Dockerfile) lives at the project root alongside your dependency definitions.

Configuration Management

Production applications need flexible configuration. FastAPI works beautifully with Pydantic's settings management:

from pydantic_settings import BaseSettings
from functools import lru_cache

class Settings(BaseSettings):
    app_name: str = "My API"
    debug: bool = False
    database_url: str
    allowed_origins: list[str] = []
    log_level: str = "INFO"
    
    model_config = {"env_file": ".env", "env_file_encoding": "utf-8"}

@lru_cache
def get_settings() -> Settings:
    return Settings()

This approach validates environment variables at startup, failing fast if required configuration is missing. The lru_cache decorator ensures settings are only parsed once.

Building a Production Container

A well-crafted Containerfile makes a significant difference to security and image size. Here's a multi-stage build that produces a minimal production image:

FROM python:3.12-slim AS builder

WORKDIR /app

# Install uv for fast dependency resolution
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

# Copy dependency files first for better layer caching
COPY pyproject.toml uv.lock ./

# Install dependencies into a virtual environment
RUN uv sync --frozen --no-dev --no-install-project

# Copy application code
COPY src ./src

# Install the project itself
RUN uv sync --frozen --no-dev

FROM python:3.12-slim AS runtime

# Create non-root user
RUN useradd --create-home --shell /bin/bash appuser

WORKDIR /app

# Copy virtual environment from builder
COPY --from=builder /app/.venv /app/.venv

# Set environment variables
ENV PATH="/app/.venv/bin:$PATH"
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1

# Switch to non-root user
USER appuser

EXPOSE 8000

CMD ["uvicorn", "myapi.main:app", "--host", "0.0.0.0", "--port", "8000"]

A few things worth noting here. The multi-stage build keeps the final image small by excluding build tools. Running as a non-root user is essential even though Podman already provides rootless containers—defence in depth matters. Setting PYTHONUNBUFFERED ensures logs appear immediately rather than being buffered.

ASGI Server Configuration

Uvicorn is the standard choice for running FastAPI in production, but it needs proper configuration. For production workloads, you'll want multiple workers to utilise available CPU cores:

# gunicorn.conf.py
import multiprocessing

workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornWorker"
bind = "0.0.0.0:8000"
keepalive = 120
timeout = 30
graceful_timeout = 30
max_requests = 1000
max_requests_jitter = 50

Using Gunicorn with Uvicorn workers gives you process management, graceful restarts, and better handling of worker crashes. The max_requests setting causes workers to restart periodically, which helps prevent memory leaks from accumulating.

Update your container command accordingly:

CMD ["gunicorn", "-c", "gunicorn.conf.py", "myapi.main:app"]

Health Checks and Observability

Production services need health endpoints for orchestration and monitoring:

from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.ext.asyncio import AsyncSession

app = FastAPI()

@app.get("/health/live")
async def liveness():
    """Kubernetes liveness probe - is the process running?"""
    return {"status": "alive"}

@app.get("/health/ready")
async def readiness(db: AsyncSession = Depends(get_db)):
    """Kubernetes readiness probe - can we serve traffic?"""
    try:
        await db.execute("SELECT 1")
        return {"status": "ready"}
    except Exception:
        raise HTTPException(status_code=503, detail="Database unavailable")

Separating liveness from readiness allows orchestrators to make intelligent decisions. A failing readiness check removes the instance from load balancing; a failing liveness check triggers a restart.

For observability, structured logging makes life much easier:

import structlog
import logging

def configure_logging(log_level: str = "INFO"):
    structlog.configure(
        processors=[
            structlog.contextvars.merge_contextvars,
            structlog.processors.add_log_level,
            structlog.processors.TimeStamper(fmt="iso"),
            structlog.processors.JSONRenderer(),
        ],
        wrapper_class=structlog.make_filtering_bound_logger(
            getattr(logging, log_level)
        ),
    )

JSON logs are essential for production—they're parseable by log aggregation systems and carry structured context that plain text logs cannot.

Running with Podman

Build your image with Podman just as you would with Docker:

podman build -t myapi:latest .

For local testing, run the container with appropriate settings:

podman run -d \
    --name myapi \
    -p 8000:8000 \
    --env-file .env \
    --health-cmd="curl -f http://localhost:8000/health/live || exit 1" \
    --health-interval=30s \
    myapi:latest

Systemd Integration

One of Podman's strengths is native systemd integration. Generate a unit file for your container:

podman generate systemd --new --name myapi > ~/.config/systemd/user/myapi.service

For a rootless container running as a regular user, enable lingering so the service starts at boot:

loginctl enable-linger $USER
systemctl --user enable myapi.service
systemctl --user start myapi.service

This gives you automatic restarts, proper logging via journald, and familiar service management commands.

Security Considerations

Beyond running rootless, consider these additional hardening measures. Use read-only root filesystems where possible by adding --read-only to your run command and mounting a tmpfs for any directories that need writes. Drop unnecessary Linux capabilities with --cap-drop=ALL and add back only what you need. Set resource limits to prevent runaway containers from affecting the host.

For secrets, avoid baking them into images or passing them as environment variables visible in process listings. Podman supports secrets management:

echo "mysecretvalue" | podman secret create db_password -
podman run --secret db_password myapi:latest

Inside the container, the secret appears as a file at /run/secrets/db_password.

Database Connections

Connection pooling deserves attention in production. With async SQLAlchemy:

from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker

engine = create_async_engine(
    settings.database_url,
    pool_size=5,
    max_overflow=10,
    pool_pre_ping=True,
    pool_recycle=3600,
)

The pool_pre_ping setting checks connections before use, which handles database restarts gracefully. pool_recycle prevents issues with connections being closed by firewalls or the database after long idle periods.

Graceful Shutdown

FastAPI applications should handle shutdown signals properly to avoid dropping requests:

from contextlib import asynccontextmanager
from fastapi import FastAPI

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Startup
    await database.connect()
    yield
    # Shutdown
    await database.disconnect()

app = FastAPI(lifespan=lifespan)

The lifespan context manager ensures cleanup code runs when the application receives SIGTERM. Combined with Gunicorn's graceful timeout, this allows in-flight requests to complete before the process exits.

Putting It Together

A production deployment typically involves a reverse proxy in front of your application. With Podman, you might run this as a pod—a group of containers sharing a network namespace:

podman pod create --name myapi-pod -p 8080:80
podman run -d --pod myapi-pod --name nginx nginx:alpine
podman run -d --pod myapi-pod --name api myapi:latest

Alternatively, use Podman Compose with a compose file for more complex setups. The syntax is nearly identical to Docker Compose.

Taking a FastAPI application to production requires attention to configuration, security, observability, and operational concerns. Podman provides a secure foundation with its rootless architecture and systemd integration. Combined with proper application structure and configuration management, you'll have a deployment that's both secure and maintainable.

The key is treating production deployment as a first-class concern from the start rather than an afterthought. The practices outlined here—multi-stage builds, health checks, structured logging, proper signal handling—aren't particularly difficult to implement, but they make an enormous difference when you're debugging an issue at 3am.