Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .dockerignore

This file was deleted.

1 change: 0 additions & 1 deletion Procfile

This file was deleted.

248 changes: 248 additions & 0 deletions RENDER_DEPLOYMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,248 @@
# Render Deployment Guide

This guide explains how to deploy OcotilloAPI to Render using the `render.yaml` blueprint.

## Overview

The `render.yaml` file defines two complete environments:
- **Staging**: Auto-deploys from `staging` branch
- **Production**: Manual deploys from `main` branch (requires approval)

## Architecture

### Services Created

1. **Web Services** (2)
- `ocotillo-api-staging` - Staging environment
- `ocotillo-api-production` - Production environment

2. **Databases** (2)
- `ocotillo-db-staging` - PostgreSQL 17 with PostGIS
- `ocotillo-db-production` - PostgreSQL 17 with PostGIS

3. **Environment Variable Groups** (3)
- `ocotillo-shared` - Common configuration for all environments
- `ocotillo-staging` - Staging-specific settings
- `ocotillo-production` - Production-specific settings

## Initial Deployment

### Step 1: Connect Repository

1. Log in to [Render Dashboard](https://dashboard.render.com)
2. Click **"New Blueprint Instance"**
3. Connect your GitHub repository
4. Select the branch containing `render.yaml` (e.g., `render-deploy`)

### Step 2: Configure Environment Variables

Render will automatically create the environment variable groups, but you need to set values for `sync: false` variables:

#### Staging Environment (`ocotillo-staging` group)

Set these in the Render dashboard:

```bash
# Authentik OAuth Configuration
AUTHENTIK_URL=https://your-staging-authentik-instance.com
AUTHENTIK_CLIENT_ID=your_staging_client_id
AUTHENTIK_AUTHORIZE_URL=https://your-staging-authentik-instance.com/application/o/authorize/
AUTHENTIK_TOKEN_URL=https://your-staging-authentik-instance.com/application/o/token/

# Google Cloud Storage (if using assets)
GCS_BUCKET_NAME=your-staging-bucket
GOOGLE_APPLICATION_CREDENTIALS=/path/to/staging/credentials.json
```

#### Production Environment (`ocotillo-production` group)

```bash
# Authentik OAuth Configuration
AUTHENTIK_URL=https://your-production-authentik-instance.com
AUTHENTIK_CLIENT_ID=your_production_client_id
AUTHENTIK_AUTHORIZE_URL=https://your-production-authentik-instance.com/application/o/authorize/
AUTHENTIK_TOKEN_URL=https://your-production-authentik-instance.com/application/o/token/

# Google Cloud Storage (if using assets)
GCS_BUCKET_NAME=your-production-bucket
GOOGLE_APPLICATION_CREDENTIALS=/path/to/production/credentials.json
```

**Note**: `SESSION_SECRET_KEY` is auto-generated by Render (`generateValue: true`)

### Step 3: Review and Deploy

1. Review the services that will be created
2. Click **"Apply"** to create all resources
3. Render will:
- Create PostgreSQL databases with PostGIS extension
- Build Docker images
- Run database migrations
- Deploy the applications

## Database Configuration

### PostGIS Extension

The `preDeployCommand` automatically installs PostGIS:
```bash
psql $DATABASE_URL -c "CREATE EXTENSION IF NOT EXISTS postgis;"
```

### Connection Details

- **DATABASE_URL** is automatically provided by Render
- Connections are internal-only by default (`ipAllowList: []`)
- To allow external connections, add IP addresses to `ipAllowList` in `render.yaml`

## Health Checks

Both environments use `/health/ready` endpoint:
- Checks application responsiveness
- Verifies database connectivity
- Returns 200 if healthy, 503 if not ready

## Deployment Workflow

### Staging Deployments

```bash
git checkout staging
git merge render-deploy # or your feature branch
git push origin staging
```

**Result**: Automatic deployment to staging environment

### Production Deployments

```bash
git checkout main
git merge staging
git push origin main
```

**Result**: Build triggered, but requires manual approval in Render dashboard

## Environment-Specific Configuration

### Staging
- `MODE=development`
- Auto-deploy enabled
- Authentication enabled (Authentik)
- Smaller instance size (starter plan)

### Production
- `MODE=production`
- Manual deploy (requires approval)
- Authentication enabled (Authentik)
- Larger instance size (standard plan)

## Service Plans

Current configuration in `render.yaml`:

| Service | Plan | Notes |
|---------|------|-------|
| Staging Web | Starter | Suitable for testing |
| Production Web | Standard | Recommended for production traffic |
| Staging DB | Standard | Can downgrade to Starter if needed |
| Production DB | Standard | Recommended for production data |

To change plans, edit the `plan` field in `render.yaml`.

## Scaling Configuration

The application uses **Gunicorn with 4 workers** by default (configured in `entrypoint.sh`).

To adjust workers:
1. Edit `entrypoint.sh`
2. Change `--workers 4` to desired number
3. Commit and push changes

**Recommended**: 2-4 workers per container, scale horizontally by adding more instances.

## Monitoring

### Health Check Endpoints

- `/health/live` - Basic liveness check
- `/health/ready` - Readiness check with DB connectivity
- `/health/status` - Detailed status with pool metrics

### Logs

Access logs in Render dashboard:
1. Navigate to your service
2. Click **"Logs"** tab
3. Filter by severity or search terms

## Troubleshooting

### Database Connection Issues

Check if PostGIS extension is installed:
```bash
# Via Render Shell
psql $DATABASE_URL -c "SELECT PostGIS_Version();"
```

### Migration Failures

Run migrations manually via Render Shell:
```bash
alembic upgrade head
```

### Port Binding Issues

Ensure `$PORT` environment variable is being used (already configured in `entrypoint.sh`).

## Updating the Blueprint

After modifying `render.yaml`:

1. Commit and push changes
2. Go to Render dashboard
3. Navigate to Blueprint instance
4. Click **"Sync"** to apply changes

**Warning**: Some changes may cause service restarts.

## Cost Optimization

### Staging Environment

Consider downgrading if budget-constrained:
- Web service: `starter` → `free` (with limitations)
- Database: `standard` → `starter` (1GB storage, no point-in-time recovery)

### Production Environment

Recommended minimum:
- Web service: `standard` (for multiple workers)
- Database: `standard` (for backups and PITR)

## Security Considerations

1. **Authentication**: Authentik OAuth is enabled by default
2. **Database**: Internal-only access (no public IP by default)
3. **Secrets**: All sensitive values use `sync: false` (not in git)
4. **HTTPS**: Automatically provided by Render

## Next Steps

After deployment:

1. ✅ Verify health checks are passing
2. ✅ Test API endpoints
3. ✅ Configure custom domains (optional)
4. ✅ Set up monitoring/alerting
5. ✅ Configure backups schedule
6. ✅ Review and optimize database performance

## Support

- [Render Documentation](https://render.com/docs)
- [Render Status Page](https://status.render.com)
- [Community Forum](https://community.render.com)
12 changes: 11 additions & 1 deletion alembic/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,17 @@ def build_database_url():
return f"postgresql+pg8000://{user}@/{database}"
return f"postgresql+pg8000://{user}:{password}@/{database}"

# Default/Postgres
# Check for DATABASE_URL first (Render/Heroku standard)
database_url = os.environ.get("DATABASE_URL", "")
if database_url:
# Convert to psycopg2 driver for alembic
if database_url.startswith("postgres://"):
return database_url.replace("postgres://", "postgresql+psycopg2://", 1)
elif database_url.startswith("postgresql://"):
return database_url.replace("postgresql://", "postgresql+psycopg2://", 1)
return database_url

# Fall back to individual env vars (backward compatible)
user = os.environ.get("POSTGRES_USER", "")
password = os.environ.get("POSTGRES_PASSWORD", "")
db = os.environ.get("POSTGRES_DB", "")
Expand Down
121 changes: 121 additions & 0 deletions api/health.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
"""Health check endpoints for deployment monitoring and container orchestration."""

from datetime import datetime
from typing import Any

from fastapi import APIRouter, HTTPException, status
from sqlalchemy import text

from core.dependencies import session_dependency

router = APIRouter(prefix="/health", tags=["Health"])


@router.get("/live", summary="Liveness probe")
async def liveness_check() -> dict[str, str]:
"""
Basic liveness probe - checks if the application is running.

This endpoint should always return 200 if the app is alive.
No dependencies checked - just confirms the web server is responding.

Use this for Kubernetes/Docker liveness probes.
"""
return {"status": "ok"}


@router.get("/ready", summary="Readiness probe")
async def readiness_check(session: session_dependency) -> dict[str, Any]:
"""
Readiness probe - checks if the application is ready to serve traffic.

Verifies:
- Application is running
- Database connection is available

Returns 200 if ready, 503 if not ready.
Use this for Kubernetes/Docker readiness probes and load balancer health checks.
"""
try:
# Test database connectivity with a simple query
result = await session.execute(text("SELECT 1"))
result.scalar()

return {
"status": "ready",
"database": "connected",
"checks": {"db_connection": True},
}
except Exception as e:
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail={
"status": "not_ready",
"database": "disconnected",
"error": str(e),
"checks": {"db_connection": False},
},
)


@router.get("/status", summary="Detailed status information")
async def status_check(session: session_dependency) -> dict[str, Any]:
"""
Detailed status endpoint - provides comprehensive health information.

Returns:
- Application status
- Database connectivity and pool information
- Timestamp
- Additional service status

Use this for monitoring dashboards and detailed health reports.
"""
timestamp = datetime.utcnow().isoformat()

# Check database connectivity
db_status = "disconnected"
db_connected = False
pool_info = {}

try:
result = await session.execute(text("SELECT 1"))
result.scalar()
db_status = "connected"
db_connected = True

# Get connection pool information
pool = session.get_bind().pool
pool_info = {
"size": pool.size(),
"overflow": pool.overflow() if hasattr(pool, "overflow") else None,
"checked_in": pool.checkedin() if hasattr(pool, "checkedin") else None,
"checked_out": pool.checkedout() if hasattr(pool, "checkedout") else None,
}
# Remove None values
pool_info = {k: v for k, v in pool_info.items() if v is not None}

except Exception as e:
db_status = f"error: {str(e)}"

response = {
"status": "healthy" if db_connected else "degraded",
"timestamp": timestamp,
"database": {
"status": db_status,
"connected": db_connected,
},
"services": {"admin": "available", "cors": "enabled"},
}

# Add pool info if available
if pool_info:
response["database"]["pool_info"] = pool_info

# Return 503 if database is not connected
if not db_connected:
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE, detail=response
)

return response

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.

Copilot Autofix

AI 6 days ago

In general, to fix this class of problem you should avoid including raw exception messages or stack traces in API responses. Instead, log the detailed error server-side (using your logging framework) and return a generic, non-sensitive status or error description to the client.

For this specific code:

  • In readiness_check, the detail field currently includes "error": str(e). This should be replaced with a generic message such as "error": "database connectivity check failed" to avoid leaking str(e) to the user. Optionally, log the real exception with a logger.
  • In status_check, the except block sets db_status = f"error: {str(e)}", which is then returned to the client in response["database"]["status"]. Replace this with a generic description like "error: database connectivity check failed" and, again, optionally log the exception.

We can introduce a logging import at the top of api/health.py and a module logger via logger = logging.getLogger(__name__). In both except blocks, call logger.exception(...) so developers still get full tracebacks in the logs, while the API only returns sanitized messages. No changes to function signatures or overall behavior (HTTP status codes, JSON structure, keys like "status"/"database"/"checks") are needed—only the content of the error strings changes.

Suggested changeset 1
api/health.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/api/health.py b/api/health.py
--- a/api/health.py
+++ b/api/health.py
@@ -2,12 +2,15 @@
 
 from datetime import datetime
 from typing import Any
+import logging
 
 from fastapi import APIRouter, HTTPException, status
 from sqlalchemy import text
 
 from core.dependencies import session_dependency
 
+logger = logging.getLogger(__name__)
+
 router = APIRouter(prefix="/health", tags=["Health"])
 
 
@@ -47,12 +44,13 @@
             "checks": {"db_connection": True},
         }
     except Exception as e:
+        logger.exception("Readiness database connectivity check failed")
         raise HTTPException(
             status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
             detail={
                 "status": "not_ready",
                 "database": "disconnected",
-                "error": str(e),
+                "error": "database connectivity check failed",
                 "checks": {"db_connection": False},
             },
         )
@@ -96,7 +89,8 @@
         pool_info = {k: v for k, v in pool_info.items() if v is not None}
 
     except Exception as e:
-        db_status = f"error: {str(e)}"
+        logger.exception("Status database connectivity check failed")
+        db_status = "error: database connectivity check failed"
 
     response = {
         "status": "healthy" if db_connected else "degraded",
EOF
@@ -2,12 +2,15 @@

from datetime import datetime
from typing import Any
import logging

from fastapi import APIRouter, HTTPException, status
from sqlalchemy import text

from core.dependencies import session_dependency

logger = logging.getLogger(__name__)

router = APIRouter(prefix="/health", tags=["Health"])


@@ -47,12 +44,13 @@
"checks": {"db_connection": True},
}
except Exception as e:
logger.exception("Readiness database connectivity check failed")
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail={
"status": "not_ready",
"database": "disconnected",
"error": str(e),
"error": "database connectivity check failed",
"checks": {"db_connection": False},
},
)
@@ -96,7 +89,8 @@
pool_info = {k: v for k, v in pool_info.items() if v is not None}

except Exception as e:
db_status = f"error: {str(e)}"
logger.exception("Status database connectivity check failed")
db_status = "error: database connectivity check failed"

response = {
"status": "healthy" if db_connected else "degraded",
Copilot is powered by AI and may make mistakes. Always verify output.
Loading
Loading