Skip to content

RES-2060: Test deployment on Fly.io with group tags branch#830

Open
edwh wants to merge 23 commits intoupgrade-laravel-10x-restartfrom
RES-2060_fly_io_deployment
Open

RES-2060: Test deployment on Fly.io with group tags branch#830
edwh wants to merge 23 commits intoupgrade-laravel-10x-restartfrom
RES-2060_fly_io_deployment

Conversation

@edwh
Copy link
Collaborator

@edwh edwh commented Feb 2, 2026

Summary

  • Test deployment of the application on Fly.io
  • Built off the group tags branch (RES-2054) to validate deployment with that feature

Related

Test plan

  • Application deploys successfully to Fly.io
  • Core functionality works in Fly.io environment
  • Group tags feature accessible in deployed environment

🤖 Generated with Claude Code

if [[ -n "$DB_PASS" ]]; then
log_step "Setting MYSQL_PASSWORD and MYSQL_ROOT_PASSWORD on ${FLY_DB_APP}..."
if [[ "$DRY_RUN" = true ]]; then
log_dry "echo 'MYSQL_PASSWORD=***\nMYSQL_ROOT_PASSWORD=***' | fly secrets import -a ${FLY_DB_APP}"

Check failure

Code scanning / SonarCloud

MySQL database passwords should not be disclosed High

Make sure this MySQL password gets changed and removed from the code. See more on SonarQube Cloud
if [[ "$DRY_RUN" = true ]]; then
log_dry "echo 'MYSQL_PASSWORD=***\nMYSQL_ROOT_PASSWORD=***' | fly secrets import -a ${FLY_DB_APP}"
else
printf "MYSQL_PASSWORD=%s\nMYSQL_ROOT_PASSWORD=%s\n" "$DB_PASS" "$DB_PASS" \

Check failure

Code scanning / SonarCloud

MySQL database passwords should not be disclosed High

Make sure this MySQL password gets changed and removed from the code. See more on SonarQube Cloud
@edwh edwh changed the base branch from develop to RES-2054_group_tags March 19, 2026 08:09
edwh and others added 17 commits March 19, 2026 12:00
- Dockerfile.fly: multi-stage build with PHP 8.2-FPM, nginx, supervisord
- fly.toml / fly-mysql.toml: app and MySQL machine configs
- nginx-fly.conf: nginx config with unix socket and Tigris proxy
- supervisord-fly.conf: process manager for nginx, php-fpm, cron
- startup.sh: background DB migrations, immediate supervisord start
- TrustProxies: trust all proxies (Fly terminates TLS)
- Disable Discourse/Wiki features when URLs not configured
- Add try-catch around Discourse notification fetch
- Remove zz-docker.conf override in Dockerfile

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…nx proxy

- Fix startup.sh to chown storage after artisan cache commands run as root,
  preventing file_put_contents errors on group/fixometer pages
- Fix DiscourseAnonymiseUser and DiscourseChangeSetting constructor bugs that
  called $this->error() before output was initialized, crashing all artisan
  commands during Docker build
- Bump PHP memory_limit from 256M to 512M for heavy query pages
- Fix nginx Tigris proxy: add resolver, rewrite rule, and SNI support
- Update fly-migrate.sh: remove DB credentials from secrets list, add Mailgun
  secrets, conditional mysqldump flags, idempotent DB import, progress indicator

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…course commands)

Move env validation and $this->error() calls from __construct() to handle()
to prevent crashing all artisan commands during Docker build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Memory bump needed for larger operations. Deploy workflow is disabled
by default with instructions for activation via GitHub Actions or CircleCI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…certificate details

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…xes, Croppa/WP checks

- Document actual DNS TTLs (3600s) and subdomains from iwantmyname.com
- Verified test email sends from Fly via Mailgun (mg.restarters.net SPF/DKIM ok)
- Fixed FEATURE__DISCOURSE_INTEGRATION and FEATURE__WIKI_INTEGRATION to true
- Confirmed WordPress XML-RPC reachable from Fly (HTTP 200)
- Confirmed Croppa dynamic resizing not used (Croppa::url never called)
- Added MAIL_FROM_ADDRESS/MAIL_FROM_NAME to missing secrets list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… collapsible sections, CI deploy option

- Reorganised to lead with What Changes / What Stays the Same
- Added Metabase direct DB access as known risk
- Moved Fly config setup to pre-cutover (days before, not during maintenance window)
- Added CircleCI automated deploy option
- Clarified DNS is at iwantmyname.com (not Cloudflare, no CNAME flattening)
- Made all sections collapsible with <details> tags
- Fixed section numbering

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… title are on same line

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On Fly.io, nginx serves /uploads/ from Tigris (S3), not local disk.
FixometerFile writes to local disk for Intervention Image processing,
but those files were never synced to Tigris, so images were broken.

After local processing (orientation fix, thumbnail/mid generation),
files are now synced to S3 via Storage facade. The syncToCloud method
is a no-op when FILESYSTEM_DISK is not 's3', so existing production
and test environments are unaffected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, hauts-de-france)

All three point to current production server and need DNS + TLS certs on Fly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… DNS cutover

Wildcard cert already created on Fly. Needs _acme-challenge CNAME at
iwantmyname.com for DNS-01 validation (can be done before cutover).
Covers repairtogether, repairshare, hauts-de-france subdomains.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@edwh edwh changed the base branch from RES-2054_group_tags to upgrade-laravel-10x-restart March 19, 2026 12:04
@edwh edwh force-pushed the RES-2060_fly_io_deployment branch from 728031a to 44b2142 Compare March 19, 2026 12:04
edwh and others added 4 commits March 19, 2026 12:12
- Deploy L10+Fly first, group tags later as preview
- Scale up to shared-cpu-2x/4GB for launch
- Fix MAIL_FROM_ADDRESS to noreply@mg.restarters.net
- Hourly backups to Google Drive, Metabase pulls from there
- Simplified rollback: just switch DNS back
- Add API compatibility check for third parties
- Add ERES preview deployment plan
- Concrete timeline: migration ~Apr 9, group tags ~Apr 30
- Note production branch cleanup needed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Mount /var/log as persistent volume so logs survive redeploys
- Add request timing to nginx access log (rt, uct, uht, urt fields)
- Enable PHP-FPM slow log (5s threshold) and request_terminate_timeout (60s)
- Install sysstat and run sar collection every 60s via supervisord
- Create log directories on volume in startup.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Health check was hitting the homepage which runs Fixometer::loginRegisterStats()
causing 60s+ timeouts and saturating PHP-FPM workers. /robots.txt is a static
file served by nginx without touching PHP.

Added logrotate for nginx and PHP-FPM slow logs (14 day retention).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Symlinks /var/www/storage/framework/cache/data to /var/log/cache/data
on the persistent volume. Previously the cache was on ephemeral container
storage and blown away on every deploy, causing 15s+ homepage loads
while Fixometer stats recalculated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@edwh edwh force-pushed the RES-2060_fly_io_deployment branch from 87dcfaf to 9903115 Compare March 19, 2026 17:30
Homepage request causes PHP-FPM to spike to ~400MB, exhausting all
available memory (985MB total, ~400MB free at idle). Without swap,
the OOM killer terminates PHP-FPM workers causing 502 errors.

Uses fallocate (instant) with dd fallback. On root fs, recreated each boot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@edwh edwh force-pushed the RES-2060_fly_io_deployment branch from 9903115 to 6988670 Compare March 19, 2026 17:46
The all_stats cache was storing 17,291 full Party model objects (96MB
serialized). Every homepage load deserialized this, taking 5+ seconds.

Only the count was ever used. Now caches just the integer count.
Cache file drops from 96MB to 112KB, read time from 5s to <1ms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
4 Security Hotspots
5.0% Duplication on New Code (required ≤ 3%)
E Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant