Conversation
Instead of making the app aware of local vs deployed file storage (filesystem fallback, local upload/download endpoints, file:// URLs), use Adobe S3Mock as a drop-in S3 replacement. Same SDK, different endpoint — one code path for both environments. - FileService: S3Client accepts LOCAL_S3_ENDPOINT_URL + forcePathStyle - Remove localFilePath, localStorageRoot, localExecutorStorageRoot - Remove local upload/download controller endpoints and service methods - Remove file:// URL generation in earthbeam-api.service - Always use s3 protocol and presigned URLs for file operations - Add S3Mock service to docker-compose - Executor (Python): boto3 accepts S3_ENDPOINT_URL - Local executors pass S3_ENDPOINT_URL + dummy AWS creds to subprocess - Simplify main.ts listen (0.0.0.0 is the default anyway) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mermaid sequence diagram for the job execution flow, AWS dependencies table, key file index for AWS touchpoints and app-executor communication, executor lifecycle steps, and S3 path structure. Gives AI agents (and developers) a quick map of how the pieces connect without reading every source file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the ExecutorService abstract class with a TypeScript interface and a string-based DI token (EXECUTOR_SERVICE). Extract the shared envVars helper as a standalone function (executorEnvVars) that each implementation calls directly. Lighter weight than inheritance — each implementation owns its constructor and dependencies explicitly rather than inheriting them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collapse the abstract EventEmitterService + three subclasses (AwsEventBridge, Logger, Noop) into a single service that conditionally creates the EventBridge client. No factory needed in the module — the service checks isLocalExecutor() at construction time and skips EventBridge in local mode (log-only). Remove LOCAL_EVENTS env var since the behavior is now derived from LOCAL_EXECUTOR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
585d720 to
8190932
Compare
Remove executorEnvVars helper — each implementation now calls apiAuth and appConfig directly. Three lines of logic don't warrant a shared function with three parameters. Rename executor.abstract.service.ts → executor.service.ts since it's an interface, not an abstract class. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove LOCAL_EXECUTOR_CALLBACK_BASE_URL — it was an unused escape hatch. The docker executor already passes --add-host=host.docker.internal:host-gateway, so the hardcoded host.docker.internal path works reliably. Also clean up getExternalApiConfig to use its local variables. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… mode isLocalExecutor now explicitly checks for 'python' | 'docker' instead of truthy coercion. Bundle refetch on every request is now controlled by a dedicated BUNDLE_CACHE_DISABLED flag rather than piggybacking on executor mode — local dev doesn't inherently need cache-busting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…HANDLER Replace the single conditional EventEmitterService with three implementations (eventbridge, log, noop) behind an interface + DI token, selected by the LOCAL_EVENT_HANDLER env var. EventBridge is the default when unset (prod). This follows the pattern of specific feature flags rather than a bundled "local mode" — you can mix and match, e.g. run executor locally but still emit to EventBridge if you have AWS creds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove TODO comments for executor interface approach (keeping inlined env vars) and docker networking (keeping host.docker.internal). Remove empty LOCAL_EXECUTOR= from .env.test and startJob TODO comment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Route all config through AppConfigService instead of direct process.env reads (FileService, ExecutorAwsService, EventEmitterEventBridgeService, local executor services) - Guard FileService S3Mock override with isDevEnvironment() check so LOCAL_S3_ENDPOINT_URL cannot accidentally activate in prod - Add startup warnings when LOCAL_EXECUTOR or LOCAL_EVENT_EMITTER are set outside of development (NODE_ENV !== 'development') - Rename BUNDLE_CACHE_DISABLED → LOCAL_BUNDLE_CACHE_DISABLED for consistent LOCAL_ prefix convention - Rename LOCAL_EVENT_HANDLER → LOCAL_EVENT_EMITTER to match service name Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NX_LOAD_DOT_ENV_FILES=false on test commands prevents NX from auto-loading
.env into process.env. ConfigModule.forRoot({ ignoreEnvFile: true }) in the
test harness prevents NestJS from loading .env when creating the test app.
Tests now rely solely on .env.test, loaded by loadEnvVars() in global-setup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NestJS v10+ defaults to listening on localhost (127.0.0.1), which makes the API unreachable from the executor Docker container via host.docker.internal. Made-with: Cursor
- check all dependencies upfront (node, npm, docker, docker compose, git) and report all missing at once - non-destructive env file setup (skip if exists, backfill LOCAL_EVENT_EMITTER if missing from older .env) - replace fragile sleep 3 with docker compose --wait backed by a postgres healthcheck in docker-compose.yml - executor mode prompt with docker/python choice, detects current setting as default on re-run, updates api/.env in-place via sed - python mode: skip venv creation if exists, git pull bundles if already cloned - docker mode: always rebuild image to pick up code changes - remove npm install -g prisma; fix prisma scripts in package.json to use npx (local devDependency) - fix executor setup.py: explicit find_packages() so setuptools auto-discovery doesn't choke on the output/ directory - add executor/.dockerignore to exclude output/ and local-run/ from docker build context Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract the manual seed SQL from README into api/seed.sql with ON CONFLICT DO NOTHING on every insert (IdP config, partner, school year, bundles, partner-bundle associations). init.sh runs it automatically after migrations via docker compose exec. Simplify README setup instructions — the three manual steps (docker compose up, wait, run seed SQL) are now a single ./init.sh invocation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the hardcoded bundle list in seed.sql with a new sync-bundles.sh script that fetches the earthmover_edfi_bundles registry.json from GitHub and upserts all bundle paths into the DB. This keeps local dev in sync with the published registry without manual seed updates. - add app/api/sync-bundles.sh: standalone, idempotent bundle sync - remove bundle inserts from seed.sql, add \echo labels for clarity - init.sh: add curl/jq dep checks, call sync-bundles.sh after seed, cd to own directory for portability, fix summary ANSI rendering - update README bundle instructions to reference sync-bundles.sh Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- add AWS_REGION to .env.test and .env.copyme so fresh clones don't fail on missing region when AWS SDK clients are constructed - add 23-24 and 24-25 school years to seed data - remove stale TODOs from integration tests - update init.sh summary: remove IdP config step, add dev/dev login - update README: S3Mock replaces storage directory for local files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Pushed some commits with the following goals:
The big change was using S3Mock rather than the file system for managing files. This worked fine out of the box. There are alternatives (e.g. LocalStack) that also support AWS SDKs and we can switch if we discover we don't like S3Mock for some reason and that should be a minor change. I originally was using minio until I saw that it stopped being maintained just a couple weeks ago, and swithing to S3Mock was basically swapping out a few lines in docker compose. There was also various cleanup stuff -- e.g. env vars from .env were leaking through when running tests, so I tightened up how the test harness loads env vars to just source from .env.test. Because tests already mock the components that are injected differently in local vs. deployed mode, there didn't seem to be a meaningful set of tests for this, though I'm open to doing more here. @bhaugeea I'd like your take. I think the main point of review should be that we're not introducing regressions or undue risk for deployed envs. |
|
Confirmed that once I fixed my .env file this is working great for me locally |
bhaugeea
left a comment
There was a problem hiding this comment.
I did some git stuff and copying and pasting to check the various code chunks that were more moved than edited, and they all check out. The modularity and DI work well and give a nice DX too. Both local options run fine on the example file that Andy gave me. I ran into two hiccups when running the init script, neither a bug per se but only a local case that wasn't explicitly handled (one was that the script asks which local mode you want to run, but doesn't edit the .env file based on that, and the other is a python thing commented below). Beyond that, a couple typos which can be ignored per your judgement, a typescript quirk which is easy to fix, and an executor logging question.
Only not ticking the Approve box because I don't know what the plan for approval is given the joint nature of the PR.
| import { Run } from '@prisma/client'; | ||
|
|
||
| export interface ExecutorService { | ||
| start(run: Run): Promise<void>; |
There was a problem hiding this comment.
I don't know why this is, but you seem to have to use the arrow function syntax if you want good enforcement of this type signature.
start: (run: Run) => Promise<void>;
app/README.md
Outdated
|
|
||
| #### 4. Configure an ODS | ||
|
|
||
| You'll need a valid (even if nono-prod) place to send data in order to run local jobs. |
| }); | ||
| proc.stderr?.on('data', (data) => { | ||
| this.logger.error(`Executor stderr: ${data}`); | ||
| }); |
There was a problem hiding this comment.
It seems like all the executor logs - other than the ones from git that circumvent the main logger service - are going here instead of the stdout handler. Maybe there's a good reason for that but I thought I'd mention it.
There was a problem hiding this comment.
Yeah, I think we might want to adjust when the executor sends logs to stdoout vs. stderr. I'd like to make these adjustments over time. We've talked about doing some re-work on the executor logs to make them easier to navigate and that'd be a good time to take a look at this.
| fail "Python 3.10+ required, found ${py_version}" | ||
| exit 1 | ||
| fi | ||
| ok "python3 ${py_version}" |
There was a problem hiding this comment.
I also had to install python3.12-venv, and had to manually delete the venv that was originally created because it lacked pip.
There was a problem hiding this comment.
Added a check for this and I think also addressed the issue you and @johncmerfeld ran into with not having LOCAL_EXECUTOR in a existing .env
app/README.md
Outdated
|
|
||
| ### Running the executor locally | ||
|
|
||
| In deployed environments, the executor runs as a Task in Elastic Container Service (ECS). Locally, the app initiate the executor based on the `LOCAL_EXECUTOR` environment variable: |
- check for ensurepip before creating python venv, with a clear message pointing to the python3.x-venv apt package - detect and recreate broken venvs that are missing pip - wait for postgres to accept host connections after docker compose up, avoiding ECONNREFUSED on migrations - backfill LOCAL_EXECUTOR in api/.env so the sed in step 7 always works - fix typos in README, fix executor service interface type Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@edandylytics did most of the brainwork for this feature; I think it's finally ready to go! I had to transfer this over from the old Runway repo, so please be on the lookout for any rollbacks of code changes that have come in since the switch.
How to try this
Follow the instructions in app/README.md. It should be pretty quick to get up and running jobs in either local mode.
What this does
When running locally:
What's changed
DEPLOYMENT_MODEenv var and by sendingfile://sources instead ofs3://ones. We could couple those more tightly but they are nominally independenthost.docker.internalFor reviewers
I think we achieved pretty good demarcation between the local stuff and production code, but this is a ratchet upward in complexity so we'll want to make sure none of the deployed functionality is impacted. There are a couple small TODOs that are really just opinion questions. New Typescript code should be scrutinized :)