This project aims to finalize and stabilize the VMR (ViewMyRecords) crawler for production.
- SPA-aware navigation (breadcrumbs, back-button, ".." folder)
- Session conflict handling ("Login Here" button)
- Resilient folder and file detection
- Recursive discovery with duplicate prevention
- Metadata extraction (sidecar JSON files)
- Batch processing and error recovery
- Python 3.8+
- Playwright (
pip install playwrightthenplaywright install)
- Set environment variables:
VMR_CORPORATE_ID,VMR_USERNAME,VMR_PASSWORD. - Run the script:
python production_migration_engine.py.
This project is fully dockerized to avoid version conflicts. See DOCKER_INSTRUCTIONS.md for details.
- Create a
.envfile (see above). - Run
docker-compose up -d. - Execute the downloader:
docker-compose exec vmr-migration python production_migration_engine_new.py.