Skip to content

jkzilla/Price-Grab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Price Grabber

Capture grocery shelf prices with your phone camera. Roboflow AI extracts the data → review & save → sync to Price Scout.

Two interfaces: a mobile app (Expo React Native, iOS App Store ready) and Python CLI scripts for bulk processing and dataset management.


Architecture

                          ┌─ Mobile App (Expo) ─────────────────────┐
                          │                                         │
   iPhone Camera ───────▶ │  Camera → Roboflow API → Review → Save │
                          │                    │                    │
                          │              AsyncStorage               │
                          │                    │                    │
                          │            Sync to Price Scout          │
                          └─────────────────────────────────────────┘

                          ┌─ Python CLI ────────────────────────────┐
                          │                                         │
   iPhone Photos ───────▶ │  Convert → Resize → Upload → Roboflow  │
   (HEIC / MOV)           │                    │                    │
                          │         Dataset (Universe)              │
                          │                    │                    │
                          │         Train → Custom Model            │
                          └─────────────────────────────────────────┘

Price extraction pipeline

Image ──▶ Object Detection ──▶ Crop tags ──▶ GPT-4o (structured) ──▶ JSON
          (find price tags)     (isolate)     (extract fields)        (output)
Step What happens
Capture Take a photo or pick from library
Detect Roboflow object detection finds every price tag
Extract GPT-4o reads each cropped tag → product name, brand, price, sale price, unit, UPC
Review Edit/confirm extracted records before saving
Store Saved on-device (AsyncStorage), organized by store and date
Sync One-tap sync to Price Scout GraphQL API, or export as JSON

Quick start

Mobile app

npm install
npx expo start

Open Settings tab → enter API keys → select store → start capturing.

Python scripts

cd scripts
pip install -r requirements.txt
cp ../.env.example ../.env   # fill in API keys

Image upload pipeline

The problem with raw iPhone photos

iPhones shoot HEIC at 5-7MB per image. Uploading these directly to Roboflow is painfully slow and wasteful:

Issue Impact
HEIC format Roboflow must decode on server side → slower ingestion, occasional failures
6MB per image Upload time scales linearly with size → 452 images = 2.5GB = 30+ min
Serial uploads One-by-one API calls → most time spent waiting for network round-trips
No deduplication Re-running uploads creates duplicates

The solution: convert + resize + parallel upload

Our pipeline fixes all of this:

iPhone HEIC/MOV
      ↓
  Convert (ImageMagick / ffmpeg)
      ↓
  Resize to 1280px max, JPEG quality 80
      ↓
  Parallel upload (8 workers)
      ↓
  Roboflow dataset ✓

Benchmarks

Metric Before (raw) After (optimized) Improvement
Image size 5-7MB each ~270KB each 95% smaller
Total payload 2,563 MB 122 MB 21x smaller
Upload speed ~6 imgs/min (serial) ~320 imgs/min (8 workers) 53x faster
452 images 30+ minutes 84 seconds 21x faster
Failures Occasional HEIC decode errors 0 failures 100% clean

Why 1280px is the right size

  • Roboflow resizes internally for training anyway (typically 640x640)
  • 1280px preserves enough detail for price tag text
  • Smaller images = faster training, less compute cost
  • Better model generalization (less noise from ultra-high-res)
  • JPEG quality 80 is visually lossless for training purposes

Scripts

upload_to_roboflow.py — Primary upload tool

Auto-resizes and parallel-uploads images to Roboflow. Handles JPG, PNG, HEIC, and optionally extracts video frames.

# Upload a folder of images (auto-resize + 8 parallel workers)
python upload_to_roboflow.py ~/Downloads/roboflow_ready_images/

# Include video frame extraction
python upload_to_roboflow.py ~/Downloads/ --include-videos --recursive

# Custom batch name and more workers
python upload_to_roboflow.py ~/photos/ --batch-name "safeway-apr-4" --workers 10

# Skip resize (upload originals)
python upload_to_roboflow.py ~/photos/ --no-resize

What it does:

  1. Finds all JPG/PNG/HEIC images (and MOV/MP4 with --include-videos)
  2. Converts HEIC → JPG via ImageMagick
  3. Resizes to 1280px max dimension, JPEG quality 80
  4. Uploads with 8 parallel workers (configurable)
  5. Tracks progress every 25 images
  6. Skips duplicates (Roboflow rejects them automatically)
  7. Logs failures cleanly (up to 5 shown)
  8. Cleans up temp files on exit

convert_and_upload_to_roboflow.py — One-shot from Downloads

Scans ~/Downloads for iPhone HEIC/MOV files, converts, resizes, and uploads in one command.

# Process everything in ~/Downloads
python convert_and_upload_to_roboflow.py

# Custom source directory
python convert_and_upload_to_roboflow.py --source ~/Photos/store-trip

# Process ALL files, not just "* 2.*" pattern
python convert_and_upload_to_roboflow.py --all-files

run_workflow.py — Single image inference

Run the price extraction workflow on one image.

python run_workflow.py ~/photos/shelf.jpg --store store-safeway
python run_workflow.py https://example.com/shelf.jpg --store store-raleys

video_inference.py — Video inference via WebRTC

Process an entire video through the Roboflow workflow using WebRTC streaming.

python video_inference.py ~/videos/store_walk.mp4 --store store-safeway
python video_inference.py ~/videos/walk.mp4 --store store-raleys --save-frames

grab_prices.py — Original CLI tool

python grab_prices.py photo.jpg --store store-safeway
python grab_prices.py ./photos/ --store store-grocery-outlet

transform_to_price_scout.py — Convert to Price Scout format

python transform_to_price_scout.py output/capture.json --price-scout-path ../../price-scout

matcher.py — Fuzzy match to Price Scout catalog

Matches extracted products to existing Price Scout items via UPC (exact) or name+brand (Levenshtein).


Gotchas and lessons learned

Image size is the #1 bottleneck

If uploads are slow, it's almost certainly because your images are too large. iPhone photos at full resolution are 5-7MB each. Always resize before uploading.

# Quick resize with ImageMagick (if you want to do it manually)
magick mogrify -resize 1280x1280\> -quality 80 *.jpg

HEIC causes server-side issues

Roboflow can accept HEIC but has to decode it server-side, which is slower and occasionally fails. Always convert to JPEG first. Our scripts handle this automatically.

Serial uploads are the #2 bottleneck

The Roboflow API accepts one image per request. Uploading 500 images serially means 500 sequential HTTP requests. With 8 parallel workers, you're making 8 requests simultaneously → 8x throughput.

Don't go above ~10 workers or you'll hit rate limits.

Duplicates are auto-skipped

Roboflow rejects images that are already in the dataset. Our scripts detect this and count them as "skipped" rather than "failed." Safe to re-run.

Video frames need deduplication

When extracting frames from store walk videos, many consecutive frames are nearly identical. We extract at 1 FPS (not 30 FPS) to avoid flooding the dataset with duplicates. You can adjust VIDEO_FRAME_FPS in the scripts.

API keys belong in .env, not in code

All scripts read from .env in the project root. The .gitignore excludes .env and all variants (.env.*, .env copy*). API keys configured in the mobile app are stored in iOS Keychain via expo-secure-store.


Project structure

price-grabber-roboflow/
├── App.js                           ← Root: tab navigator + stack nav
├── app.json                         ← Expo config, iOS permissions
├── eas.json                         ← EAS Build profiles
├── package.json                     ← JS dependencies
├── index.js                         ← Entry point
├── src/
│   ├── constants/
│   │   ├── theme.js                 ← Colors, spacing, typography
│   │   └── stores.js                ← Ukiah store list
│   ├── screens/
│   │   ├── CameraScreen.js          ← Camera capture + photo library
│   │   ├── ProcessingScreen.js      ← Roboflow API call + progress
│   │   ├── ReviewScreen.js          ← Edit/confirm prices before save
│   │   ├── HistoryScreen.js         ← Past captures, sync, export
│   │   └── SettingsScreen.js        ← API keys, store, sync config
│   └── services/
│       ├── roboflowService.js       ← Roboflow Workflow API client
│       ├── captureStorage.js        ← AsyncStorage CRUD
│       ├── secureStorage.js         ← SecureStore for keys
│       └── syncService.js           ← Price Scout GraphQL sync
├── scripts/
│   ├── upload_to_roboflow.py        ← Resize + parallel upload
│   ├── convert_and_upload_to_roboflow.py  ← HEIC/MOV → resize → upload
│   ├── run_workflow.py              ← Single image inference
│   ├── video_inference.py           ← Video inference (WebRTC)
│   ├── grab_prices.py               ← Original CLI price grabber
│   ├── transform_to_price_scout.py  ← Convert to Price Scout format
│   ├── matcher.py                   ← Fuzzy match to item catalog
│   ├── resize_and_zip.sh            ← Manual resize + zip for UI upload
│   ├── convert_selected.sh          ← HEIC/MOV shell converter
│   ├── workflow_definition.json     ← Importable Roboflow workflow
│   └── requirements.txt             ← Python dependencies
├── assets/                          ← App icon, splash screen
└── .skills/                         ← Apollo GraphQL agent skill

Data storage

Mobile app

Data Storage Encrypted
API keys (Roboflow, OpenAI) expo-secure-store ✅ iOS Keychain
Settings (store, URLs) expo-secure-store ✅ iOS Keychain
Captures (prices, photos) AsyncStorage ❌ (on-device only)

Capture record format

{
  "id": "uuid",
  "storeId": "store-safeway",
  "storeName": "Safeway",
  "imageUri": "file:///path/to/photo.jpg",
  "records": [
    {
      "productName": "Whole Milk, 1 Gallon",
      "brand": "Store Brand",
      "price": 4.29,
      "salePrice": null,
      "unit": "gallon",
      "upc": null
    }
  ],
  "createdAt": "2026-04-03T19:00:00.000Z",
  "synced": false,
  "syncedAt": null
}

Syncing to Price Scout

1. In-app sync (GraphQL)

Set Price Scout GraphQL URL in Settings → tap Sync on History screen.

2. JSON export

Tap Export on History → share via AirDrop/Messages → process with Python:

cd scripts
python transform_to_price_scout.py exported_data.json --price-scout-path ../../price-scout

Build for iOS App Store

npm install -g eas-cli
eas login
eas build --platform ios --profile production
eas submit --platform ios
Field Value
Bundle ID com.pricescout.grabber
App Name Price Grabber
Category Utilities / Shopping
Privacy Camera, Photo Library

Training a custom model

Current dataset

  • 582 images uploaded to Roboflow Universe
  • 452 store shelf photos (3 Ukiah stores: Safeway, Raley's, Grocery Outlet)
  • 130 video frames from store walk-throughs
  • All resized to 1280px max, JPEG quality 80

Workflow

  1. Annotate at app.roboflow.com — draw bounding boxes around price tags, class: price-tag
  2. Generate a dataset version (auto train/valid/test split)
  3. Train on Roboflow (free on Public plan, ~15-30 min)
  4. Update workflow to use your model: your-workspace/find-3stores-shelf-prices/N

Tips for annotation

  • Draw tight boxes around the full price tag (product name + price + any sale info)
  • Include all visible tags, even partially obscured ones
  • Be consistent across stores — same class for all tag styles
  • Aim for 50+ annotated images minimum before first training

Prerequisites

  • Node.js 18+ (for Expo app)
  • Python 3.10+ (for scripts)
  • ImageMagick (brew install imagemagick) — for HEIC conversion
  • ffmpeg (brew install ffmpeg) — for video frame extraction
  • Roboflow account (free Public plan) — app.roboflow.com
  • OpenAI API keyplatform.openai.com

Next steps

  • Annotate 50+ images → train first custom model
  • Add barcode scanner (expo-barcode-scanner) for UPC auto-fill
  • Offline queue — retry sync when connection is restored
  • Price history charts per item
  • Multi-photo batch capture mode
  • Auto-upload from mobile app directly to Roboflow dataset

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors