Click here to try it out.
- About
- Overview
- Architecture
- Quick Start
- Project Structure
- Development
- Deployment
- Configuration
- Infrastructure as Code
- API Endpoints
- License
- Feedback
paper navigator helps you discover and analyze academic papers through intelligent search, ranking, and automated report generation.
this was built mainly as a portfolio project and is completely free to use (as long as my azure credits last). if you have any feedback, please go to this form.
if you're interested in working with me, you can reach out via my website.
PaperNavigator helps researchers discover relevant academic papers by:
- Query Profiling: LLM creates a structured query profile from your research question
- Query Augmentation: LLM augments the query into multiple search queries for broader coverage
- Snowball Sampling: Starting from seed papers, it explores citations and references to find related work
- LLM-Based Filtering: Uses GPT to filter papers for relevance to your research query
- ELO Ranking: Compares papers head-to-head using LLM judgment to create a quality ranking
- Report Generation: Synthesizes findings into a structured research report
PaperNavigator runs as a serverless backend on Azure:
- Frontend (
frontend/): React/Vite app for the web interface - Azure Functions (
azure-functions/): HTTP API + Service Bus worker - Core Library (
papernavigator/): Shared domain logic (search, ranking, reports)
cd frontend
npm install
npm run devIf you have Azure Functions Core Tools installed:
cd azure-functions
func startPaperNavigator/
├── azure-functions/ # Azure Functions entrypoint + modules
├── frontend/ # React/Vite frontend
├── papernavigator/ # Core library (domain logic)
│ ├── elo_ranker/ # ELO ranking system
│ └── report/ # Report generation
├── tests/ # Test suites
└── results/ # Example outputs (historical)
- Python 3.13+
- uv for Python package management
- Node.js 18+ (for frontend)
- Azure Functions Core Tools (optional, for local Functions)
# Install Python dependencies (including test/dev extras)
uv sync --extra test --extra dev
# Install pre-commit hooks
pre-commit install
# Install frontend dependencies
cd frontend && npm install# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=papernavigator# Format code
uv run ruff format papernavigator tests azure-functions
# Lint
uv run ruff check papernavigator tests azure-functions
# Type check
uv run pyright- Infra (backend): Bicep + GitHub Actions (
infra/,.github/workflows/azure-infra-deploy.yml) - Backend code: GitHub Actions deploys Azure Functions (
.github/workflows/azure-functions-deploy.yml)
| Variable | Description | Required |
|---|---|---|
OPENAI_API_KEY |
OpenAI API key for LLM operations | Yes (or Key Vault) |
AZURE_COSMOS_ENDPOINT |
Cosmos DB endpoint | Yes |
AZURE_COSMOS_KEY |
Cosmos DB key | Yes |
AZURE_COSMOS_DATABASE |
Cosmos DB database name | Yes |
AZURE_COSMOS_CONTAINER |
Cosmos DB container name | Yes |
AZURE_SERVICE_BUS_CONNECTION_STRING |
Service Bus connection string | Yes |
AZURE_SERVICE_BUS_QUEUE_NAME |
Service Bus queue name | Yes |
AZURE_STORAGE_CONNECTION_STRING |
Blob storage connection string | Yes |
AZURE_RESULTS_CONTAINER |
Blob container for results | Yes |
AZURE_RESULTS_PREFIX |
Blob prefix for results | No |
AZURE_KEY_VAULT_URL |
Key Vault URL (optional) | No |
OPENAI_API_KEY_SECRET_NAME |
Key Vault secret name (optional) | No |
PAPERPILOT_FALLBACK_SEED_COUNT |
When strict filtering yields 0 papers, use this many OpenAlex fallback seeds (default: 8) | No |
JOB_QUEUED_SECONDS |
Seconds before queued-job watchdog runs the pipeline directly (default: 20) | No |
JOB_STALE_MINUTES |
Minutes before stale-job watchdog fails a stuck running job (default: 30) | No |
The backend Azure infrastructure is being migrated to Bicep under infra/ so the repo becomes the source of truth (PR-reviewed), reducing portal drift.
- Start here:
infra/README.md - Resource Group (prod):
PaperPilot
POST /api/pipeline- Start a pipeline jobGET /api/pipeline/{job_id}- Get job statusGET /api/jobs/{job_id}- Raw job statusGET /api/jobs/{job_id}/events- Job event logGET /api/results- List completed queriesGET /api/results/{query}/all- Get all results for a query
See LICENSE — use freely, this is a portfolio project.
Found a bug or have a feature idea? Submit feedback here.



