ArXiv ChatGuru

ArXiv ChatGuru is a Streamlit app that turns a topic from arXiv into a topic-scoped Redis vector index. It fetches papers, chunks them, stores embeddings in Redis, and lets you ask grounded questions against the papers you loaded.

This app is a learning project for academic RAG. It is intentionally simple and is meant to show how Redis fits into a paper Q&A workflow, not to act as a production-ready research assistant.

What Redis does in this app

Stores topic-specific paper chunks and embeddings
Powers vector search for retrieval
Lets you inspect the active index from the built-in stats page

How it works

Enter a topic and choose how many papers to load.
The app pulls papers from arXiv and splits them into chunks.
OpenAI generates embeddings for those chunks.
Redis stores the chunks and embeddings in a topic-scoped index.
LangChain retrieves the closest chunks for each user question and sends that context to the chat model.

Prerequisites

Python 3.13 for local development
Docker Desktop if you want the Docker-first flow
An OpenAI API key

Environment setup

Create a .env file from the template:

cp .env.template .env

Then set at least:

OPENAI_API_KEY=your_key_here

The default template uses:

OPENAI_CHAT_MODEL=gpt-4.1-mini
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
REDIS_INDEX_BASENAME=arxiv
REDIS_URL=redis://arxivchatguru-redis:6379

Run with Docker

Docker is the primary local path.

make docker-up

Then open:

http://localhost:8501

To stop the stack:

make docker-down

Run locally

Install Poetry if you do not already have it:

python3 -m pip install --user poetry

Use Python 3.13 for the project environment, install dependencies, and start the app:

python3 -m poetry env use python3.13
make install
make dev

Then open:

http://localhost:8501

If you run locally outside Docker, make sure REDIS_URL points at a reachable Redis instance such as redis://localhost:6379.

Developer commands

make format formats the app and tests
make test runs the test suite
make build builds the Docker image
make dev starts Streamlit locally
make docker-up starts the app with Docker Compose

Stats page

After you load a topic from the main page, open the Streamlit stats page to inspect the active Redis index. It shows:

Index metadata
Indexed fields
Query Engine stats for the active topic

Planned follow-ups

Add better metadata filters such as year or author
Improve chunking strategy for long papers
Add chat history or memory features only if the tutorial needs them

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
app		app
tests		tests
.env.azure.template		.env.azure.template
.env.template		.env.template
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArXiv ChatGuru

What Redis does in this app

How it works

Prerequisites

Environment setup

Run with Docker

Run locally

Developer commands

Stats page

Planned follow-ups

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ArXiv ChatGuru

What Redis does in this app

How it works

Prerequisites

Environment setup

Run with Docker

Run locally

Developer commands

Stats page

Planned follow-ups

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages