aurelle-py

Unofficial Python client library for the alphaxiv.org Assistant API — the aurelle-1 model that provides AI-powered Q&A over arXiv papers.

Disclaimer: This is an unofficial, community-developed client. aurelle-py is not affiliated with, endorsed by, or sponsored by alphaXiv. This client uses an undocumented internal API that may change without notice. Please respect alphaxiv's Terms of Service. The API key used is generated via alphaxiv.org's account settings.

How it was discovered

The API endpoint was found by inspecting network traffic in browser DevTools while using alphaxiv.org. The payload schema, authentication method, SSE event format, and rate-limit headers were all reverse-engineered from real browser requests. There is no official documentation.

Features

Sync and async clients (AurelleClient, AsyncAurelleClient)
Streaming support (stream() method)
ask_paper() convenience method for arXiv paper Q&A
Pydantic v2 request validation
Typed exceptions (AuthError, RateLimitError, AurelleAPIError)
MCP server for use with Claude Desktop and other LLM agents
Python 3.12+

Installation

pip install aurelle-py

With MCP server support:

pip install "aurelle-py[mcp]"

Or with uv:

uv add aurelle-py

Getting an API key

Create an account at alphaxiv.org
Go to Account → API Keys
Generate a new key

Set it as an environment variable:

export AURELLE_API_KEY=your_key_here

Quick start

Sync client

from aurelle import AurelleClient

client = AurelleClient()  # reads AURELLE_API_KEY from environment

# One-shot question
response = client.chat("What is the attention mechanism in Transformers?")
print(response.text)
print(f"chat_id: {response.chat_id}")

# Ask about a specific arXiv paper
response = client.ask_paper(
    arxiv_id="1706.03762",
    question="What BLEU scores were reported?",
)
print(response.text)

Streaming

for chunk in client.stream("Summarize the Transformer architecture"):
    print(chunk.delta, end="", flush=True)

Multi-turn conversation

r1 = client.chat("Summarize arxiv 1706.03762")
r2 = client.chat("What BLEU scores were reported?", chat_id=r1.chat_id)

Note: Multi-turn context is limited — the API does not expose parentMessageId, so the model may not reliably recall earlier messages in the same session.

Async client

import asyncio
from aurelle import AsyncAurelleClient

async def main():
    async with AsyncAurelleClient() as client:
        response = await client.chat("Explain O(n²) attention complexity")
        print(response.text)

        async for chunk in client.stream("Summarize arxiv 1706.03762"):
            print(chunk.delta, end="", flush=True)

asyncio.run(main())

Web search

response = client.chat("Latest diffusion model papers 2025", web_search=True)

Thinking mode

response = client.chat("Why does self-attention scale quadratically?", thinking=True)

API reference

`AurelleClient(api_key=None, *, timeout=90.0, max_retries=3)`

Parameter	Type	Description
`api_key`	`str \| None`	alphaxiv API key. Falls back to `AURELLE_API_KEY` env var.
`timeout`	`float`	HTTP timeout in seconds (default 90 — responses can be slow).
`max_retries`	`int`	Retry attempts on timeout / 5xx (default 3).

`.chat(message, *, chat_id=None, paper_version_id=None, web_search=False, thinking=False) → ChatResponse`

Send a message and return the fully assembled response.

`.stream(message, ...) → Iterator[StreamChunk]`

Stream the response as an iterator of chunks. Same parameters as chat().

`.ask_paper(arxiv_id, question, *, chat_id=None) → ChatResponse`

Convenience wrapper. Builds "arxiv:{id} — {question}" and calls chat().

`AsyncAurelleClient`

Identical interface with await. Use async for chunk in client.stream(...).

Response types

@dataclass
class ChatResponse:
    text: str            # assembled answer text
    chat_id: str         # session UUID (use in subsequent calls)
    tool_uses: list[ToolUse]
    raw_events: list[dict]

@dataclass
class StreamChunk:
    delta: str           # text fragment (empty for tool events)
    event_type: str      # "delta_output_text" | "tool_use" | "tool_result_text"

@dataclass
class ToolUse:
    tool_use_id: str
    kind: str            # e.g. "Answer PDF Queries", "Fetch arXiv Abstract"
    content: str         # raw JSON string

Exceptions

Exception	When raised
`AuthError`	Missing API key, HTTP 401 or 403
`RateLimitError`	HTTP 429. Has `.reset_in` (seconds) attribute.
`AurelleAPIError`	Other HTTP 4xx/5xx. Has `.status_code` attribute.
`AurelleError`	Base class for all aurelle exceptions.

Rate limits

alphaxiv enforces 40 requests per 60 seconds.

On hitting the limit, the client raises RateLimitError immediately (no silent retry):

from aurelle.exceptions import RateLimitError
import time

try:
    response = client.chat("...")
except RateLimitError as exc:
    print(f"Rate limited. Retry in {exc.reset_in}s")
    time.sleep(exc.reset_in or 60)

MCP server

aurelle-py ships an MCP server that exposes aurelle tools to LLM agents.

Setup

Install with MCP support and start the server:

pip install "aurelle-py[mcp]"
AURELLE_API_KEY=your_key aurelle-mcp

Claude Desktop configuration

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "aurelle": {
      "command": "aurelle-mcp",
      "env": {
        "AURELLE_API_KEY": "your_key_here"
      }
    }
  }
}

Claude Code configuration

Create .mcp.json in the root of your project (this is the correct location — not .claude/settings.json, which is ignored for MCP):

{
  "mcpServers": {
    "aurelle": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "python", "-m", "aurelle.mcp.server"],
      "cwd": "/absolute/path/to/your/project",
      "env": {
        "AURELLE_API_KEY": "your_key_here"
      }
    }
  }
}

If aurelle-mcp is installed as a script (via pip install "aurelle-py[mcp]"), you can use the entry point directly:

{
  "mcpServers": {
    "aurelle": {
      "type": "stdio",
      "command": "aurelle-mcp",
      "env": {
        "AURELLE_API_KEY": "your_key_here"
      }
    }
  }
}

Restart Claude Code after creating or modifying .mcp.json — the config is read only at startup.

Available tools

Tool	Description
`ask_arxiv_paper(arxiv_id, question)`	Ask a question about a specific arXiv paper
`research_question(query, web_search)`	General research question with optional web search

Known limitations

Multi-turn context is unreliable: The API assigns a llmChatId for session continuity, but parentMessageId is never returned, so the model may not recall earlier messages in the same session.
assistantVariant="homepage" times out: This variant requires a browser session context. The client always uses "paper" which works for all request types.
web_search=True with a specific paper may hang: Combining web search with paper_version_id can cause the request to hang indefinitely. Use one or the other, not both.
Undocumented API: This API was reverse-engineered and may change without notice. Pin your aurelle-py version and test after alphaxiv updates.

Development

# Clone and install dev dependencies
git clone https://github.com/center4aai/aurelle-py.git
cd aurelle-py
uv sync --extra dev

# Run unit tests (no API key required)
uv run pytest

# Run integration tests (requires AURELLE_API_KEY)
AURELLE_API_KEY=your_key uv run pytest -m integration

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
examples		examples
src/aurelle		src/aurelle
tests		tests
.gitignore		.gitignore
.mcp.json		.mcp.json
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aurelle-py

How it was discovered

Features

Installation

Getting an API key

Quick start

Sync client

Streaming

Multi-turn conversation

Async client

Web search

Thinking mode

API reference

`AurelleClient(api_key=None, *, timeout=90.0, max_retries=3)`

`.chat(message, *, chat_id=None, paper_version_id=None, web_search=False, thinking=False) → ChatResponse`

`.stream(message, ...) → Iterator[StreamChunk]`

`.ask_paper(arxiv_id, question, *, chat_id=None) → ChatResponse`

`AsyncAurelleClient`

Response types

Exceptions

Rate limits

MCP server

Setup

Claude Desktop configuration

Claude Code configuration

Available tools

Known limitations

Development

License

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

aurelle-py

How it was discovered

Features

Installation

Getting an API key

Quick start

Sync client

Streaming

Multi-turn conversation

Async client

Web search

Thinking mode

API reference

AurelleClient(api_key=None, *, timeout=90.0, max_retries=3)

.chat(message, *, chat_id=None, paper_version_id=None, web_search=False, thinking=False) → ChatResponse

.stream(message, ...) → Iterator[StreamChunk]

.ask_paper(arxiv_id, question, *, chat_id=None) → ChatResponse

AsyncAurelleClient

Response types

Exceptions

Rate limits

MCP server

Setup

Claude Desktop configuration

Claude Code configuration

Available tools

Known limitations

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages

`AurelleClient(api_key=None, *, timeout=90.0, max_retries=3)`

`.chat(message, *, chat_id=None, paper_version_id=None, web_search=False, thinking=False) → ChatResponse`

`.stream(message, ...) → Iterator[StreamChunk]`

`.ask_paper(arxiv_id, question, *, chat_id=None) → ChatResponse`

`AsyncAurelleClient`