Demo: https://t.me/aturretrss_bot
A social media content fetching service with a Telegram Bot client, built as a monorepo with two microservices.
Send a social media URL to the bot, and it fetches and archives the content for you. Supports most mainstream social media platforms.
- Support more social media platforms
- Douyin
- TikTok
- Threads
- Bluesky
- Enhance the scraping for existing platforms
- WeChat Public Account Articles
- Douban
- More general content scraping and enhanced features
- General Webpage scraping (with more third-party platforms...)
- LLM Content Translation
- LLM Content Summarization
- Podcast Feeds
- Audio transcription
- Image OCR
- Support for more integrations
- Inoreader
- Notion
- Architecture Refactoring
- Apply message queue to tgbot <-> API communication
- Code Refactoring
- Unified API response format
- Better error handling
- Better logging
- Database Support
- Persistent storage for scraped content
- Persistent storage for user settings
- More user interface options
- Web UI
- Discord Bot Integration
FastFetchBot is organized as a UV workspace monorepo with three packages:
FastFetchBot/
├── packages/shared/ # fastfetchbot-shared: common models, utilities, logger
├── apps/api/ # FastAPI server: scrapers, storage, routing
├── apps/telegram-bot/ # Telegram Bot: webhook/polling, message handling
├── app/ # Legacy re-export wrappers (backward compatibility)
├── pyproject.toml # Root workspace configuration
└── uv.lock # Lockfile for the entire workspace
| Service | Port | Description |
|---|---|---|
API Server (apps/api/) |
10450 | FastAPI app with all platform scrapers, file export, and storage |
Telegram Bot (apps/telegram-bot/) |
10451 | Receives messages via webhook or long polling, calls the API server |
The Telegram Bot communicates with the API server over HTTP. In Docker, this is http://api:10450.
- Copy
docker-compose.template.ymltodocker-compose.yml. - Create a
.envfile fromtemplate.envand fill in the environment variables. - If you need large file support (>50 MB), fill in
TELEGRAM_API_IDandTELEGRAM_API_HASHin the compose file for the local Telegram Bot API server. Otherwise, comment out thetelegram-bot-apiservice.
docker-compose up -dThe compose file pulls pre-built images from GitHub Container Registry:
ghcr.io/aturret/fastfetchbot-api:latestghcr.io/aturret/fastfetchbot-telegram-bot:latest
To build locally instead, uncomment the build: blocks and comment out the image: lines in docker-compose.yml.
Requires Python 3.12 and uv.
# Install all dependencies (including dev)
uv sync
# Run the API server
cd apps/api
uv run gunicorn -k uvicorn.workers.UvicornWorker src.main:app --preload
# Run the Telegram Bot (in a separate terminal)
cd apps/telegram-bot
uv run python -m core.mainThe bot supports two modes, controlled by the TELEGRAM_BOT_MODE environment variable:
| Mode | Value | Use Case |
|---|---|---|
| Long Polling | polling (default) |
Local development, simple deployments without a reverse proxy |
| Webhook | webhook |
Production with a public HTTPS URL |
In both modes, the bot runs an HTTP server on port 10451 for the /send_message callback endpoint (used by Inoreader integration) and /health.
uv sync # Install all dependencies
uv run pytest # Run tests
uv run pytest -v # Run tests with verbose output
uv run black . # Format code- Create a new scraper module in
apps/api/src/services/scrapers/<platform>/ - Implement the scraper class following existing patterns
- Add a platform-specific router in
apps/api/src/routers/ - Register the scraper in
ScraperManager - Add configuration variables in
apps/api/src/config.py - Create tests in
tests/cases/
# Build both services locally
docker-compose build
# Or build individually
docker build -f apps/api/Dockerfile -t fastfetchbot-api .
docker build -f apps/telegram-bot/Dockerfile -t fastfetchbot-telegram-bot .Note: Both Dockerfiles use the repository root as the build context (
.) because they need access topyproject.toml,uv.lock, andpackages/shared/.
Many scrapers require authentication cookies. You can extract cookies using the browser extension Get cookies.txt LOCALLY.
See template.env for a complete reference with comments.
| Variable | Description |
|---|---|
BASE_URL |
Public domain of the server (e.g. example.com). Used for webhook URL construction. |
TELEGRAM_BOT_TOKEN |
Bot token from @BotFather |
TELEGRAM_CHAT_ID |
Default chat ID for the bot |
| Variable | Default | Description |
|---|---|---|
API_SERVER_URL |
http://localhost:10450 |
URL the Telegram Bot uses to call the API server. Set to http://api:10450 in Docker. |
TELEGRAM_BOT_CALLBACK_URL |
http://localhost:10451 |
URL the API server uses to call the Telegram Bot. Set to http://telegram-bot:10451 in Docker. |
TELEGRAM_BOT_MODE |
polling |
polling or webhook |
| Variable | Default | Description |
|---|---|---|
PORT |
10450 |
API server port |
API_KEY |
auto-generated | API key for authentication |
| Variable | Default | Description |
|---|---|---|
TELEBOT_API_SERVER_HOST |
None |
Local Telegram Bot API server host |
TELEBOT_API_SERVER_PORT |
None |
Local Telegram Bot API server port |
TELEGRAM_CHANNEL_ID |
None |
Channel ID(s) for the bot, comma-separated |
TELEGRAM_CHANNEL_ADMIN_LIST |
None |
User IDs allowed to post to the channel, comma-separated |
| Platform | Variables |
|---|---|
TWITTER_CT0, TWITTER_AUTH_TOKEN |
|
REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, REDDIT_USERNAME, REDDIT_PASSWORD |
|
WEIBO_COOKIES |
|
| Xiaohongshu | See Xiaohongshu Setup below |
X_RAPIDAPI_KEY |
|
| Zhihu | Store cookies in conf/zhihu_cookies.json |
Xiaohongshu (XHS) API requests require a cryptographic signature (x-s, x-t, etc.) that must be computed by a dedicated signing proxy. FastFetchBot delegates this to an external sign server.
Note: We currently use a closed-source sign server. You will need to run your own compatible signing proxy and point
SIGN_SERVER_URLat it.
The sign server must accept POST /signsrv/v1/xhs/sign with a JSON body:
{"uri": "/api/sns/web/v1/feed", "data": {...}, "cookies": "a1=..."}and return:
{"isok": true, "data": {"x_s": "...", "x_t": "...", "x_s_common": "...", "x_b3_traceid": "..."}}Cookie configuration (two options; file takes priority):
-
File (recommended): Create
apps/api/conf/xhs_cookies.txtcontaining your XHS cookies as a single line:a1=xxxxxxxx; web_id=xxxxxxxx; web_session=xxxxxxxxLog in to xiaohongshu.com in your browser, then copy the cookie values from DevTools → Application → Cookies, or use the Get cookies.txt LOCALLY extension.
-
Environment variables (legacy fallback): Set
XIAOHONGSHU_A1,XIAOHONGSHU_WEBID, andXIAOHONGSHU_WEBSESSIONindividually. Used only when the cookie file is absent.
| Variable | Default | Description |
|---|---|---|
SIGN_SERVER_URL |
http://localhost:8989 |
URL of the XHS signing proxy |
XHS_COOKIE_PATH |
conf/xhs_cookies.txt |
Path to cookie file (overrides default location) |
XIAOHONGSHU_A1 |
None |
a1 cookie value (legacy fallback) |
XIAOHONGSHU_WEBID |
None |
web_id cookie value (legacy fallback) |
XIAOHONGSHU_WEBSESSION |
None |
web_session cookie value (legacy fallback) |
| Variable | Description |
|---|---|
OPENAI_API_KEY |
OpenAI API key for audio transcription |
AWS_ACCESS_KEY_ID |
Amazon S3 access key |
AWS_SECRET_ACCESS_KEY |
Amazon S3 secret key |
AWS_S3_BUCKET_NAME |
S3 bucket name |
AWS_S3_REGION_NAME |
S3 region |
AWS_DOMAIN_HOST |
Custom domain bound to the S3 bucket |
| Variable | Default | Description |
|---|---|---|
GENERAL_SCRAPING_ON |
false |
Enable scraping for unrecognized URLs |
GENERAL_SCRAPING_API |
FIRECRAWL |
Backend: FIRECRAWL or ZYTE |
FIRECRAWL_API_URL |
Firecrawl API server URL | |
FIRECRAWL_API_KEY |
Firecrawl API key | |
ZYTE_API_KEY |
Zyte API key |
- Bluesky
- Threads
- Reddit (Beta, only supports part of posts)
- WeChat Public Account Articles
- Zhihu
- Douban
- Xiaohongshu
- YouTube
- Bilibili
The GitHub Actions pipeline (.github/workflows/ci.yml) automatically builds and pushes both microservice images to GitHub Container Registry on every push to main:
ghcr.io/aturret/fastfetchbot-api:latestghcr.io/aturret/fastfetchbot-telegram-bot:latest
The HTML to Telegra.ph converter function is based on html-telegraph-poster. I separated it from this project as an independent Python package: html-telegraph-poster-v2.
The original Xiaohongshu scraper was based on MediaCrawler. The current implementation uses a custom httpx-based adapter with an external signing proxy.
The Weibo scraper is based on weiboSpider.
The Twitter scraper is based on twitter-api-client.
The Zhihu scraper is based on fxzhihu.
All the code is licensed under the MIT license. I either used their code as-is or made modifications to implement certain functions. I want to express my gratitude to the projects mentioned above for their contributions.