BRIC-6: fix .txt/.md indexing via PDF pre-conversion by Kaiohz · Pull Request #8 · SoluDevTech/mcp-raganything

Kaiohz · 2026-04-09T06:44:52Z

Pre-convert .txt/.md files to PDF before passing to DoclingParser. See Jira BRIC-6.

…mpatibility (BRIC-6) RAGAnything's DoclingParser rejects .txt/.md files with ValueError. Instead of monkey-patching, the adapter now pre-converts text files to PDF via Parser.convert_text_to_pdf() before passing them to process_document_complete(). Temp PDFs are cleaned up after processing. - Add _convert_text_to_pdf() and _process_with_pdf_fallback() helpers - _TEXT_EXTENSIONS = {'.txt', '.md'} gate the conversion - Unique temp directory per conversion prevents collisions - Cleanup uses contextlib.suppress(OSError) in finally blocks - index_document() and index_folder() both use the fallback - 8 new tests covering conversion, cleanup, .md, .pdf passthrough - Update README format table to reflect actual support

Kaiohz force-pushed the BRIC-6/fix-txt-indexing-via-pdf-preconversion branch from eb1d2e7 to 1aafe8e Compare April 9, 2026 06:59

Kaiohz closed this Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BRIC-6: fix .txt/.md indexing via PDF pre-conversion#8

BRIC-6: fix .txt/.md indexing via PDF pre-conversion#8
Kaiohz wants to merge 1 commit intomainfrom
BRIC-6/fix-txt-indexing-via-pdf-preconversion

Kaiohz commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kaiohz commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant