Skip to content

PyPI package publishing roadmap#239

Open
smirnovlad wants to merge 1 commit intomainfrom
docs/pypi-publishing-roadmap
Open

PyPI package publishing roadmap#239
smirnovlad wants to merge 1 commit intomainfrom
docs/pypi-publishing-roadmap

Conversation

@smirnovlad
Copy link
Copy Markdown
Collaborator

Summary

  • Adds docs/pypi-publishing-roadmap.md documenting all obstacles blocking PyPI publication of thinkbooster
  • Includes a step-by-step checklist to track progress

Key obstacles identified

  1. lm-polygraph dependency — dev branch 0.0.0 vs PyPI 0.5.0, missing features we need
  2. latex2sympy2 — antlr4/Hydra conflict, --no-deps workaround
  3. Hard imports without guards — ~15 files crash without lm_polygraph
  4. Missing llm_tts/__init__.py
  5. Wrong package-data paths in pyproject.toml
  6. Broken console script entry point
  7. Placeholder author metadata

Checklist (from the doc)

  • Publish lm-polygraph>=0.6.0 to PyPI (or add import guards)
  • Create llm_tts/__init__.py with __version__
  • Add try/except ImportError guards for optional deps
  • Fix pyproject.toml: author metadata, package-data, console script
  • Add service_app package-data for static files
  • Create MANIFEST.in for sdist
  • Test: python -m build → install wheel in clean venv → import llm_tts works
  • Publish to PyPI with twine upload

Document current obstacles (lm-polygraph dep, missing __init__.py,
broken package-data, etc.) and step-by-step plan to get thinkbooster
published on PyPI.
@smirnovlad
Copy link
Copy Markdown
Collaborator Author

Root cause: setup.sh does things pip install can't

The main blocker is that setup.sh performs install-time hacks that aren't expressible in a standard pyproject.toml:

  1. Patches lm-polygraph at install timesed loosens transformers and spacy upper bounds, removes unbabel-comet from requirements.txt before installing. A PyPI package can't patch its own dependencies like this.

  2. Installs lm-polygraph from git dev branch — the PyPI release (0.5.0) is missing classes we depend on (VLLMWithUncertainty, VLLMLogprobsCalculator, api_with_uncertainty).

  3. Installs llm-uncertainty-head from git — another unpublished dependency (needed for UHead scorer).

  4. Installs latex2sympy2 --no-deps — to dodge the antlr4 conflict with Hydra. A normal pip install would pull antlr4 and break Hydra's config resolution.

  5. Post-install numpy pinning — forces numpy>=2.0,<2.3 and upgrades thinc/spacy after everything else, because install order matters (lm-polygraph pulls numpy 1.x, vLLM needs 2.x).

Bottom line

The primary blocker is lm-polygraph: it's not on PyPI in the version we need, and even its dev branch requires patching. Until lm-polygraph>=0.6.0 is published to PyPI with corrected dependency bounds, a clean pip install thinkbooster isn't feasible.

The latex2sympy2 and llm-uncertainty-head issues are secondary but follow the same pattern — dependencies that can't be expressed as standard PyPI requirements.

@smirnovlad
Copy link
Copy Markdown
Collaborator Author

Analysis: Merging lm-polygraph dev → main

The gap is huge

Tested all 12 lm_polygraph imports that ThinkBooster uses:

Import main (PyPI 0.5.0) dev
VLLMWithUncertainty
APIWithUncertainty
VLLMLogprobsCalculator
WhiteboxModelvLLM
BlackboxModel
WhiteboxModel
MeanTokenEntropy
MaximumTokenProbability, Perplexity
GenerationParameters
EntropyCalculator
Categorical

Every single import ThinkBooster needs is missing from main. The divergence: 113 commits on dev not in main, 42 on main not in dev.

Merging is necessary but not sufficient

Even the dev branch has dependency issues that setup.sh patches at install time:

Dev requirements.txt setup.sh patches to Why
spacy>=3.4.0,<3.8.0 spacy>=3.8.0 spacy <3.8 uses thinc compiled against numpy 1.x
unbabel-comet==2.2.1 removed Pins numpy<2.0, conflicts with vLLM
numpy>=1.23.5 (post-install pin to >=2.0,<2.3) vLLM needs numpy 2.x

Recommended plan

  1. Create PR dev → main in lm-polygraph — review merge conflicts (42 main-only commits to reconcile)
  2. Fix requirements.txt in that PR:
    • spacy>=3.8.0 (or remove upper bound)
    • Remove unbabel-comet or make it optional
    • numpy>=1.23.5 → allow numpy 2.x (remove implicit <2.0 constraint)
  3. Run lm-polygraph test suite to verify nothing breaks
  4. Publish lm-polygraph==0.6.0 to PyPI
  5. Update ThinkBooster: add lm-polygraph>=0.6.0 to pyproject.toml, simplify setup.sh (remove sed patches)

Risks

  • 42 main-only commits may conflict with dev changes
  • unbabel-comet removal is breaking if any lm-polygraph feature depends on it
  • Downstream users of 0.5.0 may need migration notes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants