Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 47 additions & 14 deletions .github/workflows/python-wheel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,16 @@ on:
push:
branches:
- main # Rebuild wheels on every commit to main
workflow_dispatch:
inputs:
channel:
description: "Wheel channel to publish"
required: true
default: "staging"
type: choice
options:
- staging
- production

permissions:
contents: write # Needed for GITHUB_TOKEN to push
Expand All @@ -13,6 +23,7 @@ jobs:
runs-on: ubuntu-latest
env:
PYTHON_VERSION: 3.12
PUBLISH_CHANNEL: ${{ github.event_name == 'workflow_dispatch' && github.event.inputs.channel || 'production' }}

steps:
# Checkout the repository
Expand All @@ -38,7 +49,7 @@ jobs:
- name: Upload built wheels (optional)
uses: actions/upload-artifact@v4
with:
name: wheels
name: wheels-${{ env.PUBLISH_CHANNEL }}
path: ./dist/*.whl

# Publish wheels to orphan `wheels` branch
Expand All @@ -47,14 +58,24 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -eu

# Abort if no wheels were built
if [ -z "$(ls -A ./dist/*.whl 2>/dev/null)" ]; then
echo "No wheels found, skipping push"
exit 0
fi

# Prepare fresh working directory for orphan branch
channel="${PUBLISH_CHANNEL}"
if [ "$channel" = "production" ]; then
channel_dir="wheels"
else
channel_dir="wheels-staging"
fi
echo "Publishing channel: $channel"
echo "Target directory: $channel_dir"

# Prepare working directory for the published branch
rm -rf wheels-branch
mkdir wheels-branch
cd wheels-branch
Expand All @@ -64,24 +85,36 @@ jobs:
git remote add origin https://x-access-token:${GITHUB_TOKEN}@github.com/${{ github.repository }}.git
git fetch origin wheels || true

# Create orphan branch (separate history)
git checkout --orphan wheels
git reset --hard
# Reuse the existing published branch when present so multiple channels can coexist
if git ls-remote --exit-code --heads origin wheels >/dev/null 2>&1; then
git checkout -B wheels origin/wheels
else
git checkout --orphan wheels
git reset --hard
fi

# Copy wheels from main repo build output
mkdir -p wheels
cp ../dist/*.whl wheels/
# Replace only the selected channel contents
mkdir -p "$channel_dir"
find "$channel_dir" -maxdepth 1 -type f -name '*.whl' -delete
cp ../dist/*.whl "$channel_dir"/
echo "Wheels to publish:"
ls -lh wheels/
ls -lh "$channel_dir"/

# Generate latest.txt (name of newest wheel)
latest_wheel=$(ls -1 wheels/*.whl | sort | tail -n 1)
echo "$(basename $latest_wheel)" > wheels/latest.txt
echo "Latest wheel: $(cat wheels/latest.txt)"
latest_wheel=$(ls -1 "$channel_dir"/*.whl | sort | tail -n 1)
echo "$(basename "$latest_wheel")" > "$channel_dir/latest.txt"
echo "${{ github.sha }}" > "$channel_dir/commit.txt"
echo "${{ github.ref_name }}" > "$channel_dir/ref.txt"
echo "$channel" > "$channel_dir/channel.txt"
echo "Latest wheel: $(cat "$channel_dir/latest.txt")"

# Commit and push
git config user.name "GitHub Actions"
git config user.email "actions@github.com"
git add wheels
git commit -m "Update wheels for commit ${{ github.sha }}"
git add "$channel_dir"
if git diff --cached --quiet; then
echo "No changes to publish"
exit 0
fi
git commit -m "Update ${channel} wheels for commit ${{ github.sha }}"
git push origin wheels --force
5 changes: 5 additions & 0 deletions README-DEVELOPERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,3 +128,8 @@ if sys.platform == "emscripten":
```

This code is automatically loaded into jupyter notebooks via changes implemented in https://github.com/ironArray/Caterva2/commit/882d9fa930e573fdbc65d62b8dc90722670b8e9a.

For pre-release browser testing, a separate staging wheel channel is also available at
`https://ironarray.github.io/Caterva2/wheels-staging/latest.txt`. The staging
publishing flow and the recommended notebook override are documented in
`RELEASING.rst`.
69 changes: 69 additions & 0 deletions RELEASING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,75 @@ And experiment a bit with uploading, browsing and downloading files.
If the tests pass, you are ready to release.


Staging wheel channel
---------------------

Before publishing a production wheel for all JupyterLite users, you can publish
a staging wheel to a separate GitHub-hosted channel. This is useful for testing
changes that affect the browser-side wheel installation, such as new Pyodide
functionality or notebook helpers, without changing the production
``wheels/latest.txt`` pointer.

The wheel publishing workflow supports two channels:

- production: ``https://ironarray.github.io/Caterva2/wheels/``
- staging: ``https://ironarray.github.io/Caterva2/wheels-staging/``

Each channel gets its own ``latest.txt`` file:

- production: ``https://ironarray.github.io/Caterva2/wheels/latest.txt``
- staging: ``https://ironarray.github.io/Caterva2/wheels-staging/latest.txt``

The staging channel is published by manually running the
``Build and Publish Python Wheels for Caterva2`` workflow with
``channel=staging``.

To do a staging release:

- Push the branch you want to test to GitHub.

- Open the workflow page for
``Build and Publish Python Wheels for Caterva2``.

- Click ``Run workflow``.

- Select the branch to build.

- Select ``channel=staging``.

- Run the workflow.

After it finishes, the built wheel will be available under
``wheels-staging/`` and will not modify the production ``wheels/`` channel.

The workflow also publishes these helper files in the selected channel:

- ``latest.txt``: latest wheel filename in that channel
- ``commit.txt``: commit SHA used to build the wheel
- ``ref.txt``: Git ref name used to build the wheel
- ``channel.txt``: published channel name

Testing a staging wheel from JupyterLite
----------------------------------------

For notebook testing, point the Pyodide install to the staging channel instead
of the production one. For example::

import sys
if sys.platform == "emscripten":
import requests
import micropip

caterva_latest_url = "https://ironarray.github.io/Caterva2/wheels-staging/latest.txt"
caterva_wheel_name = requests.get(caterva_latest_url).text.strip()
caterva_wheel_url = f"https://ironarray.github.io/Caterva2/wheels-staging/{caterva_wheel_name}"
await micropip.install(caterva_wheel_url)
print(f"Installed staging wheel: {caterva_wheel_name}")

Use a fresh browser tab or kernel when testing a new staging wheel, so Pyodide
does not reuse a previously installed package from the same session.


Check documentation
-------------------

Expand Down
174 changes: 174 additions & 0 deletions TESTING-LLM.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Testing the Server-Side LLM Integration from JupyterLite

This document describes how to test the new server-side LLM feature from a
JupyterLite notebook, using a staging Caterva2 wheel first so the production
wheel channel is not affected.

## 1. Publish a staging Caterva2 wheel

1. Push the branch you want to test to GitHub.

2. Open the GitHub Actions workflow:
`Build and Publish Python Wheels for Caterva2`

3. Click `Run workflow`.

4. Select the branch you want to test.

5. Select `channel=staging`.

6. Run the workflow and wait for it to finish successfully.

## 2. Verify the staging wheel was published

Check the staging wheel channel:

- `https://ironarray.github.io/Caterva2/wheels-staging/latest.txt`

Optional metadata checks:

- `https://ironarray.github.io/Caterva2/wheels-staging/commit.txt`
- `https://ironarray.github.io/Caterva2/wheels-staging/ref.txt`
- `https://ironarray.github.io/Caterva2/wheels-staging/channel.txt`

Make sure the commit and ref match the branch you intended to test.

## 3. Start the Caterva2 server with the new backend

The notebook wheel only provides the client-side Python package. The server
must also be running the new backend code from the same branch.

Before testing, ensure:

- the Caterva2 server is started from this branch
- LLM support is enabled in the server configuration
- the desired provider is configured
- the required provider API key is available in the server environment if using
a real provider such as Groq

If you only want a lightweight backend smoke test, you can also configure the
server to use the `mock` provider.

## 4. Point the notebook to the staging wheel

In your JupyterLite test notebook, replace the Caterva2 production wheel
bootstrap with the staging URL.

Example:

```python
import sys

if sys.platform == "emscripten":
import requests
import micropip

caterva_latest_url = (
"https://ironarray.github.io/Caterva2/wheels-staging/latest.txt"
)
caterva_wheel_name = requests.get(caterva_latest_url).text.strip()
caterva_wheel_url = (
f"https://ironarray.github.io/Caterva2/wheels-staging/{caterva_wheel_name}"
)
await micropip.install(caterva_wheel_url)
print(f"Installed staging wheel: {caterva_wheel_name}")
```

## 5. Open a fresh JupyterLite session

Use a fresh browser tab or a fresh notebook kernel before testing. This avoids
reusing a previously installed `caterva2` wheel from the same Pyodide session.

## 6. Open the LLM test notebook

Open:

- `_caterva2/state/personal/cd46395a-3517-4c48-baba-186d14b0fd94/prova3.ipynb`

This notebook contains helper code for:

- creating a server-side LLM session
- sending prompts with `ask(...)`
- resetting the session
- deleting the session

## 7. Run the notebook cells

1. Run the bootstrap cell and confirm the staging Caterva2 wheel installs.

2. Run the LLM setup cell and confirm it prints an LLM session id.

## 8. Run smoke-test prompts

Use the helper functions from the notebook to test the main flow:

```python
ask("List the available roots")
ask("List datasets under @public/dir1")
ask("Show metadata for @public/ds-1d.b2nd")
ask("Show stats for @public/ds-1d.b2nd", show_trace=True)
```

Check that:

- the response text is returned
- the trace output lists the expected tool calls
- metadata and stats look correct

## 9. Test session lifecycle

From the notebook, test:

```python
reset_agent_session()
ask("List the available roots")
delete_agent_session()
new_agent_session()
```

Confirm that:

- reset keeps the session usable
- delete removes the current session
- a new session can be created afterward

## 10. Test authentication behavior

If login is enabled on the server:

- test from an authenticated JupyterLite session
- confirm the LLM session can be created and used
- confirm anonymous access is rejected when `llm_allow_public_access` is false

## 11. Check server-side behavior

While exercising the notebook, inspect the Caterva2 server logs and verify:

- requests are reaching `/api/llm-agent/...`
- the expected provider is being used
- tool failures, auth failures, or provider errors are visible and readable

## 12. After the staging test

If the staging test passes:

1. restore the notebook bootstrap to production URLs, unless you want to keep a
staging-only notebook
2. publish the production wheel channel
3. rerun the same notebook smoke tests against the production wheel

## Quick checklist

- branch pushed to GitHub
- staging wheel published
- staging wheel URLs verified
- server started from the tested branch
- LLM backend enabled on the server
- provider config and API key verified
- fresh JupyterLite session opened
- notebook installs Caterva2 from `wheels-staging`
- session creation works
- prompts work
- reset/delete/new session works
- auth behavior is correct
- server logs look good
9 changes: 9 additions & 0 deletions caterva2-server.sample.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,12 @@ urlbase = "http://localhost:8000"
quota = "10G"
maxusers = 5
register = true # allow users to register

[server.llm]
enabled = true
# provider = "mock"
provider = "groq"
model = "openai/gpt-oss-20b"
#model = "openai/gpt-oss-120b"
allow_public_access = false
session_ttl_seconds = 1800
Loading
Loading