Add Phonic Plugin to LiveKit agents by qionghuang6 · Pull Request #4980 · livekit/agents

qionghuang6 · 2026-03-03T00:07:08Z

Implements Phonic Plugin in Livekit Agents Python SDK.
(Original LiveKit Agents JS implementation: https://github.com/Phonic-Co/livekit-agents-js/tree/main/plugins/phonic)

Changes

Implements RealtimeSession and RealtimeModel, supporting STS and tool calls
Uses Phonic Python SDK
Adds example to examples/voice-agents

Testing

Tested in Agent playground, including tool use.

Demo video

https://drive.google.com/file/d/1-7iP-g1HrRmRHAcjZf_kE9mRR3A5gCH_/view?usp=sharing

devin-ai-integration

Devin Review found 1 new potential issue.

View 13 additional findings in Devin Review.

devin-ai-integration · 2026-03-03T01:37:29Z

livekit-plugins/livekit-plugins-phonic/livekit/plugins/phonic/realtime/realtime_model.py

+                "Phonic does not support updating instructions mid-session."
+            )
+            return
+        self._opts.instructions = instructions


🔴 Mutating shared _opts object in update_instructions affects the model and all future sessions

When update_instructions is called, it mutates self._opts.instructions at line 258. However, self._opts is assigned as a direct reference to realtime_model._opts at line 198 (self._opts = realtime_model._opts). This means the mutation leaks into the parent RealtimeModel and any future sessions created from it.

Root Cause and Impact

The RealtimeSession.__init__ at livekit-plugins/livekit-plugins-phonic/livekit/plugins/phonic/realtime/realtime_model.py:198 stores a reference to the model's _opts:

self._opts = realtime_model._opts

Then update_instructions at line 258 mutates that shared object:

self._opts.instructions = instructions

Since RealtimeModel.session() can be called multiple times (e.g., when the agent activity is updated at livekit-agents/livekit/agents/voice/agent_activity.py:556), the second session would inherit the instructions set by the first session's update_instructions call, rather than the original NOT_GIVEN default. This can lead to stale or incorrect instructions being sent to new Phonic sessions.

Expected: Each session should have its own copy of options, or instructions should be stored in a session-local variable.
Actual: Instructions set in one session mutate the shared model-level options object.

Prompt for agents

In livekit-plugins/livekit-plugins-phonic/livekit/plugins/phonic/realtime/realtime_model.py, line 198, change self._opts = realtime_model._opts to create a copy of the options (e.g., using dataclasses.replace or copy.copy) so that mutations in update_instructions (line 258) don't affect the parent RealtimeModel or other sessions. For example, change line 198 from: self._opts = realtime_model._opts to: import copy self._opts = copy.copy(realtime_model._opts)

Was this helpful? React with 👍 or 👎 to provide feedback.

This is the pattern that the Google realtime and other realtime models use.

tinalenguyen

thank you for the PR! i left a few comments, could you also:

undo the uv.lock changes
add the plugin to the main pyproject file here
add the plugin to the agents pyproject file here

tinalenguyen · 2026-03-03T18:28:50Z

livekit-plugins/livekit-plugins-phonic/livekit/plugins/phonic/realtime/realtime_model.py

+        self._opts.instructions = instructions
+        self._instructions_ready.set()
+
+    async def update_chat_ctx(self, chat_ctx: llm.ChatContext) -> None:


at the start of the session, update_chat_ctx is called with a fresh chat context and i see this log:

WARNI… livekit.…ns.phonic update_chat_ctx called but no new tool call outputs to send. Phonic does not support general chat context updates.

perhaps we could add a check if the session just started?

Yup, just added a check to log this only if the config message has already been sent to Phonic.

tinalenguyen · 2026-03-03T18:33:02Z

livekit-plugins/livekit-plugins-phonic/livekit/plugins/phonic/realtime/realtime_model.py

+    def interrupt(self) -> None:
+        logger.warning(
+            "interrupt() is not supported by Phonic realtime model. "
+            "User interruptions are automatically handled by Phonic."


we call interrupt() when input speech is detected (relevant lines), so even if there is no ongoing speech to interrupt this log appears. adding this check should filter out those instances

Suggested change

def interrupt(self) -> None:

logger.warning(

"interrupt() is not supported by Phonic realtime model. "

"User interruptions are automatically handled by Phonic."

def interrupt(self) -> None:

if self._current_generation:

logger.warning(

"interrupt() is not supported by Phonic realtime model. "

"User interruptions are automatically handled by Phonic."

)

devin-ai-integration

Devin Review found 1 new potential issue.

View 17 additional findings in Devin Review.

devin-ai-integration · 2026-03-03T19:46:08Z

livekit-plugins/livekit-plugins-phonic/livekit/plugins/phonic/realtime/realtime_model.py

+        if self._current_generation is None and message.text:
+            logger.debug("Starting new generation due to text in audio chunk")
+            self._start_new_assistant_turn()
+
+        gen = self._current_generation
+        if gen is None:
+            return


🟡 Audio-only chunks silently dropped when no active generation exists

In _handle_audio_chunk, a new generation is only created when message.text is truthy (line 566). If an audio chunk arrives with audio data but no text, and there is no active generation (self._current_generation is None), the audio is silently discarded at line 571-572.

Root Cause and Impact

The guard at line 566 checks:

if self._current_generation is None and message.text:

But it should also account for message.audio. When an audio-only chunk (no text) arrives without an active generation — for example if Phonic sends audio before assistant_started_speaking, or if the event was missed — the code falls through to:

gen = self._current_generation # None if gen is None: return # <-- audio silently dropped

The audio data in message.audio is never decoded or forwarded, causing audio loss for that chunk. The condition should be message.text or message.audio to also start a generation for audio-only chunks.

Impact: Potential audio loss/gaps if audio chunks arrive without an active generation. In the normal protocol flow (assistant_started_speaking → audio_chunk), this won't trigger, but any deviation from that ordering causes silent audio drops.

Suggested change

if self._current_generation is None and message.text:

logger.debug("Starting new generation due to text in audio chunk")

self._start_new_assistant_turn()

gen = self._current_generation

if gen is None:

return

if self._current_generation is None and (message.text or message.audio):

logger.debug("Starting new generation due to content in audio chunk")

self._start_new_assistant_turn()

gen = self._current_generation

if gen is None:

return

Was this helpful? React with 👍 or 👎 to provide feedback.

This is intentional. Phonic can send silent audio chunks (or audio chunks mixed in with background noise), but we want to ignore these chunks in our integration with the Livekit agent SDKs.

qionghuang6 · 2026-03-03T19:50:25Z

Thanks @tinalenguyen . I made some quick changes. I also just reset uv.lock to what it is on main. Just to confirm, we don't want to update uv.lock here even though we changed pyproject.toml?

Also, I think the ruff CI is having some issues due to GitHub being partially down? We might need to re-run it. https://github.com/livekit/agents/actions/runs/22639600339/job/65611716048?pr=4980

tinalenguyen · 2026-03-03T20:20:34Z

The uv.lock file is usually auto-updated from version releases, and looks like the ruff tests passed!

Just tested again, I consistently see this log when a tool is called:
_SegmentSynchronizerImpl.playback_finished called before text/audio input is done {"text_done": false, "audio_done": true}

I'm checking if this is related to any plugin changes, I don't recall seeing that initially

tinalenguyen · 2026-03-03T20:43:11Z

@qionghuang6 I think marking the generation as done upon "assistant_finished_speaking" when a tool is called causes this behavior. The tool call starts a new generation with no text and audio, the expected behavior is that it's included in the current generation as well

Is there an event similar to response.done to mark the end of a turn with a function call?

qionghuang6 · 2026-03-03T21:12:24Z

EDIT: What I said below assumes the assistant speaks before calling the tool. In that case, we don't get the warning, but the warning happens when the assistant doesn't say anything before calling the tool, so nothing makes its way into the text channel, and text_done doesn't become true by the time that generation is closed.

@tinalenguyen Are you saying that when a tool is called, it is in it's own generation without audio or text? I think the current implementation uses the same generation as the existing audio and text. (Unless if self._current_generation is None: is not satisfied)

assistant_finished_speaking is our version of response.done, it's just that the turn is not considered to be done (i.e. the assistant has not finished speaking) until it has received the results of the tool call and finished speaking after receiving the results of the tool call.

The current flow of events when a tool call happens is:
assistant_started_speaking -> New generation begins
audio_chunk
...
audio_chunk
tool_call -> Tool call put in function channel, we call _close_current_generation, current generation closes.
...
After the tool is executed, Livekit Agents calls update_chat_ctx which calls _start_new_assistant_turn, starting a new generation.
Agent continues speaking:
audio_chunk
...
assistant_finished_speaking.

For turns without tool calls, it is:
assistant_started_speaking -> New generation begins
audio_chunk
...
audio_chunk
assistant_finished_speaking

I tried logging when assistant_finished_speaking is received, and it seems that I am seeing this _SegmentSynchronizerImpl warning prior to the assistant_finished_speaking event.

Continuing to look into this.

qionghuang6 · 2026-03-03T23:49:24Z

livekit-plugins/livekit-plugins-phonic/livekit/plugins/phonic/realtime/realtime_model.py

+            )
+
+        if not gen.text_ch.closed:
+            gen.text_ch.send_nowait("")


I investigated more deeply, and I think the reason we were seeing

_SegmentSynchronizerImpl.playback_finished called before text/audio input is done {"text_done": false, "audio_done": true}

is because of this chain of events:

User says -> "Please turn on light A 5 for me"
Phonic -> assistant_started_speaking (Plugin -> starts a new generation)
Phonic -> tool_call (Plugin -> closes current generation, but no text chunks was put in the text_ch by the time it closes, so )

(We need to close the current generation so that tools actually get sent back to the plugin through `update_chat_ctx)

However, because no text ever made it into the text, channel, the text_done never becomes true.

A quick fix for this is to just put an empty string in the text channel, so that the Text/Audio Synchronizer can change the state from false to true.

Thank you for the detailed overview! Just to confirm, the tool call results are received after the "assistant_started_speaking" event and that's why there's something like an in-between empty generation?

Yup, the assistant_started_speaking fires before the tool_call, because assistant_started_speaking is basically our version of saying a turn has started, and tool calls need to happen within an assistant turn.

Ahh okay, I see now! Thanks again

tinalenguyen

lgtm and works great, thanks!

qionghuang6 added 6 commits March 2, 2026 22:13

Initial phonic integration

3880a5a

format

84a7c4d

fixes

009bab9

Add py.typed

ca69d2b

Update run instructions

11c81a4

update phonic version

dfb495c

qionghuang6 marked this pull request as ready for review March 3, 2026 00:36

qionghuang6 changed the title ~~Add support for Phonic Plugin to LiveKit agents~~ Add Phonic Plugin to LiveKit agents Mar 3, 2026

This comment was marked as resolved.

Sign in to view

make channel for audio chunk sending to phonic

fa27242

This comment was marked as resolved.

Sign in to view

fix

344cbed

devin-ai-integration bot reviewed Mar 3, 2026

View reviewed changes

mike-r-mclaughlin requested a review from tinalenguyen March 3, 2026 12:38

tinalenguyen reviewed Mar 3, 2026

View reviewed changes

qionghuang6 added 3 commits March 3, 2026 19:28

Update pyproject, reset uv.lock, improve warning log behavior

423d006

Merge branch 'main' into qiong/phonic

1b97c69

improve variable naming

5717760

This comment was marked as resolved.

Sign in to view

fix

f775eb1

devin-ai-integration bot reviewed Mar 3, 2026

View reviewed changes

fix

fb60e28

qionghuang6 commented Mar 3, 2026

View reviewed changes

qionghuang6 requested a review from tinalenguyen March 3, 2026 23:51

tinalenguyen approved these changes Mar 4, 2026

View reviewed changes

tinalenguyen merged commit 9a49e19 into livekit:main Mar 4, 2026
10 checks passed

Conversation

qionghuang6 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Testing

Demo video

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tinalenguyen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qionghuang6 commented Mar 3, 2026

Uh oh!

tinalenguyen commented Mar 3, 2026

Uh oh!

tinalenguyen commented Mar 3, 2026

Uh oh!

qionghuang6 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tinalenguyen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qionghuang6 commented Mar 3, 2026 •

edited

Loading

qionghuang6 commented Mar 3, 2026 •

edited

Loading