Skip to content

feat: PoC - An AiCore wrapper for OpenAi implementation#806

Open
rpanackal wants to merge 5 commits intofeat/poc-openai-responses-apachefrom
feat/poc-aicore-openai-wrapper
Open

feat: PoC - An AiCore wrapper for OpenAi implementation#806
rpanackal wants to merge 5 commits intofeat/poc-openai-responses-apachefrom
feat/poc-aicore-openai-wrapper

Conversation

@rpanackal
Copy link
Copy Markdown
Member

@rpanackal rpanackal commented Mar 30, 2026

Context

AI/ai-sdk-java-backlog#364.

This is in reference to the comment in related PR.

Builds on the HTTP adapter approach and adds a service-level interception layer to handle the "model" field transparently.

Problem: the OpenAI SDK's request params require a model field in the body — meaning users are forced to specify the model twice, once when resolving the deployment and once in the request parameters. Omitting it causes a server-side error.

The problem can be generalized to bring up the fact that we have limited control on OpenAI SDK usage as of before.

Solution: SDK service-level delegation
AiCoreOpenAiClient.forModel(...) returns a standard com.openai.client.OpenAIClient. Internally, its service accessors (e.g. responses()) return AI Core specific wrapper implementations (AiCoreResponsesService) that intercept each create()/createStreaming() call before serialization at the Java API layer.

The model resolved at client creation time is injected automatically into the request params if absent. If the user explicitly sets a model that does not match the deployed model, a descriptive error is thrown.

Definition of Done

  • Functionality scope stated & covered
  • Tests cover the scope above
  • Error handling created / updated & covered by the tests above
  • Aligned changes with the JavaScript SDK
  • Documentation updated
  • Release notes updated

@rpanackal rpanackal changed the base branch from main to feat/poc-openai-responses-apache March 30, 2026 16:27
Copy link
Copy Markdown
Contributor

@newtork newtork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good idea!

Copy link
Copy Markdown
Member Author

@rpanackal rpanackal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am halting development using the solution proposed in this PR since its doesn't scale well for the value it provides currently.

Wrapping OpenAI SDK api, considering the size of their public API surface is not feasible.

Comment on lines +109 to +124
@Test
@Disabled("Fails as Async client needs additional wrappers. Maintenance wall.")
void testAsyncChatCompletion() {
final var params =
ChatCompletionCreateParams.builder()
.model(ChatModel.GPT_5)
.addUserMessage("Say this is a test")
.build();

final CompletableFuture<ChatCompletion> future =
client.async().chat().completions().create(params);

final ChatCompletion response = future.join();

assertThat(response).isNotNull();
assertThat(response.choices()).isNotEmpty();
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limitation 1:

Async client requires their own wrappers. The test will fail since Ai Core expects query param api-version. Currently, AiCoreChatService can handle synchronous cases but simply to enable async scenario, we need to double the number of wrappers.

Comment on lines +53 to +60
@Override
@Nonnull
public ChatCompletion create(
@Nonnull final ChatCompletionCreateParams params,
@Nonnull final RequestOptions requestOptions) {
throwOnModelMismatch(params.model().asString());
return delegate.create(params, requestOptions);
}
Copy link
Copy Markdown
Member Author

@rpanackal rpanackal Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limitation 2:

In AiCoreResponseService, we are able to plug in the model ourselves, removing the burden from user having to mention it twice. Once for deployment resolution and once in ChatCompletionCreateParams (otherwise server returns 400).

client = AiCoreOpenAiClient.fromDestination(destination, OpenAiModel.GPT_5)
var params =
        ResponseCreateParams.builder()
            .input("What is the capital of France?")
            .model(ChatModel.GPT_5)                    // Users can leave this out
            .build();
Response response = client.responses().create(params);

The same is not true for AiCoreChatCompletionService. ChatCompletionCreateParams throws right away when model is not declared -> we can not provide any convenience to user by plugging in model downstream.

 final var params =
        ChatCompletionCreateParams.builder()
            .model(ChatModel.GPT_5)                      // -> build() throws without model
            .addUserMessage("Say this is a test")
            .build();

Why not switch to dynamic deployment resolution for model in params ?

The AI Core supports multiple operations for the following endpoints.

"/chat/completions": POST
"/responses": GET, POST
"/responses/{response_id}": GET, DELETE
"/responses/compact": POST

Except for the POST operations there is no model available in payload. So, it becomes unavoidable to not ask a model to resolve deployment url.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants