feat: PoC - An AiCore wrapper for OpenAi implementation#806
feat: PoC - An AiCore wrapper for OpenAi implementation#806rpanackal wants to merge 5 commits intofeat/poc-openai-responses-apachefrom
Conversation
...dels/openai/src/main/java/com/sap/ai/sdk/foundationmodels/openai/AiCoreResponsesService.java
Outdated
Show resolved
Hide resolved
...dels/openai/src/main/java/com/sap/ai/sdk/foundationmodels/openai/AiCoreResponsesService.java
Outdated
Show resolved
Hide resolved
rpanackal
left a comment
There was a problem hiding this comment.
I am halting development using the solution proposed in this PR since its doesn't scale well for the value it provides currently.
Wrapping OpenAI SDK api, considering the size of their public API surface is not feasible.
| @Test | ||
| @Disabled("Fails as Async client needs additional wrappers. Maintenance wall.") | ||
| void testAsyncChatCompletion() { | ||
| final var params = | ||
| ChatCompletionCreateParams.builder() | ||
| .model(ChatModel.GPT_5) | ||
| .addUserMessage("Say this is a test") | ||
| .build(); | ||
|
|
||
| final CompletableFuture<ChatCompletion> future = | ||
| client.async().chat().completions().create(params); | ||
|
|
||
| final ChatCompletion response = future.join(); | ||
|
|
||
| assertThat(response).isNotNull(); | ||
| assertThat(response.choices()).isNotEmpty(); |
There was a problem hiding this comment.
Limitation 1:
Async client requires their own wrappers. The test will fail since Ai Core expects query param api-version. Currently, AiCoreChatService can handle synchronous cases but simply to enable async scenario, we need to double the number of wrappers.
| @Override | ||
| @Nonnull | ||
| public ChatCompletion create( | ||
| @Nonnull final ChatCompletionCreateParams params, | ||
| @Nonnull final RequestOptions requestOptions) { | ||
| throwOnModelMismatch(params.model().asString()); | ||
| return delegate.create(params, requestOptions); | ||
| } |
There was a problem hiding this comment.
Limitation 2:
In AiCoreResponseService, we are able to plug in the model ourselves, removing the burden from user having to mention it twice. Once for deployment resolution and once in ChatCompletionCreateParams (otherwise server returns 400).
client = AiCoreOpenAiClient.fromDestination(destination, OpenAiModel.GPT_5)
var params =
ResponseCreateParams.builder()
.input("What is the capital of France?")
.model(ChatModel.GPT_5) // Users can leave this out
.build();
Response response = client.responses().create(params);The same is not true for AiCoreChatCompletionService. ChatCompletionCreateParams throws right away when model is not declared -> we can not provide any convenience to user by plugging in model downstream.
final var params =
ChatCompletionCreateParams.builder()
.model(ChatModel.GPT_5) // -> build() throws without model
.addUserMessage("Say this is a test")
.build();Why not switch to dynamic deployment resolution for model in params ?
The AI Core supports multiple operations for the following endpoints.
"/chat/completions": POST
"/responses": GET, POST
"/responses/{response_id}": GET, DELETE
"/responses/compact": POST
Except for the POST operations there is no model available in payload. So, it becomes unavoidable to not ask a model to resolve deployment url.
Context
AI/ai-sdk-java-backlog#364.
This is in reference to the comment in related PR.
Builds on the HTTP adapter approach and adds a service-level interception layer to handle the
"model"field transparently.Problem: the OpenAI SDK's request params require a
modelfield in the body — meaning users are forced to specify the model twice, once when resolving the deployment and once in the request parameters. Omitting it causes a server-side error.The problem can be generalized to bring up the fact that we have limited control on OpenAI SDK usage as of before.
Solution: SDK service-level delegation
AiCoreOpenAiClient.forModel(...)returns a standardcom.openai.client.OpenAIClient. Internally, its service accessors (e.g.responses()) return AI Core specific wrapper implementations (AiCoreResponsesService) that intercept eachcreate()/createStreaming()call before serialization at the Java API layer.The model resolved at client creation time is injected automatically into the request params if absent. If the user explicitly sets a model that does not match the deployed model, a descriptive error is thrown.
Definition of Done
Aligned changes with the JavaScript SDK