prompt-executor-openai-client

A client implementation for executing prompts using OpenAI's GPT models with support for images, audio, and custom parameters. Includes support for both Chat Completions and Responses APIs.

Overview

This module provides a client implementation for the OpenAI API, allowing you to execute prompts using GPT models. It handles authentication, request formatting, response parsing, and multimodal content encoding specific to OpenAI's API requirements.

Supported Models

Reasoning Models

ModelSpeedContextInput SupportOutput SupportPricing (per 1M tokens)APIs Support
o4-miniMedium200KText, Images, ToolsText, Tools

$1.1-$

4.4
Chat, Responses
o3-miniMedium200KText, ToolsText, Tools

$1.1-$

4.4
Chat, Responses
o1-miniSlow128KTextText

$1.1-$

4.4
Chat
o3Slowest200KText, Images, ToolsText, Tools

$10-$

40
Chat, Responses
o1Slowest200KText, Images, ToolsText, Tools

$15-$

60
Chat, Responses

Chat Models

ModelSpeedContextInput SupportOutput SupportPricing (per 1M tokens)APIs Support
GPT-4oMedium128KText, Images, ToolsText, Tools

$2.5-$

10
Chat, Responses
GPT-4.1Medium1MText, Images, ToolsText, Tools

$2-$

8
Chat, Responses
GPT-5Medium400KText, Images, DocumentsText, Tools

$1.25-$

10
Chat, Responses
GPT-5 MiniFast400KText, Images, DocumentsText, Tools

$0.25-$

2
Chat, Responses
GPT-5 NanoVery fast400KText, Images, DocumentsText, Tools

$0.05-$

0.4
Chat, Responses

Audio Models

ModelSpeedContextInput SupportOutput SupportPricing (per 1M tokens)
GPT AudioMedium128KText, Audio, ToolsText, Audio, Tools

$2.5-$

10
GPT-4o Mini AudioFast128KText, Audio, ToolsText, Audio, Tools

$0.15-$

0.6/

$10-$

20
GPT-4o AudioMedium128KText, Audio, ToolsText, Audio, Tools

$2.5-$

10/

$40-$

80

Cost-Optimized Models

ModelSpeedContextInput SupportOutput SupportPricing (per 1M tokens)APIs Support
o4-miniMedium200KText, Images, ToolsText, Tools

$1.1-$

4.4
Chat, Responses
GPT-4o MiniMedium128KText, Images, ToolsText, Tools

$0.15-$

0.6
Chat, Responses
GPT-4.1-nanoVery fast1MText, Images, ToolsText, Tools

$0.1-$

0.4
Chat, Responses
GPT-4.1-miniFast1MText, Images, ToolsText, Tools

$0.4-$

1.6
Chat, Responses
o3-miniMedium200KText, ToolsText, Tools

$1.1-$

4.4
Chat, Responses

Embedding Models

ModelSpeedDimensionsInput SupportPricing (per 1M tokens)
text-embedding-3-smallMedium1536Text$0.02
text-embedding-3-largeSlow3072Text$0.13
text-embedding-ada-002Slow1536Text$0.1

Media Content Support

Content TypeSupported FormatsMax SizeNotes
ImagesPNG, JPEG, WebP, GIF20MBBase64 encoded or URL
AudioWAV, MP325MBBase64 encoded only (audio models)
DocumentsPDF20MBBase64 encoded only (vision models)
Video❌ Not supported--

Important Details:

  • Images: Both URL and base64 supported

  • Audio: Only WAV and MP3 formats, base64 only

  • PDF Documents: Only PDF format, requires vision capability

  • Model Requirements: Audio needs Audio capability, PDF needs Vision.Image capability

Model-Specific Parameters Support

OpenAI Chat Parameters

The client supports OpenAI-specific parameters through OpenAIChatParams class:

val chatParams = OpenAIChatParams(
temperature = 0.7,
maxTokens = 1000,
frequencyPenalty = 0.5,
presencePenalty = 0.5,
topP = 0.9,
stop = listOf("\\n", "END"),
logprobs = true,
topLogprobs = 5,
reasoningEffort = ReasoningEffort.MEDIUM,
parallelToolCalls = true,
audio = OpenAIAudioConfig(voice = "alloy", format = "mp3"),
webSearchOptions = OpenAIWebSearchOptions(enabled = true)
)

OpenAI Responses API Parameters

For the Responses API, use OpenAIResponsesParams:

val responsesParams = OpenAIResponsesParams(
temperature = 0.7,
maxTokens = 1000,
background = true,
include = listOf("sources", "citations"),
maxToolCalls = 10,
reasoning = ReasoningConfig(effort = ReasoningEffort.HIGH),
truncation = Truncation(type = "auto")
)

API Endpoints Support

The client now supports both OpenAI API endpoints:

  • Chat Completions API: Traditional chat completions with streaming support

  • Responses API: Enhanced API with background processing, built-in tools, and structured outputs

Using in your project

Add the dependency to your project:

dependencies {
implementation("ai.koog.prompt:prompt-executor-openai-client:$version")
}

Configure the client with your API key:

val openaiClient = OpenAILLMClient(
apiKey = "your-openai-api-key",
)

Example of usage

suspend fun main() {
val client = OpenAILLMClient(
apiKey = System.getenv("OPENAI_API_KEY"),
)

// Text-only example with Chat API
val response = client.execute(
prompt = prompt {
system("You are helpful assistant")
user("What time is it now?")
},
model = OpenAIModels.Chat.GPT5,
params = OpenAIChatParams(
temperature = 0.7,
reasoningEffort = ReasoningEffort.MEDIUM
)
)

// Using Responses API
val responsesResponse = client.execute(
prompt = prompt {
system("You are helpful assistant")
user("Research the latest developments in AI")
},
model = OpenAIModels.Chat.GPT5,
params = OpenAIResponsesParams(
background = true,
include = listOf("sources", "citations"),
maxToolCalls = 5
)
)

println(response)
}

Multimodal Examples

// Image analysis with GPT-5
val imageResponse = client.execute(
prompt = prompt {
user {
text("What do you see in this image?")
image("/path/to/image.jpg")
}
},
model = OpenAIModels.Chat.GPT5,
params = OpenAIChatParams(
temperature = 0.3,
reasoningEffort = ReasoningEffort.HIGH
)
)

// Audio transcription (requires audio models)
val audioData = File("/path/to/audio.wav").readBytes()
val transcriptionResponse = client.execute(
prompt = prompt {
user {
text("Transcribe this audio")
audio(audioData, "wav")
}
},
model = OpenAIModels.Audio.GPT4oAudio,
params = OpenAIChatParams(
audio = OpenAIAudioConfig(voice = "alloy", format = "mp3")
)
)

// PDF document processing with Responses API
val pdfResponse = client.execute(
prompt = prompt {
user {
text("Summarize this PDF document with citations")
document("/path/to/document.pdf")
}
},
model = OpenAIModels.Chat.GPT5,
params = OpenAIResponsesParams(
include = listOf("sources", "citations"),
reasoning = ReasoningConfig(effort = ReasoningEffort.MEDIUM)
)
)

// Embedding example
val embedding = client.embed(
text = "This is a sample text for embedding",
model = OpenAIModels.Embeddings.TextEmbedding3Small
)

// Mixed content with custom parameters
val mixedResponse = client.execute(
prompt = prompt {
user {
text("Compare this image with the PDF:")
image("/path/to/chart.png")
document("/path/to/report.pdf")
text("What insights can you provide?")
}
},
model = OpenAIModels.Chat.GPT5,
params = OpenAIChatParams(
temperature = 0.5,
maxTokens = 4000,
reasoningEffort = ReasoningEffort.HIGH,
parallelToolCalls = true
)
)

// Background processing with Responses API
val backgroundResponse = client.execute(
prompt = prompt {
user("Research and analyze market trends for renewable energy")
},
model = OpenAIModels.Chat.GPT5,
params = OpenAIResponsesParams(
background = true,
include = listOf("sources", "citations", "steps"),
maxToolCalls = 20,
reasoning = ReasoningConfig(effort = ReasoningEffort.HIGH)
)
)

Packages

Link copied to clipboard
common
common
common
common