DocumentExtractor

fun interface DocumentExtractor(source)

Extractor of memory records during message ingestion.

This is a functional interface (SAM) that defines how a list of messages should be transformed into a list of TextDocuments for storage. It provides flexibility in how messages are filtered, transformed, and converted into TextDocuments while maintaining type safety.

Pre-built implementations are available for common ingestion patterns:

Usage Examples

Using pre-built extractors (Kotlin):

// Extract User and Assistant messages (default)
val extractor = MessagePassingDocumentExtractor()

// Extract only User messages
val extractor = MessagePassingDocumentExtractor(
messageRolesToExtract = setOf(Message.Role.User)
)

Custom implementation as lambda (Kotlin):

val customExtractor = DocumentExtractor { messages ->
messages
.filter { it.role == Message.Role.Assistant }
.map { MemoryRecord(content = it.content) }
}

Custom implementation as lambda (Java):

DocumentExtractor customExtractor = (messages) ->
messages.stream()
.filter(m -> m.getRole() == Message.Role.Assistant)
.map(m -> new MemoryRecord(m.getContent(), null, Collections.emptyMap()))
.collect(Collectors.toList());

Inheritors

Types

Link copied to clipboard
object Companion

Companion object with a builder method.

Functions

Link copied to clipboard
abstract suspend fun extract(messages: List<Message>): List<TextDocument>

Transforms a list of messages into a list of TextDocuments for storage.