ContentChunker

class ContentChunker(maxChunkSize: Int = 1500, overlapSize: Int = 200, minChunkSize: Int = 2000)

Converts MaterializedContainerSection objects into Chunk objects with intelligent text splitting.

For container sections with small total content (aggregated from leaves), creates a single chunk containing all leaf content. For large leaf sections within containers, splits them individually into multiple chunks.

Parameters

maxChunkSize

Maximum characters per chunk (default: 1500)

overlapSize

Characters of overlap between chunks (default: 200)

minChunkSize

Minimum characters to warrant splitting (default: 2000)

Constructors

Link copied to clipboard
constructor(maxChunkSize: Int = 1500, overlapSize: Int = 200, minChunkSize: Int = 2000)

Types

Link copied to clipboard
object Companion
Link copied to clipboard
data class SplitterConfig(val maxChunkSize: Int = 1500, val overlapSize: Int = 200, val minChunkSize: Int = 2000)

Configuration for the splitter

Functions

Link copied to clipboard

Split a MaterializedContainerSection into one or more Chunks

Link copied to clipboard

Split multiple MaterializedContainerSections into Chunks