Corpus Workspace

Topic Clusters

Automatic grouping of related content by similarity and theme across your entire corpus.

Topic Clusters

Automatic grouping of related material across your corpus by thematic similarity — documents, conversation segments, extracted outputs, and notes that share concepts or subjects are clustered together without manual tagging.


Core Capabilities

Emergent Dynamic Cross-Source
Clusters form from the content itself — no manual tagging or folder organisation required. Clusters evolve as new material enters the corpus, splitting and merging naturally. Documents, conversations, and extractions land in the same cluster when they share themes.

The Problem We Solve

"My folders don't reflect how I actually think"

Folder hierarchies impose a single, rigid structure on material that belongs in multiple categories. Topic Clusters allow content to appear in every cluster where it's relevant — no duplication, no forced hierarchy.

"I organise things but can never maintain it"

Manual tagging is inconsistent and incomplete. You tag diligently for a week, then stop. Topic Clusters are automatic and continuous — they maintain themselves as your corpus grows.

"I know these ideas connect but I can't see how"

Two conversations from different months, a document from a different project, and a code snippet from a tutorial all relate to the same underlying concept. Topic Clusters surface these connections that manual organisation misses.


How It Works

  1. Analyse — Every source entering the corpus is analysed for thematic content using n-gram patterns, structural similarity, and semantic closeness
  2. Cluster — Sources with sufficient thematic overlap are grouped into clusters automatically
  3. Evolve — As new material enters, clusters grow, split, or merge based on the evolving landscape of themes
  4. Navigate — Browse the cluster hierarchy from broad themes to narrow sub-topics, or use clusters as filters in the Corpus Index
  5. Connect — Clusters with strong inter-connections surface on the Knowledge Graph as structural relationships

What We Deliver

  • Automatic clustering based on semantic similarity, not just keyword matching
  • Dynamic cluster evolution — clusters grow, split, and merge as new content arrives
  • Cross-source clustering — documents, conversations, and extractions in the same cluster
  • Cluster hierarchy — sub-clusters for granular thematic navigation
  • Cluster-level analytics showing coverage, density, and temporal development
  • Inter-cluster connections showing how different themes relate
  • Cluster-based filtering across the Corpus Index and Source Browser

Integration with Other Features

  • Corpus Index — Filter search results by cluster membership
  • Node Graph — Clusters appear as navigable regions on the visual graph
  • Source Browser — Each source shows its cluster membership in the contextual sidebar
  • Entity Links — Entities spanning multiple clusters highlight thematic bridges

Related


Explore All Features