← All Tools

A VS Code extension that renders your codebase's identifiers as a word cloud — so you can see, at a glance, whether your ubiquitous language actually made it into the code.

Why Lexicon?

Open any codebase and read the type names, the method names, the variables. If what you see is OrderManager, PaymentHandler, DataProcessor, UserUtils — a wall of Manager, Handler, Service, Data — then your code isn't speaking your domain. It's speaking plumbing. The ubiquitous language your team argues about in the standup never made it into the source.

Lexicon makes that visible.

Reading the cloud

Same car-rental codebase, two different states. On the left, the loudest words are the domain — rental, reservation, vehicle, booking, fleet. On the right, mechanism has taken over — manager, service, handler, data — and the actual domain has shrunk to the bottom row.

A word cloud where car-rental domain words dominate
The language made it in
A word cloud where Manager, Service, Handler and Data dominate
The language got lost

Every word is the same colour. The signal is size — the bigger the word, the more your code says it. There's no built-in verdict on which words are "good" or "bad", because that's domain-specific: data is plumbing in a payments service and the whole point in an analytics product; model is a smell in one codebase and the core noun in an ML one. Lexicon won't pretend to know which is which — it shows you what your code actually says and trusts you to read it.

Identifier-aware

Extracts every identifier and splits it on camelCase, PascalCase, and snake_case boundaries, then strips language noise like err, ctx, string, and return.

Size is the signal

The biggest words are the words your code says most. That's your real, de-facto ubiquitous language — for better or worse.

Tunable, not opinionated

Drop a .lexiconignore in your workspace root to suppress genuine background hum — and the subtitle reports how many terms you ignored, so a tidy cloud is never silently curated.

Fully local

No server, no subprocess, no network calls, no telemetry. It runs entirely on your machine and nothing leaves it.

Running it

  1. Open the codebase you want to analyse as a workspace folder.
  2. Command Palette (Cmd/Ctrl+Shift+P) → Lexicon: Analyse Codebase.
  3. Read the cloud.

Runs entirely locally. No server, no subprocess, no network calls, no telemetry. Nothing leaves your machine.

Tuning what's counted

Lexicon strips only language grammar by default — keywords, primitive types, and literals (return, func, class, string, static). It does not pre-decide that http, fmt, println or generic verbs are noise. Whether those are plumbing or signal is a call only your codebase can make, so the tool surfaces them and lets you decide.

To suppress terms for a specific project, drop a .lexiconignore file in the workspace root — one term per line, # for comments:

# .lexiconignore — terms this codebase treats as plumbing
http
fmt
errorf
slog
println
base64
# generic CRUD verbs, if they're drowning the domain words
get
set
fetch

Terms match the lowercased, split words (so http also strips it out of httpClient). A word of caution: a clean-looking cloud you achieved by ignoring everything isn't a clean codebase. Reach for .lexiconignore to remove genuine background hum, not to hide the finding.

Supported languages

Go TypeScript / JavaScript JSX / TSX Python Java C# Ruby Rust Kotlin Swift C / C++

Install

Install from the Visual Studio Marketplace, or search for Lexicon in the Extensions view inside VS Code:

Requires Visual Studio Code 1.85 or later. Free and MIT-licensed.

Learn More

For full documentation and source, see the GitHub repository.