Building Knowledge Instead of Documents

I have written the same proposal four times. Not the same client, not even the same product, but the same underlying thing: the sizing logic, the design rationale, the “here is why we recommend this rather than that” section. Each time I started from a blank page or, worse, from someone else’s deck that I half-trusted. Each time the knowledge that should have accreted into something durable instead evaporated the moment the engagement closed and the file went to sleep in a SharePoint folder nobody would ever open again.

That is the quiet tragedy of how most of our industry treats its own thinking. We produce enormous quantities of genuinely valuable analysis and then bury it in formats designed for printing, not for remembering. We confuse the act of writing a document with the act of building knowledge. They are not the same thing, and the difference compounds — for you and against you — over a career.

This is the argument for treating your knowledge as plain text under version control: Markdown, Git, and a static site to read it. It is not Markdown-purism. Word and PowerPoint still win in the places they win. But for the body of work that is supposed to make you faster next year than you were this year, the document was always the wrong container.

The document graveyard

Walk into any consultancy’s SharePoint and you are walking through a graveyard. Thousands of .docx and .pptx files, each a frozen snapshot of a moment that has long since passed. They were written once, read once — by the client, at the review meeting — and then never opened again. The effort that went into them was real. The retained value is close to zero.

The symptoms are familiar to everyone and fixed by no one. There is the version chaos: Proposal_v2.docx, Proposal_v2_JS_edits.docx, Proposal_FINAL.docx, and the immortal Proposal_v3_FINAL_FINAL.docx. Nobody knows which is canonical. The “truth” is whichever copy happened to be attached to the last email. There is no history you can trust, only a sediment of near-identical files differing in ways no one can reconstruct.

Then there is the format itself. A Word document is a proprietary binary blob. You cannot grep it. You cannot diff two versions and see, line by line, what actually changed and who changed it. You cannot link from a paragraph in one document to a paragraph in another and have that link mean anything. And you certainly cannot feed a folder of decks to a retrieval system and get sensible answers back, because the content is wrapped in layout — text boxes, slide masters, embedded images of diagrams — rather than expressed as content.

So the same things get written from scratch, engagement after engagement. The Conditional Access best-practice paragraph. The Citrix delivery-group sizing rationale I keep re-deriving instead of looking up — exactly the kind of thing I now keep as canon in my notes on modern Citrix architecture. Each is rewritten because finding the previous version, trusting it, and extracting it from its surrounding layout is slower than starting again. That is the tell. When reuse is harder than rewriting, you do not have a knowledge base. You have a graveyard with good search disabled.

SharePoint is not where knowledge goes to live. It is where knowledge goes to die quietly, with full audit compliance.

A document is not knowledge

The root confusion is treating two different things as one. A document is a deliverable — a frozen artefact produced for a specific moment and a specific audience. A proposal, a design pack, a board deck. It has a date, a recipient, and a shelf life. Once delivered, it is done. Its job was to communicate something at a point in time, and it did.

Knowledge is the opposite of frozen. Knowledge is living. It is linked to other knowledge, it is reusable across contexts, and crucially it accretes value over time rather than losing it. A good note about how Microsoft Graph authentication actually works does not expire when the engagement ends. It gets better as I add the edge case I hit last week, the gotcha about app-registration consent, the link to the n8n workflow that uses it.

The mistake is producing only documents and assuming the knowledge will somehow precipitate out of them. It does not. The document is the snapshot; the knowledge is the negative it was printed from, and most people throw the negative away. The discipline I am describing is keeping the negative. You write the durable knowledge first, in a form built to last and to be reused, and you generate documents from it when a moment demands one — not the other way round.

flowchart TD
  subgraph GRAVE[Document graveyard]
    A[Word file] --> B[SharePoint folder]
    C[PowerPoint deck] --> B
    B --> D[Never opened again]
    D --> E[Rewritten next time]
    E --> A
  end
  subgraph GRAPH[Knowledge graph]
    F[Atomic note] --> G[Linked note]
    G --> H[Linked note]
    F --> H
    G --> I[Generate deliverable]
    H --> I
    F --> J[Feeds Atlas RAG]
    G --> J
  end

The left loop is a circle: write, file, forget, rewrite. The right loop is a spiral: write, link, reuse, and feed the same notes into something that makes them more useful still.

Why plain text and Markdown

The first decision is the format, and it is the one people resist most because it feels like a downgrade. Plain text looks primitive next to a styled Word document. It is precisely that primitiveness that makes it durable.

Markdown is readable by a human with no tooling at all — it is just text with light, obvious conventions. It is readable by every editor, every operating system, every programming language, forever. There is no version of the future in which you cannot open a .md file. I cannot say the same about a .pptx from 2009 with embedded fonts and a long-dead plugin. Plain text is the only format I trust to outlive the tool that created it, which is the whole point of a knowledge base.

It is diff-able. Two versions of a note produce a clean, line-by-line comparison. You see exactly what changed. It is grep-able and machine-readable, which matters more every year, because the same notes that I read are the notes that feed retrieval. When I built Project Atlas, my local AI assistant, the knowledge base it retrieves from is this same Markdown — chunked, embedded, and queried. There is no export step, no conversion, no fidelity lost pulling text out of slide layouts. The thing I write for myself is the thing the model reads. A Word-and-SharePoint estate cannot do that without a brittle extraction pipeline bolted on the side, and even then it inherits all the layout noise.

That dual readability — humans and machines, the same source — is the quiet superpower. You are not maintaining one corpus for people and another for the RAG system. There is one corpus, in the most boring, most durable format there is.

Why Git, not a shared drive

Format solves durability and reuse. Git solves trust. A shared drive gives you files; Git gives you history.

Every change is a commit, with an author, a timestamp, and a message saying why. git log is the audit trail that documents never had — not a sediment of near-identical files, but a precise, ordered record of how the thinking evolved. git blame tells me who wrote a particular line and when, which is invaluable when I am staring at a sizing assumption and trying to remember whether it was deliberate or a copy-paste accident. Branching lets me draft a major revision without touching the version of record, then merge it when it is ready. There is exactly one canonical state — main — and the history behind it is real, not reconstructed from email attachments.

This is the same approach I describe in building this site: Markdown content, Git for history, Docker to build and serve it. The repository is the source of truth. Everything else is a view onto it.

# What actually happened to this design decision, in order
git log --oneline -- citrix/delivery-group-sizing.md
# a1f2c3d  Correct session density after real-world load test
# 9e8d7c6  Add gotcha: profile container IOPS ceiling
# 4b3a2f1  Initial sizing rationale from ProjectName engagement

# Who wrote this line and why
git blame -L 40,48 citrix/delivery-group-sizing.md

You will never get that from Proposal_v3_FINAL_FINAL.docx. The version chaos is not a discipline failure on the part of the people using Word. It is the inevitable result of a tool that has no concept of history. Git makes the right behaviour the default.

Why Hugo to publish it

Plain text in Git is excellent to write and to machine-read, but humans also need to browse, and a flat repository of .md files is not pleasant to navigate by eye. That is where a static site generator earns its keep. I use Hugo with the Stack theme — the same engine behind this site.

Hugo takes the Markdown and renders a fast, searchable, linked website. Nothing is locked away. The source stays plain text in Git; Hugo is purely a rendering layer on top, and if it vanished tomorrow the knowledge would be entirely intact. I get full-text search, taxonomies, automatic cross-linking, and rendered Mermaid diagrams from the same files I write in any text editor. The build is a Docker container, so publishing is one command and the output is static HTML that will serve from anywhere with no database to corrupt and no runtime to patch.

The architecture is deliberately layered so that each layer is replaceable and the value lives in the bottom one.

flowchart LR
  W[Markdown notes] --> G[Git repo]
  G --> H[Hugo build]
  H --> S[Static site to browse]
  G --> R[RAG ingest]
  R --> A[Atlas answers questions]
  W -.source of truth.-> W

The notes are the asset. Git is the history. Hugo is one consumer; Atlas is another. Tomorrow there might be a third. None of them own the knowledge — the plain-text repository does.

Knowledge as a graph, not a folder tree

Format, history, and rendering still leave the most important question: how the knowledge is structured. The instinct from the document era is to file things — folders, sub-folders, one big document per topic. Folders are a tree, and a tree forces every idea to live in exactly one place. Real knowledge does not work like that. An idea about Conditional Access belongs to identity, to Microsoft 365, to security, and to that specific health-check engagement all at once.

So I structure the base as a graph, not a tree. The unit is the atomic note: one note, one idea, small enough to link to precisely and reuse without dragging half a document along with it. Notes cross-reference each other directly. Taxonomies — tags and categories — let the same note surface under multiple themes without being duplicated. The structure that emerges is a web of connections, which is far closer to how I actually think and, not coincidentally, far better for retrieval. This is the second-brain principle I go deeper on in building a second brain: small, linked, durable units beat large, isolated documents every time.

An atomic note carries enough front matter to be found, filtered, and trusted, then says one thing well:

---
title: "Delivery group sizing — session density"
tags: [citrix, sizing, cvad, performance]
status: validated
sources:
  - 2026-04 ProjectName load test
related:
  - "[[profile-container-iops]]"
  - "[[cvad-reference-architecture]]"
updated: 2026-06-20
---

Single-server session density is bounded by the profile
container IOPS ceiling long before CPU saturates. Plan to
8–10 heavy users per vCPU pair as a starting point, then
validate under real load — the synthetic numbers lie.

See the IOPS gotcha in [[profile-container-iops]].

That note is durable, linkable, machine-readable, and honest about its own provenance and status. When I write the next Citrix proposal, I do not re-derive any of this. I link to it, and the deliverable assembles from validated pieces — exactly the reusable-framework approach I argue for in building repeatable customer health checks.

The compounding effect

This is the part that matters most and is hardest to feel in the first month. A knowledge base built this way compounds. Every note makes the next piece of work faster, because the next piece is partly assembled from notes that already exist and have already been corrected. The graveyard does the opposite: every engagement starts near zero, so your speed in year five is roughly your speed in year one with better war stories.

Compounding is the difference between linear and exponential effort. The fiftieth proposal in a graveyard system costs about what the first did. The fiftieth proposal drawn from a living graph costs a fraction, because the sizing logic, the design rationale, the standard caveats, and the diagrams are all already written, already validated, already linked. You are composing, not authoring from scratch.

Over time that becomes a genuine moat. Not the individual notes — anyone can write a note. The accumulated, interlinked, version-controlled, machine-readable body of corrected thinking is the thing that is hard to replicate and hard to take away. It is the asset I am actually building, and the documents are just exhaust from using it.

Where Word and SharePoint still win

None of this is a case for never opening PowerPoint. That would be ideology, and ideology makes you worse at your job. There are jobs the document does better, and refusing to use the right tool is the same mistake as the graveyard, just pointed the other way.

A polished client deliverable is one of them. When I hand a board a proposal, it needs to look considered, branded, and finished — and a styled Word or PowerPoint document does that far better than rendered Markdown. Collaboration with non-technical people is another. Track Changes and a comment thread in Word are how most of the world reviews a document, and asking a procurement lead to raise a pull request is absurd. Governance and formal sign-off — the legal weight of a named, dated, approved artefact — is genuinely what the document format is for. A frozen snapshot is exactly the right thing when you need a frozen snapshot.

The model that works is not Markdown instead of Word. It is Markdown underneath, Word on top. The knowledge lives as plain text in Git and compounds there. When a moment needs a deliverable, I generate one — assembling it from validated notes and polishing it in the tool the audience expects. The deliverable is downstream of the knowledge, disposable by design, and the next deliverable starts from the same improved base rather than from the last frozen file. This is precisely the flow I describe in from proposal to production: the proposal is a render of the knowledge, not the knowledge itself.

What it costs, honestly

I would be lying if I called this free. The cost is discipline, and it is paid up front, every day, before the payoff arrives.

You have to actually write the note, in the moment, when the lazy option is to finish the deliverable and move on. You have to keep notes atomic when the temptation is to dump everything into one sprawling file. You have to maintain links and resist letting the graph rot into orphaned fragments. You have to learn enough Git to be comfortable, which is a real barrier for people who came up entirely in the Office world. And the compounding is invisible early — for the first few months it genuinely feels like extra work for no return, because the base is too small to compose from yet.

I got the granularity wrong at first. My early notes were really documents in disguise — long, multi-topic, impossible to link to precisely — and they retrieved badly in Atlas, pulling back a 2,000-word note when a query needed two sentences. I had to go back and atomise them. The discipline is not just “write it down”. It is “write it down small, link it, and keep it honest.” That is the actual price of admission.

Where this goes next

The direction I am pushing is to close the loop between writing and generating. Right now I assemble deliverables semi-manually from notes. The next step is templated generation: a proposal skeleton that pulls the relevant validated notes by tag and produces a first-draft document automatically, ready for the human polish that Word is good at. The knowledge stays canonical; the document becomes genuinely disposable.

I also want richer status and provenance on every note — draft, validated, deprecated — surfaced both in Hugo and in retrieval, so that Atlas can weight a battle-tested note above a half-formed one and tell me when it is leaning on something I never finished checking. And I want better detection of graph rot: orphaned notes, dead links, and ideas that have quietly contradicted each other. A knowledge base is a living thing, and living things need tending or they decay back into a graveyard with extra steps.

The real point

The shift is not really about Markdown, Git, or Hugo. Those are implementation details, and you could make similar arguments with different tools. The shift is in what you think you are producing. If you believe you are producing documents, you will optimise for the moment of delivery and your knowledge will keep dying behind you. If you believe you are producing knowledge, you will optimise for durability, reuse, and compounding, and the documents will fall out of that almost for free.

I spent years building a graveyard without noticing, because every individual document felt like progress. It was not. It was the same work, rewritten, on a treadmill. Building knowledge instead means the work accumulates — the version of me doing this in five years inherits everything the version doing it today figures out. That is the whole game. The document was always meant to be the output. The knowledge was always meant to be the asset. Most of us had it exactly backwards.