What I’m Building Now

Most of this site is finished thinking. An article gets written when a project has settled enough to have a shape — a problem, a few decisions I can defend, some lessons I got the hard way. This page is the opposite. It is the workbench, not the gallery.

Last updated: June 2026. That line matters, so let me explain the convention before anything else. This is a living document. I rewrite it every month or two, in place, rather than publishing a new “state of the lab” post each time. The date above is the contract: if it is stale, treat everything below as a snapshot from then, not gospel. When I update it I bump the date, move anything that shipped down into the changelog at the bottom, and pull something new up from the bench. Future-me only has to keep three lists honest — now, next, and recently done — and the page stays useful without ever becoming an essay.

Think of this as the hub. Almost everything here has, or will have, a proper write-up elsewhere on the site; I link to those as the canonical version and keep the prose here short and current. If you want the finished argument, follow the link. If you want to know what is actually on my desk this week, stay here.

How to read this page

It is organised the way I actually think about work: a now/next/later split, plus a deliberate list of things I am not doing. Active work is what I touch in a normal week. The bench is ideas I have committed to memory but not to time. The changelog is recent enough to prove the pattern. And the non-goals are there because, increasingly, I think saying no clearly is the most valuable planning skill I have.

flowchart LR
    subgraph Now
        A[Atlas v2]
        B[Battery phase 2]
        C[M365 product]
        D[VLAN segmentation]
        E[Homelab as IaC]
    end
    subgraph Next
        F[Self-service portal]
        G[AI observability]
        H[Local fine-tune]
    end
    subgraph Later
        I[EV charge coordination]
        J[Multi-tenant health checks]
    end
    Now --> Next --> Later

What is active right now

Atlas, the next iteration

Project Atlas is the recurring brain behind a lot of this, and it is the thing I am most actively pulling apart. The current version works, but it works like a clever search box with a personality. The next iteration is about three things.

First, better retrieval. The naive “embed everything, return top-k” approach gives confidently wrong answers when the knowledge base gets large, because it retrieves passages that are similar rather than passages that are relevant. I am moving to a hybrid setup — keyword plus vector, with a re-ranking pass — and chunking on document structure instead of fixed token windows. Retrieval is where most of the quality lives. The model is not the product; the context you feed it is.

Second, agentic loops. Atlas currently does one-shot tool calls into n8n. I want it to plan, call a tool, look at the result, and decide whether it is done — a proper loop with a stopping condition, not a single hop. The risk is obvious: loops that never terminate, or that burn tokens flailing. So the loop is bounded and every step is logged.

Third, evals. This is the unglamorous one and the reason the other two are even possible. I am building a small set of golden questions with known-good answers so that when I change the retrieval strategy I can measure whether it got better or just different. Without evals I am tuning by vibes, and vibes are how most AI projects quietly fail.

The battery optimiser, phase two

The AI battery optimiser has been running in Home Assistant for long enough that I trust its day-ahead charge plan. Phase two is about closing the loop intraday. Right now it commits to a plan each evening based on the solar forecast and the Agile price curve, then largely sticks to it. The next version re-plans through the day as the actual forecast and actual household load diverge from prediction — a cloudy morning should be allowed to change its mind about the afternoon. I am also starting to log forecast-versus-actual properly so I can quantify how much money the optimisation is genuinely saving versus a dumb “charge when cheap” baseline. If the answer is “not much”, that is worth knowing too.

The M365 health check, becoming a product

The Microsoft 365 AI health check started as a script and a good idea. The work now is turning it into something repeatable I can point at any tenant without hand-holding — which is the whole thesis of building repeatable customer health checks. That means parameterising the Graph app registration cleanly, versioning the rule set that decides what counts as a finding, and making the LLM-written report deterministic enough that the same tenant produces a stable report twice in a row. The hard part is not the API calls; it is the judgement layer staying consistent. A health check that grades differently on Tuesday is not a product, it is a mood.

Network segmentation and VLANs

The lab still sits on an embarrassingly flat network, and I have written before, in building an AI infrastructure lab at home, that this was a deliberate “later”. It is now. I am carving the flat network into three VLANs — trust, IoT, and lab — so that a compromised smart plug cannot reach my servers’ management plane. The work is mostly switch and firewall config, plus the unglamorous job of re-homing every device and fixing the things that quietly depended on everything being on one subnet. mDNS across VLANs alone has cost me an evening.

Moving the homelab to Git and IaC

Too much of my setup still lives as state in running machines rather than as declared intent in a repository. The direction, which falls straight out of lessons from building a Docker homelab, is that the Git repo is the source of truth and the running box is a rebuildable artefact. Compose files already live in Git. Next is the layer below — the host config, the VLAN definitions, the bare-metal host setup — moving toward declarative tooling so I can rebuild a node from scratch without remembering what I clicked eight months ago.

# the direction of travel: the GPU box as declared intent, not clicked
- hosts: gpu-box
  roles:
    - nvidia_driver        # blacklist nouveau, install and pin the driver
    - ollama               # native service, models on /mnt/nvme
    - node_exporter        # so Prometheus can see it

The site itself

This very site keeps growing, and not by accident. Building krishaynes.co.uk on Hugo was about owning plain-text, version-controlled knowledge that outlives any platform — the same idea as building a second brain. This now-page is part of that: a deliberate experiment in keeping one document alive instead of letting the site become an archive of frozen posts.

On the bench: thinking about, not started

These are real ideas with no time committed yet. Writing them down is how I stop them rattling around.

Fine-tuning a small local model. Everything I run locally today is off-the-shelf, picked for the job as I describe in my journey into local LLMs. I am curious whether a small fine-tune — a LoRA on my own corpus of notes and reports — would beat clever prompting for the narrow task of writing in my voice. My suspicion is that it would not be worth it versus better retrieval, but suspicion is not data.

EV charge coordination. The battery optimiser already reasons about cheap windows. An EV is just a very large, very mobile battery with its own constraints. Coordinating the two against one Agile price curve is an obvious extension, and an obvious way to make a single bad assumption cost real money, so it stays on the bench until phase two of the battery work is solid.

A self-service health-check portal. The M365 check is currently me running a thing. The natural next step is a front end where a colleague kicks off a run and gets the report, without me in the loop. Worth it only once the engine underneath is genuinely repeatable — otherwise I have just built a nicer way to deliver an inconsistent answer.

Observability for AI. I have Prometheus and Grafana watching infrastructure. I have almost nothing watching Atlas — token spend, retrieval hit rates, loop lengths, the quality scores from those new evals. As AI becomes infrastructure, it should be monitored like infrastructure. This will probably get promoted to “active” soon, because every other AI item above is generating telemetry I am currently throwing away.

Recently shipped

The changelog. This is what makes the living-document pattern obvious — things move down here when they leave my desk.

Jun 2026 — Migrated the reverse proxy fully to Caddy with automatic HTTPS; retired the last Nginx Proxy Manager rule.
May 2026 — Atlas retrieval moved off fixed-window chunking; first golden-question eval set checked into Git.
May 2026 — Battery optimiser day-ahead plan running unattended for a full month, no manual overrides.
Apr 2026 — Published the home lab as a learning platform and building a second brain.
Apr 2026 — Compose files for all stateful services consolidated into a single Git repo with .env kept out of history.

What I am deliberately not doing

Saying no is a feature, not an omission, so this list is on purpose.

I am not running a 70B model as my daily driver. I can, at low quant, but the latency and VRAM cost are not worth it when a well-prompted Qwen2.5 or Llama 3.1 8B handles the actual jobs. The model is not the product.

I am not building Atlas a slick custom web UI. Open WebUI is good enough, and every hour spent on chrome is an hour not spent on retrieval and evals, which are the parts that actually decide whether it is useful.

I am not moving anything to a managed cloud LLM for the core knowledge work. The whole point of the local setup is that my notes and tenant data do not leave the building. I will use a frontier model deliberately for a one-off hard task, but the default stays local.

I am not chasing Kubernetes for the homelab. Docker Compose plus Portainer plus Git is the right amount of complexity for a single-operator lab. k8s would be résumé-driven architecture, and I would spend my evenings operating a control plane instead of building things on top of it.

And I am not publishing a finished version of this page. That is the whole idea.

The point of keeping this alive

A finished post is a photograph. This is the live feed, and live feeds are honest in a way photographs are not — they show the half-built thing, the idea that has been “next” for three months, the project I quietly killed. Keeping it current costs me twenty minutes every few weeks and saves me from the slow lie where a site looks busy but nothing on it has moved since last year.

If you have landed here from one of the deeper articles, this is the index of where that thread is today. If you have landed here cold, pick any link above; each one is a rabbit hole I have already fallen down so you do not have to. And if the date at the top is more than a couple of months old, send me a nudge — the document being stale is itself a bug, and I would rather hear about it.