Engineering • 7min read

Digital sovereignty is no longer hypothetical

US tech dependency is already biting. The Fable shutdown is the latest proof, and the case for sovereign AI.

Written by:

Louise Champ

Published on:

Jun 15, 2026

Last updated on:

Jun 15, 2026

Tags:

On 12th June 2026, Anthropic switched off its two most capable models for the entire world.

Not degraded, not rate-limited. Gone.

Anthropic published a statement explaining why: the US government had issued an export control directive, citing national security authorities, which has suspended all access to Fable 5 and Mythos 5 by any foreign national, inside or outside the United States. To comply, Anthropic disabled both models for every customer worldwide. Its other models are still running, but the two at the top of the stack are not.

These models were new, opened to the public only weeks earlier, so the direct fallout was limited.

The precedent is what matters, because now it turns digital sovereignty from a conference panel abstraction into an actual operational risk.

A lab, with no allegation of wrongdoing made against it, had its flagship products switched off for its entire customer base by a government decision its customers could neither influence nor appeal. This same mechanism reaches the US models businesses already depend on every day: had a UK hospital trust or a German bank wired one of these into a live service, its continuity would have been decided in Washington that afternoon.

The dependency on a concentrated handful of US technology companies is real, and is already producing consequences. The rest of this post is the case for treating that seriously, and a practical view of what to do about it.

The short version, for anyone who will not read to the end:
Reliance on US technology is no longer a theoretical risk. It is already producing consequences, and the Fable shutdown is the most abrupt example so far.
Digital sovereignty is not about where the data sits. It is about who can switch your service off, and whether you have an answer when they do.
For AI the dependency is sharper than anything before it, because the model is the part you cannot quietly swap out over a weekend.
There is a practical answer: open-weight models on Kubernetes you run, in UK / EU infrastructure. It is platform engineering, not a moonshot.

The Fable shutdown was not the first warning

This is not a one-off. It is the most sudden entry in a sequence that, if you were watching, has been visible for a while.

For years the US has used export controls to protect its AI lead, blocking sales of NVIDIA’s most advanced accelerators to China and tightening the rules as workarounds appeared. Applying the same controls to a model as opposed to a chip, and to any foreign national as opposed to only a rival power, is the next step. On the current reading of the Fable directive, you do not have to be an adversary to lose access, you only have to be foreign.

In 2020, Europe’s highest court struck down Privacy Shield, the main legal basis for sending personal data to the US, because US surveillance law left European citizens with no adequate redress. Its replacement is under legal challenge, and the ground under every transatlantic data flow has been provisional ever since.

In 2025, when the US sanctioned officials of the International Criminal Court, the court’s reliance on a US cloud provider for something as basic as email became a live problem. The specifics were disputed; European institutions drew the lesson anyway.

And look at what the hyperscalers sell: AWS’s European Sovereign Cloud run by EU staff through an EU-controlled entity, Microsoft’s EU Data Boundary, sovereign offerings from Google and Oracle.

Companies do not spend years building products for a problem their customers do not have.

In isolation, each of these had an explanation and a workaround, but the pattern is the point. For any European organisation, the real question is whether it would notice the dependency biting in time, and have somewhere to go when it did.

Why AI raises the stakes

What makes AI different from the US dependencies organisations have carried for years is concentration and depth.

The frontier is held by a handful of labs, almost all American, with no European peer, so wiring one into a product takes the dependency at its most acute. And it embeds itself: a database migrates over a planned weekend, but a model chosen, prompted and tuned into your product’s behaviour is not something you swap on short notice, because your outputs are shaped around its quirks. That kind of dependency you do not want have held on someone else’s switch.

Running open LLMs on sovereign Kubernetes

The answer is not exotic. An open-weight model is a stateless HTTP service in front of a GPU, and Kubernetes runs that kind of thing all day.

It starts with the model. The weights ship as files you can run, and for a large share of real work, retrieval, summarisation, classification, internal copilots, a current open-weight model in the 20B to 70B range is good enough, with the gap to the frontier still narrowing.

We would start with the Apache-2.0 families; several Mistral and Qwen releases ship under it. Read the licence before the benchmark: a “community licence” with an acceptable-use policy and a user ceiling is a permission that can change, and it puts the dependency back.

Where it runs rules out the US hyperscalers and their EU regions. What is left is a healthy market: OVHcloud, Scaleway, Hetzner, STACKIT, Exoscale, UpCloud, Civo, or bare metal in a colocation facility you contract directly.

The main constraint is supply: European GPU capacity is thinner and pricier than somewhere like us-east-1, and the newest cards arrive later.

For the serving layer, the right tool for production throughput is vLLM. It exposes an OpenAI-compatible API, which matters more than it first appears: code that points at a US endpoint today can be repointed at a Service in your own namespace with a config change, and not a rewrite.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mistral-small
  namespace: inference
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mistral-small
  template:
    metadata:
      labels:
        app: mistral-small
    spec:
      containers:
        - name: vllm
          image: vllm/vllm-openai:latest
          args: ["--model", "/models/mistral-small", "--served-model-name", "mistral-small"]
          ports:
            - containerPort: 8000
          resources:
            limits:
              nvidia.com/gpu: 1
          volumeMounts:
            - name: weights
              mountPath: /models
      volumes:
        - name: weights
          persistentVolumeClaim:
            claimName: model-weights

The nvidia.com/gpu: 1 request is advertised by the NVIDIA GPU Operator, which manages driver, toolkit and device plugin as one reconciled unit. Hold the weights once, in-region, in an S3-compatible store such as RustFS, and stage them in with an init container rather than pulling from a US hub at pod start. Since an idle GPU is the fastest way to make this expensive, KEDA can also scale a deployment to zero when nothing is calling it. These are all components we can assemble already.

When not to do this

We would not tell you to put everything behind an open model. If a workload needs the closed frontier and can tolerate the dependency, self-hosting to dodge a risk that does not apply to you is wasted effort and produces worse output.

The sensible pattern would be to run the bulk of inference on a sovereign open-weight platform and keep a narrow, explicit path to a closed model for the few tasks that need it. The mistake the Fable disruption punished was not using a US model, it was having no alternative available when access stopped.

It is also operational work: Tasks such as model evaluation, GPU capacity planning, or the day-2 running of a platform that used to be someone else’s problem.

For organisations whose continuity, regulatory position or data cannot sit on a switch held elsewhere, that cost is worth paying for. For others it is not, so knowing which you are is the decision.

Where we come in

We are Kubernetes people. We design, build and run cloud-native platforms for a living, without a hyperscaler contract or a model of our own to sell you, so the recommendation you get fits your risk, not our sales target.

The first step is not a build, but an honest map of where you depend on US-controlled technology, which of those dependencies sit under something that matters, and which you could not replace in a hurry. Most organisations have never drawn that map, so this exercise is clarifying on its own.

If the Fable and Mythos shutdown made you look at your own stack and not love what you saw, follow that instinct now while it is still fresh. Talk to us about where your exposure sits, or book a Technical Review and we will go through it properly and show what a sovereign architecture looks like for your workloads.

Cloud Platform Engineering

Workload Modernisation

VMware Kubernetes Migration

TAMOSS Deployment & Integration

Case Studies

Whitepapers

Blog

Tutorials

Cloud Native News

About

Partnerships

Careers

Contact