Nvidia’s RTX Spark confirms what we’ve known for years: local AI is the next platform shift

Jensen Huang announced the RTX Spark at Computex this week and compared it to the invention of the smartphone. That’s a bold claim. It’s also correct.

The RTX Spark is a chip designed from the ground up to run AI agents locally, on a personal computer, without sending a single byte to a cloud API. Lenovo, Dell, HP, Microsoft Surface, Asus and MSI are all shipping devices with it this autumn. Acer and Gigabyte follow after that.

This isn’t Nvidia adding an AI feature to a gaming GPU. It’s Nvidia deciding that the next major computing platform is the private, on-device AI machine, and moving to own that market the same way it owned the data centre.

If you work in AI infrastructure, this is the moment you’ve been pointing at for the past two years.

Why on-device AI has always been the right answer for most organisations

The public conversation about AI has been dominated by the cloud model: you send your data to OpenAI, Anthropic, Google or Microsoft, something clever happens in a data centre somewhere, and the answer comes back. That model works well for consumer apps and early prototypes. It doesn’t work well for the majority of serious enterprise use cases.

The reason is simple. Most valuable data is sensitive.

Banks have customer financial records. Law firms have privileged communications. Defence contractors have classified project data. Healthcare organisations have patient records. None of these can leave the building. Not legally, not under their contracts with clients, and in many cases not under their own internal policy.

So for the last two years, the AI story for regulated industries has been: watch and wait. Watch the capability curve improve. Wait for on-device inference to get good enough to run genuinely useful models. The tools have been getting better. The hardware has been the bottleneck.

That bottleneck is ending.

What the RTX Spark actually means

The RTX Spark isn’t a research chip. It’s going into mainstream commercial PCs from the biggest OEMs in the world, at scale, from autumn 2026. Lenovo, HP, Dell and Microsoft Surface are among the launch partners, and between them Lenovo, HP, Dell and Apple accounted for almost 75% of the global PC market in Q1 2026, according to Gartner. This is not a niche release. It is a mainstream platform shift. That means within 12 to 18 months, a meaningful share of enterprise PC estates will have the raw compute to run capable AI models locally.

Not toy models. Not single-sentence classification tasks. Models capable of transcription, summarisation, document Q&A, code generation and agentic workflows, running entirely on the device in front of the user, with no API call, no data egress and no monthly subscription cost that scales with usage.

This is the hardware transition that makes private AI practical for the average enterprise. And it’s arriving faster than most analysts predicted.

The question now isn’t whether local AI is viable. It’s whether organisations have the infrastructure to actually run it.

The infrastructure problem nobody is talking about

Here’s what the Computex announcements won’t tell you: a powerful chip is not a deployment strategy.

Running AI on a single device is a solved problem. Running AI across hundreds or thousands of devices in a managed, auditable, updateable enterprise environment is a different challenge entirely. You need to know which model version is running where. You need to push updates without breaking workflows. You need usage visibility without creating a new surveillance problem. You need the runtime itself to be lightweight enough to operate without degrading the host machine.

This is the gap in the market. The chip manufacturers are solving the hardware problem. Nobody in the mainstream is solving the fleet management problem.

That’s what Locai was built to solve.

What we built, and why we built it before the hardware arrived

Locai is a sovereign AI infrastructure company. We build the runtime and orchestration layer that lets enterprises deploy and manage AI models on their own hardware, without a cloud dependency.

“The chip manufacturers are solving the compute problem. We’re solving the deployment problem. Those are not the same thing, and you need both.”

— Joe Ward, CEO, Locai

Locai:Link is our on-device inference runtime. It’s designed to be lightweight, hardware-agnostic and production-ready on the class of device that Nvidia just announced, as well as on Apple M-series silicon, Qualcomm Snapdragon X, and existing enterprise hardware already in the field.

Locai:Control is our fleet orchestration dashboard. It gives IT and security teams visibility and control over every AI model running across their organisation: which version, on which device, consuming what resources, producing what outputs.

We built this before the RTX Spark existed because the underlying case was already clear. Apple M-series chips have had the compute for serious on-device inference since 2023. Qualcomm’s Snapdragon X Elite arrived in 2024. The hardware capability has been ahead of the infrastructure layer for a while.

We also built a model stack to run on this hardware. Fine-tuned models for UK English transcription, meeting summarisation and document Q&A, designed specifically to run at the edge, on-device, with the performance characteristics these chips allow. Not general-purpose cloud models ported to run locally, but models optimised from the ground up for this environment.

What this means for regulated industries

The organisations that benefit most from this shift are the ones that have been locked out of serious AI adoption while their competitors in less regulated sectors moved fast.

Financial services firms regulated by the FCA who cannot allow client data to leave their own systems. Legal practices where privilege is an absolute constraint. Defence and government contractors operating under data sovereignty requirements. Healthcare providers handling patient data under UK GDPR.

These organisations have watched the AI capability explosion from the sidelines, not because AI isn’t relevant to them (it’s enormously relevant), but because the deployment model available to them (cloud APIs) was incompatible with their obligations.

On-device AI with proper fleet management changes that. For the first time, the capability and the compliance model are aligned.

The economics make sense now too

There’s a cost argument here that tends to get overlooked in the capability conversation.

Cloud AI at scale is expensive. Token costs compound quickly when you’re running inference across a large workforce or processing large volumes of documents. The economics that look fine in a pilot look very different when you’re pricing up 500 users at enterprise volume.

On-device inference has a different cost structure. You pay once for the hardware (which organisations are buying anyway as part of normal device refresh cycles) and you pay for model development and maintenance. The marginal cost per inference is effectively zero. For high-volume enterprise use cases, the business case for local AI gets more compelling the more you use it.

The platform question

Jensen Huang’s smartphone comparison is apt, but it’s worth being precise about why.

The smartphone didn’t just create a new form factor. It created a new platform, one that required new infrastructure (app stores, mobile networks, device management, MDM software) before the hardware could deliver its full value. The companies that built that infrastructure layer early, before the hardware became ubiquitous, were in the best position when adoption accelerated.

On-device AI is the same pattern. The hardware is arriving. The infrastructure layer (the runtimes, the fleet management, the model tooling, the integration layer) is what turns that hardware into enterprise value.

That’s the layer we’re building. And with RTX Spark shipping to millions of business PCs this year, the timing is right.

Frequently asked questions

What is on-device AI? On-device AI means running artificial intelligence models directly on a local device (a laptop, desktop or server) rather than sending data to a cloud service. The model processes information on the hardware in front of you, with no data leaving the device. This is sometimes called local AI inference or edge AI.

What is sovereign AI infrastructure? Sovereign AI infrastructure refers to AI systems that organisations fully own and control, with no dependency on third-party cloud providers. Data stays within the organisation’s own environment, models run on owned or leased hardware, and there is no ongoing API cost or external data exposure. It is the enterprise alternative to cloud-hosted AI services like ChatGPT or Copilot.

Why do regulated industries need local AI instead of cloud AI? Regulated industries including financial services, legal, healthcare and defence operate under strict data governance obligations. UK GDPR, FCA rules, legal privilege and government security classifications often prohibit sensitive data from being processed on external systems. Cloud AI services require data to leave the organisation’s environment, which creates legal and compliance risk. Local AI solves this by keeping all processing on-device or on-premises.

What is the Nvidia RTX Spark chip? The RTX Spark is a new Nvidia chip announced at Computex 2026, designed specifically for running AI agents on personal computers. It is being integrated into a new line of Windows PCs from Lenovo, HP, Dell, Microsoft Surface, Asus and MSI, shipping from autumn 2026. Nvidia’s CEO Jensen Huang described it as enabling a new class of computer that moves from tool to teammate.

What does Locai do? Locai is a sovereign AI infrastructure company based in Cardiff, UK. It builds two core products: Locai:Link, an on-device inference runtime that allows AI models to run on local hardware including Apple M-series, Qualcomm Snapdragon and Nvidia RTX-class devices, and Locai:Control, a fleet orchestration dashboard that gives enterprises visibility and management over every AI model running across their organisation.

What is the difference between Locai:Link and Locai:Control? Locai:Link is the runtime layer. It is the software that allows an AI model to run efficiently on a specific piece of hardware. Locai:Control is the management layer. It sits above the runtime and gives IT and security teams a dashboard to deploy, update, monitor and audit AI models running across all devices in the organisation. Together they form a complete on-device AI infrastructure stack.

Is local AI cheaper than cloud AI? For high-volume enterprise use cases, yes. Cloud AI costs scale with usage, typically charged per token or per API call. Those costs compound quickly at enterprise scale. On-device AI requires upfront investment in hardware and model development, but the marginal cost per inference is effectively zero. Organisations running AI-heavy workflows at scale typically find the economics of local AI significantly more favourable over a 12 to 24 month horizon.

Locai is a sovereign AI infrastructure company based in Cardiff, UK. We build on-device AI infrastructure for regulated industries. If you’re thinking about local AI deployment for your organisation, get in touch: hello@locai.co.uk

🍪 We use cookies