By Tunde Abagun, Sales Lead for West, East and Central Africa, Nutanix
The first wave of enterprise AI is maturing. Organisations across Africa and beyond rushed into AI projects over the last two years, and most of them did what was easiest: they did it in the cloud. An AWS subscription here, an Azure instance there, a model running in someone else’s data centre, billed by the hour. Fast to start, simple to justify, and, for many organisations, the right call at the time.
But something is shifting. The same organisations that sprinted to public cloud for their first AI implementations are now asking harder questions about cost, control, privacy, and latency. And many are arriving at the same conclusion: the next phase of enterprise AI does not live entirely in the cloud. It lives on-premises, at the edge, and in the space between.
The rise of AI localisation
We are seeing it happen in real time. Open-source LLMs and self-hosted agent frameworks are enabling businesses to run sophisticated AI models on their own infrastructure, without routing every query through a hyperscaler’s API or exposing sensitive data to third-party systems. This is AI localisation, and it is accelerating.
The drivers behind the drive to AI localisation are practical, and privacy is the most immediate. When an AI model runs on your own infrastructure, your data does not leave your environment. For industries handling sensitive customer information, financial records, or proprietary business logic, that is not a minor distinction.
Not to mention that the cost is increasingly a significant factor too. Public cloud AI inference at scale is expensive, and as AI embeds into daily operations, those per-query costs compound fast. Running inference on-premises, on hardware you own, fundamentally changes the economics. In a world of intelligence where the commercials are driven by tokens/kw, which also has implications for inferencing speed, deploying AI at the core and the edge might just be the new go-to model.
The chip story nobody is talking about
There is a structural dimension to this shift worth talking about. A significant portion of the semiconductor chips originally destined for hyperscaler buildouts is increasingly likely to end up on-premises in enterprise environments.
As organisations invest in local AI infrastructure, the market for edge-capable hardware built for inference rather than training is growing. This challenges a narrative convenient for hyperscalers: that on-premises is dying and full cloud migration is the only sensible direction. For AI workloads specifically, the on-premises environment is experiencing a quiet renaissance.
Training and inference are not the same problem
To understand why hybrid is the right answer, you must understand that AI workloads are not monolithic. Training and inference are fundamentally different activities, with different computational profiles and different optimal environments. So in short, their needs are different.
Training a large language model, or fine-tuning one for a specific business domain, is computationally intensive. It demands high-specification GPU nodes running in parallel for extended periods. This is exactly what hyperscalers do well, namely, elastic access to massive compute capacity, on demand, billed for the duration of the run.
For model training, public cloud remains genuinely compelling. Inference is a different problem entirely. Once a model is trained and deployed, serving it to users or business processes requires low latency, consistent availability, and cost efficiency at scale. These requirements are better served by infrastructure close to where the workload runs, regardless of whether this is on-premise, at the edge, or inside the organisation’s own environment, where response times are faster and sensitive data never leaves the building.
The practical implication is that you can train in the cloud, run at the edge. This is not a workaround but actually the operating model that enterprise AI demands.
Hybrid is a strategy, not a stopgap
There is a tendency to treat a hybrid model as a transitional state, a temporary arrangement for organisations that have not yet committed fully to cloud. That framing has always been questionable. For AI, it is simply wrong.
I believe that hybrid AI is the destination, not the journey. Organisations that extract the most value from AI will be those that dynamically move workloads between environments based on what each stage requires. Train where scale is available. Deploy where proximity matters. Manage cost and performance as variables to optimise, not constraints to accept.
This requires infrastructure that spans environments without forcing a choice between them, with consistent management, security, and operational practices, regardless of whether a workload is running in AWS, Azure, Google or on a rack in your own facility.
What this means for African enterprises
For organisations across West, East, and Central Africa, the hybrid model addresses challenges specific to this market. Data sovereignty matters because running AI inference on-premises keeps data local, which is essential for regulatory compliance and protecting business-sensitive information.
Latency matters too, as connecting to hyperscaler regions from many parts of Africa introduces unacceptable delays for real-time AI applications. And cost predictability is a real concern. Whereas running inference on owned infrastructure converts a variable operating expense into a more manageable capital investment.
For example, when Nutanix developed its Enterprise AI, it did so specifically for this type of environment. It provides a tool that allows organisations to deploy and manage AI workloads consistently across on-premises, cloud, and edge infrastructure. When a model needs training at scale, it moves to the public cloud. When ready for production inference, it returns on-premise, managed with the same tools and policies. In this case, the workload moves, but the organisation stays in control.
The cloud-versus-on-premises debate has always been a false binary, and for AI, it is an unhelpful distraction. The question is not where your AI runs, it is whether your infrastructure gives you the freedom to put each workload where it performs best, at the lowest cost, with the greatest control.
On-premises is not the old way. For inference workloads and privacy-sensitive deployments, it may be the smart way. Those enterprises building hybrid foundations won’t be playing catch-up when AI becomes a baseline expectation rather than a competitive edge.
For these and more stories, follow us on X (Formerly Twitter), Facebook, LinkedIn and Telegram. You can also send us tips or reach out at [email protected].
Also Read: Nutanix signs Sub-Saharan Africa MoU with born-in-the-cloud partner Datamellon


