Open by Design: How NVIDIA and DigitalOcean Are Building the Stack for the Always-On Agentic Era
Captured source
source ↗Open by Design: How NVIDIA and DigitalOcean Are Building the Stack for the Always-On Agentic Era | DigitalOcean
© 2026 DigitalOcean, LLC. Sitemap .
Dark mode is coming soon. Community Open by Design: How NVIDIA and DigitalOcean Are Building the Stack for the Always-On Agentic Era
By Jess Lulka
Content Marketing Manager
Published: June 2, 2026 7 min read
<- Back to blog home
The growth of generative AI isn’t driven solely by AI companies with proprietary models. Open-source AI is reshaping the developer ecosystem, fueled by a growing community of builders. But what does it take to go from open models to production-ready agentic AI, and what do developers need to know to get there?
This question was the focus of the DigitalOcean Deploy session, “Open by Design: How NVIDIA and DigitalOcean Are Building the Stack for the Always-On Agentic Era.” During this 30-minute chat, Kari Briski, VP Gen AI at NVIDIA, and Salman Paracha, SVP AI at DigitalOcean, discuss why AI-native teams are demanding openness, model flexibility, and infrastructure built for agents that never sleep—and what NVIDIA and DigitalOcean are doing to build support for this next generation of AI development.
Watch the full recorded session from Deploy 2026 :
View YouTube video
Open-Source Models Need Commitment, Not Just a Launch
There are many open models in the ecosystem, but having great models doesn’t guarantee they will be consistently improved or regularly updated. NVIDIA noticed a potential gap in this space for its enterprise customers, who regularly wanted access to open-source models that are launched and then left untouched.
This spurred the development of open models such as NVIDIA Nemotron . Released in March 2026, it serves as a family of multi-modal models designed for agentic AI. Having access to these open models enables developers to create agentic applications that require advanced reasoning, high compute efficiency, and open source standards. With Nemotron models and NVIDIA software libraries, developers can evolve their projects over time and receive regular updates and expanded support.
Running open-weight LLMs locally gives you more control over performance, privacy, and customization. This NVIDIA Nemotron 3 tutorial walks through deploying NVIDIA’s Nemotron 3 Nano on a DigitalOcean GPU Droplet, helping you experiment with efficient open models on dedicated GPU infrastructure without relying entirely on hosted AI APIs.
“We’ve been building these models for ourselves because we want to build great systems,” Briski says. “We’re treating [these models] like a library and are committed just like we are with our GPUs and [CUDA] libraries and our stack that we’ll improve upon.”
Beyond the models themselves, there’s also a proliferation of harnesses—the orchestration frameworks that wrap around models to manage agent lifecycle, memory, tool calling, and scaling—which are just as important for building agentic systems.
Agents Are Only as Good as Their Evaluations
Paracha highlighted that most developers building AI-native applications are still facing a high hurdle and admission rate in determining whether it’s possible to build something as durable as OpenClaw or Claude Code.
Figuring out true evaluations and observability becomes a challenge, and these developers are left wondering whether they can truly compete with AI companies that have funding for research and top-of-the-line hardware. So what does lowering that barrier to entry (and creating developer confidence) look like?
Evaluation is where it starts, according to Briski. While there are many test cases and verifications for specific use cases (such as coding), other applications lack readily available benchmarks, and academic options don’t necessarily effectively evaluate real-world models or optimize performance.
Without these standards, it becomes harder for developers to gauge the viability of their idea. For broader development, more test cases need to be created and data pulled from, which requires human knowledge and labeling. For industries like electronic automation, NVIDIA is currently working with Synopsys and Cadence to develop these test cases and benchmarks to encourage development and agent creation.
Sub-Agents Only Scale When You Can Trace Them
Developers running AI-native applications have adopted sub-agent workflows that break a problem into subtasks and delegate them to a single agent. Paracha has seen developers do this, but is curious about how this subset of AI development might shape up over the next few years, and what engineering principles still apply.
If you’re curious about what a sub-agent (or multi-agent) system can do, read about the TradingAgents LLM , which is designed to function as a simulation for financial trading through specialized agents.
“There’s a thread in engineering right now where you still have to understand how the system was built, even though the agent is writing the code. So when sub-agents are going off, you are able to test them, you are able to verify, and break it down to where something might be going wrong, so that you, as the architect, can understand the system,” Briski explains.
This philosophy also pairs well with adding traceability throughout the system, so you can have references during troubleshooting instead of just the end product to look at, leaving you with a black box. While there is a newer approach of feeding a system a whole bunch of information and having it develop an answer, having the “divide and conquer” approach still seems to be the standard.
Good Tokenomics Starts With Outcomes, Not Token Count
Scaling AI comes with a new problem: token usage. How can developers run AI systems that are consistently generating tokens and simultaneously build an effective business around them? What it really comes down to is the product’s value; the items delivered and the workflow efficiencies created.
“We’re in a stage right now where tokens are going to be counted differently as model architectures change. [But] we have to evolve our way of thinking because the way we count tokens generated with diffusion models and the latent spaces of tokens could all change. So I think instead of spinning out on how many tokens are being generated, it’s more about the value,” Briski explains.
But organizations do need to consider cost, especially with the larger models. NVIDIA is taking technical measures to improve the efficiency of token use. This includes…
Excerpt shown — open the source for the full document.
Notability
notability 4.0/10Industry collaboration post, not a product launch