Programming an LLM
Bringing software engineering
discipline to LLMs.
With the emergence of GenAI, and large language models (LLMs) specifically, a new programming paradigm has emerged: Prompt Engineering.
As outlined in Meta’s paper Prompt Engineering with Llama 2, “Programming foundational LLMs is done with natural language – it doesn’t require training or tuning like traditional ML models.” Let’s explore what that means for developers.
Think of an LLM as a new kind of computer—one with a vast but opaque instruction set, trained on a large corpus of data. Programming this “machine” isn’t done with conventional code, but with natural language instructions. That’s what we now call Prompt Engineering.
As software engineers, we’re trained to think in deterministic terms: same input, same output. We write unit tests and integration tests to assert this consistency. We study automata theory and build systems designed to behave predictably. Yet, we all know that even in traditional software, complexity leads to failure. More lines of code, more components, more people—more chances for something to go wrong.
That’s why we’ve built mature ecosystems of tooling, processes, and architecture to manage complexity and ensure high availability. In the world of traditional software, we often hit 99.95% reliability or more.
But LLMs change the game. Reliability is no longer a given. Here’s why building with LLMs is fundamentally different:
- The model doesn’t always follow instructions as expected.
- Context windows are limited, either technically or due to cost.
- Outputs can be inconsistent or surprising.
- Models may evolve silently, changing behavior over time.
This unpredictability means that engineers must write extra code to handle potential failures, design fallback mechanisms, and create guardrails to maintain a consistent experience.
It’s becoming clear that building with LLMs requires a separate software development lifecycle (SDLC) for the LLM portion of your application.
We recommend isolating all LLM interactions from your main application logic—including input preprocessing, API calls, and output validation. If you’re using a microservices architecture, think of your LLM logic as a dedicated service. This service should encapsulate all the unique complexity of working with GenAI and expose a stable, predictable interface to the rest of your system.
At Arato, we help teams build that abstraction. Our platform offers a fully managed service to design, deploy, and monitor LLM-powered applications. By separating concerns and adding structure, we help bring back the consistency, control, and reliability you expect from modern software development—even in the age of GenAI.