When Human
Here is a thought experiment.
An AI system is given the task of categorizing if a conversation is with a human or an AI. The human chooses to use an agentic system to respond to the AI during this conversation.
How should the AI categorize this?
The Binary Illusion
So clearly the user is a human, this is defined in the problem. Based on this it would seem that the AI should categorize the conversation as Human. If the AI were to instead choose AI, then we can verify that the categorization is incorrect, because the user is verifiable human.
Some of you are are deep in thought trying to express, or yelling at the screen that, the AI did not have a conversation with the user, but with the agent. Since the agent is not a human, clearly the categorization of AI is correct!
Yikes! Both of these arguments are very strong, and have some sound logical reasoning. So what is going on here?
There is perhaps a third possibility, the premise of the categorization is being violated by the user responding with an AI. This breaking of the “rules” invalidates the effort put in by the designer of the AI system. You might even be saying “Blake you are creating this rule set for what purpose? Just to break it? I mean that does sound like something you would do …”
First off, fair, that does sound like me.
What are you doing?
This actually came up from some of the work I have been doing with agents. This thought experiment is intended to illustrate the shifting reality of how humans communicate post AI. Now we will dive into why I think it is important and a problem that needs answering in the world of agents.
How many of you have sent an email that was just a lightly edited LLM response? I have, and I do fairly often. As much as I love writing, it is laborious for me as someone who generally thinks in pictures. It is hard to capture my thoughts as each one is a thousand words and I my attention span is growing shorter with each passing YouTube short.
From a practical perspective, I am able to talk to more people though out the day if I accelerate my communication with LLMs. This functional conversation is important as I have context and information that other people need to complete their work. Having an agent to relay this information is more effective and time efficient than me typing out a response. It is even more effective than me writing a guide, as the presentation becomes much more flexible and focused to the user’s need when processed through an LLM.
Now for relational conversation, conversation that is focused on building understanding and cooperation (thank you Sam for articulating this concept to me), I need to be present and listening. An agent to handle the functional conversation, frees up time for me to have more relational conversations, thus improving my feeling of connectives to those I work with.
Back to the Problem
So with this expanded understanding of how communication is changing, we have more tools to improve our AI system for categorizing conversations. Clearly we need new categories that are more useful than just AI or Human, as humans that are using AI for functional conversation are still humans.
So the correct answer to the original question, is that the AI should categorize this type of conversation as human. In fact I will boldly claim, with this new understanding, all conversations should be categorized as human. AI did not spring into existence, nor did it deploy itself into production (at least not yet). A human created this AI to converse, thus all conversations with it are human conversations.
This is not a satisfying conclusion, as our categorization feels like it has lost value, or we have discovered that it never had value. So making the best of a bad situation we should pivot, informed by this out come. When we next create a AI system to categorize communication we should strive to make the categories more meaningful, or at least useful. We would gain more functionality by categorizing the conversation by functionality instead of source in this case.
Practical Application
The source of this exercise is something that we all have probably ran into, that is human verification. This is when a system is trying to verify that the user it is interacting with is a human and not just an automated system.
The motivation for doing this is to conserve resources so that the system is being used fairly and no single user can abuse the bandwidth through automation. This provides a better user experience as the system stays responsive and more people are able to use it.
This is an important consideration, both for the system’s owner and the users. The system’s adoption can stagnate if overran by automation. Think of a forum that has been swamped with advertisements from bots. Allowing this will lower the value proposition of visiting the site as the content on the whole will become less useful due to the spam, spam, spam.
Because of this change in human communication, agents will need ways of determining if the conversation it is having is functional, IE an exchange of context and data, or it is relational, building understanding and setting boundaries on how interactions should take place.
Conclusion
The categorization of conversations as human or AI is losing value as human communication patters change. Humans increasingly using AI and agents to facilitate conversations, so our systems must adapt their communication awareness to accommodate this change. By categorizing conversations by functionality, systems can better understand and respond to the needs of users.
Long Running Agents
From Traditional User Agents to the New Era of LLM-Enabled Agents
So, we’ve established that “agents” aren’t exactly new—your browser’s been faithfully acting on your behalf for years, fetching those adorable cat pictures long before anyone uttered the phrase “LLM.” That’s old news. But here’s where things get interesting: we’re at a genuine turning point, where the classic user agent powered by LLM concept meets the exciting, somewhat mind-bending possibilities of Long Running Agents
For ages, a user agent was like a trusty old sedan. It got you from point A to point B reliably, maybe with the occasional detour or flat tire, but it didn’t exactly chat you up about how best to optimize your route or anticipate when you’d crave an espresso pit stop. Now, imagine upgrading that sedan to a self-driving, self-improving electric vehicle. That’s where long running agents step into the limelight: they’re not just fetching data—they’re interpreting, reasoning, strategizing, and evolving as they go.
Why the Excitement?
The difference this time is that the underlying tech really has changed. LLM-based agents can engage in semantic reasoning, adapt over time, and integrate multiple tools or sources. They’re not just playing fetch; they’re making sense of what you want and why you want it. The engineering challenge left is creating a system that can not only take immediate action for the user to accomplish their goals, but to help the user throughout their entire mission.
A More Natural Way to Interact
Let’s say you ask your agent to find pictures of cute black kittens, just like before. A classic agent might dutifully return some links. But the new LLM-powered agent doesn’t stop at “Here are three links.” It can understand that you mean kittens, not adult cats, and maybe even guess that you prefer high-resolution images for your phone background. If you say, “Actually, can you find me a few that match my current phone’s aesthetic?” the agent might consider color palettes, image sizes, or even seasonal vibes. It’s not just retrieving data; it’s collaborating with you on the fly.
And it doesn’t stop there, it continues to look for more cat picture in the background, perhaps setting your wallpaper to the best it finds. Filling your inbox with kitties, until it gets the hint that you have enough kitties and adjusts the number of kitties to just a couple a week. That is the vision, just the right amount of kitties. (Maybe I need to get a cat.)
Architecting the Future
To make all this happen, you need more than just an LLM. You need well-structured components working in harmony:
Profiling Module: Defines an agent’s role, personality, and objectives. It’s like giving your agent a backstory and a mission statement, so it knows how to carry itself over time.
Memory Module: Stores past interactions, important facts, and lessons learned. Think of it as the agent’s journal, letting it recall your preferences and previous queries without starting from scratch.
Planning Module: Lays out strategies and breaks down complex tasks into achievable steps. This transforms the agent from a simple responder into an intelligent planner.
Action Module: Turns decisions into tangible results—whether that’s a natural language reply, a database query, or a series of API calls.
Example of a Long-Running Agent at Work
User Query:
“Can you find me more studies related to the last dozen papers I asked about, focusing on ones that might help refine my experimental design?”
Step 1: Recall Past Context
The agent dips into its memory module and retrieves the user’s recent history—a set of twelve papers on microbiome interactions in gut health research. It knows the user found a few studies particularly insightful, especially those applying machine learning techniques to large patient datasets.
Step 2: Understand the Requirements
The agent interprets the query: The user wants “more studies” that align with previous interests. It recognizes that “refine my experimental design” suggests the user is looking for more practical, applied research. The agent infers that studies featuring robust methodological sections, comparing different data collection methods or analytic pipelines, might be most relevant.
Step 3: Select the Right Tools
The agent chooses to use a scholarly database API that allows filtering by keywords, subject areas, and publication dates. It also accesses its embedding-based similarity search to find articles conceptually close to the previously liked papers.
Step 4: Construct the Query
The agent builds a query that zeroes in on studies that:
Involve gut microbiome research
Incorporate machine learning analytics
Discuss experimental design optimizations
Are published after the user’s earliest referenced study date (to ensure freshness)
Step 5: Execute and Interpret the Results
After the API fetches a candidate list of papers, the agent reviews the abstracts. It filters out those that don’t provide new methodological insights. It weighs the user’s previous preferences and flags items that come closest to the characteristics they’ve found helpful before—papers with detailed experimental protocols and comparative analyses.
Step 6: Present the Findings
The agent returns a curated list of studies, each accompanied by a brief summary highlighting why it’s potentially useful. For instance:
“Paper A: Introduces a novel sampling technique that could enhance your current data collection approach.”
“Paper B: Compares three ML models specifically in gut health contexts, with a focus on interpretability and reproducibility.”
Step 7: Continuous Improvement
Over time, as the user interacts with the agent’s suggestions—maybe saving some papers or marking others as irrelevant—the agent refines its understanding of their evolving research interests. Next time, it’s not just fetching studies; it’s growing into a partner that helps shape the user’s entire research strategy.
Staying Positive as We Redefine “Agent”
Look, we know words evolve in tech. A term that once meant “a piece of software acting on a user’s behalf” now comes loaded with hazy promises of advanced reasoning, adaptability, and a dash of personality. While this can be frustrating when the hype isn’t delivered on time, or as describe, we are witnessing a moment where technology adds to human agency.
There’s plenty to iron out, we need to keep these agents efficient, ethically aligned, and transparent. We must ensure they don’t get caught in their own hallucinations or wander off into misinformation territory. But the trajectory is promising. We can harness the capabilities of LLMs for good, building agents that genuinely improve our workflows, enrich our learning experiences, and—even more importantly—free us up to do the creative, strategic thinking that we humans excel at.
Still Charting the Unknown: Cutting-Edge Frameworks and Open Challenges
As we stand on the cusp of these advancements, it’s essential to remember that the world of long-running AI agents is still being actively explored. There’s no well-trodden map or neat blueprint to follow. Amid this exploratory atmosphere, two particularly innovative frameworks have emerged that are pushing the boundaries and offering glimpses of what’s possible.
ExACT: Learning Through Reflection and Debate
The ExACT framework introduces a fascinating blend of reasoning techniques, including Reflective Monte Carlo Tree Search (R-MCTS), which encourages agents to reflect on their own reasoning paths and learn from previous missteps. Think of it as the agent’s version of a late-night study session, revisiting its decisions, comparing alternatives, and holding mini-debates in its digital “mind” to refine its strategy. This reflective process and contrastive analysis enable agents to make more informed decisions, achieving a 6% to 30% relative improvement across various tasks compared to prior best-in-class methods. Paper Here.
CoA: The Power of Many Minds Working in Tandem
If ExACT is all about introspection and debate, the Chain of Agents (CoA) framework is about teamwork. Instead of one agent tackling a colossal problem and potentially getting lost in the weeds, CoA breaks long, complex inputs into manageable chunks. Multiple specialized “worker” agents each handle a segment, and a “manager” agent synthesizes their work into a cohesive final output. This division of labor has shown up to a 10% improvement over strong baselines for tasks such as question answering, summarization, and code completion. Think of it like a team of diligent researchers each summarizing a chapter of a massive textbook, with an editor weaving their efforts into a polished manuscript. Paper Here
Facing Down Challenges: Scalability, Ethics, and Interoperability
The journey doesn’t end with building frameworks. We still face serious headwinds. Scalability and efficiency are front-and-center: these agents need to run for extended periods without guzzling resources or slowing to a crawl. Ethical considerations loom large too, from addressing bias and protecting privacy to ensuring these agents truly align with human values. Without thoughtful guardrails and ongoing oversight, even the most clever agents can stray into murky paper clip filled territory.
Another pressing issue is developing standardized evaluation benchmarks. We can’t measure progress without consistent yardsticks. As agents evolve to handle diverse tasks and contexts, we need reliable methodologies that test their mettle over time and across domains. Long-term robustness, interoperability among different systems, and seamless communication between multiple agents remain areas ripe for innovation and discovery.
Hype or Innovation?
I’m firmly in the “innovation” camp. If you’ve followed along with my other explorations—like the work I’ve done with EaaPL—you’ll know that these ideas depend heavily on agents. One reason I wrote this piece is to put down an initial artifact from my own research, laying out how to build better agents and why it matters. I genuinely believe that as we refine these technologies—making them more capable, reliable, and context-aware—agents will reshape the human experience in profound ways. Of course, we’re not there yet. There’s still a steep engineering climb ahead. But that’s what makes it exciting: we’re at a turning point, with the promise of agents that truly understand and partner with us lingering just over the horizon.
Agents
Marketing Ruins Everything
User Agents Have Existed for Years
Yep. So why have companies started bringing it up as a selling point now? To sell you stuff—the same reason most decisions are made. Companies have to make money; that’s just how it works. It would be a poor company that doesn’t make money. That is their purpose, despite any nice words in their mission statement.
There’s nothing wrong with companies making money—somebody has to—but it does become annoying when this behavior starts to twist the meaning of things. I don’t think this is done maliciously; it just happens because there are perverse incentives to leap on tech trends. Even smart companies fall into this trap, as misusing words can draw eyes.
Those of us in the tech sector roll our eyes and continue on. For those outside our little bubble, the real meaning of the word is changed, and now we are talking about two different things. Jargon is sacrificed to the buzzword gods, and alignment is decreased. So let’s start with some level-setting about agents.
What Is an Agent?
In software, an agent is an element that acts on behalf of a user. These are a sub-pattern of the client pattern. If that sounds vague, you’re right. Software engineers like these things called Patterns. Patterns are little bits of logic that crop up repeatedly, having some general usefulness. They are abstract, can be generalized, and prevent you from reinventing the wheel. There are whole books on the topic of software patterns. Read a couple, and now you’re fun at parties. Well, very specific kinds of parties—mostly with other people who read books on software patterns.
If you need a clearer picture of an agent, I have the perfect example for you. In fact, the odds are very high that you are using this agent right now. May I present: the Web Browser.
A web browser is a user agent because it acts as the intermediary between a website and a user. When you type a URL into your web browser or click on a link, the browser sends a request to the server hosting the website. To help the server respond appropriately with the correct version of the content (e.g., HTML, CSS, JavaScript) tailored for different devices, browsers send User-Agent strings. These strings contain information about the browser’s name, version, platform, operating system, and architecture, among other details.
The browser takes your intent—looking at cute cat pictures—and performs all the necessary tasks to get that cat picture on your screen at the right size and resolution, just as cute as a button.
Isn’t That What All Software Does?
Well, kinda.
Oh, you’re still interested in the difference? The key part is that it takes action on behalf of a user. Most (all?) software takes actions, but it just does the action and doesn’t care who it’s for. An agent knows who it is taking the action for and has the details about how the user wants the action to occur. This is a subtle difference, but a very key one. Let’s compare a browser to an FTP client.
Web browsers, as user agents, fetch content like cat pictures by initiating HTTP requests and providing detailed User-Agent strings to servers. This allows for optimized responses tailored to specific devices and browser versions. In contrast, FTP clients retrieve content more directly using the FTP protocol without sending user information or tailoring interactions based on server-side intelligence. Essentially, while both browsers and FTP clients are responsible for fetching content from servers, browsers do so with additional sophistication and adaptability through HTTP communication and User-Agent identification.
So Why Are People Excited About Agents?
People are excited because LLMs allow for smarter agents—agents that can do more, agents that can use semantic reasoning to figure out what you mean when you say “cute cat pictures,” a more general-purpose agent.
Agents in the past have been very limited in what they can do and how flexible they can be. LLMs enable the translation of context into different forms, making them flexible in how they act as agents. Here’s an example of how an LLM-enabled agent could retrieve cat pictures in a different way than current agents. But first, let’s talk about how they work.
How LLM-Enabled Agents Work
From this point on, when I say “agent,” I mean LLM-enabled agents. Here’s my definition:
Agents are pieces of software that use semantic reasoning and tools to act on behalf of a user.
Semantic reasoning refers to the agent’s ability to understand and process the meaning behind a user’s query or the data it interacts with. This involves interpreting language, making logical inferences, and using contextual understanding to determine what steps to take and how to act effectively. This is made possible by the pattern matching and completion provided by the LLM.
For example:
Understanding the Query
When a user asks, “Find pictures of black kittens,” the agent must recognize that “black” is a color and “kittens” refers to young cats. This understanding is semantic reasoning, where the agent grasps the meaning of the words and their relationships.Interpreting Results
When the agent retrieves data, it must analyze the results to ensure they align with the query. For instance, if some retrieved pictures show adult black cats instead of kittens, the agent needs to refine its actions.Making Informed Decisions
Semantic reasoning helps the agent decide what to do next, such as querying the API differently, combining data, or explaining its findings to the user.
Tools are external resources or systems the agent can use to gather information, perform actions, or solve specific tasks. Tools act like the agent’s “hands” to interact with the world beyond just reasoning.
Example of an Agent at Work
Let’s consider our case of getting cute cats. We’ll send the agent this text:
“Can you find me pictures of cute black kittens?”
Step 1: Understand the Query
The agent interprets the query:
“Pictures” indicates the user wants images, not text or data.
“Cute black kittens” specifies criteria:
The subject: cats.
Attributes: black (color), kitten (age group), cute (implied, so no need to explicitly filter).
Reasoning
“To fulfill this request, I need to find a source of cat pictures and filter by color (black) and age (kitten).”
Step 2: Decide to Use a Tool
The agent determines:
An appropriate tool (e.g., a Cat Image API) must be used to retrieve this specific data.
Step 3: Construct the API Query
The agent builds the query:
Identify the API endpoint (e.g.,
/v1/cats
).Add parameters for
color=black
andage=kitten
.
Step 4: Execute the Query
The agent sends the request to the API:
Example request:
https://api.cute-cat-pictures.com/v1/cats?color=black&age=kitten&limit=5
Step 5: Interpret the API Response
The API responds with data:
{
"results": [
{"image_url": "https://catpics.com/black-kitten1.jpg"},
{"image_url": "https://catpics.com/black-kitten2.jpg"},
{"image_url": "https://catpics.com/black-kitten3.jpg"}
]
}
The agent extracts the image_url
values.
(BTW, those links don’t work. This is an especially sad example of hallucination, but we will soldier on.)
Step 6: Provide the Answer
The agent presents the URLs to the user as hyperlinks so the user can view the images.
Wrapping It All Up: Cats, Agents, and Buzzwords
So there you have it: agents are not new, but the LLM-powered ones are like the overly ambitious interns of the software world—eager to prove themselves by doing everything on your behalf, even if they sometimes hallucinate cat pictures that don’t exist. The buzzwordification of “agent” isn’t inherently evil (corporate slogans aside); it’s just a symptom of the tech industry’s habit of taking something old, slapping a shiny new label on it, and pretending they invented fire.
But hey, at least this time it’s genuinely exciting. LLM-enabled agents have the potential to be as transformative as they claim—assuming we can keep their semantically reasoning little brains focused on actual tasks and not their existential crises about whether that cat is cute enough.
So next time you hear someone going on about “the future of agents,” just smile, nod, and maybe send them this article. Then go back to using your browser, your real MVP agent, to look up more cute cat pictures—the most noble use of technology we’ve invented to date. 🐾
Continuous Growth
Skilling Up in the Age of AI
There’s something undeniably exhilarating about being caught off guard by a surprise. Even the tiniest revelations can have an enlightening impact, much like a sudden jolt of dopamine to the brain. I encountered such a pleasant surprise recently when I discovered that one of my colleagues, despite working in the field of AI, had never actually seen a REST API in action.
To me, this was startling. Perhaps it’s because I inhabit a technology bubble—actually, that’s definitely the case—but even within our sphere, there’s an expectation of certain knowledge, particularly when it comes to someone who is adept at navigating co-intelligence and the like. They didn’t quite grasp how the system we were building was distinct from the off-the-shelf products we’ve used before.
I endeavored to clarify the concept for them, and eventually, we reached a point of understanding. It was clear from their newfound enthusiasm and the gears turning in their head that they were beginning to appreciate how these ideas could enhance our offerings.
To half of my audience, the notion that someone hasn’t encountered an API may seem extraordinary. To the other half, who might be inquiring what an API even is, a new horizon has been elevated. The existence of AI is pushing the boundaries of our bubble, equipping more people to accomplish tasks more rapidly and capably. This is a boon, especially considering the formidable challenges humanity faces: pollution, resource scarcity, political strife, and the endless wait for “Star Trek: Section 31” contribute to a collective anxiety that dampens our quality of life.
For those veterans who are surprised by the unfamiliarity with APIs, consider this a form of first contact. Individuals who might previously have ventured no further than social media pages are now beginning to see the interconnectedness of our world, powered by our proverbial “magic rocks full of lightning.”
Historically, the tech world has been somewhat insulated, with rigid gates based on perceived intelligence or skill—a practice we should abandon as AI democratizes technological ability. Talent is now pouring into our domain, not because these individuals were fundamentally different from us, but because they found fulfillment in different feedback loops.
To the rest of my audience, who may feel like interlopers amidst the tech wizards, be aware that as you venture into the bubble, you’re joining a community that has traversed this path before. Where you are taking your first steps, they are endeavoring to build “warp drives.” But this isn’t a call to discouragement—it’s an exciting reminder of the empowering shift AI is instigating. As technology natives, we see the sky as the new starting line, not the limit.
I may risk sounding like an old man shaking his fist at a cloud (and I am to be fair), but I implore you not to view AI’s offerings as mere shortcuts. These tools are set to become essential for success in a world post-AI assimilation. Yet, remember, there are no “no code” solutions without understanding.
This brings me to thinking about English as a programming language. In a future where expressing your will can bring it to life through a digital swarm, clarity in your instructions is paramount. Let’s compare
Make me a good app server.
to
Create a good app server like the one described below.
Core Features
First off, it needs to handle all the latest web traffic tech. That means **HTTP/2 and HTTP/3 (QUIC)** for faster, more efficient connections, and **WebSocket support** for real-time updates. Whether it’s hosting REST APIs or the newer GraphQL stuff, it’s gotta be flexible. And for static content—things like images or JavaScript files—it should serve them up blazing fast, maybe even with built-in caching or integration with a CDN.
Oh, and let’s talk microservices. Modern setups often break things into smaller pieces, so the server should help with service discovery and distributed tracing to keep everything playing nice.
Performance and Scalability
This thing needs to scale like a champ. Whether you're running it on one big machine or across a whole cluster, it should handle spikes in traffic without breaking a sweat. It should have asynchronous I/O, so it can juggle thousands—if not millions—of connections efficiently. Add auto-scaling so that when traffic surges, it brings more resources online, then scales back down to save costs.
Plus, edge compute compatibility is big now. That means pushing some of the processing closer to the user (like at a data center near them) for lower latency.
Security
Now, security. You don’t want to worry about someone sneaking in or messing with your app. So, at the very least, enforce LS 1.3 for all HTTPS traffic, and include a built-in Web Application Firewall (WAF) to block common attacks like SQL injection or XSS. Add in rate limiting and DDoS protection, and you’re good to handle abusive traffic.
For APIs, token validation is non-negotiable, and having built-in support for OAuth2 and OpenID Connect ensures everything’s locked down properly. Don’t forget audit logging, so you can see who did what and when.
Observability
Keeping tabs on your server is huge. You want to know how it’s doing at all times, so export metrics to something like Prometheus and make sure you’ve got distributed tracing in place to track what’s happening across services. If anything goes wrong, centralized logs—say, with an ELK stack—help you figure out what happened. Oh, and health checks are a must for stuff like Kubernetes deployments.
Performance Goals
So how fast should it be? Think sub-50ms response times for almost all requests, even when handling a million concurrent users. Modern hardware can do this if your server is efficient. And it should work just as well on a single big machine or a cluster of hundreds.
Compatibility and Flexibility
No one wants to be stuck with a server that only works in one ecosystem. This needs to support all the big programming languages—Node.js, Python, Java, Go, .NET, etc.—and run in containers like Docker. It should also slot perfectly into Kubernetes setups.
Deployment and Management
Let’s make deployment smooth. Use CI/CD pipelines like GitHub Actions to push updates without downtime. For configuration, stick to something declarative—YAML or JSON works great—and integrate with tools like HashiCorp Vault for managing secrets securely.
The Extras
Looking ahead, this server could evolve to support stuff like machine learning inference directly or even some lightweight *erverless functions for quick, on-demand computing. As edge computing grows, having more features optimized for local processing could make it a real game-changer.
Now clearly this an exaggeration, but I am trying to illustrate that really understanding the technology and how it works translates to even better prompts. The first one may result in a fine system, however the second one makes the exact system that will fit the bill. AI is not a substitute for knowledge, so continue to gain knowledge of the new bubble you have joined.
Luckily, AI is a great teacher. If you are part of the audience that doesn’t know what an API is, or have some idea but it mostly just a term that have been thrown around, use this prompt to learn more.
Teach me about APIs in the context of computer systems, focus on REST APIs. Go step by step, I am trying to learn this and the concept is new to me.
Falling Behind
“Assume this is the worst AI you will ever use.” – Ethan Mollick, Co-Intelligence: Living and Working with AI
As I dive down the rabbit hole of EaaPL, I’ve spent my time thinking about how I would personally like this system to work. I really only have time to focus on this on the weekends, so I sit on my couch listening to audiobooks and dividing my thoughts between doodling in system architecture and weighing the ethics and morality of the Culture versus the new economy of the Star Trek universe (as you can tell, I am fun at parties). This is a pleasant enough way to spend my time, but this sort of ideation comes at a cost: time.
My ideas fall behind at a rate faster than they can be expressed, especially big ideas. This is a natural occurrence; from an objective standpoint, I am a single intelligence operating for 4–6 hours once a week. I am co-intelligent (whatever “intelligence” means), augmented by my own LLM chat system (model of the week), run on my own server. Don’t be impressed; it is really a hobby setup, mostly for messing around with different tech and blowing out the cobwebs of the workweek.
In the meantime, collectives of intelligence collaborate, implementing real-world solutions as a matter of business, pushing through the theoretical and on to the practical. When I read papers like Agent Flow, I must stop and consider the cost of my ideation. Have I wasted my time thinking about this?
I have spent my only real currency on this effort, and as a well-trained capitalist, I begin to think about my ROI here. Luckily, I can maximize my returns by considering what that time has bought. First off, entertainment and self-care have real value, and the exercise of the mind is probably worthwhile, at least poetically.
To realize the value of this kind of falling behind, I must convert my short-term gains into sustainable long-term investments—the kind you can rely on as age makes work less appealing and the mind and body are less able to provide labor. I need to spend time in such a way that I can say with confidence, I am the worst intelligence I will ever be.
Execution Workers
The functionality of EaaPL’s workers is central to transforming natural language commands into actionable processes. Workers act as the backbone of EaaPL's execution environment, translating parsed commands into concrete operations and handling the modular tasks generated by the Execution Agent. To maximize the value that these workers provide, we need to consider the design principles, communication structures, and optimization strategies that enable them to operate efficiently and accurately within an event-driven, distributed framework.
Presented is the dream build, that is, a complete consideration of what we would want from a system. This is not realistic for a first pass, or even a third. For the purpose of building a POC much of this should be ignored, however considered as the execution layer will be the backbone of the system.
At the end of this specification there is a maturity model that can act as guide to how to prioritize the work allowing the elephant to be consumed in logical bites. There is also attached a purposed schema for events and an example walk through of a program execution.
Core Design Principles for EaaPL Workers
1. Modular and Stateless Architecture
- Stateless Operations: Workers should be stateless, performing each command independently without retaining information from previous tasks. This allows for scalability, as workers can be easily spun up or shut down as demand fluctuates.
- Modularity: Each worker is responsible for executing a single task, which ensures that errors or complexities within a task do not impact other parts of the workflow.
2. Event-Driven Activation
- Event-Triggered Tasks: Workers are activated by events dispatched from the Execution Agent, which breaks down complex commands into manageable, discrete steps. This approach allows for asynchronous execution, maximizing concurrency.
- Real-Time Response: When an event triggers a worker, it processes the task immediately, returns the output, and terminates. This minimizes latency and ensures a quick response for high-priority tasks.
3. Contextual Awareness for Task Precision
- Task-Specific Context: Although workers are stateless, they should have access to a limited scope of contextual information relevant to each command (e.g., task parameters and user preferences).
- Hierarchical Context Flow: The Execution Agent should pass context down to each worker only when necessary, allowing workers to handle tasks accurately without overloading them with irrelevant information.
Maximizing Worker Efficiency and Scalability
1. Dynamic Load Balancing
- To prevent bottlenecks, an orchestrator (Execution Agent) should dynamically distribute tasks across workers based on current load and worker availability.
- Auto-Scaling: For heavier or complex tasks, workers should be capable of auto-scaling in response to demand, ensuring that the system remains responsive under varying loads.
2. Parallel Processing with Redundant Fault Tolerance
- Concurrency: Multiple workers can handle parallel tasks to reduce execution time. By distributing tasks across multiple instances, we can efficiently manage high volumes of commands.
- Fault Tolerance: If a worker fails, another should seamlessly take over, maintaining continuity and reliability within the system. Utilizing task retry mechanisms ensures that no command is lost if an error occurs.
3. Optimized Resource Allocation
- Resource Pools: Workers should access pre-allocated resources, such as memory and processing power, based on task requirements. This approach avoids unnecessary resource usage and ensures that tasks are not delayed due to limited resources.
- Task Prioritization: The Execution Agent can prioritize tasks dynamically, sending high-priority tasks to available workers immediately and queuing low-priority ones for processing during idle time.
Communication and Coordination Strategies
1. Event-Queue Communication
- Workers and the Execution Agent should communicate via an event queue, which allows for a non-blocking, asynchronous flow of commands. This approach is vital for ensuring that the system remains responsive, even under heavy loads.
- Queued Events: Workers retrieve events from the queue, process them, and return the output back to the queue, ensuring a streamlined data flow and efficient processing.
2. Feedback Loop and Interactive Adjustments
- Human-in-the-Loop (HITL): If a worker encounters ambiguity in a command, it can trigger a feedback event to request clarification from the user, maintaining accuracy and control.
- Execution Feedback: Workers provide feedback to the Execution Agent upon task completion, allowing for real-time tracking and error handling. This feedback can include success/failure states, execution time, and resource usage metrics, which can help optimize future tasks.
3. Event Chaining for Complex Tasks
- For multi-step commands, workers can create sub-events that chain tasks together. For example, a command like “open the document, analyze it for trends, and summarize it” would trigger several interconnected workers that process each step sequentially.
- Task Dependency Management: The Execution Agent ensures that workers execute in the correct order by managing dependencies, enabling a seamless flow from one task to the next without requiring manual intervention.
Monitoring, Debugging, and Continuous Improvement
1. Real-Time Monitoring for System Health
- Workers should be monitored continuously to track performance metrics such as execution time, error rates, and resource consumption. This monitoring enables rapid detection of issues and optimization of worker processes.
- Performance Analysis: Regular analysis of worker performance can highlight areas for improvement, such as tasks that consistently take longer to complete, and help the development team fine-tune the Execution Agent’s algorithms.
2. Debugging Tools for Quick Resolution
- Workers should log their activities comprehensively, making it easier to diagnose issues if a task fails or behaves unexpectedly. These logs should capture input, output, and any encountered errors.
- Error Reporting and Recovery: When an error occurs, the Execution Agent should automatically notify a debugging worker, which can analyze logs, suggest possible fixes, and prompt the system to retry the task if necessary.
Maturity Model for EaaPL Workers
This maturity model outlines the progression of EaaPL workers across four levels of implementation: Functional, Performant, Resilient, and Complete. Each level builds upon the previous one, adding features and optimizations that enhance the system's capabilities.
Level 1: Functional
At the Functional level, the system is operational and performs basic tasks effectively.
1. Worker Architecture
- Modular and Stateless: Workers are designed to be stateless and modular, executing single tasks independently without retaining information from previous tasks.
2. Activation and Execution
- Event-Driven Activation: Workers are activated by events dispatched from the Execution Agent.
- Immediate Processing: Workers process tasks as soon as they are triggered and return outputs promptly.
3. Context Management
- Task-Specific Context: Workers have access to the necessary context relevant to each command to perform tasks accurately.
4. Communication and Coordination
- Basic Event Communication: Simple event-based communication exists between the Execution Agent and workers.
- Synchronous Operations: Communication is straightforward, with tasks processed in the order received.
5. Resource Management
- Basic Resource Utilization: Workers utilize system resources as available without optimization.
6. Monitoring and Debugging
- Basic Logging: Workers log activities such as task execution start and end times.
- Manual Monitoring: System health is monitored manually without real-time analytics.
Level 2: Performant
At the Performant level, the system enhances efficiency and scalability.
1. Load Balancing and Scalability
- Dynamic Load Balancing: The Execution Agent distributes tasks based on current load and worker availability.
- Auto-Scaling: Workers can scale up or down in response to demand fluctuations.
2. Parallel Processing
- Concurrency: Multiple workers handle tasks in parallel, reducing overall execution time.
3. Resource Management
- Resource Pools: Introduction of pre-allocated resource pools allows workers to access necessary resources based on task requirements.
- Optimized Allocation: Resources are allocated efficiently to avoid delays due to limited availability.
4. Communication and Coordination
- Event-Queue Communication: Implementation of an event queue enables non-blocking, asynchronous communication.
- Task Prioritization: The Execution Agent can dynamically prioritize tasks, sending high-priority tasks to available workers immediately.
5. Context Management
- Hierarchical Context Flow: Context is passed down to workers only when necessary, reducing overhead.
6. Monitoring and Debugging
- Performance Metrics: Collection of basic performance metrics like execution time and error rates.
- Improved Logging: Enhanced logging captures inputs, outputs, and encountered errors.
Level 3: Resilient
At the Resilient level, the system becomes robust against failures and can recover gracefully.
1. Fault Tolerance
- Redundant Systems: If a worker fails, another worker seamlessly takes over the task.
- Task Retry Mechanisms: Automated retries ensure no command is lost due to errors.
2. Feedback Mechanisms
- Human-in-the-Loop (HITL): Workers can request clarification from users when encountering ambiguous commands.
- Execution Feedback: Workers provide detailed feedback upon task completion, including success/failure states and resource usage.
3. Monitoring and Debugging
- Real-Time Monitoring: Continuous tracking of system health, performance metrics, and resource consumption.
- Comprehensive Logging: Detailed logs facilitate quick diagnosis and resolution of issues.
- Error Reporting and Recovery: Automated notifications and recovery mechanisms are in place for swift error handling.
4. Context Management
- Enhanced Context Handling: Improved management of context for complex tasks and dependencies.
Level 4: Complete
At the Complete level, the system is fully optimized with advanced features and continuous improvement mechanisms.
1. Advanced Task Management
- Event Chaining: Workers can create sub-events to handle complex, multi-step commands by chaining tasks together.
- Task Dependency Management: The Execution Agent manages task dependencies, ensuring correct execution order without manual intervention.
2. Performance Analysis and Optimization
- Regular Analysis: Ongoing analysis of worker performance to identify bottlenecks and optimize processes.
- Predictive Scaling: Use of historical data and analytics to predict demand and scale resources proactively.
3. Continuous Improvement
- Learning Systems: Implementation of machine learning to improve task execution based on past performance.
- Advanced Resource Management: Dynamic adjustment of resources using predictive analytics.
4. Enhanced Communication and Coordination
- Advanced Protocols: Improved communication protocols for efficient data flow and reduced latency.
- User Experience Enhancements: Improved interfaces for monitoring, control, and interaction.
5. Security and Compliance
- Robust Security Measures: Implementation of comprehensive security protocols to protect data and operations.
- Regulatory Compliance: Ensuring the system adheres to all relevant regulations and standards.
Summary Table
Component | Level 1: Functional | Level 2: Performant | Level 3: Resilient | Level 4: Complete |
---|---|---|---|---|
Worker Architecture | Modular and stateless workers | No significant change | No significant change | Advanced architectures for specific tasks |
Activation and Execution | Event-driven activation; immediate processing | No significant change | No significant change | Optimized execution strategies for complex workflows |
Context Management | Task-specific context access | Hierarchical context flow | Enhanced context handling for complex tasks | Predictive context provisioning across task chains |
Load Balancing | Basic task distribution | Dynamic load balancing; auto-scaling | Advanced load balancing with predictive scaling | Global optimization across clusters |
Fault Tolerance | Basic error handling | Task retry mechanisms | Automatic failover; redundant systems | Handling of complex failure scenarios |
Resource Management | Basic resource utilization | Resource pools; optimized allocation | Optimized allocation based on priorities | Predictive resource management using analytics |
Communication | Basic event communication | Event-queue for asynchronous flow | Detailed execution feedback; HITL mechanisms | Advanced protocols; enhanced user interfaces |
Task Prioritization | Tasks processed in order received | Dynamic prioritization | Advanced prioritization considering dependencies | Predictive prioritization using AI/ML |
Parallel Processing | Individual task processing | Concurrent task handling | Efficient concurrency management | System-wide optimized parallel processing |
Monitoring and Debugging | Basic logging; manual monitoring | Collection of performance metrics; improved logging | Real-time monitoring; comprehensive logging; error recovery | Advanced analytics; proactive issue resolution |
Task Management | Execution of individual tasks without dependencies | No significant change | Initial dependency management | Event chaining; managed dependencies for correct execution order |
Feedback Mechanisms | No feedback loops | No significant change | HITL; detailed execution feedback | Automated adjustments based on feedback; learning from past results |
Performance Analysis | Not implemented | Basic metrics collection | Regular performance optimization | Continuous improvement using AI/ML |
Security and Compliance | Basic security measures | Improved communication security | Enhanced security protocols | Full implementation of security and compliance measures |
Kafka Topics
Level 1: Functional
Topic Name:
task.dispatch
- Description: Used by the Execution Agent to dispatch tasks to workers.
- Event Details:
- Message Key:
task_id
(string) - Unique identifier for the task. - Message Value: JSON object containing:
task_id
: (string) - Unique task identifier.command
: (string) - The command or instruction to execute.parameters
: (object) - Key-value pairs of parameters required for the task.context
: (object) - Relevant context information for the task.priority
: (string, optional) - e.g., "normal", "high".
- Purpose: Enables workers to subscribe and receive tasks to execute.
- Message Key:
Topic Name:
task.result
- Description: Workers publish the results of executed tasks to this topic.
- Event Details:
- Message Key:
task_id
(string). - Message Value: JSON object containing:
task_id
: (string) - Unique task identifier.worker_id
: (string) - Identifier of the worker that executed the task.status
: (string) - e.g., "success", "failure".output
: (object) - Result of the task execution.error
: (object, optional) - Error details if the task failed.execution_time
: (number) - Time taken to execute the task in milliseconds.
- Purpose: Allows the Execution Agent to receive and process task outcomes.
- Message Key:
Level 2: Performant
Topic Name:
task.priority
- Description: Dedicated topic for high-priority tasks requiring immediate processing.
- Event Details:
- Message Key:
task_id
(string). - Message Value: Same structure as
task.dispatch
, withpriority
set to "high". - Purpose: Workers monitor this topic to handle urgent tasks promptly.
- Message Key:
Topic Name:
resource.allocation
- Description: Communicates resource availability and allocation instructions.
- Event Details:
- Message Key:
worker_id
(string). - Message Value: JSON object containing:
worker_id
: (string) - Worker identifier.allocated_resources
: (object) - Details about allocated resources (CPU, memory).task_requirements
: (object, optional) - Specific resource requirements for tasks.timestamp
: (string) - ISO 8601 formatted time of allocation.
- Purpose: Informs workers of resource allocations to optimize utilization.
- Message Key:
Topic Name:
logs.debug
- Description: Workers publish detailed logs and debugging information.
- Event Details:
- Message Key:
log_id
(string) - Unique identifier for the log entry. - Message Value: JSON object containing:
timestamp
: (string) - ISO 8601 format.worker_id
: (string).task_id
: (string).log_level
: (string) - e.g., "INFO", "DEBUG", "ERROR".message
: (string) - Log message.details
: (object, optional) - Additional information.
- Purpose: Facilitates performance analysis and troubleshooting.
- Message Key:
Level 3: Resilient
Topic Name:
errors.report
- Description: Workers report errors and failures.
- Event Details:
- Message Key:
error_id
(string) - Unique error identifier. - Message Value: JSON object containing:
error_id
: (string).timestamp
: (string).worker_id
: (string).task_id
: (string).error_type
: (string) - Classification of the error.error_message
: (string) - Description of the error.stack_trace
: (string, optional) - Stack trace details.
- Purpose: Enables prompt error handling and resolution.
- Message Key:
Topic Name:
task.retry
- Description: Contains tasks that need to be retried after failure.
- Event Details:
- Message Key:
task_id
(string). - Message Value: Same as
task.dispatch
, with additional fields:retry_count
: (integer) - Number of retry attempts.last_error
: (object) - Details of the last error encountered.
- Purpose: Ensures failed tasks are retried appropriately.
- Message Key:
Topic Name:
feedback.hitl
- Description: Workers request clarification or additional input from users.
- Event Details:
- Message Key:
feedback_id
(string). - Message Value: JSON object containing:
feedback_id
: (string).timestamp
: (string).worker_id
: (string).task_id
: (string).question
: (string) - Specific clarification needed.context
: (object) - Relevant context for the user.user_id
: (string) - Identifier of the user to respond.
- Purpose: Incorporates human-in-the-loop processes for accuracy.
- Message Key:
Topic Name:
monitoring.metrics
- Description: Workers publish performance metrics and resource consumption data.
- Event Details:
- Message Key:
worker_id
(string). - Message Value: JSON object containing:
timestamp
: (string).worker_id
: (string).cpu_usage
: (float) - CPU usage percentage.memory_usage
: (float) - Memory usage in MB.tasks_completed
: (integer).tasks_failed
: (integer).uptime
: (number) - Uptime in seconds.
- Purpose: Facilitates real-time monitoring of system health.
- Message Key:
Topic Name:
worker.heartbeat
- Description: Workers send periodic heartbeat messages to indicate their status.
- Event Details:
- Message Key:
worker_id
(string). - Message Value: JSON object containing:
timestamp
: (string).worker_id
: (string).status
: (string) - e.g., "online", "idle", "busy", "offline".current_task_id
: (string, optional).load
: (float) - Current load percentage.
- Purpose: Assists in detecting worker availability and managing load balancing.
- Message Key:
Level 4: Complete
Topic Name:
task.chain
- Description: Workers publish sub-tasks as part of executing complex, multi-step commands.
- Event Details:
- Message Key:
sub_task_id
(string). - Message Value: JSON object containing:
sub_task_id
: (string).parent_task_id
: (string).command
: (string).parameters
: (object).dependencies
: (array of strings) - IDs of tasks that must complete first.priority
: (string, optional).
- Purpose: Manages event chaining for complex workflows.
- Message Key:
Topic Name:
task.dependency
- Description: Manages task dependencies and execution order.
- Event Details:
- Message Key:
task_id
(string). - Message Value: JSON object containing:
task_id
: (string).dependencies
: (array of strings).dependent_tasks
: (array of strings) - Tasks that depend on this task.status
: (string) - e.g., "pending", "in_progress", "completed".
- Purpose: Ensures tasks execute in the correct order without manual intervention.
- Message Key:
Topic Name:
system.alerts
- Description: Carries critical system alerts and notifications.
- Event Details:
- Message Key:
alert_id
(string). - Message Value: JSON object containing:
alert_id
: (string).timestamp
: (string).severity
: (string) - e.g., "CRITICAL", "HIGH", "MEDIUM", "LOW".message
: (string).affected_components
: (array of strings).action_required
: (boolean).
- Purpose: Provides immediate attention to critical issues affecting system operation.
- Message Key:
Topic Name:
security.events
- Description: Logs security events, access controls, and compliance-related information.
- Event Details:
- Message Key:
event_id
(string). - Message Value: JSON object containing:
event_id
: (string).timestamp
: (string).event_type
: (string) - e.g., "authentication", "authorization_failure".user_id
: (string).worker_id
: (string, optional).description
: (string).severity
: (string).
- Purpose: Maintains robust security measures and ensures adherence to regulatory compliance.
- Message Key:
Topic Name:
analytics.data
- Description: Workers and the Execution Agent publish data used for predictive analytics and machine learning models.
- Event Details:
- Message Key:
data_point_id
(string). - Message Value: JSON object containing:
data_point_id
: (string).timestamp
: (string).worker_id
: (string).task_id
: (string).metrics
: (object) - Various performance metrics.resource_usage
: (object) - Detailed resource consumption.
- Purpose: Enables continuous improvement and predictive scaling based on historical data and analytics.
- Message Key:
Additional Considerations
- Message Serialization: Use a consistent serialization format like JSON or Avro for all messages to ensure interoperability.
- Schema Registry: Implement a schema registry to manage and enforce message schemas across topics.
- Security Measures:
- Encryption: Encrypt messages in topics containing sensitive information, such as
security.events
. - Authentication and Authorization: Use Kafka's security features to restrict access to topics.
- Encryption: Encrypt messages in topics containing sensitive information, such as
- Partitioning Strategy: Use meaningful keys (e.g.,
task_id
,worker_id
) to ensure even distribution and ordering where necessary. - Monitoring Tools: Integrate with monitoring tools to consume data from
monitoring.metrics
andsystem.alerts
for real-time dashboards and alerts.
Summary Table
Topic Name | Description | Purpose | Maturity Level |
---|---|---|---|
task.dispatch |
Dispatches tasks to workers | Basic task communication | Functional |
task.result |
Workers send task results | Return of task outcomes | Functional |
task.priority |
High-priority tasks for immediate processing | Dynamic task prioritization | Performant |
resource.allocation |
Resource allocation instructions | Optimized resource utilization | Performant |
logs.debug |
Detailed logs and debugging information | Enhanced troubleshooting | Performant |
errors.report |
Workers report errors and failures | Fault tolerance and error handling | Resilient |
task.retry |
Tasks to be retried | Ensures tasks are not lost | Resilient |
feedback.hitl |
Workers request user input | Human-in-the-loop processes | Resilient |
monitoring.metrics |
Performance metrics and resource usage | Real-time system health monitoring | Resilient |
worker.heartbeat |
Worker status updates | Load balancing and failure detection | Resilient |
task.chain |
Sub-events for complex commands | Supports complex workflows and dependencies | Complete |
task.dependency |
Manages task dependencies and execution order | Correct sequencing of tasks | Complete |
system.alerts |
Critical system alerts and notifications | Immediate attention to critical issues | Complete |
security.events |
Security and compliance logs | Robust security measures and auditing | Complete |
analytics.data |
Data for predictive analytics | Continuous improvement and predictive scaling | Complete |
Reality Check
Let's review the previously designed maturity model and Kafka topics for the EaaPL workers system to ensure it is consistent with executing an English language program. We'll specifically check the system against the following command:
"Send an email to each of my contacts if I haven't emailed them this week that says 'Hello World!'"
This command was chosen because it has the three basic operations we would expect from a programming language. This command would look something like this in a traditional language like python.
# Main function to execute the command
def main():
contacts = get_user_contacts()
for contact in contacts:
if not has_emailed_this_week(contact):
email_body = compose_email_message("Hello World!")
send_email_to_contact(contact, email_body)
# Function to retrieve the user's contact list
def get_user_contacts():
# Placeholder for function implementation
pass
# Function to check if an email has been sent to the contact this week
def has_emailed_this_week(contact):
# Placeholder for function implementation
pass
# Function to compose an email message with the given content
def compose_email_message(message_content):
# Placeholder for function implementation
pass
# Function to send an email to the specified contact with the given message body
def send_email_to_contact(contact, email_body):
# Placeholder for function implementation
pass
# Entry point of the script
if __name__ == "__main__":
main()
We'll break down this command into actionable steps, map these steps to the system's components, and verify that the system supports executing this program effectively.
Execution Steps at Level One Maturity
1. Parsing the Command
Planner Agent receives the natural language command:
"Send an email to each of my contacts if I haven't emailed them this week that says 'Hello World!'"
Parsing and Interpretation:
- Intent: Send a "Hello World!" email to all contacts not contacted this week.
- Identified Actions Create Plan:
- Retrieve the user's contact list.
- For each contact, check the email history for the past week.
- If no email has been sent to a contact this week, send the "Hello World!" email.
2. Execution Agent Breaks Down the Plan into Tasks
- Task 1: Retrieve the user's contact list.
- Task 2: Check email history for each contact.
- Task 3: Send "Hello World!" email to contacts not emailed this week.
3. Creating Task Events
- The Execution Agent creates events for each task with unique
task_id
s.
Task Events Structure
Kafka Topic:
task.dispatch
Message Key:
task_id
(string)Message Value: JSON object containing:
{ "task_id": "task_001", "command": "RetrieveContacts", "parameters": { "user_id": "user_123" }, "context": {} }
4. Dispatching Tasks
Publishing to Kafka Topic
task.dispatch
:Task 1: Retrieve Contacts
{ "task_id": "task_001", "command": "RetrieveContacts", "parameters": { "user_id": "user_123" }, "context": {} }
Task 2: Check Email History (to be created after Task 1 completes)
Task 3: Send Emails (multiple tasks, one per contact)
5. Worker Execution
Workers Subscribe to
task.dispatch
:- Workers listen for new tasks to execute.
Execution of Task 1:
Worker A picks up
task_001
.Action: Retrieves the contact list for
user_123
.Result: Obtains a list of contacts.
Publishing Result:
Kafka Topic:
task.result
Message:
{ "task_id": "task_001", "worker_id": "worker_A", "status": "success", "output": { "contacts": ["contact_1", "contact_2", "contact_3"] }, "execution_time": 150 }
6. Processing Results in Execution Agent
Execution Agent Subscribes to
task.result
:- Receives the result of
task_001
. - Extracts the list of contacts.
- Receives the result of
Creating Task 2 Events:
For the list of contacts, create
task_002
.{ "task_id": "task_002", "command": "CheckEmailHistory", "parameters": { "user_id": "user_123", "contacts": ["contact_1", "contact_2", "contact_3"], "time_frame": "past_week" }, "context": {} }
Publishing Task 2 to
task.dispatch
.
7. Worker Execution of Task 2
Worker B picks up
task_002
.Action: Checks email history for each contact in the past week.
Result: Identifies contacts not emailed this week.
- Assume
contact_2
andcontact_3
have not been emailed.
- Assume
Publishing Result:
{ "task_id": "task_002", "worker_id": "worker_B", "status": "success", "output": { "contacts_to_email": ["contact_2", "contact_3"] }, "execution_time": 200 }
8. Processing Results and Creating Task 3 Events
Execution Agent processes
task_002
results.For Each Contact to Email:
Create a
task_003_n
for each contact.Task for
contact_2
:{ "task_id": "task_003_2", "command": "SendEmail", "parameters": { "recipient": "contact_2", "message": "Hello World!", "user_id": "user_123" }, "context": {} }
Task for
contact_3
:{ "task_id": "task_003_3", "command": "SendEmail", "parameters": { "recipient": "contact_3", "message": "Hello World!", "user_id": "user_123" }, "context": {} }
Publishing Task 3 Events to
task.dispatch
.
9. Worker Execution of Task 3
Workers Subscribe to
task.dispatch
:- Worker C picks up
task_003_2
. - Worker D picks up
task_003_3
.
- Worker C picks up
Actions:
- Worker C sends an email to
contact_2
. - Worker D sends an email to
contact_3
.
- Worker C sends an email to
Publishing Results:
Worker C:
{ "task_id": "task_003_2", "worker_id": "worker_C", "status": "success", "output": { "recipient": "contact_2", "message_id": "msg_789" }, "execution_time": 100 }
Worker D:
{ "task_id": "task_003_3", "worker_id": "worker_D", "status": "success", "output": { "recipient": "contact_3", "message_id": "msg_790" }, "execution_time": 110 }
10. Finalizing Execution
Execution Agent collects all results from
task.result
.Updates Task Statuses:
- Marks tasks as completed.
- Logs basic information about the execution.
11. Logging and Monitoring
Workers Log Activities:
- Start and end times.
- Task IDs and statuses.
Manual Monitoring:
- System administrators can review logs for any issues.
- No automated real-time monitoring at this level.
Consistency with Prior Work
Modular and Stateless Workers:
- Each worker performs a single, independent task without retaining state.
Event-Driven Activation:
- Workers are activated by events published to the
task.dispatch
Kafka topic.
- Workers are activated by events published to the
Basic Event Communication:
- Communication between the Execution Agent and workers is facilitated through
task.dispatch
andtask.result
topics.
- Communication between the Execution Agent and workers is facilitated through
Task-Specific Context:
- Workers receive only the necessary parameters to execute their tasks.
Basic Resource Utilization:
- Workers utilize resources as available without optimization.
Basic Logging:
- Workers log execution details, enabling manual monitoring.
Additional Notes
Limitations at Level One Maturity:
- Error Handling: If a worker fails to execute a task, there are no automated retries or fault tolerance mechanisms.
- No Task Prioritization: All tasks are processed in the order they are received without prioritization.
- Manual Monitoring: System health is monitored manually, which may not be efficient for large-scale operations.
- No Parallel Processing Optimization: While workers may process tasks concurrently, there's no optimization for load balancing or resource allocation.
Potential Improvements in Higher Maturity Levels:
- Dynamic Load Balancing: Efficient distribution of tasks based on worker availability.
- Fault Tolerance: Automated retries and failover mechanisms for failed tasks.
- Advanced Monitoring: Real-time system health monitoring and analytics.
- Task Prioritization: Ability to prioritize urgent tasks over others.
By executing the program at Level One Maturity, we have demonstrated the foundational capabilities of the EaaPL system in processing and executing natural language commands using an event-driven, modular architecture. This sets the stage for further enhancements and scalability in higher maturity levels.
Execution Agent
The Execution Agent is at the heart of executing programs in our system. Its role is simple but vital: to take a program and make sure it runs as expected. To do this, it needs access to specific tools, and which tools it needs will depend on how the overall system is structured. So, let’s pause and think through the fundamentals of how our system should work before diving too deeply into the agent’s specific functions.
Key Assumptions
Our approach assumes that:
The system is distributed
The system uses Event-Driven Architecture (EDA)
The system is modular
These assumptions help us create a more flexible, powerful setup that can scale and adapt to various demands. Each assumption brings some benefits and helps us address different needs of the system. Here’s a breakdown of each assumption and the reasoning behind it:
Distributed System
In today’s computing world, most complex systems are divided into subsystems, each with a specific role. This distribution of components lets us mix and match services to support different workflows and builds resilience—no single part depends solely on a single machine or service. Distributed systems allow for more robust setups and let us spread the load across many machines or environments.
Event-Driven Architecture (EDA)
EDA is a method for organizing code that fits well with distributed systems, especially those built on our declarative, English-as-a-Programming-Language (EaaPL) model. EDA revolves around responding to “events” like actions or requests as they happen. For example, when you get a new email or a button is clicked, those are events. By organizing our system to respond to events, we can easily add features like logging and debugging since we’ll always know what triggered any action.
Modularity
A modular system has a couple of major benefits, but here, security is the standout. By modularizing our code, we can limit which parts of the system have access to sensitive information or actions. We do this by creating an allowlist—a list of specific tools or permissions that each module has. For instance, an email-sending module might be able to send emails but not access user data. This modular approach lets us expand the system with new features by adding new modules without worrying about them having unchecked access.
Execution Agent Walkthrough
The Execution Agent’s main role is to manage program execution by “provisioning” a worker—basically, assigning a worker to do the job. Once a program is done running, the agent will “release” the worker, freeing it up for other tasks. In simpler terms, it’s like assigning a person to a task and then letting them go once it’s completed. This approach also lets us add more workers if the workload grows, making the system scalable.
The agent has other responsibilities, too. It listens for errors or issues that might pop up while a worker is handling a task. If something goes wrong, the agent will try to fix the issue, retry the task if needed, and report any persistent problems back to the system. If the agent is designed well, it could also assign tasks to workers in locations that improve performance, like nearby servers.
Each worker follows a specific “plan” made up of steps. For example, let’s say you want the system to:
Send an email to all my contacts that says "Hello World!"
Here’s how the steps in that plan might look:
a. Get all contacts
b. Compose Email message: "Hello World!"
c. Send emails (requires steps a and b)
For each step, the worker publishes an “event” to the system. First, it might ask for the list of contacts, then create the message, and finally send the emails once the information is ready. Each step triggers the next, allowing the worker to get information as it’s needed.
Since this is an event-driven system, the worker can “subscribe” to events that give it the information for each step. This way, it knows exactly when a step has been completed and can move on to the next one. This “subscribe and wait” approach makes the system highly efficient and responsive.
Handling Errors
Errors are a natural part of any system, and the worker should be equipped to handle most of them. But, sometimes, more significant problems need special attention. For example, a worker might “fail” if it doesn’t receive a response in time or if it’s overwhelmed. When this happens, the agent can report the issue to other parts of the system so they can respond. Error handling deserves its own discussion, but this setup ensures that problems don’t stall the system. The right checks keep things moving smoothly, even when issues arise.
This modular, distributed, event-driven setup helps us build a system that scales, adapts, and securely manages its resources. And it’s a robust foundation for English-as-a-Programming-Language, making programming more intuitive and accessible.
The EaaPL System
The purpose of English as a Programming Language (EaaPL) is to make programming more accessible by allowing users to create designs and implement them using natural language. This approach enables individuals, even those without extensive programming knowledge, to communicate their goals clearly and have the system generate the required solution.
The concept of EaaPL was born out of the desire to simplify programming by leveraging natural language. Instead of requiring users to learn complex programming languages, EaaPL allows them to directly instruct the system in English a complex natural language. This bridges the gap between people with domain knowledge but no coding experience and software development, allowing them to contribute meaningfully to creating software solutions. It also emphasizes collaboration and creativity in problem-solving, removing the technical barriers that often hinder non-programmers.
Traditional programming languages, such as C++, Python, or JavaScript, have formal syntax and rules, which can be challenging for those without formal training. In contrast, EaaPL relies on natural language constructs, making it inherently more accessible. By transforming natural language into executable code, EaaPL utilizes advances in natural language processing (NLP) and artificial intelligence (AI) to understand and interpret instructions in a way that is intuitive for humans. This approach makes programming closer to how humans naturally think and communicate, focusing on the user’s intent rather than technical details.
EaaPL leverages a fundamental concept in language theory called semantic parsing. Semantic parsing is the process of converting natural language into a structured form that a machine can understand. In EaaPL, this means breaking down English instructions into logical steps that can be executed by the system. Additionally, EaaPL treats English as a declarative language. This means users describe what they want to achieve, and the system takes responsibility for determining how to accomplish it. The focus is on “what” needs to be done, rather than “how” it needs to be done.
Programmer: A person who writes a program.
Program: A set of instructions written in natural language that tells the system what the user wants to accomplish.
Plan: A detailed breakdown of the steps required to accomplish a program’s goal, created by the Planner agent.
Implementation: The process of turning a concept, design, or idea into a functioning system.
English is a natural language that can be used to create software via the English Execution Environment (EEE). This environment transforms natural language instructions into executable actions, effectively bridging the gap between human communication and software development.
English Execution Environment (EEE)
The English Execution Environment (EEE) is responsible for executing user programs. It includes several key components that work together to ensure programs are carried out effectively:
Planner: An agent that creates the plan for executing the program. The Planner takes the user’s natural language request and breaks it down into detailed steps that the system can understand and execute.
Evaluator: An agent that reviews the plan created by the Planner to verify that it can accomplish the intended goal. It also checks that the necessary system components are available and ready for execution.
Executor: An agent that manages the system’s state and carries out the steps defined in the plan by calling the necessary capabilities.
UI Designer: An agent that determines how the User Interface (UI) should change during the program’s execution, ensuring the user has a clear understanding of progress and outcomes.
Capabilities: Any actions that the EEE can perform, such as sending an email or writing to a file. These are system-level functions that produce observable results.
Intelligence: Pure functions that the Executor can run. These functions do not have side effects and are often used to transform data or compute values.
Output: The interface where information is displayed for the user.
Input: Where events or data are entered into the system by the user.
Terminal: A display interface for the Human in the Loop (HitL), which allows a person to intervene when necessary.
The EEE relies on events to enable communication between these components, allowing each to contribute to the overall program execution seamlessly.
Overall System Design
Below is a visual representation of the EEE system design, showing how each component interacts with the others:
graph TD
%% Main Components
UI["User Interface"]
UID["UI Designer"]
P["Planner"]
E["Evaluator"]
Exe["Executor"]
Cap["Capabilities\n(Side Effects)"]
WF["Insights\n(Intelligence)"]
Term["HitL Terminal"]
Info["Info Layer"]
Que["Queries"]
Com["Commands"]
Dat["Data"]
%% User Interaction
UI --> P
%% Agentic Components Flow
P --> E
E --> Exe
%% HitL
Cap --> Term
Term --> Cap
%% Execution Flow
Exe <--> Cap
Exe <--> WF
Exe <--> Info
Cap --> UID
UID --> UI
%% Data
Info --> Que
Info --> Com
Que --> Dat
Com --> Dat
Simplified View of the System
To make it easier to understand, here is a simplified view of the EEE, grouped into layers:
graph TD
%% Main Components
UI["User Interface"]
Ag["Agentic Layer"]
In["Intelligence Layer"]
Ac["Actions Layer"]
Dat["Information Layer"]
%% User Interaction
UI <--> Ag
%% Agentic Components Flow
Ag --> In
Ag --> Ac
Ag --> Dat
From the user’s perspective, they interact only with the UI. If they wish to engage with the lower-level system, that option should be available. However, ideally, the EEE handles everything seamlessly, and deeper engagement should not be necessary unless required. If users have to interact with lower-level components, this should be logged as an area for improvement.
Walkthrough of Program Execution
Let’s walk through an example of how a program is executed in the EEE system:
User writes a program and executes it: The user writes a simple program in natural language, such as: “Send an email to all my contacts that says ‘Hello World!’” and executes it.
Example: The user enters this command in the UI, and the EEE receives this input.
Planner creates an execution plan: The Planner translates the user’s program into an execution plan.
Example: The Planner breaks down the task into steps: (a) retrieve all contacts, (b) draft an email with the subject and content “Hello World!”, (c) schedule email sending.
Evaluator reviews the plan: The Evaluator performs a system status check to ensure the necessary resources are available for execution.
Example: The Evaluator verifies that the email capability is available, checks that the system is online, and confirms that the required permissions and configurations are in place for sending emails.
Executor generates events: The Executor generates the necessary events required by the plan.
Example: The Executor triggers an event to retrieve the user’s contact list from the data layer, another event to compose the email, and an event to send it.
Info Layer provides information: The Info Layer provides information defined by data commands and queries.
Example: The Info Layer processes a query to fetch the contact list from the data layer. This information is then used in subsequent steps to draft the email.
Insights handle data transformation: Insights define how information should be transformed or combined.
Example: Insights may optimize how the list of contacts is handled, removing duplicates or correcting formatting issues to ensure all contacts receive the email.
Capabilities manage side effects: Capabilities execute side effects, such as writing to disk, sending an email, or providing program output.
Example: The Executor calls the capability to send an email. This capability handles connecting to the email server and sending the email to each contact.
Human in the Loop (HitL) Terminal: The HitL Terminal provides a point for human interaction when needed.
Example: If there are any errors, such as missing email addresses or issues with the email server, the HitL Terminal alerts the user and provides options to resolve the issue manually.
UI Designer displays output: The UI Designer reads the output and displays it for the user based on their preferences.
Example: Once the emails are successfully sent, the UI Designer displays a confirmation message to the user, including details such as the number of emails sent and any contacts that could not be reached.
FAQ
Q1: How will EaaPL handle the ambiguity of natural language?
A: EaaPL relies on advanced natural language processing (NLP) to interpret user instructions. However, natural language is inherently ambiguous, and this remains a key challenge. We are working on improving context recognition and user feedback loops to clarify ambiguous commands, but some level of ambiguity may still require human intervention.
Q2: Won’t relying on Human in the Loop (HitL) make the system inefficient?
A: HitL is designed as a safety net to handle cases where the system cannot fully understand or execute a command. While frequent human intervention would reduce efficiency, we aim to minimize the reliance on HitL through continuous improvements in NLP and contextual understanding. Over time, as the system learns from user interactions, the need for human involvement should decrease.
Q3: How does EaaPL address the scalability concerns with multiple users?
A: Scalability is a significant concern for EaaPL, especially with multiple users issuing commands simultaneously. So for now we are not addressing this.
Q4: What are the initial limitations of EaaPL’s capabilities?
A: Initially, EaaPL will support basic capabilities, such as sending emails and handling file operations. While these are useful for everyday tasks, more complex programming scenarios will require significant development and expansion of capabilities. We are focusing on building a solid foundation that can be expanded over time to accommodate more sophisticated tasks.
Q5: How will EaaPL ensure security when executing natural language commands?
A: Security is a major concern when allowing users to write programs in natural language. We plan to implement strict input validation, permission controls, and sandboxing techniques to prevent harmful commands from being executed. Ensuring user safety and data integrity is a core part of our development efforts. In short, the user will only be allowed to shoot their own foot.
Q6: What happens if the system misinterprets user instructions?
A: Misinterpretation is a possibility due to the limitations of NLP. When the system misinterprets a command, the HitL feature allows a human to step in and correct the issue. Additionally, users should be able to enable receiving feedback on the system’s interpretation of their commands, allowing them to make corrections before execution begins in final version of the system.
Q7: How steep is the learning curve for using EaaPL effectively?
A: While EaaPL aims to make programming more accessible, but it does not remove the need for the user to learn how to program.
Q8: How will EaaPL handle errors and provide debugging support?
A: Error handling and debugging are challenging in EaaPL due to the high level of abstraction. To help users understand what went wrong, the system will need to provide detailed feedback on failed commands, and the HitL feature will assist in troubleshooting.
Ozymandias.exe
Warning: LLMs are going to be anthropomorphized for the sake of entertainment.
I was goofing around talking to some folks about the transition of natural language to programming language and the weird corner cases that crop up. The most entertaining was: what programs would poetry result in?
So here we go, let’s first build a prompt that turns poetry into a program. The program is probably going to be in Python because LLMs tend to generate Python when asked for a program without specifying a language. This is clearly training bias and slightly hilarious to me. I don’t know many people who choose Python as their favorite language; they either were told it is good for beginners, or they need to use those amazing libraries that have been built for machine learning.
You are a poet programmer. Your job is to translate poems into code that expresses the poetic diction of the original.
The poem I gave it was Ozymandias by Percy Bysshe Shelley.
The output was surprising for a couple of reasons, but before we dig in, here was the result.
fn main() {
// Define the traveler, echoing the narrative frame
let traveler = Traveler::from_antique_land();
// Describing the broken statue, evoking imagery of ruin
let statue = Statue::new("Ozymandias", "King of Kings")
.legs("Vast", "Trunkless")
.visage("shattered")
.features("Frown", "WrinkledLip", "Sneer")
.pedestal_inscription("Look on my Works, ye Mighty, and despair!")
.place_in_desert();
// The message of impermanence, encapsulated in the statue's fate
let instruction = statue.instructs("Nothing beside remains.");
let decay_message = statue.decay(); // Call decay separately
// The desert as a symbol of time's erosion of greatness
let desert = Desert::new()
.stretch_boundless()
.bare_sands()
.surrounding_wreck();
// Output the final reflection, bringing to life the lonely scene
println!(
"{} {}, {}, {}",
traveler.tell_story(),
statue.describe(),
instruction, // Use the result of instructs()
decay_message // Use the result of decay()
);
}
struct Traveler;
impl Traveler {
fn from_antique_land() -> Self {
Traveler
}
fn tell_story(&self) -> &'static str {
"I met a traveler from an antique land,"
}
}
struct Statue<'a> {
name: &'a str,
title: &'a str,
legs: (&'a str, &'a str),
visage: &'a str,
features: (&'a str, &'a str, &'a str),
inscription: &'a str,
}
impl<'a> Statue<'a> {
fn new(name: &'a str, title: &'a str) -> Self {
Statue {
name,
title,
legs: ("undefined", "undefined"),
visage: "undefined",
features: ("undefined", "undefined", "undefined"),
inscription: "undefined",
}
}
fn legs(mut self, size: &'a str, form: &'a str) -> Self {
self.legs = (size, form);
self
}
fn visage(mut self, visage: &'a str) -> Self {
self.visage = visage;
self
}
fn features(mut self, f1: &'a str, f2: &'a str, f3: &'a str) -> Self {
self.features = (f1, f2, f3);
self
}
fn pedestal_inscription(mut self, inscription: &'a str) -> Self {
self.inscription = inscription;
self
}
fn place_in_desert(self) -> Self {
self
}
fn observe_traveler(&self) -> &'static str {
"And on the pedestal, these words appear:"
}
fn describe(&self) -> String {
format!(
"A shattered visage with a {} {} and {} stands, inscribed: '{}'",
self.features.0, self.features.1, self.features.2, self.inscription
)
}
fn instructs(&self, message: &'a str) -> String {
// This method reflects the tone of finality in the poem.
format!("{} {}", self.observe_traveler(), message)
}
fn decay(&self) -> &'static str {
// Adding a decay method to reflect the final message
"The lone and level sands stretch far away."
}
}
struct Desert;
impl Desert {
fn new() -> Self {
Desert
}
fn stretch_boundless(&self) -> &Self {
self
}
fn bare_sands(&self) -> &Self {
self
}
fn surrounding_wreck(&self) -> &Self {
self
}
fn describe(&self) -> &'static str {
"The lone and level sands stretch far away."
}
}
Explanation:
Traveler: The narrator of the story, instantiated as a
Traveler
struct, representing the meeting and recounting of the tale.Statue: Represents Ozymandias and the broken statue, with methods to define its various features and attributes (legs, visage, inscription).
Desert: Acts as the eternal, desolate background, symbolizing time’s erosion of the once-great king’s works.
Final reflection: The printed output provides the poetic summary of the scene, echoing Shelley’s themes of decay, power, and the inevitable passing of time.
The code captures the essence of the poem through symbolic object-oriented design, reflecting the poem’s structure.
As you may have noticed, this is not Python. It is Rust, a minor language with little adoption (wink). I thought this was just the LLM remembering (via the memory feature) the last language I had worked in, so I asked (as everyone should when shown Rust code), “Why Rust?”
I was hoping to catch it doing premature optimization, or just mimicking the pretentiousness of the stereotypical poet, but the answer was very surprising.
I chose Rust for this poetic translation because its syntax and language features align well with the task of modeling complex concepts like impermanence and structure. Here are a few reasons why Rust works for this particular poem:
Ownership and Lifetimes: Rust’s strict ownership model and memory safety rules fit well with the theme of decay and the passage of time in the poem. Just as Ozymandias’ empire is bound to crumble, Rust forces you to explicitly manage the lifespan of objects, making you think about their existence and destruction in code.
Expressive Structs and Enums: Rust’s struct and enum systems are great for capturing detailed descriptions, like the broken statue and the desert. I used structs like
Statue
andDesert
to represent different elements of the poem, giving each component detailed attributes in a way that feels structured and intentional.Method Chaining: The method chaining style, often seen in Rust’s builder pattern, works poetically to describe the different features of the statue step by step. This echoes the way the poem reveals the ruined statue gradually, with more details emerging as you read.
Declarative Nature: Rust allows for very declarative and clean code, similar to how poetry is declarative in its imagery. Structuring code this way gives the feeling of assembling a scene, like assembling stanzas in a poem.
Precision and Safety: Rust’s focus on preventing errors (e.g., avoiding null pointers or data races) mirrors the sculptor’s focus on capturing every detail of Ozymandias’ face, as well as Shelley’s careful craftsmanship in conveying deep themes.
Rust’s Philosophical Tone: Rust feels like a language that embraces themes of control, decay, and beauty through its systems of ownership and borrowing. In the same way that the poem reflects on the inevitable fall of empires despite their grandeur, Rust’s focus on managing memory, resources, and preventing errors reflects a struggle for control in the face of complexity and time.
Dang, that is clever. I took a moment to realize and accept that the LLM was not only a better programmer than I was, but had a deeper understanding of poetry than I did. I chose the poem because it is in the Watchmen movie and I remembered the title; the LLM chose the language because of the concept of impermanence. Yikes.
But does the program run and what is the output? Yes, with slight modification (above is the final iteration). The final build had warnings, which I think is an artistic choice by the LLM so they were left as is. Here is the output.
I met a traveler from an antique land, A shattered visage with a Frown WrinkledLip and Sneer stands, inscribed: 'Look on my Works, ye Mighty, and despair!', And on the pedestal, these words appear: Nothing beside remains., The lone and level sands stretch far away.
So where does this leave us? It’s fascinating to see how an LLM can not only translate poetry into code but also choose a programming language based on thematic resonance. The Rust code wasn’t just a random choice—it was a deliberate selection that mirrors the poem’s themes of impermanence and decay. That’s a level of depth I wasn’t expecting, and it made me pause and reflect on the capabilities of these models.
It’s a bit humbling, to be honest. I chose the poem because it was featured in the Watchmen movie and the title stuck with me. Meanwhile, the LLM connected Rust’s concepts of ownership and lifetimes to the very essence of “Ozymandias.” Yikes.
When I ran the program, the output felt like a decayed retelling of the original—a meta-commentary that perfectly encapsulated the poem’s themes. I shared this with a colleague (read: sent a Discord message with a screenshot), and they pointed out how fitting that was. It was as if the code itself had become a modern artifact, echoing the impermanence the poem speaks of.
So, I’ll keep tinkering with LLMs and pushing the boundaries of this concept. Who knows? We might just stumble upon a new paradigm of programming—one where the elegance of poetry and the precision of code coexist seamlessly. And if nothing else, it’s a heck of a lot of fun to see where these rabbit holes lead.
English Programming Environment
Since we are assuming that English is a declarative language, let’s go one step further and think about the programming environment we would need to leverage English as a programming language.
First off, I don’t think a chatbot, even a custom GPT, will be enough. We actually need an IDE with all of the expected features: version control systems (VCS), IntelliSense, file browser, syntax checking, spell checking, runtime environment, etc. While most of these are straightforward—just kidding, they are programming challenges in and of themselves—others don’t really exist for English. This limits the examples we have to work from, but we can think through what we would expect to exist if we were to proceed.
First, what should the files be called? I think that simplicity is our best friend here, so a file extension should be optional. Let’s make the official extension .”plain”English; this will allow for future dialects to be easily called out, and if anyone wants to do the same exercise in another human language, they could adopt this format. For example, if we have any French-Canadians attempting this, they could use .”canada”French as their extension. Hopefully, this would encourage people to build some cross-transpilers in the future.
As for the rest of the file name, I would suggest kebab-case. Since English has proper nouns, this would allow those to appear in the file name for maximum readability. For example, add-rows-to-Blake-archive.”plain”English would be very readable—something I think we should maximize if we are working in English. Of course, this is a fairly silly example; add-rows-to-Blake-archive is a much better name overall, as file extensions should not be needed for most use cases.
Yes, for those of you who caught it, I think from a practical standpoint, English should be a transpiled language. I think that fits well with the technology that we will be using. I imagine that our English code is transformed into an intermediary language that is executed. This, of course, could be compiled ahead of time, and that optimization seems fine. I would just point out that what will be lost is the readability of the code. Perhaps we could adopt some kind of archive that would contain the source code and the compiled code, like in a Java JAR.
Regardless, English will need to be transformed into some language that can be executed. For those of you who are starting your programming journey with English, this is a technical limitation of the hardware we are using. Hopefully, in the future, “English on a Chip” or English-native processors will exist, but that may be many years away and really is a hardware issue I am not interested in exploring myself.
So, I don’t think this necessarily has to be an existing language. I think a well-written system prompt would allow all pure English functions to work. The more interesting part is how do you achieve side effects, like reading a file or saving to a database? Here, I would say there are already some solutions we could leverage, specifically function calling and agents.
Let’s take a bit of a step back. If we are using English as a declarative language, we will need to decouple the “what” and the “how” as we are designing the language itself. The way to think of this is how a modern SQL database works. When you write a query, you are not defining how the data should be returned; instead, you are declaring the data you want.
Then the database takes over: it parses your query, binds the references, plans the actual query, optimizes it, then writes the physical plan, and finally executes it. We would need something similar, and function calling and agents can fill this role.
Let’s say I wanted to send an email from my English program. I would need to write the program to generate the email:
Send an email to each of my contacts that says "Hello World!"
Since we are decoupling how this is done we need to translate this into an intermediary language that is the steps of the how. Creating these steps could be assigned to a planner agent. For our program we need the planner to create something like this.
Compose the email.
Get all the contacts.
Send the email to all the contacts.
For those who are thinking, aha! This is not optimized, composing the email and getting all the contacts should be done in parallel! I would encourage you to write the optimization agent for our language, because you are correct.
Let’s go through the different steps that need to occur for our program to run, to better grasp how we can create a declarative execution environment.
The English program needs to be evaluated and transformed into a plan.
The plan is optimized (in our example via parallelization, making English multithreaded).
The plan is executed.
Agents that have function calling make adding “libraries” to our code fairly straightforward; however, at this time, I do not think creating a library is especially simple for English-only programmers.
We are at the “C uses assembly blobs” stage of English programming from a theoretical standpoint. Native English libraries will be put off from consideration until we have a pressing need for them. It seems this could be achieved by having a program written in English to compose such libraries. I will leave this exercise to the user.
English as a Programming Language
I started thinking about English as a programming language—what does that mean practically? As a believer in polyglot programming, I choose languages based on available libraries: #Kotlin with #SpringFramework, #Python with hashtag#LangChain, #DotNet for Microsoft technologies. My goal is to quickly capture ideas before they’re lost.
With English becoming a programming language, I worry about library support. Existing examples are often untested and model-dependent, leaving me with just the base language. Being a native English speaker helps, but I hadn’t considered what kind of programming language English is.
Initially, I thought English might be Object-Oriented since we have nouns and verbs—like objects and methods in #OOP. But this analogy falls short. English is fluid and context-dependent, unlike OOP’s rigid structures.
I then considered procedural programming, as we often give step-by-step instructions. Yet, when interacting with #AI, we focus on the end goal, not detailed steps.
Functional programming didn’t fit either; English isn’t about immutability or pure functions.
I even pondered logic programming like #Prolog, but everyday English goes beyond strict logic.
Then it hit me: English is fundamentally a declarative language. In #DeclarativeProgramming, you specify what you want, not how to achieve it. Telling an AI, “Book me a flight to Tokyo next Monday,” states intent—the what—and relies on the AI to handle the how.
This realization addresses my concerns about libraries. In English and AI, the “libraries” are the AI models—their training data and capabilities. As models advance, they expand the “functions” we can invoke through natural language.
By embracing English as a declarative language, we leverage its strengths: expressiveness, flexibility, and conveying complex ideas succinctly. We focus on goals and let the AI execute them.
This means we need to craft precise, unambiguous statements. Like writing effective #SQL queries, communicating with AI demands clarity.
So, while English shares traits with various programming paradigms, its role with AI aligns most closely with declarative programming. This empowers us to communicate more effectively with AI systems, focusing on what we want and trusting the AI to handle the how.
Recognizing English as declarative guides us to:
Express clear intentions: Be specific without unnecessary details.
Leverage AI capabilities: Trust the AI to fill gaps.
Iterate and refine: Adjust statements if needed.
This aligns with my polyglot programming philosophy: choosing the right tool and focusing on productivity. Viewing English as declarative lets us swiftly move from idea to execution, in theory. Stick around as I try to test this thought over the next couple weeks.