What is an Agent? Let's stop the speculations

Nice discussion here, hi guys.

How about this definition:

Agent — an entity capable of acting based on its own decision regarding whether to act, which action to take, and which object(s) to act upon.

That would rule out a bunch of deterministic flows, softwares, automations etc.

Let me know what you think.

PS : blackjack (answer 21) :rofl::rofl::rofl::rofl::winking_face_with_tongue:

1 Like

Which brings the sub class of agents:

Assistant — an agent that acts to support another agent(s) by contributing to the achievement of their goals.

That one is maybe 1 point over the head (blackjack is over)

1 Like

OK in my example for ‘Agent GIF’ I have this image

Weather

It is a little unclear… But the process is:

User requests connection to a weather API (in plain English)
My software requests the code and runs it retrieving the weather

Let’s say this process was installed in ‘Agent GIF’… A remote GIF posted here on the forum…

{
  "Name": "WeatherTripHelper",
  "ExecutionStrategy": "depth",
  "P": [
    /* 1 ─ pull live data */
    { "ID": 1, "Type": "Phout", "PID": 0, "Name": "GetWeather",
      "Code": {
        "Lang": "php",
        "Code": "$res = json_decode(file_get_contents(
                   'https://api.open-meteo.com/v1/forecast?latitude=' .
                   $INPUT['lat'] . '&longitude=' . $INPUT['lon'] .
                   '&current_weather=true')); 
                 $Data['Weather']['Data'] = $res->current_weather;"
      },
      "AlwaysExecute": "1",
      "Description": "Fetch current weather" },

    /* 2 ─ make a decision  (THIS is Serge’s autonomy) */
    { "ID": 2, "Type": "Decision", "PID": 0, "Name": "ChoosePlan",
      "Code": {
        "Lang": "php",
        "Code": "return ($Data['Weather']['Data']->precipitation > 0.2) 
                     ? 'Indoor' : 'Outdoor';"
      },
      "Description": "Decide trip type based on rain" },

    /* 3 ─ act on that decision */
    { "ID": 3, "Type": "Phout", "PID": 0, "Name": "SuggestTrip",
      "ExecuteIf": "$Prev == 'Outdoor'",
      "Code": {
        "Lang": "php",
        "Code": "$OUT = 'Nice weather! How about a riverside walk?';"
      }},
    { "ID": 4, "Type": "Phout", "PID": 0, "Name": "SuggestTripWet",
      "ExecuteIf": "$Prev == 'Indoor'",
      "Code": {
        "Lang": "php",
        "Code": "$OUT = 'Looks rainy—try the city museum instead.';"
      }}
  ],
  "CID": "1"
}

PS. ChatGPT o3 helped me craft an example, it is not syntactically correct but made for purpose for demonstration for this thread.

1 Like

That would basically break the definition of agent I suggested and classify it as a workflow…

1 Like

Initially, an ‘Agent’ was basically just a purpose-built subcomponent of a larger infrastructure, but like any other overused word or phrase, the term ‘Agent’ has essentially been turned into garbage that doesn’t mean anything anymore. It’s used as an ambiguous term for describing anything controlled by an LLM.

2 Likes

Definitely, some clear terms in a glossary would be great to have to help people visualize what is being said.

1 Like

This is a ‘Phout’… Output… In this context ‘Always Execute’ actually means always run the code in the pane rather than use cache.

There is another type Phif - Conditional in spec that can absolutely work off information retrieved…

o3 tried to demo this with “ExecuteIf”: “$Prev == ‘Outdoor’”, which conveys same meaning I think.

It’s important to highlight:

Phas is a tree of evaluated objects

Every item in the tree is executed and items can have several types ie

Container, Output Pane, Conditional, Loop

These GIFs also contain Databases, Tables, Strings, various data in a data array…

No reason they cant write data to this db too or act conditionally based on input received through Phas client.

1 Like

I meant this would be something that decides of acting and not being constraint in hard logic dictating act/not, what to do, etc… Basically, you could pretty accurately predict what the agent will do, how and so on.

But the moment you are no longer predicting but 100% sure, the thing stops being an “agent” as the is no freedom of choice in there any more.

2 Likes

So as long as the Agent GIF has something like:

Still getting AI to do the heavy lifting but this was my source :)
Phas Tree: I want to go on holiday
AgentGIF (Phout): Here is a bunch of sites, let me know which you have whitelisted [Weather, Map, ...]
Phas Tree: [Weather, Map, ...]
AgentGIF (Phout): Connect to APIs and return data
Phas Tree : [Data]
AgentGIF (Phout): [Write data to db]
AgentGIF (Phif): (Possible here to see Databases, act on results, run functions, follow next determined path from result
Phas Tree: “I want to go on holiday.”

AgentGIF (Phout): “Great. Which external services may I use?  
[Weather, Maps, HotelReviews]”

Phas Tree: “[Weather, Maps]”   ← user (or another agent) whitelists only these

AgentGIF (Phout):  • fetch live weather  
                    • pull nearby points‑of‑interest from Maps API  
                    (stores both in Data[])

AgentGIF (Decision):           ← **autonomy lives here**
    IF Data.Weather.precipitation > 0.2
          THEN $plan = "Indoor";       // museum, aquarium …
    ELSE $plan = "Outdoor";            // hiking trail, beach …

AgentGIF (Phout): suggests $plan to the user  
                  (and writes trip itinerary to DB if approved)

AgentGIF (Phif): on next pass it sees the DB entry;  
                 *could* book tickets automatically **or** stop,  
                 depending on user confirmation.

Then it is an Agent?

The Intent lives in the GIF…

If you can compose an agent setup from your app you can also convey one from an external source if you take on it’s evaluation…

There is obviously the concern of ceding control, which could conflict with OpenAI terms, but with properly sandboxed actions this isn’t necessarily an issue.

1 Like

If/then logic in code is determined by the developer. So for me it is still a flow and not an agent.

2 Likes

OK so can you give me an example, maybe change the example above?

What line would you add?

And I am not just copy pasting ChatGPT… Good Examples are a lot of typing, it’s just helping me with the heavy lifting :slight_smile:

1 Like

I think the question maybe better…

Could you reasonably sanitize that injection from a GIF?


Edit: How about this very simplistic example…

This example assumes the loader supports Phasm prompts

Description: Choose An Appropriate Color (User Is A (Gender))

1 Like

I think it needs something about operating recursively, or until a task is complete/fails

1 Like

I even describe services that don’t use an llm but use mcp as interface an agent now. :face_with_peeking_eye:

2 Likes

While AgentGIF can’t process code itself it could decide to offload tasks to external services using your systems as a proxy…

Get this… Agent GIF could create an MCP Server…

Agent GIF is a portable self-contained agent in an image ^^… Send it in an email, upload it to a forum, whatever… I think you could even encrypt and password it? Agents for sale?

Quadrillions of encrypted intelligent GIFs buzzing around the internet.

ChatGPTs take…

┌─ AgentGIF
│ ├─ visible pixels (fun animation)
│ └─ AGENTGIF000 (gz‑compressed Phas tree)
│ • Pane 1 → spin up tiny MCP server (127.0.0.1:7400)
│ • Pane 2 → expose “/tools”:
│ - weather.get
│ - maps.poi
│ - agent.exec (run another pane)
│ • Pane 3 → policy / capability manifest
└─────────────► runtime unpacks → launches server

1 Like

If an agent has Agency and that Agency can be copied, this is clearly the state of future employment.

Jochen is right
image.

But how do we ENGAGE with that agency, as we become overseers of vast automated and in some cases non-deterministically jointed processes?

If we wish humans to have Agency then we all have to be able to engage with AI.

I don’t want to trash ‘Agent GIF’… An executable conceptual engine in a GIF…

If the forum is full of bots and coders then in reality this is a forum Augmentation that should be SUPER EASY!

Does that window build trust or better a balanced understanding of AI?

“What is an Agent?” - Sounds awfully Philosophical ^^

‘Agent GIF’ is “What is THIS Agent?” - What is it’s intent?

Sounds like something one might share and discuss on the OpenAI Developer Forum. It might stop the speculation… :slight_smile:

If we had a local Chat and use it to fill a graph with our interests and skills we could put a mcp RAG server to expose whatever we want from that.

Which technically makes our digital twin an Agent which could take orders from other Agents - paid with smart contracts…

And even better if there is a semantic exploration method which allows to expose our skills and interests but anonymously.

I mean wouldn’t it be cool if we chat and AI says.. here is someone with same interest working on exactly the same atm. Why don’t you invite them into a group chat…

4 Likes

That’s an excellent idea how do we implement that?

What is needed from each member?

2 Likes

I am currently just waiting for openai’s open source model that was announced to make that the backbone of that thing.

Semantic DNS: i thought maybe a couple convex hulls around the vector space of the local data.

Expose an alpha smoothed polytop set to a decentralized mothernode and the semantic dns could be used maybe with hmm haussdorf distance search to find the % of overlap?

That could also be used to automatically find mcp agents…

Sounds complex.. but I work on that for quiet some time now.
Including it into the pecl I am still working on.

But I mean you can train local agents just by solving stuff in a chat/group chat and the training data belongs to you. Your agent solves stuff and gets paid.

And nobody has to do work twice…

just a little vision.

Who knows.

1 Like

In that context everything that can solve a task is an agent.

And I also have a seti@home fantasy where the whole network shares compute and people can earn just by letting their gpu do stuff when it is idle

which makes it runnable on a phone haha

2 Likes