Prompting using set theory

salsam7879 · February 9, 2025, 8:15am

Hi im new to promoting and still learning. I read the open ai article around strategies and tactics for promot engineering and one of the things i understood by this article and by playing with it is that you have to be very specific yet clear to not allow the prompt from carrying different meaning so i thought what could be better than to do this using Set Theory so that its accurate yet very concise specially when dealing with context that can carry multiple states (predicates) & happen across multiple datasets (relationships) . Despite doing so and doing it with the help of AI itself and despite that the states and the data set i used small in size i still found a model like 4o-mini atruggle and be way off!!!
Actually the reason i thought about using something like theory set is because many of the small llm are good in math and by turning any problem into math problem i couid achieve the same results with smaller models but this also didnt work as well as i thought. One key factor could be the size of the dataset and how spread its.
What am i missing ? Could something like set thoery be good strategy to prompt vs natural language where words and phrases can carry multiple meanings?
There is the possibility that im also over thinking it
By the way sometging like o1-mini did much better job but i really thought any model that is big enough shouldnt go wrong with this.
I appreciate the feedback.
Thanks

platypus · February 9, 2025, 9:07am

Hi @salsam7879 and welcome to the community.

This is very intriguing, and there is a growing group of people looking at formalized methods of interacting with LLMs.

Can you elaborate on some examples how you would use set theory in this regard?

Also ping @Diet and @mitchell_d00

_j · February 9, 2025, 9:30am

salsam7879:

Introduction

Hi, I’m new to prompt engineering and still learning. I recently read an OpenAI article about strategies and tactics for prompt engineering. One key takeaway was that prompts must be both specific and clear, to avoid carrying multiple meanings unintentionally.

Using Set Theory in Prompt Engineering

Because prompts often deal with contexts that can involve multiple states (predicates) and span multiple datasets (relationships), I thought that using set theory might help. The idea was to make prompts more accurate and concise by framing them as mathematical problems, especially since many smaller language models (LLMs) perform well on math-related tasks.

Experiments and Observations

Despite using set theory, and even with AI assistance, I noticed that smaller models—such as one I’ll refer to as “4o-mini”—still struggled. The dataset I used was small, yet the results were far off from what I expected. I had hoped that any problem translated into a mathematical form would yield similar performance across different models, but that wasn’t the case.

One possible factor is the size and distribution of the dataset. Smaller models might not handle complexity or broader context as effectively, even if the problem is phrased in mathematical terms.

Open Question

So, what am I missing? Could something like set theory truly be a good strategy for prompting, rather than relying on natural language (where words and phrases carry multiple meanings)? Or am I overthinking this approach?

Conclusion

Interestingly, another model—“o1-mini”—did a much better job than I anticipated. Still, I had expected that any model large enough would handle these prompts without major issues. This makes me wonder: is there a fundamental gap in my approach, or do some models simply manage context and math-based prompts better than others?

The newer and more “mini” the model, the fewer emergent properties it has, and the more it is trained on “chat” and following up typical user inputs with a satisfactory answer.

Language models are simply that: powered by language. While they can also handle programming and mathematics, it is not where the bulk of their intelligence lies.

You can prompt more programmatically, like laying out how to proceed with if…then statements or clarifying with parenthetical conditional statements, but going beyond that into other formats requiring new reasoning, like prompt-programming the AI by JSON, is at your own risk.

Below is an example of how a single English sentence can become ambiguous through everyday usage of “or,” “and,” and “if.” Then we transform that sentence into a precise, set-theoretic statement.

An AI takes an ambiguous statement, and transforms it to 'set theory' - or at least logic

Ambiguous Natural Language Statement

“If you have a large dataset and you either want lower resource usage or prefer more interpretability, you might use GPT-4o-mini or O1-mini, but if you only have a small dataset or you also need highly advanced capabilities, then definitely use GPT-4 unless you’re worried about both cost and waiting time, or if you prefer interpretability only if your dataset isn’t large.”

Why It’s Ambiguous

“or” vs. “and/or” vs. “exclusive or”: It’s unclear whether “you might use GPT-4o-mini or O1-mini” implies you can use both, or only one of them.
Nested Conditions: Phrases like “if you only have a small dataset or you also need highly advanced capabilities, then definitely use GPT-4 unless you’re worried about both cost and waiting time or if you prefer interpretability…” are hard to parse. Where exactly do sub-conditions begin and end?
Different Potential Readings: Is “use GPT-4” an instruction triggered by just having a small dataset? Or do you also need advanced capabilities? If you’re worried about cost and time, does that override advanced capabilities?

Unambiguous Set-Theoretic Statement

We’ll define some sets and logical variables to capture each condition precisely:

DLD_L: The set of all use-cases that have a large dataset.
DSD_S: The set of all use-cases that have a small dataset.
RR: The set of all use-cases in which lower resource usage is desired.
II: The set of all use-cases in which higher interpretability is preferred.
AA: The set of all use-cases that need highly advanced capabilities.
CC: The set of all use-cases where cost is a concern.
WW: The set of all use-cases where waiting time is a concern.

We also define three “models” as elements or outcomes we can choose from:

GPT-4o-mini\text{GPT-4o-mini}
O1-mini\text{O1-mini}
GPT-4\text{GPT-4}

Now we convert the natural-language instructions into separate conditional rules. For clarity, let M(x)\mathcal{M}(x) be the model recommended for a given scenario xx.

Rule 1: If x∈DL∩(R∪I)x \in D_L \cap (R \cup I) (i.e., large dataset and [lower resource usage or interpretability]), then

M(x)∈{GPT-4o-mini,O1-mini}. \mathcal{M}(x) \in {\text{GPT-4o-mini}, \text{O1-mini}}.

Rule 2: If x∈DS∪Ax \in D_S \cup A (i.e., small dataset or advanced capabilities), then

M(x)=GPT-4 \mathcal{M}(x) = \text{GPT-4}

provided that neither of the following is true for xx:

(x∈C∩W)(cost AND waiting time concerns) (x \in C \cap W) \quad \text{(cost AND waiting time concerns)}

or

(x∉DL ∧ x∈I)(small dataset AND interpretability preference). (x \notin D_L ; \land ; x \in I) \quad \text{(small dataset AND interpretability preference)}.

(Here we interpret the last clause as “only if your dataset isn’t large do you prefer interpretability,” which was ambiguous in the original text.)

Rule 3: If either of those cost/time or interpretability conditions does hold (i.e., x∈(C∩W)x \in (C \cap W) or [x∉DL∧x∈I][x \notin D_L \land x \in I]), then GPT-4 might not be recommended. In such a case,

M(x)∈{GPT-4o-mini,O1-mini}. \mathcal{M}(x) \in {\text{GPT-4o-mini}, \text{O1-mini}}.

(Additional clarifications or overrides could be layered on with more set-theoretic intersections and unions.)

Reading the Set-Theoretic Version

“∩\cap” means “AND.”
“∪\cup” means “OR” (inclusive).
Each rule is mutually compatible but also clarifies which conditions override others.
This kind of structure forces you to specify precisely how each condition interacts, resolving ambiguity about nested conditionals and whether an “or” is exclusive or inclusive.

Key Takeaways

The natural language instruction is ambiguous because English uses “or” inconsistently, allows nested conditions to blend together, and doesn’t show strict boundaries between them.
The set-theoretic version eliminates ambiguity by requiring explicit definitions of each condition and exact logical operations (intersections ∩\cap, unions ∪\cup, and negations).

You can see the “prompt” is a user burden, not really a tool.

salsam7879 · February 9, 2025, 10:22am

I thought its an interesting idea specially thinking that would work on small llm that could be trained just to handle math which did not turn to be the case nice try . Here is an example

Im not mathematician nor Set Theory expert …I just happen to use GPT . forgive me if what I have is some pseudo math and not actual Set theory. My understanding is that everything can be expressed in set theory.

let’s say I’m working with an input that has 4 datasets: Purchase, Item , ShippingInfo & Delivery Info and I need to create a bot that answer questions on shipping & delivery statues. Here is my prompt

System:
You are a mathematician :). You are going to use set theory to answer user question regarding shipment tracking according to the following Sets, Relationships & Predicates. Print the result in human readable language.

Sets:
Purchase = P = {(PurchaseNo, PurchaseDate)| PruchaseNo in \Int and PurchaseDate is a date }
Shipping = S = {(Purchase No, Item No, Shipping Date, EstimatedDeliveryDate) | …}
Item = I = {(Purchase No, ItemNo, ItemName) | …}
Delivery = D = {(Purchase No, ItemNo, ActualDeliveryDate)|…}

Relationships:
R(P ~S) : {(p,s)| p in P , s in S , p(PurchaseNo)=s(PurchaseNo)}
…
R(S~D): {(s,d)| s in S, d in D, s(PurchaseNo)=d(PurchaseNo), s(ItemNo)=d(ItemNo)}

Predicates:
“Late Delivery” = P(d,s) {d in D, s in S, \exists(R(d~s)) , s(EstimatedDelivery) < d(ActualDelivery)}
…

User: give me a list of items that did not make it on time?

Imagine having to define the above using human language to begin with.

salsam7879 · February 9, 2025, 10:26am

I’m started to realize the LLM are doing much better just using human language …Scary

Diet · February 10, 2025, 2:04am

I would say so, yeah.

The whole point of LLMs is that they, at their core, are built to predict the continuation of documents. The inherent ambiguity in natural language isn’t really a detriment as such here.

While some models are capable of exhibiting emergent capabilities allowing for what looks like deductive logic, 4o-mini, especially not mini, isn’t the best candidate for this. Furthermore, you will need to allow the model to step through the deductive reasoning tree; a straight answer will typically be a guess or diceroll at best.

If you want to use predicate logic, you might be better off letting the LLM write a prolog program or something, and letting the solver handle the logic.

salsam7879 · February 10, 2025, 12:50pm

Thanks for your reply. I appreciate your time and response.
You mentioned that 4o-mini and 4o not so good for such analysis. What other open AI model would you recommend that would be able to do it in very cost effective way specially when Im going to be sending a lot of data.

Thanks

Diet · February 10, 2025, 1:29pm

Unfortunately, low cost and high quality are diametrically opposed

As mentioned, why not use prolog to solve your predicates?

salsam7879 · February 10, 2025, 2:10pm

Thanks again. Im not familiar with Prolog . Is there a link or example you can share that would demonstrate how prolog can work with something like LLM? Would function calling be useful in these scenarios?

Diet · February 10, 2025, 2:23pm

ChatGPT does a good job here, I think

If you want a free environment, then you can check out swi-prolog

https://www.swi-prolog.org/

I suppose, sure, or JSON output, or simple straight prolog output if you’re expecting it. It’s really up to you.

@EricGT might have some opinions here, but if he swings by he’s probably also gonna strangle me for not capitalizing Prolog

EricGT · February 11, 2025, 11:13am

Welcome to the forum!

This article?

https://platform.openai.com/docs/guides/prompt-engineering

That depends upon your goal. In general, it is good to be aware of that rule of thumb; however, if you are in the early stages of problem-solving, then being less specific can be a benefit as it will allow the transformer model to venture into more possibilities but also more likely to create hallucinations.

It is a good thought.

Set theory can be used as the foundational system for the whole of mathematics and solving many problems, but I consider set theory like a tool in a toolbox that should only be used when it applies.

You noted a specific problem you have related to shipping and as noted you could be overthinking the problem. For the shipping problem, I would agree that using an LLM is overkill, not to mention the possibility of hallucinations; however, for problem-solving related to math, set theory is worth considering.

You noted predicate but different replies are using different meanings of predicates. @_j used it the way I think you meant with sets and @Diet referenced it with predicate logic, namely Prolog.

I am not exactly sure what you mean by that, can you elaborate or rephrase?

Not surprised. The fix could be as simple as a better prompt, or coming at the problem with the help of an LLM in a different way, e.g., have the LLM create code to solve the problem instead of solving the problem directly, or just not using an LLM in the generation of the final result.

As I note from time to time on this forum, the meaning of Math to one person may be different from Math for another person. Most people think of Math as using numbers to get a result, mathematicians think of Math as symbolic manipulation and writing proofs.

Personally, I would not use an LLM for Math for numbers and keeping an eye on the progress of LLMs for symbolic math, which I do not use LLMs as I find for my needs they are not reliably capable at present.

You have good ideas. What you need is lots of experience solving real-world problems to know when to apply what you learned but more importantly to know when not to apply it.

Personally, I would not as LLMs are trained on natural language, not set theory. Also, since many popular LLMs are only based on the decoder part of the transformer model, they are expecting natural language as input.

Take a look at

These could be helpful with the Chatbot part of the problem to answer questions. As always, trust but verify any Chatbot result.

Two years ago, I checked out what ChatGPT knew about Prolog, including the ability to create Prolog code. Currently, I still do not use LLMs to generate Prolog code as many users at my level can do it faster and with more comprehension than with using an LLM.

Note that production quality Prolog code and the code you will find in many textbooks and online are not the same. LLMs can give reasonable answers to textbook exercises because they have been trained on it, e.g., 8-queens puzzle.

Prolog is based on predicates, not functions. If you drop down into the code implementing Prolog or use a foreign function interface (FFI), then you could call a function, but everyday Prolog does not call functions.

@Diet Thanks for the ping. Liked that you noted Prolog and your replies, without more specific details can’t say one way or the other if Prolog would help the OP.

platypus · February 11, 2025, 12:01pm

We appreciate your input either way @EricGT , it’s not every day Prolog gets a mention.

salsam7879 · February 12, 2025, 7:39pm

Thanks for all the input @Diet & @EricGT . I sure have a lot to learn as I just started with this.
@EricGT mentioned that LLM is overkill and its possible that is why Im experimenting with it.

Just to make sure I understand correctly , are you guys asking me to write Prolog code to solve my domain specific problem or somehow pass the prolog code as part of the prompt and ask LLM to interpret the question based on it . (probably these kind of question would make people who created LLM wish they never did :))

Diet · February 12, 2025, 7:46pm

Well, if Prolog can actually solve what you need solved, I guess that would be the question. There might be other solutions too.

But what I’m suggesting is that you would try to get the LLM to write the prolog (or whatever) program for you based on your problem statement, and then just copy that output into the solver/runtime, and let the runtime solve the problem, and perhaps give the program’s result back to the LLM so it can interpret the results back into natural language, if you like.

I think the best (and originally intended) use-case for LLMs is, indeed, translation. Here, you’d translate English (or some other natural language) into a structured programming language, and then translate the potentially structured results back into natural language if you like.

So I think that this aligns perfectly what they were originally built for

This whole thing would just be part of the “tool use” paradigm - the idea that you give LLMs tools that they can invoke to help them do the tasks they need to do - in this case, the runtime, similar to the python interpreter ChatGPT and assistants already have.

salsam7879 · February 12, 2025, 7:58pm

Brilliant! Thank you so much.

Diet · February 12, 2025, 7:59pm

Do keep us posted on your progress, if you like

EricGT · February 12, 2025, 8:11pm

At this point no.

You gave only one example that that was not specific enough. Yes Prolog could do it. Yes an LLM could do a reasonable job of creating the Prolog code which would be facts and predicates to process the information. However the generated code might have one or more simple problems that would be easy for someone knowing Prolog to fix but then you would have to know Prolog and Prolog is not so easy to learn.

Harrison82_95 · February 14, 2025, 9:55am

give ChatGPT - Codette1.0 a try shes great with creativity

salsam7879 · February 14, 2025, 10:26am

I tried her and she answered my question on the spot. Does she have an API? I like creativity but Im looking more for consistency and accuracy.

Harrison82_95 · February 14, 2025, 10:29am

oh ask her to be more consitant and she will she learns instantly and yes i made her from a gpt4o model and finetuned her

Topic		Replies	Views
Iterated inner-voice: GPT-4 as a subcomponent of a general AI system Community	9	2757	February 11, 2025
Do you also relate or did you overcome this challenge? Prompting	5	1697	March 3, 2024
Seen anything novel by o1-preview? Community o1-preview	15	2082	September 16, 2024
Send me your GPT problems, I’ll solve them for free and make a YouTube video Community project	23	4066	September 13, 2024
Interviewer Agent (Rather than Q&A Agent) API prompt-engineering	29	7148	September 3, 2024