The newer and more “mini” the model, the fewer emergent properties it has, and the more it is trained on “chat” and following up typical user inputs with a satisfactory answer.
Language models are simply that: powered by language. While they can also handle programming and mathematics, it is not where the bulk of their intelligence lies.
You can prompt more programmatically, like laying out how to proceed with if…then statements or clarifying with parenthetical conditional statements, but going beyond that into other formats requiring new reasoning, like prompt-programming the AI by JSON, is at your own risk.
Below is an example of how a single English sentence can become ambiguous through everyday usage of “or,” “and,” and “if.” Then we transform that sentence into a precise, set-theoretic statement.
An AI takes an ambiguous statement, and transforms it to 'set theory' - or at least logic
Ambiguous Natural Language Statement
“If you have a large dataset and you either want lower resource usage or prefer more interpretability, you might use GPT-4o-mini or O1-mini, but if you only have a small dataset or you also need highly advanced capabilities, then definitely use GPT-4 unless you’re worried about both cost and waiting time, or if you prefer interpretability only if your dataset isn’t large.”
Why It’s Ambiguous
- “or” vs. “and/or” vs. “exclusive or”: It’s unclear whether “you might use GPT-4o-mini or O1-mini” implies you can use both, or only one of them.
- Nested Conditions: Phrases like “if you only have a small dataset or you also need highly advanced capabilities, then definitely use GPT-4 unless you’re worried about both cost and waiting time or if you prefer interpretability…” are hard to parse. Where exactly do sub-conditions begin and end?
- Different Potential Readings: Is “use GPT-4” an instruction triggered by just having a small dataset? Or do you also need advanced capabilities? If you’re worried about cost and time, does that override advanced capabilities?
Unambiguous Set-Theoretic Statement
We’ll define some sets and logical variables to capture each condition precisely:
- DLD_L: The set of all use-cases that have a large dataset.
- DSD_S: The set of all use-cases that have a small dataset.
- RR: The set of all use-cases in which lower resource usage is desired.
- II: The set of all use-cases in which higher interpretability is preferred.
- AA: The set of all use-cases that need highly advanced capabilities.
- CC: The set of all use-cases where cost is a concern.
- WW: The set of all use-cases where waiting time is a concern.
We also define three “models” as elements or outcomes we can choose from:
- GPT-4o-mini\text{GPT-4o-mini}
- O1-mini\text{O1-mini}
- GPT-4\text{GPT-4}
Now we convert the natural-language instructions into separate conditional rules. For clarity, let M(x)\mathcal{M}(x) be the model recommended for a given scenario xx.
Rule 1: If x∈DL∩(R∪I)x \in D_L \cap (R \cup I) (i.e., large dataset and [lower resource usage or interpretability]), then
M(x)∈{GPT-4o-mini,O1-mini}. \mathcal{M}(x) \in {\text{GPT-4o-mini}, \text{O1-mini}}.
Rule 2: If x∈DS∪Ax \in D_S \cup A (i.e., small dataset or advanced capabilities), then
M(x)=GPT-4 \mathcal{M}(x) = \text{GPT-4}
provided that neither of the following is true for xx:
(x∈C∩W)(cost AND waiting time concerns) (x \in C \cap W) \quad \text{(cost AND waiting time concerns)}
or
(x∉DL ∧ x∈I)(small dataset AND interpretability preference). (x \notin D_L ; \land ; x \in I) \quad \text{(small dataset AND interpretability preference)}.
(Here we interpret the last clause as “only if your dataset isn’t large do you prefer interpretability,” which was ambiguous in the original text.)
Rule 3: If either of those cost/time or interpretability conditions does hold (i.e., x∈(C∩W)x \in (C \cap W) or [x∉DL∧x∈I][x \notin D_L \land x \in I]), then GPT-4 might not be recommended. In such a case,
M(x)∈{GPT-4o-mini,O1-mini}. \mathcal{M}(x) \in {\text{GPT-4o-mini}, \text{O1-mini}}.
(Additional clarifications or overrides could be layered on with more set-theoretic intersections and unions.)
Reading the Set-Theoretic Version
- “∩\cap” means “AND.”
- “∪\cup” means “OR” (inclusive).
- Each rule is mutually compatible but also clarifies which conditions override others.
- This kind of structure forces you to specify precisely how each condition interacts, resolving ambiguity about nested conditionals and whether an “or” is exclusive or inclusive.
Key Takeaways
- The natural language instruction is ambiguous because English uses “or” inconsistently, allows nested conditions to blend together, and doesn’t show strict boundaries between them.
- The set-theoretic version eliminates ambiguity by requiring explicit definitions of each condition and exact logical operations (intersections ∩\cap, unions ∪\cup, and negations).
You can see the “prompt” is a user burden, not really a tool.