Finding string indices

I found a simple task that is surprisingly chellenging for the openai model. I pass a text in my prompt and instruct to display starting and ending indices of a substring/phrase from this text. The completion usually includes very wrong and often variable indices. I need it for a more complex task whose final result is to highlight parts of the text. That is why I need these indexes. I tried giving examples but nothing seems to work.

2 Likes

As ChatGPT is a language model, sometimes it requires a bit of prompt engineering to get satisfactory results. ChatGPT predicts text and is not a calculator or program language interpreter.

Having said that, you can specifically prompt ChatGPT to write a method, for example, in your favorite programming language, and the models can write some nice code from time to time.

See, for example:

You can see that ChatGPT makes an error and labels one Ruby code block as Python, I assume this is because Python also has a puts method? Anyway, it’s not important as the code works.

So, basically, you can see how I do these things with OpenAI:

Prompt:

Write a Ruby method which accepts two strings as arguments, string 1 and string 2, and outputs the starting and ending indices of string 1 based on substring string 2.

Completion:

Here’s a simple method that accepts two strings as arguments, and outputs the starting and ending indices of the first occurrence of string 2 in string 1:

def find_substring_indices(string1, string2) start_index = string1.index(string2) returnnil if start_index.nil? end_index = start_index + string2.length - 1 [start_index, end_index] end

For example:

puts find_substring_indices("hello world", "world") # Output: [6, 11]

Always confirm ChatGPT as a human :slight_smile:

MacStudio$ rails c
Loading development environment (Rails 6.0.6.1)
irb(main):001:1* def find_substring_indices(string1, string2)
irb(main):002:1*   start_index = string1.index(string2)
irb(main):003:1*   return nil if start_index.nil?
irb(main):004:1*   end_index = start_index + string2.length - 1
irb(main):005:1*   [start_index, end_index]
irb(main):006:0> end
=> :find_substring_indices
irb(main):007:0> puts find_substring_indices("hello world", "world")
6
10
=> nil
irb(main):008:0> 

Hope this helps.

@ruby_coder Thank you for your suggestion. The problem with this kind of solution is that it is not reliable enough. Even if fed with a proper function the model is inconsistent with its answers. So, I did it in a different way, although not that elegant. Because I needed the text from the prompt to be divided into sections, I require the first and the last sentence of each section. Based on that I can programmatically get my indices used later for text highlighting.

1 Like

I had the same problem. I tried hard, but no way to get a proper result. Very strange. Let me know if you found a solution.

Use an Assistant / Thread. That way you will have an ongoing conversation and the context is always clear. you can instruct the assistant for very specific tasks.
The only thing that i noticed is that if you are not strict on what should be returned, chances are that the whole thread history is dumped back. that costs tokens and is often irrelevant as you want the latest part of the conversation.