Summarizing 3000 rows: why is it so expensive?

Hi, I am completely new both to code and GPT3.

My Google Sheet has 3000 rows of companies description (approximately 100 words each).
I have a script in Google Sheet to link it with GPT3.5 Turbo via API.

I want to use the GPT3 to :

  • rewrite description in professional terms
  • provide 7 keywords per description.

All in all, each description should represent approximately an input of 100 words and output of 100 + 7 words, eg. 207 words ~/~ tokens.

I made a quick test on 5 descriptions, and it costs me 2.5$.
This seems very expensive compared to the price list of openai, doesn’t it?
Is there a parameter to modify (ex. not using 3.5 Turbo) ?

Thanks for your help!

1 Like

A query of 200 words should only be around 300 tokens which would cost less than a cent. Also, 3.5 turbo is already the cheapest of the full-featured GPT models you can use, so changing the model should only make it more expensive.
Maybe try looking at OpenAI API at the daily cost breakdown to see exactly what queries were made with how many tokens? Maybe your script is sending multiple requests per cell?

2 Likes

You might want to share your script here (please use the code (</>) formatter) so we can see how your script calls the API.

I have a hunch you are doing something, perhaps a lot of somethings that are unnecessary. For example - are your keywords extracted using the full description? Or, are they extracted from the summary?

Bear in mind - tokens are not words.

3 Likes

Merci @bill.french !

Here is “my” code (in fact the one of our gptFriend, since I have 0 knowledge in coding…yet).

It basically is an addon for Google Sheet to integrate with chatGpt.

My objective is to analyse 3000 rows of tools’ description (let’s say 100 words per description) and have gtp return 7 keywords per row.
But I need to have ĂĄ macro view of the cost (20$ vs 200$ :slight_smile: )

<function onOpen() {
// Create a custom menu in Google Sheets
var ui = SpreadsheetApp.getUi();
ui.createMenu(‘ChatGPT’)
.addItem(‘Select prompt’, ‘showPromptPicker’)
.addItem(‘Enter custom prompt’, ‘showCustomPromptDialog’)
.addToUi();
}

function CHAT(val) {
// main function to generate the custom formula (=CHAT)
var ss = SpreadsheetApp.getActiveSheet();

// Set API key
var apiKey = ‘■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■YZd78Gni’;

// Configure the API request to OpenAI
var data = {
“messages”: [
{“role”: “system”, “content”: val},

  {"role": "user", "content": "Here is your prompt:\n"}],
"model": "gpt-3.5-turbo",

};

var options = {
‘method’: ‘post’,
‘contentType’: ‘application/json’,
‘payload’: JSON.stringify(data),
‘headers’: {
Authorization: Bearer ${apiKey},
},
};

var response = UrlFetchApp.fetch(
‘https://api.openai.com/v1/chat/completions’,
options
);

Logger.log(response.getContentText());

// Send the API request
var result = JSON.parse(response.getContentText())[‘choices’][0][‘message’][‘content’];
Logger.log(result);

return result;
}

function showCustomPromptDialog() {
var ui = SpreadsheetApp.getUi();
var promptResponse = ui.prompt(‘Enter your custom prompt:’);

if (promptResponse.getSelectedButton() == ui.Button.OK) {
var customPrompt = promptResponse.getResponseText();
var rangeResponse = ui.prompt(‘Enter the origin cell, row, column, or range to analyze (e.g., A1, A2; A:B, 1:2)’);
if (rangeResponse.getSelectedButton() == ui.Button.OK) {
var selectedRange = rangeResponse.getResponseText();
var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var rangeData = sheet.getRange(selectedRange).getDisplayValues();
var input = rangeData.map(function (row) {
return row.join(’ ‘);
}).join(’\n’);

  var destinationResponse = ui.prompt('Enter the Destination cell, row, column, or range (e.g., A1, A2; A:B; 1:2)');
  if (destinationResponse.getSelectedButton() == ui.Button.OK) {
    var destinationRange = destinationResponse.getResponseText();
    var result = CHAT(customPrompt + '\n' + input);
    sheet.getRange(destinationRange).setValue(result);
  }
}

}
}

function showPromptPicker() {
var htmlOutput = HtmlService.createHtmlOutputFromFile(‘GptApiDialog’)
.setWidth(500)
.setHeight(400);
SpreadsheetApp.getUi().showModalDialog(htmlOutput, ‘Select a Prompt’);
}

function processPrompt(promptText, inputRange, outputRange) {
var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var rangeData = sheet.getRange(inputRange).getDisplayValues();
var input = rangeData.map(function (row) {
return row.join(’ ‘);
}).join(’\n’);

var standardPrompts = [
{
prompt: ‘Keywords’,
systemMessage: ‘You are experienced in reflecting a content with keywords. Please provide 10 keywords describing the following content.’
},
{
prompt: ‘Summary’,
systemMessage: ‘You are an expert at summarizing content. You are able to reflect in a few words the key points of any type of text. Please provide a summary of the following content’
},
{
prompt: ‘Sentiment’,
systemMessage: ‘You are an experienced analyst in social network moderation. Please analyse whether the following content is positive, neutral or negative’
},
{
prompt: ‘Emoji’,
systemMessage: ‘Please provide an emoji reflecting each of the following content’
}
];

var selectedPrompt = standardPrompts.find(function (promptObj) {
return promptObj.prompt == promptText;
});

if (selectedPrompt) {
var result = CHAT(selectedPrompt.systemMessage + ‘\n’ + input);
sheet.getRange(outputRange).setValue(result);
} else {
throw new Error(‘Invalid prompt selected. Please try again.’);
}
}>

This line right here is able to extract multiple rows and columns. You should fear doing this because it has the ability to allow users to send WAY more content than it should to achieve the results. It’s like an AR-15 with auto-select. One press of the trigger and dozens of rows are collected into the prompt. This might explain the high costs you are seeing. Am I warm?

In my view, a system like this should be very precise; it should not allow users to do things that could result in runaway fees.

I also see that each of the prompts is apparently separate but near-identical API calls into GPT with the same text. For example, the same cell(s) is sent to get a summary as is sent to extract keywords. These extraction dimensions should probably be designed to reduce costs while increasing accuracy and relevance.

Imagine a process that retrieves a summary and then uses the summary as the prompt to extract keywords. This alone will reduce the tokens required to get the keywords.

There are so many ways to optimize this using Google Apps Script. Hope these comments help.

1 Like

Many thanks for your comments.
As I said, 0 knowledge about coding, so I using chatGpt as a junior developer giving me things to copy-paste on the basis of my functional design; this is by the way a lesson-learned about the apparent easiness of chatGpt…

What I understand, is that I shouldn’t use this script (given that I am absolutely unable to optimize it by following your recommendations).

Many thanks!

Not necessarily; it just needs some changes. I build Google Apps Script for many AI-related processes, and it’s a really powerful approach to using LLMs to advance process automation and intelligence.

If this is for business use and there’s value/budget for getting this right, you should pay someone to nudge this into a better place.

One recommendation - remove the human factor. If the data is there and you need it to gather summaries and keywords, this should be automated with a script like yours that spins up when a new record is added, performs the necessary AI extractions and adds it to the row.

This would eliminate the risk of high costs while making it all seamless - each new row added would be embellished with content almost instantly, and it would never run again on that row unless you wanted it to when something about the row was modified.

1 Like

I recently built an app script for google sheets and noticed that by default Google Sheets won’t cache results. So it very frequently hits the OpenAI API endpoint for each cell containing your formula. I found a way to reduce the frequency of the app script refresh but I have not found a way to stop it completely once you’ve gotten one successful response per cell.

I am curious if others have found a solution to not re-run app scripts every time a g sheet is opened or changed.

I only built this a few days ago to simulate chatbot conversations in bulk and it’s extremely useful but this is a minor inconvenience as I regularly just copy the data and paste as plain text to avoid the refresh. I have some ideas on how to work around this but haven’t tested yet.

Well, that depends on what you’re asking sheets to do.

Then, your implementation approach is missing a key process automation requirement.

The objective is to ensure each row is manicured with embellishments from the AI process when – AND ONLY WHEN – a record exists that has not been embellished. This could be triggered by a script that runs every minute which examines the data looking for rows that need to be processed. Once processed, certain columns are now filled that were previously empty. These are filter flags that make it easy to determine what should be processed when the script is run or when the sheet is opened.

I believe the OnOpen() event should not be used for this purpose; it is not a human-triggered process. Instead, a chron process that reads the table and filters for incomplete rows is generally good enough. If you use arrays to process all data reads and writes, the entire update process requires only a single read and a single write to the sheet itself. Even for 20,000 rows, this is typically a few seconds. Ideally, an even handler is best - it calls the script when – AND ONLY WHEN – a new record is added.

To my clients, this would be a major inconvenience and likely a source of incomplete work or subject to frequent human errors. Processes like this should be fully automated and run whether the sheet is open or closed.

1 Like