Without knowing your keywords, I can’t comment on the best solution, but my advice would be to just do it in steps:

  1. First, tell the model to generate a list of sentences on the topic, one per keyword. This is very easy for GPT in general.

  2. Next, ask the model to use all of those sentences in an article on the topic. Here, I get GPT to use the following keywords: (“breakfast cereal”, “basketball”, “flashlight”, “pajamas”, “furniture”) in an article about graphics cards. It’s a pretty bad article. You can probably improve easily by providing some system message style direction on style, tone, intended audience, etc.

https://chat.openai.com/share/bd4f5948-53f9-428a-a2ac-27b9191f3638

This works because the first task is something that isn’t hard. And, once you have those sentences made, the model has a path through it’s weights and probabilities for tying the intended topic to the keyword every time because it already has the sentences. It can even bend them to flow more naturally on the second generation.

Trying to do the generation all in one go is going to be a task that LLMs are specifically ill-suited to achieve because it forces the generation to keep shifting and jumping to different probability spaces in a way that nothing in it’s training data gives it a roadmap to follow. So, just give it the roadmap ahead of time (that it gives to itself!), and it’ll find it’s own way on the second go around.

1 Like

Why? Maybe it was just what my chatgpt defaulted to and you used gpt3, but if not well… cGPT4 is perfectly capable of generating an acceptable, even quite good, article. This kind of thing, having it generate sentences without knowing that they’re going to be in a single article and then trying to get it to shoehorn them into an article leads to what you got, a weirdly disjointed mishmash of paragraphs with keywords sort of stuffed in there? Like, who thinks of “comfort” with a graphics card?

Not trying to be a dick, but if youve got GPT4 this is both unnecessary and detrimental imo. If you want to do this method, generating sentences beforehand, you need a lot of different keywords, then bucket related keywords together and determine a topic beforehand. Iteratively, so it doesnt just list the keywords together either.

Check out this code interpreter completion of my original example. It could 100% be cleaned up to improve it for unguided automation, but i also wanted to give some examples of ways to get the bot to be an effective writer: AI Advancements: GPT-Nvidia Synergy

Compare that with this, which was built off of the link you sent: GPU Insights Unveiled which took longer and was more convoluted.

If you’re trying to get it to generate 13 separate articles, then do them all as separate queries.

Oh actually I meant there are 13 more pairs of instructions per 1 article. It’s essentially 1 set of instructions for each heading and content underneath, with a few others like these keyword instructions, for a total of 13-14 sets of user/assistant instructions per 1 article. EDIT: This has since been changed to just 13 user instructions instead of 13 sets of user/assistant instructions.

Also, @_j has a great point with “You might even SEO just by telling the AI it is a SEO optimizer…”, I bet that would improve the quality of the content a bunch, once you split this monster query into a bunch of smaller ones.

Unfortunately for my use case, the keywords have to be a specific list I give it because the article is then checked against a different SaaS (not mine) to ensure these keywords are present. This step is client facing, so even if the generated article is SEO optimized, if it doesn’t contain the keywords I specify, their SaaS marks it as a ‘problem’ or ‘not optimized.’ Somewhat annoying since they’re SEO optimized either way I know, but such is the request I’ve been given :smiling_face_with_tear:

I do have an upcoming project that I’ll be using your suggestions for, however. Much easier when there won’t be clients involved in the process.

I’m curious about your usage of user/assistant instruction pairs over system message. Do you mean that you’re providing the assistant response, or are you describing a chat history where the assistant answer is fed back 13-14 times as the answer is refined?

1 Like

If you list all of your instructions and a set of actual keywords I can help you with your issue.

1 Like

They arent mutually exclusive, you can tell it both to be an SEO master and also explain+command it to use exactly those keywords. Something like starting the overall prompt with “youre an SEO guru and i need your help.” The explain the general trend of the goal, but dont give explicit instructions yet. Next explain something like, “To do this I have a set of keywords, [insert list here], and i need them to all be present in the input. Please include them organically and in an SEO optimized way.” Then you explain the details (your other rules), and at the end you add your imperatives (ie your commands, voiced as orders not requests), like “The output must include exactly the list of SEO keywords i gave you, which was [insert list again], and they must appear organically as if they just came to your mind as you were writing.”

Ill revisit this thread this evening and try to provide a concrete example. It would help to have some example of your other 13 constraints, or things like them. Also, thats really too many instructions to expect a good one-shot article to result, you should find ways to combine or eliminate some of them. Pm me if you want help brainstorming how to do that but dont want to post them publicly.

Here is some code I used to specify 21 tokens as the only available tokens to be returned (bias value of 100s across the board). I’m not experimented too much with logit_bias, but here is some starter code to get the token ids.

import tiktoken

encoding = tiktoken.encoding_for_model("ada")

def get_token_id(word):
    token_ids = encoding.encode(word)
    return token_ids[0] if token_ids else None

words = [
    '0', ' 0', '0 ',
    '1', ' 1', '1 ',
    '2', ' 2', '2 ',
    '3', ' 3', '3 ',
    '4', ' 4', '4 ',
    '5', ' 5', '5 ',
    '6', ' 6', '6 '
]
bias = {str(get_token_id(word)): 100 for word in words}
2 Likes

You didn’t read more than the headline. The point here is that the contents need to remain. However, it is just fine if they are rewritten by an AI, have awkward words or phrases inserted, don’t sound natural or preserve the original intent - the only purpose is to spam up the internet.

From earlier in the thread:

This is one of the options I’ve tried, but it didn’t seem to work either. After a little more research, I think I have to use logit_bias for this, but I’m not sure how to programmatically tokenize keyword list inputs.

Should have replied to that directly. My use case was different from what is trying to be accomplished in the thread.

Im torn between “id rather not have my brain assaulted by terribly written everything, at least make it read well” and “if its all obviously ai written i can mentally filter it out and ignore it easier”.

Sure, let me post a little more for clarity. I can’t post the complete code or specifics of what it’s generating, but here’s what I’m doing.

Here is the web app side where I input the article’s contents in a form.

main.py

import chat_functions
from flask import Flask, render_template, request

app = Flask(__name__)

@app.route("/", methods=["GET", "POST"])
def generate_article():
    if request.method == 'GET':
        return render_template('index.html')

    title = request.form.get('title').title()
    materials = request.form.get('materials')
    instructions = request.form.get('instructions')
    keywords = request.form.get('keywords')

    content = chat_functions.ArticleGenerator(title, materials, instructions, keywords)
    article = content.generate_article()

    return render_template('index.html', article=article)

if __name__ == "__main__":
    app.run(host="127.0.0.1", port=8080, debug=False)

The chat_functions being imported in main.py above are my GPT instructions, which look like the following (please excuse the text formatting). Small side note, I changed the 13 pairs to just 13 user instructions, I’ll edit my comment above.

chat_functions.py

import openai
import os
from dotenv import load_dotenv

class ArticleGenerator():
    def __init__(self, title, materials, instructions, keywords):
        self.title = title
        self.materials = materials
        self.instructions = instructions
        self.keywords = keywords
        load_dotenv()
        openai.api_key =  os.getenv('OPENAI_API_KEY', 'xxxxxxx')
        self.messages = [
        {"role": "system", "content": f'''Generate articles that follow the users instructions exactly.'''},
        {"role": "user", "content": f'''I want you to write an article for a {title} project according to my instructions and formatting rules.'''},
        {"role": "user", "content": f'''First, write {title} as a first level heading. Under that heading, write an introduction to the project.'''},
        {"role": "user", "content": f'''Next, write 'What is {title}?' as a second level heading. Under that heading, write about what {title} is.'''},
        {"role": "user", "content": f'''Next, write 'What You Need to Make {title} At Home' as a second level heading. Under that heading, take this list of materials
                                            and write about why each material is needed for the project: {materials}. Bold the name of the material, then write
                                            no more than 40 words about why it is needed. Seperate each material and its description with a line break. 
                                            Do not include the material's quantity.'''},
        {"role": "user", "content": f'''Next, write 'Tips for Making the Best {title}' as a second level heading. Under that heading, include 5-7 points about how to build the best {title}.'''}
        ]

    def generate_article(self):
        completion = openai.ChatCompletion.create(
            model="gpt-4",
            temperature=1.2,
            presence_penalty=0.0,
            frequency_penalty=0.0,
            logit_bias={48126: -100},
            messages=self.messages
            )
        return completion.choices[0].message.content

These aren’t the complete instructions, but this is basically how I’ve got it working now. Each instruction creates an H1, H2, or H3, then writes and formats the content for each heading. This part is working great; it writes relevant content that is formatted how I specify. The issue is, while it’s writing these sections of content, I need it to use specific keywords, which I can’t get it to do.

These keywords are always relevant to the topic of the article, I just need the API to actually use them. For example, say the article title is ‘Painted DIY Bookshelves’ and a keyword is ‘coats’ (as in coats of paint). The API could either write a new sentence using the word (i.e., ‘Apply two coats of paint, letting it dry in between’), or it could replace an existing word with the keyword when it can be used interchangeably (i.e., using ‘coats’ instead of ‘layers’). It’s doing neither currently.

@chrstfer I’m going to completely redo the chat instructions following your recommendations today, so I’ll update or send a PM to take you up on your brainstorming offer. Thanks!

@_j No offense, but you are not understanding the point of this thread and it’s convoluting the conversation. What you’re talking about is keyword stuffing, which Google has been penalizing for years. What I’m talking about is keyword optimizing, which is still very much a ranking factor. Having this program write ‘fridge’ instead of ‘refrigerator’, or create new sentences using keywords that are relevant to the topic, is neither spam nor too much to ask from ChatGPT. Every website anywhere near the top 5 pages of Google SERP’s use some variation of keyword optimization to help get there, which is why I’m not here to debate SEO strategy.

@ethan.peck I appreciate the example! If I can’t get this to work with @chrstfer suggestions, I’ll try this out with the logit_bias.

1 Like

So here’s what I tried:

{"role": "system", "content": f'''You are an SEO specialist and I need your help writing articles. These need to be well written articles about [blank] that are SEO optimized.'''},
{"role": "user", "content": f'''To do this, I have a set of keywords and phrases, {keywords}, that need to be included in the article. Please include them organically and in an SEO optimized way..'''},

# Examples of these instructions can be found in my comment above
{"role": "user", "content": '''Article instructions here using 10 separate user instructions'''}, 

{"role": "user", "content": f'''“The output must include exactly the list of SEO keywords and phrases I gave you, which was {keywords}. They must appear organically as if they just came to your mind as you were writing.”'''},
{"role": "user", "content": f'''Write this [blank] article according to my instructions.'''},

I also tried combining the first user instruction with the system instruction, but neither worked. I generated 3 articles with these instructions, then took the {keywords} instructions out and generated them again, and the results were pretty identical.

To be clear, it’s writing a very nice article with other relevant keywords, just not the ones I specify.

@ethan.peck I tried your code above and couldn’t get it to work. Here’s my implementation, maybe I did something incorrectly.

import tiktoken
import openai

keywords = request.form.get('keywords')

encoding = tiktoken.encoding_for_model("gpt-4")

def get_token_id(word):
    token_ids = encoding.encode(word)
    return token_ids[0] if token_ids else None

    words = keywords
    bias = {str(get_token_id(word)): 100 for word in words}

class ArticleGenerator():
    def __init__(self, bias):
    self.bias = bias
    self.messages= #Chat instructions here

def generate_article(self, bias):
    completion = openai.ChatCompletion.create(
        model="gpt-4",
        temperature=1.2,
        presence_penalty=0.0,
        frequency_penalty=0.5,
        logit_bias={bias: 100},
        messages=self.messages
        )
    return completion.choices[0].message.content

content = chat_functions.ArticleGenerator(bias)
article = content.generate_article()

EDIT: It looks like logit_bias isn’t working at all actually. I ran this a second time to exclude just the token for the word ‘typically’ like this

logit_bias={48126: -100},

but the generated content still included ‘typically’ 3 times.

This is a straightforward task. Not sure why you’re facing this issue. Here’s a demo in Playground

You can combine this with @elmstedt 's suggestion and get successful completions with the desired keywords.

2 Likes

Yes this works if you use a single instruction to generate the entire article as in your example, but it’s not working when I have multiple instructions. Maybe I’m not using the correct method to achieve what I’m trying to do; if I have ~10 headings in the article and the content under each heading needs its own instructions and formatting, and these keywords need to be used throughout, what’s the best way to do that?

I’ve tried every variation of @elmstedt 's suggestions I can think of and more, it’s just not using the words. In fact, it even ignores set word counts and formatting instructions intermittently, which should be even more straightforward.

I had a quick skim through this thread, and I can’t seem to find the answer to a question I have, how many keywords are in the list you give it?

If it’s more than 5-6, it’s almost never going to work. So I just wondered are you giving it 20 keywords? or just 3 or 4?

1 Like

I made it work to extract skills from commit files and it works 100% correct. Checked thousands of commits and extractions manually and did not find a single wrong skill.

But that’s all I am telling. It is possible. I won’t tell how. Went through many weeks of sleepless nights for that.

If it’s more than 5-6, it’s almost never going to work.

Ah, yeah that might be the issue then, the lists are 20+ keywords

@jochenschultz I’m afraid I don’t know what you’re referring to, skills?

Google doesn’t really go for keyword match anymore. It changed a few month ago. At least that’s the essence of what they said.

What you are looking for is the meaning of the website in a few words.

It should be possible to write content about money without using the word money at all and still rank on the word money.

Try with a website about fruit. Don’t use the word fruit.

Ah yeah, quality content over keywords and all that. Like I said above, that would be fine for future projects, but this one comes with client expectations where the keywords match their SaaS so the article gets a “good grade.”

You and I know there are different/more important ranking factors, but this use case has an additional requirement along with ranking.

1 Like

And you have no idea how to do that? How did you get the customer?