Build your own AI assistant in 10 lines of code - Python

This tutorial will guide you to develop a chat assistant with gpt-3.5-turbo accessible via the console.


Before starting, you will need to have:

  • Python 3.7.1 or higher installed on your system
  • An OpenAI API key

Step 1: Install the OpenAI Python Library

First, we need to install the latest Python client library (currently v0.27.0) for the OpenAI API. You can install it using pip, the Python package manager, with the following command:

pip install openai

Step 2: Set Up Your OpenAI API Key

You will need to set up an OpenAI API key. If you don’t already have one, you can get one by following the instructions provided in the OpenAI API documentation.

Once you have your API key, replace "YOUR_API_KEY" in the code snippet below with your API key:

import openai
openai.api_key = "YOUR_API_KEY"

Step 3: Write the system message

The system message is a message object with the "role" : "system". The system message helps set the behavior of the assistant. You can also give the assistant a name using the system message.

When creating your own chat assistant, it’s important to choose a good directive prompt. Here are some tips to keep in mind:

  • Keep it simple and clear: The directive should be clear, intentional, concise and declarative.

  • Set the tone: The directive should establish the chat assistant’s personality and tone. This can be done through the choice of words, phrasing, and even punctuation. For example, a chat assistant that is meant to be friendly and approachable might use exclamation points and emojis, while one that is meant to be more serious and professional might use more formal language and punctuation.

  • Test and iterate: Don’t be afraid to experiment with different directive prompts and see how users respond. You may need to tweak the language or tone of the prompt to get the best results.

Replace the "DIRECTIVE_FOR_gpt-3.5-turbo" placeholder in next step (4) with the system message you wrote.

Step 4: Create the Chat Assistant

Now that we have the OpenAI Python library installed and our API key set up, we can create the chat assistant.

import openai

openai.api_key = "YOUR_API_KEY" # supply your API key however you choose

message = {"role":"user", "content": input("This is the beginning of your chat with AI. [To exit, send \"###\".]\n\nYou:")};

conversation = [{"role": "system", "content": "DIRECTIVE_FOR_gpt-3.5-turbo"}]

    completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=conversation) 
    message["content"] = input(f"Assistant: {completion.choices[0].message.content} \nYou:")

Step 5: Run the Chat Assistant

To run the chat assistant, save the code snippet from step 4


In this tutorial, I’ve shown you how to create a chat assistant using the OpenAI Python library and the GPT-3.5-turbo model. I’ve also discussed the importance of the system directive in establishing the chat assistant’s personality and tone, and provided some tips for creating a good directive prompt.

Remember that chat assistant development is an ongoing process, and you may need to experiment with different approaches and iterate on your design to get the best results. Good luck, and happy coding!


Code explanation:

The code shared in Step 4 is the complete code you need to run your AI chat assistant/companion. Here’s a breakdown on what’s happening in the code.

message = {"role":"user", "content": input("This is the beginning of your chat with AI.\n\nYou:")};

The message variable is a message object that stores the message from Human via console input with the user role.

conversation = [{"role": "system", "content": "DIRECTIVE_FOR_gpt-3.5-turbo"}]

The conversation variable holds the entire conversation history, which begins with the system message containing the directive for the assistant. It is essentially an array of message objects.

    completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=conversation) 
    message["content"] = input(f"Assistant: {completion.choices[0].message.content} \nYou:")

Here we’re going to continue the conversation until the user enters ### exit sequence to end the conversation. Hence the while loop.

Once we enter the loop. We add the user message to the conversation and call the chat completion endpoint.

Then we print the message received from the AI assistant (in the completion) to console and ask for the user for a reply.

The new message from the user is then added to the conversation variable.

Repeat until user types the exit sequence.


This doesn’t seem like very efficient way to use the API. If I use the suggested method and my target usage consist of tiny prompts, then I need to provide the full context every time, which in some cases can span several long prompts. So instead of using, say 50-100 tokens, I may need to use 10k or more for every single prompt. That’s excessive overhead. I am aware I can make use of the stateful feature of the API, but that would only work when I have few consecutive prompts. I would still require to resent the whole context if the session gets closed between the different prompts.


It is a tutorial to get a chat assistant running in the console in 10 lines of code.
It does not deal with how to maintain context length as there a multiple ways to achieve that depending on the use case.

With ChatGPT you have to keep a conversation context to pass to to model for it to generate a relevant coherent response. That’s how it’s designed to be.

And if your request exceeds 4096 tokens, you may even have to shorten it by summarizing, trimming, or extracting relevant parts with semantic search, or some other way, but that’s an entire topic on it’s own.

Yes, you’ll have to send the entire conversation if it is relevant to the user message content. And if you look at the code, you’ll find that it only makes a call to the API when the user sends a message, it doesn’t stay connected all the time. I left it running on my mac for 2 days because of my forgetfulness and when I saw it, I sent a message and it worked, because the only time it uses network is to make the call and receive the response.

You are welcome to share your code describing your solution to the community.


I appreciate your proposed 10 lines solution here. I initially came to this thread from the discussion about how to keep a session with gpt-3.5-turbo api and for the sake of relevance I’ll continue the discussion in that topic.

Regarding this

In the web UI obviously we can recall a conversation session and only add new prompts to already existing context. This is already achieved with gpt-3.5-turbo in the context of ChatGPT. If we could get that feature through the API for gpt-3.5-turbo, we could avoid resending the previous messages with every prompt.


The web UI uses a custom storage solution, most likely a DB, to store every conversation and retrieve it when required. Even then, when you send a message to previously saved conversation, it has to make the API call with the relevant conversation in message parameter.

Yes it would be great, but it it not currently supported by the API.

It clearly is mentioned in the docs that gpt-3.5-turbo has no memory of it’s own

Including the conversation history helps when user instructions refer to prior messages. In the example above, the user’s final question of “Where was it played?” only makes sense in the context of the prior messages about the World Series of 2020. Because the models have no memory of past requests, all relevant information must be supplied via the conversation. If a conversation cannot fit within the model’s token limit, it will need to be shortened in some way.


Great critique. I was thinking the same thing the last couple of days; context is costly. Oh how the tokens will add up when the conversation history is resent. But I definitely appreciate how gpt-turbo-3.5 is more cost efficient than text-davinci-003 and recently released.

At some point I believe context cost will get sorted out with an AI moore’s law of its own vs power consumption, based on the use case. And in such short of time to see the cost drop 10 fold, next year it could be another 10 fold. Well I hope. :wink:


Hey sps! I think it is a pretty interesting solution - even though with some limitations. I was wondering if we could also use the open ai to help with the tokens issues (for example: if the previous conversation is getting larger, than summarize it and input again to the AI conversation)

Effort should be REALLY LOW and probably it is another step towards solving this issue (even though we can get some other better solutions that the one i suggested)

great job

1 Like

Thank you @dmirandaalves

Yes it is very much possible to do that by using tiktoken to count tokens everytime before making the API call, if the token count exceed a specific threshold, get a summary and pass it as the system or user message, and then make the call with user input.

The approach above is easy to implement, but a more robust yet complex approach would be to use embeddings.

UPDATE: Here’s how to Use embeddings to retrieve relevant context for AI assistant


Thank you for this. The simple program is a great starting point for those who are new to the API :slight_smile:

1 Like

If we can retrieve the previous promotions and completions by ID, we can pass them in as needed.

Welcome @gnehcgnaw

I saw the previous reply, what I meant was that this feature is still quite important in the future. Currently, only the interface has been reserved.