2023-03-03

[# Chat completions

Beta](https://platform.openai.com/docs/guides/chat/chat-completions-beta)

ChatGPT is powered by gpt-3.5-turbo, OpenAI’s most advanced language model.

Using the OpenAI API, you can build your own applications with gpt-3.5-turbo to do things like:

  • Draft an email or other piece of writing
  • Write Python code
  • Answer questions about a set of documents
  • Create conversational agents
  • Give your software a natural language interface
  • Tutor in a range of subjects
  • Translate languages
  • Simulate characters for video games and much more

This guide explains how to make an API call for chat-based language models and shares tips for getting good results. You can also experiment with the new chat format in the OpenAI Playground.

## Introduction

Chat models take a series of messages as input, and return a model-generated message as output.

Although the chat format is designed to make multi-turn conversations easy, it’s just as useful for single-turn tasks without any conversations (such as those previously served by instruction following models like text-davinci-003).

An example API call looks as follows:

1
2
3
4
5
6
7
8
9
10
11
12
# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

The main input is the messages parameter. Messages must be an array of message objects, where each object has a role (either “system”, “user”, or “assistant”) and content (the content of the message). Conversations can be as short as 1 message or fill many pages.

Typically, a conversation is formatted with a system message first, followed by alternating user and assistant messages.

The system message helps set the behavior of the assistant. In the example above, the assistant was instructed with “You are a helpful assistant.”

The user messages help instruct the assistant. They can be generated by the end users of an application, or set by a developer as an instruction.

The assistant messages help store prior responses. They can also be written by a developer to help give examples of desired behavior.

Including the conversation history helps when user instructions refer to prior messages. In the example above, the user’s final question of “Where was it played?” only makes sense in the context of the prior messages about the World Series of 2020. Because the models have no memory of past requests, all relevant information must be supplied via the conversation. If a conversation cannot fit within the model’s token limit, it will need to be shortened in some way.

### Response format

An example API response looks as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
 'id': 'chatcmpl-6p9XYPYSTTRi0xEviKjjilqrWU2Ve',
 'object': 'chat.completion',
 'created': 1677649420,
 'model': 'gpt-3.5-turbo',
 'usage': {'prompt_tokens': 56, 'completion_tokens': 31, 'total_tokens': 87},
 'choices': [
   {
    'message': {
      'role': 'assistant',
      'content': 'The 2020 World Series was played in Arlington, Texas at the Globe Life Field, which was the new home stadium for the Texas Rangers.'},
    'finish_reason': 'stop',
    'index': 0
   }
  ]
}

In Python, the assistant’s reply can be extracted with response[‘choices’][0][‘message’][‘content’].

### Managing tokens

Language models read text in chunks called tokens. In English, a token can be as short as one character or as long as one word (e.g., a or apple), and in some languages tokens can be even shorter than one character or even longer than one word.

For example, the string “ChatGPT is great!” is encoded into six tokens: [“Chat”, “G”, “PT”, “ is”, “ great”, “!”].

The total number of tokens in an API call affects: How much your API call costs, as you pay per token How long your API call takes, as writing more tokens takes more time Whether your API call works at all, as total tokens must be below the model’s maximum limit (4096 tokens for gpt-3.5-turbo-0301)

Both input and output tokens count toward these quantities. For example, if your API call used 10 tokens in the message input and you received 20 tokens in the message output, you would be billed for 30 tokens.

To see how many tokens are used by an API call, check the usage field in the API response (e.g., response[‘usage’][‘total_tokens’]).

To see how many tokens are in a text string without making an API call, use OpenAI’s tiktoken Python library. Example code can be found in the OpenAI Cookbook’s guide on how to count tokens with tiktoken.

Each message passed to the API consumes the number of tokens in the content, role, and other fields, plus a few extra for behind-the-scenes formatting. This may change slightly in the future.

If a conversation has too many tokens to fit within a model’s maximum limit (e.g., more than 4096 tokens for gpt-3.5-turbo), you will have to truncate, omit, or otherwise shrink your text until it fits. Beware that if a message is removed from the messages input, the model will lose all knowledge of it.

Note too that very long conversations are more likely to receive incomplete replies. For example, a gpt-3.5-turbo conversation that is 4090 tokens long will have its reply cut off after just 6 tokens.

## Instructing chat models

Best practices for instructing models may change from model version to version. The advice that follows applies to gpt-3.5-turbo-0301 and may not apply to future models.

Many conversations begin with a system message to gently instruct the assistant. For example, here is one of the system messages used for ChatGPT:

You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible. Knowledge cutoff: {knowledge_cutoff} Current date: {current_date}

In general, gpt-3.5-turbo-0301 does not pay strong attention to the system message, and therefore important instructions are often better placed in a user message.

If the model isn’t generating the output you want, feel free to iterate and experiment with potential improvements. You can try approaches like:

  • Make your instruction more explicit
  • Specify the format you want the answer in
  • Ask the model to think step by step or debate pros and cons before settling on an answer

For more prompt engineering ideas, read the OpenAI Cookbook guide on techniques to improve reliability.

Beyond the system message, the temperature and max tokens are two of many options developers have to influence the output of the chat models. For temperature, higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. In the case of max tokens, if you want to limit a response to a certain length, max tokens can be set to an arbitrary number. This may cause issues for example if you set the max tokens value to 5 since the output will be cut-off and the result will not make sense to users.

## Chat vs Completions

Because gpt-3.5-turbo performs at a similar capability to text-davinci-003 but at 10% the price per token, we recommend gpt-3.5-turbo for most use cases.

For many developers, the transition is as simple as rewriting and retesting a prompt.

For example, if you translated English to French with the following completions prompt:

Translate the following English text to French: “{text}”

An equivalent chat conversation could look like:

1
2
3
4
[
  {“role”: “system”, “content”: “You are a helpful assistant that translates English to French.”},
  {“role”: “user”, “content”: ‘Translate the following English text to French: “{text}”’}
]

Or even just the user message:

1
2
3
[
  {“role”: “user”, “content”: ‘Translate the following English text to French: “{text}”’}
]

## FAQ

### Is fine-tuning available for gpt-3.5-turbo?

No. As of Mar 1, 2023, you can only fine-tune base GPT-3 models. See the fine-tuning guide for more details on how to use fine-tuned models.

### Do you store the data that is passed into the API?

As of March 1st, 2023, we retain your API data for 30 days but no longer use your data sent via the API to improve our models. Learn more in our data usage policy.

### Adding a moderation layer

If you want to add a moderation layer to the outputs of the Chat API, you can follow our moderation guide to prevent content that violates OpenAI’s usage policies from being shown.

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯(lián)系作者
  • 序言:七十年代末肤舞,一起剝皮案震驚了整個濱河市赌厅,隨后出現(xiàn)的幾起案子送矩,更是在濱河造成了極大的恐慌镜粤,老刑警劉巖实檀,帶你破解...
    沈念sama閱讀 218,941評論 6 508
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件翰撑,死亡現(xiàn)場離奇詭異健蕊,居然都是意外死亡,警方通過查閱死者的電腦和手機乳蓄,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,397評論 3 395
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來夕膀,“玉大人虚倒,你說我怎么就攤上這事〉晔” “怎么了裹刮?”我有些...
    開封第一講書人閱讀 165,345評論 0 356
  • 文/不壞的土叔 我叫張陵,是天一觀的道長庞瘸。 經常有香客問我捧弃,道長,這世上最難降的妖魔是什么擦囊? 我笑而不...
    開封第一講書人閱讀 58,851評論 1 295
  • 正文 為了忘掉前任违霞,我火速辦了婚禮,結果婚禮上瞬场,老公的妹妹穿的比我還像新娘买鸽。我一直安慰自己,他們只是感情好贯被,可當我...
    茶點故事閱讀 67,868評論 6 392
  • 文/花漫 我一把揭開白布眼五。 她就那樣靜靜地躺著,像睡著了一般彤灶。 火紅的嫁衣襯著肌膚如雪看幼。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,688評論 1 305
  • 那天幌陕,我揣著相機與錄音诵姜,去河邊找鬼。 笑死搏熄,一個胖子當著我的面吹牛棚唆,可吹牛的內容都是我干的。 我是一名探鬼主播心例,決...
    沈念sama閱讀 40,414評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼宵凌,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了止后?” 一聲冷哼從身側響起摆寄,我...
    開封第一講書人閱讀 39,319評論 0 276
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后微饥,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體逗扒,經...
    沈念sama閱讀 45,775評論 1 315
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 37,945評論 3 336
  • 正文 我和宋清朗相戀三年欠橘,在試婚紗的時候發(fā)現(xiàn)自己被綠了矩肩。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 40,096評論 1 350
  • 序言:一個原本活蹦亂跳的男人離奇死亡肃续,死狀恐怖黍檩,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情始锚,我是刑警寧澤刽酱,帶...
    沈念sama閱讀 35,789評論 5 346
  • 正文 年R本政府宣布,位于F島的核電站瞧捌,受9級特大地震影響棵里,放射性物質發(fā)生泄漏。R本人自食惡果不足惜姐呐,卻給世界環(huán)境...
    茶點故事閱讀 41,437評論 3 331
  • 文/蒙蒙 一殿怜、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧曙砂,春花似錦头谜、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,993評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至笑陈,卻和暖如春末荐,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背新锈。 一陣腳步聲響...
    開封第一講書人閱讀 33,107評論 1 271
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留眶熬,地道東北人妹笆。 一個月前我還...
    沈念sama閱讀 48,308評論 3 372
  • 正文 我出身青樓,卻偏偏與公主長得像娜氏,于是被迫代替她去往敵國和親拳缠。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 45,037評論 2 355

推薦閱讀更多精彩內容