Advantage Usage for GPT 3.5 Turbo API

引言

今天(2023.03.02) OpenAI 发布了 GPT 3.5 Turbo 最新的 API，目前定价是 $0.002/1k tokens¹。

warning
本文信息具有时效性，请注意辨别。

使用指南 Usage

官方文档在这里²，显然会比我写的更详细 233，没意外还是建议查看官方文档。

我写完发现，可能还真没我写的详细！233

注册 Register

warning
这里可能需要一个自由的网络环境，同时提一嘴，如果你是使用的代理软件，那么在下面运行程序的时候需要在命令行中环境中设置 HTTP_PROXY 和 HTTPS_PROXY，否则会访问错误。Windows 下可以使用 set 命令，set HTTP_PROXY=http://127.0.0.1:xxxx，Linux 下可以使用 export 命令，export HTTP_PROXY=http://127.0.0.1:xxxx，xxxx 为代理软件端口号。

首先得在 OpenAI Platform 注册一个开发者账号，随后在 API Keys 页面生成一个 API Key。

目前，OpenAI 提供了 $18 一个月的免费额度，用来测试应该是绰绰有余了。

Demo Example

现在直接上手！在开始前需要先安装一下 openai 库，目前最新版本是 v0.27.0。

1

pip install openai

如果你之前安装过，那么你可能需要升级一下

1

pip install openai --upgrade

下面是一个官方提供的 Demo，将里面的 API key 文本替换成你自己的就可以直接运行了。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


import os
import openai

os.environ["OPENAI_API_KEY"] = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # replace with your API key
openai.api_key = os.getenv("OPENAI_API_KEY")

completion = openai.ChatCompletion.create(
    model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello!"}]
)

print(completion.choices[0].message)

这样我们成功跟 GPT 3.5 Turbo 打了个招呼！

Advanced Usage # 1

我们先来看一个更长的栗子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


import os
import openai

os.environ["OPENAI_API_KEY"] = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # replace with your API key
openai.api_key = os.getenv("OPENAI_API_KEY")

completion = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {
            "role": "assistant",
            "content": "The Los Angeles Dodgers won the World Series in 2020.",
        },
        {"role": "user", "content": "Where was it played?"},
    ],
)

print(completion.choices[0].message)

首先需要利用 openai 库的 ChatCompletion 后调用 create 方法来创建一个 ChatCompletion 对象，这个对象中包含了我们的请求信息。

观察一下请求的格式：

model: 这个不用多说，便是模型的名称，目前现在我们在测试 gpt-3.5-turbo 这个模型，并且目前仅支持两个模型，gpt-3.5-turbo 和 gpt-3.5-turbo-0301，后面带日期的模型是不会更新的，但对今天来说，两者是相同的。
messages: 这个是一个列表，列表中的每个元素都是一个字典，字典中的 role 表示这个消息，目前支持 user，system和assistant三种，content 表示消息的内容。
- system：系统消息，用来设置 ChatGPT 的行为。
- user：用户消息，用来和 ChatGPT 交互。
- assistant：助理消息，用来帮助你存储 ChatGPT 在此之前的回复。

来看一下完整的回复

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas.",
        "role": "assistant"
      }
    }
  ],
  "created": 1677759714,
  "id": "chatcmpl-6pcEUG6zKP0Ld33OP1dZGbg3GwWhC",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 19,
    "prompt_tokens": 56,
    "total_tokens": 75
  }
}

这其实就是一个多轮对话的栗子，如果是想动态的进行多轮对话，那么必须每次都将之前的回复都记录并传入才行。

首先我们设置了 system 的消息，这个消息的内容是 You are a helpful assistant.，随后我们设置了 user 的消息，这个消息的内容是 Who won the world series in 2020?，这个消息相当于前一轮对话的信息，然后我们设置了 assistant 的消息，这个消息的内容是 The Los Angeles Dodgers won the World Series in 2020.，意思是 ChatGPT 在上一轮的回复。最后我们又设置了 user 的消息，这个消息的内容是 Where was it played?，表示本轮消息的问题，也就是当前想让 GPT 回复的问题。

可以看到，收到的回复内容是 The 2020 World Series was played at Globe Life Field in Arlington, Texas.，我们询问时只询问了地点，而 ChatGPT 已经通过之前的回复，知道了这次的回复是关于 2020 年世界系列赛的，并且回答了这个问题。

同时，我们还可以看到 usage 字段，这个字段表示这次请求使用了多少 token，prompt_tokens 输入的 token 数量，completion_tokens 是 ChatGPT 回复的 token 数量，total_tokens 是总共使用的 token 数量，这一波消耗了 75 个 token。

Advanced Usage # 2

我们再来看一个更复杂的栗子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31


import os
import openai

os.environ["OPENAI_API_KEY"] = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # replace with your API key
openai.api_key = os.getenv("OPENAI_API_KEY")

completion = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": 'I want you to act as an Chinese translator, spelling corrector and improver. I will speak to you in any language and you will detect the language, translate it and answer in the corrected and improved version of my text, in Chinese. I want you to replace my simplified A0-level words and sentences with more beautiful and elegant, upper level Chinese words and sentences. Keep the meaning same, but make them more literary. I want you to only reply the correction, the improvements and nothing else, do not write explanations. My first sentence is "Non terrae plus ultra"',
        },
    ],
    temperature=0.9,  # 0.0 to 2.0 (default 1.0)
    top_p=1,  # 0.0 to 1.0 (default 1.0) (not used if temperature is set)
    n=5,  # number (default 1) How many chat completion choices to generate for each input message.
    stream=False,  # boolean (default False)
    stop=None,  # string or array (default None)
    max_tokens=10,  # inf (default 4096-prompt_token)
    presence_penalty=2.0,  # -2.0 to 2.0 (default 0)
    frequency_penalty=0,  # -2.0 to 2.0 (default 0)
    # logit_bias=
    # user=
)

print(completion)

for choice in completion.choices:
    print(choice.message.content)

这里的 prompt 修改自 ³，prompt 与我想解释的参数无关

这里面涉及到了更多的参数：

temperature: 0.0 to 2.0 (默认 1.0) 温度，越高越随机，越低越有规律（或确定性）。
top_p: 0.0 to 1.0 (默认 1.0) 使用温度的另一种选择，也叫核采样（nucleus sampling），建议不要同时使用 temperature 和 top_p。top_p 表示模型只考虑概率最高的 top_p 的 token，比如 top_p=0.1，表示模型只考虑概率最高的 10% 的 token。
n: number (默认 1) 生成的回复数量。
stream: boolean (默认 False) 是否使用流式模式，如果设置为 True，将发送部分消息增量，就像在 ChatGPT 中一样。什么意思捏，就是每次单独给你蹦几个词，好让你动态的去更新文本，像你在 ChatGPT 中等待完整的回复一样。
stop: string or array (默认 None) 用来停止生成的 token，可以是一个字符串，也可以是一个字符串列表，如果是字符串列表，那么只要其中一个 token 出现，就会停止生成，最多 4 个。
max_tokens: inf (默认 4096-prompt_token) 生成的最大 token 数量。
frequency_penalty 和 presence_penalty: -2.0 to 2.0 (默认 0) 用来惩罚重复的 token。关于此参数的更多细节在 ⁴ 中有介绍，看起来一个是处理的频率，一个是处理的存在次数（整数）。这两个参数的值越大，生成的文本越不会重复。

公式是这样的：
1

mu[j] -> mu[j] - c[j] * alpha_frequency - float(c[j] > 0) * alpha_presence
logit_bias: dict (默认 None) 用来调整 token 的概率，可以接受 json。数值是 -100 to 100，-100 相当于直接禁用这个词，100 相当于如果相关就必须使用。
user: dict (默认 None) 用来设置用户的信息，具体内容可以参考 ⁵，主要是为了防止滥用。

而这段代码的输出是下面内容（因为中文在 json 中会有转义，所以这里我把中文替换过了）。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53


{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "\"天涯海角\"",
        "role": "assistant"
      }
    },
    {
      "finish_reason": "stop",
      "index": 1,
      "message": {
        "content": "\n\n无穷尽之地",
        "role": "assistant"
      }
    },
    {
      "finish_reason": null,
      "index": 2,
      "message": {
        "content": "\n\n\"无出其右\"",
        "role": "assistant"
      }
    },
    {
      "finish_reason": "stop",
      "index": 3,
      "message": {
        "content": "\n\n\"无地不可至\"",
        "role": "assistant"
      }
    },
    {
      "finish_reason": "length",
      "index": 4,
      "message": {
        "content": "\n\n无地可往，更远",
        "role": "assistant"
      }
    }
  ],
  "created": 1677760749,
  "id": "chatcmpl-6pcVBkXoD9xr2CSioR1Gz3ubEOdpg",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 45,
    "prompt_tokens": 124,
    "total_tokens": 169
  }
}

Advanced Usage # 3

这里是关于 steam 参数的一个栗子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30


import os
import openai

os.environ["OPENAI_API_KEY"] = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
openai.api_key = os.getenv("OPENAI_API_KEY")

completion = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": 'I want you to act as an Chinese translator, spelling corrector and improver. I will speak to you in any language and you will detect the language, translate it and answer in the corrected and improved version of my text, in Chinese. I want you to replace my simplified A0-level words and sentences with more beautiful and elegant, upper level Chinese words and sentences. Keep the meaning same, but make them more literary. I want you to only reply the correction, the improvements and nothing else, do not write explanations. My first sentence is "Non terrae plus ultra"',
        },
    ],
    temperature=1,  # 0.0 to 2.0 (default 1.0)
    top_p=1,  # 0.0 to 1.0 (default 1.0) (not used if temperature is set)
    n=1,  # number (default 1) How many chat completion choices to generate for each input message.
    stream=True,  # boolean (default False)
    stop=None,  # string or array (default None)
    # max_tokens=100,  # inf (default 4096-prompt_token)
    presence_penalty=2.0,  # -2.0 to 2.0 (default 0)
    frequency_penalty=0,  # -2.0 to 2.0 (default 0)
    # logit_bias=
    # user=
)

for completion_ in completion:
    # print(completion_)
    for choice in completion_.choices:
        print(choice.delta.content if "content" in choice.delta else "")

开启 Stream 模式后，返回的会是流式数据，而不是返回一个包含所有数据的对象，并且这个返回的对象是可以迭代的。下面是返回的一个 item 样例，这里注意 delta 中未必会有 content，所以需要判断一下。

也就是说，其实 ChatGPT 在输出前就已经得到了完整的结果，他一个词一个词蹦纯属在前端拖延你的时间？！

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


{
  "choices": [
    {
      "delta": {
        "content": "\uff1f"
      },
      "finish_reason": null,
      "index": 0
    }
  ],
  "created": 1677763452,
  "id": "chatcmpl-6pdCm5jwsB1e3YyEDZ1MQXbpHzWvn",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion.chunk"
}

Advanced Usage # 4

这是一个关于 stop 参数的栗子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28


import os
import openai

os.environ["OPENAI_API_KEY"] = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
openai.api_key = os.getenv("OPENAI_API_KEY")

completion = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": 'I want you to act as an Chinese translator, spelling corrector and improver. I will speak to you in any language and you will detect the language, translate it and answer in the corrected and improved version of my text, in Chinese. I want you to replace my simplified A0-level words and sentences with more beautiful and elegant, upper level Chinese words and sentences. Keep the meaning same, but make them more literary. I want you to only reply the correction, the improvements and nothing else, do not write explanations. My first sentence is "Non terrae plus ultra"',
        },
    ],
    temperature=1,  # 0.0 to 2.0 (default 1.0)
    top_p=1,  # 0.0 to 1.0 (default 1.0) (not used if temperature is set)
    n=1,  # number (default 1) How many chat completion choices to generate for each input message.
    stream=False,  # boolean (default False)
    stop="无",  # string or array (default None)
    # max_tokens=100,  # inf (default 4096-prompt_token)
    presence_penalty=0,  # -2.0 to 2.0 (default 0)
    frequency_penalty=0,  # -2.0 to 2.0 (default 0)
    # logit_bias=
    # user=
)

for choice in completion.choices:
    print(choice.message.content)

这里我们依然让他扮演一个翻译的角色，来翻译 Apex Legends 中动力小子的一句台词，用 “无” 作为停止条件，当输出的结果中遇到 “无” 时，就停止并返回结果。输出如下：

1

您好，您所给出的第一句话“Non terrae plus ultra”是拉丁语，意为“没有比这更远的土地”，翻译成中文可写作“

可以看到输出直接给断开了，并且会将中断的词自动转为 token，不过我暂时还没想到有啥应用场景 - -

Error

在使用的时候我有好几次触发了这个错误，并不是我请求格式的问题，大概是同时请求的人太多了，所以会出现这个错误，这个错误的信息是下面这样的：

1
2
3
4
5
6
7
8
9


openai.error.APIError: The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 7d8d2f6b67d92ff7850ef3e17d742827 in your email.) {
  "error": {
    "message": "The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 7d8d2f6b67d92ff7850ef3e17d742827 in your email.)",
    "type": "server_error",
    "param": null,
    "code": null
  }
}
 500 {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 7d8d2f6b67d92ff7850ef3e17d742827 in your email.)', 'type': 'server_error', 'param': None, 'code': None}} {'Date': 'Thu, 02 Mar 2023 12:37:21 GMT', 'Content-Type': 'application/json', 'Content-Length': '366', 'Connection': 'keep-alive', 'Access-Control-Allow-Origin': '*', 'Openai-Model': 'gpt-3.5-turbo-0301', 'Openai-Organization': 'user-8qnumkqgd3l02hvzq5rqz0y1', 'Openai-Processing-Ms': '750', 'Openai-Version': '2020-10-01', 'Strict-Transport-Security': 'max-age=15724800; includeSubDomains', 'X-Request-Id': '7d8d2f6b67d92ff7850ef3e17d742827'}

结语

这个定价我只能说，实在是太便宜了，感觉很多公司可能都不会去想办法复现了，直接调包，性能好，不需要担心成本、电费、算力等种种因素，价格还便宜。我感觉小公司就算有自己模型，光算力和电力成本可能比 API 高不少，毕竟这也是一个利用率的问题。

另外我感觉这个其实也一定程度上改变了翻译市场，以腾讯云的翻译 API 为例，他的价格稍微计算一下，大概是 GPT 3.5 Turbo API 价格的 3 倍左右，但是算上输入的 token，也就 1.5 倍吧，但是附赠了其他功能，包括改写润色。

通过体验也可以发现，如果你每次都是长文本输入，其实消耗 token 还是挺快的，输入输出都会计费。同时，如果你想进行 session 级别的对话，那么你消耗的 token 也会增长很快，每次 * 2，再累计，也就是平方级别的了，所以长对话的消耗其实还是挺大的。

不过，遗憾的是，目前 OpenAI 只支持虚拟信用卡支付，国内用户想自费使用的话，可能还得有些自己的手段。

还记得，去年 9 月份我写了一篇博客来讲讲我对 Stable Diffusion 的想法⁶，现在再来看看 SD 模型，简直像是差了一个世纪…lora、ControlNet…如果说 SD 只是影响到了艺术、设计领域，那么 ChatGPT 的大模型潜力是真的很大，会影响很多的行业，因为输出的多样性实在是太丰富了，比如可能会有人用他来输出一段代码，驱动机器等等⁷…应用的场景完全取决于想象力了，但目前也存在科学性的问题，如果有一个更强大的知识库来建立，输出时给予理论依据，那么这个模型的应用场景就会更加广泛，比如医疗、金融等等，对证据、决策要求更高的领域也会大放异彩。

Advantage Usage for GPT 3.5 Turbo API

引言

使用指南 Usage

注册 Register

Demo Example

Advanced Usage # 1

Advanced Usage # 2

Advanced Usage # 3

Advanced Usage # 4

Error

结语

作者

发布于

更新于

许可协议

引言#

使用指南 Usage#

注册 Register#

Demo Example#

Advanced Usage # 1#

Advanced Usage # 2#

Advanced Usage # 3#

Advanced Usage # 4#

Error#

结语#

引言

使用指南 Usage

注册 Register

Demo Example

Advanced Usage # 1

Advanced Usage # 2

Advanced Usage # 3

Advanced Usage # 4

Error

结语