OpenAI Prompt Engineering-5. 模型推理

OpenAI的Prompt Engineering 课程

5. 模型推理

大模型有将一个文本作为输入,然后进行提取标签、提取名字、理解文本情感等能力。比如想要提取一段文本中的态度是正面/负面的操作,在传统的机器学习工作流程中,需要先收集标签数据,然后训练模型、部署模型,这是一个复杂的过程。另外分析情感、提取标签都需要分别训练和部署一个单独的模型。对于大语言模型来说都可以处理,这给开发应用程序带来了巨大的提速。

下面我们来看一个例子,我们根据一段用户的评论,来提取评论的态度是否是积极的,用户是否表达了愤怒,产品的名称、产品的制造商等信息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
lamp_review = f"""
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point./
Got it fast. The string to our lamp broke during the
transit and
the company happily sent over a new one.\
Came within a fev days as well. It was easy to put\
together.
I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""

prompt = f"""
Identify the following items from the review text:
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item
The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment","Anger","Item" and "Brand" as the keys.
If the information isn't present,use "unknown"\
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.
Review text:'''{lamp_review}'''
"""

我们看看大模型是怎么输出的:

1
2
3
4
5
6
7
8
```json
{
"Sentiment": "positive",
"Anger": false,
"Item": "lamp",
"Brand": "Lumina"
}
```

模型按照要求出色的完成了任务,然后 json 结构化的形式输出,我们可以拿到结果很轻松的在 python 或者其他语言框架中作为输入处理结果。

接下来再看一个例子,我们输入一段虚构的新闻报道,让大模型来分析这段文字的主题,并且罗列这个文章五个相关的主题列表。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
story = """In a
recent survey conducted by the government,
public sector employees were asked to rate their level
of satisfaction with the department they work at.
The results revealed that NASA was the most popular
department with a satisfaction rating of 95%.
One NASA employee,John Smith, commented on the findings,
stating, "I'm not surprised that NASA came out on top.
It's a great place to work with amazing people and
incredible opportunities. I'm proud to be a part of
such an innovative organization."|
The results were also welcomed by NASA's management team,
with Director Tom Johnson stating, "we
are thrilled to
hear that our employees are satisfied with their work at NASA
We have a talented and dedicated team who work tirelessly
to achieve our goals,and it's fantastic to see that their
hard work is paying off."
The survey also revealed that the
Social Security Administration had the lowest satisfaction
rating,with only 458 of employees indicating they were
satisfied with their job. The government has pledged to
address the concerns raised by employees in the survey and
work towards improving job satisfaction across all department
"""

prompt_2 = f"""
Determine five topics that are being discussed in the \
following text, which is delimited by triple backticks.
Make each item one or two words long.
Format your response as a list of items separated by commas.
Text sample:'''{story}'''
"""

看看大模型是怎么输出的:

1
survey, satisfaction, NASA, employees, management

整体来说提取的准确度和相关性还是非常不错的。

现在如果我有一个话题列表和一 篇报道,我想知道这个报道涉及到话题列表中的哪些话题,也直接交给大模型分析:

1
2
3
4
5
6
7
8
9
10
11
12
topic_list = [
"nasa", "local government", "engineering",
"'employee satisfaction", "federal government"]

prompt_3 = f"""
Determine whether each item in the following list of \
topics is a topic in the text below, which
is delimited with triple backticks.
Give your answer as list with 0 or 1 for each topic.\
List of topics: {",".join(topic_1ist)}
Texi sample: '''{story}'''
"""

我们来看看DeepSeek的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
To determine whether each topic in the list is present in the given text, we need to check for mentions or discussions related to each topic. Here's the analysis:

1. **Government survey**: The text begins with "In a recent survey conducted by the government," which directly mentions a government survey.
- **Answer**: 1

2. **NASA satisfaction rating**: The text states that "NASA was the most popular department with a satisfaction rating of 95%," and includes comments from a NASA employee and its director.
- **Answer**: 1

3. **Social Security Administration satisfaction**: The text mentions that "the Social Security Administration had the lowest satisfaction rating, with only 45% of employees indicating they were satisfied."
- **Answer**: 1

4. **Employee comments**: The text includes direct quotes from a NASA employee (John Smith) and NASA's Director (Tom Johnson).
- **Answer**: 1

5. **Government improvement plans**: The text states that "the government has pledged to address the concerns raised by employees in the survey and work towards improving job satisfaction across all departments."
- **Answer**: 1

### Final Answer:
`[1, 1, 1, 1, 1]`

(All topics are present in the text.)

大模型也按照它的分析对主题列表中涉及到的主题做出了标记。但是有意思的是,我使用了相同的分析文本和提示词,课程中的OpenAI给出的答案和DeepSeek给出的答案相去甚远。下面是OpenAI给出的答案:

1
2
3
4
5
nasa: 1
local government:0
engineering:0
employee satisfaction: 1
federal government: 1

我理解这就是继续迭代提示词的动机,我们可以继续完善提示词,通过提供案例,约束输出来实现不同模型间输出的差异。吴恩达老师在这里提到了一个概念,我们上面的这种工程,没有提供任何有标签的训练数据,这也被称之为零示例工程

在短短几分钟之内,建立了多个能力,对文本进行推断,要想实现这些能力,在之前需要一名熟练的机器学习开发人员数天或者数周的时间才能完成。所以,无论对于熟练的机器学习开发人员还是机器学习新手而言,这都是一个令人兴奋的消息。现在可以使用提示词非常快速的对复杂的自然语言处理任务做出推断处理。