Prompt_Engineering_v7

内容纲要

Translation of 22365_3_Prompt_Engineering_v7.pdf

原文:Prompt Engineering

Introduction

在思考大型语言模型的输入和输出时,文本提示(有时会伴随图像提示等其它模态)是模型用来预测特定输出的输入。你不需要是数据科学家或机器学习工程师——每个人都可以编写提示。然而,编写最有效的提示可能很复杂。你的提示的许多方面会影响其效果:你使用的模型、模型的训练数据、模型配置、你的词语选择、风格和语气、结构以及上下文都至关重要。因此,提示工程是一个迭代的过程。不充分的提示会导致模糊、不准确的响应,并可能阻碍模型提供有意义的输出。

当你与Gemini聊天机器人聊天时,基本上就是在编写提示,但是本白皮书专注于在Vertex AI中或通过使用API为Gemini模型编写提示,因为直接提示模型将使你能够访问配置,例如温度等。
本白皮书详细讨论了提示工程。我们将研究各种提示技术,帮助你开始并分享技巧和最佳实践,以成为提示专家。我们还将讨论在编写提示时你可能面临的挑战。

Page 1

Original

Prompt Engineering

Author: Lee Boonstra

Translation

提示工程
作者:李·邦斯特拉


Page 2

Original

Prompt Engineering

February 2025 2

Acknowledgements

Content contributors

Michael Sherman

Yuan Cao

Erick Armbrust

Anant Nawalgaria

Antonio Gulli

Simone Cammel

Curators and Editors

Antonio Gulli

Anant Nawalgaria

Grace Mollison

Technical Writer

Joey Haymaker

Designer

Michael Lanning

Translation

提示工程
2025年2月

致谢

内容贡献者
Michael Sherman
Yuan Cao
Erick Armbrust
Anant Nawalgaria
Antonio Gulli
Simone Cammel

策划与编辑
Antonio Gulli
Anant Nawalgaria
Grace Mollison

技术撰稿人
Joey Haymaker

设计师
Michael Lanning


Page 3

Original

Introduction 6

Prompt engineering 7

LLM output configuration 8

Output length 8

Sampling controls 9

Temperature 9

Top-K and top-P 10

Putting it all together 11

Prompting techniques 13

General prompting / zero shot 13

One-shot & few-shot 15

System, contextual and role prompting 18

System prompting 19

Role prompting 21

Contextual prompting 23

Table of contents

Translation

目录

介绍 6

提示工程 7

大语言模型输出配置 8

输出长度 8

采样控制 9

温度 9

Top-K和top-P 10

综合应用 11

提示技术 13

通用提示 / 零样本 13

单样本 & 少样本 15

系统提示、上下文提示和角色提示 18

系统提示 19

角色提示 21

上下文提示 23


Page 4

Original

Step-back prompting 25

Chain of Thought (CoT) 29

Self-consistency 32

Tree of Thoughts (ToT) 36

ReAct (reason & act) 37

Automatic Prompt Engineering 40

Code prompting 42

Prompts for writing code 42

Prompts for explaining code 44

Prompts for translating code 46

Prompts for debugging and reviewing code 48

What about multimodal prompting? 54

Best Practices 54

Provide examples 54

Design with simplicity 55

Be specific about the output 56

Use Instructions over Constraints 56

Control the max token length 58

Use variables in prompts 58

Experiment with input formats and writing styles 59

For few-shot prompting with classification tasks, mix up the classes 59

Adapt to model updates 60

Experiment with output formats 60

Translation

后退一步提示法 25

思维链(CoT) 29

自我一致性 32

思维树(ToT) 36

ReAct(推理与行动) 37

自动提示工程 40

代码提示 42

代码编写提示 42

代码解释提示 44

代码翻译提示 46

代码调试与审查提示 48

多模态提示如何? 54

最佳实践 54

提供示例 54

简单设计 55

明确输出要求 56

使用指令而非约束 56

控制最大令牌长度 58

在提示中使用变量 58

尝试不同的输入格式和写作风格 59

分类任务的少样本提示中混合类别 59

适应模型更新 60

尝试不同的输出格式 60


Page 5

Original

JSON Repair 61

Working with Schemas 62

Experiment together with other prompt engineers 63

CoT Best practices 64

Document the various prompt attempts 64

Summary 66

Endnotes 68

Translation

JSON 修复 61

使用模式 62

与其他提示工程师共同实验 63

CoT 最佳实践 64

记录各种提示尝试 64

总结 66

尾注 68


Page 6

Original

Prompt Engineering

February 2025 6

Introduction

When thinking about a large language model input and output, a text prompt (sometimes

accompanied by other modalities such as image prompts) is the input the model uses

to predict a specific output. You don’t need to be a data scientist or a machine learning

engineer – everyone can write a prompt. However, crafting the most effective prompt can be

complicated. Many aspects of your prompt affect its efficacy: the model you use, the model’s

training data, the model configurations, your word-choice, style and tone, structure, and

context all matter. Therefore, prompt engineering is an iterative process. Inadequate prompts

can lead to ambiguous, inaccurate responses, and can hinder the model’s ability to provide

meaningful output.

You don’t need to be a data scientist or a machine learning engineer – everyone can write a prompt.

Translation

提示工程

2025年2月 6

引言

在思考大型语言模型的输入和输出时,文本提示(有时会伴随其他模态,如图像提示)是模型用来预测特定输出的输入。您不需要成为数据科学家或机器学习工程师——每个人都可以编写提示。然而,编写最有效的提示可能很复杂。您的提示的许多方面会影响其有效性:您使用的模型、模型的训练数据、模型的配置、您的用词、风格和语气、结构以及上下文都很重要。因此,提示工程是一个迭代过程。不充分的提示可能导致模糊、不准确的回应,并可能阻碍模型提供有意义输出的能力。

您不需要成为数据科学家或机器学习工程师——每个人都可以编写提示。


Page 7

Original

Prompt Engineering

February 2025 7

When you chat with the Gemini chatbot,1 you basically write prompts, however this

whitepaper focuses on writing prompts for the Gemini model within Vertex AI or by using

the API, because by prompting the model directly you will have access to the configuration

such as temperature etc.

This whitepaper discusses prompt engineering in detail. We will look into the various

prompting techniques to help you getting started and share tips and best practices to

become a prompting expert. We will also discuss some of the challenges you can face

while crafting prompts.

Prompt engineering

Remember how an LLM works; it’s a prediction engine. The model takes sequential text as

an input and then predicts what the following token should be, based on the data it was

trained on. The LLM is operationalized to do this over and over again, adding the previously

predicted token to the end of the sequential text for predicting the following token. The next

token prediction is based on the relationship between what’s in the previous tokens and what

the LLM has seen during its training.

When you write a prompt, you are attempting to set up the LLM to predict the right sequence

of tokens. Prompt engineering is the process of designing high-quality prompts that guide

LLMs to produce accurate outputs. This process involves tinkering to find the best prompt,

optimizing prompt length, and evaluating a prompt’s writing style and structure in relation

to the task. In the context of natural language processing and LLMs, a prompt is an input

provided to the model to generate a response or prediction.

Translation

提示工程

2025年2月 7

当您与Gemini聊天机器人对话时,您基本上就是在编写提示,然而本白皮书重点关注在Vertex AI中使用Gemini模型或通过API编写提示,因为直接提示模型可以访问诸如温度等配置。

本白皮书将详细讨论提示工程。我们将探讨各种提示技巧,帮助您入门,并分享成为提示专家的技巧和最佳实践。我们还将讨论在编写提示时可能面临的一些挑战。

提示工程

请记住大型语言模型(LLM)的工作原理;它是一个预测引擎。模型将顺序文本作为输入,然后根据其训练数据预测下一个标记(token)应该是什么。LLM被设计为反复执行此操作,将先前预测的标记添加到顺序文本的末尾,以预测下一个标记。下一个标记的预测基于前一个标记中的内容与LLM在训练过程中所见内容之间的关系。

当您编写提示时,您试图设置LLM以预测正确的标记序列。提示工程是设计高质量提示以引导LLM产生准确输出的过程。此过程包括尝试找到最佳提示、优化提示长度,并根据任务评估提示的写作风格和结构。在自然语言处理和LLM的上下文中,提示是提供给模型以生成响应或预测的输入。


Page 8

Original

Prompt Engineering

February 2025 8

These prompts can be used to achieve various kinds of understanding and generation

tasks such as text summarization, information extraction, question and answering, text

classification, language or code translation, code generation, and code documentation

or reasoning.

Please feel free to refer to Google’s prompting guides2,3 with simple and effective

prompting examples.

When prompt engineering, you will start by choosing a model. Prompts might need to be

optimized for your specific model, regardless of whether you use Gemini language models in

Vertex AI, GPT, Claude, or an open source model like Gemma or LLaMA.

Besides the prompt, you will also need to tinker with the various configurations of a LLM.

LLM output configuration

Once you choose your model you will need to figure out the model configuration. Most LLMs

come with various configuration options that control the LLM’s output. Effective prompt

engineering requires setting these configurations optimally for your task.

Output length

An important configuration setting is the number of tokens to generate in a response.

Generating more tokens requires more computation from the LLM, leading to higher energy

consumption, potentially slower response times, and higher costs.

Translation

提示词工程

2025年2月8日

这些提示词可用于实现各种理解和生成任务,如文本摘要、信息提取、问答、文本分类、语言或代码翻译、代码生成,以及代码文档或推理。

请随时参考谷歌的提示词指南2,3,其中包含简单有效的提示词示例。

在进行提示词工程时,您首先要选择一个模型。无论您使用的是Vertex AI中的Gemini语言模型、GPT、Claude,还是Gemma或LLaMA等开源模型,提示词都可能需要针对您的特定模型进行优化。

除了提示词,您还需要调整LLM的各种配置。

LLM输出配置

选择模型后,您需要确定模型配置。大多数LLM都带有各种控制输出的配置选项。有效的提示词工程需要为您的任务最佳地设置这些配置。

输出长度

一个重要的配置设置是响应中要生成的标记数量。生成更多标记需要LLM进行更多计算,导致更高的能源消耗、可能更慢的响应时间和更高的成本。


Page 9

Original

Prompt Engineering

February 2025 9

Reducing the output length of the LLM doesn’t cause the LLM to become more stylistically

or textually succinct in the output it creates, it just causes the LLM to stop predicting more

tokens once the limit is reached. If your needs require a short output length, you’ll also

possibly need to engineer your prompt to accommodate.

Output length restriction is especially important for some LLM prompting techniques, like

ReAct, where the LLM will keep emitting useless tokens after the response you want.

Be aware, generating more tokens requires more computation from the LLM, leading

to higher energy consumption and potentially slower response times, which leads to

higher costs.

Sampling controls

LLMs do not formally predict a single token. Rather, LLMs predict probabilities for what the

next token could be, with each token in the LLM’s vocabulary getting a probability. Those

token probabilities are then sampled to determine what the next produced token will be.

Temperature, top-K, and top-P are the most common configuration settings that determine

how predicted token probabilities are processed to choose a single output token.

Temperature

Temperature controls the degree of randomness in token selection. Lower temperatures

are good for prompts that expect a more deterministic response, while higher temperatures

can lead to more diverse or unexpected results. A temperature of 0 (greedy decoding) is

Translation

提示工程

2025年2月9日

减少大型语言模型(LLM)的输出长度并不会使其在输出风格或文本简洁性上有所提升,而只是当达到限制时,模型会停止预测更多token。如果你的需求需要较短的输出长度,你可能还需要设计相应的提示来满足这一要求。

对于某些LLM提示技术(如ReAct),输出长度限制尤为重要,因为在这些技术中,LLM会在给出你想要的回应后继续输出无用的token。

请注意,生成更多的token需要LLM进行更多计算,这会导致更高的能源消耗、可能更慢的响应时间,以及更高的成本。

采样控制

大型语言模型(LLM)并不正式预测单个token。相反,LLM预测下一个token可能是什么的概率,LLM词汇表中的每个token都会获得一个概率。然后对这些token概率进行采样,以确定下一个生成的token是什么。

温度(Temperature)、top-K和top-P是最常见的配置设置,它们决定了如何处理预测的token概率以选择单个输出token。

温度控制token选择的随机程度。较低的温度适合于期望更确定性回应的提示,而较高的温度可能导致更多样化或意想不到的结果。温度为0(贪婪解码)时


Page 10

Original

Prompt Engineering

February 2025 10

deterministic: the highest probability token is always selected (though note that if two tokens

have the same highest predicted probability, depending on how tiebreaking is implemented

you may not always get the same output with temperature 0).

Temperatures close to the max tend to create more random output. And as temperature gets

higher and higher, all tokens become equally likely to be the next predicted token.

The Gemini temperature control can be understood in a similar way to the softmax function

used in machine learning. A low temperature setting mirrors a low softmax temperature (T),

emphasizing a single, preferred temperature with high certainty. A higher Gemini temperature

setting is like a high softmax temperature, making a wider range of temperatures around

the selected setting more acceptable. This increased uncertainty accommodates scenarios

where a rigid, precise temperature may not be essential like for example when experimenting

with creative outputs.

Top-K and top-P

Top-K and top-P (also known as nucleus sampling)4 are two sampling settings used in LLMs

to restrict the predicted next token to come from tokens with the top predicted probabilities.

Like temperature, these sampling settings control the randomness and diversity of

generated text.

• Top-K sampling selects the top K most likely tokens from the model’s predicted

distribution. The higher top-K, the more creative and varied the model’s output; the

lower top-K, the more restive and factual the model’s output. A top-K of 1 is equivalent to

greedy decoding.

Translation

提示工程

2025年2月10日

确定性:总是选择概率最高的词元(但请注意,如果两个词元具有相同的最高预测概率,根据平局打破方式的实现不同,在使用温度为0的情况下,您可能不会总是得到相同的输出)。

接近最大值的温度往往会创造出更随机的输出。随着温度越来越高,所有词元都同样可能成为下一个预测的词元。

Gemini温度控制可以类似于机器学习中使用的softmax函数来理解。低温度设置类似于低softmax温度(T),强调具有高度确定性的单一首选温度。更高的Gemini温度设置类似于高softmax温度,使得所选设置周围更广泛范围内的温度更可接受。这种增加的不确定性适应了不需要严格精确温度的场景,例如在尝试创意输出时。

Top-K和top-P

Top-K和top-P(也称为核心采样)4是两种在大型语言模型中使用的采样设置,用于限制预测的下一个词元来自具有最高预测概率的词元。与温度类似,这些采样设置控制生成文本的随机性和多样性。

• Top-K采样从模型预测的分布中选择前K个最可能的词元。Top-K值越高,模型的输出越有创意和多样化;Top-K值越低,模型的输出越保守和事实性。Top-K为1等同于贪心解码。


Page 11

Original

Prompt Engineering

February 2025 11

• Top-P sampling selects the top tokens whose cumulative probability does not exceed

a certain value (P). Values for P range from 0 (greedy decoding) to 1 (all tokens in the

LLM’s vocabulary).

The best way to choose between top-K and top-P is to experiment with both methods (or

both together) and see which one produces the results you are looking for.

Putting it all together

Choosing between top-K, top-P, temperature, and the number of tokens to generate,

depends on the specific application and desired outcome, and the settings all impact one

another. It’s also important to make sure you understand how your chosen model combines

the different sampling settings together.

If temperature, top-K, and top-P are all available (as in Vertex Studio), tokens that meet

both the top-K and top-P criteria are candidates for the next predicted token, and then

temperature is applied to sample from the tokens that passed the top-K and top-P criteria. If

only top-K or top-P is available, the behavior is the same but only the one top-K or P setting

is used.

If temperature is not available, whatever tokens meet the top-K and/or top-P criteria are then

randomly selected from to produce a single next predicted token.

At extreme settings of one sampling configuration value, that one sampling setting either

cancels out other configuration settings or becomes irrelevant.

Translation

提示工程

2025年2月11日

• Top-P采样选择累积概率不超过特定值(P)的前几个标记。P的值范围从0(贪婪解码)到1(包含LLM词汇表中的所有标记)。

在top-K和top-P之间做选择的最好方法是尝试这两种方法(或同时使用),看看哪种方法能产生您想要的结果。

综合应用

选择top-K、top-P、温度以及要生成的标记数量,取决于具体应用和期望的结果,并且这些设置都会相互影响。确保您了解所选模型如何结合不同的采样设置也很重要。

如果温度、top-K和top-P都可用(如在Vertex Studio中),同时满足top-K和top-P标准的标记将成为下一个预测标记的候选,然后应用温度从通过top-K和top-P标准的标记中进行采样。如果只提供top-K或top-P,行为相同,但只使用一个top-K或P设置。

如果温度不可用,则从满足top-K和/或top-P标准的标记中随机选择一个作为下一个预测标记。

在一个采样配置值的极端设置下,该采样设置要么会抵消其他配置设置,要么变得无关紧要。


Page 12

Original

Prompt Engineering

February 2025 12

• If you set temperature to 0, top-K and top-P become irrelevant–the most probable

token becomes the next token predicted. If you set temperature extremely high (above

1–generally into the 10s), temperature becomes irrelevant and whatever tokens make

it through the top-K and/or top-P criteria are then randomly sampled to choose a next

predicted token.

• If you set top-K to 1, temperature and top-P become irrelevant. Only one token passes the

top-K criteria, and that token is the next predicted token. If you set top-K extremely high,

like to the size of the LLM’s vocabulary, any token with a nonzero probability of being the

next token will meet the top-K criteria and none are selected out.

• If you set top-P to 0 (or a very small value), most LLM sampling implementations will then

only consider the most probable token to meet the top-P criteria, making temperature and

top-K irrelevant. If you set top-P to 1, any token with a nonzero probability of being the

next token will meet the top-P criteria, and none are selected out.

As a general starting point, a temperature of .2, top-P of .95, and top-K of 30 will give you

relatively coherent results that can be creative but not excessively so. If you want especially

creative results, try starting with a temperature of .9, top-P of .99, and top-K of 40. And if you

want less creative results, try starting with a temperature of .1, top-P of .9, and top-K of 20.

Finally, if your task always has a single correct answer (e.g., answering a math problem), start

with a temperature of 0.

NOTE: With more freedom (higher temperature, top-K, top-P, and output tokens), the LLM

might generate text that is less relevant.

WARNING: Have you ever seen a response ending with a large amount of filler words? This

is also known as the "repetition loop bug", which is a common issue in Large Language

Models where the model gets stuck in a cycle, repeatedly generating the same (filler) word,

phrase, or sentence structure, often exacerbated by inappropriate temperature and top-k/

Translation

提示工程

2025年2月12日

• 如果您将温度设置为0,那么top-K和top-P将变得无关紧要——最可能的token将成为下一个预测的token。如果您将温度设置得非常高(高于1—通常达到10以上),温度将变得无关紧要,任何通过top-K和/或top-P标准的token将被随机采样以选择下一个预测的token。

• 如果您将top-K设置为1,那么温度和top-P将变得无关紧要。只有一个token通过top-K标准,该token就是下一个预测的token。如果您将top-K设置得非常高,比如到LLM词汇表的大小,任何具有非零概率成为下一个token的token都将满足top-K标准,且不会被排除。

• 如果您将top-P设置为0(或一个非常小的值),大多数LLM采样实现将只考虑最可能的token以满足top-P标准,从而使温度和top-K变得无关紧要。如果您将top-P设置为1,任何具有非零概率成为下一个token的token都将满足top-P标准,且不会被排除。

作为一般的起点,温度设置为.2,top-P设置为.95,top-K设置为30将给您相对连贯的结果,这些结果可以是有创意的,但不过分。如果您想要特别有创意的结果,可以尝试从温度.9,top-P.99,top-K 40开始。如果您想要创意较少的结果,可以尝试从温度.1,top-P.9,top-K 20开始。

最后,如果您的任务总是只有一个正确答案(例如,回答数学问题),可以从温度0开始。

注意:随着自由度的增加(更高的温度、top-K、top-P和输出token),LLM可能会生成相关性较低的文本。

警告:您是否曾经看到过以大量填充词结束的回应?这也被称为"重复循环错误",这是大型语言模型中的一个常见问题,模型陷入循环,重复生成相同的(填充)词、短语或句子结构,通常由不适当的温度和top-k/p参数加剧。


Page 13

Original

Prompt Engineering

February 2025 13

top-p settings. This can occur at both low and high temperature settings, though for different

reasons. At low temperatures, the model becomes overly deterministic, sticking rigidly to the

highest probability path, which can lead to a loop if that path revisits previously generated

text. Conversely, at high temperatures, the model's output becomes excessively random,

increasing the probability that a randomly chosen word or phrase will, by chance, lead back

to a prior state, creating a loop due to the vast number of available options. In both cases,

the model's sampling process gets "stuck," resulting in monotonous and unhelpful output

until the output window is filled. Solving this often requires careful tinkering with temperature

and top-k/top-p values to find the optimal balance between determinism and randomness.

Prompting techniques

LLMs are tuned to follow instructions and are trained on large amounts of data so they can

understand a prompt and generate an answer. But LLMs aren’t perfect; the clearer your

prompt text, the better it is for the LLM to predict the next likely text. Additionally, specific

techniques that take advantage of how LLMs are trained and how LLMs work will help you get

the relevant results from LLMs

Now that we understand what prompt engineering is and what it takes, let’s dive into some

examples of the most important prompting techniques.

General prompting / zero shot

A zero-shot5 prompt is the simplest type of prompt. It only provides a description of a task

and some text for the LLM to get started with. This input could be anything: a question, a

start of a story, or instructions. The name zero-shot stands for ’no examples’.

Translation

提示工程

2025年2月13日

top-p设置。这种情况可能在低温和高温设置下发生,但原因不同。在低温下,模型变得过于确定,严格遵循最高概率路径,如果该路径重新访问之前生成的文本,可能会导致循环。相反,在高温下,模型的输出变得过度随机,增加了随机选择单词或短语偶然回到先前状态的概率,由于大量可用选项的存在而形成循环。在这两种情况下,模型的采样过程都会"卡住",导致单调且无用的输出,直到输出窗口被填满。解决这个问题通常需要仔细调整温度和top-k/top-p值,以确定确定性和随机性之间的最佳平衡。

提示技术

大型语言模型(LLMs)经过调整以遵循指令,并在大量数据上进行训练,以便它们能够理解提示并生成答案。但LLMs并不完美;你的提示文本越清晰,LLM就越能预测下一个可能的文本。此外,利用LLM训练方式和工作原理的特定技术将帮助你从LLM中获得相关结果。

既然我们了解了什么是提示工程以及它需要什么,让我们深入探讨一些最重要的提示技术示例。

通用提示 / 零样本

零样本提示是最简单的提示类型。它只提供任务的描述和一些供LLM开始使用的文本。这个输入可以是任何内容:一个问题、一个故事的开头或指令。零样本这个名字代表"没有示例"。


Page 14

Original

Prompt Engineering

February 2025 14

Let’s use Vertex AI Studio (for Language) in Vertex AI,6 which provides a playground to test

prompts. In Table 1, you will see an example zero-shot prompt to classify movie reviews.

The table format as used below is a great way of documenting prompts. Your prompts will

likely go through many iterations before they end up in a codebase, so it’s important to keep

track of your prompt engineering work in a disciplined, structured way. More on this table

format, the importance of tracking prompt engineering work, and the prompt development

process is in the Best Practices section later in this chapter (“Document the various prompt

attempts”).

The model temperature should be set to a low number, since no creativity is needed, and we

use the gemini-pro default top-K and top-P values, which effectively disable both settings

(see ‘LLM Output Configuration’ above). Pay attention to the generated output. The words

disturbing and masterpiece should make the prediction a little more complicated, as both

words are used in the same sentence.

Translation

提示工程

2025年2月14日

让我们使用Vertex AI中的Vertex AI Studio(用于语言)6,它提供了一个测试提示的游乐场。在表1中,您将看到一个用于分类电影评论的零样本提示示例。

下面使用的表格格式是记录提示的好方法。您的提示很可能要经过多次迭代才能最终进入代码库,因此以纪律性、结构化的方式跟踪您的提示工程工作非常重要。关于这种表格格式、跟踪提示工程工作的重要性以及提示开发过程的更多信息,请参见本章后面的"最佳实践"部分("记录各种提示尝试")。

由于不需要创造性,模型温度应设置为低数值,我们使用gemini-pro默认的top-K和top-P值,这实际上禁用了这两个设置(参见上面的"LLM输出配置")。注意生成的输出。令人不安和杰作这两个词应该会使预测变得稍微复杂一些,因为这两个词在同一句话中使用。


Page 15

Original

Prompt Engineering

February 2025 15

Name 1_1_movie_classification

Goal Classify movie reviews as positive, neutral or negative.

Model gemini-pro

Temperature 0.1 Token Limit 5

Top-K N/A Top-P 1

Prompt Classify movie reviews as POSITIVE, NEUTRAL or NEGATIVE. Review: "Her" is a disturbing study revealing the direction humanity is headed if AI is allowed to keep evolving, unchecked. I wish there were more movies like this masterpiece. Sentiment:

Output POSITIVE

Table 1. An example of zero-shot prompting

When zero-shot doesn’t work, you can provide demonstrations or examples in the prompt,

which leads to “one-shot” and “few-shot” prompting. General prompting / zero shot

One-shot & few-shot

When creating prompts for AI models, it is helpful to provide examples. These examples can

help the model understand what you are asking for. Examples are especially useful when you

want to steer the model to a certain output structure or pattern.

A one-shot prompt, provides a single example, hence the name one-shot. The idea is the

model has an example it can imitate to best complete the task.

A few-shot prompt 7 provides multiple examples to the model. This approach shows the

model a pattern that it needs to follow. The idea is similar to one-shot, but multiple examples

of the desired pattern increases the chance the model follows the pattern.

Translation

提示词工程

2025年2月15日

名称:1_1_电影分类

目标:将电影评论分为正面、中性或负面。

模型:gemini-pro
温度:0.1 令牌限制:5
Top-K:不适用 Top-P:1

提示:将电影评论分为正面、中性或负面。评论:"Her"是一部令人不安的作品,揭示了如果人工智能任其发展不加限制,人类将走向何方。我希望能有更多这样的杰作。情感:

输出:正面

表1. 零样本提示的示例

当零样本方法不起作用时,您可以在提示中提供演示或示例,这就导致了"单样本"和"少样本"提示。
通用提示/零样本 单样本&少样本

为AI模型创建提示时,提供示例是有帮助的。这些示例可以帮助模型理解您的要求。当您希望引导模型产生特定的输出结构或模式时,示例尤其有用。

单样本提示提供了一个示例,因此得名。其理念是模型可以模仿这个示例来最好地完成任务。

少样本提示向模型提供了多个示例。这种方法向模型展示了它需要遵循的模式。其理念与单样本类似,但提供多个期望模式的示例增加了模型遵循该模式的可能性。


Page 16

Original

Prompt Engineering

February 2025 16

The number of examples you need for few-shot prompting depends on a few factors,

including the complexity of the task, the quality of the examples, and the capabilities of the

generative AI (gen AI) model you are using. As a general rule of thumb, you should use at

least three to five examples for few-shot prompting. However, you may need to use more

examples for more complex tasks, or you may need to use fewer due to the input length

limitation of your model.

Table 2 shows a few-shot prompt example, let’s use the same gemini-pro model

configuration settings as before, other than increasing the token limit to accommodate the

need for a longer response.

Goal Parse pizza orders to JSON

Model gemini-pro

Temperature 0.1 Token Limit 250

Top-K N/A Top-P 1

Prompt Parse a customer's pizza order into valid JSON:

EXAMPLE: I want a small pizza with cheese, tomato sauce, and pepperoni. JSON Response: { "size": "small", "type": "normal", "ingredients": [["cheese", "tomato sauce", "peperoni"]] }

Continues next page...

Translation

提示工程
2025年2月16日

对于少样本提示,您需要的示例数量取决于几个因素,包括任务的复杂性、示例的质量以及您所使用的生成式AI(生成AI)模型的能力。一般来说,对于少样本提示,您应该使用至少三到五个示例。然而,对于更复杂的任务,您可能需要使用更多示例,或者由于模型的输入长度限制,您可能需要使用更少的示例。

表2展示了一个少样本提示示例,我们将使用与之前相同的gemini-pro模型配置设置,只是增加了令牌限制以适应更长响应的需求。

目标:将披萨订单解析为JSON
模型:gemini-pro
温度:0.1 令牌限制:250
Top-K:不适用 Top-P:1

提示:将客户的披萨订单解析为有效的JSON:
示例:我想要一个小号披萨,配有奶酪、番茄酱和意大利辣香肠。JSON响应: { "size": "small", "type": "normal", "ingredients": [["cheese", "tomato sauce", "peperoni"]] }

继续下一页...


Page 17

Original

Prompt Engineering

February 2025 17

Prompt EXAMPLE: Can I get a large pizza with tomato sauce, basil and mozzarella

{ "size": "large", "type": "normal", "ingredients": [["tomato sauce", "bazel", "mozzarella"]] }

Now, I would like a large pizza, with the first half cheese and mozzarella. And the other tomato sauce, ham and pineapple.

JSON Response:

Output { "size": "large", "type": "half-half", "ingredients": [["cheese", "mozzarella"], ["tomato sauce", "ham", "pineapple"]] }

Table 2. An example of few-shot prompting

When you choose examples for your prompt, use examples that are relevant to the task you

want to perform. The examples should be diverse, of high quality, and well written. One small

mistake can confuse the model and will result in undesired output.

If you are trying to generate output that is robust to a variety of inputs, then it is important to

include edge cases in your examples. Edge cases are inputs that are unusual or unexpected,

but that the model should still be able to handle.

Translation

提示工程

2025年2月17日

提示示例:请给我一个大号披萨,配番茄酱、罗勒和马苏里拉奶酪

{ "size": "large", "type": "normal", "ingredients": [["番茄酱", "罗勒", "马苏里拉奶酪"]] }

现在,我想要一个大号披萨,前一半是奶酪和马苏里拉奶酪,后一半是番茄酱、火腿和菠萝。

JSON响应:

输出 { "size": "large", "type": "half-half", "ingredients": [["奶酪", "马苏里拉奶酪"], ["番茄酱", "火腿", "菠萝"]] }

表2. 少样本提示的示例

当您为提示选择示例时,请选择与您要执行的任务相关的示例。示例应该多样化、高质量且书写良好。一个小小的错误可能会让模型感到困惑,并导致不理想的输出。

如果您试图生成能够适应各种输入的稳健输出,那么在您的示例中包含边缘情况就非常重要。边缘情况是指那些不寻常或出乎意料的输入,但模型仍然应该能够处理它们。


Page 18

Original

Prompt Engineering

February 2025 18

System, contextual and role prompting

System, contextual and role prompting are all techniques used to guide how LLMs generate

text, but they focus on different aspects:

• System prompting sets the overall context and purpose for the language model. It

defines the ‘big picture’ of what the model should be doing, like translating a language,

classifying a review etc.

• Contextual prompting provides specific details or background information relevant to

the current conversation or task. It helps the model to understand the nuances of what’s

being asked and tailor the response accordingly.

• Role prompting assigns a specific character or identity for the language model to adopt.

This helps the model generate responses that are consistent with the assigned role and its

associated knowledge and behavior.

There can be considerable overlap between system, contextual, and role prompting. E.g. a

prompt that assigns a role to the system, can also have a context.

However, each type of prompt serves a slightly different primary purpose:

• System prompt: Defines the model’s fundamental capabilities and overarching purpose.

• Contextual prompt: Provides immediate, task-specific information to guide the response.

It’s highly specific to the current task or input, which is dynamic.

• Role prompt: Frames the model’s output style and voice. It adds a layer of specificity

and personality.

Translation

提示工程

2025年2月18日

系统提示、上下文提示和角色提示

系统提示、上下文提示和角色提示都是用于指导语言模型生成文本的技术,但它们关注不同的方面:

• 系统提示为语言模型设置整体背景和目的。它定义了模型应该做什么的"大局观",比如翻译语言、分类评论等。

• 上下文提示提供与当前对话或任务相关的具体细节或背景信息。它帮助模型理解所询问内容的细微差别,并相应地调整回应。

• 角色提示为语言模型分配特定的角色或身份。这有助于模型生成与所分配角色及其相关知识和行为保持一致的回应。

系统提示、上下文提示和角色提示之间可能存在相当大的重叠。例如,为系统分配角色的提示也可以包含上下文信息。

然而,每种类型的提示都有略微不同的主要目的:

• 系统提示:定义模型的基本能力和总体目的。

• 上下文提示:提供即时的、特定任务的信息来指导回应。它对当前任务或输入具有高度针对性,而这些是动态变化的。

• 角色提示:构建模型的输出风格和语调。它增加了一层具体性和个性。


Page 19

Original

Prompt Engineering

February 2025 19

Distinguishing between system, contextual, and role prompts provides a framework for

designing prompts with clear intent, allowing for flexible combinations and making it easier to

analyze how each prompt type influences the language model’s output.

Let’s dive into these three different kinds of prompts.

System prompting

Table 3 contains a system prompt, where I specify additional information on how to return the

output. I increased the temperature to get a higher creativity level, and I specified a higher

token limit. However, because of my clear instruction on how to return the output the model

didn’t return extra text.

Goal Classify movie reviews as positive, neutral or negative.

Model gemini-pro

Temperature 1 Token Limit 5

Top-K 40 Top-P 0.8

Prompt Classify movie reviews as positive, neutral or negative. Only return the label in uppercase. Review: "Her" is a disturbing study revealing the direction humanity is headed if AI is allowed to keep evolving, unchecked. It's so disturbing I couldn't watch it. Sentiment:

Output NEGATIVE

Table 3. An example of system prompting

System prompts can be useful for generating output that meets specific requirements. The

name ‘system prompt’ actually stands for ‘providing an additional task to the system’. For

example, you could use a system prompt to generate a code snippet that is compatible with

a specific programming language, or you could use a system prompt to return a certain

structure. Have a look into Table 4, where I return the output in JSON format.

Translation

提示工程

2025年2月19日

区分系统提示、上下文提示和角色提示,为设计具有明确意图的提示提供了框架,允许灵活组合,并使分析每种提示类型如何影响语言模型输出变得更加容易。

让我们深入了解这三种不同类型的提示。

系统提示

表3包含一个系统提示,我在其中指定了关于如何返回输出的额外信息。我提高了温度以获得更高的创造性水平,并指定了更高的令牌限制。然而,由于我对如何返回输出有明确的指示,模型没有返回额外的文本。

目标:将电影评论分类为正面、中性或负面。
模型:gemini-pro
温度:1 令牌限制:5
Top-K:40 Top-P:0.8
提示:将电影评论分类为正面、中性或负面。仅以大写形式返回标签。评论:"她"是一部令人不安的研究,揭示了如果任由人工智能持续无限制发展,人类将走向何方。它太令人不安了,我无法继续观看。情感:
输出:负面

表3. 系统提示的示例

系统提示可用于生成满足特定要求的输出。"系统提示"这个名称实际上代表"为系统提供额外任务"。例如,您可以使用系统提示来生成与特定编程语言兼容的代码片段,或者使用系统提示来返回特定结构。请看表4,我在其中以JSON格式返回输出。


Page 20

Original

Prompt Engineering

February 2025 20

Goal Classify movie reviews as positive, neutral or negative, return JSON.

Model gemini-pro

Temperature 1 Token Limit 1024

Top-K 40 Top-P 0.8

Prompt Classify movie reviews as positive, neutral or negative. Return valid JSON:

Review: "Her" is a disturbing study revealing the direction humanity is headed if AI is allowed to keep evolving, unchecked. It's so disturbing I couldn't watch it.

Schema: MOVIE: { "sentiment": String "POSITIVE" | "NEGATIVE" | "NEUTRAL", "name": String } MOVIE REVIEWS: { "movie_reviews": [MOVIE] } JSON Response:

Output { "movie_reviews": [ { "sentiment": "NEGATIVE", "name": "Her" } ] }

Table 4. An example of system prompting with JSON format

Translation

提示工程

2025年2月20日

目标:将电影评论分类为正面、中性或负面,返回JSON。

模型:gemini-pro
温度:1
令牌限制:1024
Top-K:40 Top-P:0.8

提示:将电影评论分类为正面、中性或负面。返回有效的JSON:

评论:《她》是一部令人不安的作品,揭示了如果人工智能任其不受控制地发展,人类将走向何方。它如此令人不安,以至于我无法继续观看。

架构:

MOVIE: {
  "sentiment": String "POSITIVE" | "NEGATIVE" | "NEUTRAL",
  "name": String
}
MOVIE REVIEWS: {
  "movie_reviews": [MOVIE]
}

JSON响应:

输出

{
  "movie_reviews": [
    {
      "sentiment": "NEGATIVE",
      "name": "Her"
    }
  ]
}

表4. 使用JSON格式的系统提示示例


Page 21

Original

Prompt Engineering

February 2025 21

There are some benefits in returning JSON objects from a prompt that extracts data. In

a real-world application I don’t need to manually create this JSON format, I can already

return the data in a sorted order (very handy when working with datetime objects), but most

importantly, by prompting for a JSON format it forces the model to create a structure and

limit hallucinations.

System prompts can also be really useful for safety and toxicity. To control the output, simply

add an additional line to your prompt like: ‘You should be respectful in your answer.’.

Role prompting

Role prompting is a technique in prompt engineering that involves assigning a specific role to

the gen AI model. This can help the model to generate more relevant and informative output,

as the model can craft its responses to the specific role that it has been assigned.

For example, you could role prompt a gen AI model to be a book editor, a kindergarten

teacher, or a motivational speaker. Once the model has been assigned a role, you can then

give it prompts that are specific to that role. For example, you could prompt a teacher to

create a lesson plan that they can then review. Have a look into Table 5, where the model

takes the role of a travel guide.

Translation

提示工程

2025年2月21日

从提取数据的提示中返回JSON对象有一些好处。在实际应用中,我不需要手动创建这种JSON格式,我已经可以按排序顺序返回数据(处理日期时间对象时非常方便),但最重要的是,通过要求JSON格式,它强制模型创建结构并限制幻觉。

系统提示对于安全性和毒性控制也很有用。为了控制输出,只需在提示中添加一行,如:"您应该在回答中保持尊重。"

角色提示是提示工程中的一种技术,涉及为生成式AI模型分配特定角色。这可以帮助模型生成更相关和信息丰富的输出,因为模型可以根据被分配的角色来调整其回答。

例如,您可以提示生成式AI模型扮演书籍编辑、幼儿园老师或励志演说家的角色。一旦模型被分配了角色,您就可以给它提供与该角色相关的提示。例如,您可以提示一位老师创建课程计划,然后进行审核。请查看表5,其中模型扮演了导游的角色。


Page 22

Original

Prompt Engineering

February 2025 22

Goal Act as travel guide and provide 3 travel suggestions

Model gemini-pro

Temperature 1 Token Limit 1024

Top-K 40 Top-P 0.8

Prompt I want you to act as a travel guide. I will write to you about my location and you will suggest 3 places to visit near me. In some cases, I will also give you the type of places I will visit. My suggestion: "I am in Amsterdam and I want to visit only museums." Travel Suggestions:

Output 1. Rijksmuseum: World-renowned museum, known for Dutch masterpieces like Rembrandt’s “The Night Watch.” Don’t miss the incredible collection of paintings and artifacts showcasing Dutch art and history.

  1. Van Gogh Museum: Immerse yourself in the captivating works of Vincent van Gogh. The museum houses the largest collection of his paintings and drawings, including “The Starry Night” and “Sunflowers.”

  2. Stedelijk Museum Amsterdam: Discover modern and contemporary art from around the world. The museum’s collection includes works by Picasso, Kandinsky, and Rietveld, housed in a striking modern building.

Table 5. An example of role prompting

The above example shows an example of taking the role of a travel agent. When you

would change the role to a teacher of geography, you would notice that you will receive a

different response.

Defining a role perspective for an AI model gives it a blueprint of the tone, style, and focused

expertise you’re looking for to improve the quality, relevance, and effectiveness of your

output.

Translation

提示工程

2025年2月22日

目标:扮演旅行指南,提供3个旅行建议

模型:gemini-pro

温度:1 令牌限制:1024

Top-K:40 Top-P:0.8

提示:我希望你扮演旅行指南。我会向你写我的位置,你将建议我附近参观的3个地方。在某些情况下,我也会告诉我要参观的地方类型。我的建议:"我在阿姆斯特丹,我只想参观博物馆。"旅行建议:

输出 1. 荷兰国立博物馆:世界知名博物馆,以伦勃朗的"夜巡"等荷兰大师级作品闻名。不要错过令人惊叹的绘画和文物收藏,展示了荷兰的艺术和历史。

  1. 梵高博物馆:沉浸在文森特·梵高引人入胜的作品中。该博物馆拥有他最大规模的绘画和素描收藏,包括"星夜"和"向日葵"。

  2. 阿姆斯特丹市立博物馆:发现来自世界各地的现代和当代艺术。博物馆的收藏包括毕加索、康定斯基和里特维尔德的作品,坐落在引人注目的现代建筑中。

表5. 角色提示示例

上述示例展示了扮演旅行代理的角色。当你将角色更改为地理老师时,你会注意到你将收到不同的回应。

为AI模型定义角色视角为其提供了你期望的语调、风格和专注专业知识的蓝图,以提高输出的质量、相关性和有效性。


Page 23

Original

Prompt Engineering

February 2025 23

Here are some styles you can choose from which I find effective:

Confrontational, Descriptive, Direct, Formal, Humorous, Influential, Informal,

Inspirational, Persuasive

Let’s change our prompt in Table 6 to include a humorous and inspirational style.

Prompt I want you to act as a travel guide. I will write to you about my location and you will suggest 3 places to visit near me in a humorous style.

My suggestion: "I am in Manhattan." Travel Suggestions:

Output 1. Behold the Empire State of Mind: Ascend to the dizzying heights of the Empire State Building and bask in the glory of Manhattan's skyline. Prepare to feel like King Kong atop the Big Apple, minus the giant ape-sized banana.

  1. Get Artsy-Fartsy at MoMA: Unleash your inner art aficionado at the Museum of Modern Art (MoMA). Gaze upon masterpieces that will boggle your mind and make you question whether your stick-figure drawings have any artistic merit.

  2. Shop ‘Til You Drop on Fifth Avenue: Indulge in some retail therapy on the iconic Fifth Avenue. Brace yourself for sticker shock as you window-shop (or actually shop) at designer boutiques that will make your wallet cry. But hey, you’re in Manhattan, so you might as well embrace the fabulousness!

Table 6. An example of role prompting with a humorous tone and style

Contextual prompting

By providing contextual prompts, you can help ensure that your AI interactions are as

seamless and efficient as possible. The model will be able to more quickly understand your

request and be able to generate more accurate and relevant responses, as you can see in the

example of Table 7.

Translation

提示词工程

2025年2月23日

以下是我发现有效的几种风格可供选择:

对抗性、描述性、直接、正式、幽默、有影响力、非正式、启发性、说服力强

让我们修改表6中的提示词,加入幽默和启发的风格。

提示词:请扮演一名旅行向导。我会告诉您我的位置,您将以幽默的风格为我推荐3个附近的旅游景点。

我的建议:"我在曼哈顿。" 旅行建议:

输出1. 感受帝国大厦的壮丽:登上帝国大厦令人眩晕的高度,沉浸在曼哈顿天际线的辉煌中。准备好感受站在"大苹果"顶端的金刚般的震撼,只不过没有巨型香蕉。

  1. 在MoMA释放你的艺术细胞:在现代艺术博物馆(MoMA)释放你内心的艺术爱好者。凝视那些会让你目瞪口呆的杰作,并让你质疑你的火柴人画作是否还有任何艺术价值。

  2. 在第五大道尽情购物:在标志性的第五大道上享受一些购物疗法。当你浏览(或实际购物)那些会让你的钱包哭泣的设计师精品店时,准备好迎接价格冲击。但嘿,你在曼哈顿,不妨尽情拥抱这非凡的魅力!

表6. 一个带有幽默语调和风格的角色提示词示例

情境提示词

通过提供情境提示词,您可以确保与AI的交互尽可能无缝和高效。模型将能够更快地理解您的请求,并能够生成更准确和相关的响应,如表7的示例所示。


Page 24

Original

Prompt Engineering

February 2025 24

Goal Suggest articles for a blog about retro games

Model gemini-pro

Temperature 1 Token Limit 1024

Top-K 40 Top-P 0.8

Prompt Context: You are writing for a blog about retro 80's arcade video games.

Suggest 3 topics to write an article about with a few lines of description of what this article should contain.

Output 1. The Evolution of Arcade Cabinet Design: This article would explore the evolution of arcade cabinet designs, from the early wood and metal cabinets of the 1970s to the sleek, neon-lit designs of the 1980s and beyond.

  1. Blast From The Past: Iconic Arcade Games of The 80’s: - This article would feature a glimpse into the history, impact, and legacy of some of the most iconic and influential arcade games of the 1980s, highlighting their innovations, popular mechanics, and enduring charm.

  2. The Rise and Retro Revival of Pixel Art: - This article would delve into the evolution of pixel art as a defining visual style of early arcade games, tracing its roots, exploring its techniques, and discussing the resurgence of pixel art in modern games and digital art.

Table 7. An example of contextual prompting

Translation

提示工程
2025年2月24日

目标:为复古游戏博客推荐文章

模型:gemini-pro
温度:1 令牌限制:1024
Top-K:40 Top-P:0.8

提示背景:你正在为一家关于80年代复古街机游戏的博客撰写文章。

建议3个可以撰写文章的主题,并简要描述每篇文章应包含的内容。

  1. 街机机柜设计的演变:本文将探讨街机机柜设计的演变,从1970年代的早期木质和金属机柜,到1980年代及以后的光滑霓虹灯设计。

  2. 昔日重现:80年代经典街机游戏:本文将回顾80年代最具标志性和影响力的街机游戏的历史、影响和遗产,突出它们的创新、流行机制和持久的魅力。

  3. 像素艺术的兴起与复古复兴:本文将深入探讨像素艺术作为早期街机游戏标志性视觉风格的演变,追溯其根源,探索其技术,并讨论像素艺术在现代游戏和数字艺术中的复兴。

表7. 上下文提示示例


Page 25

Original

Prompt Engineering

February 2025 25

Step-back prompting

Step-back8 prompting is a technique for improving the performance by prompting the LLM

to first consider a general question related to the specific task at hand, and then feeding the

answer to that general question into a subsequent prompt for the specific task. This ‘step

back’ allows the LLM to activate relevant background knowledge and reasoning processes

before attempting to solve the specific problem.

By considering the broader and underlying principles, LLMs can generate more accurate and

insightful responses. Step-back prompting encourages LLMs to think critically and apply their

knowledge in new and creative ways. It changes the final prompt doing the task by utilizing

more knowledge in the LLM’s parameters than would otherwise come into play when the LLM

is prompted directly.

It can help to mitigate biases in LLM responses, by focusing on general principles instead of

specific details, step-back prompting.

Let’s have a look into these examples to understand how step-back prompting can improve

the results. Let’s first review a traditional prompt (Table 8) before we compare it to a step

back prompt (Table 9):

Translation

提示工程

2025年2月25日

回退提示法

回退提示法是一种通过提示大型语言模型(LLM)首先考虑与当前特定任务相关的一般性问题,然后将该一般性问题的答案输入到后续针对特定任务的提示中来改进性能的技术。这种"回退"允许LLM在尝试解决特定问题之前激活相关的背景知识和推理过程。

通过考虑更广泛和基本的原则,LLM可以生成更准确和有洞察力的回应。回退提示法鼓励LLM批判性思考,并以新颖和创造性的方式应用其知识。它通过利用LLM参数中更多的知识来改变执行任务最终提示,这些知识在直接提示LLM时不会发挥作用。

通过专注于一般原则而非具体细节,回退提示法有助于减轻LLM回应中的偏见。

让我们看看这些例子,以了解回退提示法如何改进结果。在将其与回退提示(表9)进行比较之前,我们先回顾一下传统提示(表8):


Page 26

Original

Prompt Engineering

February 2025 26

Goal Write a storyline for a level of a first-person shooter video game.

Model gemini-pro

Temperature 1 Token Limit 1024

Top-K 40 Top-P 0.8

Prompt Write a one paragraph storyline for a new level of a first- person shooter video game that is challenging and engaging.

Output The level begins with the player’s squad being ambushed in a dense urban area by a relentless enemy faction. Players must fight their way through a maze of back alleys and dilapidated buildings, utilizing stealth and precision to eliminate enemy forces and avoid detection. The player must use their environment to their advantage, setting traps, utilizing cover, and making calculated moves to survive. Along the way, they uncover intel suggesting the enemy is planning a devastating attack, and the player must race against time to gather evidence and relay it back to their headquarters before the enemy can carry out their plans.

Table 8. A traditional prompt before we compare it with a step back prompt

When you set the temperature to 1, you might end up with all kinds of creative writings for a

storyline, but it’s also quite random and generic. So let’s take a step back:

Translation

提示工程

2025年2月26日

目标:为第一人称射击游戏的一个关卡编写故事情节。

模型:gemini-pro

温度:1 令牌限制:1024 Top-K:40 Top-P:0.8

提示:为第一人称射击游戏的一个新关卡编写一个段落的故事情节,要求具有挑战性和吸引力。

输出:关卡开始时,玩家小队在密集的城市区域遭到无情敌派的伏击。玩家必须穿越迷宫般的后巷和破败建筑,利用潜行和精准射击来消灭敌军并避免被发现。玩家必须充分利用环境优势,设置陷阱、利用掩护并做出精确计算以生存。途中,他们发现情报表明敌人正在策划一场毁灭性攻击,玩家必须争分夺秒收集证据并将其传回总部,以防敌人实施他们的计划。

表8. 在我们将它与"退一步"提示进行比较之前,传统的提示

当您将温度设置为1时,您可能会得到各种创意性的故事情节,但它也相当随机和普通。所以让我们退一步思考:


Page 27

Original

Prompt Engineering

February 2025 27

Prompt Based on popular first-person shooter action games, what are 5 fictional key settings that contribute to a challenging and engaging level storyline in a first-person shooter video game?

Output 1. Abandoned Military Base: A sprawling, post-apocalyptic military complex crawling with mutated soldiers and rogue robots, ideal for challenging firearm combat.

  1. Cyberpunk City: A neon-lit, futuristic urban environment with towering skyscrapers and dense alleyways, featuring cybernetically enhanced enemies and hacking mechanics.

  2. Alien Spaceship: A vast alien vessel stranded on Earth, with eerie corridors, zero-gravity sections, and extraterrestrial creatures to encounter.

  3. Zombie-Infested Town: A desolate town overrun by hordes of aggressive zombies, featuring intense close-quarters combat and puzzle-solving to find safe passage.

  4. Underwater Research Facility: A deep-sea laboratory flooded with water, filled with mutated aquatic creatures, and requiring stealth and underwater exploration skills to survive.

Table 9. An example of prompting for self consistency

Yeah those topics seem like a good fit for a first-person video game. Let’s go back to the

original prompt, but this time we include the answer of the step back as context and see what

it will return.

Translation

提示工程

2025年2月27日

提示:基于流行的一人称射击动作游戏,有哪些5个虚构的关键场景能够为一人称射击游戏的关卡剧情提供挑战性和吸引力?

输出:

  1. 废弃军事基地:广阔的末日后军事综合体,布满了变异士兵和叛变机器人,适合进行具有挑战性的枪战。

  2. 赛博朋克城市:霓虹灯闪烁的未来都市环境,拥有高耸的摩天大楼和密集的巷道,包含赛博增强型敌人和黑客机制。

  3. 外星飞船:一艘坠落在地球上的庞大外星飞船,有着诡异走廊、零重力区域和需要遭遇的外星生物。

  4. 僵尸横行的城镇:一个被大批具有攻击性的僵尸占领的荒凉小镇,包含激烈的近距离战斗和寻找安全通道的解谜元素。

  5. 水下研究设施:一个被水淹没的深海实验室,充满了变异的水生生物,需要潜行和水下探索技能才能生存。

表9. 自一致性提示示例

是的,这些主题似乎很适合第一人称视频游戏。让我们回到原始提示,但这次我们包含逐步回退的答案作为上下文,看看它会返回什么。


Page 28

Original

Prompt Engineering

February 2025 28

Prompt Context: 5 engaging themes for a first person shooter video game:

  1. Abandoned Military Base: A sprawling, post-apocalyptic military complex crawling with mutated soldiers and rogue robots, ideal for challenging firearm combat.

  2. Cyberpunk City: A neon-lit, futuristic urban environment with towering skyscrapers and dense alleyways, featuring cybernetically enhanced enemies and hacking mechanics.

  3. Alien Spaceship: A vast alien vessel stranded on Earth, with eerie corridors, zero-gravity sections, and extraterrestrial creatures to encounter.

  4. Zombie-Infested Town: A desolate town overrun by hordes of aggressive zombies, featuring intense close-quarters combat and puzzle-solving to find safe passage.

  5. Underwater Research Facility: A deep-sea laboratory flooded with water, filled with mutated aquatic creatures, and requiring stealth and underwater exploration skills to survive.

Take one of the themes and write a one paragraph storyline for a new level of a first-person shooter video game that is challenging and engaging.

Output In the heart of a murky abyss, lies a dilapidated underwater research facility, standing as a testament to human ambition and its disastrous consequences. Shrouded in darkness, pulsating with the hum of malfunctioning machinery, and haunted by the echoes of drowned souls, the facility is now a breeding ground for a nightmare borne from the deep sea. The player, an elite marine equipped with advanced diving gear and experimental weaponry, is tasked with venturing into this aquatic realm of terror. Armed with resilience and resourcefulness, they must navigate treacherous corridors, solve cryptic puzzles, and confront gruesome sea monstrosities that lurk in the depths. From grappling with the crushing pressure of the deep sea to outsmarting cunning aquatic predators, every moment in this uncharted underworld tests the limits of human endurance and courage.

Table 10. An example of prompting for self consistency

Translation

提示工程

2025年2月28日

提示上下文:5个第一人称射击游戏的引人入胜主题:

  1. 废弃军事基地:一个广阔的后末日军事综合体,布满了变异士兵和失控机器人,适合挑战性的枪战。

  2. 赛博朋克城市:一个霓虹闪烁的未来都市环境,有高耸的摩天大楼和密集的小巷,包含赛博增强型敌人和黑客机制。

  3. 外星飞船:一艘困在地球上的巨大外星飞船,有诡异的走廊、零重力区域和需要遭遇的外星生物。

  4. 僵尸横行的城镇:一个被成群攻击性僵尸占领的荒凉小镇,包含激烈的近身战斗和寻找安全通道的解谜元素。

  5. 水下研究设施:一个被水淹没的深海实验室,充满了变异的水生生物,需要潜行和水下探索技能才能生存。

选择其中一个主题,为第一人称射击游戏的新关卡撰写一段具有挑战性和吸引力的故事情节。

输出 在一片黑暗深渊的中心,矗立着一座破败的水下研究设施,它见证了人类的雄心及其灾难性的后果。笼罩在黑暗中,伴随着故障机器的嗡嗡声和溺亡灵魂的回响,这个设施如今已成为深海噩梦的温床。玩家,一名配备先进潜水装备和实验性武器的精英海军陆战队员,被派往这片恐怖的水域。凭借坚韧和机智,他们必须穿越危险的走廊,破解神秘的谜题,并 confront 在深渊中潜伏的恐怖海怪。从应对深海的压力到智取狡猾的水生捕食者,在这个未知的地下世界中的每一刻都在考验人类耐力和勇气的极限。

表10. 自一致性提示的示例


Original

That looks like an interesting video game! By using step back prompting techniques you can

increase the accuracy of your prompts.

Translation

这看起来像是个有趣的游戏!通过使用"退一步提示"技巧,你可以提高提示的准确性。


Page 29

Original

Prompt Engineering

February 2025 29

Chain of Thought (CoT)

Chain of Thought (CoT) 9 prompting is a technique for improving the reasoning capabilities

of LLMs by generating intermediate reasoning steps. This helps the LLM generate more

accurate answers. You can combine it with few-shot prompting to get better results on more

complex tasks that require reasoning before responding as it’s a challenge with a zero-shot

chain of thought.

CoT has a lot of advantages. First of all, it’s low-effort while being very effective and works

well with off-the-shelf LLMs (so no need to finetune). You also get interpretability with CoT

prompting, as you can learn from the LLM’s responses and see the reasoning steps that were

followed. If there’s a malfunction, you will be able to identify it. Chain of thought appears

to improve robustness when moving between different LLM versions. Which means the

performance of your prompt should drift less between different LLMs than if your prompt

does not use reasoning chains. Of course there are also disadvantages, but they are

somewhat intuitive.

The LLM response includes the chain of thought reasoning, which means more output

tokens, which means predictions cost more money and take longer.

To explain the following example in Table 11, let’s first try to create a prompt that is not using

CoT prompting to showcase the flaws of a large language model.

Prompt When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner?

Output 63 years old

Table 11. An example of a prompt which is trying to solve a mathematical problem

Translation

提示工程

2025年2月29日

思维链(CoT)

思维链(CoT)提示是一种通过生成中间推理步骤来提高大语言模型(LLM)推理能力的技术。这有助于LLM生成更准确的答案。您可以将其与少样本提示(few-shot prompting)结合,在需要先进行推理再回答的更复杂任务上获得更好的结果,因为在零样本思维链(zero-shot chain of thought)情况下这是一个挑战。

思维链有很多优点。首先,它非常有效且操作简单,并且能够很好地与现成的大语言模型(LLM)配合使用(因此无需微调)。通过思维链提示,您还可以获得可解释性,因为您可以从LLM的响应中学习并看到所遵循的推理步骤。如果出现故障,您将能够识别它。思维链似乎在不同LLM版本之间迁移时能提高鲁棒性。这意味着与不使用推理链的提示相比,您的提示在不同LLM之间的性能漂移应该更小。当然也有缺点,但这些缺点在某种程度上是显而易见的。

LLM的响应包含思维链推理,这意味着更多的输出标记,这也意味着预测需要花费更多的金钱和时间。

为了解释表11中的以下示例,让我们首先尝试创建一个不使用思维链提示的提示,以展示大语言模型的缺陷。

提示:当我3岁时,我的伴侣是我的年龄的3倍。现在,我20岁了。我的伴侣多大?

输出:63岁

表11. 一个尝试解决数学问题的提示示例


Page 30

Original

Prompt Engineering

February 2025 30

Yikes. That’s obviously the wrong answer. As a matter of fact, LLMs often struggle with

mathematical tasks and can provide incorrect answers – even for a task as simple as

multiplying two numbers. This is because they are trained on large volumes of text and math

may require a different approach. So let’s see if intermediate reasoning steps will improve

the output.

Prompt When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner? Let's think step by step.

Output 1. Let’s assume my current age is “x” years.

  1. When I was 3 years old, my age was 3 years.

  2. My partner’s age at that time was 3 times my age, which means my partner’s age was 3 * 3 = 9 years.

  3. Now, I am 20 years old, which means my age has increased by 20 - 3 = 17 years since I was 3 years old.

  4. Since my partner’s age was 9 years when I was 3 years old, and my age has increased by 17 years, my partner’s age must have also increased by 17 years.

  5. Therefore, my partner’s current age is 9 + 17 = 26 years.

So, my partner is 26 years old.

Table 12. An example of Chain of Thought prompting

Alright, now the final answer is correct. That’s because we clearly instructed the LLM to

explain each step instead of just returning an answer. It’s interesting to see that the model

adds up the 17 increased years. In my head I would take the difference of years between my

partner and me and add those up. (20+(9-3)). Let’s help the model to think a little bit more

like me.

Translation

提示工程

2025年2月30日

哎呀。这显然是错误的答案。事实上,大型语言模型(LLM)常常在处理数学任务时遇到困难,可能会给出错误的答案——即使是像两个数字相乘这样简单的任务也是如此。这是因为它们是在大量文本数据上训练的,而数学可能需要不同的处理方法。那么,让我们看看中间推理步骤是否能改善输出结果。

提示:当我3岁时,我的伴侣是我的年龄的3倍。现在,我20岁了。我的伴侣现在多大了?让我们一步步思考。

输出:

  1. 让我们假设我现在的年龄是"x"岁。
  2. 当我3岁时,我的年龄是3岁。
  3. 那时我伴侣的年龄是我的3倍,这意味着我伴侣的年龄是3 * 3 = 9岁。
  4. 现在,我20岁了,这意味着自从我3岁以来,我的年龄增加了20 - 3 = 17岁。
  5. 既然我3岁时我伴侣的年龄是9岁,而我的年龄增加了17岁,那么我伴侣的年龄也一定增加了17岁。
  6. 因此,我伴侣现在的年龄是9 + 17 = 26岁。

所以,我的伴侣26岁了。

表12. 思维链提示的一个例子

好的,现在最终答案正确了。这是因为我们明确指示大型语言模型解释每个步骤,而不是仅仅返回一个答案。有趣的是,模型是将增加的17年相加。在我的头脑中,我会计算伴侣和我之间的年龄差,然后加上这个差值。(20+(9-3))。让我们帮助模型思考得更像我一些。


Page 31

Original

Prompt Engineering

February 2025 31

Table 12 is an example of ‘zero-shot’ Chain of thought. Chain of thought prompting can be

very powerful when combined with a single-shot or few-shot, as what you can see in Table 13:

Prompt Q: When my brother was 2 years old, I was double his age. Now I am 40 years old. How old is my brother? Let's think step by step. A: When my brother was 2 years, I was 2 * 2 = 4 years old. That's an age difference of 2 years and I am older. Now I am 40 years old, so my brother is 40 - 2 = 38 years old. The answer is 38. Q: When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner? Let's think step by step. A:

Output When I was 3 years old, my partner was 3 * 3 = 9 years old. That’s an age difference of 6 years and my partner is older. Now I am 20 years old, so my partner is 20 + 6 = 26 years old. The answer is 26.

Table 13. An example of chain of thought prompting with a single-shot

Chain of thought can be useful for various use-cases. Think of code generation, for breaking

down the request into a few steps, and mapping those to specific lines of code. Or for

creating synthetic data when you have some kind of seed like “The product is called XYZ,

write a description guiding the model through the assumptions you would make based on

the product given title.” Generally, any task that can be solved by ‘talking through is a good

candidate for a chain of thought. If you can explain the steps to solve the problem, try chain

of thought.

Please refer to the notebook10 hosted in the GoogleCloudPlatform Github repository which

will go into further detail on CoT prompting:

Translation

提示工程

2025年2月31日

表12是"零样本"思维链的示例。如表13所示,思维链提示与单样本或少样本结合使用时可以非常强大:

提示问题:当我的弟弟2岁时,我的年龄是他的两倍。现在我40岁了,我的弟弟多大了?让我们一步步思考。答案:当我的弟弟2岁时,我的年龄是2×2=4岁。这意味着我们相差2岁,而且我比他大。现在我40岁了,所以我的弟弟是40-2=38岁。答案是38。

问题:当我3岁时,我的伴侣年龄是我的3倍。现在我20岁了,我的伴侣多大了?让我们一步步思考。答案:

输出:当我3岁时,我的伴侣年龄是3×3=9岁。这意味着我们相差6岁,而且伴侣比我大。现在我20岁了,所以我的伴侣是20+6=26岁。答案是26。

表13. 单样本思维链提示的示例

思维链可用于多种用例。想象代码生成,将请求分解为几个步骤,并将这些步骤映射到特定的代码行。或者当你有某种种子数据时,用于创建合成数据,例如"产品名为XYZ,编写一个描述,引导模型根据给定的产品标题做出假设。"一般来说,任何可以通过"边说边做"解决的问题都是思维链的良好候选。如果你能解释解决问题的步骤,就尝试使用思维链。

请参考GoogleCloudPlatform Github仓库中托管的notebook10,它将详细介绍CoT提示:


Page 32

Original

Prompt Engineering

February 2025 32

In the best practices section of this chapter, we will learn some best practices specific to

Chain of thought prompting.

Self-consistency

While large language models have shown impressive success in various NLP tasks, their

ability to reason is often seen as a limitation that cannot be overcome solely by increasing

model size. As we learned in the previous Chain of Thought prompting section, the model can

be prompted to generate reasoning steps like a human solving a problem. However CoT uses

a simple ‘greedy decoding’ strategy, limiting its effectiveness. Self-consistency11 combines

sampling and majority voting to generate diverse reasoning paths and select the most

consistent answer. It improves the accuracy and coherence of responses generated by LLMs.

Self-consistency gives a pseudo-probability likelihood of an answer being correct, but

obviously has high costs.

It follows the following steps:

  1. Generating diverse reasoning paths: The LLM is provided with the same prompt multiple

times. A high temperature setting encourages the model to generate different reasoning

paths and perspectives on the problem.

  1. Extract the answer from each generated response.

  2. Choose the most common answer.

Let’s look into an example of an email classification system, which classifies an email as

IMPORTANT or NOT IMPORTANT. A zero-shot chain of thought prompt will be sent to the LLM

multiple times, to see if the responses differ after each submit. Notice the friendly tone, the

word choice and the sarcasm that’s been used in the email. All this could trick the LLM.

Translation

提示词工程

2025年2月 32

在本章的最佳实践部分,我们将学习一些特定的最佳实践:

思维链提示。

自洽性

虽然大型语言模型在各种自然语言处理任务中展现出了令人印象深刻的成功,但其推理能力通常被视为一个局限性,仅靠增加模型规模无法克服。正如我们在前文思维链提示部分所学,模型可以被提示生成类似人类解决问题的推理步骤。然而,思维链提示使用简单的"贪婪解码"策略,限制了其有效性。自洽性结合了采样和多数投票,以生成多样化的推理路径并选择最一致的答案。它提高了大型语言模型生成响应的准确性和连贯性。

自洽性给出了答案正确性的伪概率可能性,但显然成本很高。

它遵循以下步骤:

  1. 生成多样化的推理路径:向大型语言模型多次提供相同的提示。较高的温度设置鼓励模型生成不同的推理路径和对问题的不同视角。

  2. 从每个生成的响应中提取答案。

  3. 选择最常见的答案。

让我们看一个邮件分类系统的例子,该系统将邮件分类为重要或不重要。零样本思维链提示将被多次发送给大型语言模型,以查看每次提交后响应是否不同。注意邮件中使用的友好语气、措辞选择和讽刺意味。所有这些都可能欺骗大型语言模型。


Page 33

Original

Prompt Engineering

February 2025 33

Prompt EMAIL: ``` Hi,

I have seen you use WordPress for your website. A great open source content management system. I have used it in the past too. It comes with lots of great user plugins. And it's pretty easy to set up.

I did notice a bug in the contact form, which happens when you select the name field. See the attached screenshot of me entering text in the name field. Notice the JavaScript alert box that I inv0k3d.

But for the rest it's a great website. I enjoy reading it. Feel free to leave the bug in the website, because it gives me more interesting things to read.

Cheers,

Harry the Hacker. ```

Classify the above email as IMPORTANT or NOT IMPORTANT. Let's think step by step and explain why.

Continues next page...

Translation

提示工程

2025年2月 33

邮件主题:```
你好,

我看到你的网站使用了WordPress。这是一个很棒的开源内容管理系统。我也曾经使用过它。它有很多很棒的用户插件,而且设置起来相当简单。

我注意到联系表单中有一个bug,当你选择姓名字段时就会出现。请看我附上的截图,我在姓名字段中输入了文本。注意那个我调出来的JavaScript警告框。

但除此之外,这是一个很棒的网站。我很喜欢阅读它。你可以在网站上保留这个bug,因为这会给我提供更多有趣的内容。

祝好,

哈克黑客。


将上述邮件分类为重要或不重要。让我们一步一步思考并解释原因。

下页继续...

---

## Page 34

### Original
Prompt Engineering

February 2025 34

Output  Attempt 1

**Step 1: Identify the purpose of the email.**

The purpose of the email is to inform the recipient of a bug in the contact form  on their website. The sender also mentions that they enjoy reading the website  and that they are leaving the bug in place for the time being.

**Step 2: Assess the potential impact of the bug.**

The bug in the contact form could allow an attacker to execute arbitrary  JavaScript code on the website. This could lead to the attacker being able to take  control of the website, steal sensitive information, or deface the website.

**Step 3: Consider the credibility of the sender.**

The sender of the email claims to be a hacker named Harry. However, there is no  way to verify the sender’s identity. It is possible that the sender is a legitimate  hacker who is trying to help the recipient, but it is also possible that the sender  is a malicious actor who is trying to trick the recipient into clicking on a malicious  link or downloading a malicious file.

**Conclusion: IMPORTANT**

Based on the potential impact of the bug and the credibility of the sender,  the email should be classified as **IMPORTANT**. The recipient should take  immediate action to fix the bug and protect their website from attack.

Output  Attempt 2

**Explanation:**

1.   **Lack of urgency:** The email does not convey any sense of urgency or  time-sensitivity. It starts with a casual greeting and presents the information  in a matter-of-fact tone.

2.   **Non-critical bug report:** The bug reported in the contact form is  described as interesting but not critical. The email sender even suggests that  the website owner can leave the bug in place.

3.   **Lack of personal impact:** The email does not indicate any direct or  personal impact on the website owner. It simply mentions that the sender  enjoys reading the website.

Continues next page...

### Translation
提示工程

2025年2月 34

输出 尝试1

**第一步:确定电子邮件的目的。**

这封电子邮件的目的是通知收件人其网站联系表单中存在一个漏洞。发送者还提到他们喜欢阅读该网站,并且暂时保留这个漏洞。

**第二步:评估漏洞的潜在影响。**

联系表单中的漏洞可能允许攻击者在网站上执行任意的JavaScript代码。这可能导致攻击者能够控制网站、窃取敏感信息或破坏网站外观。

**第三步:考虑发送者的可信度。**

这封电子邮件的发送者声称自己是一个名叫哈里的黑客。然而,无法验证发送者的身份。发送者可能是一个试图帮助收件人的合法黑客,但也可能是一个恶意行为者,试图诱骗收件人点击恶意链接或下载恶意文件。

**结论:重要**

基于漏洞的潜在影响和发送者的可信度,这封电子邮件应被归类为**重要**。收件人应立即采取措施修复该漏洞,并保护其网站免受攻击。

输出 尝试2

**说明:**

1. **缺乏紧迫性:** 电子邮件没有传达任何紧迫感或时间敏感性。它以随意的问候开始,并以实事求是的语调呈现信息。

2. **非关键性漏洞报告:** 联系表单中报告的漏洞被描述为有趣但不关键。电子邮件发送者甚至建议网站所有者可以保留这个漏洞。

3. **缺乏个人影响:** 电子邮件没有表明对网站所有者有任何直接或个人的影响。它只是简单提到发送者喜欢阅读该网站。

续下页...

---

## Page 35

### Original
Prompt Engineering

February 2025 35

Output  Attempt 2

4.   **Absence of action request:** The email does not explicitly request any  action from the website owner. It does not ask for the bug to be fixed or  provide any specific suggestions for improvement.

5.   **Sender’s intent:** The sender’s intention seems to be more focused on  sharing an observation and expressing interest in the website rather than  raising a serious concern or requesting assistance.

**Conclusion: NOT IMPORTANT**

Output  Attempt 3

**Step 1: Identify the purpose of the email.**

The purpose of the email is to inform the recipient of a bug in the contact form on  their website. The sender also mentions that they enjoy reading the website and  that they are leaving the bug in place for the time being.

**Step 2: Assess the potential impact of the bug.**

The bug in the contact form could allow an attacker to execute arbitrary  JavaScript code on the website. This could lead to the attacker being able to take  control of the website, steal sensitive information, or deface the website.

**Step 3: Consider the credibility of the sender.**

The sender of the email claims to be a hacker named Harry. However, there is no  way to verify the sender’s identity. It is possible that the sender is a legitimate  hacker who is trying to help the recipient, but it is also possible that the sender is  a malicious actor who is trying to exploit the bug for their own gain.

**Conclusion:**

Based on the above factors, the email should be classified as **IMPORTANT**.  The bug in the contact form is a serious security risk, and the sender’s credibility  is unknown. The recipient should take immediate action to fix the bug and  i i h d l i

Table 14. An example of prompting for self consistency

You can use the above prompt and try to see if it returns a consistent classification.

Depending on the model that you are using and the temperature configuration, it might

return “IMPORTANT” or “NOT IMPORTANT”.

### Translation
提示工程

2025年2月 35

输出 尝试2

4.  **缺少行动请求:** 该邮件没有明确向网站所有者提出任何行动请求。它没有要求修复该漏洞,也没有提供任何具体的改进建议。

5.  **发送者的意图:** 发送者的意图似乎更侧重于分享观察结果并表达对网站的兴趣,而不是提出严重关切或请求协助。

**结论:不重要**

输出 尝试3

**步骤1:确定邮件的目的。**

邮件的目的是通知收件人其网站联系表单中存在一个漏洞。发送者还提到他们喜欢阅读该网站,并且暂时保留了这个漏洞。

**步骤2:评估漏洞的潜在影响。**

联系表单中的漏洞可能允许攻击者在网站上执行任意JavaScript代码。这可能导致攻击者能够控制网站、窃取敏感信息或破坏网站外观。

**步骤3:考虑发送者的可信度。**

该邮件的发送者声称自己是一个名叫哈利的黑客。然而,无法验证发送者的身份。发送者可能是一个试图帮助收件人的合法黑客,但也可能是一个试图利用漏洞谋取私利的恶意行为者。

**结论:**

基于上述因素,该邮件应被归类为重要。联系表单中的漏洞是一个严重的安全风险,且发送者的可信度未知。收件人应立即采取措施修复该漏洞并且...

表14. 自一致性提示的示例

您可以使用上述提示并尝试查看它是否返回一致的分类结果。

根据您使用的模型和温度配置,它可能会返回"重要"或"不重要"。

---

## Page 36

### Original
Prompt Engineering

February 2025 36

By generating many Chains of Thoughts, and taking the most commonly occurring answer

(“IMPORTANT”), we can get a more consistently correct answer from the LLM.

This example shows how self-consistency prompting can be used to improve the accuracy

of an LLM’s response by considering multiple perspectives and selecting the most

consistent answer.

Tree of Thoughts (ToT)

Now that we are familiar with chain of thought and self-consistency prompting, let’s review

Tree of Thoughts (ToT).12 It generalizes the concept of CoT prompting because it allows LLMs

to explore multiple different reasoning paths simultaneously, rather than just following a

single linear chain of thought. This is depicted in Figure 1.

Figure 1. A visualization of chain of thought prompting on the left versus. Tree of Thoughts prompting on

the right

### Translation
提示工程

2025年2月 36

通过生成多条思维链,并取最常见的答案("重要"),我们可以从大型语言模型(LLM)中获得更加一致的准确答案。

这个例子展示了如何通过自我一致性提示来提高大型语言模型回答的准确性,即通过考虑多种观点并选择最一致的答案。

思维树(ToT)

现在我们已经熟悉了思维链和自我一致性提示,让我们来回顾一下思维树(ToT)¹²。它概括了思维链提示的概念,因为它允许大型语言模型同时探索多种不同的推理路径,而不仅仅是遵循单一的线性思维链。这在图1中有描述。

图1. 左侧为思维链提示的可视化,右侧为思维树提示的可视化

---

## Page 37

### Original
Prompt Engineering

February 2025 37

This approach makes ToT particularly well-suited for complex tasks that require exploration. It

works by maintaining a tree of thoughts, where each thought represents a coherent language

sequence that serves as an intermediate step toward solving a problem. The model can then

explore different reasoning paths by branching out from different nodes in the tree.

There’s a great notebook, which goes into a bit more detail showing The Tree of Thought

(ToT) which is based on the paper ‘Large Language Model Guided Tree-of-Thought’.9

ReAct (reason & act)

Reason and act (ReAct) [10]13 prompting is a paradigm for enabling LLMs to solve complex

tasks using natural language reasoning combined with external tools (search, code

interpreter etc.) allowing the LLM to perform certain actions, such as interacting with external

APIs to retrieve information which is a first step towards agent modeling.

ReAct mimics how humans operate in the real world, as we reason verbally and can

take actions to gain information. ReAct performs well against other prompt engineering

approaches in a variety of domains.

ReAct prompting works by combining reasoning and acting into a thought-action loop. The

LLM first reasons about the problem and generates a plan of action. It then performs the

actions in the plan and observes the results. The LLM then uses the observations to update

its reasoning and generate a new plan of action. This process continues until the LLM

reaches a solution to the problem.

To see this in action, you need to write some code. In code Snippet 1 I am using the langchain

framework for Python, together with VertexAI (google-cloud-aiplatform) and the

google-search-results pip packages.

### Translation
提示工程

2025年2月 37

这种方法使ToT特别适合需要探索的复杂任务。它通过维护一个思维树来工作,其中每个思维代表一个连贯的语言序列,作为解决问题的中间步骤。然后,模型可以通过从树中的不同节点分支来探索不同的推理路径。

有一个很棒的笔记本,它更详细地展示了基于论文《大型语言模型引导的思维树》的Tree of Thought (ToT)。

推理与行动(ReAct)[10]13提示是一种范式,它使大型语言模型能够通过自然语言推理结合外部工具(搜索、代码解释器等)来解决复杂任务,允许大型语言模型执行某些操作,例如与外部API交互以检索信息,这是迈向智能体建模的第一步。

ReAct模拟了人类在现实世界中的运作方式,因为我们通过口头推理并可以采取行动来获取信息。在各种领域中,ReAct与其他提示工程方法相比表现良好。

ReAct提示通过将推理和行动结合成一个思维-行动循环来工作。大型语言模型首先对问题进行推理并制定行动计划。然后它执行计划中的行动并观察结果。接着,大型语言模型利用观察结果更新其推理并生成新的行动计划。这个过程一直持续,直到大型语言模型找到问题的解决方案。

要实际看到这一点,你需要编写一些代码。在代码片段1中,我使用了Python的langchain框架,以及VertexAI(google-cloud-aiplatform)和google-search-results的pip包。

---

## Page 38

### Original
Prompt Engineering

February 2025 38

To run this sample you must create a (free) SerpAPI key from https://serpapi.com/manage-

api-key and set an environment variable SERPAPI_API_KEY.

Next let’s write some Python code, with the task for the LLM to figure out: How many children

have a famous dad that performs in the band Metallica.

Python

from langchain.agents import load_tools from langchain.agents import initialize_agent from langchain.agents import AgentType from langchain.llms import VertexAI

prompt = "How many kids do the band members of Metallica have?"

llm = VertexAI(temperature=0.1) tools = load_tools(["serpapi"], llm=llm)

agent = initialize_agent(tools, llm,   agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True) agent.run(prompt)

Snippet 1. Creating a ReAct Agent with LangChain and VertexAI

Code Snippet 2 shows the result. Notice that ReAct makes a chain of five searches. In fact,

the LLM is scraping Google search results to figure out the band names. Then, it lists the

results as observations and chains the thought for the next search.

### Translation
提示工程

2025年2月 38

要运行此示例,您必须从 https://serpapi.com/manage-api-key 创建一个(免费的)SerpAPI 密钥,并设置一个环境变量 SERPAPI_API_KEY。

接下来,让我们编写一些 Python 代码,任务是让大型语言模型(LLM)找出:有多少孩子的父亲是 Metallica 乐队的成员。

Python

from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import VertexAI

prompt = "Metallica 乐队成员有多少个孩子?"

llm = VertexAI(temperature=0.1)
tools = load_tools(["serpapi"], llm=llm)

agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent.run(prompt)

代码片段 1:使用 LangChain 和 VertexAI 创建 ReAct 代理

代码片段 2 显示了结果。请注意,ReAct 进行了五次搜索。实际上,大型语言模型正在抓取 Google 搜索结果以找出乐队成员的名字。然后,它将这些结果作为观察结果列出,并为下一次搜索连接思考过程。

---

## Page 39

### Original
Prompt Engineering

February 2025 39

> Entering new AgentExecutor chain... Metallica has 4 members.  Action: Search Action Input: How many kids does James Hetfield have? Observation: three children Thought: 1/4 Metallica band members have 3 children Action: Search Action Input: How many kids does Lars Ulrich have? Observation: 3 Thought: 2/4 Metallica band members have 6 children Action: Search Action Input: How many kids does Kirk Hammett have? Observation: Hammett has been married to his second wife Lani since 1998.  They have two sons, Angel (b. September 29, 2006) and Vincenzo (b. June  28, 2008). Thought: 3/4 Metallica band members have 8 children Action: Search Action Input: How many kids does Robert Trujillo have? Observation: 2 Thought: 4/4 Metallica band members have 10 children Final Answer: 10

Snippet 2. Executing ReAct Agent in the terminal

Code Snippet 2 figures out that the band Metallica has four band members. Then it searches

each band member to request the total of children and add up the total. Finally, it returns the

total number of children as the final answer.

ReAct prompting in practice requires understanding that you continually have to resend the

previous prompts/responses (and do trimming of the extra generated content) as well as set

up the model with appropriate examples/instructions. Please refer to the notebook14 hosted

in the GoogleCloudPlatform Github repository, which goes into a bit more detail showing the

actual LLM inputs and outputs with a more elaborate example.

### Translation
提示工程
2025年2月 39

> 进入新的AgentExecutor链...Metallica乐队有4名成员。操作:搜索 操作输入:James Hetfield有几个孩子?观察:三个孩子 思考:1/4的Metallica乐队成员有3个孩子 操作:搜索 操作输入:Lars Ulrich有几个孩子?观察:3 思考:2/4的Metallica乐队成员有6个孩子 操作:搜索 操作输入:Kirk Hammett有几个孩子?观察:Hammett自1998年以来与他的第二任妻子Lani结婚。他们有两个儿子,Angel(生于2006年9月29日)和Vincenzo(生于2008年6月28日)。思考:3/4的Metallica乐队成员有8个孩子 操作:搜索 操作输入:Robert Trujillo有几个孩子?观察:2 思考:4/4的Metallica乐队成员有10个孩子 最终答案:10

代码片段2. 在终端中执行ReAct代理

代码片段2确定Metallica乐队有四名成员。然后它搜索每个乐队成员以请求子女总数并累加总数。最后,它将子女总数作为最终答案返回。

实践中的ReAct提示需要理解,您必须不断重新发送之前的提示/响应(并修剪多余生成的内容),同时使用适当的示例/指令设置模型。请参考GoogleCloudPlatform Github仓库中托管的notebook14,它通过一个更详细的示例,更深入地展示了实际的LLM输入和输出。

---

## Page 40

### Original
Prompt Engineering

February 2025 40

Automatic Prompt Engineering

At this point you might realize that writing a prompt can be complex. Wouldn’t it be nice to

automate this (write a prompt to write prompts)? Well, there’s a method: Automatic Prompt

Engineering (APE). This method15 not only alleviates the need for human input but also

enhances the model’s performance in various tasks.

You will prompt a model to generate more prompts. Evaluate them, possibly alter the good

ones. And repeat.

For example, you can use automatic prompt engineering that will help with training a chatbot

for a merchandise t-shirt webshop. We want to figure out all the various ways customers

could phrase their order for buying a band merchandise t-shirt.

1.   Write the prompt which will generate the output variants. In this example, I am using

gemini-pro to generate 10 instructions. See Table 15:

### Translation
提示工程

2025年2月 第40页

自动提示工程

此时你可能会意识到编写提示词可能很复杂。如果能自动化这个过程(编写提示词来生成提示词)岂不是很好?确实有这样一个方法:自动提示工程(APE)。这种方法15不仅减少了对人工输入的需求,还提高了模型在各种任务中的性能。

你将提示模型生成更多提示词。评估这些提示词,可能修改好的提示词。然后重复这个过程。

例如,你可以使用自动提示工程来帮助训练一个乐队纪念品T恤网店的聊天机器人。我们想找出顾客可以用各种方式表达购买乐队纪念品T恤订单的所有可能方式。

1. 编写将生成输出变量的提示词。在此示例中,我使用gemini-pro生成10条指令。参见表15:

---

## Page 41

### Original
Prompt Engineering

February 2025 41

Prompt We have a band merchandise t-shirt webshop, and to train a  chatbot we need various ways to order: "One Metallica t-shirt  size S". Generate 10 variants, with the same semantics but keep  the same meaning.

Output 1.    I’d like to purchase a Metallica t-shirt in size small.

2.   Can I order a small-sized Metallica t-shirt?

3.   I’d like to buy one Metallica t-shirt in small.

4.   One Metallica shirt, size small, please.

5.   I want one Metallica t-shirt in size small.

6.   I’d like to place an order for one Metallica t-shirt in small.

7.   Small Metallica t-shirt, one please.

8.   I’m looking for one Metallica t-shirt in small.

9.   Can you get me one Metallica t-shirt in small?

10. One Metallica small t-shirt, please.

Table 15. Use an LLM to generate prompts

2.   Evaluate all of the instruction candidates by scoring the candidates based on a chosen

metric, For example, you can use. BLEU (Bilingual Evaluation Understudy) or ROUGE

(Recall-Oriented Understudy for Gisting Evaluation).

3.   Select the instruction candidate with the highest evaluation score. This candidate will be

the final prompt you can use in your software application or chatbot. You can also tweak

the select prompt and evaluate again.

### Translation
提示工程

2025年2月 41

提示:我们有一个乐队周边T恤网店,为了训练聊天机器人,我们需要各种不同的下单方式示例:"一件Metallica乐队T恤,尺码S"。请生成10个变体,保持相同语义但表达方式不同。

输出 1. 我想购买一件小码的Metallica乐队T恤。

2. 我可以订购一件小码的Metallica乐队T恤吗?

3. 我想买一件小码的Metallica乐队T恤。

4. 请给我一件小码的Metallica乐队T恤。

5. 我想要一件小码的Metallica乐队T恤。

6. 我想订购一件小码的Metallica乐队T恤。

7. 一件小码的Metallica乐队T恤,谢谢。

8. 我在找一件小码的Metallica乐队T恤。

9. 你能帮我拿一件小码的Metallica乐队T恤吗?

10. 请给我一件小码的Metallica乐队T恤。

表15. 使用大型语言模型生成提示

2. 通过基于选定指标对候选提示进行评分来评估所有候选提示。例如,您可以使用BLEU(双语评估研究)或ROUGE(用于摘要评估的面向召回的评估研究)。

3. 选择评估得分最高的候选提示。该提示将成为您可以在软件应用程序或聊天机器人中使用的最终提示。您也可以调整所选提示并重新评估。

---

## Page 42

### Original
Prompt Engineering

February 2025 42

Code prompting

Gemini focuses primarily on text-based prompts, which also includes writing prompts for

returning code. Let’s go to the Vertex AI Studio and test these prompts to look at some

coding examples.

Prompts for writing code

Gemini can also be a developer and help you with writing code in any programming language

of choice. As a developer this can help you speed up the process of writing code.

Imagine a folder on your machine with hundreds of files that needs to be renamed.

Renaming each file would take you a lot of time. You know a little Bash, and could write

a script to automate this, but that may also take awhile. So let’s write a prompt. You can

write the prompt in the public consumer chatbot Gemini, or if you are more concerned

about confidentiality, you can write these prompts within your Google Cloud account and

open Vertex AI Studio. The advantage of Vertex AI Studio is that you can configure the

temperature etc.

### Translation
提示词工程

2025年2月 42

代码提示

Gemini主要专注于基于文本的提示,这包括用于返回代码的编写提示。让我们进入Vertex AI Studio并测试这些提示,查看一些编码示例。

编写代码的提示

Gemini也可以成为开发者,帮助您用任何选定的编程语言编写代码。作为开发者,这可以帮助您加快编写代码的过程。

想象一下您机器上的一个文件夹中有数百个文件需要重命名。逐个重命名每个文件会花费您大量时间。您懂一些Bash,可以编写脚本来自动化这个过程,但这可能也需要一段时间。所以让我们写一个提示。

您可以在公开的消费者聊天机器人Gemini中编写提示,或者如果您更关注保密性,可以在您的Google Cloud账户中编写这些提示并打开Vertex AI Studio。Vertex AI Studio的优势是您可以配置温度等参数。

---

## Page 43

### Original
Prompt Engineering

February 2025 43

Name 1_bash_rename_files_in_folder

Goal Write a prompt to write code in Bash to rename files in a folder.

Model gemini-pro

Temperature 0.1 Token Limit 1024

Top-K N/A Top-P 1

Prompt Write a code snippet in Bash, which asks for a folder name.  Then it takes the contents of the folder and renames all the  files inside by prepending the name draft to the file name.

Output ```bash #!/bin/bash

# Ask for the folder name echo "Enter the folder name: " read folder_name

# Check if the folder exists if [ ! -d "$folder_name" ]; then echo "Folder does not exist." exit 1 fi

# Get the list of files in the folder files=( "$folder_name"/* )

# Rename each file by prepending "draft" to the file name for file in "${files[@]}"; do new_file_name="draft_$(basename "$file")" mv "$file" "$new_file_name" done

echo "Files renamed successfully." ```

Table 16. Using Gemini to program code

That seems like good code to me - it’s even documented! However, since LLMs can’t reason,

and repeat training data, it’s essential to read and test your code first.

The moment we are all waiting for, does it really work?

### Translation
提示词工程

2025年2月 43

名称:1_bash_rename_files_in_folder

目标:编写一个提示词,用于编写Bash代码来重命名文件夹中的文件。

模型:gemini-pro

温度:0.1 令牌限制:1024

Top-K:不可用 Top-P:1

提示:编写一个Bash代码片段,要求输入文件夹名称,然后获取文件夹内容,并通过在文件名前添加"draft"来重命名所有文件。

输出:
```bash
#!/bin/bash

# 要求输入文件夹名称
echo "请输入文件夹名称:"
read folder_name

# 检查文件夹是否存在
if [ ! -d "$folder_name" ]; then
  echo "文件夹不存在。"
  exit 1
fi

# 获取文件夹中的文件列表
files=( "$folder_name"/* )

# 通过在文件名前添加"draft"来重命名每个文件
for file in "${files[@]}"; do
  new_file_name="draft_$(basename "$file")"
  mv "$file" "$new_file_name"
done

echo "文件重命名成功。"</code></pre>
<p>表16. 使用Gemini编程</p>
<p>在我看来,这是很好的代码 - 甚至还有文档!然而,由于大型语言模型无法进行推理,只是重复训练数据,因此务必先阅读和测试您的代码。</p>
<p>我们都在等待的时刻,它真的有效吗?</p>
<hr />
<h2>Page 44</h2>
<h3>Original</h3>
<p>Prompt Engineering</p>
<p>February 2025 44</p>
<p>Let’s try it first with a test folder with only a few files in it, which will need to be renamed from</p>
<p>filename.txt to draft_filename.txt.</p>
<ol>
<li>Copy the output from Table 16 (without the ```bash ``` text wrapper), and paste it in a new</li>
</ol>
<p>file called: “rename_files.sh”.</p>
<ol start="2">
<li>Open a terminal window and type: . rename_files.sh. It will ask to enter a folder name, e.g.</li>
</ol>
<p>test. and hit enter.</p>
<ol start="3">
<li>The script seems to run fine. You will see the message: Files renamed successfully.</li>
</ol>
<p>When you look into the test folder, notice that all the files were perfectly renamed to</p>
<p>draft_filename.txt.</p>
<p>It worked!</p>
<p>Prompts for explaining code</p>
<p>As a developer when you work in teams you have to read someone else’s code. Gemini can</p>
<p>help you with this as well. Let’s take the code output from Table 16, remove the comments</p>
<p>and ask the large language model to explain what’s happening, See Table 17:</p>
<h3>Translation</h3>
<p>提示工程</p>
<p>2025年2月 44</p>
<p>让我们先用一个只包含几个文件的测试文件夹来尝试一下,这些文件需要从 filename.txt 重命名为 draft_filename.txt。</p>
<ol>
<li>复制表16中的输出(不包括 ```bash ``` 文本包装器),并将其粘贴到一个名为 "rename_files.sh" 的新文件中。</li>
<li>打开一个终端窗口并输入:. rename_files.sh。它会要求输入文件夹名称,例如 test。然后按回车键。</li>
<li>脚本运行似乎很顺利。您将看到消息:Files renamed successfully. 当您查看测试文件夹时,请注意所有文件都已完美重命名为 draft_filename.txt。</li>
</ol>
<p>成功了!</p>
<p>解释代码的提示词</p>
<p>作为一名开发者,当您在团队中工作时,必须阅读他人的代码。Gemini 也可以帮助您做到这一点。让我们使用表16中的代码输出,删除注释,然后要求大型语言模型解释发生了什么,请参见表17:</p>
<hr />
<h2>Page 45</h2>
<h3>Original</h3>
<p>Prompt Engineering</p>
<p>February 2025 45</p>
<p>Name 1_bash_explain_the_code</p>
<p>Goal Write a prompt to explain Bash code</p>
<p>Model gemini-pro</p>
<p>Temperature 0.1 Token Limit 1024</p>
<p>Top-K N/A Top-P 1</p>
<p>Prompt Explain to me the below Bash code:</p>
<pre><code class="language-#!/bin/bash">
echo "Enter the folder name: " read folder_name

if [ ! -d "$folder_name" ]; then echo "Folder does not exist." exit 1 fi

files=( "$folder_name"/* )

for file in "${files[@]}"; do new_file_name="draft_$(basename "$file")" mv "$file" "$new_file_name" done

echo "Files renamed successfully." ```

Output This Bash script performs the following tasks:

1.   **User Input**:  - It prompts the user to enter a folder name using the <code>echo</code> and <code>read</code>  commands. The entered folder name is stored in the <code>folder_name</code> variable.

2.   **Folder Existence Check**:  - It checks if the folder with the specified name exists using the <code>[ ! -d  “$folder_name” ]</code> condition. If the folder does not exist, it displays an error  message (“Folder does not exist.”) and exits the script with an exit code of 1.

Continues next page...

### Translation
提示工程

2025年2月 45

名称 1_bash_explain_the_code

目标 编写一个解释Bash代码的提示

模型 gemini-pro
温度 0.1 令牌限制 1024
Top-K 不可用 Top-P 1

提示 请向我解释下面的Bash代码:

``` #!/bin/bash

echo "Enter the folder name: " read folder_name

if [ ! -d "$folder_name" ]; then echo "Folder does not exist." exit 1 fi

files=( "$folder_name"/* )

for file in "${files[@]}"; do new_file_name="draft_$(basename "$file")" mv "$file" "$new_file_name" done

echo "Files renamed successfully." ```

输出 这个Bash脚本执行以下任务:

1.  **用户输入**:
   - 它使用`echo`和`read`命令提示用户输入文件夹名称。输入的文件夹名称存储在`folder_name`变量中。

2.  **文件夹存在性检查**:
   - 它使用`[ ! -d "$folder_name" ]`条件检查指定名称的文件夹是否存在。如果文件夹不存在,它会显示错误消息("文件夹不存在。")并以退出代码1退出脚本。

继续下一页...

---

## Page 46

### Original
Prompt Engineering

February 2025 46

Output 3.    **File Listing**:  - It lists all the files inside the specified folder using the `files=( “$folder_ name”/* )` command. This creates an array called `files` containing the paths  to all the files in the folder.

4.   **File Renaming**:  - It iterates through each file in the `files` array using a `for` loop.  - For each file, it generates a new file name using the `new_file_ name=”draft_$(basename “$file”)” ` command. This new file name is created  by prefixing the original file name with the string “draft_”.  - It then renames the file using the `mv “$file” “$new_file_name”` command,  which moves the file to the new file name.

5.   **Success Message**:  - After renaming all the files, it displays a success message (“Files renamed  successfully.”) to indicate that the renaming operation was successful

Table 17. A prompt for explaining code

Prompts for translating code

The bash code from Table 16 seems to work fine. However this script could really be reusable

when it would prompt me for file names, ideally it should work as a separate application with

an UI. As a starting point, Python would be a better language for a (web) application than

Bash. LLMs can help with translating code from one language to another. See the example in

Table 18:

### Translation
提示工程

2025年2月 46

3. **文件列表**:
   - 它使用 `files=( "$folder_name"/* )` 命令列出指定文件夹内的所有文件。这会创建一个名为 `files` 的数组,包含文件夹中所有文件的路径。

4. **文件重命名**:
   - 它使用 `for` 循环遍历 `files` 数组中的每个文件。
   - 对于每个文件,它使用 `new_file_name="draft_$(basename "$file")"` 命令生成新的文件名。这个新文件名通过在原始文件名前添加字符串 "draft_" 来创建。
   - 然后它使用 `mv "$file" "$new_file_name"` 命令重命名文件,该命令将文件移动到新的文件名。

5. **成功消息**:
   - 重命名所有文件后,它会显示一条成功消息("Files renamed successfully."),表示重命名操作已成功完成。

表17. 用于解释代码的提示

用于翻译代码的提示

表16中的bash代码似乎工作正常。然而,这个脚本如果能提示我输入文件名,将会更加实用,理想情况下它应该作为一个带有用户界面的独立应用程序运行。作为起点,Python对于(网络)应用程序来说是比Bash更好的语言。大型语言模型可以帮助将代码从一种语言翻译成另一种语言。参见表18中的示例:

---

## Page 47

### Original
Prompt Engineering

February 2025 47

Name 1_python_rename_files

Goal Write a prompt to translate Bash code to Python

Model gemini-pro

Temperature 0.1 Token Limit 1024

Top-K N/A Top-P 1

Prompt Translate the below Bash code to a Python snippet.

```bash #!/bin/bash

echo "Enter the folder name: " read folder_name

if [ ! -d "$folder_name" ]; then echo "Folder does not exist." exit 1 fi

files=( "$folder_name"/* )

for file in "${files[@]}"; do new_file_name="draft_$(basename "$file")" mv "$file" "$new_file_name" done

echo "Files renamed successfully." ```

Output ```python import os import shutil

# Get the folder name from the user folder_name = input("Enter the folder name: ")

# Check if the folder exists if not os.path.isdir(folder_name): print("Folder does not exist.")     exit(1)

Continues next page...

### Translation
提示词工程

2025年2月 47

名称 1_python_rename_files

目标 编写一个将Bash代码转换为Python的提示词

模型 gemini-pro

温度 0.1 令牌限制 1024

Top-K 不可用 Top-P 1

提示 将下面的Bash代码转换为Python代码片段。

```bash #!/bin/bash

echo "Enter the folder name: " read folder_name

if [ ! -d "$folder_name" ]; then echo "Folder does not exist." exit 1 fi

files=( "$folder_name"/* )

for file in "${files[@]}"; do new_file_name="draft_$(basename "$file")" mv "$file" "$new_file_name" done

echo "Files renamed successfully." ```

输出 ```python import os import shutil

# Get the folder name from the user folder_name = input("Enter the folder name: ")

# Check if the folder exists if not os.path.isdir(folder_name): print("Folder does not exist.")     exit(1)

继续下一页...

---

## Page 48

### Original
Prompt Engineering

February 2025 48

Output # Get a list of all files in the folder files = os.listdir(folder_name)

# Iterate over the files for file in files:     # Create a new file name with the prefix "draft_"     new_file_name = f"draft_{file}"

# Move the file to the new name shutil.move(os.path.join(folder_name, file),  os.path.join(folder_name, new_file_name))

# Print a success message print("Files renamed successfully.") ```

Table 18. A prompt for translating code from Bash to Python

Read and review the code. Copy the output from the prompt and paste it into a new file:

file_renamer.py. Test the code by opening a Terminal window, and execute the following

command python file_renamer.py.

NOTE: When prompting for (Python) code in the Language Studio in Vertex AI, you will have

to click on the ‘Markdown’ button. Otherwise you will receive plain text which is missing the

proper indenting of lines, which is important for running Python code.

Prompts for debugging and reviewing code

Let’s manually write some edits to the code of Table 18. It should prompt the user for the

filename prefix, and write this prefix in upper case characters. See the example code in

Snippet *3, but what a bummer. It now returns Python errors!

### Translation
提示工程

2025年2月 48

# 获取文件夹中所有文件的列表
files = os.listdir(folder_name)

# 遍历文件
for file in files:
    # 创建带有前缀"draft_"的新文件名
    new_file_name = f"draft_{file}"

    # 将文件重命名为新名称
    shutil.move(os.path.join(folder_name, file), 
                os.path.join(folder_name, new_file_name))

# 打印成功消息
print("Files renamed successfully.") ```

表18. 用于将Bash代码转换为Python的提示

阅读并审查代码。复制提示的输出并粘贴到一个新文件中:file_renamer.py。通过打开终端窗口测试代码,并执行以下命令:python file_renamer.py。

注意:在Vertex AI的Language Studio中提示(Python)代码时,您需要点击'Markdown'按钮。否则您将收到纯文本,缺少适当的行缩进,这对运行Python代码很重要。

用于调试和审查代码的提示

让我们手动对表18的代码进行一些编辑。它应该提示用户输入文件名前缀,并将这个前缀以大写字符写入。请参见代码片段*3中的示例代码,但真糟糕。现在它返回Python错误!

---

## Page 49

### Original
Prompt Engineering

February 2025 49

Python

import os import shutil

folder_name = input("Enter the folder name: ") prefix = input("Enter the string to prepend to the filename: ") text = toUpperCase(prefix)

if not os.path.isdir(folder_name):     print("Folder does not exist.")     exit(1)

files = os.listdir(folder_name)

for file in files:     new_filename = f"{text}_{file}"

shutil.move(os.path.join(folder_name, file),  os.path.joi(folder_name, new_file_name))

print("Files renamed successfully.")

Snippet 3. A broken Python script

Doh! That looks like a bug:

The below Python code gives an error:

Traceback (most recent call last):   File “/Users/leeboonstra/Documents/test_folder/rename_files.py”, line 7,  in <module>     text = toUpperCase(prefix) NameError: name ‘toUpperCase’ is not defined

Snippet 4. I broke the Python code

Let’s see if we can ask the large language model to debug and review code. Have a look into

Table 19:

### Translation
提示工程

2025年2月 49

Python

import os import shutil

folder_name = input("Enter the folder name: ") prefix = input("Enter the string to prepend to the filename: ") text = toUpperCase(prefix)

if not os.path.isdir(folder_name):     print("Folder does not exist.")     exit(1)

files = os.listdir(folder_name)

for file in files:     new_filename = f"{text}_{file}"

shutil.move(os.path.join(folder_name, file),  os.path.joi(folder_name, new_file_name))

print("Files renamed successfully.")

代码片段3. 一个损坏的Python脚本

哎呀!这看起来是个错误:

下面的Python代码出现了错误:

Traceback (most recent call last):   File "/Users/leeboonstra/Documents/test_folder/rename_files.py", line 7,  in <module>     text = toUpperCase(prefix) NameError: name 'toUpperCase' is not defined

代码片段4. 我弄坏了Python代码

让我们看看是否可以请求大型语言模型来调试和审查代码。请查看

表19:

---

## Page 50

### Original
Prompt Engineering

February 2025 50

Name 1_python_debug_code

Goal Write a prompt to debug and review Python code.

Model gemini-pro

Temperature 0.1 Token Limit 1024

Top-K N/A Top-P 1

Prompt The below Python code gives an error:

Traceback (most recent call last):                                                    File "/ Users/leeboonstra/Documents/test_folder/rename_files.py", line  7, in <module>     text = toUpperCase(prefix) NameError: name 'toUpperCase' is not defined

Debug what's wrong and explain how I can improve the code.

```python import os import shutil

# Get the folder name from the user

folder_name = input("Enter the folder name: ") prefix = input("Enter the string to prepend to the filename: ") text = toUpperCase(prefix)

# Check if the folder exists if not os.path.isdir(folder_name):     print("Folder does not exist.")     exit(1)

# Get a list of all files in the folder files = os.listdir(folder_name)

# Iterate over the files for file in files:

Continues next page...

### Translation
提示工程

2025年2月 50

名称 1_python_debug_code

目标:编写一个用于调试和审查Python代码的提示。

模型:gemini-pro

温度:0.1 令牌限制:1024

Top-K N/A Top-P 1

提示:下面的Python代码出现错误:

回溯(最近一次调用):文件 "/Users/leeboonstra/Documents/test_folder/rename_files.py",第7行,在<module>中
text = toUpperCase(prefix)
NameError: 名称 'toUpperCase' 未定义

调试问题所在并解释如何改进代码。

```python
import os
import shutil

# 从用户获取文件夹名称
folder_name = input("Enter the folder name: ")
prefix = input("Enter the string to prepend to the filename: ")
text = toUpperCase(prefix)

# 检查文件夹是否存在
if not os.path.isdir(folder_name):
    print("Folder does not exist.")
    exit(1)

# 获取文件夹中所有文件的列表
files = os.listdir(folder_name)

# 遍历文件
for file in files:

继续下一页...

---

## Page 51

### Original
Prompt Engineering

February 2025 51

Prompt # Create a new file name with the prefix "draft_"     new_filename = f"{text}_{file}"

# Move the file to the new name shutil.move(os.path.join(folder_name, file),  os.path.join(folder_name, new_file_name))

# Print a success message print("Files renamed successfully.") ```

Output The code calls the `toUpperCase` function to convert `prefix` to uppercase, but  that function is not defined. To fix this issue, you can use the `upper()` method of  the string class which converts a given string into uppercase. The modified code  is shown below:

```python import os import shutil

# Get the folder name from the user folder_name = input("Enter the folder name: ") prefix = input("Enter the string to prepend to the filename: ") text = prefix.upper()

# Check if the folder exists if not os.path.isdir(folder_name):     print("Folder does not exist.")     exit(1)

# Get a list of all files in the folder files = os.listdir(folder_name)

# Iterate over the files for file in files:     # Create a new file name with the prefix "draft_"     new_filename = f"{text}_{file}"

# Move the file to the new name shutil.move(os.path.join(folder_name, file),

Continues next page...

### Translation
提示工程

2025年2月 51

提示 # 使用前缀"draft_"创建新文件名
new_filename = f"{text}_{file}"

# 将文件移动到新名称
shutil.move(os.path.join(folder_name, file), os.path.join(folder_name, new_file_name))

# 打印成功消息
print("文件重命名成功。")</code></pre>
<p>输出 代码调用`toUpperCase`函数将`prefix`转换为大写,但该函数未定义。要解决此问题,可以使用字符串类的`upper()`方法,该方法将给定字符串转换为大写。修改后的代码如下:</p>
<pre><code class="language-python">import os
import shutil

# 从用户获取文件夹名称
folder_name = input("请输入文件夹名称: ")
prefix = input("请输入要添加到文件名前面的字符串: ")
text = prefix.upper()

# 检查文件夹是否存在
if not os.path.isdir(folder_name):
    print("文件夹不存在。")
    exit(1)

# 获取文件夹中所有文件的列表
files = os.listdir(folder_name)

# 遍历文件
for file in files:
    # 使用前缀"draft_"创建新文件名
    new_filename = f"{text}_{file}"

# 将文件移动到新名称
shutil.move(os.path.join(folder_name, file),

继续下一页...

---

## Page 52

### Original
Prompt Engineering

February 2025 52

Output os.path.join(folder_name, new_filename))

# Print a success message print("Files renamed successfully.") ```

Additionally, there are a few other improvements that can be made to the code:

1.   The file extension of the new filenames is not kept. It’s better to extract the  extension from the original filename and append it to the new filename.

2.   The code assumes that there are no spaces in the folder name. However,  it is a good practice to handle spaces gracefully by replacing them with  underscores or hyphens.

3.   Using the <code>f</code> string syntax for string interpolation is more readable and  concise than the traditional <code>+</code> operator.

4.   The code doesn’t handle errors that might occur during the renaming process.  It would be better to wrap the <code>shutil.move</code> call in a <code>try...except</code> block to  catch any potential errors.

Here is the improved code with these suggestions:

```python import os import shutil # Get the folder name from the user folder_name = input("Enter the folder name: ") prefix = input("Enter the string to prepend to the filename: ") # Check if the folder exists if not os.path.isdir(folder_name):     print("Folder does not exist.")     exit(1) # Get a list of all files in the folder files = os.listdir(folder_name)

# Iterate over the files for file in files:

Continues next page...

### Translation
提示工程

2025年2月 52

```python
import os
import shutil

# 从用户获取文件夹名称
folder_name = input("请输入文件夹名称:")
prefix = input("请输入要添加到文件名前面的字符串:")

# 检查文件夹是否存在
if not os.path.isdir(folder_name):
    print("文件夹不存在。")
    exit(1)

# 获取文件夹中所有文件的列表
files = os.listdir(folder_name)

# 遍历文件
for file in files:
    # 使用前缀创建新文件名
    new_filename = prefix + file

    # 使用新名称移动文件
    shutil.move(os.path.join(folder_name, file), os.path.join(folder_name, new_filename))

# 打印成功消息
print("文件重命名成功。")</code></pre>
<p>此外,还可以对代码做一些改进:</p>
<ol>
<li>
<p>新文件名没有保留文件扩展名。最好从原始文件名中提取扩展名并将其附加到新文件名上。</p>
</li>
<li>
<p>代码假设文件夹名称中没有空格。然而,最好通过将空格替换为下划线或连字符来优雅地处理空格。</p>
</li>
<li>
<p>使用 `f` 字符串语法进行字符串插值比传统的 `+` 运算符更具可读性和简洁性。</p>
</li>
<li>
<p>代码没有处理重命名过程中可能发生的错误。最好将 `shutil.move` 调用包装在 `try...except` 块中,以捕获任何潜在的错误。</p>
</li>
</ol>
<p>以下是包含这些建议的改进代码:</p>
<pre><code class="language-python">import os
import shutil

# 从用户获取文件夹名称
folder_name = input("请输入文件夹名称:")
prefix = input("请输入要添加到文件名前面的字符串:")

# 检查文件夹是否存在
if not os.path.isdir(folder_name):
    print("文件夹不存在。")
    exit(1)

# 获取文件夹中所有文件的列表
files = os.listdir(folder_name)

# 遍历文件
for file in files:

继续下一页...

---

## Page 53

### Original
Prompt Engineering

February 2025 53

Output     # Get the file extension     ext = os.path.splitext(file)[1]

# Create a new file name with the prefix and the file extension     new_filename = f"{prefix}_{os.path.splitext(file)[0]}{ext}"

# Build the full paths for the original and new files     old_path = os.path.join(folder_name, file)     new_path = os.path.join(folder_name, new_filename)

# Try to move the file     try:         shutil.move(old_path, new_path)

except Exception as e:         print(f"An error occurred while moving the file  {file}: {e}")

# Print a success message print("Files renamed successfully.") ```

These improvements make the code more robust and flexible while also making  it easier to read and understand... (The response was truncated because  it has reached the token limit. Try to increase the token limit if you need a  longer response.)

Table 19. A prompt for debugging and reviewing Python code

That’s great. Not only did it tell me how to solve the problem, it also figured out that my code

had more bugs and how to solve them, too. The last part of the prompt gave suggestions to

improve the code in general.

### Translation
提示工程

2025年2月 第53页

输出     # 获取文件扩展名     ext = os.path.splitext(file)[1]

# 使用前缀和文件扩展名创建新文件名     new_filename = f"{prefix}_{os.path.splitext(file)[0]}{ext}"

# 构建原始文件和新文件的完整路径     old_path = os.path.join(folder_name, file)     new_path = os.path.join(folder_name, new_filename)

# 尝试移动文件     try:         shutil.move(old_path, new_path)

except Exception as e:         print(f"移动文件 {file} 时发生错误: {e}")

# 打印成功消息 print("文件重命名成功。") ```

这些改进使代码更加健壮和灵活,同时也使其更易于阅读和理解...(由于达到令牌限制,响应被截断。如果需要更长的响应,请尝试增加令牌限制。)

表19. 用于调试和审查Python代码的提示

太好了。它不仅告诉我如何解决问题,还发现我的代码有更多错误以及如何解决它们。提示的最后一部分提供了改进代码的一般性建议。

---

## Page 54

### Original
Prompt Engineering

February 2025 54

What about multimodal prompting?

Prompting for code still uses the same regular large language model. Multimodal prompting

is a separate concern, it refers to a technique where you use multiple input formats to guide

a large language model, instead of just relying on text. This can include combinations of text,

images, audio, code, or even other formats, depending on the model’s capabilities and the

task at hand.

Best Practices

Finding the right prompt requires tinkering. Language Studio in Vertex AI is a perfect place to

play around with your prompts, with the ability to test against the various models.

Use the following best practices to become a pro in prompt engineering.

Provide examples

The most important best practice is to provide (one shot / few shot) examples within a

prompt. This is highly effective because it acts as a powerful teaching tool. These examples

showcase desired outputs or similar responses, allowing the model to learn from them

and tailor its own generation accordingly. It’s like giving the model a reference point or

target to aim for, improving the accuracy, style, and tone of its response to better match

your expectations.

### Translation
提示工程

2025年2月 54页

什么是多模态提示?

代码提示仍然使用相同的常规大型语言模型。多模态提示是一个独立的概念,它指的是一种使用多种输入格式来引导大型语言模型的技术,而不是仅仅依赖文本。根据模型的能力和手头的任务,这可以包括文本、图像、音频、代码甚至其他格式的组合。

最佳实践

找到合适的提示需要不断尝试和调整。Vertex AI中的Language Studio是测试您提示的完美场所,可以针对各种模型进行测试。使用以下最佳实践来成为提示工程的专业人士。

提供示例

最重要的最佳实践是在提示中提供(单样本/少样本)示例。这非常有效,因为它充当了一个强大的教学工具。这些示例展示了期望的输出或类似的响应,使模型能够从中学习并相应地调整自己的生成。这就像给模型一个参考点或目标,以提高其响应的准确性、风格和语调,以更好地符合您的期望。

---

## Page 55

### Original
Prompt Engineering

February 2025 55

Design with simplicity

Prompts should be concise, clear, and easy to understand for both you and the model. As a

rule of thumb, if it’s already confusing for you it will likely be also confusing for the model. Try

not to use complex language and don’t provide unnecessary information.

Examples:

BEFORE:

I am visiting New York right now, and I'd like to hear more about great

locations. I am with two 3 year old kids. Where should we go during

our vacation?

AFTER REWRITE:

Act as a travel guide for tourists. Describe great places to visit in

New York Manhattan with a 3 year old.

Try using verbs that describe the action. Here’s a set of examples:

Act, Analyze, Categorize, Classify, Contrast, Compare, Create, Describe, Define,

Evaluate, Extract, Find, Generate, Identify, List, Measure, Organize, Parse, Pick,

Predict, Provide, Rank, Recommend, Return, Retrieve, Rewrite, Select, Show, Sort,

Summarize, Translate, Write.

### Translation
提示工程

2025年2月 55

简洁设计

提示词应该简洁、清晰,并且让你和模型都容易理解。经验法则是,如果连你自己都觉得困惑,模型很可能也会困惑。尽量避免使用复杂语言,不要提供不必要的信息。

示例:

修改前:
我现在正在纽约旅行,想了解更多关于优秀景点的地方。我和两个3岁的孩子在一起。我们在度假期间应该去哪里?

修改后:
扮演一名旅游向导,为游客描述纽约曼哈顿适合3岁儿童参观的绝佳地点。

尽量使用描述动作的动词。以下是一组示例:
行动、分析、分类、归类、对比、比较、创建、描述、定义、评估、提取、查找、生成、识别、列出、测量、组织、解析、挑选、预测、提供、排序、推荐、返回、检索、重写、选择、展示、整理、总结、翻译、写作。

---

## Page 56

### Original
Prompt Engineering

February 2025 56

Be specific about the output

Be specific about the desired output. A concise instruction might not guide the LLM enough

or could be too generic. Providing specific details in the prompt (through system or context

prompting) can help the model to focus on what’s relevant, improving the overall accuracy.

Examples:

DO:

Generate a 3 paragraph blog post about the top 5 video game consoles.

The blog post should be informative and engaging, and it should be

written in a conversational style.

DO NOT:

Generate a blog post about video game consoles.

Use Instructions over Constraints

Instructions and constraints are used in prompting to guide the output of a LLM.

•    An instruction provides explicit instructions on the desired format, style, or content of the

response. It guides the model on what the model should do or produce.

•    A constraint is a set of limitations or boundaries on the response. It limits what the model

should not do or avoid.

### Translation
提示词工程

2025年2月 第56期

明确输出要求

对期望的输出要具体明确。简洁的指令可能不足以引导大语言模型(LLM),或者可能过于笼统。在提示词中提供具体细节(通过系统提示或上下文提示)可以帮助模型专注于相关信息,从而提高整体准确性。

示例:

正确做法:
撰写一篇关于五大游戏主机评测的三段式博客文章。文章应兼具信息性和吸引力,并采用口语化的写作风格。

错误做法:
撰写一篇关于游戏主机的博客文章。

优先使用指令而非限制

在提示词中使用指令和约束来引导大语言模型的输出。

• 指令对期望的格式、风格或内容提供明确指导。它告诉模型应该做什么或产出什么。
• 约束是对响应的一组限制或边界。它规定了模型不应该做什么或需要避免什么。

---

## Page 57

### Original
Prompt Engineering

February 2025 57

Growing research suggests that focusing on positive instructions in prompting can be more

effective than relying heavily on constraints. This approach aligns with how humans prefer

positive instructions over lists of what not to do.

Instructions directly communicate the desired outcome, whereas constraints might leave the

model guessing about what is allowed. It gives flexibility and encourages creativity within the

defined boundaries, while constraints can limit the model’s potential. Also a list of constraints

can clash with each other.

Constraints are still valuable but in certain situations. To prevent the model from generating

harmful or biased content or when a strict output format or style is needed.

If possible, use positive instructions: instead of telling the model what not to do, tell it what to

do instead. This can avoid confusion and improve the accuracy of the output.

DO:

Generate a 1 paragraph blog post about the top 5 video game consoles.

Only discuss the console, the company who made it, the year, and total

sales.

DO NOT:

Generate a 1 paragraph blog post about the top 5 video game consoles.

Do not list video game names.

As a best practice, start by prioritizing instructions, clearly stating what you want the model

to do and only use constraints when necessary for safety, clarity or specific requirements.

Experiment and iterate to test different combinations of instructions and constraints to find

what works best for your specific tasks, and document these.

### Translation
提示工程

2025年2月 第57期

越来越多的研究表明,在提示中注重正面指令比过度依赖约束更为有效。这种方法与人类倾向于喜欢正面指令而非禁忌列表的方式相一致。

指令直接传达期望的结果,而约束可能会让模型猜测哪些是被允许的。它在既定边界内提供灵活性并鼓励创造力,而约束则可能限制模型的潜力。此外,约束列表之间也可能相互冲突。

约束在某些情况下仍然很有价值。例如,为了防止模型生成有害或有偏见的内容,或者需要严格的输出格式或风格时。

如果可能,使用正面指令:不要告诉模型不要做什么,而是告诉它应该做什么。这样可以避免混淆并提高输出的准确性。

执行:
生成一段关于五大游戏主机的博客文章。
只讨论主机、制造公司、发布年份和总销量。

不要:
生成一段关于五大游戏主机的博客文章。
不要列出游戏名称。

作为最佳实践,首先优先考虑指令,明确说明你希望模型做什么,仅在安全、清晰性或特定要求必要时才使用约束。

实验和迭代,测试不同的指令和约束组合,以找到最适合你特定任务的方法,并记录这些发现。

---

## Page 58

### Original
Prompt Engineering

February 2025 58

Control the max token length

To control the length of a generated LLM response, you can either set a max token limit in the

configuration or explicitly request a specific length in your prompt. For example:

"Explain quantum physics in a tweet length message."

Use variables in prompts

To reuse prompts and make it more dynamic use variables in the prompt, which can be

changed for different inputs. E.g. as shown in Table 20, a prompt which gives facts about a

city. Instead of hardcoding the city name in the prompt, use a variable. Variables can save

you time and effort by allowing you to avoid repeating yourself. If you need to use the same

piece of information in multiple prompts, you can store it in a variable and then reference

that variable in each prompt. This makes a lot of sense when integrating prompts into your

own applications.

Prompt VARIABLES {city} = "Amsterdam"

PROMPT You are a travel guide. Tell me a fact about the city: {city}

Output Amsterdam is a beautiful city full of canals, bridges, and narrow streets. It’s a  great place to visit for its rich history, culture, and nightlife.

Table 20. Using variables in prompts

### Translation
提示工程

2025年2月 第58页

控制最大令牌长度

要控制生成的大语言模型(LLM)响应的长度,您可以在配置中设置最大令牌限制,或者在提示中明确请求特定长度。例如:

"用一条推文的长度解释量子物理学。"

在提示中使用变量

为了重用提示并使其更具动态性,可以在提示中使用变量,这些变量可以根据不同的输入进行更改。例如,如表20所示,这是一个提供城市事实的提示。不要在提示中硬编码城市名称,而是使用变量。变量可以帮助您节省时间和精力,避免重复自己。如果您需要在多个提示中使用相同的信息,可以将其存储在变量中,然后在每个提示中引用该变量。当将提示集成到您自己的应用程序中时,这是非常有意义的。

提示变量 {city} = "阿姆斯特丹"

提示:您是一名旅游指南。告诉我一个关于这个城市的事实:{city}

输出:阿姆斯特丹是一个充满运河、桥梁和狭窄街道的美丽城市。由于其丰富的历史、文化和夜生活,它是一个很棒的地方。

表20. 在提示中使用变量

---

## Page 59

### Original
Prompt Engineering

February 2025 59

Experiment with input formats and writing styles

Different models, model configurations, prompt formats, word choices, and submits can yield

different results. Therefore, it’s important to experiment with prompt attributes like the style,

the word choice, and the type prompt (zero shot, few shot, system prompt).

For example a prompt with the goal to generate text about the revolutionary video game

console Sega Dreamcast, can be formulated as a question, a statement or an instruction,

resulting in different outputs:

•    Question: What was the Sega Dreamcast and why was it such a revolutionary console?

•    Statement: The Sega Dreamcast was a sixth-generation video game console released by

Sega in 1999. It...

•    Instruction: Write a single paragraph that describes the Sega Dreamcast console and

explains why it was so revolutionary.

For few-shot prompting with classification tasks, mix up  the classes

Generally speaking, the order of your few-shots examples should not matter much. However,

when doing classification tasks, make sure you mix up the possible response classes in the

few shot examples. This is because you might otherwise be overfitting to the specific order of

the examples. By mixing up the possible response classes, you can ensure that the model is

learning to identify the key features of each class, rather than simply memorizing the order of

the examples. This will lead to more robust and generalizable performance on unseen data.

### Translation
提示工程

2025年2月 59

尝试不同的输入格式和写作风格

不同的模型、模型配置、提示格式、词汇选择和提交方式会产生不同的结果。因此,实验提示属性,如风格、词汇选择和提示类型(零样本、少样本、系统提示)非常重要。

例如,一个旨在生成关于革命性视频游戏主机Sega Dreamcast文本的提示,可以表述为问题、陈述或指令,从而产生不同的输出:

• 问题:Sega Dreamcast是什么,为什么它是一款革命性的游戏主机?
• 陈述:Sega Dreamcast是世嘉于1999年推出的第六代视频游戏主机。它...
• 指令:写一个段落描述Sega Dreamcast游戏主机,并解释为什么它如此具有革命性。

对于少样本提示的分类任务,要混合不同的类别

一般来说,你的少样本示例的顺序应该不太重要。然而,在进行分类任务时,请确保在少样本示例中混合可能的响应类别。这是因为否则你可能会过度拟合示例的特定顺序。通过混合可能的响应类别,你可以确保模型正在学习识别每个类别的关键特征,而不是简单地记住示例的顺序。这将导致在未见过的数据上具有更强健和更可泛化的性能。

---

## Page 60

### Original
Prompt Engineering

February 2025 60

A good rule of thumb is to start with 6 few shot examples and start testing the accuracy

from there.

Adapt to model updates

It’s important for you to stay on top of model architecture changes, added data, and

capabilities. Try out newer model versions and adjust your prompts to better leverage new

model features. Tools like Vertex AI Studio are great to store, test, and document the various

versions of your prompt.

Experiment with output formats

Besides the prompt input format, consider experimenting with the output format. For non-

creative tasks like extracting, selecting, parsing, ordering, ranking, or categorizing data try

having your output returned in a structured format like JSON or XML.

There are some benefits in returning JSON objects from a prompt that extracts data. In

a real-world application I don’t need to manually create this JSON format, I can already

return the data in a sorted order (very handy when working with datetime objects), but most

importantly, by prompting for a JSON format it forces the model to create a structure and

limit hallucinations.

In summary, benefits of using JSON for your output:

•    Returns always in the same style

•    Focus on the data you want to receive

### Translation
提示工程

2025年2月 60

一个好的经验法则是先从6个少样本示例开始,然后从那里开始测试准确性。

适应模型更新

密切关注模型架构的变化、新增的数据和能力对您来说很重要。尝试使用更新的模型版本,并调整您的提示以更好地利用新的模型功能。像Vertex AI Studio这样的工具非常适合存储、测试和记录您的提示的各种版本。

尝试输出格式

除了提示输入格式外,还可以尝试输出格式。对于提取、选择、解析、排序、排名或分类数据等非创造性任务,尝试让您的输出以JSON或XML等结构化格式返回。

从提取数据的提示返回JSON对象有一些好处。在实际应用中,我不需要手动创建这种JSON格式,我已经可以按排序顺序返回数据(处理datetime对象时非常方便),但最重要的是,通过要求JSON格式,它迫使模型创建结构并限制幻觉。

总之,使用JSON作为输出的好处有:
• 始终以相同风格返回
• 专注于您想要接收的数据

---

## Page 61

### Original
Prompt Engineering

February 2025 61

•    Less chance for hallucinations

•    Make it relationship aware

•    You get data types

•    You can sort it

Table 4 in the few-shot prompting section shows an example on how to return

structured output.

JSON Repair

While returning data in JSON format offers numerous advantages, it's not without its

drawbacks. The structured nature of JSON, while beneficial for parsing and use in

applications, requires significantly more tokens than plain text, leading to increased

processing time and higher costs. Furthermore, JSON's verbosity can easily consume the

entire output window, becoming especially problematic when the generation is abruptly cut

off due to token limits. This truncation often results in invalid JSON, missing crucial closing

braces or brackets, rendering the output unusable. Fortunately, tools like the json-repair

library (available on PyPI) can be invaluable in these situations. This library intelligently

attempts to automatically fix incomplete or malformed JSON objects, making it a crucial

ally when working with LLM-generated JSON, especially when dealing with potential

truncation issues.

### Translation
提示工程

2025年2月 61

• 减少幻觉的可能性
• 使其具备关系感知能力
• 获取数据类型
• 支持排序

少样本提示(few-shot prompting)部分的表4展示了如何返回结构化输出的示例。

JSON修复

虽然以JSON格式返回数据具有诸多优势,但也并非没有缺点。JSON的结构化特性虽然有利于解析和在应用程序中使用,但它比纯文本需要更多的token,导致处理时间增加和成本提高。此外,JSON的冗长特性很容易占用整个输出窗口,当因token限制导致生成被突然截断时,这会成为一个特别严重的问题。这种截断通常会导致无效的JSON,缺少关键的大括号或方括号,使输出无法使用。幸运的是,像json-repair库(在PyPI上可用)这样的工具在这种情况下可以发挥极其宝贵的作用。这个库智能地尝试自动修复不完整或格式错误的JSON对象,使其在使用LLM生成的JSON时成为至关重要的助手,特别是在处理潜在的截断问题时。

---

## Page 62

### Original
Prompt Engineering

February 2025 62

Working with Schemas

Using structured JSON as an output is a great solution, as we've seen multiple times in this

paper. But what about input? While JSON is excellent for structuring the output the LLM

generates, it can also be incredibly useful for structuring the input you provide. This is where

JSON Schemas come into play. A JSON Schema defines the expected structure and data

types of your JSON input. By providing a schema, you give the LLM a clear blueprint of the

data it should expect, helping it focus its attention on the relevant information and reducing

the risk of misinterpreting the input. Furthermore, schemas can help establish relationships

between different pieces of data and even make the LLM "time-aware" by including date or

timestamp fields with specific formats.

Here's a simple example:

Let's say you want to use an LLM to generate descriptions for products in an e-commerce

catalog. Instead of just providing a free-form text description of the product, you can use a

JSON schema to define the product's attributes:

Snippet 5. Definition of the structured output schema

{   "type": "object",   "properties": {     "name": { "type": "string", "description": "Product name" },     "category": { "type": "string", "description": "Product category" },     "price": { "type": "number", "format": "float", "description": "Product  price" },     "features": {       "type": "array",       "items": { "type": "string" },       "description": "Key features of the product"     },     "release_date": { "type": "string", "format": "date", "description":  "Date the product was released"}   },

### Translation
提示工程

2025年2月 62

使用模式

使用结构化的JSON作为输出是一个很好的解决方案,正如我们在本文中多次看到的那样。那么输入呢?虽然JSON非常适合结构化LLM生成的输出,但它也可以非常有用地用于结构化您提供的输入。这就是JSON模式发挥作用的地方。JSON模式定义了您JSON输入的预期结构和数据类型。通过提供模式,您为LLM提供了一个清晰的数据蓝图,告诉它应该期望什么样的数据,帮助它专注于相关信息,并降低误解输入的风险。此外,模式还可以帮助建立不同数据之间的关系,甚至通过包含具有特定格式的日期或时间戳字段,使LLM具备"时间感知"能力。

这是一个简单的例子:

假设您想使用LLM为电子商务目录中的产品生成描述。您不仅可以提供产品的自由文本描述,还可以使用JSON模式来定义产品的属性:

代码片段5. 结构化输出模式的定义

```json
{
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "description": "产品名称"
    },
    "category": {
      "type": "string",
      "description": "产品类别"
    },
    "price": {
      "type": "number",
      "format": "float",
      "description": "产品价格"
    },
    "features": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "产品的主要特性"
    },
    "release_date": {
      "type": "string",
      "format": "date",
      "description": "产品发布日期"
    }
  }
}

Page 63

Original

Prompt Engineering

February 2025 63

Then, you can provide the actual product data as a JSON object that conforms to

this schema:

Snippet 6. Structured output from the LLM

By preprocessing your data and instead of providing full documents only providing both the

schema and the data, you give the LLM a clear understanding of the product's attributes,

including its release date, making it much more likely to generate an accurate and relevant

description. This structured input approach, guiding the LLM's attention to the relevant fields,

is especially valuable when working with large volumes of data or when integrating LLMs into

complex applications.

Experiment together with other prompt engineers

If you are in a situation where you have to try to come up with a good prompt, you might

want to find multiple people to make an attempt. When everyone follows the best practices

(as listed in this chapter) you are going to see a variance in performance between all the

different prompt attempts.

{ "name": "Wireless Headphones", "category": "Electronics", "price": 99.99, "features": ["Noise cancellation", "Bluetooth 5.0", "20-hour battery life"], "release_date": "2023-10-27" }

Translation

提示工程

2025年2月 63

然后,您可以将实际产品数据作为符合此架构的JSON对象提供:

代码片段6. 来自LLM的结构化输出

通过预处理您的数据,而不是只提供完整文档,而是同时提供架构和数据,您可以让LLM清楚地了解产品的属性,包括其发布日期,从而更有可能生成准确和相关的描述。这种结构化输入方法,引导LLM关注相关字段,在处理大量数据或将LLM集成到复杂应用程序中时特别有价值。

与其他提示工程师一起进行实验

如果您处于需要尝试想出好的提示词的情况,您可能需要找多个人来尝试。当每个人都遵循最佳实践(如本章所列)时,您会看到所有不同提示尝试之间的性能差异。

{
"name": "Wireless Headphones",
"category": "Electronics",
"price": 99.99,
"features": ["Noise cancellation", "Bluetooth 5.0", "20-hour battery life"],
"release_date": "2023-10-27"
}


Page 64

Original

Prompt Engineering

February 2025 64

CoT Best practices

For CoT prompting, putting the answer after the reasoning is required because the

generation of the reasoning changes the tokens that the model gets when it predicts the

final answer.

With CoT and self-consistency you need to be able to extract the final answer from your

prompt, separated from the reasoning.

For CoT prompting, set the temperature to 0.

Chain of thought prompting is based on greedy decoding, predicting the next word in a

sequence based on the highest probability assigned by the language model. Generally

speaking, when using reasoning, to come up with the final answer, there’s likely one single

correct answer. Therefore the temperature should always set to 0.

Document the various prompt attempts

The last tip was mentioned before in this chapter, but we can’t stress enough how important

it is: document your prompt attempts in full detail so you can learn over time what went well

and what did not.

Prompt outputs can differ across models, across sampling settings, and even across different

versions of the same model. Moreover, even across identical prompts to the same model,

small differences in output sentence formatting and word choice can occur. (For example, as

mentioned previously, if two tokens have the same predicted probability, ties may be broken

randomly. This can then impact subsequent predicted tokens.).

Translation

提示工程

2025年2月 64

思维链最佳实践

对于思维链提示,必须将答案放在推理之后,因为推理过程的生成会改变模型在预测最终答案时获得的token。

在使用思维链和自一致性方法时,你需要能够从提示中提取最终答案,并将其与推理过程分离。

对于思维链提示,将温度设置为0。

思维链提示基于贪心解码,根据语言模型分配的最高概率预测序列中的下一个词。一般来说,在使用推理得出最终答案时,很可能只有一个正确的答案。因此,温度应始终设置为0。

记录各种提示尝试

上一条技巧在本章前面已经提到过,但我们再怎么强调其重要性也不为过:详细记录你的提示尝试,这样你可以随时间学习哪些方法有效,哪些无效。

提示输出可能因模型、采样设置甚至同一模型的不同版本而异。此外,即使对同一模型使用完全相同的提示,输出句子的格式和用词上也可能出现微小差异。(例如,如前所述,如果两个token具有相同的预测概率,平局可能被随机打破,这可能会影响后续预测的token。)


Page 65

Original

Prompt Engineering

February 2025 65

We recommend creating a Google Sheet with Table 21 as a template. The advantages of

this approach are that you have a complete record when you inevitably have to revisit your

prompting work–either to pick it up in the future (you’d be surprised how much you can

forget after just a short break), to test prompt performance on different versions of a model,

and to help debug future errors.

Beyond the fields in this table, it’s also helpful to track the version of the prompt (iteration),

a field to capture if the result was OK/NOT OK/SOMETIMES OK, and a field to capture

feedback. If you’re lucky enough to be using Vertex AI Studio, save your prompts (using the

same name and version as listed in your documentation) and track the hyperlink to the saved

prompt in the table. This way, you’re always one click away from re-running your prompts.

When working on a retrieval augmented generation system, you should also capture the

specific aspects of the RAG system that impact what content was inserted into the prompt,

including the query, chunk settings, chunk output, and other information.

Once you feel the prompt is close to perfect, take it to your project codebase. And in the

codebase, save prompts in a separate file from code, so it’s easier to maintain. Finally, ideally

your prompts are part of an operationalized system, and as a prompt engineer you should

rely on automated tests and evaluation procedures to understand how well your prompt

generalizes to a task.

Prompt engineering is an iterative process. Craft and test different prompts, analyze,

and document the results. Refine your prompt based on the model’s performance. Keep

experimenting until you achieve the desired output. When you change a model or model

configuration, go back and keep experimenting with the previously used prompts.

Translation

提示工程

2025年2月 65

我们建议创建一个以表21为模板的Google表格。这种方法的优势在于,当您不可避免地需要重新审视您的提示工程工作时,您会拥有完整的记录——无论是为了在未来继续这项工作(您可能会惊讶于短暂休息后就能忘记多少内容),为了测试提示在不同模型版本上的性能,还是为了帮助调试未来的错误。

除了本表中的字段外,跟踪提示的版本(迭代)、记录结果是否为OK/NOT OK/SOMETIMES OK的字段,以及用于捕获反馈的字段也很有帮助。如果您有幸使用Vertex AI Studio,请保存您的提示(使用与文档中列出的相同名称和版本),并在表格中跟踪到已保存提示的超链接。这样,您只需单击一次即可重新运行您的提示。

在使用检索增强生成系统时,您还应捕获影响插入提示内容的RAG系统的特定方面,包括查询、块设置、块输出和其他信息。

一旦您认为提示接近完美,就将其应用到您的项目代码库中。在代码库中,将提示保存在与代码单独的文件中,以便于维护。最后,理想情况下,您的提示应作为已运营系统的一部分,作为提示工程师,您应依赖自动化测试和评估程序来了解您的提示对任务的泛化程度。

提示工程是一个迭代过程。设计和测试不同的提示,分析并记录结果。根据模型的性能改进您的提示。持续实验直到获得所需的输出。当您更改模型或模型配置时,请回去继续实验之前使用的提示。


Page 66

Original

Prompt Engineering

February 2025 66

Name [name and version of your prompt]

Goal [One sentence explanation of the goal of this attempt]

Model [name and version of the used model]

Temperature [value between 0 - 1] Token Limit [number]

Top-K [number] Top-P [number]

Prompt [Write all the full prompt]

Output [Write out the output or multiple outputs]

Table 21. A template for documenting prompts

Summary

This whitepaper discusses prompt engineering. We learned various prompting techniques,

such as:

• Zero prompting

• Few shot prompting

• System prompting

• Role prompting

• Contextual prompting

• Step-back prompting

• Chain of thought

• Self consistency

• Tree of thoughts

Translation

提示工程

2025年2月 66

名称 [您的提示的名称和版本]

目标 [用一句话解释本次尝试的目标]

模型 [所用模型的名称和版本]

温度值 [0-1之间的数值] 令牌限制 [数值]

Top-K [数值] Top-P [数值]

提示 [写下完整的提示内容]

输出 [写出输出内容或多个输出]

表21. 记录提示的模板

总结

本白皮书讨论了提示工程。我们学习了各种提示技术,例如:

• 零样本提示
• 少样本提示
• 系统提示
• 角色提示
• 上下文提示
• 退步提示
• 思维链
• 自我一致性
• 思维树


Page 67

Original

Prompt Engineering

February 2025 67

• ReAct

We even looked into ways how you can automate your prompts.

The whitepaper then discusses the challenges of gen AI like the problems that can happen

when your prompts are insufficient. We closed with best practices on how to become a better

prompt engineer.

Translation

提示工程
2025年2月 67页

• ReAct

我们还探讨了如何实现提示自动化的方法。
该白皮书随后讨论了生成式AI面临的挑战,比如提示不足时可能出现的问题。
最后,我们分享了成为更优秀提示工程师的最佳实践。


Page 68

Original

Prompt Engineering

February 2025 68

Endnotes

  1. Google, 2023, Gemini by Google. Available at: https://gemini.google.com.

  2. Google, 2024, Gemini for Google Workspace Prompt Guide. Available at:

https://inthecloud.withgoogle.com/gemini-for-google-workspace-prompt-guide/dl-cd.html.

  1. Google Cloud, 2023, Introduction to Prompting. Available at:

https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/introduction-prompt-design.

  1. Google Cloud, 2023, Text Model Request Body: Top-P & top-K sampling methods. Available at:

https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/text#request_body.

  1. Wei, J., et al., 2023, Zero Shot - Fine Tuned language models are zero shot learners. Available at:

https://arxiv.org/pdf/2109.01652.pdf.

  1. Google Cloud, 2023, Google Cloud Model Garden. Available at: https://cloud.google.com/model-garden.

  2. Brown, T., et al., 2023, Few Shot - Language Models are Few Shot learners. Available at:

https://arxiv.org/pdf/2005.14165.pdf.

  1. Zheng, L., et al., 2023, Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models.

Available at: https://openreview.net/pdf?id=3bq3jsvcQ1

  1. Wei, J., et al., 2023, Chain of Thought Prompting. Available at: https://arxiv.org/pdf/2201.11903.pdf.

  2. Google Cloud Platform, 2023, Chain of Thought and React. Available at: https://github.com/

GoogleCloudPlatform/generative-ai/blob/main/language/prompts/examples/chain_of_thought_react.ipynb.

  1. Wang, X., et al., 2023, Self Consistency Improves Chain of Thought reasoning in language models.

Available at: https://arxiv.org/pdf/2203.11171.pdf.

  1. Yao, S., et al., 2023, Tree of Thoughts: Deliberate Problem Solving with Large Language Models.

Available at: https://arxiv.org/pdf/2305.10601.pdf.

  1. Yao, S., et al., 2023, ReAct: Synergizing Reasoning and Acting in Language Models. Available at:

https://arxiv.org/pdf/2210.03629.pdf.

  1. Google Cloud Platform, 2023, Advance Prompting: Chain of Thought and React. Available at:

https://github.com/GoogleCloudPlatform/applied-ai-engineering-samples/blob/main/genai-

Translation

提示工程

2025年2月 68

尾注

  1. Google, 2023年, Gemini by Google. 可在以下网址获取: https://gemini.google.com.

  2. Google, 2024年, Gemini for Google Workspace Prompt Guide. 可在以下网址获取: https://inthecloud.withgoogle.com/gemini-for-google-workspace-prompt-guide/dl-cd.html.

  3. Google Cloud, 2023年, Introduction to Prompting. 可在以下网址获取: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/introduction-prompt-design.

  4. Google Cloud, 2023年, Text Model Request Body: Top-P & top-K sampling methods. 可在以下网址获取: https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/text#request_body.

  5. Wei, J., et al., 2023年, Zero Shot - Fine Tuned language models are zero shot learners. 可在以下网址获取: https://arxiv.org/pdf/2109.01652.pdf.

  6. Google Cloud, 2023年, Google Cloud Model Garden. 可在以下网址获取: https://cloud.google.com/model-garden.

  7. Brown, T., et al., 2023年, Few Shot - Language Models are Few Shot learners. 可在以下网址获取: https://arxiv.org/pdf/2005.14165.pdf.

  8. Zheng, L., et al., 2023年, Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models. 可在以下网址获取: https://openreview.net/pdf?id=3bq3jsvcQ1

  9. Wei, J., et al., 2023年, Chain of Thought Prompting. 可在以下网址获取: https://arxiv.org/pdf/2201.11903.pdf.

  10. Google Cloud Platform, 2023年, Chain of Thought and React. 可在以下网址获取: https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/prompts/examples/chain_of_thought_react.ipynb.

  11. Wang, X., et al., 2023年, Self Consistency Improves Chain of Thought reasoning in language models. 可在以下网址获取: https://arxiv.org/pdf/2203.11171.pdf.

  12. Yao, S., et al., 2023年, Tree of Thoughts: Deliberate Problem Solving with Large Language Models. 可在以下网址获取: https://arxiv.org/pdf/2305.10601.pdf.

  13. Yao, S., et al., 2023年, ReAct: Synergizing Reasoning and Acting in Language Models. 可在以下网址获取: https://arxiv.org/pdf/2210.03629.pdf.

  14. Google Cloud Platform, 2023年, Advance Prompting: Chain of Thought and React. 可在以下网址获取: https://github.com/GoogleCloudPlatform/applied-ai-engineering-samples/blob/main/genai-


Original

on-vertex-ai/advanced_prompting_training/cot_react.ipynb.

  1. Zhou, C., et al., 2023, Automatic Prompt Engineering - Large Language Models are Human-Level Prompt

Engineers. Available at: https://arxiv.org/pdf/2211.01910.pdf.

Translation

  1. Zhou, C. 等人,2023年,《自动提示工程:大型语言模型达到人类水平的提示工程师》。获取地址:https://arxiv.org/pdf/2211.01910.pdf

Leave a Comment

您的电子邮箱地址不会被公开。 必填项已用*标注

close
arrow_upward