AI has recently ramped up in popularity recently with ChatGPT and OpenAI leading the charge, giving everyone the ability to interact with Large Language Models (LLMs) in a conversational format to get some sort of result. However, depending on the complexity of what you’re asking for, you may not always get the result you want. You might get something close, something completely unrelated, or something that’s just not accurate. This is where prompt engineering comes into play, to help you craft more effective prompts that allow the model to return more focused and accurate results.
Large Language Models (LLMs)
It’s important to have a high-level understanding of the two types of LLMs that make up the AI you’re interacting with. There’s a Base LLM and an Instruction Tuned LLM. A Base LLM is trained to predict the next item in a sequence; think of autocomplete as an example. An Instruction Tuned LLM attempts to follow a set of instructions and return a result. ChatGPT is a great example of this and shows that in some cases you can provide instructions in a more natural format like a conversation.
If you’re working directly with an LLM through an API, there are some additional parameters you can adjust that will allow you to get different results:
- The first parameter is Temperature; typically this is set to 0, which will return the most common results based on the prompt. You can increase the temperature to get less common results; a higher temperature could be more useful when trying to achieve more creative results and a lower temperature is better for more accurate and consistent results.
- The second parameter is Top_p; this works similarly to the temperature parameter in the sense that the lower the value, the more consistent and accurate the results will be, and the higher the value, the more diverse the results will be.
A general rule is to only alter one of these parameters, not both.
Model Limitations
Before diving into the details of prompt engineering for LLMs, it’s important to be aware of modal limitations. Models are often trained on very large datasets, but there has to be a cut off at some point. Meaning, if the model was only trained on data up to 2020, then it won’t be able to give accurate results on anything that happened after that cutoff.
Another common issue found when using LLMs: hallucinations. Hallucinations are when the model makes statements that are plausible but are not true or references inaccurate information, think of this as the model guessing to fill in the blanks when returning a result. Hallucinations often occur when the prompt is not specific enough. To help reduce hallucinations, you can ask the model to find relevant quotes and ask it to use those quotes to answer questions to force itself to give more accurate information from the source material.
Principles of Prompting
To help you write more effective prompts, ensure that your prompt has clear and specific instructions. To achieve this, you can use delimiters (triple quotes/backticks/dashes, XML tags, etc.) to provide additional information or context to prompts. Asking the model to produce a structured output such as HTML, JSON, etc., also helps the model stay more focused on how it formats its response. Check whether the conditions are satisfied and verify the assumptions required to perform the task. Sometimes, at first glance, the solution to a problem looks right, but when you work the problem out step by step, you realize that the original solution was wrong. Models will sometimes make those same mistakes, so forcing the model to check conditions and avoid making assumptions will help yield more accurate results. Another great tool for providing clear and specific instructions is few-shot prompting. This is when you give the model successful examples of the task being completed, so that it can follow a similar format and provide a more accurate result.
Another important principle is to give the model time to think. This isn’t meant to be taken literally, but rather instructs the model to work the problem out slowly rather than rushing to a conclusion. To achieve this, you can explicitly specify the steps required to complete a task (Step 1, 2, 3…) and instruct the model to figure out its own solution before coming to a conclusion. This is similar to the ‘Show Your Work’ portion on a math test to show how you came to your conclusion.
Iterative Prompt Development (IPD)
To develop the prompt you want that consistently yields the exact results you need often takes multiple attempts in which you’re tweaking the original prompt, testing the results, and repeating . This process is called Iterative Prompt Development (IPD). When it comes to developing a prompt, ensure you follow the principles above, then analyze why the result does not give the desired output, refine the idea and prompt, then repeat the process until you consistently get the result you need.
Resources
When looking into Prompt Engineering a nice interactive resource for beginners is the online course provided by DeepLearning.AI called ChatGPT Prompt Engineering for Developers which is free (as of writing this blog). The other standout resource that I found myself spending lots of time on was the Prompt Engineering Guide by DAIR.AI. I also found some additional resources through their Github page including some PowerPoint slides with great examples. Microsoft also has a great guide on prompt engineering that can be found here.