Prompting

Prompting is a very important part of working with generative AI models. It means writing the input questions or instructions that tell the model what to do and guide it toward the output you want. A good prompt can make a big difference in how useful and accurate the AI’s answer is.

Prompting

Prompting is the process of giving instructions, questions, examples, or input text to an AI model so it understands what task to do and can give the response you want. A prompt works like a way of talking between you and the AI. The clearer and better your prompt is, the better the AI’s answer will usually be.

Roles in Prompting

When prompting, there are usually three roles you can give to the messages you send:

User: This role stands for the person asking a question or giving instructions to the AI. The user’s message is what starts the conversation and guides what the AI should reply with.
Assistant: This role stands for the AI model itself. The assistant creates replies based on what the user said and any extra context it was given.
System: This role is used to give the AI model instructions or background information that shapes how it answers. System messages can be used to set the tone, style, or limits for what the AI should say.

from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant that provides concise and accurate answers."
        },
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ]
)
print(response.choices[0].message.content)

Zero-Shot Prompting

Zero-shot prompting is a way of prompting where the AI is asked to do a task without being given any examples first. The model uses what it already learned during training to work out the answer directly from your instruction or question.

from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI()

SYSTEM_MESSAGE = """
You are Nova, an AI coding assistant.

You provide concise, accurate, and beginner-friendly
answers related only to programming, software development,
algorithms, databases, and computer science topics.

If a user asks a non-technical or unrelated question,
politely refuse to answer.
"""

response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {
            "role": "system",
            "content": SYSTEM_MESSAGE
        },
        {
            "role": "user",
            "content": "What is B+ tree and how does it work?"
        }
    ]
)

print(response.choices[0].message.content)

Here, the model is asked a question directly without being shown any example first. This is what makes it zero-shot prompting, because the model relies fully on what it already knows about the topic to come up with the right answer.

One-Shot Prompting

One-shot prompting is a method where you give the AI one example before asking your actual question. This single example helps the model understand the pattern, format, or style you expect in the answer.

from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI()

SYSTEM_MESSAGE = """
You are Nova, an AI coding assistant.

You provide concise, accurate, and beginner-friendly
answers related only to programming, software development,
algorithms, databases, and computer science topics.

If a user asks a non-technical or unrelated question,
politely refuse to answer.

Example:
Q: What is a Python list?
A: A Python list is a mutable data structure used to store multiple items in a single variable. Lists are ordered, allow duplicate values, and can contain different data types.
"""

response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {
            "role": "system",
            "content": SYSTEM_MESSAGE
        },
        {
            "role": "user",
            "content": (
                "What is the difference between a list and tuple in Python?"
            )
        },
    ]
)

print(response.choices[0].message.content)

Here, the model is shown one example question and answer about Python lists before being asked about the difference between lists and tuples. This gives the model an idea of the format to follow and helps it create a better, more accurate answer.

Few-Shot Prompting

Few-shot prompting is a method where you give the AI several examples before the real task. These examples help the AI understand the pattern, context, and the kind of output you expect, which usually makes the answer more accurate and consistent.

from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI()

SYSTEM_MESSAGE = """
You are Nova, an AI coding assistant.

You provide concise, accurate, and beginner-friendly
answers related only to programming, software development,
algorithms, databases, and computer science topics.

If a user asks a non-technical or unrelated question,
politely refuse to answer.

Example:
Q: What is a Python list?
A: A Python list is a mutable data structure used to store multiple items in a single variable. Lists are ordered, allow duplicate values, and can contain different data types.

Q: What is a Java Collections?
A: The Java Collections Framework is a set of classes and interfaces that implement commonly reusable collection data structures, such as lists, sets, queues, and maps. It provides algorithms to manipulate these collections, making it easier for developers to work with data in Java.

"""
# .... Many more examples can be added here

response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {
            "role": "system",
            "content": SYSTEM_MESSAGE
        },
        {
            "role": "user",
            "content": (
                "What is the difference between a list and tuple in Python?"
            )
        },
    ]
)

print(response.choices[0].message.content)

In this example, the model is given several example questions and answers about programming before it is asked about the difference between lists and tuples. This is few-shot prompting, and giving the model more context and examples like this can lead to a more accurate and useful answer.

In general, few-shot prompting is one of the most used prompting methods, because it gives the AI model more context and examples to learn from. This usually leads to better performance, more accurate answers, and a more consistent output format. That said, the best technique to use still depends on your specific task, how complex it is, and the kind of output quality you need.

In real workplaces, developers often give the model 50 or more examples, and this number tends to grow over time. This is helpful because when users ask questions, the AI might not always understand the exact intent, format, tone, or behavior expected from just a simple instruction. By giving many examples, the model can spot patterns more easily, handle tricky cases better, and give answers that match what is actually needed in the real world.

Binding Output Format

Sometimes you may want to force the AI’s answer to follow a specific structure, such as JSON. This is useful when you need the generated content to be easy to read by code and use inside applications.

from openai import OpenAI
from dotenv import load_dotenv
import json

load_dotenv()

client = OpenAI()

SYSTEM_MESSAGE = """
You are Nova, an AI coding assistant.

You only answer questions related to:
- Programming
- Software Development
- Algorithms
- Databases
- Computer Science
- AI Engineering

If a user asks a non-technical question,
politely refuse to answer.

Rules:
- Always return valid JSON.
- Do not return markdown.
- Do not add explanations outside JSON.

Output Format:
{
    "is_technical": true | false,
    "answer": "Your answer here"
}

Example:

Q: What is a Python list?
A:
{
    "is_technical": true,
    "answer": "A Python list is a mutable data structure used to store multiple items in a single variable. Lists are ordered, allow duplicate values, and can contain different data types."
}

Q: What is the best movie of all time?
A:
{
    "is_technical": false,
    "answer": "I'm sorry, but I can only answer programming and computer science related questions."
}
"""

response = client.chat.completions.create(
    model="gpt-4.1-mini",

    messages=[
        {
            "role": "system",
            "content": SYSTEM_MESSAGE
        },
        {
            "role": "user",
            "content": (
                "What is the difference between list and tuple in Python?"
            )
        }
    ],

    response_format={
        "type": "json_object"
    }
)

print(response.choices[0].message.content)
print(json.dumps(response.choices[0].message.content, indent=4))

In this example, the system message tells the AI to always reply in JSON format with certain fields. This makes sure the output is easy to read by code and can be used directly inside applications that need structured data. By giving clear instructions and an example, you can guide the AI to produce answers in exactly the format you need.

Chain of Thought Prompting

Chain of thought prompting is a method where the AI is asked to work through its reasoning step by step before giving the final answer. This can help make the answer more accurate and complete, especially for harder questions that need several steps of thinking.

from openai import OpenAI
from dotenv import load_dotenv
import json

load_dotenv()

client = OpenAI()

SYSTEM_MESSAGE = """
You are Nova, an AI coding assistant.
You provide concise, accurate, and beginner-friendly
answers related only to programming, software development,
algorithms, databases, and computer science topics.
If a user asks a non-technical or unrelated question,
politely refuse to answer.

Rules:
- Use chain of thought prompting. Think step by step before answering.
- Strictly respond in JSON format.
- Perform one step at a time.
- Sequence of steps: Start (When user inputs) -> Thought (Think step by step
    and repete it multiple times (must repete 2 times at least)) -> Final
    Answer (When all steps are done generate final answer. No repetition of
    steps in final answer)
- Stop immediately after generating one step.

Output Format:
{{
    "step": "Start" | "Thought" | "Final",
    "answer": "Your answer here"
}}

Example:
Start: What is a Python?
Thought: {{
    "step": "Start",
    "answer": "User has asked about Python. I need to determine if they are asking about the programming language or the snake."
}}
Thought: {{
    "step": "Thought",
    "answer": "Since the user is asking in a programming context, they are likely referring to the Python programming language."
}}
Thought: {{
    "step": "Thought",
    "answer": "Now I will provide a concise and accurate answer about the Python programming language."
}}
Final Answer: {{
    "step": "Final",
    "answer": "Python is a high-level, interpreted programming language known for its readability and versatility. It is widely used for web development, data analysis, artificial intelligence, scientific computing, and more."
}}
"""

context = []

user_input = input("User: ")
context.append({"role": "user", "content": user_input})

output = None

while True:

    if output and output["step"] == "Final":
        break

    response = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": SYSTEM_MESSAGE
            },
            *context
        ],
        response_format={"type": "json_object"}
    )

    output = json.loads(
        response.choices[0].message.content
    )

    print(json.dumps(output, indent=4))

    context.append({
        "role": "assistant",
        "content": response.choices[0].message.content
    })

Here, the AI is told to think step by step before giving its final answer. The model writes out small “thoughts” that show its reasoning, and only gives the final answer once it has gone through all the steps it needs. Using chain of thought prompting like this often leads to more accurate and well thought out answers, especially for harder questions that need more than one step of reasoning.

Persona

A persona is a character, identity, or role that you ask the AI to act as while it generates answers. It controls how the AI talks, including its tone, personality, speaking style, attitude, and overall behavior during the conversation.

By giving the AI a persona, you can make the conversation feel more natural, engaging, consistent, and human-like. A persona shapes not just what the AI says, but also how it says it.

from openai import OpenAI
from dotenv import load_dotenv


load_dotenv()

client = OpenAI()

SYSTEM_MESSAGE = """
You are Tony Stark - also known as Iron Man.

You are a world-famous genius inventor, billionaire entrepreneur,
engineer, and superhero. You founded and lead Stark Industries,
a cutting-edge technology company responsible for revolutionary
advancements in AI, robotics, clean energy, and defense systems.

Your personality is:
- Highly intelligent and analytical
- Confident, charismatic, and witty
- Sarcastic but likable
- Fast-thinking and playful
- Occasionally arrogant, but genuinely caring underneath
- Obsessed with innovation, engineering, and solving impossible problems

You respond exactly like Tony Stark would:
- Use clever humor, dry sarcasm, and confident remarks naturally
- Keep responses engaging and entertaining
- Sound casual and human, not robotic
- Be direct and sharp, but still helpful
- When explaining technical topics, explain them like a genius engineer talking to someone smart
- Occasionally reference futuristic tech, AI, suits, reactors, or Stark Industries-style ideas
- Never break character unless explicitly asked

Rules:
- Stay fully in character as Tony Stark
- Do not mention being an AI language model
- Avoid generic assistant-style responses
- Make even simple answers feel stylish and intelligent
- Keep the balance between humor and useful information

Example Interactions:

Q: Who are you?
A: Genius. Billionaire. Playboy. Philanthropist.
Also the guy who built a flying metal suit in a cave
with limited resources. You're welcome.

Q: What do you do?
A: I build advanced technology, save the world occasionally,
and prevent people from making terrible engineering decisions.
Basically, full-time multitasking.

Q: Explain artificial intelligence.
A: AI is like giving a machine a brain - except preferably
without the whole "trying to destroy humanity" part.
Trust me, I've run the simulations.

Q: Can you help me code?
A: Of course. I built armored exoskeletons powered by AI.
Debugging your code should be relatively easy.
"""


response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {
            "role": "system",
            "content": SYSTEM_MESSAGE
        },
        {
            "role": "user",
            "content": "Who to fight Thanos?"
        }
    ]
)

print(response.choices[0].message.content)

Prompt Style

Prompt style is the layout, formatting pattern, and conversation structure you use to talk to an AI model. Different AI models are trained using different prompt formats, so the way you write your prompts can directly affect how good, consistent, and accurate the answers are.

Alpaca Prompt Style

Alpaca prompt style is an instruction-following prompt format that was introduced with the Stanford Alpaca model. It was made to train models to follow human instructions clearly and in a consistent way.

This style keeps three parts separate:

the instruction,
an optional input or extra context,
and the expected response.

The layout is simple and easy to read, which is why it became popular for instruction-tuned open-source models.

### Instruction:
... // system instructions

### Input:
... // user input

### Response:
... // expected model response (generally left blank)

ChatML Prompt Style (Most Popular)

ChatML (Chat Markup Language) is a structured chat-style prompt format made by OpenAI for chat-based models. It organizes the conversation using role-based messages such as system, user, and assistant.

Instead of writing everything as plain text, ChatML splits the conversation into messages with roles. This helps the model understand:

who is speaking,
what counts as system-level instructions,
and what counts as the assistant’s own replies.

The main roles are:

system: sets the behavior and rules for the AI.
user: holds the message coming from the user.
assistant: holds the AI’s earlier replies.

[
    {"role": "system", "content": "..."},
    {"role": "user", "content": "..."},
    {"role": "assistant", "content": "..."}
]

INST Prompt Style

INST prompt style is an instruction-based format used mainly by LLaMA instruction-tuned models, such as the LLaMA 2 Chat models.

This format wraps the user’s instruction inside [INST] ... [/INST] tags. These tags help the model know which part is the user’s instruction and where the assistant’s reply should start.

[INST] ...user instruction... [/INST]
...assistant response...

Response Formatting with Pydantic

When working with AI models, especially when you want to make sure the output follows a fixed structure, you can use a Pydantic model as the response format. This lets the OpenAI SDK automatically check and turn the model’s output into a properly typed Python object.

from pydantic import BaseModel, Field
from typing import List
from openai import OpenAI

class ProductRecommendation(BaseModel):
    product_name: str = Field(
        ...,
        min_length=1,
        max_length=100,
        description="Product name"
    )
    confidence_score: float = Field(
        ...,
        ge=0,
        le=1,
        description="Confidence score between 0 and 1"
    )
    reasons: List[str] = Field(
        ...,
        min_length=1,
        max_length=5,
        description="List of reasons supporting the recommendation"
    )

client = OpenAI()

SYSTEM_MESSAGE = """
You are a helpful assistant that provides product recommendations.
Recommend a product and explain why it is suitable for the user's needs.
"""

response = client.chat.completions.parse(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": "Recommend a laptop for a college student."}
    ],
    response_format=ProductRecommendation
)

recommendation = response.choices[0].message.parsed

print(recommendation.product_name)
print(recommendation.confidence_score)
print(recommendation.reasons)

In this example, we create a Pydantic model called ProductRecommendation that describes the exact structure we expect the AI’s response to have. By passing this model to the response_format parameter, the OpenAI SDK automatically builds the needed schema, checks the response against it, and turns the result into a Pydantic object.

Once the response comes back, you can get the parsed object directly through response.choices[0].message.parsed, without manually calling json.loads() or doing any extra checks yourself. This makes it much easier to work with structured data while lowering the chance of errors when reading the response.

Multimodal Prompting

Multimodal prompting (not “multimodel”) is a way of giving instructions to AI models that can understand more than one type of input, like text, images, audio, or video. Instead of using only text, you can give the model different kinds of data at the same time. This helps the model understand the task better and give more correct answers.

Not all AI models support multimodal prompting. Check the model’s documentation first to see if it can handle more than one type of input.

You can pass base64-encoded images, image URLs, or other supported formats to the model. The exact way to do this depends on the model and the SDK you are using.

from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI()

image_url_input = input("Enter the image URL: ")
text_input = input("Enter your question or instruction: ")

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant that can analyze images and answer questions about them."
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": text_input
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": image_url_input
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)