Skip to content

Setup

In this section, we will cover the essential steps to set up your environment for working with generative AI models. This includes obtaining API credentials, installing necessary libraries, and configuring your development environment to interact with various AI service providers.

To get started with prompting, you will need to set up access to a generative AI model. This typically involves obtaining API credentials from a service provider such as OpenAI, Hugging Face, or others. Once you have your API key, you can use it to make requests to the model and receive generated responses.

API providers:

To set up the OpenAI SDK, follow these steps:

  1. Install the OpenAI SDK and Dotenv using pip:

    Terminal window
    pip install openai python-dotenv
  2. Create a .env file in your project directory and add your OpenAI API key:

    OPENAI_API_KEY=your_api_key_here
  3. Load the API key in your Python script and initialize the OpenAI client:

    from openai import OpenAI
    from dotenv import load_dotenv
    load_dotenv()
    client = OpenAI()
    response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
    {
    "role": "user",
    "content": "Explain Python in simple words"
    }
    ]
    )
    print(response.choices[0].message.content)

To set up the Gemini SDK, follow these steps:

  1. Install the Gemini SDK using pip:

    Terminal window
    pip install google-genai python-dotenv
  2. Create a .env file in your project directory and add your Gemini API key:

    GEMINI_API_KEY=your_api_key_here
  3. Load the API key in your Python script and initialize the Gemini client:

    from google import genai
    from dotenv import load_dotenv
    import os
    load_dotenv()
    client = genai.Client(
    api_key=os.getenv("GEMINI_API_KEY")
    )
    response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Explain FastAPI in simple words"
    )
    print(response.text)

The OpenAI SDK can be used to interact with various API providers, not just OpenAI’s own models. To use the OpenAI SDK with a different API provider, you will need to configure the SDK to point to the appropriate endpoint and include any necessary authentication headers.

  1. Install the OpenAI SDK and Dotenv using pip:

    Terminal window
    pip install openai python-dotenv
  2. Create a .env file in your project directory and add your API key and endpoint:

    API_KEY=your_api_key_here
    API_ENDPOINT=https://api.yourprovider.com/v1/generate
  3. Load the API key and endpoint in your Python script and initialize the OpenAI client:

    from openai import OpenAI
    from dotenv import load_dotenv
    import os
    load_dotenv()
    client = OpenAI(
    api_key=os.getenv("API_KEY"),
    base_url=os.getenv("API_ENDPOINT")
    )
    response = client.chat.completions.create(
    model="your-model-name",
    messages=[
    {
    "role": "user",
    "content": "Explain the concept of machine learning in simple terms."
    }
    ]
    )
    print(response.choices[0].message.content)

We can also set up and run generative AI models locally on our machines. This can be done using Docker, Hugging Face’s Transformers library, or other frameworks that support local deployment of AI models. Running models locally can provide more control over the environment and may be necessary for certain use cases, such as when working with sensitive data.

To set up a generative AI model using Docker, follow these steps:

  1. Install Docker on your machine if you haven’t already. You can download it from the official Docker website: https://www.docker.com/get-started

  2. Pull and run the Ollama Docker image, which provides a local server for downloading and running AI models.:

    Docker Image: https://hub.docker.com/r/ollama/ollama

    Terminal window
    docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
  3. Once the Docker container is running, you can interact with the model using the appropriate API endpoints provided by the container. For example, if you’re using Ollama, you can send requests to http://localhost:11434 to generate content.

  4. We have to pull and run the model we want to use in our local environment. For example, if you want to use the llama3.2 model, you can pull it using the following command:

    Model Library: https://ollama.com/library

    Terminal window
    # pull model
    docker exec -it ollama ollama pull <model-name>
    # docker exec -it ollama ollama pull llama3.2
    # run model
    docker exec -it ollama ollama run <model-name>
    # docker exec -it ollama ollama run llama3.2
  5. You can also Setup UI interface for local model using Open Web UI, which provides a user-friendly interface to interact with your local generative AI models. You can find more information about Open Web UI here:

    Open Web UI: https://docs.openwebui.com/getting-started/quick-start/

  6. Follow the instructions provided by Open Web UI to connect it to your local model and start generating content through the web interface.

  7. FastAPI can also be used to create a custom API endpoint for your local model, allowing you to integrate it into your applications more easily. You can find more information about FastAPI here:

    FastAPI: https://fastapi.tiangolo.com/

    from fastapi import FastAPI
    from pydantic import BaseModel
    from ollama import Client
    app = FastAPI()
    MODEL_NAME = "llama3.2"
    client = Client(host="http://localhost:11434")
    class GenerateRequest(BaseModel):
    prompt: str
    @app.post("/generate")
    async def generate_content(data: GenerateRequest):
    response = client.chat(
    model=MODEL_NAME,
    messages=[
    {
    "role": "user",
    "content": data.prompt,
    }
    ],
    )
    return {
    "response": response["message"]["content"]
    }

To set up a generative AI model using Hugging Face’s Transformers library, follow these steps:

  1. Install Hugging Face CLI using pip:

    Terminal window
    pip install -U "huggingface_hub" #global install
  2. Log in to your Hugging Face account using the CLI:

    Terminal window
    hf auth login
    hf whoami

    Auth requires a Hugging Face account. You can create one for free at https://huggingface.co/join and obtain your API token from https://huggingface.co/settings/tokens.

  3. Install the Transformers and torch library:

    Terminal window
    uv add transformers torch
  4. Load a pre-trained model and tokenizer in your Python script:

    from transformers import pipeline
    import json
    pipe = pipeline("text-generation", model="Qwen/Qwen2.5-0.5B-Instruct")
    messages = [{"role": "user", "content": "What is FastAPI?"}]
    response = pipe(messages)
    print(json.dumps(response, indent=2))

    If model is not available in local cache, it will be downloaded from Hugging Face’s model hub and stored in the cache directory for future use. It will be stored in a directory like Linux: ~/.cache/huggingface/hub and Windows: %USERPROFILE%\.cache\huggingface\hub on your machine.