Asyncio and Concurrency

Python provides multiple ways to run tasks that appear to happen at the same time. The correct approach depends on what your program is doing:

waiting (I/O work)
calculating (CPU work)

Concurrency vs Parallelism

Term	Meaning
Concurrency	Managing multiple tasks so they can make progress
Parallelism	Running multiple tasks at the exact same time

Concurrency is about handling many tasks efficiently.
Parallelism is about using multiple CPU cores to do work faster.

Multithreading, Multiprocessing, and the GIL

Threading Basics

A thread is a small unit of execution inside a process.

All threads share the same memory
Threads are lightweight
Useful when tasks spend time waiting

Example:

import threading
import time

def task():
    print("Task started")
    time.sleep(2)
    print("Task finished")

t = threading.Thread(target=task)
t.start()
t.join()

Creating Multiple Threads

import threading
import time

def task(name):
    print(f"Starting {name}")
    time.sleep(2)
    print(f"Ending {name}")

threads = []

for i in range(3):
    t = threading.Thread(target=task, args=(f"Thread-{i}",))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

Race Condition

A race condition happens when multiple threads modify the same data at the same time.

Example (unsafe):

import threading

counter = 0

def increment():
    global counter
    for _ in range(100000):
        counter += 1

threads = [threading.Thread(target=increment) for _ in range(2)]

for t in threads:
    t.start()

for t in threads:
    t.join()

print(counter)  # Unexpected result

Locks and Synchronization

Locks prevent multiple threads from accessing shared data at the same time.

import threading

counter = 0
lock = threading.Lock()

def increment():
    global counter
    for _ in range(100000):
        with lock:
            counter += 1

threads = [threading.Thread(target=increment) for _ in range(2)]

for t in threads:
    t.start()

for t in threads:
    t.join()

print(counter)  # Correct result

Thread Pool

from concurrent.futures import ThreadPoolExecutor
import time

def task(name):
    print(f"Start {name}")
    time.sleep(2)
    print(f"End {name}")

with ThreadPoolExecutor(max_workers=3) as executor:
    for i in range(3):
        executor.submit(task, f"Task-{i}")

Multiprocessing

A process is a completely separate program with its own memory.

No shared memory by default
True parallel execution
Best for CPU-heavy tasks

Creating Processes

from multiprocessing import Process
import os

def task():
    print("Process ID:", os.getpid())

processes = []

for _ in range(3):
    p = Process(target=task)
    processes.append(p)
    p.start()

for p in processes:
    p.join()

Process vs Thread

Feature	Thread	Process
Memory	Shared	Separate
Speed	Fast	Slower to create
Communication	Easy	Harder
CPU usage	Limited by GIL	True parallel

When to Use Multiprocessing

Use multiprocessing when:

heavy calculations
data processing
image/video processing
machine learning tasks

Communication Between Processes

from multiprocessing import Process, Queue

def worker(q):
    q.put("Hello from process")

if __name__ == "__main__":
    q = Queue()
    p = Process(target=worker, args=(q,))
    p.start()
    print(q.get())
    p.join()

The GIL (Global Interpreter Lock)

The GIL is a rule in CPython:

Only one thread can execute Python code at a time

graph TD A["Thread A"] --> GIL["GIL"] B["Thread B"] --> GIL GIL --> C["Only one thread runs Python code"]

Why GIL Exists

Keeps memory management simple
Prevents corruption of Python objects
Makes interpreter easier to implement

CPU-bound vs IO-bound

Type	Meaning
CPU-bound	Heavy computation
IO-bound	Waiting for input/output

Why Threads Are Not Good for CPU Work

Even if you create many threads:

Only one runs at a time (because of GIL)
No real speed improvement

Why Processes Are Better for CPU Work

Each process has its own GIL
Can run on different CPU cores
True parallel execution

Then Why Use Threads?

Threads are useful when:

program waits for network
reading files
calling APIs
database queries

Because:

while one thread waits, another runs

Asyncio (Asynchronous Programming)

Asyncio is another way to handle concurrency using a single thread.

It uses:

event loop
coroutines
non-blocking code

Basic Async Example

import asyncio

async def task(name):
    print(f"Start {name}")
    await asyncio.sleep(2)
    print(f"End {name}")

async def main():
    tasks = [task(f"Task-{i}") for i in range(3)]
    await asyncio.gather(*tasks)

asyncio.run(main())

How Asyncio Works

graph TD Loop["Event Loop"] --> A["Task A waiting"] Loop --> B["Task B waiting"] Loop --> C["Task C ready"] C --> Run["Execute Task"] Run --> Wait["Task hits await"] Wait --> Loop

Key Concepts

async def → defines coroutine
await → pauses task without blocking
event loop → manages execution

Async vs Threads

Feature	Threads	Asyncio
Memory	More	Less
Speed	Good	Very efficient
Complexity	Medium	Requires async code

Asyncio with HTTP (Important)

Using blocking requests:

import requests

response = requests.get("https://example.com")
print(response.status_code)

Problem:

blocks entire program

Async HTTP Example using aiohttp

import asyncio
import aiohttp

async def fetch(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return await response.text()

async def main():
    urls = ["https://example.com"] * 3
    tasks = [fetch(url) for url in urls]
    results = await asyncio.gather(*tasks)
    print(len(results))

asyncio.run(main())

concurrent.futures

concurrent.futures is a high-level module for running tasks asynchronously.

It gives you two executors:

Executor	Uses	Best for
ThreadPoolExecutor	Threads	I/O-bound tasks (wait)
ProcessPoolExecutor	Processes	CPU-bound tasks (compute)

Both executors:

manage worker pools
schedule tasks for you
return results using Future objects

submit() vs map()

submit()

Runs one task
Returns a Future

from concurrent.futures import ThreadPoolExecutor

def square(n):
    return n * n

with ThreadPoolExecutor() as executor:
    future = executor.submit(square, 5)
    print(future.result())

map()

Runs many inputs
Returns results in order

from concurrent.futures import ThreadPoolExecutor

def square(n):
    return n * n

with ThreadPoolExecutor() as executor:
    results = executor.map(square, [1, 2, 3, 4])

for r in results:
    print(r)

ThreadPoolExecutor

Use when work spends most time waiting:

API calls
file reading/writing
database queries

from concurrent.futures import ThreadPoolExecutor
import time

def task(name):
    print(f"Starting {name}")
    time.sleep(2)  # simulates waiting
    print(f"Ending {name}")

with ThreadPoolExecutor(max_workers=3) as executor:
    for i in range(3):
        executor.submit(task, f"Task-{i}")

ProcessPoolExecutor

Use when work is heavy computation:

large loops
data processing
image/video processing

from concurrent.futures import ProcessPoolExecutor

def compute(n):
    total = 0
    for i in range(n):
        total += i
    return total

if __name__ == "__main__":  # important on Windows
    with ProcessPoolExecutor(max_workers=4) as executor:
        results = list(executor.map(compute, [5_000_000, 10_000_000, 20_000_000]))
    print(results)

Why it helps:

true parallel execution on multiple cores
bypasses GIL for CPU-heavy Python code
more overhead than threads, so avoid very tiny tasks

Quick Rule

If your task waits, use ThreadPoolExecutor.
If your task calculates, use ProcessPoolExecutor.

When to Use What

Situation	Best Choice
Network calls	asyncio
File I/O	threads
CPU-heavy work	multiprocessing
Simple parallel tasks	concurrent.futures