Writing Computationally Efficient and Scalable Code

Writing Computationally Efficient and Scalable Code is a practical programming skill — something you'll reach for on almost every data-science project in Python. This guide focuses on the idiomatic patterns professional engineers actually use, not textbook toy examples.

Why Writing Computationally Efficient Matters

Data scientists who write clean, testable, well-structured Python ship faster, re-use more and collaborate better. Craftsmanship here pays dividends on every subsequent project.

  • Write small, composable functions with explicit inputs and outputs.
  • Prefer built-in data structures and the standard library where they fit.
  • Handle failure with narrow, named exceptions instead of bare except.
  • Measure before you optimise — always profile first.

How Writing Computationally Efficient Shows Up in Practice

In a typical project, writing computationally efficient and scalable code is combined with the rest of the Python Programming toolkit. You rarely use any one technique in isolation; the real skill is knowing which combination fits the problem you are trying to solve, and being able to explain that choice to a non-technical stakeholder.

This shows up every day: building pipelines, writing analysis notebooks, packaging reusable utilities and reviewing a teammate's pull request.

Back to the Data Science curriculum →

Code Examples: Writing Computationally Efficient and Scalable Code (5 runnable snippets)

Copy any block into a file or notebook and run it end-to-end — each example stands alone.

Example 1: Concurrent I/O with asyncio + aiohttp

# Example 1: Concurrent I/O with asyncio + aiohttp -- Writing Computationally Efficient and Scalable Code
import asyncio
import aiohttp

URLS = [
    "https://httpbin.org/uuid",
    "https://httpbin.org/user-agent",
    "https://httpbin.org/ip",
    "https://httpbin.org/headers",
]

async def fetch(session, url):
    async with session.get(url, timeout=10) as resp:
        return url, resp.status, len(await resp.text())

async def main():
    async with aiohttp.ClientSession() as session:
        results = await asyncio.gather(*(fetch(session, u) for u in URLS))
    for url, status, size in results:
        print(f"{status}  {size:>5} bytes  {url}")

asyncio.run(main())

Example 2: Typed dataclass with custom methods

# Example 2: Typed dataclass with custom methods -- Writing Computationally Efficient and Scalable Code
from dataclasses import dataclass, field
from typing import Iterable

@dataclass(slots=True)
class Sample:
    id: int
    features: list[float] = field(default_factory=list)
    label:    str | None  = None

    def norm(self) -> float:
        return sum(x * x for x in self.features) ** 0.5

    def scaled(self, factor: float) -> "Sample":
        return Sample(self.id, [x * factor for x in self.features], self.label)

def build(rows: Iterable[tuple[int, list[float], str]]) -> list[Sample]:
    return [Sample(i, f, y) for i, f, y in rows]

batch = build([(1, [1.0, 2.0], "A"), (2, [-3.0, 4.0], "B")])
for s in batch:
    print(s.id, round(s.norm(), 3), s.label)

Example 3: Generators, itertools and lazy pipelines

# Example 3: Generators, itertools and lazy pipelines -- Writing Computationally Efficient and Scalable Code
from itertools import islice, accumulate

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

def running_stats(seq):
    total, n = 0, 0
    for x in seq:
        total += x
        n     += 1
        yield x, total / n, total

first10 = list(islice(fibonacci(), 10))
print("fib(0..9)   :", first10)
for x, avg, cum in islice(running_stats(first10), 10):
    print(f"  x={x:>3}  mean={avg:>6.2f}  cumulative={cum:>4}")

partial_sums = list(accumulate(first10))
print("partial sums:", partial_sums)

Example 4: Context manager with timing and error handling

# Example 4: Context manager with timing and error handling -- Writing Computationally Efficient and Scalable Code
from contextlib import contextmanager
import time, traceback

@contextmanager
def timed(name: str):
    t0 = time.perf_counter()
    try:
        yield
    except Exception as exc:
        print(f"[{name}] failed: {exc!r}")
        traceback.print_exc()
        raise
    finally:
        dt_ms = (time.perf_counter() - t0) * 1_000
        print(f"[{name}] took {dt_ms:.2f} ms")

with timed("hash 1M ints"):
    total = sum(hash(i) for i in range(1_000_000))
print("result:", total % 9_973)

Example 5: Decorator for memoised pure functions

# Example 5: Decorator for memoised pure functions -- Writing Computationally Efficient and Scalable Code
from functools import wraps

def memoise(fn):
    cache: dict = {}
    @wraps(fn)
    def inner(*args):
        if args not in cache:
            cache[args] = fn(*args)
        return cache[args]
    inner.cache = cache
    return inner

@memoise
def fib(n: int) -> int:
    return n if n < 2 else fib(n - 1) + fib(n - 2)

print([fib(i) for i in range(15)])
print("cache entries:", len(fib.cache))