Advanced Data Structures and Algorithmic Complexity
Advanced Data Structures and Algorithmic Complexity is a practical programming skill — something you'll reach for on almost every data-science project in Python. This guide focuses on the idiomatic patterns professional engineers actually use, not textbook toy examples.
Why Advanced Data Structures Matters
Data scientists who write clean, testable, well-structured Python ship faster, re-use more and collaborate better. Craftsmanship here pays dividends on every subsequent project.
- Write small, composable functions with explicit inputs and outputs.
- Prefer built-in data structures and the standard library where they fit.
- Handle failure with narrow, named exceptions instead of bare except.
- Measure before you optimise — always profile first.
How Advanced Data Structures Shows Up in Practice
In a typical project, advanced data structures and algorithmic complexity is combined with the rest of the Python Programming toolkit. You rarely use any one technique in isolation; the real skill is knowing which combination fits the problem you are trying to solve, and being able to explain that choice to a non-technical stakeholder.
This shows up every day: building pipelines, writing analysis notebooks, packaging reusable utilities and reviewing a teammate's pull request.
- Management of Professional Development Environments and
- Pythonic Code Idiomatic Expressions Adherence Pep
- Control Flow Iterators and Generators in
- Modular and Functional Programming Paradigms
Back to the Data Science curriculum →
Code Examples: Advanced Data Structures and Algorithmic Complexity (5 runnable snippets)
Copy any block into a file or notebook and run it end-to-end — each example stands alone.
Example 1: Generators, itertools and lazy pipelines
# Example 1: Generators, itertools and lazy pipelines -- Advanced Data Structures and Algorithmic Complexity
from itertools import islice, accumulate
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
def running_stats(seq):
total, n = 0, 0
for x in seq:
total += x
n += 1
yield x, total / n, total
first10 = list(islice(fibonacci(), 10))
print("fib(0..9) :", first10)
for x, avg, cum in islice(running_stats(first10), 10):
print(f" x={x:>3} mean={avg:>6.2f} cumulative={cum:>4}")
partial_sums = list(accumulate(first10))
print("partial sums:", partial_sums)
Example 2: Context manager with timing and error handling
# Example 2: Context manager with timing and error handling -- Advanced Data Structures and Algorithmic Complexity
from contextlib import contextmanager
import time, traceback
@contextmanager
def timed(name: str):
t0 = time.perf_counter()
try:
yield
except Exception as exc:
print(f"[{name}] failed: {exc!r}")
traceback.print_exc()
raise
finally:
dt_ms = (time.perf_counter() - t0) * 1_000
print(f"[{name}] took {dt_ms:.2f} ms")
with timed("hash 1M ints"):
total = sum(hash(i) for i in range(1_000_000))
print("result:", total % 9_973)
Example 3: Decorator for memoised pure functions
# Example 3: Decorator for memoised pure functions -- Advanced Data Structures and Algorithmic Complexity
from functools import wraps
def memoise(fn):
cache: dict = {}
@wraps(fn)
def inner(*args):
if args not in cache:
cache[args] = fn(*args)
return cache[args]
inner.cache = cache
return inner
@memoise
def fib(n: int) -> int:
return n if n < 2 else fib(n - 1) + fib(n - 2)
print([fib(i) for i in range(15)])
print("cache entries:", len(fib.cache))
Example 4: Concurrent I/O with asyncio + aiohttp
# Example 4: Concurrent I/O with asyncio + aiohttp -- Advanced Data Structures and Algorithmic Complexity
import asyncio
import aiohttp
URLS = [
"https://httpbin.org/uuid",
"https://httpbin.org/user-agent",
"https://httpbin.org/ip",
"https://httpbin.org/headers",
]
async def fetch(session, url):
async with session.get(url, timeout=10) as resp:
return url, resp.status, len(await resp.text())
async def main():
async with aiohttp.ClientSession() as session:
results = await asyncio.gather(*(fetch(session, u) for u in URLS))
for url, status, size in results:
print(f"{status} {size:>5} bytes {url}")
asyncio.run(main())
Example 5: Typed dataclass with custom methods
# Example 5: Typed dataclass with custom methods -- Advanced Data Structures and Algorithmic Complexity
from dataclasses import dataclass, field
from typing import Iterable
@dataclass(slots=True)
class Sample:
id: int
features: list[float] = field(default_factory=list)
label: str | None = None
def norm(self) -> float:
return sum(x * x for x in self.features) ** 0.5
def scaled(self, factor: float) -> "Sample":
return Sample(self.id, [x * factor for x in self.features], self.label)
def build(rows: Iterable[tuple[int, list[float], str]]) -> list[Sample]:
return [Sample(i, f, y) for i, f, y in rows]
batch = build([(1, [1.0, 2.0], "A"), (2, [-3.0, 4.0], "B")])
for s in batch:
print(s.id, round(s.norm(), 3), s.label)