Big O Notation, Explained Without a Math Degree
Big O isn't math for its own sake. It answers one practical question: when your input gets 10x or 1000x bigger, does your code get 10x slower, 1000x slower, or barely slower at all? Here's how to read the shapes without the limit definition.
Forget the textbook for a second. Big O notation exists to answer one question you actually care about: when the input to your code gets 10x bigger, or 1000x bigger, what happens to the runtime? Does it get 10x slower? 1000x slower? A million times slower? Or does it barely move?
That's it. That's the whole idea. Everything else, the funny notation, the dropped constants, the Greek-looking limit definition in the back of the algorithms book, is just bookkeeping in service of that one question. You don't need the math to use this well. You need to recognize the shapes.
Summary
I'm going to walk through the handful of shapes that cover almost everything you'll write, show you what each one feels like at scale, and end with the single most useful refactor in the whole subject: turning an O(n^2) into an O(n) by reaching for a hash map.
Why we throw away the details
Here's the part that confuses people first, so let's kill it early. Say you write a loop that does three things on every element of a list of size n. The exact number of operations is something like 3n, plus a bit of setup, call it 3n + 5. Big O calls this O(n).
Where did the 3 go? Where did the 5 go? They got dropped on purpose. Big O is about the trend, not the exact count. When n is a billion, the difference between 3n and 3n + 5 is rounding error. And the 3 in front? It's a constant. Doubling the input still doubles the work whether the constant is 3 or 300. The shapeof the growth is what we're naming, and the shape is linear.
This is also why the formal definition with limits exists, and why you can safely ignore it. The math is just a rigorous way to say “past some input size, this term dominates and the rest is noise.” You already understand the idea intuitively. A car that goes twice as far in twice the time is going at a steady speed, whether it's a Honda or a Ferrari. The constant is the engine. The shape is the relationship.
Heads up
The shapes, from flat to catastrophic
There are maybe six growth classes that cover the vast majority of code you'll read or write. Here they are, roughly in order from “doesn't care how big your data is” to “please do not run this on more than twelve things.”
O(1), constant time.The runtime doesn't depend on the input size at all. Looking up a value in a hash map by its key. Grabbing the element at index 5 of an array. Pushing onto a stack. Whether the array has ten elements or ten million, indexing into it takes the same time. This is the gold standard. When you can make something O(1), you usually should.
O(log n), logarithmic.Every step throws away half of what's left. Binary search is the poster child: to find a name in a sorted phone book, you open to the middle, decide which half the name is in, and repeat on that half. A phone book with a billion names takes about 30 steps, because you can only halve a billion about 30 times before you're down to one. Logarithmic growth is almost as good as constant. Doubling the input adds just one more step.
O(n), linear.You touch each element a fixed number of times. Scanning a list to find the max, summing an array, printing every row. Double the input, double the time. This is the honest, unavoidable cost of “look at everything once,” and for a lot of problems it's the best you can possibly do, because you can't find the max without looking at every number.
O(n log n), linearithmic.A linear pass, but each element pays a logarithmic tax. This is the cost of a good sort. Mergesort splits the list in half repeatedly (that's the log n part) and merges the pieces back together (that's the n part). Memorize this one. It's the practical floor for comparison-based sorting. You cannot sort arbitrary items by comparing them and do better than O(n log n) in the general case. That's a proven lower bound, not just “nobody's figured it out yet.”[1]
O(n^2), quadratic.For every element, you do something with every other element. A loop inside a loop, both running over the same data. The naive “compare everything to everything” pattern. With 1,000 items that's a million operations. With 1,000,000 items it's a trillion. Quadratic is where code goes to die quietly: it's fine in testing with 50 rows and falls over in production with 50,000.
O(2^n) and O(n!), combinatorial explosion.The runtime roughly doubles (or worse) for each element you add. Brute-force trying every subset of a set is O(2^n). Trying every possible ordering, like the naive traveling-salesman approach, is O(n!). These grow so fast they're useless beyond tiny inputs. O(2^n) at n=50 is about a quadrillion. O(n!) at n=20 already exceeds the number of operations a modern machine does in years. If your solution is here, the real job is finding a smarter approach, not a faster computer.
What these numbers actually feel like
Abstract growth classes don't land until you see the operation counts side by side. Here's roughly how many basic steps each shape costs as the input grows from small to large.
n → 10 1,000 1,000,000
────────────────────────────────────────────────────────
O(1) 1 1 1
O(log n) ~3 ~10 ~20
O(n) 10 1,000 1,000,000
O(n log n) ~33 ~10,000 ~20,000,000
O(n^2) 100 1,000,000 1,000,000,000,000
O(2^n) 1,024 (1.07 x 10^301) don't ask
O(n!) 3,628,800 (4 x 10^2567) don't askLook at the O(n^2) row. At a million inputs it's a trillion operations. A modern CPU does maybe a billion simple operations a second, so that's on the order of fifteen minutes for something the O(n) version does in a millisecond. That gap is the entire reason this subject exists. It's not academic. It's the difference between a page that loads instantly and one that times out.
Worst case, average case, and the amortized trick
When someone says an algorithm is O(n), they almost always mean the worst case. That's the default, and it's the safe one to reason about, because the worst case is what wrecks your production latency at 3am, not the average.
But there are three flavors worth knowing. Worst case is the most operations the algorithm could ever do for an input of size n. Average caseis what you'd expect across typical inputs. Best caseis the luckiest scenario, and it's mostly useless for planning, because you don't get to assume luck.
Then there's the one that genuinely confuses people: amortized. You'll hear that appending to a dynamic array, the resizable list type in most languages, is “O(1) amortized.” Here's what that means. Most appends are O(1), just drop the value in the next slot. But every so often the array runs out of room and has to allocate a bigger block and copy everything over, which is O(n). That sounds like an occasional O(n) hit, so why call it O(1)?
Because the array doubles in size each time it grows. The expensive copies get rarer and rarer as the array gets bigger, exactly fast enough that if you spread the cost of all those copies across all the cheap appends, each append averages out to constant time. Amortized analysis is the accounting trick that proves this: a few expensive operations paid for by many cheap ones, averaged over the whole sequence.[2] It's why you can call push a million times and not feel the resizes.
Plain English
Space complexity is the same idea for memory
Everything above was about time. The exact same notation describes how much memory an algorithm needs as the input grows, and people forget to think about it.
An algorithm that scans a list and tracks a single running total uses O(1) extra space, one variable, no matter how big the list. One that builds a new copy of the list uses O(n) space. One that builds a table of every pair uses O(n^2) space, and you'll run out of RAM long before you run out of patience.
There's often a trade between the two, and it's one of the most common real engineering decisions you'll make. You can frequently spend memory to buy speed, which is exactly the move in the refactor coming up next. The two-loop solution uses almost no extra memory but burns time. The hash-map solution spends O(n) memory to drag the time down to O(n). Usually that trade is worth it. Memory is cheap and your users' patience is not.
The one refactor that pays for this whole article
Here's the punchline, and it's the most useful pattern in the subject. When something is mysteriously slow, the usual suspect is a nested loop over the same data. Two loops, both walking the list, quietly making it O(n^2). The fix is almost always the same: use a hash map (or hash set) to remember what you've already seen, so the inner loop becomes a single O(1) lookup instead of another full scan.
The classic example is the two-sum problem. Given a list of numbers and a target, find two numbers that add up to the target. The obvious solution checks every pair:
function twoSumSlow(nums, target) {
// For each number, scan the rest of the list looking for its partner.
for (let i = 0; i < nums.length; i++) {
for (let j = i + 1; j < nums.length; j++) {
if (nums[i] + nums[j] === target) {
return [i, j]
}
}
}
return null
}A loop inside a loop, both over nums. That's O(n^2). With 100 numbers it does about 5,000 comparisons, no problem. With 100,000 numbers it's about five billion, and now your request hangs. Nothing looks wrong. It just doesn't scale.
Now the rewrite. As you walk the list once, you remember every number you've seen in a hash set. For each new number, the partner you need is target - current. Instead of scanning the rest of the list to find it, you ask the set “have I seen this already?” in one O(1) step:
function twoSumFast(nums, target) {
const seen = new Map() // value -> index, the numbers we've passed
for (let i = 0; i < nums.length; i++) {
const need = target - nums[i]
if (seen.has(need)) {
return [seen.get(need), i] // found the partner in O(1)
}
seen.set(nums[i], i) // remember this one for later
}
return null
}One loop. Each step does a constant-time lookup and a constant-time insert. That's O(n) time, paid for with O(n) extra memory for the map. We traded the inner loop for a hash lookup, and the difference at 100,000 inputs is five billion operations versus a hundred thousand. Same answer, a runtime that scales.
Takeaway
When code is slow, look for a loop inside a loop over the same data. That's O(n^2) hiding in plain sight. Nine times out of ten you can kill the inner loop by remembering what you've seen in a hash map and turning a full re-scan into a single O(1) lookup. That one move, O(n^2) to O(n), is the highest-leverage refactor most engineers know.
What to actually take away
You don't need the formal limit definition to get real value out of Big O. You need to recognize the shapes and know what they cost. A hash lookup is flat. A binary search halves. A single scan is linear. A good sort is n log n. Two nested loops over the same data is quadratic, and a brute-force over every combination is a wall.
That's the working knowledge. When you read code, ask the one question this whole thing is built around: if the input gets 1000x bigger, what happens? If the answer is “basically nothing” or “1000x more, fine,” you're good. If the answer is “a million times more,” you've found the line that's going to page you in six months, and you probably already know it needs a hash map.
Sources and further reading
- 1.PrimaryCormen, Leiserson, Rivest, Stein, Introduction to Algorithms (CLRS), 4th ed.. Growth of functions, sorting lower bounds
- 2.PrimaryCLRS, Chapter 16: Amortized Analysis. Aggregate, accounting, and potential methods
- 3.ReportingBig-O Cheat Sheet: time and space complexity of common algorithms. bigocheatsheet.com
- 4.ReportingMDN: Array.prototype.sort and engine sort implementations. developer.mozilla.org
- 5.PrimaryMIT 6.006 Introduction to Algorithms, asymptotic complexity lectures. ocw.mit.edu
Written by
Tech Talk News Editorial
Tech Talk News covers engineering, AI, and tech investing for people who build and invest in technology.