Skip to content
Igor Maric / imTheOdd0ne

NaN: the number that poisons every calculation it touches

NaN looks like a harmless floating-point oddity until it crosses a boundary. One invalid calculation can become a missing chart, a corrupt aggregate, a broken equality check, or a model training run that quietly collapses.

TL;DRHomeBlog2026Article

NaN is a formal IEEE 754 value, not a random language wart. It lets invalid numerical operations keep moving through a calculation so the original failure can be diagnosed later, but that same propagation makes it dangerous in ordinary application code. JavaScript comparisons, JSON serialisation, Python encoders, PostgreSQL indexes, pandas missing-data handling, NumPy aggregations, TensorFlow checks, and compiler fast-math flags all treat NaN differently. The lesson is not to fear floating point. It is to stop pretending every numeric field is automatically a valid number.

16 May 2026 · 12 min read · Quality, Standards, PatternsMore from 2026 →
NaN: the number that poisons every calculation it touches

Most bugs announce themselves by crashing. NaN prefers a slower method. It keeps the program running, slides through arithmetic, survives long enough to reach a chart, a model, an API response, or a database column, and only then does someone notice that a number has turned into an accusation.

The accusation is not subtle. It usually appears exactly as NaN: not a number. The name sounds like a type error, but it is not. In most languages that expose IEEE-style floating point, NaN is a numeric value. It sits inside the number system as a formal result for computations that cannot produce a meaningful real number. Divide zero by zero. Take the square root of a negative number in a real-number context. Subtract infinity from infinity. The machine does not necessarily stop. It hands you a number-shaped failure and lets the computation continue.

That design was not foolish. It solved a real problem. Numerical programs often run long calculations where aborting on the first invalid intermediate result is less useful than finishing the calculation and examining which outputs became invalid. IEEE 754 standardises floating-point formats, operations, exception conditions, and default handling, including nonnumbers.1 William Kahan's notes on IEEE 754 describe the invalid-operation flag that is raised when a NaN is created from non-NaN operands, and the flag remains raised until the program clears it.2 The point was not to hide the failure. The point was to preserve it.

Application software inherited the preservation and forgot the diagnosis.

The error that refuses to stop

NaN is dangerous precisely because it is useful. If an invalid operation returned an ordinary zero, the failure would be silently converted into false certainty. If it threw an exception every time, many numerical algorithms would become harder to write, optimise, and analyse. A quiet NaN takes a third path: it represents invalidity as data.

That invalidity then propagates. In ECMAScript, the ordinary Number type is based on IEEE 754 binary64 arithmetic, and the specification repeatedly states the rule in plain algorithmic form: if an operand is NaN, operations such as addition, multiplication, division, remainder, and exponentiation return NaN in the relevant cases.3 This is the poison and the diagnostic tool in one value. A single invalid input contaminates the downstream result so the final output can reveal that something went wrong upstream.

The trouble starts when the downstream system is not expecting a diagnostic value. A charting layer wants a coordinate. A billing calculation wants an amount. A ranking algorithm wants a score. A machine-learning optimiser wants a loss value. A JSON API wants a serialisable number. A database index wants an ordering. NaN arrives wearing a number's uniform while refusing a number's duties.

That refusal is the first thing developers learn and the first thing many codebases forget.

const value = Number.NaN;

console.log(value === value);
console.log(value < 10);
console.log(value > 10);
console.log(value + 1);

The output is a compact little horror show:

false
false
false
NaN

The first line is the famous one. NaN is not equal to itself under ordinary numeric equality. The relational comparisons are just as important. A value that is neither less than ten nor greater than ten is not therefore ten. It is unordered. Any validation logic that assumes comparison failure narrows the world into a useful alternative has already lost.

The equality trap

The phrase NaN !== NaN has become trivia, but the consequences are not trivial. Equality is how software deduplicates, groups, caches, indexes, validates, and decides whether something changed. NaN breaks the everyday intuition that a value must at least be equal to itself.

ECMAScript makes the split explicit. Its ordinary numeric equality operation returns false if either side is NaN.4 But its SameValue operation, exposed through Object.is, treats two NaN values as the same value; SameValueZero, used by array includes, Map, and Set, does the same while also treating positive and negative zero as equal.4

That means this is all true at once:

console.log(NaN === NaN);
console.log(Object.is(NaN, NaN));
console.log([NaN].includes(NaN));
console.log(new Set([NaN, NaN]).size);

And the result is:

false
true
true
1

None of these behaviours is a bug. They are different equality contracts serving different parts of the language. The bug is assuming there is only one contract. A codebase that uses === in one layer, includes in another, JSON serialisation at the boundary, and database uniqueness underneath is already asking the same question in four dialects.

This is why NaN is worse than a visibly invalid string. A string called 'not-a-number' is annoying, but it does not pretend to participate in numeric ordering. NaN does. It can move through arithmetic, live in arrays, sit in object properties, enter dataframes, and cross API layers until a contract finally disagrees with the one before it.

The value is not lying. The surrounding code is overconfident.

The serialisation trap

JSON is where many systems discover that NaN is not a portable fact. RFC 8259's grammar for JSON numbers explicitly excludes values such as Infinity and NaN.5 A conforming JSON generator must produce text that fits the grammar. NaN does not fit.

JavaScript responds by erasing the distinction. ECMAScript's JSON.stringify rules state that finite numbers are stringified as numbers, while NaN and Infinity are represented as null.6 That is a pragmatic choice, and sometimes a defensible one, but it is also a lossy transformation. A failed numerical calculation and an intentionally absent value become indistinguishable unless the surrounding schema carries a second signal.

JSON.stringify({ score: NaN });

The result is:

{ "score": null }

The receiving service no longer knows whether score was missing, deliberately redacted, not applicable, or the residue of an invalid calculation. The most important piece of information was the reason the number stopped being a number. The serialised payload keeps the shape and loses the cause.

Python complicates the boundary in the opposite direction. Its standard json module accepts and emits NaN, Infinity, and -Infinity by default, while documenting that this is outside the JSON specification; setting allow_nan to false turns those values into an encoding error instead.7 One runtime silently turns NaN into null. Another may write a token that strict JSON consumers reject. Both behaviours are documented. Neither is safe to discover by accident in production.

The moral is not that JSON is broken. JSON is doing what its grammar says. The moral is that a numeric API contract needs an explicit policy for non-finite values. Reject them. Encode them as tagged domain values. Keep them out of the payload. But do not let a general-purpose serialiser decide whether mathematical failure becomes null, a non-standard token, or an exception.

By the time NaN reaches a wire format, the interesting debugging question is usually several layers old.

The database and dataframe compromise

Databases have a different problem. They need equality and ordering because indexes need equality and ordering. PostgreSQL documents the compromise directly: although most implementations of NaN do not treat it as equal to any value, including itself, PostgreSQL treats NaN values as equal and greater than all non-NaN values so numeric and floating-point values can be sorted and used in tree-based indexes.8

That is a sane database decision. It is also a semantic trap. The same NaN that fails equality in application code can become equal inside the database. The same unordered value can acquire an ordering position because indexes require one. If an application layer assumes NaN is never equal and a persistence layer groups NaNs together, the system is not merely storing data. It is translating mathematical invalidity into an operational convenience.

Data tooling makes a different compromise. pandas uses dtype-dependent missing-value sentinels, including NaN, while its newer NA value exists to represent missingness more consistently across data types.9 NumPy offers functions such as nanmean, which computes a mean while ignoring NaNs, but returns NaN and raises a warning for slices where every value is NaN.10

These tools are not confused. They are choosing policies. Sometimes NaN means an invalid numerical result. Sometimes it means missing data. Sometimes an aggregate should ignore it. Sometimes ignoring it destroys the signal. A temperature sensor that fails for five minutes may be safely omitted from one dashboard and absolutely central to another. A credit-risk feature that is NaN because the applicant has no mortgage history is not the same as a feature that is NaN because the ingestion job divided by zero.

This is the quiet place where NaN does the most damage: not in arithmetic, but in meaning. A missing value, an impossible value, an uncollected value, a redacted value, and a failed calculation are different facts. Collapsing them all into the same floating-point sentinel makes the data easier to store and harder to reason about.

The convenience is real. So is the debt.

The model that learns nothing

Machine learning makes NaN feel less like a curiosity and more like a system failure. Training loops are long chains of arithmetic. A bad input, unstable loss function, overflow, underflow, invalid logarithm, division by zero, or exploding gradient can introduce NaN into a tensor. Once that happens, propagation does exactly what IEEE arithmetic trained it to do: the invalidity moves forward.

Frameworks provide tools because this failure mode is common enough to deserve first-class debugging support. TensorFlow's enable_check_numerics causes execution to error as soon as an operation produces infinity or NaN.11 PyTorch's anomaly detection can raise an error when a backward computation generates NaN, although the documentation warns that the mode is intended for debugging because the checks slow execution.12

That is the same lesson in a more expensive costume: detection has a cost, but late detection has a larger one. If a training run produces NaN loss after four hours, the failure did not begin at hour four. Hour four is when the accumulated invalidity became visible enough for the metric to confess.

The same pattern appears outside machine learning. A dashboard can show a blank point because a ratio divided by zero three transformations upstream. An alert can fail to fire because a comparison against NaN returned false. A ranking can quietly drop an item because sorting code had no explicit non-finite policy. A CSV export can carry the string NaN into a spreadsheet, where the next person treats it as text, missingness, or a corrupted value depending on which tool opens it.

NaN is a bad incident witness. It can tell you that invalid arithmetic happened. It cannot, by itself, tell you where the original invalid operation was, what domain fact it represented, or whether continuing was acceptable.

That context has to be captured before the value spreads.

Keeping the poison contained

The right response to NaN is not panic. It is boundary discipline.

First, decide where non-finite values are allowed. Scientific computation, simulation, graphics, machine learning research, statistical exploration, and numerical libraries may deliberately use NaN as a diagnostic or missing-data marker. Product APIs, payment paths, ranking scores, permissions logic, feature flags, and user-visible metrics usually need a stricter contract. A field named score should not be a shrug.

Second, validate at boundaries rather than sprinkling checks randomly through the code. Incoming API payloads, database writes, dataframe ingestion, model-feature construction, chart data, and JSON serialisation are natural places to reject or classify NaN. A single Number.isFinite check at the edge of a JavaScript API can be worth more than twenty defensive comparisons buried downstream. In Python, choosing allow_nan=False when emitting JSON turns a silent interoperability hazard into an explicit error.7

Third, represent domain meaning explicitly. If a value is missing, model missingness. If it is not applicable, say that. If a calculation failed, return an error object, tagged union, nullable field with a reason, or validation result that preserves cause. NaN is a poor substitute for a domain model because it says only that a number did not survive.

Fourth, be careful with performance flags that rewrite the contract. GCC's -ffast-math enables -ffinite-math-only, and that option allows optimisations which assume arguments and results are not NaNs or infinities.13 The compiler is not betraying you when it optimises under the assumptions you requested. If the program depends on IEEE behaviour for NaN checks, do not ask the compiler to assume NaNs do not exist.

Finally, test the bad path deliberately. Tests that cover ordinary numbers are not enough. Feed the system NaN, Infinity, -Infinity, negative zero, empty strings that coerce to zero, missing values, and values just beyond expected ranges. Verify the exact policy: reject, coerce, tag, ignore, aggregate, or fail. The important thing is not that every system chooses the same policy. The important thing is that each boundary chooses one on purpose.

NaN was designed to keep numerical failure observable. That is a good idea. The failure happens when application systems turn observability into ambiguity. A number stops being a number, then every layer politely pretends that someone else will decide what that means.

Someone else rarely does.

The practical rule is simple: treat NaN as contaminated until proven intentional. If it belongs in the computation, preserve it and monitor it. If it crosses a product boundary, explain it or reject it. If it reaches a customer, dashboard, model, or database without a policy, the bug is not that NaN exists. The bug is that the system had no plan for mathematical failure wearing a number's uniform.


Footnotes

  1. IEEE. (2019). 'IEEE Standard for Floating-Point Arithmetic.' IEEE Standards Association. https://standards.ieee.org/ieee/754/6210/

  2. Kahan, William. (1997). 'Lecture Notes on the Status of IEEE 754.' University of California, Berkeley. https://people.eecs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF

  3. Ecma International. (2024). 'ECMAScript 2024 Language Specification: Number Type and Numeric Operations.' TC39. https://tc39.es/ecma262/2024/multipage/ecmascript-data-types-and-values.html#sec-ecmascript-language-types-number-type

  4. Ecma International. (2024). 'ECMAScript 2024 Language Specification: Number::equal, Number::sameValue, and Number::sameValueZero.' TC39. https://tc39.es/ecma262/2024/multipage/ecmascript-data-types-and-values.html#sec-numeric-types-number-equal 2

  5. Bray, Tim, Ed. (2017). 'RFC 8259: The JavaScript Object Notation (JSON) Data Interchange Format.' Internet Engineering Task Force. https://www.rfc-editor.org/rfc/rfc8259

  6. Ecma International. (2022). 'ECMAScript 2022 Language Specification: JSON.stringify.' TC39. https://tc39.es/ecma262/2022/multipage/structured-data.html#sec-json.stringify

  7. Python Software Foundation. (2026). 'json: JSON encoder and decoder.' Python 3.14.5 documentation. https://docs.python.org/3.14/library/json.html 2

  8. PostgreSQL Global Development Group. (2026). 'PostgreSQL Documentation: Numeric Types.' PostgreSQL Documentation. https://www.postgresql.org/docs/14/datatype-numeric.html

  9. pandas development team. (2026). 'Working with missing data.' pandas 3.0 documentation. https://pandas.pydata.org/pandas-docs/version/3.0/user_guide/missing_data.html

  10. NumPy Developers. (2026). 'numpy.nanmean.' NumPy v2.4 Manual. https://numpy.org/doc/stable/reference/generated/numpy.nanmean.html

  11. Google. (2024). 'tf.debugging.enable_check_numerics.' TensorFlow API documentation. https://www.tensorflow.org/api_docs/python/tf/debugging/enable_check_numerics

  12. PyTorch Contributors. (2025). 'Automatic differentiation package: debugging and anomaly detection.' PyTorch 2.9 documentation. https://docs.pytorch.org/docs/2.9/autograd.html#debugging-and-anomaly-detection

  13. Free Software Foundation. (2026). 'Optimize Options.' Using the GNU Compiler Collection. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Related Articles

Latest from the blog

The organisational memory leak: why lessons disappear between teams

Companies do not keep repeating software failures because nobody noticed. They repeat them because the lesson had nowhere durable to live, no owner, and no budget attached. The post-mortem sits in the wiki. The trap stays armed.

19 May 2026 · 23 min read