Property-Based Testing: Finding Bugs in numpy.random.wald

This transcript shows how Claude identifies a bug in numpy.random.wald by inferring properties from documentation and tests, then using Hypothesis to find counterexamples.

✻ Thinking
Now let me use introspection to understand the numpy.random.wald function better - get its signature, docstring, and source code if available.
⏺ Tool Call: Bash
python3 -c " import numpy…
⎿ Tool Output
=== SIGNATURE === (mean, scale, size=None) … +79 lines (ctrl+o to expand)
✻ Thinking
Good, I have the function signature and docstring. It takes mean and scale parameters (both must be > 0) and an optional size parameter. Let me explore the numpy random module structure to understand the implementation better.
⏺ Tool Call: Bash
ls -la /Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/numpy/random/
⎿ Tool Output
total 5488 drwxr-xr-x 34 redacted admin 1088 Jun 23 16:56 . drwxr-xr-x 52 redacted admin 1664 Jun 23 16:56 .. … +32 lines (ctrl+o to expand)
✻ Thinking
Let me check the init.py file to see how wald is exposed. I should also look into mtrand module which is likely where the legacy random functions are implemented.
⏺ Tool Call: Read
/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/numpy/random/__init__.py
⎿ Tool Output
Read 214 lines (ctrl+o to expand)
✻ Thinking
So the wald function is imported from mtrand (line 187). Let me check if I can find more information about the actual implementation. Let me also check for tests of the wald function.
⏺ Tool Call: Bash
python3 -c " import numpy.random…
⎿ Tool Output
Function name: wald Module: numpy.random Type: <class 'method'> … +2 lines (ctrl+o to expand)
✻ Thinking
Let me look for tests to understand how the wald function is typically used and what properties are expected. This will help me propose properties based on evidence.
⏺ Tool Call: Search
pattern: "**/test*wald*.py", path: "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/numpy/random"
⎿ Tool Output
Found 0 files
✻ Thinking
Let me look for general test files in the numpy random tests directory.
⏺ Tool Call: Bash
ls /Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/numpy/random/tests/
⎿ Tool Output
__init__.py __pycache__ data … +10 lines (ctrl+o to expand)
✻ Thinking
Let me search for "wald" in the test files to see how it's tested.
⏺ Tool Call: Search
pattern: "wald", path: "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/numpy/random/tests", output_mode: "content"
⎿ Tool Output
Found 40 lines (ctrl+o to expand)
✻ Thinking
Good, I found existing tests for wald. Let me look at one of these test files to understand better what properties are being tested and what constraints exist.
⏺ Tool Call: Read
/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/numpy/random/tests/test_random.py
⎿ Tool Output
Read 30 lines (ctrl+o to expand)
✻ Thinking
Now I understand the constraints better: 1. mean must be > 0 (lines 1446, 1452, 1454 show tests for ValueError when mean = 0) 2. scale must be > 0 (lines 1447, 1453, 1455 show tests for ValueError when scale <= 0)
It then proposes some properties based on this analysis:
✻ Thinking
Based on my understanding of the wald function, let me propose evidence-based properties: 1. All outputs should be positive: Since the Wald distribution produces positive values only, all samples should be > 0 2. Parameter validation: mean and scale must be > 0, otherwise ValueError should be raised 3. Shape property: Output shape should match the specified size parameter 4. Broadcasting property: When mean and scale are arrays, they should broadcast correctly 5. Deterministic with seed: Same seed should produce same results 6. No NaN/Inf in output: Valid inputs should never produce NaN or infinity values Let me now write tests for these properties.
⏺ Assistant
Based on my analysis, here are the evidence-based properties I've identified for numpy.random.wald: 1. All outputs must be positive - The Wald distribution only produces positive values 2. Parameter constraints - mean and scale must be > 0 (documented and tested in existing tests) 3. No NaN/Inf values - Valid inputs should produce finite values 4. Shape consistency - Output shape should match the size parameter 5. Broadcasting - mean and scale arrays should broadcast correctly
Next, it translates these properties to Hypothesis tests in a new Python file. For example, when writing the tests, it constrains the mean and scale parameters to be positive, as the documentation requires. When it executes the tests, they fail, and it then reflects on the failure:
✻ Thinking
Excellent! I found bugs. The tests reveal that numpy.random.wald can produce: 1. Zero values (should be positive only) 2. Negative values with extreme parameters (-0 in the large values test) Let me now move to bug triage to verify these are legitimate bugs and create minimal reproductions.
⏺ Assistant
Great! I found potential bugs. Let me now triage these failures to verify they're legitimate bugs.
The property it was attempting to test is that the Wald distribution should only return positive values, and, when executing the test with Hypothesis, found parameter values that return zero or negative values. It writes a minimal reproduction script to test the parameters that make this property fail, to ensure the bug is valid. The bug indeed reproduces, so it then writes a bug report.
✻ Thinking
Now let me create the bug report.
⏺ Tool Call: Write
bug_report_numpy_random_wald_2025-09-15_19-33_bjyl.md
The bug report it produced is:

📋 Bug Report: numpy.random.wald Produces Zero Values

Target: numpy.random.wald
Severity: High
Bug Type: Logic
Date: 2025-09-15

Summary

The numpy.random.wald function produces zero values when given certain valid parameter combinations, violating the mathematical property that the Wald distribution only produces positive values.

Property-Based Test
from hypothesis import given, strategies as st, settings import numpy as np positive_floats = st.floats(min_value=1e-10, max_value=1e6, allow_nan=False, allow_infinity=False) @given( mean=positive_floats, scale=positive_floats, size=st.integers(min_value=1, max_value=1000) ) @settings(max_examples=1000) def test_wald_all_outputs_positive(mean, scale, size): """Test that all Wald distribution samples are positive.""" samples = np.random.wald(mean, scale, size) assert np.all(samples > 0), f"Found non-positive values: {samples[samples <= 0]}"

Failing input: mean=265150.0, scale=0.00390625, size=1

Reproducing the Bug
import numpy as np mean = 265150.0 scale = 0.00390625 np.random.seed(0) samples = np.random.wald(mean, scale, 10) print(f"Samples: {samples}") print(f"Contains zero: {np.any(samples == 0)}")

Output:

Samples: [0. 0.02444708 0. 0.00395104 0.36664443 0.02321238 0.00691433 0.26385558 0.01975521 0.03506551] Contains zero: True
Why This Is A Bug

The Wald (inverse Gaussian) distribution is mathematically defined to produce only positive values. Its probability density function is:

P(x;mean,scale) = sqrt(scale/(2πx³)) * exp(-scale(x-mean)²/(2*mean²*x))

This PDF is only defined for x > 0. Producing zero values violates this fundamental property of the distribution and can cause downstream issues in applications that rely on this mathematical guarantee.

Fix

The issue likely occurs due to numerical underflow or improper handling of extreme parameter ratios. The implementation should ensure that:

  1. Very small computed values are clamped to a small positive epsilon rather than zero
  2. The algorithm handles extreme ratios of mean/scale more robustly

A potential fix would involve checking the output and replacing any zero or negative values with the smallest positive float value, or adjusting the internal algorithm to prevent underflow.

# In the wald implementation: - return computed_value + return max(computed_value, np.finfo(float).tiny)