Billing logic is some of the hardest code to test correctly. Unlike a CRUD endpoint where a wrong answer is obvious, a billing calculation that's off by a few cents silently compounds across thousands of invoices before anyone notices. I've seen a proration bug cost a self-storage operator nearly $4,000 in incorrect charges over three months — all from a single off-by-one error in day counting. After that incident I rebuilt the test suite for the billing engine from scratch with a specific focus on the failure modes that matter in financial software. Here's the testing strategy I use now.
What Makes Billing Logic Hard to Test
Four categories of complexity make billing tests harder than most application tests:
Proration math: When a tenant moves in on the 17th of a month, you need to charge them for the remaining days. That seems simple until you have to handle months with different day counts, fiscal periods that don't match calendar months, and operators who use 30-day billing cycles instead of calendar months. The proration formula itself is usually simple; the date arithmetic around it is treacherous.
Date and time edge cases: Billing that runs at midnight on the last day of the month behaves differently on February 28th, February 29th (leap year), and December 31st. Daylight saving time transitions can cause a billing job that's supposed to run once at midnight to run twice or not at all, depending on how your server handles the clock change.
Tax calculations: Sales tax rates change, tax exemption status varies by tenant type, and compound tax scenarios (state + county + city) have rounding rules that differ by jurisdiction. A test that hardcodes assertEquals(0.0825, $taxRate) will break the moment a municipality changes their rate.
Payment gateway behavior: Your tests can't call a live payment gateway. But tests that mock the gateway too simply miss real-world failure modes — declined cards that return different error codes, network timeouts, partial captures, and the specific behavior when a gateway returns a success response but the payment actually failed (this happens more than you'd think with certain processors).
PHPUnit Setup for Billing
I use PHPUnit 11 with a separate test suite configuration for billing tests, separate from the application's general test suite. The billing suite runs against a dedicated test database that gets rebuilt before each run:
<?xml version="1.0" encoding="UTF-8"?>
<phpunit xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="phpunit.xsd"
bootstrap="tests/billing/bootstrap.php"
colors="true">
<testsuites>
<testsuite name="billing-unit">
<directory>tests/billing/unit</directory>
</testsuite>
<testsuite name="billing-integration">
<directory>tests/billing/integration</directory>
</testsuite>
</testsuites>
<php>
<env name="DB_DSN" value="mysql:host=localhost;dbname=billing_test"/>
<env name="PAYMENT_GATEWAY_ENV" value="sandbox"/>
</php>
</phpunit>
The bootstrap file sets up the test database schema and seeds it with a minimal fixture set. I use database transactions in integration tests — each test wraps its database operations in a transaction that gets rolled back in tearDown(), so tests are isolated and the database stays clean without needing a full rebuild between tests.
Test Doubles for Payment Gateways
Payment gateway tests require a double that models failure modes, not just happy paths. I create a FakePaymentGateway class rather than relying on PHPUnit's mock builder for this — the fake is more readable and can model stateful behavior like "decline on the third charge attempt" which is difficult to express with mock expectations:
class FakePaymentGateway implements PaymentGatewayInterface {
private array $responses = [];
private array $capturedCharges = [];
private int $callCount = 0;
public function queueResponse(PaymentResult $result): void {
$this->responses[] = $result;
}
public function charge(PaymentRequest $request): PaymentResult {
$this->callCount++;
$this->capturedCharges[] = $request;
if (empty($this->responses)) {
return PaymentResult::success('fake-txn-' . uniqid());
}
return array_shift($this->responses);
}
public function getCapturedCharges(): array {
return $this->capturedCharges;
}
public function assertChargedOnce(): void {
PHPUnit\Framework\Assert::assertCount(1, $this->capturedCharges);
}
public function assertChargedAmount(Money $amount): void {
PHPUnit\Framework\Assert::assertEquals(
$amount,
$this->capturedCharges[0]->amount ?? null
);
}
}
This fake is injected via constructor in the billing service, and tests can pre-program specific response sequences. Testing the retry behavior on a declined card means queueing a decline result, then a success result, and asserting that the gateway was called twice and that the final outcome is a paid invoice.
Testing Proration Math
Proration tests need to cover calendar-month and 30-day cycle variants, leap years, and move-ins on the first and last day of a period. I use PHPUnit's data providers to cover the matrix without writing 20 separate test methods:
class ProrationCalculatorTest extends TestCase {
/**
* @dataProvider prorationProvider
*/
public function testProration(
string $moveInDate,
string $periodStart,
string $periodEnd,
float $monthlyRate,
string $expectedAmount
): void {
$calculator = new ProrationCalculator();
$result = $calculator->calculate(
new \DateTimeImmutable($moveInDate),
new \DateTimeImmutable($periodStart),
new \DateTimeImmutable($periodEnd),
Money::fromDecimal($monthlyRate)
);
$this->assertEquals(Money::fromString($expectedAmount), $result);
}
public static function prorationProvider(): array {
return [
'full month' => ['2024-01-01', '2024-01-01', '2024-01-31', 100.00, '100.00'],
'move in mid January' => ['2024-01-17', '2024-01-01', '2024-01-31', 100.00, '48.39'],
'move in last day' => ['2024-01-31', '2024-01-01', '2024-01-31', 100.00, '3.23'],
'February leap year' => ['2024-02-15', '2024-02-01', '2024-02-29', 100.00, '51.72'],
'February non-leap' => ['2023-02-15', '2023-02-01', '2023-02-28', 100.00, '50.00'],
'30-day cycle mid' => ['2024-01-17', '2024-01-01', '2024-01-30', 100.00, '46.67'],
];
}
}
The Money value object is important here — never use floats for currency in tests or production code. Floats have rounding errors that compound. Use a Money class backed by integer cents, or the brick/money package which handles multi-currency and rounding modes correctly.
Testing Late Fee Calculation with Time Mocking
Late fee logic depends on the current date, which makes it the classic "how do I test time-dependent code" problem. I solve this with a ClockInterface that gets injected, with a real clock for production and a controllable fake for tests:
interface ClockInterface {
public function now(): \DateTimeImmutable;
}
class FakeClock implements ClockInterface {
public function __construct(private \DateTimeImmutable $now) {}
public function setNow(\DateTimeImmutable $now): void {
$this->now = $now;
}
public function now(): \DateTimeImmutable {
return $this->now;
}
}
class LateFeeCalculatorTest extends TestCase {
public function testLateFeeAppliedAfterGracePeriod(): void {
$clock = new FakeClock(new \DateTimeImmutable('2024-01-01'));
$calculator = new LateFeeCalculator($clock, graceDays: 5);
$invoice = Invoice::create(
dueDate: new \DateTimeImmutable('2024-01-01'),
amount: Money::fromDecimal(100.00)
);
// Day 5 — within grace period, no late fee
$clock->setNow(new \DateTimeImmutable('2024-01-06'));
$this->assertEquals(Money::zero(), $calculator->calculateLateFee($invoice));
// Day 6 — grace period expired, 10% late fee applies
$clock->setNow(new \DateTimeImmutable('2024-01-07'));
$this->assertEquals(Money::fromDecimal(10.00), $calculator->calculateLateFee($invoice));
}
}
Integration Tests vs Unit Tests for Financial Logic
My rule of thumb: unit tests for the math, integration tests for the workflow. The proration calculator, tax calculator, and late fee calculator are pure functions with no external dependencies — unit tests are fast, exhaustive, and I can run 200 scenarios in under a second. The billing cycle workflow — pulling tenant records, calculating charges, calling the payment gateway, writing invoice records, updating balances — that needs an integration test that exercises the full path through real database calls (against the test database) and the fake gateway.
The integration test for a billing cycle run is longer but catches a different class of bugs: database constraint violations when invoice records conflict, incorrect JOIN logic that produces duplicate charges, transaction isolation issues when two concurrent billing jobs process the same tenant. These bugs don't show up in unit tests of the individual components — they only emerge when the components interact.
The investment in a thorough billing test suite pays back the first time you refactor the proration logic or upgrade the payment gateway SDK. Running the full suite takes about 45 seconds on my development machine. That 45 seconds catches the classes of errors that take days to find manually and potentially cost your client thousands of dollars in incorrect charges. It's not optional — it's part of the cost of building financial software responsibly.