There's a specific kind of challenge that comes up repeatedly in the storage management software space: a facility operator is running a billing system that nobody fully understands anymore. The original developer is gone. The source code is incomplete, wrong, or missing entirely. The database schema was never documented. But the system is live, it processes payments, and tenants are depending on it. Your job is to reverse engineer it, document it, and build something better — without breaking the facility in the process.

I've done this more than once. Here's the methodology that works.

Phase 1: Black-Box Behavioral Mapping

Before touching the database or intercepting any traffic, spend time as a user of the system. Create test accounts, move tenants through complete lifecycle flows, and document every observable behavior. What happens when a payment fails? What does the system do on the first of the month? What edge cases produce error messages? What inputs change what outputs?

Build a behavioral inventory document that maps inputs to outputs for every major operation:

  • Move-in: what fields are required, what gets created, what emails fire
  • Renewal: when it runs, what it produces, what changes in the UI
  • Payment posting: what changes in balance, what confirmation appears
  • Move-out: proration calculation method, deposit handling, what records remain
  • Rate changes: how they propagate to existing tenants vs new move-ins

This behavioral map becomes your test suite specification. Every behavior you document here becomes a test case that your replacement system must pass before you migrate the first real tenant.

Phase 2: Database Schema Reverse Engineering

If you have direct database access, start with the information schema. Dump the full structure and look for patterns:

-- Get all tables with row counts and engine
SELECT
    t.TABLE_NAME,
    t.ENGINE,
    t.TABLE_ROWS,
    t.CREATE_TIME,
    t.UPDATE_TIME,
    t.TABLE_COMMENT
FROM information_schema.TABLES t
WHERE t.TABLE_SCHEMA = 'legacy_billing_db'
ORDER BY t.TABLE_ROWS DESC;

-- Get all columns with types and constraints
SELECT
    c.TABLE_NAME,
    c.COLUMN_NAME,
    c.DATA_TYPE,
    c.COLUMN_TYPE,
    c.IS_NULLABLE,
    c.COLUMN_DEFAULT,
    c.COLUMN_COMMENT
FROM information_schema.COLUMNS c
WHERE c.TABLE_SCHEMA = 'legacy_billing_db'
ORDER BY c.TABLE_NAME, c.ORDINAL_POSITION;

-- Get all foreign keys
SELECT
    kcu.TABLE_NAME,
    kcu.COLUMN_NAME,
    kcu.REFERENCED_TABLE_NAME,
    kcu.REFERENCED_COLUMN_NAME
FROM information_schema.KEY_COLUMN_USAGE kcu
WHERE kcu.TABLE_SCHEMA = 'legacy_billing_db'
  AND kcu.REFERENCED_TABLE_NAME IS NOT NULL;

Sort tables by row count descending. High-row-count tables with names like trans, hist, log, or ledg are your billing ledgers. Tables with a few hundred rows that are joined frequently are configuration or lookup tables. Tables with exactly one row per facility or one row per user account are settings tables.

Look for the pattern of "balance stored in a column" vs "balance derived from transaction rows." Legacy systems almost always store balance as a column and update it with triggers or application-level code. Finding that column and tracing every code path that writes to it is how you discover the business logic.

Phase 3: API Traffic Capture

If the legacy system has a web interface or exposes an API to other systems (like a facility management platform calling a payment processor), capture that traffic. A transparent proxy sitting between the legacy system and its dependencies gives you a complete picture of the actual API contracts in play.

For PHP applications, I add a logging wrapper to the HTTP client layer rather than deploying a proxy:

class LoggingHttpClient implements HttpClientInterface
{
    public function __construct(
        private HttpClientInterface $inner,
        private RequestLogRepository $log,
    ) {}

    public function request(string $method, string $url, array $options = []): Response
    {
        $startTime = microtime(true);
        $response = $this->inner->request($method, $url, $options);
        $elapsed = microtime(true) - $startTime;

        $this->log->record([
            'method'          => $method,
            'url'             => $url,
            'request_body'    => $options['body'] ?? null,
            'response_status' => $response->getStatusCode(),
            'response_body'   => $response->getBody(),
            'elapsed_ms'      => round($elapsed * 1000),
            'captured_at'     => date('Y-m-d H:i:s'),
        ]);

        return $response;
    }
}

Run this for 2–4 weeks capturing real production traffic. You'll discover undocumented API endpoints, implicit field requirements, response formats that differ from documentation, and edge cases that only appear during specific billing events like month-end or rate increase runs.

Phase 4: Building a Compatibility Layer

Rather than doing a hard cutover from legacy to replacement, build a compatibility layer that lets both systems run simultaneously during migration. The compatibility layer intercepts operations destined for the legacy system and mirrors them to the new system in parallel, comparing results.

class DualWriteBillingService
{
    public function __construct(
        private LegacyBillingService $legacy,
        private NewBillingService    $replacement,
        private MismatchLogger       $mismatchLog,
        private bool                 $replacementIsActive = false,
    ) {}

    public function postPayment(PaymentRequest $request): PaymentResult
    {
        // Always execute on legacy system — it's still the source of truth
        $legacyResult = $this->legacy->postPayment($request);

        // Execute on new system and compare
        try {
            $newResult = $this->replacement->postPayment($request);
            $this->compareResults('postPayment', $legacyResult, $newResult, $request);
        } catch (\Throwable $e) {
            $this->mismatchLog->logError('postPayment', $request, $e);
        }

        // Return legacy result until we flip the flag
        return $this->replacementIsActive ? $newResult : $legacyResult;
    }

    private function compareResults(string $operation, $legacy, $new, $context): void
    {
        $legacyNorm = $this->normalizeResult($legacy);
        $newNorm    = $this->normalizeResult($new);

        if ($legacyNorm !== $newNorm) {
            $this->mismatchLog->log([
                'operation' => $operation,
                'context'   => $context,
                'legacy'    => $legacyNorm,
                'new'       => $newNorm,
                'timestamp' => date('Y-m-d H:i:s'),
            ]);
        }
    }
}

This pattern lets you run the replacement system in shadow mode for weeks, accumulating a mismatch log. Every mismatch is a discovered behavioral difference. Work through that log until it's empty, then flip $replacementIsActive to true for a soft cutover while keeping the legacy system as an instant rollback option.

Phase 5: Risk Management During Migration

The highest-risk period in any billing migration is the window where data exists in both systems. Maintain a clear rule: legacy system is the source of truth for financial records until the migration is fully complete and validated. Any reconciliation discrepancy during the shadow phase is investigated against the legacy system's records, not the new system's.

Create a reconciliation report that runs daily during migration, comparing key financial totals between systems:

  • Total payments posted today: legacy total vs new system total
  • Count of active leases: both systems should match exactly
  • Sum of all outstanding balances: should match within rounding tolerance
  • Count of failed renewal queue entries: both should show same tenants in arrears

Set alert thresholds. Any discrepancy above $1.00 in total daily payment volume triggers an immediate investigation before the next business day. Facilities cannot have their books differ between systems by more than rounding error.

The hardest part of reverse engineering legacy billing isn't the technical work — it's managing the facility operator's anxiety during the process. The system they're running has been processing their revenue for years, even if it's technically flawed. Build trust incrementally: shadow mode first, then single-facility pilot, then portfolio rollout. Every successful reconciliation report is evidence that the replacement is working correctly, and that evidence is what gives the operator confidence to complete the migration.