Skip to main content

Debugging `CHIP_ERROR_INTERNAL` (0xAC) in ESP32 Matter Commissioning

 Few things are more demoralizing in firmware development than a generic error code during a critical phase. In the ESP32 Matter ecosystem, CHIP_ERROR_INTERNAL (0xAC) during the PASE (Passcode Authenticated Session Establishment) or commissioning phase is a notorious showstopper.

You see the BLE connection succeed, the credentials exchange begin, and then the process hangs until the commissioner (Google Home, Apple Home, or chip-tool) throws 0xAC and disconnects.

This is almost never an internal memory error. It is usually a networking identity crisis caused by stale Thread Operational Datasets.

The Root Cause: Thread Dataset Mismatches

To understand why 0xAC occurs, you must look at how Matter commissioning hands over credentials.

  1. BLE Handshake: The phone connects to the ESP32 via BLE.
  2. Thread Provisioning: The phone sends the Thread Network credentials (PAN ID, Channel, Network Key) to the ESP32.
  3. Network Switch: The ESP32 spins up its OpenThread stack, joins the mesh, and shuts down BLE.
  4. Discovery (The Failure Point): The Border Router attempts to find the new device via mDNS (DNS-SD) over IPv6 to complete the commissioning (CASE).

The Problem: The ESP32 stores Thread credentials in Non-Volatile Storage (NVS). If you re-flash your firmware, the NVS persists.

If your Border Router has rotated its Thread credentials (common with Apple Border Routers after updates) or if you previously commissioned the device to a different test network, the ESP32 might ignore the new credentials sent during PASE and cling to the old Active Operational Dataset stored in NVS.

The ESP32 thinks it is connected to a Thread network (the old one), but the Border Router is on a different one. The mDNS broadcast goes into the void. The commissioner times out waiting for the device to announce itself, resulting in CHIP_ERROR_INTERNAL.

The Fix: Implementing a Hard NVS Cleanse

While running idf.py erase-flash is a quick fix for the bench, it is not a production solution. You need a programmatic way to ensure your device enters a pristine state when a factory reset is requested or when commissioning repeatedly fails.

Below is a robust C++ implementation for handling Factory Resets within the ESP-Matter SDK (esp-matter). This code ensures that both the Matter KVS (Key-Value Store) and the underlying OpenThread persistent data are obliterated.

1. The Factory Reset Implementation

In your app_driver.cpp (or wherever you handle button events), implement a strictly defined reset handler. Do not rely solely on the default SDK handlers; we need to guarantee the Thread stack is wiped.

#include <app/server/Server.h>
#include <platform/CHIPDeviceLayer.h>
#include <platform/ESP32/ESP32Utils.h>
#include <esp_matter.h>
#include <esp_log.h>
#include <nvs_flash.h>

// specific OpenThread headers usually required for deep cleaning
#include <openthread/instance.h>
#include <openthread/platform/settings.h>

using namespace chip;
using namespace chip::DeviceLayer;

static const char *TAG = "FactoryReset";

void ScheduleFactoryReset()
{
    ESP_LOGW(TAG, "Initiating Factory Reset - Clearing NVS and Thread Datasets");

    // Schedule the reset on the Matter event loop to ensure thread safety
    PlatformMgr().ScheduleWork([](intptr_t arg) {
        
        // 1. Emit the User Confirmation (optional, good for UI feedback)
        // Check if ConfigurationManager is available
        if (ConfigurationMgr().IsFullyProvisioned()) {
            ESP_LOGI(TAG, "Device was provisioned. Clearing credentials.");
        }

        // 2. Initiate the Standard Matter Factory Reset
        // This clears the Matter KVS (Fabric tables, ACLs, etc.)
        ConfigurationMgr().InitiateFactoryReset();
        
        // Note: The call above schedules a restart, but depending on the 
        // ESP-Matter version, it might not wipe the separate 'nvs' partition 
        // where custom data or specific OpenThread settings might live if 
        // configured outside the Matter KVS.
        
    }, 0);
}

// Helper to force-wipe NVS if standard reset fails to clear ghost datasets
// Call this only if you detect corruption or persistent 0xAC errors during boot.
extern "C" void ForceNukeNVS()
{
    ESP_LOGE(TAG, "Performing Nuclear NVS Erase");
    
    // Erase the default NVS partition
    esp_err_t err = nvs_flash_erase();
    if (err != ESP_OK) {
        ESP_LOGE(TAG, "Failed to erase NVS: %s", esp_err_to_name(err));
    }
    
    // Crucial: OpenThread often uses a separate partition or namespace
    // explicitly wiping the OpenThread dataset is safest via the OT API
    // if you have access to the instance, otherwise the NVS erase covers it
    // provided the partition table is standard.
    
    esp_restart();
}

2. Validating the Thread State (Debugging)

If you are stuck in a reboot loop or constant 0xAC, you need to inspect what the device thinks its dataset is versus what the commissioner is sending.

Add this diagnostic function to your app_main.cpp to print the current Thread Dataset Hash on boot. If this hash persists across "Resets," your reset logic is flawed.

#include <platform/ThreadStackManager.h>
#include <openthread/dataset.h>

void LogThreadDataset()
{
#if CONFIG_OPENTHREAD_ENABLED
    chip::ThreadStackMgr().LockThreadStack();
    
    otInstance *instance = chip::ThreadStackMgrImpl().OTInstance();
    otOperationalDataset dataset;
    
    if (otDatasetGetActive(instance, &dataset) == OT_ERROR_NONE) {
        ESP_LOGI("ThreadDebug", "Active Dataset Present:");
        ESP_LOGI("ThreadDebug", "  Network Name: %s", dataset.mNetworkName.m8);
        ESP_LOGI("ThreadDebug", "  PAN ID: 0x%04x", dataset.mPanId);
        ESP_LOGI("ThreadDebug", "  Channel: %d", dataset.mChannel);
    } else {
        ESP_LOGI("ThreadDebug", "No Active Thread Dataset found (Clean State).");
    }
    
    chip::ThreadStackMgr().UnlockThreadStack();
#endif
}

Why This Fix Works

The 0xAC error in this context is a synchronization failure.

When ConfigurationMgr().InitiateFactoryReset() is called, the Matter stack (connectedhomeip) calls internal::ESP32Utils::ClearNVSKeys(). However, depending on your partitions.csv layout and how OpenThread was initialized (specifically if CONFIG_OPENTHREAD_SETTINGS_RAM is disabled and it uses NVS), remnants of the Thread network info can remain.

By validating the dataset on boot and ensuring a full NVS erase (or using idf.py erase-flash during development), you force the OpenThread stack to start with a OT_ERROR_NOT_FOUND on its dataset.

When the commissioning process reaches the Thread Provisioning stage, the device will immediately accept the new Operational Dataset provided by the phone/commissioner, attach to the correct Border Router, and successfully broadcast its mDNS service.

Summary

  1. Stop guessing. 0xAC usually means your device is on the wrong Thread network.
  2. Verify NVS. During development, always use idf.py -p PORT erase-flash before flashing if you encounter commissioning issues.
  3. Code for Resilience. Implement the LogThreadDataset function to verify your device's state on boot. If you see a Network Name you didn't expect, your reset logic is failing.