Troubleshooting Kubernetes Deployment on Huawei Cloud EulerOS (CCE)

Scaling microservices on Huawei Cloud Container Engine (CCE) should be a seamless operation. However, when utilizing EulerOS node pools under heavy load, teams frequently encounter a critical Huawei Cloud deployment error: node autoscaling triggers successfully, but new Pods remain stuck in a ContainerCreating or Pending state. Examining the cluster events inevitably reveals Container Network Interface (CNI) IP pool exhaustion.

This failure cascade severely impacts Enterprise cloud DevOps pipelines. This article breaks down the architectural constraints causing this issue on Huawei Cloud CCE and provides a definitive, code-backed solution to stabilize your cluster scaling.

The Root Cause: VPC-CNI Architecture and Subnet Exhaustion

To understand why IP exhaustion occurs when running Kubernetes on EulerOS within CCE, we must look at the underlying network models. CCE offers several networking modes, but high-performance enterprise environments typically utilize Cloud Native Network 2.0 (VPC-CNI), powered by Huawei's Yangtse CNI plugin.

In VPC-CNI mode, every Pod is allocated an IP address directly from the Virtual Private Cloud (VPC) subnet. The Yangtse CNI achieves this by attaching Elastic Network Interfaces (ENIs) to the EulerOS worker nodes.

The exhaustion happens due to IP pre-allocation (IP warming). To ensure fast Pod startup times, the CNI pre-allocates a pool of IP addresses to each ENI attached to a node. If your node pool is configured with a default maxPods value of 128, the CCE control plane attempts to reserve a significant chunk of IPs from the subnet the moment the node spins up.

If your underlying Subnet CIDR is a /24 (256 addresses), a single autoscaled EulerOS node will instantly consume over half of the available IPs. When the cluster autoscaler adds a second node, the subnet exhausts, the CNI fails to bind IPs, and deployments halt. Furthermore, default EulerOS kernel parameters for connection tracking (conntrack) and ARP caching are not natively tuned for the density required by large-scale microservice architectures.

Step-by-Step Resolution

To permanently resolve CNI exhaustion and node scaling failures, you must decouple Node IPs from Pod IPs using a dedicated container subnet and tune the EulerOS network stack during the node bootstrap phase.

Step 1: Provision a Dedicated Container Subnet

Do not use the same subnet for your EulerOS instances and your Pods. We will use Terraform to provision a secondary subnet with a massive CIDR block (e.g., /18) specifically designated for the CNI.

# main.tf
terraform {
  required_providers {
    huaweicloud = {
      source  = "huaweicloud/huaweicloud"
      version = ">= 1.50.0"
    }
  }
}

# Primary Subnet for EulerOS Nodes
resource "huaweicloud_vpc_subnet" "node_subnet" {
  name       = "cce-node-subnet"
  cidr       = "10.0.1.0/24"
  gateway_ip = "10.0.1.1"
  vpc_id     = var.vpc_id
}

# Secondary Subnet strictly for VPC-CNI Pod allocation
resource "huaweicloud_vpc_subnet" "pod_subnet" {
  name       = "cce-pod-subnet"
  cidr       = "10.1.0.0/18" # Provides 16,384 IPs
  gateway_ip = "10.1.0.1"
  vpc_id     = var.vpc_id
}

Step 2: Configure the CCE Node Pool Autoscaler

Next, configure the CCE node pool to strictly utilize this secondary subnet for Pod ENIs. We must also hardcode a realistic max_pods limit based on the actual microservice density expected per node, preventing aggressive CNI over-allocation.

resource "huaweicloud_cce_node_pool" "euler_microservices_pool" {
  cluster_id               = huaweicloud_cce_cluster.main.id
  name                     = "euler-production-pool"
  os                       = "EulerOS 2.9"
  initial_node_count       = 3
  flavor_id                = "c7.2xlarge.2" # 8 vCPUs, 16GB RAM
  
  # Adjust max pods to prevent aggressive IP reservation per node
  max_pods                 = 64 

  network {
    subnet_id = huaweicloud_vpc_subnet.node_subnet.id
  }

  data_volumes {
    size       = 100
    volumetype = "SSD"
  }

  # Link the dedicated Pod Subnet to the Container Network
  # This requires the cluster to be configured in Cloud Native Network 2.0 mode
  extension_template = jsonencode({
    "network" : {
      "vpcCni" : {
        "eniSubnetId" : huaweicloud_vpc_subnet.pod_subnet.id
      }
    }
  })

  # Inject Kernel Tuning via Post-Install Script
  postinstall = base64encode(file("${path.module}/scripts/euler-tune.sh"))
}

Step 3: Tune EulerOS Kernel Parameters

EulerOS is a highly secure, enterprise-grade Linux distribution, but its default sysctl configurations act as a bottleneck when handling the high network churn of container creation and destruction.

Create the euler-tune.sh script referenced in the Terraform configuration above. This script modifies the kernel ARP cache thresholds and connection tracking limits during the node bootstrap phase.

#!/bin/bash
# scripts/euler-tune.sh
set -e

# Define required sysctl parameters for high-density Kubernetes on EulerOS
cat <<EOF > /etc/sysctl.d/99-kubernetes-cni.conf
# Increase max connection tracking for microservice RPC traffic
net.netfilter.nf_conntrack_max=1048576

# Tune ARP cache thresholds to prevent 'neighbor table overflow' errors
net.ipv4.neigh.default.gc_thresh1=8192
net.ipv4.neigh.default.gc_thresh2=32768
net.ipv4.neigh.default.gc_thresh3=65536

# Prevent TCP TIME_WAIT exhaustion
net.ipv4.tcp_max_tw_buckets=262144
net.ipv4.tcp_tw_reuse=1
EOF

# Apply the parameters immediately
sysctl --system

# Restart network interface controller to bind new limits
systemctl restart NetworkManager

Deep Dive: Why This Fix Works

When managing Enterprise cloud DevOps architectures, isolating fault domains is critical. By creating a dedicated pod_subnet (/18), we transition the IP capacity from 256 addresses to 16,384 addresses.

The Yangtse CNI plugin allocates an ENI per worker node and assigns secondary IPs to that ENI. When a Pod schedules, it is rapidly bound to one of these pre-warmed secondary IPs. By lowering max_pods to 64, we force the CNI to allocate a smaller batch of IPs per node, reducing subnet fragmentation.

Furthermore, EulerOS handles network bridging efficiently, but the Linux kernel maintains an ARP cache mapping IP addresses to MAC addresses. In a highly dynamic cluster where Pods are constantly scaling up and down, the default ARP table (gc_thresh3 set to 1024) overflows rapidly. Bumping this to 65536 ensures the EulerOS kernel can track thousands of ephemeral Pod networking interfaces without dropping packets.

Common Pitfalls and Edge Cases

Security Group Constraints

When utilizing separate subnets for Nodes and Pods, a common error is failing to update Security Group rules. The Pod subnet (10.1.0.0/18) must be explicitly permitted to communicate with the Node subnet (10.0.1.0/24) on all ports within the Huawei Cloud VPC security groups. Failure to do so will result in CoreDNS timeouts, as Pods will be unable to reach the Kubernetes service IPs managed by kube-proxy on the nodes.

VPC Peering Subnet Overlaps

If your architecture relies on VPC Peering to communicate with legacy databases or on-premises networks via Huawei Cloud Direct Connect, ensure your massive /18 Pod subnet does not overlap with any routed corporate networks. Because VPC-CNI Pods use natively routable VPC IPs, an overlap will create asymmetric routing failures that are incredibly difficult to diagnose at the application layer.

Conclusion

Encountering a Huawei Cloud deployment error related to IP exhaustion is a rite of passage when scaling enterprise clusters. By architecting your networking with dedicated Pod subnets, explicitly limiting your maxPods parameter, and utilizing bootstrapping scripts to tune EulerOS kernel parameters, you ensure your Huawei Cloud CCE infrastructure is resilient, highly scalable, and ready for production workloads.

Programming Tutorials

Search This Blog