Bringing Intel TDX Support to Alioth

Technology ,

I’m excited to share a significant milestone in my hobby project, Alioth, a from-scratch Virtual Machine Monitor (VMM) written in Rust. In my quest to explore the frontiers of confidential computing, I’ve just landed initial support for Intel Trust Domain Extensions (TDX). This post will walk you through the journey, the challenges I encountered, and the interesting discoveries I made along the way.

For those unfamiliar with Alioth, it’s my personal endeavor to build a lightweight and modern VMM, similar in spirit to QEMU but with a much smaller scope. The goal is to learn and experiment with virtualization technologies in a hands-on way.

The Path to TDX

My primary reference for this implementation was the excellent work done in QEMU. The patches from Xiaoyao Li at Intel were an invaluable guide. The process involved several stages, mainly done in #406. You can see the complete history starting from 6f7f070254a9.

Here’s a brief overview of the key steps:

While implementing TDX support, I also refactored the existing codebase a lot (#393, #395, #403), mainly to remove hardcoded code paths for AMD-SEV and create a more modular architecture.

Firmware: Stage0 for Now

One important point to note is that the current implementation relies on oak/stage0 as the firmware. This is a minimal firmware designed for confidential computing scenarios. As a result, traditional OVMF-based boot is not yet supported.

An Unexpected Bug in OVMF

While using QEMU and OVMF as a reference, I stumbled upon a bug in OVMF’s Intel TDX support. I was happy to contribute a fix for this, which you can find in pull request #12216. It’s always rewarding to contribute back to the projects I learn from.

Wrestling with KVM/TDX

The journey wasn’t without its challenges. I ran into a couple of interesting issues with the KVM/TDX interface:

  1. Missing SIGNIFICANT_INDEX Flag: I discovered that the struct kvm_cpuid_entry2 returned by KVM_TDX_CAPABILITIES were missing the SIGNIFICANT_INDEX flag. This caused the CPUID index to be parsed as None, breaking CPUID capability matching against KVM_GET_SUPPORTED_CPUID. I’ve reported this to the KVM community (link to mailing list post) and implemented a workaround in Alioth (commit f5bf11ab2634) to handle this.

  2. TDX Guest Kernel Probing ROMs: I also found that the TDX guest Linux kernel was still probing for ROMs, which would cause guest crashes because the memory is not backed. With the help of Gemini, I was able to patch the guest kernel to work around this issue:

    --- a/arch/x86/coco/tdx/tdx.c
    +++ b/arch/x86/coco/tdx/tdx.c
    @@ -1127,6 +1127,9 @@ void __init tdx_early_init(void)
    
            cc_vendor = CC_VENDOR_INTEL;
    
    +       /* Prevent probe_roms() from crashing on unbacked private MMIO */
    +       x86_init.resources.probe_roms = x86_init_noop;
    +
            /* Configure the TD */
            tdx_setup(&cc_mask);
    

What’s Next?

Getting TDX support to this stage has been a rewarding challenge. There’s still more to do, like exploring OVMF support, upstreaming local kernel patches, and continuing to improve the robustness of the implementation. I’m excited to see where this journey takes me next.

I hope this gives you a good overview of what’s been happening with Alioth. As always, feel free to check out the project on GitHub and share your thoughts!