Explore the current status and future developments of the HWPOISON subsystem in this 40-minute conference talk by Naoya Horiguchi from NEC Solution Innovators. Gain insights into the basics of this memory error handling feature introduced in the Linux kernel in 2009, including its functionality and user applications. Delve into recent development topics such as 1GB hugepage support and improved cooperation with memory hotplug. Learn about hardware error-level handling for memory errors, terminology clarifications, and the HWPOISON basic concept. Examine examples of hard and soft page offline processes, internal workings of hard-offline, and notification to userspace. Discover ongoing developments like soft offline rework, pagecache handling improvements, and issues related to folios, huge zero pages, and persistent memory. Understand the challenges of hugetlb pinning race, subpage hwpoison tracking, 1GB hugetlb page support, and memory hotplug integration. Explore machine check safe memory copy and management interface enhancements to gain a comprehensive understanding of this critical Linux kernel subsystem.
Read more