[Pkg-xen-devel] Bug#1051862: (Debian) Bug#1051862: server flooded with xen_mc_flush warnings with xen 4.17 + linux 6.1

Radoslav Bodó radoslav.bodo at igalileo.cz
Thu Sep 14 08:46:00 BST 2023


Hi all,

hopefully it's ok to reply-all at this point


On 9/13/23 23:38, Hans van Kranenburg wrote:
> I have a few quick additional questions already:
> 
> 1. For clarification.. From your text, I understand that only this one
> single server is showing the problem after the Debian version upgrade.
> Does this mean that this is the only server you have running with
> exactly this combination of hardware (and BIOS version, CPU microcode
> etc etc)? Or, is there another one with same hardware which does not
> show the problem?

This is the unique HW combination in terms of server type Dell R750xs 
and CPU type 'Intel Xeon Silver 4310'


> 2. Can you reply with the output of 'xl dmesg' when the problem happens?
> Or, if the system gets unusable too quick, do you have a serial console
> connection to capture the output?

in attachment


> 3. To confirm... I understand that there are many of these messages.
> Since you pasted only one, does that mean that all of them look exactly
> the same, with "1 of 1 multicall(s) failed: cpu 10" "call  1: op=1
> arg=[ffff8888a1a9eb10] result=-22"? Or are there variations? If so, can
> you reply with a few different ones?

all looks exacly same, only 1 of 1 multicalls failed with same result



On 9/14/23 07:43, Juergen Gross wrote:
 >>> kernel: [   99.768181] Call Trace:
 >>> kernel: [   99.768436]  <TASK>
 >>> kernel: [   99.768691]  ? __warn+0x7d/0xc0
 >>> kernel: [   99.768947]  ? xen_mc_flush+0x196/0x220
 >>> kernel: [   99.769204]  ? report_bug+0xe6/0x170
 >>> kernel: [   99.769460]  ? handle_bug+0x41/0x70
 >>> kernel: [   99.769713]  ? exc_invalid_op+0x13/0x60
 >>> kernel: [   99.769967]  ? asm_exc_invalid_op+0x16/0x20
 >>> kernel: [   99.770223]  ? xen_mc_flush+0x196/0x220
 >>> kernel: [   99.770478]  xen_mc_issue+0x6d/0x70
 >>> kernel: [   99.770726]  xen_set_pmd_hyper+0x54/0x90
 >>> kernel: [   99.770965]  do_set_pmd+0x188/0x2a0
 >
 > This looks like an attempt to map a hugepage, which isn't supported
 > when running as a Xen PV guest (this includes dom0).
 >
 > Are transparent hugepages enabled somehow? In a Xen PV guest there
 > should be no /sys/kernel/mm/transparent_hugepage directory. Depending 
 > on the presence of that directory either hugepage_init() has a bug, 
or > a test for hugepages being supported is missing in 
filemap_map_pages() > or do_set_pmd().
 >
 >>> kernel: [   99.771200]  filemap_map_pages+0x1a9/0x6e0
 >>> kernel: [   99.771434]  xfs_filemap_map_pages+0x41/0x60 [xfs]
 >>> kernel: [   99.771714]  do_fault+0x1a4/0x410
 >>> kernel: [   99.771947]  __handle_mm_fault+0x660/0xfa0

in faulty state (linux 6.1) and also in good state (linux 5.10), the 
directory /sys/kernel/mm/transparent_hugepage is not present

we have also tried to boot with 'transparent_hugepage=never', but it 
make no difference


best regards
bodik
-------------- next part --------------
(XEN) Xen version 4.17.2-pre (Debian 4.17.1+2-gb773c48e36-1) (pkg-xen-devel at lists.alioth.debian.org) (x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0) debug=n Thu May 18 19:26:30 UTC 2023
(XEN) Bootloader: GRUB 2.06-13
(XEN) Command line: placeholder dom0_mem=32G,max:32G
(XEN) Xen image load base address: 0x5e800000
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN)  VBE/DDC methods: none; EDID transfer time: 0 seconds
(XEN)  EDID info not retrieved because no DDC retrieval method detected
(XEN) Disc information:
(XEN)  Found 2 MBR signatures
(XEN)  Found 2 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  [0000000000000000, 0000000000098fff] (usable)
(XEN)  [0000000000099000, 000000000009ffff] (reserved)
(XEN)  [00000000000e0000, 00000000000fffff] (reserved)
(XEN)  [0000000000100000, 000000004a413fff] (usable)
(XEN)  [000000004a414000, 000000004b413fff] (ACPI NVS)
(XEN)  [000000004b414000, 000000004bfc2fff] (usable)
(XEN)  [000000004bfc3000, 000000004c0c8fff] (reserved)
(XEN)  [000000004c0c9000, 000000004cffffff] (usable)
(XEN)  [000000004d000000, 000000004d1fffff] (reserved)
(XEN)  [000000004d200000, 000000005eefdfff] (usable)
(XEN)  [000000005eefe000, 000000006e3fefff] (reserved)
(XEN)  [000000006e3ff000, 000000006f3fefff] (ACPI NVS)
(XEN)  [000000006f3ff000, 000000006f7fefff] (ACPI data)
(XEN)  [000000006f7ff000, 000000006f7fffff] (usable)
(XEN)  [000000006f800000, 000000008fffffff] (reserved)
(XEN)  [00000000fd000000, 00000000fe7fffff] (reserved)
(XEN)  [00000000fec00000, 00000000fec00fff] (reserved)
(XEN)  [00000000fec80000, 00000000fed00fff] (reserved)
(XEN)  [00000000fed40000, 00000000fed44fff] (reserved)
(XEN)  [00000000ff000000, 00000000ffffffff] (reserved)
(XEN)  [0000000100000000, 000000407fffffff] (usable)
(XEN) ACPI: RSDP 000FE320, 0024 (r2 DELL  )
(XEN) ACPI: XSDT 6F40A188, 00F4 (r1 DELL   PE_SC3          0 DELL  1000013)
(XEN) ACPI: FACP 6F7F6000, 0114 (r6 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: DSDT 6F770000, 7FAD3 (r2 DELL   PE_SC3          3 DELL        1)
(XEN) ACPI: FACS 6F373000, 0040
(XEN) ACPI: SSDT 6F7FB000, 1571 (r2  INTEL RAS_ACPI        1 INTL 20210331)
(XEN) ACPI: SSDT 6F7FA000, 0745 (r2  INTEL ADDRXLAT        1 INTL 20210331)
(XEN) ACPI: EINJ 6F7F9000, 0150 (r1 DELL   PE_SC3          1 INTL        1)
(XEN) ACPI: BERT 6F7F8000, 0030 (r1 DELL   PE_SC3          1 INTL        1)
(XEN) ACPI: ERST 6F7F7000, 0230 (r1 DELL   PE_SC3          1 INTL        1)
(XEN) ACPI: HMAT 6F7F5000, 0180 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: HPET 6F7F4000, 0038 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: MCFG 6F7F3000, 003C (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: MIGT 6F7F2000, 0040 (r1 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: MSCT 6F7F1000, 0090 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) ACPI: WSMT 6F7F0000, 0028 (r1 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: APIC 6F76F000, 035E (r4 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: SLIT 6F76E000, 0030 (r1 DELL   PE_SC3          1 DELL  1000013)
(XEN) ACPI: SRAT 6F767000, 6430 (r3 DELL   PE_SC3          2 DELL  1000013)
(XEN) ACPI: OEM4 6F5DF000, 187A61 (r2  INTEL CPU  CST     3000 INTL 20210331)
(XEN) ACPI: OEM1 6F4CB000, 113489 (r2  INTEL CPU EIST     3000 INTL 20210331)
(XEN) ACPI: OEM2 6F484000, 46031 (r2  INTEL CPU  HWP     3000 INTL 20210331)
(XEN) ACPI: SSDT 6F40D000, 764A5 (r2  INTEL SSDT  PM     4000 INTL 20210331)
(XEN) ACPI: SSDT 6F40C000, 0AA3 (r2 DELL   PE_SC3          0 DELL        1)
(XEN) ACPI: HEST 6F40B000, 017C (r1 DELL   PE_SC3          1 INTL        1)
(XEN) ACPI: SSDT 6F7FD000, 0623 (r2 DELL   Tpm2Tabl     1000 INTL 20210331)
(XEN) ACPI: TPM2 6F409000, 004C (r4 DELL   PE_SC3          2 DELL  1000013)
(XEN) ACPI: SSDT 6F401000, 7299 (r2  INTEL SpsNm           2 INTL 20210331)
(XEN) ACPI: SSDT 6F400000, 06EA (r2 DELL   PE_SC3          2 DELL        1)
(XEN) ACPI: DMAR 6F3FF000, 0188 (r1 DELL   PE_SC3          1 DELL        1)
(XEN) System RAM: 261595MB (267873864kB)
(XEN) Domain heap initialised DMA width 32 bits
(XEN) x2APIC mode is already enabled by BIOS.
(XEN) ACPI: 32/64X FACS address mismatch in FADT - 6f373000/0000000000000000, using 32
(XEN) IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-119
(XEN) CPU0: TSC: ratio: 168 / 2
(XEN) CPU0: bus: 100 MHz base: 2100 MHz max: 3300 MHz
(XEN) CPU0: 800 ... 2100 MHz
(XEN) xstate: size: 0xa88 and states: 0x2e7
(XEN) Unrecognised CPU model 0x6a - assuming vulnerable to LazyFPU
(XEN) Speculative mitigation facilities:
(XEN)   Hardware hints: RDCL_NO IBRS_ALL SKIP_L1DFL MDS_NO TAA_NO SBDR_SSDP_NO PSDP_NO
(XEN)   Hardware features: IBPB IBRS STIBP SSBD PSFD L1D_FLUSH MD_CLEAR TSX_CTRL FB_CLEAR FB_CLEAR_CTRL
(XEN)   Compiled-in support: INDIRECT_THUNK SHADOW_PAGING
(XEN)   Xen settings: BTI-Thunk JMP, SPEC_CTRL: IBRS+ STIBP+ SSBD- PSFD- TSX+, Other: IBPB-ctxt BRANCH_HARDEN
(XEN)   Support for HVM VMs: MSR_SPEC_CTRL MSR_VIRT_SPEC_CTRL RSB EAGER_FPU
(XEN)   Support for PV VMs: MSR_SPEC_CTRL EAGER_FPU
(XEN)   XPTI (64-bit PV only): Dom0 disabled, DomU disabled (with PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU disabled
(XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Initializing Credit2 scheduler
(XEN) Platform timer is 24.000MHz HPET
(XEN) Detected 2095.078 MHz processor.
(XEN) Intel VT-d iommu 8 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 7 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 6 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 5 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 4 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 3 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 2 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d iommu 9 supported page sizes: 4kB, 2MB, 1GB
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping enabled.
(XEN) Intel VT-d Posted Interrupt not enabled.
(XEN) Intel VT-d Shared EPT tables enabled.
(XEN) I/O virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) Enabling APIC mode:  Clustered.  Using 1 I/O APICs
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) Allocated console ring of 128 KiB.
(XEN) VMX: Supported advanced features:
(XEN)  - APIC MMIO access virtualisation
(XEN)  - APIC TPR shadow
(XEN)  - Extended Page Tables (EPT)
(XEN)  - Virtual-Processor Identifiers (VPID)
(XEN)  - Virtual NMI
(XEN)  - MSR direct-access bitmap
(XEN)  - Unrestricted Guest
(XEN)  - APIC Register Virtualization
(XEN)  - Virtual Interrupt Delivery
(XEN)  - Posted Interrupt Processing
(XEN)  - VMCS shadowing
(XEN)  - VM Functions
(XEN)  - Virtualisation Exceptions
(XEN)  - Page Modification Logging
(XEN)  - TSC Scaling
(XEN)  - Bus Lock Detection
(XEN) HVM: ASIDs enabled.
(XEN) HVM: VMX enabled
(XEN) HVM: Hardware Assisted Paging (HAP) detected
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) Brought up 48 CPUs
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Initializing Credit2 scheduler
(XEN) Dom0 has maximum 1368 PIRQs
(XEN)  Xen  kernel: 64-bit, lsb
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x4a00000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000004020000000->0000004028000000 (8345580 pages to be allocated)
(XEN)  Init. ramdisk: 000000407d7ec000->000000407ffff69e
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff84a00000
(XEN)  Phys-Mach map: 0000008000000000->0000008004000000
(XEN)  Start info:    ffffffff84a00000->ffffffff84a004b8
(XEN)  Page tables:   ffffffff84a01000->ffffffff84a2a000
(XEN)  Boot stack:    ffffffff84a2a000->ffffffff84a2b000
(XEN)  TOTAL:         ffffffff80000000->ffffffff84c00000
(XEN)  ENTRY ADDRESS: ffffffff830721c0
(XEN) Dom0 has maximum 48 VCPUs
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM in background
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) Freed 624kB init memory


More information about the Pkg-xen-devel mailing list