Bug#1069844: grub-efi-arm64: grub fails to load Xen hypervisor on arm64 EFI systems (AVA & QEMU)

Alex Bennée alex.bennee at linaro.org
Thu Apr 25 17:36:09 BST 2024


Package: grub-efi-arm64
Version: 2.12-2~deb13u1
Severity: serious

Attempting to boot Xen on an arm64 EFI system fails. Further attempts to
select another boot entry fails and the system needs to be rebooted
before you can load the kernel without Xen. This is likely due to boot
services having been unloaded.

Running on the Ampere AVA platform with additional diagnostics
(xen,xen_loader,efiemu,fdt,linux) gets the following:

  loader/arm64/xen_boot.c:116:xen_loader: Xen Hypervisor cmdline : placeholder
  no-real-mode edd=off @ 0x807ff98a6e0 size:33
  loader/arm64/xen_boot.c:224:xen_loader: Module @ 0xf283b000 size:0x1f2c75b
  loader/arm64/xen_boot.c:139:xen_loader: Module node name module at f283b000 
  loader/arm64/xen_boot.c:161:xen_loader: Module
  loader/arm64/xen_boot.c:183:xen_loader: Module has no bootargs!
  loader/arm64/xen_boot.c:224:xen_loader: Module @ 0xef17d000 size:0x3514a00
  loader/arm64/xen_boot.c:139:xen_loader: Module node name module at ef17d000 
  loader/arm64/xen_boot.c:161:xen_loader: Module
  loader/arm64/xen_boot.c:172:xen_loader: Module cmdline : placeholder
  root=UUID=06e49159-b374-445c-bf5f-2bf93e3f4d6b ro @ 0x807ffa1a3a0 size:62
  loader/arm64/xen_boot.c:224:xen_loader: Module @ 0xf4769000 size:0x3514a00
  loader/arm64/xen_boot.c:139:xen_loader: Module node name module at f4769000 
  loader/arm64/xen_boot.c:161:xen_loader: Module
  loader/arm64/xen_boot.c:172:xen_loader: Module cmdline : placeholder
  root=UUID=06e49159-b374-445c-bf5f-2bf93e3f4d6b ro @ 0x807fd65a280 size:62
  loader/efi/fdt.c:136:fdt: Installed/updated FDT configuration table @
  0xef160000
  error: cannot load image.

So it appears EFI boot services can successfully stat the various bits
and pieces but once it gets to the stage of
grub_arch_efi_linux_boot_image it fails. As debugging grub on the AVA
platform is hard I replicated the setup in QEMU. At which point I
obtained the following backtrace:

  Thread 1 hit Breakpoint 6.2, grub_error (n=GRUB_ERR_BAD_OS, fmt=0x23bedbbbd "cannot load image") at ../../../grub-core/kern/err.c:41
  41        grub_errno = n;
  (grub gdb) bt
  #0  grub_error (n=GRUB_ERR_BAD_OS, fmt=0x23bedbbbd "cannot load image") at ../../../grub-core/kern/err.c:41
  #1  0x000000023bedabf0 in grub_arch_efi_linux_boot_image (addr=9561964544, size=1081352, 
      args=0x23bbb8b00 "placeholder dom0_mem=4G,max:4G loglvl=all guest_loglvl=all no-real-mode edd=off") at ../../../grub-core/loader/efi/linux.c:214
  #2  0x000000023bff41bc in grub_loader_boot () at ../../../grub-core/commands/boot.c:211
  #3  grub_loader_boot () at ../../../grub-core/commands/boot.c:190
  #4  0x000000023bf42158 in grub_command_execute (name=0x23bf4e72c "boot", argc=0, argv=0x0 <_start>) at ../../../include/grub/command.h:126
  #5  grub_menu_execute_entry (entry=entry at entry=0x23bd17660, auto_boot=auto_boot at entry=0) at ../../../grub-core/normal/menu.c:306
  #6  0x000000023bf41e2c in show_menu (autobooted=<optimized out>, nested=<optimized out>, menu=<optimized out>) at ../../../grub-core/normal/menu.c:925
  #7  grub_show_menu (menu=menu at entry=0x23bd1a940, nested=nested at entry=1, autoboot=autoboot at entry=0) at ../../../grub-core/normal/menu.c:940
  #8  0x000000023bf408a8 in grub_normal_execute (config=<optimized out>, nested=nested at entry=1, batch=batch at entry=0) at ../../../grub-core/normal/main.c:291
  #9  0x000000023bf32260 in grub_cmd_source (cmd=<optimized out>, argc=1, args=0x23bd1fcc8) at ../../../grub-core/commands/configfile.c:48
  #10 grub_cmd_source (cmd=<optimized out>, argc=<optimized out>, args=0x23bd1fcc8) at ../../../grub-core/commands/configfile.c:30
  #11 0x000000023bf48d0c in grub_script_execute_cmdline (cmd=<optimized out>) at ../../../grub-core/script/execute.c:1034
  #12 0x000000023bf478c0 in grub_script_execute_cmd (cmd=cmd at entry=0x23bd190c8) at ../../../grub-core/script/execute.c:819
  #13 0x000000023bf4874c in grub_script_execute_cmdlist (list=<optimized out>) at ../../../grub-core/script/execute.c:1079
  #14 0x000000023bf478c0 in grub_script_execute_cmd (cmd=<optimized out>) at ../../../grub-core/script/execute.c:819
  #15 0x000000023bf489b4 in grub_script_execute (script=<optimized out>) at ../../../grub-core/script/execute.c:1191
  #16 0x000000023bf497fc in grub_normal_parse_line (line=line at entry=0x23bd20060 "configfile $prefix/grub.cfg", getline=getline at entry=0x23bf40430 <read_config_file_getline>, 
      getline_data=getline_data at entry=0x23bd20380) at ../../../grub-core/script/main.c:36
  #17 0x000000023bf409a0 in read_config_file (config=0x23bd20780 "(hd0,gpt1)/EFI/debian/grub.cfg") at ../../../grub-core/normal/main.c:179
  #18 grub_normal_execute (config=config at entry=0x23bd20780 "(hd0,gpt1)/EFI/debian/grub.cfg", nested=nested at entry=0, batch=batch at entry=0)
      at ../../../grub-core/normal/main.c:277
  #19 0x000000023bf40ca4 in grub_enter_normal_mode (config=config at entry=0x23bd20780 "(hd0,gpt1)/EFI/debian/grub.cfg") at ../../../grub-core/normal/main.c:304
  #20 0x000000023bf40da0 in grub_try_normal_prefix (prefix=0x23bd209a0 "(hd0,gpt1)/EFI/debian") at ../../../grub-core/normal/main.c:356
  #21 0x000000023bf40ea0 in grub_try_normal (variable=0x23bf4e492 "fw_path") at ../../../grub-core/normal/main.c:407
  #22 grub_cmd_normal (cmd=<optimized out>, argc=0, argv=<optimized out>) at ../../../grub-core/normal/main.c:421
  #23 grub_cmd_normal (cmd=<optimized out>, argc=<optimized out>, argv=<optimized out>) at ../../../grub-core/normal/main.c:412
  #24 0x000000023c018fb8 in grub_command_execute (name=0x23c01ec6e ")", argc=0, argv=0x0 <_start>) at ../../../include/grub/command.h:126
  #25 grub_load_normal_mode () at ../../../grub-core/kern/main.c:247
  #26 grub_main () at ../../../grub-core/kern/main.c:339
  #27 0x000000023c5c02c8 in ?? ()
  #28 0x000000023c62a000 in ?? ()
  #29 0xafafafaf6c617470 in ?? ()
  Backtrace stopped: previous frame identical to this frame (corrupt stack?)

While there are different firmwares (QEMU is EDKII targeting -M virt) it
looks like the same error. However before writing off the firmware I did
build the upstream grub:

  ➜  git describe
  grub-2.12-17-g8719cc204
  🕙17:25:21 alex at gwenyn:grub.git  on  master [?] 
  ➜  git show HEAD
  commit 8719cc2040368d43ab2de0b6e1b850b2c9cfc5b7 (HEAD -> master, origin/master, origin/HEAD)
  Author: Daniel Kiper <daniel.kiper at oracle.com>
  Date:   Tue Apr 9 19:56:02 2024 +0200

      windows: Add _stack_chk_guard/_stack_chk_fail symbols for Windows 64-bit target

      Otherwise the GRUB cannot start due to missing symbols when stack
      protector is enabled on EFI platforms.

      Signed-off-by: Daniel Kiper <daniel.kiper at oracle.com>
      Reviewed-by: Vladimir Serbinenko <phcoder at gmail.com>

And installed it in parallel with the distro grub. This was able to
start Xen using the same grub.cfg and get most of the way through the
Dom0 boot before that failed for unrelated issues. So it seems there is
a bug introduced by the debian customisation of the package or missing a
fix from the current state of upstream.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



More information about the Pkg-grub-devel mailing list