Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Medium
Fix Version/s: 1807
Affects Version/s: None
Labels:
- ARM64
Environment:

Hide

HOST:
HW: hierofalcon2, ARM64
OS: Fedora 27
QEMU: 2.10.1-3.fc27
LIBVIRT: 3.7.0-4.fc27
Linux kernel: 4.14.14-300.fc27.aarch64 #1 SMP Fri Jan 19 12:52:12 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux

VM:
OS host: Fedora 27
Linux kernel: 4.15.15-300.fc27.aarch64 #1 SMP Mon Apr 2 23:00:39 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux

Show
HOST: HW: hierofalcon2, ARM64 OS: Fedora 27 QEMU: 2.10.1-3.fc27 LIBVIRT: 3.7.0-4.fc27 Linux kernel: 4.14.14-300.fc27.aarch64 #1 SMP Fri Jan 19 12:52:12 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux VM: OS host: Fedora 27 Linux kernel: 4.15.15-300.fc27.aarch64 #1 SMP Mon Apr 2 23:00:39 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux

Epic Link:
AARCH64

The guest OS (Fedora, Ubuntu) becomes unresponsive during CSIT crypto suite execution.
When the issue occurs the next CSIT tests fails due to SSH timeout and the target is not responsive by ssh/virtual console anymore.

The VM can be recovered in two modes:

reboot the VM (virsh destroy & start);
connect by gdb to QEMU builtin gdbserver, stop and resume from execution the guest;
host# gdb
(gdb) target remote :1235
(gdb) CTRL+C
(gdb) info threads
Id Target Id Frame

1 Thread 1 (CPU#0 [running]) 0xffff285d47283cec in ?? ()
2 Thread 2 (CPU#1 [halted ]) 0xffff285d47283cec in ?? ()
(gdb) continue

There is a simplified test scenario based on CSIT which can be used to reproduce a issue with a similar behavior.

These steps are executed which shall be executed in parallel on host:

Execute PCI rescan in loop on VM using the ssh connection (see the script attached - ./qemu_kvm_guest_hang.sh 192.168.122.18)
Flood one VM network interface with ICMP packets (e.g. ping -s 8 -i 0.02 192.168.121.22)

What to observe when the issue appears:

No ping reply and the SSH on that IP address in not working anymore;
The other interfaces are accessible, but these can become unusable also if the traffic is generated on them during the rescan;
The non-responsive interface becomes usable if the "QEMU" recover method described above is used;

The IP addresses mentioned above are for 2 different VM virtual network interfaces

Notes:

The PCI device intensively used under PCI rescan is affected;
The recover is not working by PCI device remove & rescan;
The KVM trace show that the interface IRQ is pending:
kworker/5:1-7655 [005] .... 6224674.474316: vgic_update_irq_pending: VCPU: 0, IRQ 101, level: 0
kworker/4:0-31773 [004] .... 6224674.504352: vgic_update_irq_pending: VCPU: 0, IRQ 101, level: 1
kworker/4:0-31773 [004] .... 6224674.504353: vgic_update_irq_pending: VCPU: 0, IRQ 101, level: 0
kworker/5:1-7655 [005] .... 6224674.534326: vgic_update_irq_pending: VCPU: 0, IRQ 101, level: 1

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

qemu_kvm_guest_hang.sh
0.3 kB
12/Apr/18 2:29 PM
vm_domain.xml
8 kB
12/Apr/18 2:29 PM

Assignee:: Juraj Linkeš

Reporter:: Lucian Banu

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 12/Apr/18 2:32 PM

Updated:: 10/Feb/21 7:29 AM

Details

Description

Attachments

Attachments

Activity

People

Dates