And continuation of https://forums.virtualbox.org/viewtopic.php?t=108833
Running a Windows 10 guest.
The guest is used for testing api code served by a Visual Studio develpment environment.
The api is using several external network services (sql, geolocation, erp, regional tax-registrars et.al.)
The VM frequently loses network - pretty disturbing when running long tests using a network sql service.
Host
$ inxi -SCMNzyx
System:
Kernel: 6.6.23-1-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
Desktop: KDE Plasma v: 6.0.2 Distro: Manjaro base: Arch Linux
Machine:
Type: Desktop System: LENOVO product: 30E000GMMT v: ThinkStation P620
serial: <superuser required>
Mobo: LENOVO model: 1046 v: SBB1C50523 WIN 3556073303264
serial: <superuser required> UEFI: LENOVO v: S07KT5AA date: 11/21/2023
CPU:
Info: 12-core model: AMD Ryzen Threadripper PRO 5945WX s bits: 64
type: MT MCP arch: Zen 3 rev: 2 cache: L1: 768 KiB L2: 6 MiB L3: 64 MiB
Speed (MHz): avg: 1367 high: 2448
min/max: 400/4978:4565:4705:4841:5254:5118:5943:5666:5807:5530:5394 cores:
1: 2280 2: 2274 3: 400 4: 2204 5: 2447 6: 400 7: 2391 8: 400 9: 400 10: 400
11: 400 12: 400 13: 2231 14: 2273 15: 400 16: 400 17: 2448 18: 2304 19: 400
20: 400 21: 2387 22: 400 23: 2389 24: 2395 bogomips: 196535
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Network:
Device-1: Aquantia AQtion AQC107 NBase-T/IEEE 802.3an Ethernet [Atlantic 10G]
vendor: Lenovo driver: atlantic v: kernel port: N/A bus-ID: 01:00.0
temp: 69.2 C
VirtualBox version
- VirtualBox 7.0.14r161095
- virtualbox-host-dkms
- no extension package
Guest
- Windows 10 Enterprise LTSC Version 21H2 (OS Build 19044.4096)
- MBR boot
- usb 1.1
- bridged network (to make host api reachable from lan)
- using static IP assignment in Windows IP configuration (no more reserved DHCP)
- IP has hostname registered local DNS server
- IP is reverse lookup registered for hostname
- a single folder shared from host to guest
- 16GiB RAM
- 4 vCPU
- NIC default Intel PRO/1000 MT Desktop (82540EM)
How can I troubleshoot this?
So far I have ensured everything is up-to-date (guest and host)
From the test running inside the guest I have my API logging (communicating with a network SQL) so I have a fairly good measure on when it happens.
The logfile attached correspond with the ping excerpt and the log exerpt which shows my connection to the SQL is severed.
I have tried to run ping against the guest while at the same time running my tests.
The ping increases until the connection is severed
Terminal ping snippet
09:00:18 ○ [fh@tiger] ~
$ ping vs.net.nix.dk
PING vs.net.nix.dk (172.30.30.80) 56(84) bytes of data.
64 bytes from 172.30.30.80: icmp_seq=1 ttl=128 time=0.226 ms
64 bytes from 172.30.30.80: icmp_seq=2 ttl=128 time=0.158 ms
64 bytes from 172.30.30.80: icmp_seq=3 ttl=128 time=0.141 ms
64 bytes from 172.30.30.80: icmp_seq=4 ttl=128 time=0.167 ms
64 bytes from 172.30.30.80: icmp_seq=5 ttl=128 time=0.153 ms
64 bytes from 172.30.30.80: icmp_seq=6 ttl=128 time=0.154 ms
64 bytes from 172.30.30.80: icmp_seq=7 ttl=128 time=0.193 ms
64 bytes from 172.30.30.80: icmp_seq=8 ttl=128 time=0.143 ms
64 bytes from 172.30.30.80: icmp_seq=9 ttl=128 time=0.110 ms
[...]
64 bytes from 172.30.30.80: icmp_seq=249 ttl=128 time=215 ms
64 bytes from 172.30.30.80: icmp_seq=198 ttl=128 time=51896 ms
64 bytes from 172.30.30.80: icmp_seq=199 ttl=128 time=50882 ms
64 bytes from 172.30.30.80: icmp_seq=200 ttl=128 time=49869 ms
64 bytes from 172.30.30.80: icmp_seq=201 ttl=128 time=48856 ms
64 bytes from 172.30.30.80: icmp_seq=202 ttl=128 time=47843 ms
[...]
64 bytes from 172.30.30.80: icmp_seq=245 ttl=128 time=4278 ms
64 bytes from 172.30.30.80: icmp_seq=246 ttl=128 time=3265 ms
64 bytes from 172.30.30.80: icmp_seq=247 ttl=128 time=2252 ms
64 bytes from 172.30.30.80: icmp_seq=248 ttl=128 time=1239 ms
From tiger.net.nix.dk (172.30.30.20) icmp_seq=388 Destination Host Unreachable
From tiger.net.nix.dk (172.30.30.20) icmp_seq=389 Destination Host Unreachable
From tiger.net.nix.dk (172.30.30.20) icmp_seq=390 Destination Host Unreachable
^C
--- vs.net.nix.dk ping statistics ---
396 packets transmitted, 249 received, +3 errors, 37.1212% packet loss, time 406906ms
rtt min/avg/max/mdev = 0.110/5442.399/51895.513/12668.482 ms, pipe 52
09:07:16 ○ [fh@tiger] ~
$
Log snippets
2024-03-29 09:02:24.0376 INFO PROCESSING => '12345678' 'Test', bcId='73a80afe-2963-ec11-9f08-000d3ab4fc9a', companyId='5e82146f-1c98-4d3a-b6dc-c11a6ee8e38d'
2024-03-29 09:02:29.2728 DEBUG Running TaxId Check for: 12345678 Test
[...]
2024-03-29 09:05:10.4772 ERROR The underlying provider failed on Open.
2024-03-29 09:05:22.5534 ERROR An error occurred while sending the request.
2024-03-29 09:05:22.5534 INFO PROCESSING => '' '', bcId='', companyId=''
Further testing under different conditions appears to indicate the guest drop of network only appears when connection is under load - that is running the api stress test.
If the guest is not under stress the disconnect is much less likely to happen.
I simply have no idea why it drops ...