Page 1 of 1

Network performance difference between bridged and NAT

Posted: 30. Dec 2020, 13:38
by Tony Lewis
The attached screenshot shows that I'm getting a massive performance difference between using bridged and NAT networking, and I'm trying to figure out why.

The screenshot is from Wireshark, captured at my router (not the host or guest). The first big bump is running a speedtest[dot]net test when using bridged networking. The second one is using NAT networking a few seconds later after switching it in the VirtualBox interface. There are lots more TCP errors in the packet capture.

The real problem manifests itself in a windows app on the guest. Under bridged networking, a particular network action takes 30-60 seconds. In NAT, it takes about 5 seconds.

Searching the VirtualBox bug tracker didn't yield much, nor did Google. Any ideas?

Re: Network performance difference between bridged and NAT

Posted: 30. Dec 2020, 19:32
by mpack
I suspect your problem is a badly configured guest. When you use NAT you are in fact using the host's Internet, so it's much more forgiving.

How was the VM created? Did you for example create a MAC address that conflicts with another PC on the network, including the host. Or are you trying to statically assign an IP address?

That said, I'm not sure I fully understand your screenshot. Packets/s is not a direct indication of throughput - not without knowing what the packet sizes are. If it IS an indication of throughput then your second paragraph seems to indicate that bridged is better, and then the third paragraph says the opposite. Unless I'm tired and my brain has stopped working?

Re: Network performance difference between bridged and NAT

Posted: 30. Dec 2020, 20:39
by fth0
Using NAT, there's an average of more than 1000 packets per second. Using bridged mode, there's an average of more than 1000 TCP errors per second. I'd suggest to analyze the Wireshark capture. If you need help with that, I could take a look (but note the 128 kB attachment limit in the VirtualBox forums). BTW, delays of >= 30 seconds often indicate DNS problems, since 30 seconds are a common DNS timeout value.

Re: Network performance difference between bridged and NAT

Posted: 1. Jan 2021, 14:25
by Tony Lewis
mpack wrote:How was the VM created? Did you for example create a MAC address that conflicts with another PC on the network, including the host. Or are you trying to statically assign an IP address?
Good thoughts. The VM was created a few years ago, and as far as I recall it's pretty standard. It's been migrated once or twice, each time to what becomes my new / current laptop. However to eliminate that, I created a new VM and did the same test. This new (screenshot attached) is from a Kubuntu Live instance, and uses Firefox as the browser. It pretty much follows the same pattern.
mpack wrote:That said, I'm not sure I fully understand your screenshot. Packets/s is not a direct indication of throughput - not without knowing what the packet sizes are. If it IS an indication of throughput then your second paragraph seems to indicate that bridged is better, and then the third paragraph says the opposite. Unless I'm tired and my brain has stopped working?
I was not very clear in my first post, sorry. That screenshot, and the first attachment, show the full throughput (line graph) and also the 'TCP Errors', which turns out to mean where Wireshark detects things like duplicate TCP acknowledgements. A symptom of the problem is that doing the same browser-based speed test has a lot more TCP duplicate acks when in bridged mode than in NAT.

To point back to the real world. the second attachment shows the same I/O graph for the application in question. When in bridged mode (first part of the figure), the one action takes a long time and has lots of errors. In NAT (second part), three equivalent actions all happen quickly, with much higher throughput and with less TCP errors.

Re: Network performance difference between bridged and NAT

Posted: 1. Jan 2021, 14:33
by Tony Lewis
fth0 wrote:Using NAT, there's an average of more than 1000 packets per second. Using bridged mode, there's an average of more than 1000 TCP errors per second. I'd suggest to analyze the Wireshark capture. If you need help with that, I could take a look (but note the 128 kB attachment limit in the VirtualBox forums). BTW, delays of >= 30 seconds often indicate DNS problems, since 30 seconds are a common DNS timeout value.
I have a fitered PCAP that just shows those speedtests, and I'll DM you a download link. Thanks in advance for your help.

Re: Network performance difference between bridged and NAT

Posted: 1. Jan 2021, 14:44
by Tony Lewis
Tony Lewis wrote:I have a fitered PCAP that just shows those speedtests, and I'll DM you a download link. Thanks in advance for your help.
Turns out I don't have enough rep to post links in PMs. So here it is, using a URL shortening service, which points to a S3 bucket. It's an encrypted zip, and I'll PM you the password.

https://cutt.ly/TjujUrH

Re: Network performance difference between bridged and NAT

Posted: 2. Jan 2021, 04:13
by fth0
I haven't received a private message from you (yet), and I've sent you one myself that you haven't checked yet ...

Re: Network performance difference between bridged and NAT

Posted: 2. Jan 2021, 07:49
by Tony Lewis
fth0 wrote:I haven't received a private message from you (yet), and I've sent you one myself that you haven't checked yet ...
Sorry, user error. You should have it now.

Re: Network performance difference between bridged and NAT

Posted: 2. Jan 2021, 20:51
by fth0
In the Wireshark trace, each received TCP data packet is acknowledged twice (when using bridged mode according to your description). This usually happens when the TCP packets are received by the same TCP endpoint over two different paths. A classical scenario (without VirtualBox) consists of a network professional ;) who connects their computer with a second network interface to a mirror port of a switch, to capture the network traffic outside themselves, thereby creating a second path. With VirtualBox involved, I could also imagine problems when using a wireless network adapter on the host. Please tell us more details about your network setup.

Start a VM, reproduce the problem, shut the VM down, and post the corresponding (zipped) VBox.log file. You could maybe use a simpler interactive TCP connection (e.g. SSH) instead of the speed test, as long as we're only looking for the source of the duplicate ACKs. If we knew the exact number of TCP packets involved, we could compare them to several statistic counters in the VBox.log file.

Re: Network performance difference between bridged and NAT

Posted: 4. Jan 2021, 02:31
by Tony Lewis
fth0 wrote:In the Wireshark trace, each received TCP data packet is acknowledged twice (when using bridged mode according to your description). This usually happens when the TCP packets are received by the same TCP endpoint over two different paths. A classical scenario (without VirtualBox) consists of a network professional ;) who connects their computer with a second network interface to a mirror port of a switch, to capture the network traffic outside themselves, thereby creating a second path.
The dups don't appear to be exactly twice. Scrolling through I could see there were many cases of only a single ACK. There is no port mirroring.
fth0 wrote:With VirtualBox involved, I could also imagine problems when using a wireless network adapter on the host.
This is looking more like the culprit (see next post)
fth0 wrote:Please tell us more details about your network setup.
The host is a laptop running Kubuntu. The guest is Windows 10. The laptop connects to the home network over wireless. In this case, using NetGear Orbi, with one router and two satellite nodes. This connects to a wired interface which connects to a NetGear gigabit switch, to which the home server is connected. The home server runs Linux and connects to the internet via PPPoE, including firewalling.
fth0 wrote:Start a VM, reproduce the problem, shut the VM down, and post the corresponding (zipped) VBox.log file. You could maybe use a simpler interactive TCP connection (e.g. SSH) instead of the speed test, as long as we're only looking for the source of the duplicate ACKs. If we knew the exact number of TCP packets involved, we could compare them to several statistic counters in the VBox.log file.
Not done yet, but see next post.

Re: Network performance difference between bridged and NAT

Posted: 4. Jan 2021, 02:42
by Tony Lewis
It appears the determining factor is using bridged networking attached to the host's wireless interface.

The two attachments show Wireshark's I/O graph from the following:
  • the guest has two network interfaces, both bridged to either the host's wireless or wired interface
  • disconnect the wired interface (VBox Settings -> Network -> Adapter (b) -> uncheck Cable Connected) so it is only using wireless
  • do a speedtest in the browser on the guest
  • disconnect the wireless interface and connect the wired interfaces in VBox Settings
  • do a speedtest in the browser on the guest
The first figure shows packet throughput and TCP errors. It shows that there were lots of packets using the wireless interface, and lots of TCP errors. But for the wireless interface there were less packets and virtually no errors.
Byte throughput and TCP errors
Byte throughput and TCP errors
Screenshot_20210104_112317.png (69.09 KiB) Viewed 3684 times
The second figure shows byte throughput, and shows that the speedtest was more performant on the wired interface despite less packets.
Packet throughput
Packet throughput
Screenshot_20210104_112351.png (74.59 KiB) Viewed 3684 times
Maybe it's a poor wireless driver on the laptop host. Or maybe the Orbis are performing poorly. Or it could be VirtualBox's use of wireless.

Re: Network performance difference between bridged and NAT

Posted: 4. Jan 2021, 13:57
by fth0
Tony Lewis wrote:The dups don't appear to be exactly twice. Scrolling through I could see there were many cases of only a single ACK.
Inside TCP streams #0, #1 or #2? Note that the duplicate ACKs only appear when no data is sent, which excludes the upstream test TCP streams #3-#7. Additionally, during the recovery from lost TCP downstream packets, you have to look closely to identify which of the duplicate ACKs are from the wrong source.

You could create additional (simultaneous) Wireshark captures in the host OS and the guest OS for comparison. The short TCP stream #0 from the previous Wireshark capture would be enough for further analysis, so you could simply stop the speed test as soon as the downstream test has been running for a few seconds.