#9440 closed defect (fixed)
NAT networking rewrites the DF (dont fragment) flag
回報者: | Oleg | 負責人: | |
---|---|---|---|
元件: | network/NAT | 版本: | VirtualBox 4.1.0 |
關鍵字: | DF, fragmentation, dont fragment | 副本: | |
Guest type: | Linux | Host type: | Windows |
描述 (由 作最後更新) ¶
I found that NAT network removes, improperly, he IP packet DF flag (dont fragment) when the packet is crossing the boundary of the NAT networking. The bridged network works fine.
Environment:
VirtualBox 4.1.0 Host OS Windows 7 Guest OS Linux 2.6.38
Problem:
When we send a UDP packet (I tested only UDP) with DF flag set (0x02) in the IP header, to the "outside world", it appears on the wire outside of the NAT network as having the IP flags as 0 - the DF flag disappears.
The same is true on the receiving side: when the packet on the wire has 0x02 flag set in the IP header, it is getting delivered to the NAT network with flags 0.
When I am changing the networking settings to "bridged", everything works normally - the DF flag is preserved both ways. So, this is NAT-only problem, but rather annoying for some advanced applications which have to perform PMTU discovery, for example.
更動歷史 (22)
comment:2 13 年 前 由 編輯
Replying to holger67:
I tested this issue with version 4.1.2 - the same problem.
Could you please add the log? Could you please point which application is affected?
comment:3 13 年 前 由 編輯
I attached the log file, and I attached two network captures to illustrate the problem:
1) Guest.pcap file is the capture taken from the perspective of the Guest. The guest, as a client, is establishing the DTLS communications with server in the outer network (you can decode the packets as "DTLS" protocol for the nicer view). The server system IP address is 10.207.21.238, the client (guest OS) is 10.0.2.15. Both server and client are setting DF (dont fragment) flag in the IP header. As you can see in the IP header, all network outgoing packets, from the perspective of the guest, are having DF flag, but all incoming packets (already passed thru NAT) have DF flag cleared.
2) Host.pcap shows exactly the same capture, from the perspective of the Host system. You can see exactly the same packets, but the DF flag values are mirrored: the outgoing packets have them zero (because they passed thru the NAT) but incoming packets have them properly set (because the incoming packets have not passed thru NAT, yet).
The correct behavior would be to carry on the DF flag value in the IP header, in both directions (outgoing and incoming). Currently, it is cleared in both direction.
Any network application that uses PMTU discovery is affected. For example, an SSL-based VPN over UDP protocol. To test this functionality, just use IP_MTU_DISCOVER socketopt flag in the Linux guest OS and send the packet outside - and see what happens.
跟進: 5 comment:4 13 年 前 由 編輯
I also tested the bridged network in VirtualBox - the bridge works fine. Only NAT has this bug.
comment:5 13 年 前 由 編輯
Replying to holger67:
I also tested the bridged network in VirtualBox - the bridge works fine. Only NAT has this bug.
Thanks for reporting. I'll investigate the issue, IIRC correctly PMTU rely on ICMP, that is probably ok for Unix, but not for Windows hosts (where we have to use ICMP API).
跟進: 7 comment:6 13 年 前 由 編輯
PMTU discovery in general is not relying solely on ICMP, because many routers in the middle of the cloud do not return the correct ICMP messages, so the reliable full PMTU discovery must be implemented on the application level. For that, we have to rely on the DF flag being correctly carried on through all network hops. The bug that I filed is not about ICMP - it is only about DF in IP header. If there are problems with ICMP, then that would be a different bug. Thanks !
跟進: 8 9 comment:7 13 年 前 由 編輯
Replying to holger67:
PMTU discovery in general is not relying solely on ICMP, because many routers in the middle of the cloud do not return the correct ICMP messages, so the reliable full PMTU discovery must be implemented on the application level. For that, we have to rely on the DF flag being correctly carried on through all network hops. The bug that I filed is not about ICMP - it is only about DF in IP header. If there are problems with ICMP, then that would be a different bug. Thanks !
Interesting, I've thought if one of the nodes can't sent the packet without fragmentation, it have to send ICMP type 3 code 4 (rfc 792). In addition NAT shouldn't do direct mapping incoming datagram to outgoing, because it uses socket API, and we don't use raw sockets except Linux'n'Solaris for ICMP. And this require investigation, whether we have to support PMTU or not? NAT implements rfc3022 and for protocol hanlding rfc791, 792, 793 and 768 and none of them declares mandatory of rfc 1191. More over rfc791 says in "Fragmentation and Reassembly"
If the Don't Fragment flag (DF) bit is set, then internet fragmentation of this datagram is NOT permitted, although it may be discarded. This can be used to prohibit fragmentation in cases where the receiving host does not have sufficient resources to reassemble internet fragments.
Again, if application that affected is critical, we might investigate this and probably add this.
comment:8 13 年 前 由 編輯
I am not suggesting to support PMTU. I am just asking to route the packets through NAT in a consistent way - please preserve whatever flags had been set in the original IP header. Currently, NAT clears the DF flag in both directions. I am just asking not to clear it. Please do not complicate the matter by expanding the bug, I believe this is something very simple.
跟進: 10 comment:9 13 年 前 由 編輯
Interesting, I've thought if one of the nodes can't sent the packet without fragmentation, it have to send ICMP type 3 code 4 (rfc 792).
and this is actually not true for most existing routers in the Internet. The admins disable this feature often, intentionally.
跟進: 11 12 comment:10 13 年 前 由 編輯
Replying to holger67:
Interesting, I've thought if one of the nodes can't sent the packet without fragmentation, it have to send ICMP type 3 code 4 (rfc 792).
and this is actually not true for most existing routers in the Internet. The admins disable this feature often, intentionally.
Yes, that what actually rfc791 says. VBox's NAT is "router", that why I'm asking which applications are affected and how it's critical. Because if won't discard this flag we should be prepared handle different situations, some of which are restricted by host API.
comment:11 13 年 前 由 編輯
Yes, that what actually rfc791 says. VBox's NAT is "router", that why I'm asking which applications are affected and how it's critical. Because if won't discard this flag we should be prepared handle different situations, some of which are restricted by host API.
I cannot say about the particular applications. We encountered this problem while working on our SSL VPN custom project. So, I guess that VPN clients or servers may be affected. Other network development projects may be affected. Linux virtual clients are often used for development testing, so I guess some other developers might be affected.
comment:12 13 年 前 由 編輯
Because if won't discard this flag we should be prepared handle different situations, some of which are restricted by host API.
I'd suggest a simplest fix - just preserve the flag and don't do anything else from rfc791. At least, then VB NAT networking would have similar functionality to the existing de-facto Internet routers functionality. It would be more-or-less acceptable modern behavior. Current VB NAT behavior is just very unusual and inconvenient. Thanks !
comment:13 10 年 前 由 編輯
描述: | 修改 (差異) |
---|
This should be mostly fixed the forthcoming 4.3.16.
"Mostly" since not all hosts provide API to manage Don't Fragment flag. MacOS and older Solaris versions don't have it. Raw sockets and IP_HDRINCL
is always an option, but that requires non-trivial changes that have to be evaluated separately.
comment:15 10 年 前 由 編輯
I am using version 4.3.12 (Windows host) and it is saying that "you are running the most recent version of the VirtualBox". The information on the virtualbox.org is telling me that the Windows version 4.3.16 is unstable, for now. So I cannot upgrade to 4.3.16. When there will be an upgrade from 4.3.12 to 4.3.16 available, then I'll try to test it.
Thanks Oleg
comment:16 10 年 前 由 編輯
VirtualBox 4.3.16 is not unstable. Many users had problems to start VMs with VBox 4.3.14 on Windows hosts due to additional security measures we had to introduce with that release. We fixed a number of problems but most likely some corner cases remain. That's not a reason not to upgrade to VBox 4.3.16. The worst thing which can happen is that there might be a conflict with between Windows and the virus scanning application on your host. In that case you would be not able to start a VM. In that case you could still downgrade to the old version.
comment:18 10 年 前 由 編輯
I updated to 4.3.16 and I configured my tests. I do not see any abnormalities in the DF flag transmission. I suppose that the bug is fixed. Thanks a lot !
Oleg
I tested this issue with version 4.1.2 - the same problem.