#15223 closed defect (fixed)
VBoxNetNAT high CPU utilisation => Fixed in SVN
回報者: | planck_length | 負責人: | |
---|---|---|---|
元件: | network/NAT | 版本: | VirtualBox 5.0.16 |
關鍵字: | 副本: | ||
Guest type: | other | Host type: | Mac OS X |
描述
I'm using NAT network on my Linux guests. Host is OS X 10.11.3
When I run Activity Monitor on OS X, VBoxNetNAT is always at 100% CPU utilisation (or close to it).
This only happens while there is at least one guest running.
附加檔案 (6)
更動歷史 (28)
9 年 前 由 編輯
跟進: 4 comment:3 9 年 前 由 編輯
I don't see how that sample can correspond to 100% cpu load. Almost all out of 8908 samples are in some blocking/waiting primitive, so most of the time threads wait for stuff to happen. Only a few dozens (i.e. less than 1%) of samples are elsewhere. There's one call to tcp timer on lwIP2 thread (they are called every 250ms, but only when there are active tcp pcbs). There's some activity - a dozen of samples on lwIP3 thread that polls the host-side sockets. IOW, this sampling looks pretty idle to me.
跟進: 5 comment:4 9 年 前 由 編輯
Replying to vushakov:
I don't see how that sample can correspond to 100% cpu load...
When took the sample, Activity Monitor said it didn't work, and I had to do it from the command line using sudo. I don't know if that has anything to do with it.
Nevertheless, Activity Monitor shows it consistently at 100%. This only starts happening a few hours after I spin up the first guest, but once it starts, it stays at 100% until I shutdown down all guests.
Judging by the temperature of my laptop, something is obviously chewing up CPU cycles, even if it isn't this process.
Is there anything else I can send you? The output of top or ps perhaps? Or a screenshot of Activity Monitor?
comment:5 9 年 前 由 編輯
Here's the output of ps. It shows the CPU at around 100%
Also note at the end how I can't run sample on this process unless I use sudo. This is despite the fact the process is owned by me.
$ ps aux | grep VBoxNetNAT na 6196 100.0 0.1 2541024 6752 ?? R 10:23am 28:41.29 /Applications/VirtualBox.app/Contents/MacOS/VBoxNetNAT --ip-address 10.0.2.1 --netmask 255.255.255.0 --network nat1 --trunk-type whatever $ ps aux | grep VBoxNetNAT na 6196 100.0 0.1 2541024 6752 ?? R 10:23am 28:51.97 /Applications/VirtualBox.app/Contents/MacOS/VBoxNetNAT --ip-address 10.0.2.1 --netmask 255.255.255.0 --network nat1 --trunk-type whatever $ ps aux | grep VBoxNetNAT na 6196 99.9 0.1 2541024 6752 ?? R 10:23am 28:53.48 /Applications/VirtualBox.app/Contents/MacOS/VBoxNetNAT --ip-address 10.0.2.1 --netmask 255.255.255.0 --network nat1 --trunk-type whatever $ ps aux | grep VBoxNetNAT na 6196 99.3 0.1 2541024 6752 ?? R 10:23am 29:00.67 /Applications/VirtualBox.app/Contents/MacOS/VBoxNetNAT --ip-address 10.0.2.1 --netmask 255.255.255.0 --network nat1 --trunk-type whatever $ ps aux | grep VBoxNetNAT na 6196 100.0 0.1 2541024 6752 ?? R 10:23am 29:01.42 /Applications/VirtualBox.app/Contents/MacOS/VBoxNetNAT --ip-address 10.0.2.1 --netmask 255.255.255.0 --network nat1 --trunk-type whatever $ ps aux | grep VBoxNetNAT na 6196 100.0 0.1 2541024 6752 ?? R 10:23am 29:01.99 /Applications/VirtualBox.app/Contents/MacOS/VBoxNetNAT --ip-address 10.0.2.1 --netmask 255.255.255.0 --network nat1 --trunk-type whatever $ ps aux | grep VBoxNetNAT na 6196 100.0 0.1 2541024 6752 ?? R 10:23am 29:02.64 /Applications/VirtualBox.app/Contents/MacOS/VBoxNetNAT --ip-address 10.0.2.1 --netmask 255.255.255.0 --network nat1 --trunk-type whatever $ sample VBoxNetNAT sample[6363]: sample cannot find any existing process you have access to which has a name like 'VBoxNetNAT'; try running with `sudo`.
I've taken another sample, this time for 100 seconds. Interestingly the CPU utilisation of VBoxNetNAT drops to 90% while sample is running, presumably because it is being suspended every 1ms.
9 年 前 由 編輯
跟進: 7 comment:6 9 年 前 由 編輯
planck_length, any chance to find out which of the VBoxNetNAT threads is responsible for the CPU load? The 'htop' tool might help.
comment:7 9 年 前 由 編輯
Replying to frank:
planck_length, any chance to find out which of the VBoxNetNAT threads is responsible for the CPU load? The 'htop' tool might help.
I installed htop using homebrew, but I can't find any way of showing individual thread usage. Pressing the H key doesn't do anything. Any ideas?
I'm going to attach a screen shot of htop. It looks like most of the CPU usage is occurring in the kernel (red CPU bars). So perhaps there's a system call that's responsible?
9 年 前 由 編輯
跟進: 9 comment:8 9 年 前 由 編輯
Hmm, system ps
(1) has -M
option to show information for individual threads, but there seems to be no way to identify individual threads...
Aside - for htop
(1) output, please, use text display instead of bars in the future. I have mild deuteranomaly, so telling apart red and green lines 1-2 pixels wide is not impossible, but is very hard.
comment:9 9 年 前 由 編輯
Replying to vushakov:
Hmm, system
ps
(1) has-M
option to show information for individual threads, but there seems to be no way to identify individual threads...
How frustrating! Would it take long for one of you guys to write some code that I could compile and run here, that will list each thread ID for a given process ID, along with its CPU usage time? Then I could run it here and send you the results.
Aside - for
htop
(1) output, please, use text display instead of bars in the future. I have mild deuteranomaly, so telling apart red and green lines 1-2 pixels wide is not impossible, but is very hard.
Do you know how I can get htop to do this? EDIT: I figured it out - attaching screenshot now. '*' is kernel and '#' is user,
I'm keen to help you nail this bug, so if there's anything I can do to help, please let me know.
9 年 前 由 編輯
comment:10 9 年 前 由 編輯
You switched htop
to monochrome color scheme. What I meant was switching it to <F2>
(Setup) -> Meters
-> CPUs (1/1) [Text]
- from [Bar]
. You can toggle the display with space or enter, different versions seems to use different keys. This is just for future reference, don't attach another screenshot.
跟進: 12 comment:11 9 年 前 由 編輯
To get kernel stack traces, please, can you run
$ sudo spindump VBoxNetNAT
comment:12 9 年 前 由 編輯
Replying to vushakov:
To get kernel stack traces, please, can you run
$ sudo spindump VBoxNetNAT
The sample file is too large to upload to here, so you'll need to download it from http://buildoneforme.com/VBoxNetNAT_6196.spindump.txt.gz
跟進: 14 comment:13 9 年 前 由 編輯
Please, can you download and install the latest 5.0 test build (revision 106028
+).
Start VirtualBox manually with something like
$ (export VBOXNET_nat1_RELEASE_LOG="+nat_service.e.l.l2.l3.f -nat_service.restrict"; \ export VBOXNET_nat1_RELEASE_LOG_FLAGS="thread timeprog"; \ VirtualBox &)
where nat1
in the variable names is the name of your NAT Network as per the VBox.log
file you've attached.
This will generate quite a bit of extra info in the ~/Library/VirtualBox/nat1.log
log file.
To stop generating those extra logs, save or shutdown all VMs and quit VirtualBox manager. Wait for VBoxSVC
process to terminate automatically after a small timeout. Restart VirtualBox without those environment variables.
Thanks.
9 年 前 由 編輯
附檔: | 新增 nat1.log.gz |
---|
comment:14 9 年 前 由 編輯
Replying to vushakov:
This will generate quite a bit of extra info in the
~/Library/VirtualBox/nat1.log
log file.
You don't say!
It took 2 hours before the problem started again, and another hour before I realised it was happening. By then the log file was already 17G!! I truncated it down to 2MB and gzipped that. It's easy to see where the problem starts: it's "sock 54: HUP" all the way down.
跟進: 16 comment:15 9 年 前 由 編輯
Thanks! I have reproduced the problem now. My guess is that some external client connects to VM's http server via port-forwarding, requests a lot of data, half-closes its tx side and/then aborts the transfer. Or something along this lines. You can probably match this with whatever is really happening in your setup so that you can reproduce this easier.
I'm working on a fix.
comment:16 9 年 前 由 編輯
Replying to vushakov:
Thanks! I have reproduced the problem now. My guess is that some external client connects to VM's http server via port-forwarding, requests a lot of data, half-closes its tx side and/then aborts the transfer. Or something along this lines. You can probably match this with whatever is really happening in your setup so that you can reproduce this easier.
I'm working on a fix.
Fantastic! Yes, I'm using a browser on the host to request data from a server running on the VM.
comment:18 9 年 前 由 編輯
Replying to vushakov:
Please, can you give test build
106083
+ a try?
Apologies for the late reply. I just installed it, and so far no problem. However, I'm not sure whether I know exactly how to reproduce the problem, so I'll keep it running for a few days and see if it reoccurs.
comment:20 9 年 前 由 編輯
comment:21 9 年 前 由 編輯
摘要: | VBoxNetNAT high CPU utilisation → VBoxNetNAT high CPU utilisation => Fixed in SVN |
---|
Please, can you sample the
VBoxNetNAT
process in that state?