March 2009
The NTP
problem in seemed to be related to VMWare.
(Small recap: the host that VMWare runs on loses interrupt ticks, and at the same time, perhaps by coincidence, NTP keeps picking the hardware timer as reference source, ignoring the higher stratum servers.)
This time, the Sarge has changed into Lenny, VMWare 1.x has changed into 2.x, and we retry allowing the hardware clock at a low stratum to see how the NTP daemon behaves.
sudo vmware-uninstall.pl
remote refid st t when poll reach delay offset jitter
==============================================================================
*RN2-R6509-RP.ne 192.36.143.150 2 u 47 256 377 0.773 -0.919 0.100
driftfile /var/lib/ntp/ntp.drift statsdir /var/log/ntpstats/ statistics loopstats peerstats clockstats filegen loopstats file loopstats type day enable filegen peerstats file peerstats type day enable filegen clockstats file clockstats type day enable server ntp1.rug.nl burst iburst prefer server ntp2.rug.nl burst iburst server 127.127.1.0 fudge 127.127.1.0 stratum 10 restrict 127.0.0.1 restrict ::1 enable ntp enable kernel multicastclient
sudo /etc/init.d/ntp restart
At first, ntpq -p gives:
remote refid st t when poll reach delay offset jitter
==============================================================================
LOCAL(0) .LOCL. 10 l 50 64 3 0.000 0.000 0.001
After a while, this becomes:
remote refid st t when poll reach delay offset jitter
==============================================================================
*RN2-R6509-RP.ne 192.36.143.150 2 u 28 64 1 0.764 -0.695 0.664
+129.125.3.251 192.36.143.150 2 u 27 64 1 0.559 -0.784 10.486
LOCAL(0) .LOCL. 10 l 42 64 1 0.000 0.000 0.001
Note | |
---|---|
The local hardware clock is not the reference source, as it shouldn't be. But mind you, this is with no VMWare running, so if the above hypothesis is correct, we don't expect NTP to pick the local clock. Of course there could be other changes causing the improvement... |
Fetch the new VMWare from the vmware site (register first and be sent some codes), then untar the thing, cd into it and run sudo ./vmware-install.pl.
I prefer to install to /usr/local
heeding the FSH section of the Debian Policy.
sudo vmware-config.pl
Note | ||
---|---|---|
You need kernel headers/sources and a C compiler for that. Since I'm using Debian, I also need to patch the installer with the following, obtained from Ubuntu: --- /usr/bin/vmware-config.pl.orig 2008-11-28 12:06:35.641054086 +0100 +++ /usr/bin/vmware-config.pl 2008-11-28 12:30:38.593304082 +0100 @@ -4121,6 +4121,11 @@ return 'no'; } + if ($name eq 'vsock') { + print wrap("VMWare config patch VSOCK!\n"); + system(shell_string($gHelper{'mv'}) . ' -vi ' . shell_string($build_dir . '/../Module.symvers') . ' ' . shell_string($build_dir . '/vsock-only/' )); + } + print wrap('Building the ' . $name . ' module.' . "\n\n", 0); if (system(shell_string($gHelper{'make'}) . ' -C ' . shell_string($build_dir . '/' . $name . '-only') @@ -4143,6 +4148,10 @@ if (try_module($name, $build_dir . '/' . $name . '.o', 0, 1)) { print wrap('The ' . $name . ' module loads perfectly into the running kernel.' . "\n\n", 0); + if ($name eq 'vmci') { + print wrap("VMWare config patch VMCI!\n"); + system(shell_string($gHelper{'cp'}) . ' -vi ' . shell_string($build_dir.'/vmci-only/Module.symvers') . ' ' . shell_string($build_dir . '/../')); + } remove_tmp_dir($build_dir); return 'yes'; }
|
Note | |
---|---|
At this point, the VMWare services are running, and I see no sign in the logs of lost interrupts yet, and NTP is doing well. With virtual machines running, there is no timer problem any more according to the logs, but the hardware clock loses 5 minutes in 10. After a couple of minutes running, ntpqsays:
The local clock has precedence once more. |
Warning | |
---|---|
And the VMWare server 2.0 web interface is so bloody slow that we can't interrupt a VM during boot any more :( |