I found this issue specifically on CentOS running the trixbox telephony software, where over a 12 hour period my system clock had gained over 3 hours of extra time.
This is not a good thing for VoIP, as it relies heavily on time for RTP packet switching.
I also had a compounding issue of my system locking up whenever I tried to perform an NTP update from one of my domain controllers, with an error similar to the following: –
BUG: soft lockup – CPU#0 stuck for 10s! [bash:2513]
EIP: 0060:[<c06100b8>]dahdi_dummy_timer
It turns out that this service is particularly time sensitive, and the very large time step incurred by an NTP update causes it to lock up until the time is back in phase, but in my case that will never happen because of the rate that the system is gaining time.
My solution was to disable ACPI and APIC at boot time, prevent the dahdi service from starting at runtime and then perform an NTP update and update the hardware clock with the system time by performing the following steps: –
- Modified the kernel boot options by modifying the boot loader config file (trixbox uses grub, so I had to edit “/boot/grub/grub.conf” to add “divider=10 clocksource=acpi_pm” after the appropriate kernel line
- Ran “chkconfig dahdi off” to prevent the dahdi service from automatically starting
- Restart
- Ran “ntpdate -u <NTP server IP>” to update the time
- Ran “hwclock –systohc” to update the hardware clock from the system clock
- Ran “chkconfig dahdi on” to allow the dahdi service to start automatically again
Now the time is accurate and my VoIP calls are working properly.