Linux (specifically CentOS running trixbox) gains excessive time on system clock

I found this issue specifically on CentOS running the trixbox telephony software, where over a 12 hour period my system clock had gained over 3 hours of extra time.

This is not a good thing for VoIP, as it relies heavily on time for RTP packet switching.

I also had a compounding issue of my system locking up whenever I tried to perform an NTP update from one of my domain controllers, with an error similar to the following: –

BUG: soft lockup – CPU#0 stuck for 10s! [bash:2513]
EIP: 0060:[<c06100b8>]

dahdi_dummy_timer

It turns out that this service is particularly time sensitive, and the very large time step incurred by an NTP update causes it to lock up until the time is back in phase, but in my case that will never happen because of the rate that the system is gaining time.

My solution was to disable ACPI and APIC at boot time, prevent the dahdi service from starting at runtime and then perform an NTP update and update the hardware clock with the system time by performing the following steps: –

  1. Modified the kernel boot options by modifying the boot loader config file (trixbox uses grub, so I had to edit “/boot/grub/grub.conf” to add “divider=10 clocksource=acpi_pm” after the appropriate kernel line
  2. Ran “chkconfig dahdi off” to prevent the dahdi service from automatically starting
  3. Restart
  4. Ran “ntpdate -u <NTP server IP>” to update the time
  5. Ran “hwclock –systohc” to update the hardware clock from the system clock
  6. Ran “chkconfig dahdi on” to allow the dahdi service to start automatically again

Now the time is accurate and my VoIP calls are working properly.

“Failed to register service principal name” on Hyper-V host

I recently replaced one of my Hyper-V hosts with Windows Server 2008 R2, and noticed that I was getting the following event logged every two minutes: –

Log Name:      Microsoft-Windows-Hyper-V-VMMS-Admin
Source:        Microsoft-Windows-Hyper-V-VMMS
Date:          20/09/2009 5:52:42 PM
Event ID:      14050
Task Category: None
Level:         Error
Keywords:    �
User:          SYSTEM
Computer:      HyperV01.mydomain.internal
Description:
Failed to register service principal name.

 I was nearly certain that this was due to the fact that I hadn’t removed the computer from the domain before rebuilding it, and therefore it had acquired the old computer account when it was re-joined. This error indicates that there was an error updating the “servicePrincipalName” attribute of the computer account for my Hyper-V server.

I jumped in to my Active Directory to check out the permissions of the computer account first, and the first thing I noticed was that there was an unresolvable SID in my ACL. This wasn’t causing the issue, but it was a good indication that the permissions were probably in need of attention.

To understand how to resolve this issue, it’s important to understand what’s failing. In this case, we can see from the event 14050, that the SYSTEM account on my Hyper-V host tried to update the servicePrincipalAttribute of it’s own computer account within Active Directory, but failed. We believe it’s a permissions issue, so we should check the “SELF” entry in the ACL to see if it has the correct permissions: –

 

…And bingo! The “SELF” entry is missing the “Validated write to service principal name” permissions, so therefore it can’t write the attribute. “SELF” in this case, corresponds to the SYSTEM account of the host that owns the computer account.

So I went ahead and granted this permission to the computer account, and confirmed that the servicePrincipalName attribute updated on next attempt and that the events were no longer being logged.

Windows Server 2008 domain controller blue screens on startup with STOP: c00002e2

Earlier in the year, I had a hardware issue which brought down one of my Hyper-V servers (and my virtual web server hosting this website along with it). When I finally resolved the issue (I had a faulty hard disk), I had to re-install Windows Server 2008 on the Hyper-V server and then bring all of my virtual machines back online. I wrote down my resolution steps, and now have finally had some time to share this.

I used Virtual Machine Manager 2008 to add the rebuilt server back in as a library server, and then copied my virtual machine files in to the Virtual Library share so that it would pick up the machines.

Each of my virtual hosts have two disks – One for the operating system, and one for the data. This meant that after I had finished creating the “new” VM using the .vhd containing the OS for each machine, I had to go back and attach the data .vhd as well.

It seems that when you do this (for whatever reason) the disk is not brought back online automatically. As I have my Active Directory database stored on my data drive, when the domain controller attempts to access the database, it can’t so it blue screens and then restarts. I don’t believe that this is the behaviour on Windows Server 2003 machines, so I am assuming it’s either the behaviour for Windows Server 2008 or maybe it’s just the Server Core install of Windows Server 2008.

In a nutshell, playing around with the virtual disks for a Windows Server 2008 domain controller can cause a lot of grief. If you ever get a BSoD with this message – “STOP: c00002e2 Directory Services could not start because of the following error: A device attached to the system is not functioning. Error Status: 0xc0000001. Please shutdown this system and reboot into Directory Services Restore Mode, check the event log for more detailed information.” – this just might be the reason.

EDIT (04/04/10): I have revisited this article as I ran in to another situation when moving a virtual machine recently. As F.H.R. stated in the comments below, Windows puts the secondary disk offline. If you bring the disk back online using diskpart or Disk Management, you might still have the same issues.

If you do still have a blue screen after bringing the disk back online, ensure that the disk is initialised. If you are using diskpart, select the disk and then use the command “attrib disk clear readonly” which should bring your domain controller back to life.

If you still have issues, follow the rest of this article.

In order to resolve this issue, I presented a new VHD to the domain controller and booted in to Directory Services Restore Mode. Once I was in there, I was able to move my Active Directory database across to this new disk, swap the drive letters around, and then restart.

I’m not sure why this is necessary, but I can tell you that 7 months later I have used this process a few times when playing around with my domain controllers, and with 100% success.

Upgrading, repairing or servicing your Dell Studio One desktop

Recently I organised a new computer for my little brother. As usual, I recommended Dell because of their in-home warranty, which helps considerably considering I’m in a different state and can’t help with hardware issues.

He has a Macbook at the moment, which is probably a decent unit, but when I’m trying to give technical support or deploy updates or extra software it can get a bit troublesome considering my experience is primarily with Windows systems.

We went with the Studio One desktop computer from Dell, because it has that Mac look-and-feel to it, but comes with Vista OEM which should keep us both happy. Here it is below: –

 

Anyway, the main concern with this system, was it’s upgradability and serviceability, so when it arrived at my office last week one of the first things I did was determine what components were used, and how to replace them.

The Dell Studio One is a self-contained system, where the PC components are stored in the same housing as the LCD panel (directly behind it), so it’s not as simple as just opening up a standard ATX case and replacing dead RAM or a hard drive. Here’s how you can do some basic servicing: –

  1. Disconnect the system from power, and unplug any peripherals
  2. Place system system LCD panel down on to a soft cloth or the original foam bag out of the box
  3. On the bottom edge of the system, there is a groove on either side of the stand where you can use soft leveraging to release the back panel from the first two clips (the clip locations are marked in the image below)
  4. With the first two clips released, gently work around the edges of the panel until all clips are released, as per the locations in the image of the removed panel below
  5. With the back panel removed, you will have exposed the interior metal panel which covers the system components – You may also notice that there are clear markings on this panel to indicate the location of the internal components, as well as the screws to remove to get access to the components (I have marked the component locations in blue, the screw markings in yellow and the screws themselves in red in the picture below)
  6. Remove the screws as marked in the image above to remove this panel to expose the main board of the system, and the majority of the sytem components
  7. The hard disk drive is a standard 3.5″ sized SATA II disk (320GB pictured here)
  8. The system memory is SO DIMM, notebook size (3GB pictured here)
  9. The optical drive is a slot loading, SATA DVD-ROM drive
  10. The CPU is a standard 2.94GHz Core 2 Duo processor

You’ll probably also notice that the main board and cooling system is much closer to a notebook than a desktop machine, so this limits some upgrades, but your major components and readily obtainable from just about any retailer.

When you’ve finished servicing your system, be sure to use the provided cleaning cloth.

Sending CTRL + ALT + DEL to an RDP session inside an RDP session

This is just a bit of a quick and random (but useful) tip.

You probably already know that to send CTRL + ALT + DEL to a machine that you’re RDP’d to, you use the CTRL + ALT + END combination instead; but what if you’re trying to send the CTRL + ALT + DEL to an RDP session inside another RDP session? The first machine you’re connected to gets the key combination, not the machine inside the machine you’re connected to.

The answer? The On-Screen Keyboard. It’s under Programs -> Accessories -> Accessibility, and allows you to send key combination from the machine that is connected to the machine you want to send your key combination to.

I found this particularly useful when trying to bring up the task manager of a machine that I couldn’t risk re-connecting to, as it was unstable and would probably have kicked me off altogether.

Repairing the DHCP Client service after a Conficker worm infection

If you’ve recently removed a Conficker infection from one of your machines, you might find that you can no longer start the DHCP Client service on the machine in question.

This is still a problem, even if the machine doesn’t rely on DHCP for it’s IP addressing, because the DHCP Client service still plays an important role in machines configured with static IP’s, in that it is responsible for dynamic registration and updating of it’s DNS record on it’s configured DNS servers. Without the service starting, the records will eventually get scavenged (if the DNS servers are configured for scavenging) because the records haven’t been “touched” by the DHCP Client service on the machine in question.

In fact, that might be how you determine that a problem exists, because you can no longer resolve machines in DNS that previously had a Conficker infection.

The service fails to start, because Conficker makes changes to services which call the svchost.exe process so that it can attach itself and attempt to spread throughout a network. During this process, permissions are reset on the registry keys which contain the service information for the DHCP Client service. Without looking too deep in to why these permissions are changed, I suspect that Conficker entirely removes and recreates the DHCP Client service registry keys, which will of course inherit the parent permissions by default. The DHCP Client service requires the following non-inherited permissions in order to be controlled:

HKLM\SYSTEM\CurrentControlSet\Services\Dhcp – NETWORK SERVICE, Read
HKLM\SYSTEM\CurrentControlSet\Services\Dhcp – NETWORK SERVICE, Full control

Setting these permissions will allow the service to be controller again. I recommend checking the Windows Event Logs to ensure that DNS registration occurs successfully once the service has been started.

Transitioning from an Exchange 2003 to an Exchange 2007 environment

There is no shortage of information out there on transitioning from Exchange 2000 or Exchange 2003 to an Exchange 2007 environment, although when performing the transition myself I found myself relying upon multiple resources to get the job done. Specifically, one of the things that was a problem for me was the fact that there is no real process to “upgrade” to Exchange 2007 retaining your old server name (a problem for third party issues SSL certificates).

I’ve run through the transition in a lab environment three times to capture all of the required steps (and hopefully the most common transitioning problems) in order to create a guide on performing this in your own environment.

There are a few assumptions with this guide (you can factor in any differences pretty easily):

  • This is an Exchange 2003 on Windows Server 2003 to Exchange 2007 on Windows Server 2008 transition process
  • This is a single server exchange environment transitioning to another single server environment
  • The environment uses RPC/HTTPS (now called Outlook Anywhere) for all client access

This guide will also include a transition from your legacy Exchange 2003 environment, to a temporary Exchange  2007 server, and then a migration from that temporary server to a new Exchange 2007 server of the same name as your legacy Exchange 2003 server. A quick summary:

  • Existing Exchange 2003 server (I will refer to this as the legacy server) migrating to…
  • Temporary Exchange 2007 server (I will refer to this as the temporary server) migrating to…
  • New Exchange 2007 server (I will refer to this as the new server) and that’s the end of the line

So let’s get started…

  • Export your SSL certificate (if you have a trusted third party issued certificate) from the legacy server to a .pfx file and copy this .pfx file to a file share to be accessible at the end of the transition
  • Ensure that the account to perform the transition is a Domain Administrator and a Schema Admin
  • Ensure that the legacy server has at least Exchange 2003 SP2 applied
  • Ensure that the schema master domain controller, and any global catalog servers in the same site as the Exchange server have at least Windows 2003 SP1 applied
  • Ensure that the domain functional level is Windows Server 2000 native or higher
  • Add the following registry key to the legacy server

HKLM\SYSTEM\CurrentControlSet\Services\RESvc\Parameters\SuppressStateChanges = 1 (DWORD)

  • Provision the temporary server using Windows Server 2008 and join to the domain
  • Install Windows Powershell, Remote Server Administration Tools (RSAT) and issue the following commands to install IIS dependencies dependencies:

ServerManagerCmd -i Web-Server
ServerManagerCmd -i Web-Dyn-Compression
ServerManagerCmd -i Web-Stat-Compression
ServerManagerCmd -i Web-Basic-Auth
ServerManagerCmd -i Web-Windows-Auth
ServerManagerCmd -i Web-Digest-Auth

  • Install the Exchange Best Practices Analyzer (ExBPA) on any server that has an internet connection and access to the domain controllers, and run an Exchange 2007 readiness check to determine if there are any warnings or recommendations
  • Run setup.com /PrepareAD from the Exchange 2007 install media (make sure to “Run as Administrator” if UAC is turned on)
  • Install Exchange 2007 on the temporary server as a “Typical Exchange Server Installation” and select the legacy Exchange server when prompted for mail flow settings (make sure to “Run as Administrator” if UAC is turned on)
  • Using the Exchange Management Console (EMC) on the temporary server, migrate all mailboxes using the “Recipient Configuration | Mailbox” menu
  • Remove all mailbox databases from the legacy server
  • Ensure that the new mailbox databases are configured to use the new public folder database as their default
  • Migrate all public folder replicas by running moveallreplicas.ps1 –server <legacyserver> -newserver <temporaryserver> from the temporary server
  • Wait for all of the records under the “Public Folder Instances” node in the public folder database in ESM to disappear on the legacy server (this can take a while, sometimes days to complete depending on the amount of data)
  • Using Exchange Server Manager (ESM) on the legacy server, create a new “Public Folder Container” directly under the new Exchange 2007 administrative group
  • Drag the existing “Public Folders” tree under the “Folders” tree in the new administrative group
  • Remove the public folder store from the legacy server, selecting the temporary server’s public folder database when prompted for a new store
  • Remove all storage groups from the legacy server
  • Remove the routing group connectors that were created during the Exchange 2007 install, using ESM on the legacy server, under both the legacy and the new administrative groups
  • Delete the domain and enterprise Recipient Update Services (RUS) object using adsiedit.msc
  • Uninstall Exchange 2003 from the legacy server
  • Using adsiedit.msc on the temporary server, delete the legacy administrative group
  • Delete the legacy Exchange Domain Servers and Exchange Enterprise Servers groups (if they aren’t being used for other custom purposes in your environment)
  • If the legacy server is being entirely decommissioned, remove it from the domain and shut it down
  • Provision the new server using Windows Server 2008 and join to the domain
  • Install Windows Powershell and the IIS dependencies as per the temporary server build
  • Install Exchange 2007 on the new server as a “Typical Exchange Server Installation”
  • Install the latest Exchange 2007 updates, including service packs and update rollups (this is important) and then restart the server
  • Rename the storage group and mailbox databases as desired
  • Configure the location for the storage group and mailbox database as desired
  • Create a new public folder database with the desired name and location
  • Migrate all mailboxes from the temporary server to the new server using the “Recipient Configuration | Mailbox” menu
  • Remove the mailbox database(s) from the temporary server
  • Change the default public folder database on the mailbox database(s) on the new server to be the new public folder database (it’s under the “Client Settings” tab of the mailbox database properties)
  • Move all offline address book by running Move-OfflineAddressBook -identity “\<oabname>” -server <newserver> -confirm:$false
  • From the temporary server, move all public folder replicas to the new server by running moveallreplicas.ps1 –server <temporaryserver> -newserver <newserver>
  • Monitor the status of the replica move by running Get-PublicFolderStatistics -server <temporaryserver> until no items are returned, or you can append | Measure-Object -Line to count the number of lines returned to monitor the public folders moving (this can take hours, days or weeks depending on the amount of data)
  • Remove the public folder database from the temporary server
  • Remove Exchange 2007 from the temporary server, remove the server from the domain and shut down
  • Create a new wildcard (*) send connector using the “Organization Configuration | Hub Transport” menu
  • Configure the “Default <servername>” receive connector to allow “Anonymous users” to connect using the “Server Configuration | Hub Transport” menu
  • Copy the exported .pfx file from earlier and use the “Server Certificates” option on the parent node in IIS7 to import the certificate
  • Change the certificate used by OWA by selecting the Default Web Site, clicking the “Bindings” menu on the right hand side, and editing “https”
  • Select the imported certificate from the drop-down box and save settings
  • Install the “RPC over HTTP Proxy” feature
  • Enable Outlook Anywhere using the “Server Configuration | Client Access” menu in EMC on the new server, right clicking on the server and selecting “Enable Outlook Anywhere”
  • Allow 15 minutes before testing (check the event logs for event ID 3006 which indicates that Outlook Anywhere is configured)
  • Edit the hosts file to comment out the IPv6 localhost line (::1) and add the following lines

127.0.0.1 <hostname>
127.0.0.1 <hostname.domain>

  • Restart the new server and test the connection

I have a fairly complicated environment regarding permissions, address lists and the like, so I found that I had to go through and make sure that my showInAddressBook attributes were set correctly on all of my mailboxes (the address lists were fine, but the global address lists were not). This may not be an issue in your environment, but feel free to drop me a line if you’re having issues.

COM Surrogate errors in Windows 7 x64 after installing Adobe CS4

I’m running the latest official beta of Windows 7 (build 7000) x64 and have been getting a whole bunch of these errors…

Runtime Error!
Program C:\Windows\SysWOW64\DLLHost.exe

This application has requested the Runtime to terminate it in an unusual way.

Generally, it happens when opening the Control Panel, but it can happen when doing other things that are obviously enumerating items from the Control Panel, such as Mobile Device Center or opening the Network and Sharing Center.

Turns out that it’s due to Adobe CS4 (not sure about CS3) being installed, as the VersionCue .cpl file doesn’t play nicely with Windows 7 x64.

I was able to work around this issue by renaming the VersionCueCS4.cpl file to VersionCueCS4.old in the C:\Program Files (x86)\Common Files\Adobe\Adobe Version Cue CS4\Server\bin\ directory.

No more COM Surrogate errors!

Error When Trying to Demote a Windows Server 2008 Domain Controller via the Command Line

If you’ve deployed a Windows Server 2008 core install running AD DS, you’ll be familiar with the promotion process. If you’ve ever demoted a Domain Controller from the command line, you may have come across an issue which makes your heart skip a beat (the last thing you want is for a Domain Controller promotion or demotion operation to fail, because you never know what you’re going to be left with). It’s actually a pretty easy “fix” which is handy to know before you try going crazy in adsiedit or anything like that.

The error can occur while the demotion process is attempting to stop the NETLOGON service, however the stop request times out, you get a message indicating that the operation has completed, but then an error on the next immediate line and your demotion stops.

Stopping service NETLOGON

……………………………………..
The attempted domain controller operation has completed
Failed to configure the service NETLOGON as requested

When I first did this, I figured that the dcpromo process would be aware of the last failure, and retry, which is sort of the case except you get the following error

The wizard cannot access the list of domains in the forest. The error is: the interface is unknown.

Another error which sounds particularly ominous, but it’s not. During the first demotion attempt, the demotion failed because either the NETLOGON service didn’t stop in time, or didn’t return the success code to the stop request, however by the time you run the second demotion the service has stopped. The demotion isn’t going to work with the NETLOGON service stopped.

The solution is as simple as just starting the NETLOGON service again by typing “net start netlogon” from the command, and then retrying your demotion. The dcpromo will pick up from where it was before, and nearly always complete successfully this time around.

Unrealistically Fast (or Negative) Ping Responses in Server 2003 Hyper-V Guests

I came across an interesting problem the other day while I was doing some unrelated troubleshooting on one of my Hyper-V guests.

The symptoms were that my Windows Server 2003 machine would return very strange results when pinging hosts, both internally and externally, such as returning all four responses within about half a second, yet measuring them at over 3000ms (which means they should have timed out, rather than given me a reading in milliseconds) as well as occasionally providing negative values for response times.

Obviously the results were completely inaccurate, but I couldn’t work out why the issue was only happening on a handful (not all) Hyper-V guests running Windows Server 2003 and none on Server 2008.

Turns out that this is an issue if all of the following are true:

  • You are running an operating system prior to Windows Vista or Windows Server 2008
  • You are running the current implementation of Microsoft Hyper-V (i.e. at the time of writing)
  • You have presented multiple processors to the Hyper-V guest

The issue occurs because the multiprocessor HAL in Hyper-V causes the guest’s operating system Time Stamp Counter (TSC) to skew. According to this blog the problem wouldn’t ordinarily occur if you were running Windows Server 2003 with SP2 unless the BIOS check fails to determine if the TSC should be used. More specifically, if I understand correctly the issue occurs because the processors (or cores, if we’re talking about a single multicore processor) are not in sync with each other, which produces sporadic out-of-time results where time sensitive operations (such as ping responses) are in use.

The resolution is to force the guest to use the PM timer instead of the TSC, by adding /USEPMTIMER in the boot.ini file and then restart. You can easily test this by running a ping -t to a host and checking for drastically abnormal results.