Friday, December 08, 2006

Migrating an NT4 domain to a Samba BDC

For a long time now, since we broke away from the VMS collective, we've had a mixture of server operating systems in our department. Early on, we ran a Novell Netware server for our computing labs. Windows servers later replaced this, and served both computer labs and administrative desktops. Later, we deployed an IIS-based web server. These Windows machines lived alongside AIX compute servers and later Linux compute, mail, print, file, web, etc., servers. This heterogeneity peaked sometime in the 90s, and since then we've been trying to consolidate. Our server infrastructure is now split between Linux and Windows, and we're at the point of eliminating the last of the Windows servers. The only Windows servers we still have are an IIS web server and several Windows NT4 domain controllers.

We've been planning to retire the domain controllers for a while, but have been debating how to go about it. We could just do away with domain authentication altogether, or we could piggyback on an Active Directory infrastructure maintained by our parent organization. We've also begun planning for a migration to Linux-based domain controllers, but since the existing Windows domain controllers are lightly loaded and functional, the migration was never a priority. Ultimately, we think we'll just migrate to our parent organization's AD.

We recently ran into a short-term need for a replacement, though. We were down to two Windows domain controllers (a PDC and a BDC), and our PDC failed. We promoted the BDC to a PDC, but that left us with no failover. Since the remaining domain controller is getting old, we were worried about this possibility, so we did a couple of things in parallel, to make sure we had at least one additional domain controller up and running within a day or two: We set up a new NT4 domain controller, from scratch, and at the same time we set up a Linux-based backup domain controller.

For the Linux BDC, I started by installing the latest samba version (3.0.23d). Then I looked around the web for instructions for creating a BDC. For simplicity (to save time in getting the new server up) I decided to use the tdbsam backend rather than configuring an ldap server. Following the clear instructions in the NT Migration section of the Samba Guide got me a long way.

After I'd configured the smb.conf file (which I'll talk about in detail a little later), I followed these steps (beginning with the smb service turned off):
net rpc getsid -S oldpdc -W OURDOMAIN
net setlocalsid blah-blah-blah-blah (whatever is returned by the previous command)

service smb start
net rpc join -S oldpdc -W OURDOMAIN -U Administrator

net rpc vampire -S oldpdc -U Administrator
where "oldpdc" is our current Windows PDC and "OURDOMAIN" is our domain name. The final "net rpc vampire" command should suck the user and group information out of the old PDC and populate the samba server's tdbsam databases.

(Note that if you repeat this process you should clean out the tdb files in /etc/samba and /var/lib/samba and clean out newly-created entries in /etc/passwd, shadow and group before trying again.)

The first time I ran this, I found that the process began running quickly, then slowed to a near-stop. It turned out that there was a problem with the currently installed version of the groupadd command (from the shadow-utils package on the Fedora-based computer I was using). After adding a few groups, it began eating up memory until the OOM-killer killed it. I fixed this by upgrading shadow-utils to version 4.0.7-9.

This was only the first of the problems I ran into with groups. The "net rpc vampire" command uses the following entries from smb.conf to create local unix users and groups:
  • add user script
  • add group script
  • add user to group script
  • add machine script
Adding groups and adding users to groups turned out to be very tricky.

The first problem with adding groups was the fact that many of our Windows groups have spaces in their names. (E.g., "Domain Admins".) The Linux groupadd command doesn't allow spaces in group names, although (to my surprise) the operating system will honor group names containing spaces.
The samba documentation and advice I found online suggests two ways of dealing with this:
  1. Begin by using groupadd to create a dummy group, with a name acceptable to groupadd, and then rename that group once it's in the /etc/group file, or
  2. Create unix groups with mangled names, and then map them (using "net rpc groupmap") in the tdbsam database to Windows group names.
These both sound like reasonable workarounds, but it turns out that only solution number 1 will actually work. I found this out by trial and error, because I initially tried solution number 2.

I did it as follows: The smb.conf documentation says that the "add group script" can be any executable script that returns the gid of the newly-created group. So, I created this script:
#!/usr/bin/perl
# DON'T USE THIS SCRIPT, IT WON'T WORK!
use strict;
my $newgroup = shift;
$newgroup =~ s/ /_/g;
print STDERR `/usr/sbin/groupadd -f $newgroup`;
my $gid = getgrnam($newgroup);
print $gid;
This looks pretty straightforward, and "net rpc vampire" used it without complaint. All of the unix groups were created, and they were all properly mapped automatically to their Windows equivalents (some with spaces in their names). The problem was that users were only added to groups that had no spaces in their Windows names. It turns out that "net rpc vampire" does a brute-force lookup in /etc/group to see if a group exists before it even tries to add members to the group. If the group's literal, Windows, space-containing name isn't there, the group will be silently ignored.

So, I fell back on solution number 1, for which a script is actually provided in one of the samba HOWTOs. This worked like a charm.... except that users were only added to, at most, a single group.

The "net rpc vampire" command uses the "add user to group script" from smb.conf. In my case, this looked like
add user to group script = /usr/sbin/usermod -G '%g' '%u'
At first glance, this seems like it would work. (Note that usermod, unlike groupadd, is perfectly happy with group names containing spaces.) Usermod accepts a list of additional groups to which the user should be added. Looking at the man page, however, I saw that it wants a complete list of additional groups. If the user is currently a member of any groups not in the list, usermod removes the user from those groups. So, using usermod as I was, each user would only end up belonging to the last group "net rpc vampire" processed for that user.

The solution was to find a command that would just add a single user to a single group. This turns out to be "gpasswd -a $user $group", from the shadow-utils package. I didn't find anyone else on the web who had used this. In fact, many sources recommend using usermod, as I had done originally. This would work if all users belonged to only a single group, so maybe that's why the problem hadn't been noticed.

With these changes, the final smb.conf file looks like this:
[global]
workgroup = OURDOMAIN
netbios name = PDC

interfaces = eth0, lo
bind interfaces only = Yes
hosts allow = 192.168.0.0/16, 127.0.0.1
hosts deny = 0.0.0.0/0

passdb backend = tdbsam

add user script = /usr/sbin/useradd -g users '%u'
add group script = /etc/samba/smbgrpadd.sh '%g'
add user to group script = /usr/bin/gpasswd -a '%u' '%g'
add machine script = /usr/sbin/useradd -s /bin/false -d /dev/null '%u'

preferred master = Yes
wins support = Yes
shutdown script = /sbin/shutdown
abort shutdown script = /sbin/shutdown -c
logon script = Login.bat

domain logons = Yes

domain master = no

preferred master = Yes
username map = /etc/samba/smbusers

log level = 1
syslog = 0
log file = /var/log/samba.log
max log size = 5000

smb ports = 139 445
name resolve order = wins bcast hosts
time server = Yes
map acl inherit = Yes
printing = cups

[IPC$]
path = /tmp

[netlogon]
comment = Network Logon Service
path = /home/netlogon
guest ok = Yes
locking = No
Using this, the "net rpc vampire" process completed successfully, and we have a Linux/Samba-based BDC up and running.

No comments: