Kilala.nl - Personal website of Tess Sluijter

Unimportant background
Login
  RSS feed

About me

Blog archives

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

> Weblog

> Sysadmin articles

> Maths teaching

Running BoKS on SELinux protected servers

2013-10-01 09:00:00

I have moved the project files into GITHub, over here

FoxT Server Control (aka BoKS) is a product that has grown organically over the past two decades. Since its initial inception in the late nineties it has come to support many different platforms, including a few Linux versions. These days, most Linuxen support something called SELinux: Security Enhance Linux. To quote Wikipedia:

"Security-Enhanced Linux (SELinux) is a Linux kernel security module that provides the mechanism for supporting access control security policies, including United States Department of Defense-style mandatory access controls (MAC). It is a set of kernel modifications and user-space tools that can be added to various Linux distributions. Its architecture strives to separate enforcement of security decisions from the security policy itself and streamlines the volume of software charged with security policy enforcement.

Basically, SELinux allows you to very strictly define which files and resources can be accessed under which conditions. It also has a reputation of growing very complicated, very fast. Luckily there are resources like Dan Walsh' excellent blog and the presentation "SELinux for mere mortals".

Because BoKS is a rather complex piece of software, which dozens of binaries and daemons all working together across many different resources, integrating BoKS into SELiinux is very difficult. Thus it hasn't been undertaken yet and thus BoKS will not only require itself to be run outside of SELinux' control, it actually wants to have the software fully disabled. So basically you're disabling one security product, so you can run another product that protects other parts of your network. Not so nice, no?

So I've decided to give it a shot! I'm making an SELinux ruleset that will allow the BoKS client software to operate fully, in order to protect a system alongside SELinux. BoKS replicas and master servers are even more complex, so hopefully those will follow later on. 

I've already made good progress, but there's a lot of work remaining to be done. For now I'm working on a trial-and-error basis, adding rules as they are needed. I'm foregoing the use of sealert for now, as I didn't like the rules it was suggesting. Sure, my method is slower, but at least we'll keep things tidy :)

Over the past few weeks I've been steadily expanding the boks.te file (TE = Type Enforcement, the actual rules):

v0.32 = 466 lines
v0.34 = 423 lines
v0.47 = 631 lines
v0.52 = 661 lines 
v0.60 = 722 lines 
v0.65 = 900+ lines 

Once I have a working version of the boks.te file for the BoKS client, I will post it here. Updates will also be posted on this page.

 

Update 01/10/2013:

Looks like I've got a nominally working version of the BoKS policy ready. The basic tests that I've been performing are working now, however, there's still plenty to do. For starters I'll try to get my hands on automated testing scripts, to run my test domain through its paces. BoKS needs to be triggered to just about every action it can, to ensure that the policy is complete.

 

Update 19/10/2013:

Now that I have an SELinux module that will allow BoKS to boot up and to run in a vanilla environment, I'm ready to show it to the world. Right now I've reached a point where I can no longer work on it by myself and I will need help. My dev and test environment is very limited, both in scale and capabilities and thus I can not test every single feature of BoKS with this module. 

I have already submitted the current version of the module to FoxT, to see what they think. They are also working on a suite of test scripts and tools, that will allow one to automatically run BoKS through its paces which will speed up testing tremendously. 

I would like to remind you that this SELinux module is an experiment and that it is made available as-is. It is absolutely not production-ready and should not be used to run BoKS systems in a live environment. While most of BoKS' basic functions have been tested and verified to work, there are still many features that I cannot test in my current dev environment. I am only running a vanilla BoKS domain. No LDAP servers, no Kerberos, no other fancy features. 

Most of the rules in this file were built by using the various SELinux troubleshooting tools, determining what access needs to be opened up. I've done it all manually, to ensure that we're not opening up too much. So yeah: trial and error. Lots of it. 

This code is made available under the Creative Commons - Attribution-ShareAlike license. See here for full details. You are free to Share (to copy, distribute and transmit the work), to Remix (to adapt the work) and to make commercial use of the work under the following conditions:

So. How to proceed? 

  1. Build a dev/test environment of your own. I'm running CentOS VMs using Parallels Destop on my Macbook. Ensure that they're all up to date and that you include SELinux with the install. Better yet, check the requirements on this page
  2. I've got a BoKS master, replica and client, all version 6.7. However, installing BoKS on CentOS is a bit tricky and requires some trickery.
  3. Download the BoKS SELinux module files
  4. Put them in a working directory, together with a copy of the Makefile from /usr/share/selinux/devel/
  5. Run: make. If you use the files from my download, it should compile without errors. 
  6. Run: semodule -i boks. The first time that you're building the policy you'll need to install the module (-i). After that, with each recompile you will need -u, for update. 
  7. Run: touch /.autorelabel. Then reboot. Your system will change all the BoKS files to their newly defined SELinux types. 
  8. Run: setenforce 1. Then get testing!  Start poking around BoKS and check /var/log/audit/audit.log for any AVC messages that say something's getting blocked. 

I'd love to discuss the workings of the module with you and would also very much appreciate working together with some other people to improve on all of this. 

 

Update 05/11/2014:

Henrik Skoog from Sweden contacted me to submit a bugfix. I'd forgotten to require one important thing in the boks.te file. That's been fixed. Thanks Henrik!

 

Update 11/11/2014:

I have moved the project files into GITHub, over here


kilala.nl tags: , , , ,

View or add comments (curr. 0)

BoKS Users Group: an ending

2012-10-08 19:46:00

BoKS Users Group website

Almost two years ago I let go a volunteer project that I'd started, Open Coffee Almere. The project had out-grown me and in order to prosper needed someone else in charge. So I passed the project on and stepped back completely. 

Another project that was started at roughly the same time, but which never really took off is the BoKS Users Group. Meant to unite FoxT BoKS administrators across the globe in order to share knowledge, it was mostly me trying to push, pull and shove a cart of rocks. A lot of people said it was a great idea and they'd love to join, or to provide input or to benefit from it. But none of that ever really happened. 

And then even I stopped pushing updates to the website. Hence why I've decided to pull all the content back into my own website and to shutter the site. I'll probably also give admin rights of the LinkedIn group to FoxT and that's that. 


kilala.nl tags: , ,

View or add comments (curr. 2)

BOKS: Mind your log files, part 2

2011-12-19 00:00:00

A few months back we discussed how incorrect log settings can mess with your auditing and logging in "Mind your log files!". Today we'll take a look at another way your logging can go horribly wrong.

Case in point: keystroke logs.

BoKS' suexec facility comes with optional keystroke logging, which allow you to capture a user's input and output. This is particularly handy when providing suexec su - user access to an applicative or super user. These keystroke logs are stored locally on the client system, where they are hashed and filed. The master server will then pull these log files from each client for centralized storage, after which the files will be cleaned from the clients. Optionally, these log files will then be pushed to replica servers for backup purposes.

Things go awfully wrong when the master server's kslog storage is underdimensioned. Once the storage location for keystroke logs is filled, the master server will stop pulling and cleaning files from client systems. This means that $BOKS_var/kslog, which is meant for temporary storage, now becomes rather permanent storage. And since many BoKS administrators leave $BOKS_var as part of the /var file system you are now filling up /var. If the BoKS client system is not protected against a 100% filled /var you are now looking at a very, very nasty situation. You might end up crashing client systems, or causing other erratic behaviour.

TLDR:


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS debugging example

2011-12-16 00:00:00

 

Yesterday served as a reminder that we can all fall prey to stupid little things :)

Symptom: A customer of mine could use suexec su - oracle on a few of his systems, but not on some of his others.

Troubleshooting: Everything seemed to check out just fine. The customer's account was in working order and neither root, nor the target account were locked or otherwise problematic. And of course the customer had the required access routes.

$ suexec lsbks -aTl *:customer | grep SXSHELL
suexec:*->root@HOSTGROUP%CUSTOMER-PG-SXSHELL (kslog=3)

$ suexec pgrpadmin -l -g CUSTOMER-PG-SXSHELL | grep oracle
/bin/su - oracle
/usr/bin/su - oracle

So, why does BoKS keep saying that this user isn't allowed to use suexec su - oracle on one box, but it's okay on the other?

12/13/11 10:00:57 HOST1 pts/1 customer suexec Successful suexec (pid 16867) from customer to root, program /bin/su
12/13/11 10:00:57 HOST1 pts/1 customer suexec suexec args (pid 16867): - oracle
12/13/11 10:01:12 HOST2 pts/5 customer suexec Unsuccessful suexec from customer to root, program /bin/su. No terminal authorization granted.

I thought it was odd that the logging for the failed suexec seemed "incomplete", but wrote it off as a software glitch. However, this is where alarm bells should've gone off!

So I continued and everthing seemed to check out: on both hosts /bin/su was used, on both hosts oracle was the target user and the BoKS logging supported it all. So let's try something exciting! Boksauth simulations!

Obviously the simulation for HOST1 went perfectly. But then I tried it for HOST2:

$ suexec boksauth -L -Oresults -r 'SUEXEC:customer@pts/1->root@HOST2%/bin/su#20-#20oracle' -c FUNC=auth TOUSER=root FROMUSER=customer TOHOST=HOST2 FROMHOST=HOST2 PSW="iascfavvcfHc"

ROUTE=SUEXEC:customer@pts/1->root@HOST2%/bin/su#20-#20oracle
FUNC=auth
TOUSER=root
FROMUSER=customer
TOHOST=HOST2
FROMHOST=HOST2
PSW=iascfavvcfHc
$HOSTSYM=MASTER
$ADDR=192.168.10.20
$SERVCADDR=192.168.10.20
WC=#$*-./?_
FKEY=CUSTOMER-HG:customer
UKEY=HOST2:root
RMATCH=suexec:*->root@CUSTOMER-HG%CUSTOMER-PG-SXSHELL,kslog=3
MOD_CONV=1
AMETHOD=psw
$PSW=ok
VTYPE=psw
RETRY=0
MODLIST=kslog=3,prompt=+1,su=+1,passroot=+1,use_frompsw=+1,su_fromtoken=+1,chpsw=-1,concur_limit=-1
$STATE=9
$SERVCVER=6.5.3

What I was expecting to see was STATE=6 and ERROR=203. But since the ERROR= field is absent and the STATE=9, this indicates that the simulation was successful. Now things get interesting! So I asked my customer to try the suexec su - oracle with me online, while I ran a trace on the BoKS internals. This resulted in a file 10k lines long, but it finally got me what I needed.

In the course of the debug trace, BoKS went through table 37 (suexec program group entries) to verify whether my customer's command was amongh the list. It of course was, but BoKS said it didn't match!

wildprogargscmp_recurse: wild = /usr/bin/su#20-#20oracle, match = /bin/su^M
wildprogargscmp_recurse: is_winprog = 0^M
wildprogargscmp_docmp: Called, wild /usr/bin/su#20-#20oracle match /bin/su^M
wildprogargscmp_docmp: Progs do not match^M
wildprogargscmp_docmp: return 1 (0 means match)^M
wildprogargscmp_recurse: wild = /bin/su#20-#20oracle, match = /bin/su^M
wildprogargscmp_recurse: is_winprog = 0^M
wildprogargscmp_docmp: Called, wild /bin/su#20-#20oracle match /bin/su^M
wildprogargscmp_docmp: fnamtch wild - sumdev, match did not match^M
wildprogargscmp_docmp: return 1 (0 means match)^M

This threw me for a loop. So I went back to the original BoKS servc call that was received from client HOST2.

servc_func_1: From client (HOST2) {FUNC=auth01TOHOST=?HOST01FROMHOST=?HOST01TOUSER=root01FROMUSER=customer01FROMUID=181801FROMTTY=pts/5201ROUTE=SUEXEC:customer@pts/52->root@?HOST%/bin/su}^M

And then it clicked! One final check confirmed that I'd been overthinking the issue!

$ suexec cadm -l -f ENV -h HOST2 | grep ^VERSION
VERSION=6.0

It turns out that HOST2 was still running BoKS version 6.0. While the suexec facility was introduced into BoKS aeons ago, only per version 6.5 did suexec become capable of screening command parameters! So a v6.5 system would submit the request as suexec su - oracle, while a v6.0 host sends it as suexec su. And of course that fails.

It's awesomely fun to dig around BoKS' internals, but in this particular case it'd have been better if I'd spent the hour on something else :)

 


kilala.nl tags: , ,

View or add comments (curr. 0)

BOKS: Demystifying the user FLAGS field

2011-11-28 00:00:00

 

The BoKS database can be an interesting place to poke around, "mysterious" at times. For example, there's the enigmatic "FLAGS" field which resides in table 1, the user data table. Among the usual user information (name, host group, user class, password, GID, UID, etc) there's the "FLAGS" field which contains a numerical value. What this numerical value represents isn't clear to the untrained eye.

The "FLAGS" number is a decimal representation of a hexadecimal number, where each digit represents a number of flags. The value of each digit is determined by adding the values of the flags enabled for the user. You could compare it to Unix file permission values, like 750 or 644, there each digit is an addition of values 1, 2 and 4 (x, w and r).

Below you'll find a table of the flags that can be set for any given user account.

Max. valueF3E3

Flag MSD     LSD
User deleted - - - 1
User blocked - - - 2
Timeout not depend on CPU - - 2 -
Timeout not depend on tty - - 4 -
Timeout not depend on screen - - 8 -
Windows local host account - 1 - -
Windows domain account - 2 - -
Lock at timeout, no logout 1 - - -
User must change password 2 - - -
Manage secondary groups 4 - - -
Check local udata 8 - - -

So for example, a value of 16386 equals a value of 0x4002, which means that the user is blocked and that BoKS is used to push his secondary group settings to the /etc/group file on each server.


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS: Successful login, but no logging

2011-11-04 00:00:00

 

Another fun one!

Case: Customer attempts to login, succeeds, then gets kicked from the system immediately with a session disconnect from the server. The BoKS transaction log however does not show any record of the login attempt.

Symptoms:

Troubleshooting:

Debugging:

  1. Key exchange
  2. User identification
  3. User authentication
  4. Session startup

Trace shows failure when forking shell for customer.

debug2: User child is on pid 495766
debug3: mm_request_receive entering
Failed to set process credentials
boks_sshd@server[9] :369851 in debug_log_printit: called. Failed to set process credentials151212
boks_sshd@server[9] :370000 in debug_log_printit: not in cache, add
boks_sshd@server[9] :370092 in addlog: add Failed to set process credentials151212 (head = 0x0)
boks_sshd@server[9] :370233 in addlog: head = 0x20332b28

Cause:

After doing a quick Google search, we concluded that customer's shell could not be forked due to a missing primary group on the server. Lo and behold! His primary group had not been pushed to the server by BoKS. This in turn was caused by corruption in AIX's local security files, which can be cleared up easily enough using usrck, pwdck and grpck.

This however does not explain why there was no transaction log entry for these logins. Because by all means this was a successful BoKS login: authentication and authorization had both gone through completely.

Hypothesis and additional test:

We reckon that the BoKS log system call for the "succesful login" message is only sent once a process has been forked, so on authentication+authorization+first fork. As opposed to on authentication+authorization as we would expect.

To test another case we switched a user's shell to a nonexistent one. When the user now logs in this -does- generate the "succesful login" message. This further muddles when the BoKS logging calls get done. FoxT is on the case and has confirmed the bug.

 


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS: setting new users' default shell

2011-10-26 00:00:00

Recently we upgraded our BoKS master and replica servers. Out went the aged Sun V210 with Solaris 8 and BoKS 6.0.3 and in came shiny new hardware+OS+BoKS. Lovely! Everything was purring along! We did start getting complaints that newly created users couldn't log in to all of their servers, which seemed odd. One of our Unix admins spotted that all these users had their shells set to bash, while ksh is the default shell we should be using.

How come the user default shell had changed all of a sudden? We traced the cause back to the BoKS web interface, but couldn't find out where the new shell setting had come from.

So! Back to grepping through the TCL source code of the web interface! A last ditch attempt, searching for every instance of the word "shell" (excluding the help files of course). In between oodles of lines of code I stumble upon this nugget:

# Get first shell from /etc/shells if it exists,
proc boks_uadm_get_default_shell {} {
    if { [catch {set fp [open /etc/shells r]}] == 0 } {

So there you have it! The BoKS v6.5 web interface simply grabs the first line of /etc/shells (if the file exists) and uses that for default value in the "shell" field when creating new user accounts. After changing the first line back to /bin/ksh things were back to normal.

An RFC has been submitted to make the user' default shell a configurable option.


kilala.nl tags: , ,

View or add comments (curr. 0)

Locking the BoKS database for fun and profit

2011-06-07 00:01:00

If your BoKS master server ever inexplicably grinds to a halt, blocking all suexec and remote logins, just do a ps -ef to check if there's anybody running a dumpbase. Then pray that you can contact this person, or that there's still someone with a root shell on the server...

A running dumpbase process keeps a read/write lock on the BoKS database until it has dumped all the requested content. If you have a sizeable database a full dump can take half a minute or more. That's not awful and it won't affect your daily operations too much, but it should still be kept to a minimum.

But what if? What if someone decides to run dumpbase and then pipe it through something like more?

The standard buffer size for a pipe is roughly 64kB (some Unices might differ). This means that dumpbase will not finish running until you've either ^C-ed the command, or until you've more-ed through all of the pages. Thus the easiest way to completely lock your master server, is to more a dumpbase and then go get yourself a cup of coffee. Because not even root will be able to login on the console while the dumpbase is active.


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS: registering SSH hostkeys in one blow

2011-06-07 00:00:00

Last weekend we upgraded our laster BoKS v6.0.3 server to 6.5, which presented us with a few interesting challenges. More about those later. But first! SSH host keys!

Per BoKS v6.5 the SSH daemon/client software will automatically verify that the SSH hostkey of the server you're connecting to matches the one listed in the BoKS database. If you're unprepared for this new feature, then you could be caught unawares with a situation where SSH warns you about a man-in-the-middle attack, despite your personal ~/.ssh/known_hosts file being empty.

To prevent this from happening we ran a simple two-liner right after performing the upgrade. The script below (if you can even call it that) will tell all the BoKS client systems in your domain to set their SSH hostkey in the database to its current key.

for HOST in $(sx hostadm -Sl | grep UNIXBOKS | awk '{print $1}')
do
cadm -s "ssh_keyreg -w -f /etc/opt/boksm/ssh/ssh_host_rsa_key.pub" -h $HOST
sleep 3
done

Of course you shouldn't run this script willy-nilly, but only at times where you know the current hostkeys to be correct :)

Once the FOR-loop has finished you will notice that the fields SSHHOSTKEY and SSHHOSTKEYTYPE in table 6 of the BoKS database will now contain values for each registered client.


kilala.nl tags: , ,

View or add comments (curr. 0)

Putty crash upon password change

2011-05-11 00:00:00

 

Recently we have been running into an interesting problem between BoKS 6.5.3 (FoxT Server Control) and Putty.

Situation: End user's password has expired and must be changed upon login.

Symptom: On password change, Putty crashes with the error "Incoming Packet was garbled on decryption. Protocol error packet to long".

Cause: Unknown yet.

Temp solution: Set customer's last password change date to very recently (eg: modbks -l $USER -L 1), then have customer login and change the password manually (eg: passwd).

UPDATE:

Earlier we reported a bug that would make Putty crash when trying to change your password upon login. The rather cryptic message provided by Putty was: "Incoming Packet was garbled on decryption. Protocol error packet to long".  Here's an update on that matter.

A number of FoxT customers logged calls about this problem, among others 110216-012399.  After investigating,  FoxT's reply in this matter is:

BoKS Master: If you already have TFS090625-101616-1 installed on the Master but not TFS081202-134416-3 (i.e. rev 3) you may want to uninstall TFS090625-101616-1 temporarilly and then install TFS081202-134416-3 and TFS090625-101616-1 (in that order).

BoKS Replica: Hotfix 090625-101616 does indeed contain the corrections from 081202-134416 (rev 3). Thus hotfix 090625-101616 is sufficient on the Replicas in this case.

 


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS 6.6 ready for release

2011-04-21 00:00:00

Awesome! Just before the Easter weekend a joyous email was sent around the FoxT offices: BoKS version 6.6.1 is now officially ready for release. Oh happy day!

New features in v.6.6.1 are:

Aside from new features, BoKS 6.6.1 also includes no less than 46 bug fixes and modifications which were requested by various customers. Oh happy day indeed!


kilala.nl tags: , ,

View or add comments (curr. 0)

SCP troubles: BoKS OpenSSH versus F-Secure

2011-04-06 00:01:00

It is not uncommon for network environments to mix different versions of SSH software, especially when you are still transitioning towards a BoKS-ified network. In such situations you'll often run into little snags that make the seemingly trivial rather impossible. Case in point: SCP (Secure Copy).

Whereas SSH and SFTP are standardized protocols that have been properly documented, SCP isn't so lucky. Sadly there is no such thing as a standard SCP and what "SCP" is depends completely on the SSH software you're using. The Wikipedia page linked above makes a very important point: "The SCP program is a software tool implementing the SCP protocol as a service daemon or client. It is a program to perform secure copying. The SCP server program is typically the same program as the SCP client."

Meaning that if you're using F-Secure on one side, it is going to expect F-Secure on the other side. If you try and have an OpenSSH client talk SCP to an F-Secure server, then you'll undoubtedly run into errors like these: "scp: FATAL: Executing ssh1 in compatibility mode failed (Check that scp1 is in your PATH)."

What if you're migrating an F-Secure-based environment to BoKS? There are a few possible solutions:

Option #2 is a bit redundant if you're going to be installing BoKS on the hosts later on. You might as well get it over with as soon as possible, you don't have to actively use BoKS from the get-go. Option #3 is a useful enough kludge, especially if there are servers that will never switch to BoKS.

See also:


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS: mind your log files!

2011-04-06 00:00:00

BoKS' main log file for transactions is $BOKS_data/LOG. The way BoKS handles this file is configured using the logadm command. Specifically, this is done using two distinct variables:

For example:

$ suexec logadm -V
Log file size limit before backup:       3000 kbytes
Absolute maximum log file size:          100000 kbytes

$ suexec logadm -lv
Primary log directory:                   /var/opt/boksm/data
Backup log directory:                    /var/opt/boksm/archives

What this means is that:

First off, this means that it's not just $BOKS_data that you need to monitor for free space! $BACKUP_dir is equally important because once the -M threshold is reached BoKS will simply stop logging. But then there's something else!

Did you know that BoKS is hard coded for a maximum of 64 log rotations per day? This is because the naming scheme of the rotated logs is: L$DATE[",#,%,',+,,,-,.,:,=,@A-Z,a-z]$DATE. Once BoKS reaches L$DATEz$DATE it will keep on re-using and overwriting that file because it cannot go any further! This means that you could potentially lose a lot of transaction logging.

The current work around for this problem is to set your logadm -T value large enough to prevent BoKS from ever reaching the "z" file (the 64th in line). Of course the real fix would be to switch to a different naming scheme that is more flexible and which allows a theoretically unlimited amount of log rotations.

The real fix has been requested from FoxT and is registered as RFC 081229-160335. This fix has been confirmed as being part of BoKS v6.6.1 (per build 13 I am told).


kilala.nl tags: , ,

View or add comments (curr. 0)

From the rumor mill: features for BoKS 6.x and 7.x

2010-09-16 00:00:00

A few weeks ago we met with two of FoxT's VPs who'd come over from the US to Amsterdam. During our two hour meeting we were told of many awesome features to be expected in future versions of BoKS (or "FoxT Access Control" :) ).

The future looks bright! I for one can't wait to get my hands on 6.6.x to start testing and learning! :)


kilala.nl tags: , ,

View or add comments (curr. 0)

What's in a name? FoxT product renaming

2010-08-31 00:00:00

Over the past fifteen years the product we've come to know and love has changed names on numerous occasions. BoKS has changed hands a few times and with each move came a new name. All of this has led to a rather muddled position in the market, with many people confused about what to call the software.

Is it "BoKS"? Is it "FoxT Access Control", or "Keon", or even "UnixControl"? And is the company called FoxT or is it Fox Technologies?! And this confusion isn't alleviated by the fact that both resumés and job postings refer to the software by any of these names.

Now we are told that FoxT are seriously considering a rigorous change to their naming convention, one that they will stick with for the coming years. All we can say is that it'd better be good! Because most of the names tossed about so far have both up and downsides.

Things like Access Control, Unix Control, or Server Control all have the problem that they are names consisting of two very generic words. Run them through Google and you'll get oodles of results. Words like FoxT and BoKS are certainly far from generic, but even those give pretty bad results in Google ("Did you mean books?"). BoKS is certainly a memorable term and most people still refer to the software in that way, despite the fact that neither the FoxT documentation nor their website even still mentions the name.

So far the only past name that ticks all the boxes (unique, memorable, great with SEO) is "Keon". But unfortunately that can't be used, because the name is still owned by RSA. :(

So, what do you think?! Any suggestions with regards to a new product name? Any emotional attachment to the name "BoKS" (I'll admit to having that flaw)? Pipe in and let us know!


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS and the epoch rollover

2010-08-20 00:00:00

 

The year 2038 is still a long time away, but we may already be feeling its effects!

As any Unix administrator will know Unix systems count their time and date in the amount of seconds passed since "Epoch" (01/01/1970). On 32-bit architectures this means that we're bound to "run out of time" on the 19th of January of 2038 because after that the Unix clock will roll-over from 1111111.11111111.1111111.11111110 to 10000000.00000000.00000000.00000000.

While you might not expect it, BoKS administrators may already be feeling the effects of the Year 2038 problem way ahead of time.

One commonly used trick for applicative user accounts is to set their "pswvalidtime" to a very large number. This means that the user account in question will never be bugged to change its password, which tends to keep application support people happy. The account will never be locked automatically because they forgot to change the password and thus their applications will not crash unexpectedly.

It's common to use the figure "9999" as this huge number for "pswvalidtime". This roughly corresponds to 27,3 years. Do the rough math: 2010,8 + 27,3 = 2038,1. Combine that with the "pswgracetime" setting and BINGO! The password validity for the user in question has now rolled over to some day in January of 1970! The odd thing is that the BoKS "lsbks" command will not show this fact, but instead translate the date to the relating date in 2038, which puts you off the track of the real problem.

So... If you happen to rely on huge "pswvalidtime" settings, you'd better tone it down a little bit. Thanks to the guys at FoxT for quickly pinpointing our "problem". It seems that there's a 9999-epidemic going round :)

EDIT: Thank you to Wilfrid for pointing out two small mistakes :)

 


kilala.nl tags: , ,

View or add comments (curr. 0)

In BoKS, "locked" does not always mean an account is locked

2010-07-28 21:37:00

I ran into a rather interesting case the other day, pointing me to another caveat that you need to keep in mind with BoKS. Let me say up front that I understand FoxT's design decision in this case and that, while I don't necessarily agree with them, it isn't a very big problem as long as you know the situation exists. So, what's up?

In BoKS, a "locked" account is not always locked the way you might think it is.


How I found out that "locked" isn't always "locked"

I received a trouble ticket from a friend/colleague of mine, saying that he suspects his application user got locked. He couldn't SU to the user account anymore, getting a message saying it was locked. Either way, his password wasn't getting accepted and he needed to get in!

So, I checked the application user and it was fine! Not locked, no expired password, no problems at all. However, the BoKS logs did show that my friend's account was in fact blocked! Browsing back through the transaction logs I found that his personal account had been locked after he'd entered a wrong password while SU-ing. In the world of BoKS this makes sense: you try to guess your way into another account with SU and your own account gets locked as a punishment. This way you can block the perpetrator, while preventing a DoS (Denial of Service) on the target account.

07/07/10 17:05:50 SERVER-A pts/2 bobby sshd Successful login (ssh shell from 10.72.2.3)
07/07/10 17:05:58 SERVER-A pts/2 bobby su Successful SU from user bobby to oracle
07/08/10 03:48:30 SERVER-A pts/2 bobby sshd Logout
07/08/10 11:02:35 SERVER-B - bobby sshd Bad login (ssh auth from 10.72.2.3). Wrong password.
07/08/10 15:05:13 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Authentication failed.
07/08/10 15:05:16 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Wrong password.
07/08/10 15:05:19 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Wrong password.
07/08/10 15:05:26 SERVER-C - bobby servc Too many failed login retries on SERVER-C
07/08/10 15:05:26 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Wrong password.
07/08/10 15:05:30 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Too many erroneous login attempts.
07/13/10 08:22:47 SERVER-B pts/1 bobby sshd Successful login (ssh shell from 10.72.2.3)
07/13/10 11:14:15 SERVER-B pts/1 bobby su Access denied by server 10.72.2.3, route SU:bobby@pts/1->oracle@SERVER-B
07/13/10 11:14:15 SERVER-B pts/1 bobby su Bad SU from user bobby to oracle (Too many erroneous login attempts.)
07/14/10 15:52:34 SERVER-B pts/1 bobby sshd Logout
07/15/10 08:12:49 SERVER-B pts/1 bobby sshd Successful login (ssh shell from 10.72.2.3)
07/15/10 10:24:50 SERVER-B pts/2 bobby sshd Successful login (ssh shell from 10.72.2.3)

In the case above, "bobby" locked his account by repeatedly botching his own password on a system where he hadn't installed his SSH keys yet.

So how come my colleague could still login using SSH? Didn't BoKS say his user account was blocked?!


Design decision: SSH key authentication ignores "locked" status

I was flabbergasted! Bobby's account had gotten locked, so certainly he should not be allowed to login anymore, right? Besides, he was getting blocked on his SU and SUEXEC usage! So why could he still login?

After discussing the matter with FoxT tech support I was reminded of the aforementioned design decision regarding DoS attacks: FoxT doesn't want you to easily block another person's account by just slamming his password. Which is why they decided that anybody who is allowed to use SSH key pairs should also be allowed to keep logging in despite his "locked" status.

Two very important distinctions:

  1. This does not apply to manual account locks! So if an administrator locks an account, the account will not be allowed to login at all, SSK keys or not.
  2. Also, to quote my contact at FoxT: Also, a successful key-based login will not reset the failed password count of the account and will not open up additional opportunity for an attacker to keep trying the password. Both, very important.

kilala.nl tags: , ,

View or add comments (curr. 0)

FoxT BoKS training and documentation

2010-07-06 19:25:00

I've been asked multiple times who can provide training or education about FoxT's BoKS Access Control. The most obvious answer is: "it depends on where you live".

FoxT has many local partners across the globe, offering many different services. Project management, consulting, administration and training, the works! Who these local partners are depends on the continent and/or country you're in.

In the case of the Benelux (Belgium, Netherlands and Luxemburg) there are two answers.

  1. You can hire FoxT to come down from Sweden
  2. You can hire Unixerius, the only local partner for BoKS training.

For information about local training partners in your locale, please contact FoxT.


kilala.nl tags: , ,

View or add comments (curr. 0)

End of life for BoKS 5.x, 6.0 and 6.1

2010-06-19 00:00:00

 

A few months ago FoxT made their official announcement regarding the EOL-ing of various BoKS versions within the next 1.5 years.

Per the 31st of December 2010, the following products will no longer receive support.

Also, per the 31st of December 2011, the following products will no longer receive support.

Per the aforementioned dates "no more maintenance updates or patches will be made available and no further development will take place for these particular components. In addition, the affected components will no longer be supported by FoxT Customer Support".

Please keep these dates in mind and plan your upgrade paths accordingly! You don't want to get stuck with an unsupported version of the software because you'll miss out on critical software updates and tech support costs will go through the roof. Then again, in this day and age, why are you still running a version < 6.0?!

Gentlemen, start your upgrades!

 


kilala.nl tags: , ,

View or add comments (curr. 0)

Check_boks_dormant.ksh: Finding unused and inactive user accounts

2010-03-16 22:02:00

Users come and users go and likewise user accounts get created and destroyed. However, sometimes your HR-processes fail and accounts get forgotten and left behind. It may not be obvious, but these forgotten accounts can actually form a threat to your security and should be cleaned up. Many companies even go out and lock or remove accounts of people who actively employed if they go unused for an extended period of time.

This script will help you find these forgotten user accounts, so you can then decide what to do with them.


Usage of check_boks_dormant

./check_boks_dormant [[-u UC] [-H HG] [-h HOST] | -A] [-M MON] [-x UC] [-X HG]  [-d -o FILE] [-f FILE]

-u UCLASS	Check only accounts with profile UCLASS. Multiple -u entries allowed.
-H HGROUP	Check only accounts from HOSTGROUP. Multiple -H entries allowed.
-h HOST		Check ALL accounts involved with HOST. Multiple -h entries allowed.
-A 		Check ALL user accounts.
-M MON		Minimum amount of months that accounts must be dormant. Default is 6.
-x EXCLUDEUC	Exclude all accounts with profile UCLASS. Multiple -x entries allowed.
-X EXCLUDEHG	Exclude all accounts from HOSTGROUP. Multiple -X entries allowed.
-S 		Exclude all accounts who can authenticate with SSH_PK. See "other notes" below.
-f FILE		Log file that contains all dormant accounts. Default logs into $BOKS_var.
-d 		Debug mode. Provides error logging. 
-o FILE		Output file for debugging logs. Required when -d is passed.

When using the -h option, a list will be made of all user accounts involved with this server
regardless of user class or host group. One can exclude certain classes or groups by using
the -x and -X parameters.

Example: 
./check_boks_dormant.ksh -h solaris1 -x RootUsers -x DataTransfer
./check_boks_dormant.ksh -u OracleDBA
./check_boks_dormant.ksh -A -d -o /tmp/foobar

Output

The script does not output to stdout. Instead, all dormant accounts are logged in $BOKS_var/check_boks_dormant.ksh.DATE or another file specified with -f.

The log file in $BOKS_var (or specified with -f) will contain a list of inactive accounts.


Limitations


Other notes

Download

Download check_boks_dormant.ksh
$ wc check_boks_dormant.ksh
     482    2559   17139 check_boks_dormant.ksh

$ cksum check_boks_dormant.ksh
2919189107 17139 check_boks_dormant.ksh

kilala.nl tags: , ,

View or add comments (curr. 1)

Check_boks_replication - Script for monitoring BoKS database replication

2010-03-16 21:43:00

In a BoKS infrastructure the master server automatically distributes database updates to its replicas. BoKS provides the admin with a number of ways to verify the proper functioning of these replicas, but none of these is easily hooked into monitoring software.

This script makes use of the following methods to verify infra sanity. * boksdiag list, to verify if replicas are reachable. * boksdiag sequence, to verify if a replica's database is up to date. * dumpbase -tN | wc -l, to verify the actual files on the replicas.


Usage of check_boks_replication

./check_boks_replication [-l LAG] [-h HOST] [-n] [-d -o FILE]
-l LAG		Maximum amount of updates for a replica table to be behind on.
		Typically this should not be over 50. Default is 30.
-h HOST		Hostname of individual replica to verify.
-x EXCLUDE	Hostname of replica to exclude.
-p		Disable the use of ping in connection testing, in case of firewalls.
-n		Dry-run mode. Will only return an OK status.
-d		Debug mode. Use with dry-run mode to test Tivoli.
-o FILE		Output file for debugging logs. Required when -d is used.
 
Example: ./check_boks_sequence -l 20 -d -o /tmp/foobar

Multiple -h and -x parameters are allowed.

Output

This script is meant to be called as a Tivoli numeric script. Hence both the output and the exit code are a single digit. Please configure your numeric script calls accordingly:

0 = OK
1 = WARNING
2 = SEVERE
3 = CRITICAL


Limitations


Download

Download check_boks_replication.ksh
$ wc check_boks_replication.ksh
     570    2668   17878 check_boks_replication.ksh

$ cksum check_boks_replication.ksh
4063571181 17878 check_boks_replication.ks

kilala.nl tags: , ,

View or add comments (curr. 3)

Integrating your applications with FoxT BoKS

2010-02-11 09:16:00

BoKS provides you with an open architecture, allowing you to integrate BoKS access control with your own applications. The easiest way to do this is by using Pluggable Authentiation Modules (PAM), provided that PAM is available for your operating system of choice. Aside from PAM one could also make use of the APIs provided by FoxT, though I personally don't have experience with that option.


PAM example: using ProFTPd with BoKS

Recently we needed to get FTP up and running on a system that previously only used SCP/SFTP. However, the Solaris-default FTP daemon was never installed, nor does the BoKS package for Solaris include the BoKS FTP daemon. This left us with a few options, including the installation of ProFTPd.

Simply installing and running ProFTPd would leave us with an unsecured system: anybody would be able to login, because BoKS does not yet have any grip on the daemon. Luckily, the integration with BoKS was very easy, thanks to PAM.

  1. Add the following to proftpd.conf:
    <IfModule mod_auth_pam.c>
    AuthPAM on
    AuthOrder mod_auth_unix.c mod_auth_pam.c*
    </IfModule>
  2. In the same proftpd.conf set "UseIPv6" to "off". (Why?)
  3. Restart proFTPd.

It's that simple. Now, let's take a look at what's needed if you don't use an existing access method.


Integrating an application with a new access method

Each application that makes use of PAM will send an identifier to PAM. For example, most FTP daemons will either identify themselves as "ftp" or "ftpd". You will need to edit /etc/pam.conf..ssm (the pam.conf file used when you run sysreplace replace) and add a set of rules for this new PAM identifier. Usually it's enough to take the ruleset defined for FTP and then to adjust the identifier to your own.

Once your pam.conf has been modified, you need to add a new entry to $BOKS_etc/bokspam.conf that ties the new PAM identifier to a BoKS access method. You are free to choose your own method string, as long as it doesn't already exist in $BOKS_etc/method.conf. For applications that simply take an incoming network request it's easiest to copy the line for FTP and set it to your new application.

On the master+replicas and the BoKS clients in question you will finally need to edit $BOKS_etc/method.conf. There you will define the format of access routes for this new method, as well as any modifiers that you desire.

And to my knowledge that's it!

  1. App points to PAM
  2. PAM points to BoKS
  3. bokspam.conf points to access method
  4. method.conf defines access method

kilala.nl tags: , ,

View or add comments (curr. 0)

Check_boks_queues: Tracking the status of your BoKS clntd queues

2010-01-13 06:33:00

Every time a BoKS client becomes unreachable the master server will retain updates for this client in a queue. Over time this queue will continue to grow, containing all manner of updates to /etc/passwd, /etc/shadow and so forth. Without these updates the client will become out of date and known-good passwords will stop working. You could lose access to the root account if you don't keep a history of the previous passwords!

This simple Tivoli plugin will warn you of any client queues that exceed a certain size or age, with both thresholds adjustable from the command line.


Usage of check_boks_queues

./check_boks_queues [-m MESS] [-a AGE] [-d -o FILE] [-f FILE]

-m MESS		Threshold for amount of messages. Default is 40 messages.
-a AGE		Threshold for age of client queue. Default is 24 hours.
-f FILE		Log file that queues that are over threshold. Default logs into $BOKS_var.
-d 		Debug mode. Provides error logging. 
-o FILE		Output file for debugging logs. Required when -d is passed.

The -a parameter requires BoKS 6.5.x. It DOES NOT work in 6.0.x and older versions.

Example: 
./check_boks_queues -m 50 -f /tmp/over50.txt
./check_boks_queues -a 168 -f /tmp/oneweek.txt

Output

This script is meant to be called as a Tivoli numeric script. Hence both the output and the exit code are a single digit. Please configure your numeric script calls accordingly:

The log file in $BOKS_var (or specified with -f) will contain a list of queues that are stuck.


Limitations


Download

Download check_boks_queues.ksh
BoKS > wc check_boks_queues.ksh
     299    1413    9307 check_boks_queues.ksh
BoKS > cksum check_boks_queues.ksh
1047961426      9307    check_boks_queues.ksh

kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS troubleshooting: corrupt message queues

2010-01-12 20:36:00

Today I ran into a problem I hadn't encountered before: seemingly out of the blue one of our BoKS client systems would not allow you to login. The console showed the familiar "No contact with BoKS. Only "root" may login." message. The good thing was that the master could still communicate with the client through the clntd channel, so at least I could do a sysreplace restore through cadm -s.

We were originally alerted about this problem after the client in question has started reporting it's /var partition had reached 100%. After logging in I quickly saw why: for over 24 hours the bridge_servc_s process had been dumping core, with hundreds of core dumps in /var/core. This also explained why logging in does not work, but master-to-client comms were still OK. /var/adm/messages also confirmed these crashes, showing that the boks_bridge process kept on restarting and dying on a SIGBUS signal.

The $BOKS_var/boks_errlog file showed these messages between a restart and a rekill of BoKS:

boks_init@CLIENT Tue Jan 12 09:52:09 2010
  INFO: Max file descriptors 1024
boks_sshd@CLIENT Tue Jan 12 09:52:09 2010
  WARNING: Could not load host key: /etc/opt/boksm/keys/host.kpg
boks_udsqd@CLIENT Jan 12 09:52:09 [servc_queue]
  WARNING: Failed to connect to any server (0/1). Last attempt to ".servc", errno 146
boks_init@CLIENT Tue Jan 12 09:52:09 2010
  WARNING: Respawn process bridge_servc_s exited, reason: signal SIGBUS. Process restarted.
boks_udsqd@CLIENT Jan 12 09:52:10 [servc_queue]
  WARNING: Dropping packet. Server failed to accept it
boks_init@CLIENT Tue Jan 12 09:52:13 2010
  WARNING: Respawn process bridge_servc_s exited to often, NOT respawned
boks_init@CLIENT Tue Jan 12 09:53:26 2010
  WARNING: Dying on signal SIGTERM

This indicates that none of the replicas was accepting servc request from the client, which again explains why one could not login, nor use suexec etc. Checking the $BOKS_var/boks_errlog file on the replicas explained why the servc requests were being rejected:

%oks_bridge@REPLICA Mon Jan 11 22:41:16 2010
  ERROR: Got malformed message from 192.168.10.113
%oks_bridge@REPLICA Tue Jan 12 01:04:06 2010
  ERROR: Got malformed message from 192.168.10.113
%oks_bridge@REPLICA Tue Jan 12 01:07:46 2010
  ERROR: Got malformed message from 192.168.10.113

And so on... After deliberating with FoxT tech support they concluded that the client must have had a message in its outgoing servc queue that had gotten damaged. They suggested that I make a backup of $BOKS_var/data/crypt_spool/servc and then remove the files in that directory. Normally it's not a good idea to remove these files, as they may contain password-change requests from users, but in this case there wasn't much else we could do. Remember though, leave the crypt_spool directory alone on the master and replicas, because that stuff's even more important!

What do you know? After clearing out the message queue the client worked perfectly. I'm now working with FoxT to find out which one of the few dozen messages was the corrupt one. In the process I'm trying to learn a little about the insides of BoKS. For example, looking at the message files it seems that either they were ALL deformed, or BoKS doesn't actually have a uniform format for them, because some contained a smattering of newline characters, while other files were one long line. I'm still waiting for a reply on that question.


kilala.nl tags: , ,

View or add comments (curr. 0)

boks_set_passwd.ksh - Quick and dirty script to set a password

2010-01-11 17:32:00

Sometimes you're in a hurry and need to set a new, random password on an account. Don't feel your random banging the keyboard is random enough? Then use this script instead.


Usage of boks_set_passwd

./boks_set_passwd.ksh [HGROUP|HOST]:USER

Example: 
./boks_set_passwd.ksh SUN:thomas
./boks_set_passwd.ksh solaris2:root

Output

Three fields get echoed to stdout: the username, the password and the encrypted password string (should you ever need it).


Limitations


Download

Download boks_set_passwd.ksh
$ wc boks_set_passwd.ksh 
      92     389    2369 boks_set_passwd.ksh

$ cksum boks_set_passwd.ksh 
2167470539 2369 boks_set_passwd.ksh

kilala.nl tags: , ,

View or add comments (curr. 2)

Check_boks_rootpw - Script for monitoring of root password consistency

2010-01-10 21:51:00

In a BoKS domain root passwords are stored in a number of locations. In order to guarantee proper functioning of the root password one will need to verify that the password stored in all three locations is identical. The three locations are:

Brpf in this case stands for "BoKS Root Password File". It is used to allow the root user to login through a system's console if the BoKS client cannot communicate with the master server.

This script uses functionality from the boks_new_rootpw.ksh script to test all three locations of the BoKS root password.


Usage of check_boks_rootpw

./check_boks_rootpw.ksh [[-h HOST] [-H HG] [-i FILE] | -A] [-x HOST] [-X HG]  [-d -o FILE] [-f FILE]

-h HOST		Verify the root password for HOST. Multiple -h entries allowed.
-H HGROUP	Verify the root passwords for HOST GROUP. Multiple -H entries allowed.
-i FILE		Verify the root passwords for all hosts in FILE.
-A 		Verify the root passwords for ALL hosts.
-x EXCLUDE	Hosts to exclude (when using -H or -A). Multiple -x entries allowed.
-X EXCLUDEHG	Host groups to exclude (when using -A). Multiple -X entries allowed.
-f FILE		Log file that lists errors in root password files. Default logs into $BOKS_var.
-d 		Debug mode. Provides error logging. Does a dry-run, not doing any updates.
-o FILE		Output file for debugging logs. Required when -d is passed.

Example: 
./check_boks_rootpw.ksh -h HOST1 -h HOST2 -f $BOKS_var/root.txt
./check_boks_rootpw.ksh -A -d -o /tmp/foobar

Multiple -h, -H, -i, -x and -X parameters are allowed.

Output

This script is meant to be called as a Tivoli numeric script. Hence both the output and the exit code are a single digit. Please configure your numeric script calls accordingly:

0 = OK, everything OK.
1 = WARNING, an wrong parameter was entered.
2 = SEVERE, a root password is inconsistent. Check log file.
3 = CRITICAL, not used.


Limitations


Download

Download check_boks_rootpw.ksh
$ wc check_boks_rootpw.ksh 
     467    2162   14401 check_boks_rootpw.ksh

$ cksum check_boks_rootpw.ksh 
3050878034 14401 check_boks_rootpw.ks

kilala.nl tags: , ,

View or add comments (curr. 2)

Check_boks_ssmactive: Script to verify client BoKS security

2010-01-10 21:44:00

The check_boks_client script checks many different things on a per-client basis. That particular script needs to run locally on the client itself. This script, check_boks_ssmactive, is meant to do one quick check on a clients, from the master server. The only thing it checks is whether BoKS security is actually active on the client, which is rather important!

By running this script from the master server you can blanket your whole domain in one blow.


Usage of check_boks_ssmactive

./check_boks_ssmactive [[-h HOST] [-H HG] [-i FILE] | -A] [-x HOST] [-X HG]  [-d -o FILE] [-f FILE]

-h HOST		Verify the root password for HOST. Multiple -h entries allowed.
-H HGROUP	Verify the root passwords for HOST GROUP. Multiple -H entries allowed.
-i FILE		Verify the root passwords for all hosts in FILE.
-A 		Verify the root passwords for ALL hosts.
-x EXCLUDE	Hosts to exclude (when using -H or -A). Multiple -x entries allowed.
-X EXCLUDEHG	Host groups to exclude (when using -A). Multiple -X entries allowed.
-f FILE		Log file that lists errors in root password files. Default logs into $BOKS_var.
-d 		Debug mode. Provides error logging. Does a dry-run, not doing any updates.
-o FILE		Output file for debugging logs. Required when -d is passed.

Example: 
./check_boks_ssmactive.ksh -h HOST1 -h HOST2 -f $BOKS_var/BOKSdisabled.txt
./check_boks_ssmactive.ksh -A -d -o /tmp/foobar

Multiple -h, -H, -i, -x and -X parameters are allowed.

Output

This script is meant to be called as a Tivoli numeric script. Hence both the output and the exit code are a single digit. Please configure your numeric script calls accordingly:

0 = OK, everything OK or clients unreachable.
1 = WARNING, an wrong parameter was entered.
2 = SEVERE, one or more hosts are NOT secure. Check log file.
3 = CRITICAL, not used.

The log file in $BOKS_var (or specified with -f) will contain a list of hosts that have BoKS disabled.


Limitations


Download

Download check_boks_ssmactive.ksh
$ wc check_boks_ssmactive.ksh 
     440    2041   13544 check_boks_ssmactive.ksh

$ cksum check_boks_ssmactive.ksh 
3734761991 13544 check_boks_ssmactive.ks

kilala.nl tags: , ,

View or add comments (curr. 0)

Boks_new_rootpw.ksh - Script for automatic changing of root passwords

2010-01-10 20:49:00

This script can be used to generate, set and verify a new password for any root account within your BoKS domain. It could be used as part of your monthly root password reset cycle, or for daily maintenance purposes. Functionality of the script includes:


Usage of check_boks_replication

./boks_new_rootpw [[-h HOST] [-H HG] [-i FILE] | -A] [-x HOST] [-X HG] [-f FILE] [-d -o FILE]

-h HOST		Change the root password for HOST. Multiple -h entries allowed.
-H HGROUP	Change the root passwords for HOSTGROUP. Multiple -H entries allowed.
-i FILE		Change the root passwords for all hosts in FILE.
-A 		Change the root passwords for ALL hosts.
-x EXCLUDE	Hosts to exclude (when using -H or -A). Multiple -x entries allowed.
-X EXCLUDEHG	Hostgroups to exclude (when using -A). Multiple -X entries allowed.
-f FILE		Output file to store the new root passwords in. Default is stdout.
-d 		Debug mode. Provides error logging. Does a dry-run, not doing any updates.
-o FILE	Output file for debugging logs. Required when -d is passed.

Example: 
./boks_new_rootpw -h HOST1 -h HOST2 -f $BOKS_var/root.txt
./boks_new_rootpw -A -d -o /tmp/foobar

Multiple -h, -H, -i, -x, and -X entries are allowed.

Output

If you do not use the -f flag to indicate an output file, the script will output everything to stdout. The output consists of a listing of hostname, plus root password, plus encrypted password string. Either way you may want to keep this output somewhere safe, for reference.

When running in debug/dry-run mode, the script outputs log messages to the output file specified with the -o flag. This file will show detailed error reports for failing root updates. BEWARE: THE DEBUG LOG WILL CONTAIN (UNUSED) ROOT PASSWORDS.

All (temporary) files created by this script are 0600, root:root. Duh! ^_^


Limitations


Download

Download boks_new_rootpw.ksh
$ wc boks_new_rootpw.ksh
     525    2549   16959 boks_new_rootpw.ksh

$ cksum boks_new_rootpw.ksh
4078240301 16959 boks_new_rootpw.ksh

kilala.nl tags: , ,

View or add comments (curr. 3)

Hacking BoKS 6.5 to run on Fedora

2010-01-10 15:10:00

The past few weeks I've spent a few hours here-and-there, trying to get BoKS 6.5 to run on Fedora Core 12. Why? Because FoxT's list of supported platforms only has commercial Linuxes on there. The last free version on there is RedHat 7. I've asked my contacts at FoxT whether they're looking at converting BoKS for free Linuxes, like Fedora.

Unfortunately my efforts were only partially successful. I've used the base BoKS 6.5.2 package for RHEL, which requires a few tweaks to make it work. In the end I got SSH and SU to work properly, but "su -l" and telnet don't work. You can telnet into the Fedora box, but it's never checked for authorization, though servc on the master does receive the request. Also, "su -l" fails immediately with the message "su: password incorrect" without even asking for my password.

I've compiled a list of about a dozen tweaks and extra packages that are needed to get to this point, but I'm far from having a proper BoKS client on Fedora.


kilala.nl tags: , ,

View or add comments (curr. 1)

FoxT BoKS: changing a (root) user's password

2009-11-18 07:45:00

Speaking of over thinking things...

Recently I've been working on my script for the mass changing of root passwords, right? After working on it for a few days I've found three four five ways of changing a (root) user's password.

1. passwd $HOST:root

2. modbks -l $HOST:root -p "$ENCPASSWD"

3. boksauth -c FUNC=change_psw ... NEWPSW="$PASSWD"

4. boksauth -c FUNC=write TAB=1 ... +PSW="$ENCPASSWD"

5. restbase -s 1 ... $UPDATEFILE

Options 1 and 3 both use the plain text password string, where option 1 is obviously not useful for mass password changes because it's an interactive command. On the other hand options 2 and 4 both use the encrypted password string, thus creating the need for an encryption routine like Perl's "print crypt" method.

Options 3 and 4 are kludges because you're using the "boksauth" command to send calls directly to the servc process as if you were a piece of BoKS client software.

Option 5 is just too nasty to consider. Using the "restbase" command you can restore or overwrite parts of the BoKS database from plain text files in the BoKS dump ("dumpbase") format. This means that you could technically speaking make an update file containing an edited entry for the user in question, containing the new encrypted password string in the PSW field.

In my script I originally used option 2, but was dissatisfied with it because it did not update the PSWLASTCHANGE field in table 1. This in turn was screwing up our SOx audits, because all of our root passwords were listed as being over a year old which obviously wasn't true. This is why I switched to using "boksauth" and option 3.

And that's where the over thinking comes into the story. I don't know why both I and the guys from FoxT didn't think of this, but let's check the "modbks" man-page:

-L days = Set password last change date back days days.

Hooray for reading comprehension! /o/

This means that by simply adding "-L 0" to my modbks command I could've reset the PSWLASTCHANGE field to today. And it works for both BoKS 6.0 and BoKS 6.5. How did I miss this? I think I just need to sit down and read all BoKS man-pages because who knows what else I can come up with? :)


kilala.nl tags: , ,

View or add comments (curr. 1)

FoxT BoKS: forcing a user to change his password

2009-11-18 07:19:00

Sometimes I think too far out of the box :)

I have always been up front about what I think about FoxT's BoKS security software: it's good stuff, but sometimes it's a bit kludgy. Today I learned that I shouldn't let this cloud my judgment too much because sometimes BoKS -does- do things elegantly ^_^;

A colleague of mine asked me the following question: Is it possible to force a user to change his password on the next login, -without- using the web interface?.

Seems straightforward enough, right? However, in my clouded mindset I completely over thought the whole matter and started digging in the database. Table 1 of the BoKS database should contain the relevant information, but which field could it be? Two fields seem to stand out, but neither is related.

BoKS > dumpbase -t1 | grep ru13rs

RLOGNAME="SECURITY:thomas" UID="1000" GID="1000" PROFILE="SecuritySupport" REALNAME="Thomas Sluyter" HOMEDIR="thomas" USERLASTCHANGE="1224244960" FLAGS="16384" PSW="39ajnasdlfkj4" PSWLASTCHANGE="1256545622" NO_PWDF="0" SERIAL="" PSWKEY="6436" LASTTTY="servera:pts/17" LASTLOGIN="1258524725" LASTLOGOUT="1258465492" RETRY="0" RESERVED1="125196" RESERVED2="" LOGINVALIDTIME="0" PSWVALIDTIME="0" CHPSWTIME="0" PSWMINLEN="0" PSWFORCE="0" PSWHISTLEN="0" CHPSWFREQ="0" TIMEOUT="0" TTIMEOUT="0" TDAYS="0" TSTART="0" TEND="0" RETRYMAX="0" CONCUR_LOGINS="0" SHELL="/bin/ksh" PARAMETERMASK="16384" PSDPSW="" PSDPSWLASTCHANGE="0" PSDPSWRETRIES="0" PSDBLOCKED="0" PSDBLOCKEDTIME="0" FEK="" GEKVER="" MD5DN="" LASTDTLOGIN="0" SETTINGVER=""

I've no clue what the NO_PWDF field does, but at least it does NOT stand for "no password force" :) Also, the field PSWFORCE does indeed have something to do with the enforcing of passwords, but not with the forced changing thereof. Instead it defines which guidelines and rules a new password must adhere to (see page 262 of the BoKS 6.5 admin guide). In the end our friendly FoxT support engineer informed me that the value I was looking for is a hex code that's part of the FLAGS field.

However, that's not why I over thought things.

In his email the engineer also showed how he derived the appropriate hex value from the FLAGS field, which led to:

BoKS > man passwd

boksadm -S passwd [-f|-F] [-x debug level] [user]

-f This option forces the user to enter a new password on the next login. Valid for superuser only.

Duh!

EDIT:

Obviously you can also use modbks -l $USER -L $DAYS to set the PSWLASTCHANGE field for the user back X amount of days past the PSWVALIDTIME. However, this isn't very practical since the PSWVALIDTIME field differs per user :)

You'd also be messing with information that could be important to a SOx audit, so you'd better not do it this way ;)


kilala.nl tags: , ,

View or add comments (curr. 0)

Unixerius is now official partner of FoxT

2009-11-05 07:08:00

FoxT's logo

I am proud to announce that my employer, Unixerius, is FoxT's official partner for the Benelux, starting per November 2009. We will be FoxT's preferred partner for the delivery of:

* BoKS Access Control licenses

* Pre-sales consulting

* After-sales consulting

* Implementation projects

* Daily management of BoKS infrastructures

* Training

It took us a year of lobbying, from planting the initial thought in my boss's head to getting the final signature on paper. I'm very glad that we finally managed to get the title and am looking very much forward to working with FoxT on improving both their market in the Netherlands as well as the product itself.


kilala.nl tags: , ,

View or add comments (curr. 1)

What is FoxT BoKS? A short introduction

2009-10-25 15:59:00

Boiling it down to one sentence one can say that BoKS enables you to centraly manage user accounts and access permissions, based on Role Based Access Control (RBAC).

The following article is also available as a PDF.




What is FoxT BoKS?

BoKS Access Control is a product of the Swedish firm FoxT (Fox Technologies), intended for the centralized management of userauthentication and authorization (Role Based Identity Management and Access Control). The name is an abbreviation of the Swedish "Behörighet- och KontrollSystem", which roughly translates as "Legitimicy and Control System".

Some key features of BoKS are:

Using BoKS you decide WHEN WHO gets to access WHICH servers, WHAT they can do there and HOW.

BoKS is a standalone application and requires no modifications of the server or desktop operating systems.


An example: Role Based Access Control

BoKS groups users accounts and computer systems based on their function within the network and the company. Each user will fit one or more role descriptions and each server will be part of different logical host groups. One could say that BoKS is a technical representation of your company's organisation where everyone has a clearly defined role and purpose.

Let us discuss a very simple example, based on a BoKS server, an application server and a database server.

Your database admins will obviously need access to their own work stations. Aside from that they will be allowed to use SSH to access those servers in the network that run their Oracle database. Because BoKS is capable of filtering SSH subsystems, the DBAs will get access to the command line (normal SSH login) and to SCP file transfer. All other SSH functions (like port forwarding, X11 tunneling and such) will be turned off for their accounts. Using the BoKS Oracle plugins your DBAs accounts will also be allowed to administer the actual databases running on the server.
The sysadmins will be allowed full SSH access from their work stations to all of the servers in the network. Aside from their own user accounts they will also be allowed to login using the superuser account, but that will be limited to each server's console to limit the actual risk of abuse. Because the system administrators are expected to provide 24x7 support they will also be allowed to create a VPN connection to the network, through which they can also use SSH. However, this particular SSH will only work if they have authenticated themselves using an RSA token.

To ensure a seperation of duties the system administrators will not be allowed access to any of the applications or databases running on the servers.
The actual users of BoKS, security operations, will gain SSH access to the BoKS security server. Aside from that they will also be allowed access to the BoKS web interface, provided that they've identified themselves using their PKI smart card.

Key features of BoKS

Centralized management of user accounts
No longer will you have to locally create, modify or remove user accounts on your servers. BoKS will manage everything from it's central security server(s), including SSH certificates, secondary Unix groups and personal home directories.

Centrally defined access rules
Users will only be allowed access to your computer systems based on the rules defined in the BoKS database. These rules define permissable source and destinations systems as well as the (time of) day and the communications protocols to be used.

Role based access control
Access rules can be assigned both to individual users as well as to roles. By defining these user classes you can create and apply a set of access rules for a whole team or department in one blow. This will save you time and will also lower the risk of human error.

Extensive audit logging
Every authentication request that's handled by BoKS is stored in the audit logs. At all times will you be able to see what's happened in your network. BoKS also provides the possibility of logging every keystroke performed by a superuser (root) account, allowing you greater auditing capabilities.

Real-time monitoring
The BoKS auditing logs are updated and replicated in real-time. This allows you to use your existing monitoring infrastructure to monitor for undesired activities.

Support for most common network protocols
BoKS provides authentication and authorization for the following protocols: login, su, telnet, secure telnet, rlogin, XDM, PC-NFS, rsh and rexec, FTP and SSH. The SSH protocol can be further divided into ssh_sh (shell), ssh_exec (remote command execution), ssh_scp (SCP), ssh_sftp (SFTP), ssh_x11 (X11 forwarding), ssh_rfwd (remote port forwarding) and ssh_fwd (local port forwarding).

Delegated superuser access
Using "suexec" BoKS allows your users to run a specified set of commands using the superuser (root) account. Suexec access rules can be specified on both the command and the parameter level, allowing you great flexibility.

Integration with LDAP and NIS+
If so desired BoKS can be integrated into your existing directory services like LDAP and NIS+. This enables you to connect to automated Human Resources processes involving your users.

Redundant infrastructure
By using multiple BoKS servers per physical location you will be able to provide properly load balanced services. Your BoKS infrastructure will also remain operable despite any large disasters that may occur. Disaster recovery can be a matter of minutes.


Product comparison

OpenLDAP eTrust AC BoKS AC
Centralized authentication management Y Y Y
Centralized authorization management Y (1) Y Y
Role based access control N Y Y
SSH subsysteem management N N Y
Monitoring of files and directories N Y Y
Access control on files and directories N Y N
Delegated superuser access Y (2) Y Y
Real-time security monitoring Y (3) Y Y
Extensive audit logging N Y Y
OS remains unchanged Y N Y
User-friendly configuration N Y Y
Reporting tools N Y Y
Password vault functionality N Y Y (4)

1: Only for SSH.
2: Using additional software.
3: Locally, using syslog.
4: Using the optional BoKS Password Manager module.


kilala.nl tags: , ,

View or add comments (curr. 11)

Wat is FoxT BoKS? Een korte introductie

2009-10-20 07:36:00

In een zin samengevat is het met BoKS mogelijk om vanuit een centrale server gebruikersaccounts en toegangsrechten te beheren op basis van Role Based Access Control (RBAC).

Het volgende artikel is ook beschikbaar als PDF.




Wat is FoxT BoKS?

BoKS Access Control is een product van de Zweedse firma FoxT, bedoelt voor het centrale beheer van gebruikersauthenticatie en -authorizatie (Role Based Identity Management en Access Control). De naam is een afkorting voor het Zweedse "Behörighet- och KontrollSystem", wat zich laat vertalen als "Legitimatie en controle systeem".

Belangrijke features van het pakket zijn onder andere:

Met behulp van BoKS bepaalt u WIE WANNEER toegang krijgt tot WELKE servers, WAT hij daar mag doen en HOE.

BoKS is een vrijstaande applicatie en vereist geen aanpassingen aan het besturingssysteem van uw servers en desktop systemen.


Een praktijkvoorbeeld: Role Based Access Control

Gebruikers en computersystemen worden in BoKS gegroepeerd op basis van hun functie binnen het netwerk. Elke gebruiker kan beschikken over één of meerdere rollen en elke server maakt deel uit van verscheidene host groepen. De BoKS database is feitelijk een weergave van het organogram van de organisatie, waarbij eenieder een eigen rol binnen het bedrijf vervult.

Als voorbeeld nemen we een netwerk met een BoKS security server, een applicatie server en een database server.

De database beheerders krijgen toegang tot hun eigen werkstations. Daarnaast worden zij toegestaan om met behulp van SSH op hun Oracle servers in te loggen. Omdat BoKS in staat is om ook op SSH subsystemen te filteren, krijgen de DBA's toegang tot de command line en kunnen zij bestanden kopiëren met behulp van SCP. Zij zullen echter geen gebruik kunnen maken van X11 forwarding of SSH port forwarding. Met behulp van de BoKS Oracle plugin worden ook hun gebruikersaccounts in Oracle zelf aangemaakt zodat zij de volledige controle over hun databases krijgen.
De systeembeheerders krijgen vanaf hun werkstations SSH toegang tot alle servers in het netwerk. Om hun werkzaamheden uit te kunnen voeren krijgen zij toegang tot alle SSH functies en mogen zij daarnaast met het superuser account inloggen op de console. Omdat de systeembeheerders 24x7 support leveren mogen zij ook via een VPN verbinding met SSH inloggen. Echter, zij zullen dit alleen mogen wanneer zij zich met een RSA token hebben geauthenticeerd.

Vanwege de strikte functiescheiding zullen de systeembeheerders geen toegang krijgen tot de applicaties en databases die op de servers actief zijn.
Security operations, de eigenlijke gebruikers van BoKS, krijgen SSH toegang tot de BoKS security server. Daarnaast krijgen zij toegang tot de BoKS web interface, mits zij zich identificeren met behulp van een smart card met PKI certificaat.

Key features van BoKS

Centraal beheer van gebruikersaccounts
Het aanmaken, wijzigen en verwijderen van gebruikersaccounts en aanverwante zaken hoeft niet langer lokaal te gebeuren. BoKS beheert niet alleen user accounts, maar ook SSH certificaten, secundaire Unix groepen en home directories.

Centraal gedefinieerde toegangsregels
Gebruikers krijgen toegang tot systemen op basis van toegangsregels in de BoKS database. Deze regels stellen eisen aan zowel het bron- als het doelsysteem, het tijdstip en het gebruikte protocol.

Role based access control
Toegangsregels kunnen worden toegekend aan individuele gebruikers, maar kunnen ook worden verbonden aan rollen. Zo wordt het mogelijk om per afdeling een set toegangsregels te definiëren, waarmee veel tijd en risico’s bespaard kunnen worden.

Diepgaande audit logging
Elke authorisatieaanvraag die door BoKS wordt behandeld wordt opgeslagen in de audit logs. Zo kan men ten alle tijden zien wat er zich in het netwerk heeft afgespeeld. Daarnaast is het mogelijk om voor de superuser keystroke logging te activeren zodat bij kan worden gehouden welke commando’s een gebruiker heeft uitgevoerd.

Real-time monitoring mogelijkheden
De BoKS audit logs worden real-time aangevuld waardoor het mogelijk wordt om met monitoring tools alarmen te verbinden aan bepaalde situaties.

Ondersteuning voor alle gebruikelijke protocollen
BoKS ondersteunt authenticatie en authorizatie controle voor de volgende protocollen: login, su, telnet, secure telnet, rlogin, XDM, PC-NFS, rsh en rexec, FTP en SSH. Het SSH protocol kan verder worden opgesplitst in ssh_sh (shell), ssh_exec (remote command execution), ssh_scp (SCP), ssh_sftp (SFTP), ssh_x11 (X11 forwarding), ssh_rfwd (remote port forwarding) en ssh_fwd (local port forwarding.

Gedelegeerde superuser toegang
Met behulp van de suexec functionaliteit van BoKS wordt het mogelijk om gebruikers zeer gelimiteerde toegang te geven tot superuser accounts. De suexec toegangsregels kunnen tot op het parameter niveau aangeven welke commando’s uitgevoerd mogen worden als root.

Integratie met LDAP en NIS+
Indien gewenst is het mogelijk om BoKS samen te laten werken met directory services als LDAP en NIS+. Zo wordt het onder andere mogelijk gemaakt om aan te sluiten bij geautomatiseerde HR processen met betrekking tot het in en uit dienst treden van medewerkers.

Redundant uitgevoerde infrastructuur
Het gebruik van meerdere BoKS servers per fysieke locatie maakt load balancing mogelijk. Tijdens een catastrofe zal de BoKS infrastructuur beschikbaar blijven, waarbij disaster recovery binnen afzienbare tijd behaald kan worden.


Productvergelijking

OpenLDAP eTrust AC BoKS AC
Centraal user beheer Y Y Y
Centraal authorisatie beheer Y (1) Y Y
Role based access control N Y Y
SSH subsysteem beheer N N Y
Monitoring van bestanden N Y Y
Toegangsbeheer op bestanden N Y N
Gedelegeerde superuser toegang Y (2) Y Y
Real-time security monitoring Y (3) Y Y
Diepgaande audit logging N Y Y
OS blijft ongewijzigd Y N Y
Gebruiksvriendelijke configuratie N Y Y
Rapportage tooling N Y Y
Password vault functionaliteit N Y Y (4)

1: Alleen voor SSH.
2: Met behulp van extra software.
3: Decentraal, met behulp van bijvoorbeeld syslog.
4: Met behulp van de BoKS Password Manager module.


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS database tables - an overview

2009-10-08 08:54:00

Documentation on the actual contents and makeup of the BoKS database is sparse and hard to find. The BoKS system administrator's manual doesn't mention any details, nor does FoxT's website. This isn't very odd, because in general FoxT would not recommend that people muck about in the database. However in some cases it's very important to know what's what and how you can extract information. Case in point, my earlier database dump script for migrations.

In the past I've pieced together an overview of the various database tables, which is far from conclusive. I still need to update this list using some unofficial BoKS documentation, but below you'll find the summary as it stands now.

In the mean time you can find the unofficial documentation of the BoKS database tables by reading the following file on your BoKS master: $BOKS_lib/gui/tcl/base/boksdb.tcl

 

BoKS database tables

# Contents # Contents
0 System parameters 27 -
1 User accounts 28 -
2 User access routes 29 -
3 - 30 -
4 SSH authentication methods
31 User SSH authenticators
5 Currently logged-in users
32 -
6 Hosts 33 ? don't know yet ?
7 Host group -> host 34 Certificates for HTTPS et al
8 ? don't know yet ? 35 -
9 Host -> host group 36 -
10 - 37

Suexec program groups AND!

LDAP server names

11 ? don't know yet ? 38 ? don't know yet ?
12 - 39 -
13 - 40 -
14 Certificates for HTTPS et al
41 Server virtual cards ?
15 IP address -> host 42 -
16 User class access routes 43 -
17 User classes 44 BoKS users -> LDAP entries
18 - 45 -
19 - 46 -
20 Log rotation settings, see logadm
47 Unix group -> GID
21 - 48 User -> GID
22 Seccheck and filmon settings
49 User -> user class
23 LDAP bind settings 50 -
24 - 51 -
25 Password complexity settings 52 -
26 - 53 -
    54 -

 

BoKS database interconnections diagram

BoKS database tables diagram

 

BoKS database relational schema

BoKS database relational schema

My colleagues Erik Bleeker and Patryck Winkelmolen have created a lovely Visio diagram of the BoKS database, its tables and fields and the relations between all of these. It took them quite a while to complete the puzzle, so they should be proud of their work! Lucky for us they were friendly enough to share the drawing with the rest of the world. I've included the Visio schematic over here with their permission.


kilala.nl tags: , ,

View or add comments (curr. 0)

Making managing BoKS sub-administrators easier

2009-09-28 10:23:00

BokS' administrative GUI is far from a work of art, at least those versions I've worked with (up to and including 6.5.3). The web interface feels kludgy and it's apparent that it was designed almost ten years ago. I'm aware that FoxT are working on a completely new Java-driven GUI, so I'm very curious to see how that turns out!

In the mean time I've asked them to look at an improvement regarding the GUI that the might not have thought of before: the management of sub-administrators.


How do sub-administrators work?

In BoKS one can opt to delegate certain administrative tasks to other departments. For example, one could delegate the creation of simple Unix user accounts to the help desk in order to free up time for the 2nd and 3rd lines of support to do "important" things. In BoKS people with delegated access are called sub-administrators. It's important to remember that -everybody- with the "BOKSADM" access route gets full access to the BoKS web interface, unless they're defined as sub-admins.

According to the BoKS manual the following tasks can and cannot be delegated.


CAN be delegated CANNOT be delegated
User Administration
Access Control (partial)
Host Administration (partial)
Virtual Card Administration
Encryption Key Administration (partial)
Log Administration
Integrity Check
File Monitoring
Database Backup
User Inactivity Monitoring
Host Administration (partial)
LDAP Synchronization
Password Administration
UNIX Groups Administration
Sub-Administrator Configuration
BoKS Agent Configuration
Authenticator Administration
CA Administration

Within each section it's possible to further limit the administrative rights. For example, if you allow your help desk to create simple Unix accounts you may want to limit them to a certain number of user classes, host groups or UID ranges. This can be done, but is quite a hassle. You will need to configure each user separately, on a per-user basis. Frankly, doing this through the web interface sucks, especially if you have a huge list of user classes and want to include/exclude large numbers of classes.

Luckily there is a way to make things a -little- easier for yourself.


What happens under water? Configuration files galore!

I found out that all sub-administrator configuration is held on the file system and NOT in the BoKS database. I found this a bit odd, as it seems logical to keep stuff like this in the DB. This is also why I issued my original feature request: to bind sub-admin rights to BoKS user classes. But no, for now (BoKS 6.5.3 and lower) this config is held in $BOKS_var/subadm.

After enabling sub-administrator access for a particular user BoKS will create a new file in this directory, called $HOSTGROUP:$USERNAME.cfg thus binding it to a specific account. Browsing through this file I discovered how the access limitations work and to be honest: IMNSHO it's a kludge. For each particular section of the BoKS interface you will find a function (TCL subroutine?) that looks something like this:

boks_subadmin_check_$SECTION {
if "getlist" { return "ENTRY1 ENTRY2 ENTRY3 ... ENTRYn" }
if "changeitem matches ENTRY1 || ENTRY2 || ENTRY3 || ... || ENTRYn" { return 1 }
}

That's right, the configuration file actually contains subroutines that return a 0 or a 1 depending on which access rights you've given the user. If you've given him access to a hundred user classes there will be a subroutine with an IF-statement that has a hundred || OR-statements. Ouch. I've said it before and I'll say it again: it's time for a proper (relational) database.


Work around: making sub-administrator templates for BoKS

The way to make managing sub-administrators easier is not very userfriendly, but it's surprisingly easy.

  1. Convert one team member to sub-admin.
  2. Configure this one person through the web interface.
  3. Login to the BoKS master server and become root.
  4. Go into $BOKS_var/subadm.
  5. Copy $HOSTGROUP:$USERNAME.cfg to $USERCLASS.template, then chmod 644.
  6. for USER in $(classadm -L -u $USERCLASS); do cp -p $USERCLASS.template $USER.cfg

Done!

Obviously you'll want to copy $BOKS_var/subadm to all your replica servers as well. If you don't you'll give -everyone- with an "BOKSADM" access route full access to the GUI. I suggest setting up an rsync for this.


One final, very big "gotcha!"

My colleague Wim realized that the current way of sub-admin delegation has one very big flaw. Every time you add a new host group or user class you will need to update all .CFG files to match this. Of course, using the aforementioned templates will make this easier because you can update one file and then copy it to the whole team. But still...


kilala.nl tags: , ,

View or add comments (curr. 1)

Submitted a bugfix to fix a BoKS bugfix

2009-09-24 10:17:00

This morning I discovered a bug in one of FoxT's "hotfixes" (aka patch, bugfix) for BoKS 6.0.x. Maybe the problem exists for other BoKS versions as well. The hotfix in question is TFS 061016-115513 which enables BoKS 6.0 to work with the ssh_pk_optional authentication method. Before this hotfix you were forced to use either password or SSH key authentication, but never both. With the hotfix applied you can now use SSH key authentication, but fall back to password if the keys are missing.

Anywho... I found out that on Solaris 10 the hotfix does not actually replace all necessary files if you run BoKS 6.0. Here's the list of files that get replaced:

Sol10 = boks_sshd, mess.eng

Sol8 = boks_sshd, mess.eng, boks_servc_d, method.conf, plus a few GUI forms.

After conferring with BoKS-guru Wilfrid at FoxT it seems that the patch will treat Solaris 10 as client-only systems, which sucks when you're appying it to a replica or master server. In order to fix a Sol10 replica/master you'll need to manually copy the files from the Sol8 part of the fix to their intended destinations. This should work without any problems as Sol10 is fully backwards compatible with Sol8.


kilala.nl tags: , ,

View or add comments (curr. 0)

Published three new BoKS admin scripts

2009-09-12 23:01:00

The past few months I've been working on some BoKS scripts. Let's say that my daily job's inspired me to write a number of scripts that I just -know- are going to be useful in any BoKS environment. I've got plenty ideas for both admin and monitoring scripts and finally I'm starting to see the fruits of my labour!

All of these scripts were written in my "own" time, so luckily I can do with them as I please. I've chosen to share all these scripts under the Creative Commons license which means that you can use them, change them and even re-use them as long as you attribute the original code to me. I guess it sounds a bit like the GPL.

Anywho, for now I've published three scripts, with more to come! All scripts can be found in the Sysadmin section of my site, in the menubar. So far there are:

1. boks_safe_dump, which creates database dumps for specific hosts and host groups.

2. boks_new_rootpw, which sets and verifies new passwords on root accounts.

3. check_boks_replication, a monitor script to make sure BoKS database replication works alright.

As they say in HHGTTG: Share and enjoy!


kilala.nl tags: , , ,

View or add comments (curr. 1)

BoKS_safe_dump - Script for making BoKS database dumps

2009-09-11 15:30:00

From time to time one will need a BoKS database dump that includes all the tables, but is limited to one or two specific applications. For example, one could be migrating an application or hostgroup to another BoKS domain. Or one might be performing a security audit on a specific group of servers.

This script will make a dump of all BoKS information relevant to a set of specified servers or host groups. It will strip the password information for all accounts (for obvious security reasons).


Usage of boks_safe_dump

./SafeDump.ksh [-g HOSTGROUP] [-h HOST | -f FILE] [-p] -d DIRECTORY
-g HOSTGROUP	Hostgroup to dump the BoKS information for. Multiple allowed.
-h HOST		Host to dump the BoKS information for. Multiple allowed.
-f FILE		List of hostnames to dump the BoKS information for. 
-p		Disable hiding of account passwords for non-root accounts.
-d DIRECTORY  	Location to store the output files.

Examples:
$PROGNAME -f /tmp/hostlist -d /tmp/BOKSdump
$PROGNAME -g HG_APP1 -g HG_APP3 -d /tmp/BOKSdump
$PROGNAME -g HG_APP1 -h HOST1 -h HOST5 -d /tmp/BOKSdump

Output

The script creates a new directory (indicated with the -d flag) which will contain a number of files called tableN. "N" in this case refers to the relevant table from the BoKS database. The following tables are dumped.

01. Contains all user accounts.
02. Binds access routes to individual users.
06. Contains all host information.
07. Binds host groups to hosts.
09. Binds hosts to host groups (reverse of table 9).
15. Binds IP address to hostname (reverse of table 6).
16. Binds access routes to user classes.
17. Contains all user classes.
31. Contains SSH settings for individual users.
47. Contains all Unix groups.
48. Binds secondary Unix groups to individual users.
49. Binds user accounts to user classes.


Limitations


Download

Download boks_safe_dump.ksh
thomas$ wc boks_safe_dump.ksh
380    1462   10781 boks_safe_dump.ksh

thomas$ cksum boks_safe_dump.ksh
3833439207 10781 boks_safe_dump.ksh

kilala.nl tags: , ,

View or add comments (curr. 0)

Known software issues when working with FoxT BoKS

2009-09-11 08:12:00

Unfortunately not all software plays nicely with BoKS. Some of them have special needs, or need to be configured in a particular manner. This page discusses the known issues. Luckily in most cases all you need to do is tweak one or two settings.


ProFTPd

We have found that recent versions of ProFTPd report FROMHOST IP addresses in the IPv6-IPv4 hybrid mode. This currently (Feb 2010) breaks the BoKS login call because the servc daemon cannot process a FROMHOST formatted as :::ffff:192.168.0.1. You will not see any logging in the BoKS transaction log, but if you bdebug the ftpd process on the agent you'll see that servc returns an ERR-9.

For some reason using the -ipv4 of -4 flags from the command line in order to force ProFTPd into IPv4 mode do not work. Instead you will need to edit proftpd.conf and set the flag "UseIPv6" to "off" (Source).


F-Secure

Connecting to a BoKS server with F-Secure

SSH keys generated by F-Secure are usually in the SSH2 format. Before you can import them on your BoKS server they will need to be converted to OpenSSH format. You cannot simply add them to ~/.ssh/authorized_keys. This conversion is done using the "ssh-keygen" command on your Unix box.

  1. Copy the source user's SSH2 public key (RSA or preferably DSA) to your server.
  2. Login as the destination user.
  3. Run: cd ~/.ssh
  4. Run: /opt/boksm/bin/ssh-keygen -i -f $PATH/TO/PUBKEY >> authorized_keys

You have now converted and added the public key to the authorized_keys file.

Now, if you forego the use of SSH keys and would like to use passwords instead, you will need to force F-Secure SSH to use the "keyboard interactive" authentication method. Per default it will use "password", which will not work properly. Both methods are very similar insofar that "keyboard-interactive" actually includes "password" authentication, but it includes a few additional handshakes that BoKS' OpenSSH needs.

If you're coming from a Unix server you'll need to enable "keyboard-interactive" in either your personal ssh_config file, or in the systemwide file under /etc/ssh/ssh_config.


Connecting to an F-Secure server with BoKS' OpenSSH

Again there's a difference insofar that F-Secure uses SSH2 keys as opposed to the OpenSSH format. Your key will need to be transformed before transfering it to the remote server. The authorized_keys file on the other side will also work differently from what you're used to. The F-Secure authorized_keys file is not a list of keys, but a list of pubkey file names.

  1. Login as the source user.
  2. Run: cd ~/.ssh
  3. Run: /opt/boksm/bin/ssh-keygen -o -f ./id_dsa.pub
  4. Copy the resulting output.
  5. Login to the remote server as the destination user.
  6. Run: cd ~/.ssh
  7. Paste the copied text into a new public key file, like id_dsa.remote.pub.
  8. Edit the authorized_keys file and add a line: key id_dsa.remote.pub

ComForte SFTP

ComForte is an SFTP client used on Tandem servers. It's not a piece of client software like the ones we're used to! It was originally meant for file transfer between Tandem servers. From our experiences it seems to be a daemon running on Tandem that acts as a pass-through for regular FTP traffic, which it then sends through SSH or SSL. It's really rather wonderfully weird :)

We've seen in the past that ComForte SFTP cannot work with keyboard-interactive authentication, since the client software simply does not recognize the method returned by BoKS. Unfortunately to my knowledge BoKS' SSH daemon in turn does not allow the old "password" method to be enabled. Hence with ComForte we must use SSH public key authentication. That's the only way it's going to work.

I have actually never witnessed the configuration process of ComForte, but it seems to work something like this.


Putty and WinSCP

Putty and WinSCP are based on the same piece of simple, elegant software and both should work straight out of the "box". Seeing how they're standalone binaries you won't even have to actually install them in Windows.

If you do discover that your password-based login fails, make sure to check your SSH authentication settings. Just like with F-Secure the "keyboard-interactive" method should be enabled and on the top of your list.

Update 10 Sept 2009:

My colleague Frank vd Bilt has informed me of a semi-bug in a very recent version of Putty. Apparently this version of Putty bombs when used together with the boks_sshd daemon. Even a few "ls -lrt" commands are enough to crash the connection. The error message you'll get is: Disconnected: Received SSH_MSG_CHANNEL_SUCCESS for "winadj@putty.projects.tartarus.org".

You can read the Putty bug report over here.

kilala.nl tags: , ,

View or add comments (curr. 0)

Caveats and gotchas for the FoxT BoKS administrator

2009-09-04 15:19:00

Despite it's long life (it's been with us for over ten years now!), BoKS has a number of caveats, or gotchas that one needs to keep in mind at all times. Some of the points below clearly fall in the "not a bug, but a feature" category, but that doesn't mean you shouldn't be aware of them.

So, here's a list of things that can easily lead to problems.


No protection against duplicate UIDs and GIDs

BoKS will not prevent you from re-assigning the same UID to many different users, nor will it prevent the re-use of the same GID for different groups. You may do this intentionally or accidentally. Either way it's a very good idea to regularly check for duplicate UIDs and GIDs. The thing is, if such a duplication occurs on a server it will have a very hard time figuring out to whom a file or a process belongs. Usually this is left up to the order in which the entries occur in /etc/passwd or /etc/group.

Obviously it's best NOT to use duplicate UIDs and GIDs. However, preventing this will require a centralised database of some sorts that all your security personnel refer to and which is used to lay claim to unused IDs.


No protection against mismatches in UIDs and GIDs

The exact opposite to the previous is also true: BoKS thinks it's perfectly alright for you to use different UIDs for multiple accounts with the same user name. For example, SUN:peter and AIX:peter may have two completely different UIDs. In the case of normal user accounts this may be problematic, but in the case of applicative accounts (like the "oracle" or "sybase" users) this may lead to disaster.

The same goes for Unix groups: it's possible to have multiple groups with the same names, yet different GIDs. See above for the repercussions.


No protection against manual editing of local files

The way BoKS propagates user accounts and groups to a server is by updating the local security files, such as /etc/passwd and /etc/group. Each time a change is made to a user account BoKS will automatically change the contents of these files. However, there are two issues we have run into with regards to the local security files.

  1. Any information not in BoKS is left untouched.
  2. Manual changes to information from BoKS is not corrected.

Re item 1: Usually a number of accounts present after a default OS install are not added to BoKS; think of users like uucp, lp, nobody and sys. These accounts may be needed at one point in time, so BoKS will leave any accounts or groups it does not have knowledge of alone. It will work around this information in the local security files. This leads to ...

Re item 2: Unfortunately this means that it's possible for someone with root access to add accounts to the server that cannot be traced. Of course, assuming that BoKS is up and running the account will not be able to be used because there are no access routes. These manual edits however may completely mess up other accounts that -are- in BoKS.

Say for example that BoKS contains a user "oracle" with UID 1234. If the local passwd file happens to contain another "oracle" user with UID 1200 (which was possibly added by a post-install script) things will go horribly wrong.

Manual changes to accounts or groups that -do- exist in BoKS are rectified by BoKS. However, this only occurs when you make a change to an account, after which BoKS overwrites the "faulty" information.


No warning against account overlap with user names from different host groups

Simply put, BoKS will not issue any warning if there is an overlap of two user accounts made in different host groups. This becomes especially problematic when combined with the second item on this page: no protection against mismatches in UIDs and GIDs.

Let's say we have user accounts SUN:peter (UID 20001) and ORACLE:peter (UID 21003). Now let's say we add SERVERA to both hostgroups SUN and ORACLE. Both "peter" accounts will be added to /etc/passwd with the confusion that is to be expected.

Again, one can prevent a lot of problems by not using different UIDs for the same account. Also, it is a -very- good idea to minimise the amount of copies that exist of one user. I've seen cases where one person had no less than five different accounts, all with the same name but in different host groups. That's easy to mess up!


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS troubleshooting: servc error messages in the log file

2009-08-27 08:49:00

BoKS logs all transactions into $BOKS_var/data/LOG, which then gets rotates to another location of your choosing. Every single request that's handled by BoKS gets logged, detailing who did what, where, when and why. If a transaction fails, the servc process will indicate the error message in the log file. This may not always make clear what is wrong (like the infamous and useless ERR223), but it sure helps you in your troubleshooting.

All of the error messages are listed in the BoKS administration manual. However, since a lot of people also chose not to RTFM I thought I might as well copy the list over here ^_^.

You will also find a more up-to-date list of these messages in $BOKS_var/mess.eng, which acts as a translation file between BoKS errors and plain English.





ERR_SERVC_NEED_MORE 2

Sent by servc when it decides it needs more info from a client NEED=something is set in string sent back).


ERR_SERVC_GAVE_UP 1

Servc cannot get in contact with database.


ERR_SERVC_COMM_ERROR -1

Communication error. Probably wrong nodekey. Set a new nodekey on the machine. Check also that xservc is running by using lsmqueid.


ERR_SERVC_READ_ERROR -2

Read error from database


ERR_SERVC_WRITE_ERROR -3

Write error to database


ERR_SERVC_CORRUPT_BASE -4

Erroneous database


ERR_SERVC_NO_AUTH -5

No authorization


ERR_SERVC_UNKNOWN_HOST -6

Host unknown


ERR_SERVC_NO_SERVC -7

Call to servc failed


ERR_SERVC_UNKNOWN_CLIENT -8

Unknown client type


ERR_SERVC_BAD_ARGS -9

Internal BoKS Manager error. Argument format error.


ERR_SERVC_OLDPSW_CHANGE -100

The password is too old. Must be changed.


ERR_SERVC_PSW_SHORT -101

The password is too short.


ERR_SERVC_PSW_USE11 -102

At least one digit and one letter in the password.


ERR_SERVC_PSW_USE22 -103

At least two digits and two letters in the password.


ERR_SERVC_PSW_ISSAME -104

The password is similar to the username.


ERR_SERVC_PSW_ISUSED -105

The password has already been used.


ERR_SERVC_PSW_INVALID -106

Invalid password


ERR_SERVC_PSW_CHANGED -107

Password changed


ERR_SERVC_NEW_MISMATCH -109

The new passwords don't match


ERR_SERVC_PSW_LOOKALIKE -110

Password does not differ enough from the previous one


ERR_SERVC_NO_USER -200

The user doesn't exist, will not be displayed even if verbose mode is on


ERR_SERVC_WRONG_PSW -201

Wrong password.


ERR_SERVC_OLDPSW -202

The password is too old.


ERR_SERVC_NO_TTY -203

No terminal authorization granted.


ERR_SERVC_NO_TIME -204

Access denied at this hour.


ERR_SERVC_USER_BLOCKED -205

The user is blocked.


ERR_SERVC_TTY_LOCKED -206

The terminal is blocked.


ERR_SERVC_TOO_MANY_TRIES -207

Too many erroneous login attempts.


ERR_SERVC_OLD_USER -208

The username is not valid.


ERR_SERVC_WRONG_SYSPSW -209

Wrong system password


ERR_SERVC_NO_AUTH_INFO -210



ERR_SERVC_STDLOGIN -211

Tells client that standard unix login should be used


ERR_SERVC_MISSING_SYSPSW -212

Missing system password


ERR_SERVC_NO_REMHOST -213

Remote host missing


ERR_SERVC_BAD_REMHOST -214

Calling host not authorized


ERR_SERVC_NO_PIN -215

Missing PIN code or serial number


ERR_SERVC_WRONG_SPIN -216

Wrong password (SPIN)


ERR_SERVC_NO_LOGIN -217

Login not allowed


ERR_SERVC_NO_SUTO -218

SU to user not allowed


ERR_SERVC_GETKEY_EXHAUSTED -217

# SLAN Login not allowed


ERR_SERVC_GETKEY_CANTDEL -218

# SLAN SU to user not allowed


ERR_SERVC_PASSWD_TOO_NEW -219

Not long enough since last password change


ERR_SERVC_TOO_MANY_CONCUR_LOGINS -220

Too many concurrent logins with your name


ERR_SERVC_CERT_REVOKED -221

Certificate revoked


ERR_SERVC_USERPROTO -222

User-level protocol error (currently from dgsadasp)


ERR_SERVC_AUTH_FAIL -223

Authentication failed (currently from bosas)

kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS troubleshooting: tracing the BoKS internals

2009-08-27 08:22:00

FoxT provides us with a number of very useful tools to aid us in troubleshooting BoKS issues. Among others we will frequently use the boksauth and bdebug commands. Bdebug in this case refers to the tracing tool that this article will focus on.

Usually we will want to run a trace when BoKS is doing something that we don't expect. For example:

In each case you will need to determine which BoKS processes are part of the problem. For example:

Before we begin, let me warn you: debug trace log files can grow pretty vast pretty fast! Make sure that you turn on the trace only right before you're ready to use the faulty part of BoKS and also be sure to stop the trace immediately once you're done.

Debugging login issues

In the case of users getting denied access, troubleshooting got a lot easier once we learnt to use the boksauth command. Boksauth allows you to simulate a login request by a user, without actually having access to the account, the password or the source host. For example:

BoKS > boksauth -Oresults -r'ssh:192.168.0.128->SERVERA' -c FUNC=auth PSW="vljwvHlx3zS35" \
FROMHOST=192.168.0.128 TOHOST=SERVERA TOUSER=patrick ERRMSG=

The command above will test a login from 192.168.0.128, using SSH to user patrick@SERVERA. Assuming that you're testing a failing login, the output will include something like "ERRMSG=No terminal authorization granted."

In order to see what's actually going wrong you will need to start a debug trace on the servc process on the same master/replica where you run the boksauth command. This is done by entering:

BoKS > bdebug -x9 -f /tmp/servc.trace servc

Repeat the boksauth command and then immediately afterwards run the following command to turn off the trace again:

BoKS > bdebug -x0 servc

The file /tmp/servc.trace will now contain the debug output for all transactions parsed in the past few seconds, including the failed simulated login you did with boksauth. Debug output is rather lengthy and difficult to read so either you'll need half an hour to dig through it, or you can send it to FoxT's tech support department so they can explain it for you.

Debugging other issues

As I mentioned you can use bdebug to run traces on any BoKS process you can think of. In each case you'll use "bdebug -x9" to turn debugging on and "bdebug -x0" to turn it off again. In order to properly troubleshoot your issues you'll need to decided which processes to trace and then, with the trace running, try to replicate the problem.

In the case of replication issues you'll:

If a client is not receiving updates, you'll:


kilala.nl tags: , ,

View or add comments (curr. 0)

FoxT BoKS international users group

2009-08-27 08:20:00

The Fox Tech logo

Users and administrators of the BoKS Access Control software seem to be spread out quite thinly across the globe. Most companies that employ BoKS are quite large, but there's only a few in each country that actually do so. So far, to my knowledge, the Netherlands only has one multinational using BoKS with two others considering an implementation of their own.

Since there isn't very much BoKS information available on the web I thought I'd create a users group on LinkedIn. LI.com is a great site for maintaining your professional network and for keeping in touch with colleagues both old and new. Hence it's also a nice and easy way to set up a discussion board for professionals.

I'm very curious to see if we can entice BoKS admins from countries other than the Netherlands to join. It'd be great if we could set up discussions between users across the globe. Maybe we could even coordinate feature requests and bug reports to lighten the load on FoxT and to make sure the really important requests get handled first.

Spreading information about BoKS on the web

Slowly but surely we are working on making more information about BoKS available through the Internet. Friends and colleagues have started writing tutorials and case studies, which (by providence of Google) should turn up when people search for Information.

Below you'll find a list of the efforts I've tracked down so far.


kilala.nl tags: , ,

View or add comments (curr. 1)

Integrating FoxT BoKS in Solaris Service Management Facility (SMF)

2009-08-18 12:37:00

As we all know BoKS is available for a multitude of flavors of Unix. Aside from a number of Linux distributions, it also runs on AIX, HP-UX, Solaris and even on Windows. Because of this diverse choice of platforms FoxT is of course forced to make design choices that point towards the lowest common denominator.

In some cases these design choices lead to undesirable situations, which one will need to work around. One such case is Solaris 10, which chooses to forgo the ancient Unix staple of /etc/inetd.conf, /etc/init.d/, /etc/rc?.d/ and /etc/inittab. Instead, Sun Microsystems has chosen to create their own service management facility, aptly called Solaris SMF.

In Solaris 10 the SMF software is used to manage the startup and shutdown sequences of the server, as well as the current state of many running applications. For example, where one would originally type "/etc/init.d/openssh start" one now enters "svcadm enable svc:/network/service:ssh".

BoKS however still relies on the old fashioned scripts for its startup and shutdown as it can expect to find these on all Unixen. However, during the execution of one of our projects Unixerius have decided to make a patch for BoKS that will allow the software to work reliably from SMF.


Patching BoKS for use with Solaris SMF

In order to get BoKS to work with SMF we'll need to make a number of changes to both BoKS and the Solaris operating system. We are currently not aiming for a full switch from /etc/rc3.d and boksinit to SMF, but instead opt to only include the minimum into SMF.

The way we see it, we'll need to make the following changes:

*: And boksinit.replica and boksinit.master.

The above should allow us to stop and start BoKS independently of the BoKS SSH daemon. If you wouldn't do this SMF would kill boks_sshd along with the rest of BoKS. It will also allow us to use "Boot -k" and "Boot", which will then interact with SMF instead of just killing PIDs from a list.

Please give us a few weeks to work out this patch. Of course we'll post news both over here and on the Unixerius website once the work is done.


Further resources about SMF


kilala.nl tags: , ,

View or add comments (curr. 0)

Disaster recovery (fail over) of the BoKS infrastructure

2009-08-04 22:01:00

The BoKS infrastructure is pretty much rock solid and will not let you down under normal circumstances. However, "normal" doesn't always happen so it's good to prepare for a disaster. What happens if you lose a replica or two? What happens if the BoKS master server itself is dead? It pays to come prepared!


Adding new BoKS replica servers

Luckily BoKS replica servers are pretty expendable. One needs at least one replica server per physical location, though it pays to have more than one. Moreover you may want to have a replica per section of your network.

By having a good amount of replica servers you won't be caught off guard by a network failure. Having a set of replicas per data center ensures that all your hosts will remain funcional, even if your WAN connections die. And having a replica per network section will allow you to keep operating, despite failure of backbone routers and such.

Should you ever feel the need to add more replica servers, then you can take the following step to create new ones. However, keep in mind that you'll need to be able to communicate with the master server, so this won't do you any good if the network's already dead.

First, modify the host record of your targeted client system through the BoKS GUI. Change the host type from UNIXBOKSHOST to BOKSREPLICA. Then, on the client system perform the following commands.

# /opt/boksm/sbin/boksadm -S

BoKS> vi $BOKS_etc/ENV      #set SHM_SIZE to 16000

BoKS> convert -v server
Stopping daemons...
Setting BOKSINIT=server in ENV file...
Restarting daemons...
Conversion from client to replica done.

BoKS> Boot -k

BoKS> Boot

Finally, also restart the BoKS master software. Running "boksdiag list" should now show the new replica server, which is probably still loading its copy of the database.


Performing a BoKS master fail-over

Without a working master server the BoKS infrastructure will keep on functioning. However, it is impossible to make any changes to the database and thus it's a good idea to restore your master as soon as possible. It's a good idea to promote a replica to master status if you think it'll take you more than a few hours (a day?) to fix the server.

Log in to your chosen replica and perform the following actions. Start off by checking the boks_errlog file to see if the replica itself isn't broken.

$ /opt/boksm/sbin/boksadm -S

BoKS> tail -30 /var/opt/boksm/boks_errlog
...
...

BoKS> convert –v master

Stopping daemons...
Setting BOKSINIT=master in ENV file...
Restarting daemons...
Conversion from replica to master done.

BoKS> boksdiag list
SERVER SINCE LAST SINCE LAST SINCE LAST COUNT LAST
REPHOST5 00:49 523D 5:19:20 04:49 1853521 OK
REPHOST4 00:49 136D 22:21:35 04:49 526392 OK
REPHOST3 00:49 04:50 726768 OK
REPHOST2 00:49 107D 5:05:33 04:49 425231 OK
REPHOST 02:59 02:13 11:44 148342 DOWN

BoKS> boksdiag sequence
...
T7 13678d 8:33:46 5053 (5053)
...
T9 13178d 11:05:23 7919 (7919)
...
T15 13178d 11:03:16 1865 (1865)
...

Now log in to the remaining replica servers and compare the output of the "boksdiag sequence" commands. Alternatively you can run the check_boks_replication script to automate the process. Either way, none of the replicas should either be ahead of the new master, nor should it lag too far behind. If you do find that the replication is broken we'll need to proceed with troubleshooting.


Rolling back after the BoKS master fail-over

Assuming that you will not be using your new master server permanently you will want to go back to your original BoKS master at some point in time. Let's assume that you've repaired whatever damage there was and that the system is now ready to resume its duty.

It's crucial that the original master be converted to a client system before booting it up fully. Perform the following in single user mode.

$ /opt/boksm/sbin/boksadm -S

BoKS> convert –v client
Stopping daemons...
Setting BOKSINIT=client in ENV file...
Restarting daemons...
Conversion from master to client done.

BoKS> cd /var/opt/boksm/data

BoKS> rm *.dat

BoKS> rm sequence

You may now boot the original master server into multi-user mode and let it rejoin the BoKS infrastructure as a client. Afterwards, convert it into a replica server per the instructions in the first paragraph of this page.

Once the original master server has become a fully functioning replica server you may start thinking about dismantling the temporary master. This process will actually be quite similar to what we've done before. Basically you:

  1. Reboot the temporary master into single user mode.
  2. Convert the temporary master into a client (see above).
  3. Convert the original master server back into a master (see second paragraph).
  4. Boot the temporary box into multi-user mode.
  5. Convert the temporary box back into a replica.

kilala.nl tags: , ,

View or add comments (curr. 0)

Finally: BoKS has a logo!

2008-11-22 20:37:00

The new BoKS logo

Since I've joined $CLIENT in October my life has been nothing but BoKS, BoKS, BoKS. It's great to be working with FoxT's security software again :) A lot of things have changed over the years, though the software is still very, very familiar.

One of the things that's made me happy is that Fox Tech have -finally- made an official logo for their BoKS products! I find it odd that they've been marketing this software for over ten years and that their last logo dates back to the nineties. Said decrepit logo hasn't been used in ages and henceforth BoKS was just known by that: a plain text rendition of the name. By request of $CLIENT, Fox Tech have gotten of their hineys and created a new logo that matches their corporate identity.

As a side note: over the past few weeks I've seen a lot of in-depth troubleshooting and I've decided to share some of the stuff I've learnt. Hence you'll find that the BoKS part of the sysadmin section has been revamped :)


kilala.nl tags: , , ,

View or add comments (curr. 0)

BoKS troubleshooting: another example of a debugging session

2008-11-22 20:29:00

Original issues

As I mentioned at the end of example 1 the problem with the seemingly random login denials was caused by a misbehaving replica server. We tracked the problem down to REPHOST, where we discovered that three of the database tables were not in sync with the rest. A whole number of hosts were being reported as non-existent, which was causing login problems for our users.

Now that we've figured out which server was giving us problems and what the symptoms were, we needed to figure out what was causing the issues.

Symptoms

One of our replica servers had three database tables that were not getting any updates. Their sequence numbers as reported by "boksdiag sequence" were very different from the sequence numbers on the master, indicating nastiness.

Diagnosis

1. Verify the sequence numbers again

Just to be sure that the replica is still malfunctioning, let's check the sequence numbers again.

BoKS > boksdiag sequence

...

T7 13678d 8:33:46 5053 (5053)

...

T9 13178d 11:05:23 7919 (7919)

...

T15 13178d 11:03:16 1865 (1865)

...



BoKS > boksdiag sequence

...

T7 13678d 8:33:46 6982 (6982)

...

T9 13178d 11:05:23 10258 (10258)

...

T15 13178d 11:03:16 2043 (2043)

Yup, it's still broken :) You may notice that the sequence numbers on the replica are actually AHEAD of the numbers on the master server.

2. Demoting the replica to client status

Because I was not sure what had been done to REPHOST in the past I wanted to reset it completely, without reinstalling the software. I knew that the host had been involved in a disaster recovery test a few months before, so I had a hunch that something'd gone awry in the conversion between the various host states.

Hence I chose to convert the replica back to client status.

BoKS> sysreplace restore



BoKS> convert -v client

Stopping daemons...

Setting BOKSINIT=client in ENV file...

Restarting daemons...

Conversion from replica to client done.



BoKS > cd /var/opt/boksm



BoKS > tail -20 boks_errlog

...

WARNING: Dying on signal SIGTERM

boks_authd Nov 17 14:59:30

INFO: Shutdown by signal SIGTERMboks_csspd@REPHOST Nov 17 14:59:30

INFO: Shutdown by signal SIGTERM



boks_authd Nov 17 14:59:30

INFO: Min idle workers 32boks_csspd@REPHOST Nov 17 14:59:30

INFO: Min idle workers 32



BoKS > sysreplace replace

I verified that all the BoKS processes running are newly created and that there are no stragglers from before the restart. Also, I tried to SSH to the replica to make sure that I could still log in.

3. Change the replica's type in the database

The BoKS master server will also need to know that the replica is now a client. In order to do this I needed to change the host's TYPE in the database. Initially I tried doing this with the following command.

BoKS> hostadm -a -h REPHOST -t UNIXBOKSHOST

Unfortunately this command refused to work, so I chose to modify the host type through the BoKS webinterface. Just a matter of a few clicks here and there. Afterwards the BoKS master was aware that the replica was no more. `

BoKS > boksdiag list

Server Since last Since last Since last Count Last

REPHOST5 00:49 523d 5:19:20 04:49 1853521 ok

REPHOST4 00:49 136d 22:21:35 04:49 526392 ok

REPHOST3 00:49 04:50 726768 ok

REPHOST2 00:49 107d 5:05:33 04:49 425231 ok

REPHOST 02:59 02:13 11:44 148342 down

It'll take a little while for REPHOST's entry to completely disappear from the "boksdiag list" output. I sped things up a little bit by restarting the BoKS master using the "Boot -k" and "Boot" commands.

4. Reconvert the host back to a replica

Of course I wanted REPHOST to be a replica again, so I changed the host type in the database using the webinterface.

I then ran the "convert" command on REPHOST to promote the host again.

BoKS > convert -v replica

Checking to see if a master can be found...

Stopping daemons...

Setting BOKSINIT=replica in ENV file...

Restarting daemons...

Conversion from client to replica done.



BoKS > ps -ef | grep -i boks

root 16543 16529 0 15:14:33 ? 0:00 boks_bridge -xn -s -l servc.s -Q !/etc/opt/boksm!.servc!servc_queue -q /etc/opt

root 16536 16529 0 15:14:33 ? 0:00 boks_servc -p1 -xn -Q !/etc/opt/boksm!.xservc1!xservc_queue

root 16535 16529 0 15:14:33 ? 0:00 boks_servm -xn

root 16529 1 0 15:14:33 ? 0:00 boks_init -f /etc/opt/boksm/boksinit.replica

root 16540 16529 0 15:14:33 ? 0:00 boks_bridge -xn -r -l servc.r -Q /etc/opt/boksm/xservc_queue -P servc -k -K /et

root 16552 16529 0 15:14:33 ? 0:00 boks_csspd -e/var/opt/boksm/boks_errlog -x -f -c -r 600 -l -k -t 32 -i 20 -a 15

root 16533 16529 0 15:14:33 ? 0:00 boks_bridge -xn -s -l master.s -Q /etc/opt/boksm/master_queue -P master -k -K /

...

...



BoKS > cd ..



BoKS > tail boks_errlog

boks_authd Nov 17 14:59:30

INFO: Min idle workers 32boks_csspd@REPHOST Nov 17 14:59:30

INFO: Min idle workers 32



boks_init@REPHOST Mon Nov 17 15:02:21 2008

WARNING: Respawn process sshd exited, reason: exit(1). Process restarted.

boks_init@REPHOST Mon Nov 17 15:14:31 2008

WARNING: Dying on signal SIGTERM

boks_aced Nov 17 15:14:33

ERROR: Unable to access configuration file /var/ace/sdconf.rec



On the master server I saw that the replica was communicating with the master again.

BoKS > boksdiag list

Server Since last Since last Since last Count Last

REPHOST5 04:35 523d 5:33:41 06:39 1853555 ok

REPHOST4 04:35 136d 22:35:56 06:42 526426 ok

REPHOST3 04:35 06:43 726802 ok

REPHOST2 04:35 107d 5:19:54 06:41 425265 ok

REPHOST 01:45 16:34 26:05 0 new

Oddly enough REPHOST was not receiving any real database updates. I also noticed that the sequence numbers for the local database copy hadn't changed. This was a hint that stuck in the back of my head, but I didn't pursue it at the time. Instead I expected there to be some problem with the communications bridges between the master and REPHOST.

BoKS > ls -lrt

...

...

-rw-r----- 1 root root 0 Nov 17 15:14 copsable.dat

-rw-r----- 1 root root 0 Nov 17 15:14 cert2user.dat

-rw-r----- 1 root root 0 Nov 17 15:14 cert.dat

-rw-r----- 1 root root 0 Nov 17 15:14 ca.dat

-rw-r----- 1 root root 0 Nov 17 15:14 authenticator.dat

-rw-r----- 1 root root 0 Nov 17 15:14 addr.dat



BoKS >

5. Verify that everything's okay on the replica

I was rather confused by now. Because REPHOST wasn't getting database updates I though to check the following items

Everything seemed completely fine! It was time to break out the big guns.

6. Clear out the database

I decided to clear out the whole local cop of the database, to make sure that REPHOST had a clean start.

BoKS > Boot -k



BoKS > cd /var/opt/boksm



BoKS > tar -cvf data.20081117.tar data/*

a data/ 0K

a data/crypt_spool/ 0K

a data/crypt_spool/clntd/ 0K

a data/crypt_spool/clntd/ba_fbuf_LCK 0K

a data/crypt_spool/clntd/ba_fbuf_0000000004 6K

a data/crypt_spool/clntd/ba_fbuf_0000000003 98K

a data/crypt_spool/servc/ 0K

a data/crypt_spool/servm/ 0K

...



BoKS > cd data



BoKS > rm *.dat



BoKS > Boot

Checking the contents of /var/opt/boksm/data immediately afterwards showed that BoKS had re-created the database table files. Some of them were getting updates, but over 90% of the tables remained completely empty.

7. Debugging the communications bridges

As explained in this article it's possible to trace the internal workings of just about every BoKS process. This includes the various communications bridges that connect the BoKS hosts.

I'd decided to use "bdebug" on the "servm_r" and "servm" processes on REPHOST, while also debugging "drainmast" and "drainmast_s" on the master server. The flow of data starts at drainmast, the goes through drainmast_s and servm_r to finally end up in servm on the replica. Drainmast is what sends data to replicas and servm is what commits the received changes to the local database copy.

Unfortunately the trace output didn't show anything remarkable, so I won't go over the details.

8. Calling in tech support

By now I'd drained all my inspiration. I had no clue what was going on and I was one and a half hours into an incident that should've taken half an hour to fix. Since I always say that one should know one's limitations I decided to call in Fox Tech tech support. Because it was already 1600 and I wanted to have the issue resolved before I went home I called their international support number.

I submitted all the requested files to my engineer at FoxT, who was still investigating the case around 1800. Unfortunately things had gone a bit wrong in the handover between the day and the night shift, so my case had gotten lost. I finally got a call back from an engineer in the US at 2000. I talked things over with him and something in our call triggered that little voice stuck in the back of my head: sequence numbers!

The engineer advised me to go ahead and clear the sequence numbers file on REPHOST. At the same time I also deleted the database files again for a -realy clean start.

BoKS > Boot -k



BoKS > cd /var/opt/boksm



BoKS > tar -cvf data.20081117-2.tar data/*

...



BoKS > cd data



BoKS > rm *.dat



BoKS > rm sequence



BoKS > Boot

Lo and behold! The database copy on REPHOST was being updated! All of the tables were getting filled again, including the three tables that had been stuck from the beginning.

The engineer informed me that in BoKS 6.5 the "convert" command is supposed to clear out the database and sequence file when demoting a master/replica to client status. Apparently this is NOT done automatically in BoKS versions 6.0 and lower.

In conclusion

We discovered that the host had at one point in time played the role of master server and that there was still some leftover crap from that time. During REPHOST's time as the master the sequence numbers for tables 7, 9 and 15 had gotten ahead of the sequence numbers of the real master which was turned off at the time. This had happened because these three tables were edited extensively during the original master's downtime. This in turn led to these tables never getting updated.

After the whole mess was fixed we concluded that the following four steps are all you need to restart your replica in a clean state.

  1. Stop the BoKS software on REPHOST.
  2. Delete all the .dat files in $BOKS_var/data.
  3. Delete the sequence file from $BOKS_var/data.
  4. Restart the BoKS software on REPHOST

I've also asked the folks at Fox Tech to issue a bugfix request to their developers. As I mentioned in step 1, the seqeunce numbers on the replica were ahead of those on the master. Realisticly speaking this should never happen, but BoKS does not currently recognize said situation as a failure.

In the meantime I will write a monitoring script for Nagios and Tivoli that will monitor the proper replication of the BoKS database.


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS troubleshooting: an example of a debugging session

2008-11-21 21:52:00

Recently we ran into a rather perplexing problem: a few of our customers had intermittent login problems. There seemed to be no pattern to this issue, with users from different departments being deing access to their servers at random points in time. Sometimes the problem would go away after a few hours, sometimes it took a few days. It took a few days before the penny dropped and we found out that one of our replica servers was misbehaving.

The paragraphs below outline my diagnosis and troubleshooting procedure.

Original issues

The issues seemed to focus on servers in one specific, physical location.

One of our DBAs created several incidents over the course of a month regarding login issues with user sybase@SYBHOST. Initially this problem was fixed by adding the "ssh_pk" authenticator, but the problem returned with intermittent login denial without an apparent reason.

A number of users from another department indicated intermittent login problems where they were allowed to login one day and denied access the next. My troubleshooting of the problem hadn't given me any real results so far. I'd ran debugging on SSH sessions which didn't clear much up.

For the remainder of this document I will focus on my troubleshooting process for the case involving user sybase.

Symptoms

These denials occur at seemingly random intervals and result in varying BoKS error messages. Most frequent is the rather useless "ERR 223, no authentication" which, as Fox Tech confirms, tells us absolutely nothing. At other times users receive an "ERR 203, no access route" eventhough said user does in fact have the requisite access routes.

Diagnosis

1. Map out the flow of data in this case.

In this case the DBAs attempt to use SSH (with keypair authentication) from sybase@UNIXHOST, to sybase@SYBHOST.

2. Verify the access routes involved in the exchange.

The BoKS database shows that both hosts are part of the hostgroup SYBASE.

BoKS > hgrpadm -l | grep UNIXHOST

...

SYBASE UNIXHOST

TRUSTED UNIXHOST

...



BoKS > hgrpadm -l | grep SYBHOST

...

SYBASE SYBHOST

TRUSTED SYBHOST

...

The BoKS database shows that user sybase is allowed SSH inside hostgroup SYBASE.

BoKS > sx /opt/boksm/sbin/boksadm -S dumpbase -t 2 | grep SYBASE:sybase

RUSER="SYBASE:sybase" ROUTE="ssh*:TRUSTED->SYBASE"

...

RUSER="SYBASE:sybase" ROUTE="ssh*:ANY/SYBASE->SYBASE"

...

3. Verify the authentication methods involved in the exchange.

The BoKS database confirms that sybase is allowed to use SSH keypairs.

BoKS > sx /opt/boksm/sbin/boksadm -S dumpbase -t 31 | grep SYBASE:sybase

RLOGNAME="SYBASE:sybase" TYPE="ssh_pk" VERSION="1.0" FLAGS="1"

4. Check the SSH keypair.

The public key of sybase@UNIXHOST is correctly installed in the authorized_keys file of user sybase@SYBHOST.

sybase@UNIXHOST > cat ~/.ssh/id_dsa.pub

ssh-dss AAAAB3NzaC1kc3MAAACBANSl ... WjUgDlUEIA5g== sybase@UNIXHOST



sybase@SYBHOST > cat ~/.ssh/authorized_keys

ssh-dss AAAAB3NzaC1kc3MAAACBAPd/ ... 8Cbt3Gl9hvTa== sybase@OTHERHOST

ssh-dss AAAAB3NzaC1kc3MAAACBANSl ... WjUgDlUEIA5g== sybase@UNIXHOST

The permissions on the .ssh directory for sybase@SYBHOST are also correct.

sybase@SYBHOST > ls -al ~/.ssh

drwx------ 2 sybase sybase 96 Aug 15 2007 .

drwxr-xr-x 3 sybase sybase 8192 Sep 12 15:58 ..

-rw------- 1 sybase sybase 1210 Oct 27 10:53 authorized_keys

5. Run SSH debug traces.

Because things seem alright so far it's time to check out what's going wrong on the inside of BoKS. The first step to take is to run an additonal debugging SSH daemon. This can be done using the following command. Key in this are the multiple -d flags and "-p 2222".

BoKS > /opt/boksm/lib/boks_sshd -d -d -d -D -g120 -p 2222 >/tmp/Trace.txt 2>&1

The customer is now instructed to attempt a login to port 2222 by adding "-p 2222" to his usual SSH command. This should of course still fail, but this time we can get a trace.

The trace output file gets pretty long because it no only shows the SSH debug information, but also debugging for the BoKS internals. After going through the hostkey exchange, BoKS will start authentication by requesting valid authentication methods.

debug2: userauth-request for user sybase service ssh-connection method none

debug2: input_userauth_request: setting up authctxt for sybase

...

debug2: get_opt_authmethod_from_servc: INSIDE - user = sybase, need_privsep = 0

debug2: boks_servc_call_vec: INSIDE boks_sshd@SYBHOST[6] 14 Nov 11:21:24:026533 in servc_call_str: To server: {FUNC=route-stat-user FROMUSER = sybase ROUTE = SSH:192.168.0.181->?HOST TOHOST=?HOST TOUSER=sybase FROMHOST = 192.168.0.181}

...

boks_sshd@SYBHOST[6] 14 Nov 11:21:24:264031 in servc_call_str: Return: {FUNC=route-stat-user FROMUSER=sybase ROUTE=SSH:192.168.0.181->?HOST TOHOST=?HOST TOUSER=sybase FROMHOST=192.168.0.181 $HOSTSYM=SYBHOST $ADDR=192.168.40.165 $SERVCADDR=192.168.23.9 METHODS=ssh_pk $SERVCVER=6.0.3}

debug2: get_opt_authmethod_from_servc: Must use BokS authentication methods: "ssh_pk"

debug2: get_opt_authmethod_from_servc: BokS optional authentication methods: ""

debug2: boks_ssh_restrict_authmethods: INSIDE - orginal authmethods = publickey,keyboard-interactive

debug2: boks_ssh_restrict_authmethods: DONE - returning methods = publickey

debug2: userauth-request for user

This confirms that authentication using SSH keypairs is allowed and is actually enforced. The key is now checked and (after some fidgeting) accepted.

debug2: input_userauth_request: try method publickey

debug1: trying public key file /home/sybase/.ssh/authorized_keys

...

debug2: userauth_pubkey: authenticated 1 pkalg ssh-dss

Accepted publickey for sybase from 192.168.0.181 port 63569 ssh2

Now that the user has been authenticated BoKS will check his access routes. Sadly this returns with ERR 203 (no access route)

boks_sshd@SYBHOST[6] 14 Nov 11:21:24:304336 in servc_call_str: To server: {FUNC=auth FROMUSER=sybase ROUTE=SSH:192.168.0.181->?HOST TOHOST=?HOST TOUSER=sybase FROMHOST=192.168.0.181 $ssh_pk=ok}

...

boks_sshd@SYBHOST[6] 14 Nov 11:21:24:314704 in servc_call_str: Return: {FUNC=auth FROMUSER=sybase ROUTE=SSH:UNIXHOST->SYBHOST TOHOST=SYBHOST TOUSER=sybase FROMHOST=192.168.0.181 $ssh_pk=ok01$HOSTSYM=SYBHOST $ADDR=192.168.40.165 $SERVCADDR=192.168.23.9 WC=#$*-./?_ UKEY=SYBASE:sybase MOD_CONV=1 SEC_USER=sybase VTYPE=ssh_pk MODLIST=optional_ssh_pk=+1,psw=+1,prompt=-1,timeout=+1,login=+1,verbose=+1 $STATE=6 ERROR=-203 $SERVCVER=6.0.3}

debug3: boks_ssh_do_authorization: Servc auth failed ERROR = -203

6. Force client to use one replica.

Please note that the SSH debug trace above shows that address 192.168.23.9 is used for the servc calls. This indicates that the client is communicating with replica REPHOST. In order to further aid the troubleshooting process it's best to force the client to communicate with just this one replica.

BoKS > cd /etc/opt/boksm

BoKS > vi bcastaddr



DONT_BROADCAST

ADDRESS_LIST

192.168.23.9 REPHOST.domain

~

~

:wq



BoKS > Boot -k

BoKS > Boot

7. Run a trace on the BoKS communications

Just to play it safe we'll need to check that the client's request is sent and received properly. This can be done by running a BoKS debug on the "servc_bridge_[s|r]" process, "s" being on the sending side and "r" on the receiving end.

Once again we'll be asking the customer to SSH to the system. However, right before he executes his command we'll run the following two commands.

Client: bdebug bridge_servc_s -x 9 -f /tmp/servcs.out

Replica: bdebug bridge_servc_r -x 9 -f /tmp/servcr.out

Right after the customer's SSH session is killed again we'll run the following commands.

Client: bdebug bridge_servc_s -x 0

Replica: bdebug bridge_servc_r -x 0

The two resulting files will be rather large and hard to read. Both log should only be given a cursory glance as they only pertain to the BoKS communications itself. In this case the logs indicate no problems at all, though they might have shown problems with hostkeys or network connectivity.

8. Run a trace on the BoKS database processing

Again we will ask the customer to attempt another (failed) login through SSH. This time we will trace another subset of BoKS, the "servc" process which handles the actual database lookup and verification.

Right before the client executes his SSH we'll run the following command.

Replica: bdebug servc -x 9 -f /tmp/servc-trace.out

Right after the customer's SSH session is killed again we'll run the following commands.

Replica: bdebug servc -x 0

The resulting log file will most likely be huge as it will contain all authentication requests handled by the replica during the trace. In order to get to the part of the log that is of interest to us it's best to do a search for the username (sybase). The first entry that we'll find is part of the setup of the authentication request.

servc@REPHOST[3] 14 Nov 11:43:35:660033 in servc_func_1: From client (SYBHOST) {FUNC=route-stat-user FROMUSER=sybase ROUTE=SSH:192.168.0.181->?HOST TOHOST=?HOST TOUSER=sybase FROMHOST=192.168.0.181}

BoKS will now go through a rather lengthy process of identifying the parties involved, which includes some BoKS-database and DNS voodoo to identify the hosts and their hostgroups. It's important to read all the log entries, searching for errors.

Having ascertained the identity of the parties involved, BoKS will start checking the appropriate access routes for the user. In this case you will see that BoKS will go over the access routes found at step 2 one by one. As part of this list it will also go over the access route that should have given sybase SSH access. However, instead we see the following.

14 Nov 11:43:35:930834 in fetchrec: Reading record from tab 2 at offset 1878504 (688 bytes)

14 Nov 11:43:35:931016 in get_route_key: got "ssh*:ANY/SYBASE->SYBASE"

14 Nov 11:43:35:931150 in am_methodcmp: ssh* == SSH ?

14 Nov 11:43:35:931254 in am_methodcmp: yes

14 Nov 11:43:35:931354 in hosttype_cmp: wild = ANY/SYBASE, host = UNIXHOST

14 Nov 11:43:35:931453 in domexpand: Enter. host="ANY/SYBASE"

...



14 Nov 11:43:35:931863 in domexpand: Return. "ANY/SYBASE.domain"

14 Nov 11:43:35:931963 in domexpand: Enter. host="UNIXHOST"

...

14 Nov 11:43:35:932367 in domexpand: Return. "UNIXHOST.domain"

...

14 Nov 11:43:35:932721 in host_wild_cmp: wild (SYBASE.domain) is a hostgroup

14 Nov 11:43:35:932824 in hostgroup_match_sub: enter

14 Nov 11:43:35:933336 in hostgroup_match_sub: no match

14 Nov 11:43:35:933641 in get_route_key: mismatch

This indicates that BoKS thinks that host UNIXHOST is not part of hostgroup SYBASE, even though we already confirmed that this is in fact the case (see step 2). This would seem to indicate that there are problems with the local copy of the BoKS database on replica REPHOST.

We won't have to continue reading the log file any further.

9. Verify data in database on faulty replica.

Suspecting database problems on the replica we check the following.

BoKS > hgrpadm -l | grep UNIXHOST

...

SYBASE UNIXHOST

TRUSTED UNIXHOST

...

Oddly enough the "hgrpadm" command, which interacts with the database, returns the proper results. However, dumping the local tables shows that we have problems.

BoKS > dumpbase -t 7 | grep UNIXHOST

BoKS > dumpbase -t 9 | grep UNIXHOST

BoKS > dumpbase -t 15 | grep UNIXHOST

10. Verifying the synchronisation status of the database

Run the following command on both the master server and the replica. Compare the figures for each table, looking for any discrepancies. A difference less than ten is alright, but anything in the dozens or higher is a problem. In this case I found the following.

BoKS > boksdiag sequence



Master Replica

...

T7 13678d 8:33:46 5053 (5053)

...

T9 13178d 11:05:23 7919 (7919)

...

T15 13178d 11:03:16 1865 (1865)

...

T7 13678d 8:33:46 6982 (6982)

...

T9 13178d 11:05:23 10258 (10258)

...

T15 13178d 11:03:16 2043 (2043)

This indicates that there are indeed synchronisation problems between this replica server and the master server.

11. Verifying the synch status of other replicas

Now that we've ascertained that there's one replica that's running badly, it's a good idea to check the other replicas as well. Run the "boksdiag sequence" command on the other replicas and verify the figures again.

In this case the figures for the other replicas all look fine, with one exception: REPHOST2 complains about database locking issues. Said error messages also pop up when running "dumpbase" commands on that replica, indicating software errors on that host as well.

boksdiag@REPLICA: INTERNAL DYNDB ERROR in blockbase(): Can't lock database

errno = 28, No space left on device

boksdiag@sREPLICA: INTERNAL DYNDB ERROR in bunlockbase(): Can't unlock database

errno = 28, No space left on device

T0 12549d 6:39:06 94193 (94193)

T1 13907d 7:13:45 637314 (637314)

...

 

In conclusion

In the end the problem was in fact down to REPHOST being out of synch with the rest of the BoKS domain. The troubleshooting continues with example 2.


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS troubleshooting: SSH daemon and client

2008-11-21 21:13:00

At $CLIENT we found that almost 60% of our time was being spent on troubleshooting SSH or SFTP in one of its many forms. Because each problem -seemed- unique we kept on reinventing the wheel, costing us precious time. To cut down on this I've set up a short procedure that should help in diagnosing the problem. I've also made a list of various symptoms that are linked to rather rare scenarios.

Troubleshooting example 1 also covers most of these steps with some sample output for additional detail.

Standard procedure is to follow these steps:

  1. Check the BoKS transaction log.
  2. Check the user's access rights.
  3. Check for authentication methods.
  4. Check the user's SSH keypair.

This should actually be enough to handle 70% of the cases. For the rest there's more:



1. Check the BoKS transaction log

While this may sound painfully obvious, the best place to see why a user cannot login is the BoKS transaction log. For each login request handled by BoKS these files will contain a log entry. It's easiest to search for the combination of hostname and username and to use the BoKS log parser to make the output legible.

For example:

$ for FILE in `ls -lrt | grep "Dec 13" | awk '{print $9}'`

> do

> grep $HOSTNAME $FILE | grep $USER | /opt/boksm/sbin/bkslog -f -

> done

Using either the output of the parsed BoKS log, or the list of error codes it should be trivial to find out what's going wrong. The most common errors in our environment are the following:



2. Check the user's access rights

As was mentioned, in the cases of a 200, 201 or a 203 you'll have to make sure whether the user actually has access to the requested resource. Crosscheck the following:

One of the most useful commands will be:

BoKS > lsbks -aTl *:$USER

The "lsbks" command lists information about a user. By using -a (all) and -T (access routes) you'll see everything you'll need to know. Hostgroup, userclass, uid/gid, is the account locked, when was the last login, and so on. You'll also see two lists of access routes: one for the individual user and one for his userclass.



3. Check for authentication methods

SSH is tricky insofar that it allows for (a combination of) multiple authentication methods. The most common are password, keyboard interactive and ssh_pk, aka key pair. The keyboard interactive method is actually forced by BoKS, thus disabling the "password" method, which isn't a problem at all since keyboard interactive -includes- password auth.

If the user's denied access it could be that the used authentication method isn't allowed. Per default, users have to use password authentication. In order to allow keypair authentication one has to set a particular flag on the account. This flag can be checked with either of these commands.

BoKS > authadm list -u *:$USER

BoKS > dumpbase -t31 | grep $USER

You'll notice the "must use" flag which indicates whether ssh_pk is optional or required. This value can be change using the -m and -M flags on the "authadm mod" command.



4. Check the user's SSH keypair

If the user is in fact making use of ssh_pk we should ensure that all relevant settings are correct.



5. Rare cases: further debugging

For those few cases that aren't solved by the aforementioned steps, there's a few other things we can try.



6. Scenarios and symptoms


kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS troubleshooting: replication of the BoKS database

2008-11-21 21:08:00

If one or more of the replicas are out of sync login attempts by users may fail, assuming that the BoKS client on the server in question was looking at the out-of-sync BoKS replica. Other nasty stuff may also occur.

Standard procedure is to follow these steps:

  1. Check the status of all BoKS replicas.
  2. Check BoKS error logs on the master and the replica(s).
  3. Try a forced database download.
  4. Check BoKS replication processes to see if they are all running.
  5. Check the master queue, using the boksdiag fque -master command.
  6. Check BoKS communications, using the cadm command.
  7. Check node keys.
  8. Check the replica server's definition on BoKS database.
  9. Check the BoKS configuration on the replica.
  10. Debug replication processes.

All commands are run in a BoKS shell, on the master server unless specified otherwise.



1. Check the status of all BoKS replicas.

# /opt/boksm/sbin/boksadm -S boksdiag list

Since last pckt

The amount of minutes/seconds since the BoKS master last sent a communication packet to the respective replica server. This amount should never exceed more than a couple of minutes.

Since last fail

The amount of days/hours/minutes since the BoKS master was last unable to update the database on the respective replica server. If an amount of a couple of hours is listed you'll know that the replica server had a recent failure.

Since last sync

Shows the amount of days/hours/minutes since BoKS last sent a database update to the respective replica server.

Last status

Yes indeed! The last known status of the replica server in question. OK means that the server is running perfectly and that updates are received. Loading means that the server was just restarted and is still loading the database or any updates. Down indicates that the replica server is down or even dead.



2. Check BoKS error logs on the master and the replica(s).

This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the master and the replicas to see if you can detect any errors there. If the log file doesn't mention something about the hosts involved you should be able to find the cause of the problem pretty quickly.



3. Try a forced database download.

Keon> boksdiag download -force $hostname

This will push a database update to the replica. Perform another boksdiag list to see if it worked. Re-read the BoKS error log file to see if things have cleared up.



4. Check BoKS replication processes to see if they are all running.

Keon> ps -ef | grep -i drainmast

This should show two drainmast processes running. If there aren't you should see errors about this in the error logs and in Tivoli.

Keon> Boot -k

Keon> ps -ef | grep -i boks (kill any remaining BoKS processes)

Keon> Boot

Check to see if the two drainmast processes stay up. Keep checking for at least two minutes. If one of them crashes again, try the following:

Check to see that /opt/boksm/lib/boks_drainmast is still linked to boks_drainmast_d, which should be in the same directory. Also check to see that boks_drainmast_d is still the same file as boks_drainmast_d.nonstripped.

If it isn't, copy boks_drainmast_d to boks_drainmast_d.orig and then copy the non-stripped version over the boks_drainmast_d. This will allow you to create a core file which is useful to TFS Technology.

Keon> Boot -k

Keon> Boot

Keon> ls -al /core

Check that the core file was just created by boks_drainmast_d.

Keon> Boot -k

Keon> cd /var/opt/boksm/data

Keon> tar -cvf masterspool.tar master_spool

Keon> rm master_spool/*

Keon> Boot

Things should now be back to normal. Send both the tar file and the core file to TFS Technology (support@tfstech.com).



5. Check the master queue.

Keon> boksdiag fque -master

If any messages are stuck there is most likely still something wrong with the drainmast processes. You may want to try and reboot the BoKS master software. Do NOT reboot the master server! Reboot the software using the Boot command. If that doesn't help, perform the troubleshooting tips from step 4.



6. Check BoKS communications, using the cadm command.

Verify that the BoKS communication between the master and the replica itself is up and running.

Keon> cadm -l -f bcastaddr -h $replica.

If this doesn't work, re-check the error logs on the client and proceed with step 7.



7. Check node keys.

On the replica system run:

Keon> hostkey

Take the output from that command and run the following on the master:

Keon> dumpbase | grep $hostkey

If this doesn't return the configuration for the replica server, the keys have become unsynchronized. If you make any changes you will need to restart the BoKS processes, using the Boot command.



8. Check the replica server's definition on BoKS database.

Keon> dumpbase | grep RNAME | grep $replica

The TYPE field in the definition of the replica should be set to 261. Anything else is wrong, so you need to update the configuration in the BoKS database. Either that or have SecOPS do it for you.



9. Check the BoKS configuration on the replica.

On the replica system, review the settings in /etc/opt/boksm/ENV.



10. Debug replication processes.

If all of the above fails you should really get cracking with the debugger. Refer to the appropriate chapter of this manual for details.




kilala.nl tags: , ,

View or add comments (curr. 0)

BoKS troubleshooting: login and communications issues on the client

2008-11-21 21:04:00

The basics: verifying the proper functioning of a BoKS client

These easy steps will show you whether your new client is working like it should.

  1. Check the boks_errlog in $BOKS_var.
  2. Run cadm -l -f bcastaddr -h $client from the BoKS master (in a BoKS shell).
  3. Try to login to the new client.

If all three steps go through without error your systems is as healthy as a very healthy good thing... or something.



You can't log in to a BoKS client

Most obviously we can't do our work on that particular server and neither can our customers. Naturally this is something that needs to be fixed quite urgently!

  1. Check BoKS transaction log.
  2. Check if you can log in.
  3. Check BoKS communications
  4. Check bcastaddr and bremotever files.
  5. Check BoKS port number.
  6. Check node keys
  7. Check BoKS error logs.
  8. Debug servc process on replica server or relevant process on client.

All commands are run in a BoKS shell, on the master server unless specified otherwise.



1. Check BoKS transaction log.

Keon> cd /var/opt/boksm/data Keon> grep $user LOG | bkslog -f - -wn

This should give you enough output to ascertain why a certain user cannot login. If there is no output at all, do the following:

Keon> cd /var/junkyard/bokslogs Keon> for file in `ls -lrt | tail -5 | awk '{print $9}'`

> do

> grep $user $file | bkslog -f - -wn

> done

If this doesn't provide any output, perform step 2 as well to see if us sys admins can login.



2. Check if you can log in.

Pretty self-explanatory, isn't it? Try if you can log in yourself.



3. Check BoKS communications

Keon> cadm -l -f bcastaddr -h $client



4. Check bcastaddr and bremotever files.

Login to the client through its console port.

Keon> cat /etc/opt/boksm/bcastaddr

Keon> cat /etc/opt/boksm/bremotever

These two files should match the same files on another working client. Do not use a replica or master to compare the files. These are different over there. If you make any changes you will need to restart the BoKS processes using the Boot command.



5. Check BoKS port number.

On the client and master run:

Keon> getent services boks

This should return the same value for the BoKS base port. If it doesn't either check /etc/services or NIS+. If you make any changes you will need to restart the BoKS processes using the Boot command.



6. Check node keys

On the client system run:

Keon> hostkey

Take the output from that command and run the following on the master:

Keon> dumpbase | grep $hostkey

If this doesn't return the definition for the client server, the keys have become unsynchronized. Reset them and restart the BoKS client software. If you make any changes you will need to restart the BoKS processes using the Boot command.



7. Check BoKS error logs.

This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the master and the client to see if you can detect any errors there. If the log file doesn't mention something about the hosts involved you should be able to find the cause of the problem pretty quickly.



8. Debug servc process on replica server or relevant process on client.

If all of the above fails you should really get cracking with the debugger. Refer to the appropriate chapter of this manual for details (see chapter: SCENARIO: Setting a trace within BoKS)

NOTE: If you need to restart the BoKS software on the client without logging in, try doing so using a remote management tool, like Tivoli.



The client queues are filling up or you can't communicate with the client

The whole of BoKS is still up and running and everything's working perfectly. The only client(s) that won't work are the one(s) that have stuck queues. The only way you'll find out about this is by running boksdiag fque -bridge which reports all of the queues which are stuck.

  1. Check if client is up and running.
  2. Check BoKS communications.
  3. Check node keys.
  4. Check BoKS error logs.

All commands are run in a BoKS shell, on the master server unless specified otherwise.



1. Check if client is up and running.

Keon> ping $client

Also ask your colleagues to see if they're working on the system. Maybe they're performing maintenance.



2. Check BoKS communications.

Keon> cadm -l -f bcastaddr -h $client



3. Check node keys.

On the client system run:

Keon> hostkey

Take the output from that command and run the following on the master:

Keon> dumpbase | grep $hostkey

If this doesn't return the definition for the client server, the keys have become unsynchronised. Reset them and restart the BoKS client software using the Boot command.



4. Check BoKS error logs.

This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the master and the client to see if you can detect any errors there. If the log file doesn't mention something about the hosts involved you should be able to find the cause of the problem pretty quickly.

NOTE: What can we do about it?

If you're really desperate to get rid of the queue, do the following

Keon> boksdiag fque -bridge -delete $client-ip

At one point in time we thought it would be wise to manually delete messages from the spool directories. Do not under any circumstance touch the crypt_spool and master_spool directories in /var/opt/boksm. Really: DON'T DO THIS! This is unnecessary and will lead to troubles with BoKS.


kilala.nl tags: , ,

View or add comments (curr. 0)

The BoKS/Keon users group is taking off!

2008-10-29 09:44:00

About a week ago I opened up the BoKS Access Control users group (LinkedIn) on LinkedIn.com. My goal was to unite BoKS/Keon admins from across the globe in order to build a tightly knit network in which we can all share our knowledge of BoKS.

The thing is, the way things are right now, there's hardly any information on the web about BoKS/Keon. First off "BoKS" is a four letter word, which makes it hard for Google to look for anything useful (especially since it keeps correcting it to "books"). Second, there's not that much on the web anyway! There's my website which has some real info and then there's the Fox Tech site which has general sales info. For some reason Fox Tech decided to hide all the manuals and in-depth stuff so only paying customers can get to the docs.

By building a professional network of BoKS users we finally know who to turn to for questions! LinkedIn allows us to post discussions inside our group and since folks from Fox Tech are also joining, we're bound to get some good answers!

Right now we're at 31 members but, since Fox has started advertising the group to their customers, I'm assuming we'll see a steady rise in members RSN(tm)!


kilala.nl tags: , , ,

View or add comments (curr. 0)

Modifying the BoKS GUI

2008-10-20 08:10:00

In most cases the BoKS administration GUI serves its purpose. It's pretty spartan, though it can look a bit crowded at times. This isn't altogether that strange, as FoxT have used the same GUI layout for years on end. It's getting a bit long in the tooth.

Sometimes though you'll run into things that you'd like to do from the GUI, but which aren't implemented (yet). And that's where the hacking starts ^_^ In this article I'll go over the basic structure of the GUI's files and resources, explaining the function of each part. I'll also discuss a few of the changes we've made (or are contemplating) at $CLIENT.

Structure of the GUI files and resources

As is mentioned elsewhere, BoKS runs a custom webserver on ports 6505 and 6506 (default ports). This webserver gets started using the $BOKS_etc/boksinit.master script and, as the name implies, only runs on the master server.

All resources for the management GUI are stored in $BOKS_lib/gui. There you will find four subdirectories.

Keon> ls $BOKS_lib/gui

etc

forms

public

tcl

To start with, the public directory contains those few files that are accessible without having logged on. Naturally these files are limited to the various login screens, ie password/certificate/securid. Nothing more, nothing less.

The etc directory contains all the template files (.tmpl) that are used to create the GUI, as well as all of the image files. Most images are limited to the black banner at the top.

The forms directory consists of files and directories that form the menu structure of the GUI. There's a .menu file for each option in the main menu and a directory containing more .menu's for options that have sub-menus. This directory also contains all of the .form files that are used to enter or edit information.

Finally, the tcl directory contains the TCL code that does the actual work. Whenever you've edited a form to update information in the database, this code gets used to perform the actual modifications.

Including the domain name in the banner

One of the first mods that I wanted to make to our GUI was to include the names of the BoKS domain and the master/replica server in the black banner of each page. That way it would be impossible to mix up in which domain you're working, thus lowering the chance of FUBARs. Later on I also decided it would be a good idea to include the domain name in each page's title. Of course this mod isn't as useful if you're only running one domain.

To make the desired changes we'll need to edit a number of .tmpl files in $BOKS_lib/gui/etc/eng. The changes will be making are along these lines.

Original:

<html>

<title>

Welcome to FoxT BoKS

</title>

<body><body TEXT="000000" LINK="#0000FF" ALINK="#0000FF" VLINK="#0000FF">



<table bgcolor="black" width="100%">

<tr><td align="center">< IMG SRC="@PUBLIC@/eng/figs/welcome.gif" alt="Welcome to FoxT BoKS"></td></tr>

</table>

Modified:

<html>

<title>

CAT DOMAIN: Welcome to FoxT BoKS

</title>

<body><body TEXT="000000" LINK="#0000FF" ALINK="#0000FF" VLINK="#0000FF">



<table style="color: #000000;" bgcolor="black" width="100%">

<tr><td align="center"><IMG SRC="@PUBLIC@/eng/figs/welcome.gif" alt="Welcome to FoxT BoKS"></td></tr>

<tr><td align="center">CAT DOMAIN, running on master server<i>Andijvie</i></td></tr>

</table>

As you can see, all I did was slightly modify the TITLE tag and I've added an additional row to the banner table. I've also tweaked the text colour in the banner, so it's not black on black.

The abovementioned changes need to be made in all of the .tmpl files on the master server. If you like, you could also make the mods on the replica servers, assuming that you may at one point in time need to failover to one of them. You never know when the master server might croak.


kilala.nl tags: , ,

View or add comments (curr. 4)

Troubleshooting BoKS fault situations

2008-01-01 00:00:00

A PDF version of this document is available. Get it over here.

1.1.Verifying the proper functioning of a BoKS client

People have often asked me how one can check of a newly installed BoKS client is functioning

properly. With these three easy steps you too can become a milliona..!!.... Oops... Wrong show!

These easy steps will show you whether your new client is working like it should.

  1. Check the boks_errlog in $BOKS_var.
  2. Run cadm –l –f bcastaddr –h $client from the BoKS master (in a BoKS shell).
  3. Try to login to the new client.

If all three steps go through without error your systems is as healthy as a very healthy good

thing... or something.

1.2.SCENARIO: The BoKS master is not replicating to a replica (or all replicas)

Since on or more of the replicas is/are out of sync login attempts by users may fail, assuming that

the BoKS client on the server in question was looking at the out-of-sync BoKS replica. Other

nasty stuff may also occur.

Standard procedure is to follow these steps:

  1. Check the status of all BoKS replicas.
  2. Check BoKS error logs on the master and the replica(s).
  3. Try a forced database download.
  4. Check BoKS replication processes to see if they are all running.
  5. Check the master queue, using the boksdiag fque –master command.
  6. Check BoKS communications, using the cadm command.
  7. Check node keys.
  8. Check the replica server’s definition on BoKS database.
  9. Check the BoKS configuration on the replica.
  10. Debug replication processes.

All commands are run in a BoKS shell, on the master server unless specified otherwise.

1. Check the status of all BoKS replicas.

# /opt/boksm/sbin/boksadm –S boksdiag list

Since last pckt

The amount of minutes/seconds since the BoKS master

last sent a communication packet to the respective

replica server. This amount should never exceed more

than a couple of minutes.

Since last fail

The amount of days/hours/minutes since the BoKS

master was last unable to update the database on the

respective replica server. If an amount of a couple of

hours is listed you’ll know that the replica server had a

recent failure.

Since last sync

Shows the amount of days/hours/minutes since BoKS last

sent a database update to the respective replica server.

Last status

Yes indeed! The last known status of the replica server in

question. OK means that the server is running perfectly

and that updates are received. Loading means that the

server was just restarted and is still loading the database

or any updates. Down indicates that the replica server is

down or even dead.

2. Check BoKS error logs on the master and the replica(s).

This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the

master and the replicas to see if you can detect any errors there. If the log file doesn’t mention

something about the hosts involved you should be able to find the cause of the problem pretty

quickly.

3. Try a forced database download.

Keon> boksdiag download –force $hostname

This will push a database update to the replica. Perform another boksdiag list to see if it

worked. Re-read the BoKS error log file to see if things have cleared up.

4. Check BoKS replication processes to see if they are all running.

Keon> ps –ef | grep –i drainmast

This should show two drainmast processes running. If there aren’t you should see errors about

this in the error logs and in Tivoli.

Keon> Boot –k

Keon> ps –ef | grep –i boks (kill any remaining BoKS processes)

Keon> Boot

Check to see if the two drainmast processes stay up. Keep checking for at least two minutes. If

one of them crashes again, try the following:

Check to see that /opt/boksm/lib/boks_drainmast is still linked to boks_drainmast_d, which

should be in the same directory. Also check to see that boks_drainmast_d is still the same file as

boks_drainmast_d.nonstripped.

If it isn’t, copy boks_drainmast_d to boks_drainmast_d.orig and then copy the non-stripped

version over the boks_drainmast_d. This will allow you to create a core file which is useful to TFS

Technology.

Keon> Boot –k

Keon> Boot

Keon> ls –al /core

Check that the core file was just created by boks_drainmast_d.

Keon> Boot –k

Keon> cd /var/opt/boksm/data

Keon> tar –cvf masterspool.tar master_spool

Keon> rm master_spool/*

Keon> Boot

Things should now be back to normal. Send both the tar file and the core file to TFS Technology

(support@tfstech.com).

5. Check the master queue.

Keon> boksdiag fque –master

If any messages are stuck there is most likely still something wrong with the drainmast processes.

You may want to try and reboot the BoKS master software. Do NOT reboot the master server!

Reboot the software using the Boot command. If that doesn’t help, perform the troubleshooting

tips from step 4.

6. Check BoKS communications, using the cadm command.

Verify that the BoKS communication between the master and the replica itself is up and running.

Keon> cadm –l –f bcastaddr –h $replica.

If this doesn’t work, re-check the error logs on the client and proceed with step 7.

7. Check node keys.

On the replica system run:

Keon> hostkey

Take the output from that command and run the following on the master:

Keon> dumpbase | grep $hostkey

If this doesn’t return the configuration for the replica server, the keys have become

unsynchronized. If you make any changes you will need to restart the BoKS processes, using the

Boot command.

8. Check the replica server’s definition on BoKS database.

Keon> dumpbase | grep RNAME | grep $replica

The TYPE field in the definition of the replica should be set to 261. Anything else is wrong, so you

need to update the configuration in the BoKS database. Either that or have SecOPS do it for you.

9. Check the BoKS configuration on the replica.

On the replica system, review the settings in /etc/opt/boksm/ENV.

10. Debug replication processes.

If all of the above fails you should really get cracking with the debugger. Refer to the appropriate

chapter of this manual for details.

1.3.SCENARIO: You can’t log in to a BoKS client

Most obviously we can’t do our work on that particular server and neither can our customers.

Naturally this is something that needs to be fixed quite urgently!

  1. Check BoKS transaction log.
  2. Check if you can log in.
  3. Check BoKS communications
  4. Check bcastaddr and bremotever files.
  5. Check BoKS port number.
  6. Check node keys
  7. Check BoKS error logs.
  8. Debug servc process on replica server or relevant process on client.

All commands are run in a BoKS shell, on the master server unless specified otherwise.

1. Check BoKS transaction log.

Keon> cd /var/opt/boksm/data

Keon> grep $user LOG | bkslog –f - -wn

This should give you enough output to ascertain why a certain user cannot login. If there is no

output at all, do the following:

Keon> cd /var/junkyard/bokslogs

Keon> for file in `ls –lrt | tail –5 | awk ‘{print $9}’`

> do

> grep $user $file | bkslog –f - -wn

> done

If this doesn’t provide any output, perform step 2 as well to see if us sys admins can login.

2. Check if you can log in.

Pretty self-explanatory, isn’t it? Try if you can log in yourself.

3. Check BoKS communications

Keon> cadm –l –f bcastaddr –h $client

4. Check bcastaddr and bremotever files.

Login to the client through its console port.

Keon> cat /etc/opt/boksm/bcastaddr

Keon> cat /etc/opt/boksm/bremotever

These two files should match the same files on another working client. Do not use a replica or

master to compare the files. These are different over there. If you make any changes you will need

to restart the BoKS processes using the Boot command.

5. Check BoKS port number.

On the client and master run:

Keon> getent services boks

This should return the same value for the BoKS base port. If it doesn’t either check /etc/services

or NIS+. If you make any changes you will need to restart the BoKS processes using the Boot

command.

6. Check node keys

On the client system run:

Keon> hostkey

Take the output from that command and run the following on the master:

Keon> dumpbase | grep $hostkey

If this doesn’t return the definition for the client server, the keys have become unsynchronized.

Reset them and restart the BoKS client software. If you make any changes you will need to restart

the BoKS processes using the Boot command.

7. Check BoKS error logs.

This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the

master and the client to see if you can detect any errors there. If the log file doesn’t mention

something about the hosts involved you should be able to find the cause of the problem pretty

quickly.

8. Debug servc process on replica server or relevant process on client.

If all of the above fails you should really get cracking with the debugger. Refer to the appropriate

chapter of this manual for details (see chapter: SCENARIO: Setting a trace within BoKS)

NOTE: If you need to restart the BoKS software on the client without logging in, try doing so using a remote management tool, like Tivoli.

1.4 SCENARIO: The BoKS client queues are filling up

The whole of BoKS is still up and running and everything’s working perfectly. The only client(s)

that won’t work are the one(s) that have stuck queues. The only way you’ll find out about this is

by running boksdiag fque –bridge which reports all of the queues which are stuck.

  1. Check if client is up and running.
  2. Check BoKS communications.
  3. Check node keys.
  4. Check BoKS error logs.

All commands are run in a BoKS shell, on the master server unless specified otherwise.

1. Check if client is up and running.

Keon> ping $client

Also ask your colleagues to see if they’re working on the system. Maybe they’re performing

maintenance.

2. Check BoKS communications.

Keon> cadm –l –f bcastaddr –h $client

3. Check node keys.

On the client system run:

Keon> hostkey

Take the output from that command and run the following on the master:

Keon> dumpbase | grep $hostkey

If this doesn’t return the definition for the client server, the keys have become unsynchronised.

Reset them and restart the BoKS client software using the Boot command.

4. Check BoKS error logs.

This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the

master and the client to see if you can detect any errors there. If the log file doesn’t mention

something about the hosts involved you should be able to find the cause of the problem pretty

quickly.

NOTE: What can we do about it?

If you’re really desperate to get rid of the queue, do the following

Keon> boksdiag fque –bridge –delete $client-ip

At one point in time we thought it would be wise to manually delete

messages from the spool directories. Do not under any circumstance touch the

crypt_spool and master_spool directories in /var/opt/boksm. Really:

DON’T DO THIS! This is unnecessary and will lead to troubles with BoKS.

1.5 SCENARIO: Setting a trace within BoKS

We are required to run a BoKS debug trace when either:

  1. People are unable to login without any apparent reason. A debug will show why login are

    getting rejected.

  2. We have run into a bug or a problem with BoKS which cannot easily be dealt with through e-

    mail. TFS Tech support will usually request us to perform a number of traces and that we send

    them the output files..

First off, let me warn you: debug trace log files can grow pretty vast pretty fast! Make sure that

you turn on the trace only right before you’re ready to use the faulty part of BoKS and also be

sure to stop the trace immediately once you’re done.

Now, before you can start a trace you will need to make sure that the BoKS client system only

performs transactions with one BoKS server. If you don’t you will have no way of knowing on

which server you should run the trace.

Login to the client system experiencing problems.

$ su –

# cd /etc/opt/boksm

# cp bcastaddr bcastaddr.orig

# vi bcastaddr

Edit the file in such a way that it only points to one of the available BoKS servers. Preferably a

BoKS replica. Please refrain from using the BoKS master server.

# /opt/boksm/sbin/boksadm –S Boot –k

# sleep 10; ps –ef | grep –i boks | awk '{print $2}' | xargs kill

# /opt/boksm/sbin/boksadm –S Boot

Now, how you proceed depends on what problems you are experiencing.

If people are having problems logging in:

Log in to the replica server and start Boks with sx.

# sx /opt/boksm/sbin/boksadm –S

# cd /var/tmp

Now, type the following command, but DO NOT press enter yet.

# bdebug –x 9 bridge_servc_r –f /var/tmp/BR-SERVC.trace

Open a new terminal window, because we will try to login to the failing client. BEFORE YOU

START THE TOOL USED TO LOGIN (SSH, Telnet, FTP, whatever) press enter at the command

waiting on the replica server. Attempt to login as usual. If it fails you have successfully set a trace.

Switch back to the window on the replica server and run the following command to stop the

trace.

# bdebug –x 0 bridge_servc_r

Repeat the same process once more, but this time around debug the servc process instead of

bridge_servc_r. Send the output to /var/tmp/SERVC.trace.

You can now read through the files /var/tmp/BR-SERVC.trace and /var/tmp/SERVC.trace to

troubleshoot the problem by your self, or you could send it to TFS Tech for analysis. If the

attempted login did NOT fail there’s something else going on: one of the other replica servers is

not working properly! Find out which one it is by changing the client’s bcastaddr file while every

time using a different BoKS server as a target.

If you are attempting to troubleshoot another kind of problem:

Tracing any other part of BoKS isn’t really altogether that different from tracing the login process.

You prepare in the same way (make bcastaddr point at one BoKS server) and you will probably

have to prepare the trace on bridge_servc_r as well (see the text block above; if you do not have

to trace bridge_servc_r TFS Tech will probably tell you so).

Yet again, BEFORE you start the trace on the master side by running

# bdebug –x 9 bridge_servc_r –f /var/tmp/SERVC.trace

You will have to go to the client system with the problematic situation and perform the following.

# cd /var/tmp

# bdebug –x 9 $PROG –f /var/tmp/$PROG.trace

$PROG in this case is the name of the BoKS process (bridge_servc_r, drainmast_download) or the

access method (login, su, sshd) that you want to debug.

Now, start both traces and attempt to perform the task that is failing. Once it has failed, stop

both traces again using bdebug –x 0 $PROG.

1.6 SCENARIO: Debugging the BoKS SSH daemon

From time to time you may have problems with the BoKS SSH daemon which cannot be explained

in any logical way. At such a time a debug trace of the SSH daemon can be very helpful! This can

be done by starting a second daemon on an unused port temporarily.

On the troubled system, login and start a BoKS shell:

# /opt/boksm/sbin/boksadm –S

Keon> boks_sshd –d –d –d –p 24 /tmp/sshd.out 2>&1

From another system:

$ ssh –l $username -p24 $target-host

Try logging in; it shouldn’t work :) Now close the SSH session with Ctrl-C, which should also

close the temporary SSH daemon on port 24. /tmp/sshd.out should now contain all of the

debugging information you or TFS Technology could need.


kilala.nl tags: , , , ,

View or add comments (curr. 0)

Nagios and BoKS/Keon

2005-09-11 00:47:00

Major updates in the Sysadmin section! w00t!

In this case a lot of information one of my favourite security tools and Nagios, my new-found love on the monitoring front.


kilala.nl tags: , , , ,

View or add comments (curr. 0)

Hacking NIS+ and BoKS

2004-11-17 18:25:00

Holy moly, what a weekend! I can tell you guys right now that the procedure I wrote for switching NIS+ master servers is NOT fool proof! We had planned to only take about four hours at a max, for switching both NIS+ and BoKS over to a new master server. Unfortunately it turned out that we would only get to spend one hour on switching NIS+ until things went horribly sour.

In the end I spent a total of eightteen hours in the office on Saturday and Sunday. I'll spare you the gory details for now (I'll incorporate them in version 2.0 of the master switch procedure).

But God, what a weekend! And the way it looks now we'll be repeating it in a week or so...

Aniwho... I'm still trying to put as much time as possible into my work for the convention, but it's going slowly. I plan on spending every free minute of coming thursday on my Foundation work though. That should get me along the way nicely.


kilala.nl tags: , , ,

View or add comments (curr. 0)

Travelling to Brussels, teaching a course

2004-02-10 08:01:00

Ah! This feels so incredibly good! ^_^

Today I'm travelling to Brussels, instead of heading off to the office like any other day, to give a short course to our IT colleagues over there. We're busy on a very exciting (and tiring) project which involves migrating hundreds of servers from London, over to the EU mainland. These servers will be placed within domains which involve a certain piece of security software that we use at $CLIENT, and the course I'm about to give covers just that!

Anyway. Not to delve too much into our company politics :) The reason I'm feeling so well this morning (it's about 8:30 now) is because I get to take the Thalys train into Brussels! This involves getting up at five in the morning, riding a luxury cab to Schiphol airport and then getting on the train around 7:15. $CLIENT even sprang for a first class ticket for me! So that means that I get to sit in a _very_ comfy seat, while working on the company's laptop and getting pampered by two lovely ladies. Don't you just _love_ a good, free breakfast?!

Speaking of pampering: I just booked a cab ride in Brussels _from_ the train! ^_^ This is so weird! I just can't help feeling giddy with excitement. (Gee Cailin! I guess you don't get around much, do you?!)

And speaking of laptops: right now I'm working on this HP Omnibook I borrowed from the company. It's running NT4, so it's both slow and instable : ( But my experiences during the last two weeks have lead me to decide that I seriously want a laptop of my own. Preferably an iBook of course! It's unbelievable how bloody useful these contraptions are and the amount of work I can get done with them while on the road!

Aniway, I'd better get back to work now! I'll be arriving at Brussels around 9:30, so I'd better review my course material one more time *shudder*

Cheers!


kilala.nl tags: , , ,

View or add comments (curr. 0)

Older blog posts