As I already said in my last post about “Can’t install ohasd service“, setting up Oracle Clusterware 11.2.0.4 on SuSE Linux Enterprise Server (SLES) SP2 should work flawlessly, but sometimes it does not. This time, it was about the USM drivers.
USM driver install actions failed
/u01/app/grid/11.2.0/perl/bin/perl -I/u01/app/grid/11.2.0/perl/lib
-I/u01/app/grid/11.2.0/crs/install
/u01/app/grid/11.2.0/crs/install/rootcrs.pl execution failed
USM drivers are components (Kernel object files, extension .ko) enabling ACFS – I don’t use it on this system, but root.sh (in fact, rootcrs.pl) needs a decent directory structure related to the Linux Kernel version: Again, the log file “$GRID_HOME/cfgtoollogs/crsconfig/rootcrs_<hostname>.log” was my friend: It unveiled, that the problem was somewhat related to loading oracleoks.ko. And this file is in directory “$GRID_HOME/install/usm/Novell/SLES11/x86_64/<your-kernel-version>/default/bin”. Trouble is, that good old SLES 11 SP2 has a Kernel that was not foressen by the Oracle folks implementing this piece of software.
oracle@lx01:~> cat /etc/SuSE-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 2
oracle@lx01:~> uname -a
Linux iwacslx01 3.0.101-0.5-default #1 SMP <...> x86_64 GNU/Linux
oracle@iwacslx01:/u01/app/grid/11.2.0/install/usm/Novell/SLES11/x86_64> (asm)> ll
total 16
drwxr-xr-x 4 oracle oragrid 4096 Jan 11 17:10 2.6.27.19-5
drwxr-xr-x 4 oracle oragrid 4096 Jan 11 17:10 2.6.32.12-0.7
drwxr-xr-x 4 oracle oragrid 4096 Jan 11 17:10 3.0.13-0.27
drwxr-xr-x 4 oracle oragrid 4096 Jan 11 17:10 3.0.61-0.9
<no such number, no such zone … *sing*>
What would every Linux admin in the world think? Let’s create a matching symlink. But, My-Oracle-Support (MOS) is your friend. Doc ID 1590701.1 explains this issue for another Kernel version (of SLES 11 SP1 I assume). And, huh, the fact one has to know and what the note tells when you read it carefully, that the link target has to be 3.0.13-0.27, since something in directory 3.0.61-0.9 seems broken.
And, it’s more complicated. Oracle Clusterware somewhere registers a Kernel version for its modules. If they have the wrong info, they take an entirely different directory. So check this as well:
oracle@lx01:~> (asm)> $ORACLE_HOME/bin/acfsdriverstate version
ACFS-9325: Driver OS kernel version = 2.6.32.12-0.7-default(x86_64).
ACFS-9326: Driver Oracle version = 130707.
So what to do about it? MOS says, drop it or move it. Wow. What a strategy – but it works…
oracle@lx01:/u01/app/grid/11.2.0/install/usm/Novell/SLES11/x86_64> (asm)>
mv 2.6.32.12-0.7 2.6.32.12-0.7.shipped
Now let’s create the symlink we all are longing for, and we are done. Remember, it’s the Kernel version from uname, pointing to 3.0.13-0.27 because of fog on the other airport)
oracle@lx01:/u01/app/grid/11.2.0/install/usm/Novell/SLES11/x86_64> (asm)>
ln -s 3.0.13-0.27 3.0.101-0.5
Now the directory looks very much like that:
oracle@lx01:/u01/app/grid/11.2.0/install/usm/Novell/SLES11/x86_64> (asm)> ll
total 16
drwxr-xr-x 4 oracle oragrid 4096 Jan 11 17:10 2.6.27.19-5
drwxr-xr-x 4 oracle oragrid 4096 Jan 11 17:10 2.6.32.12-0.7.shipped
lrwxrwxrwx 1 oracle oragrid 11 Jan 11 18:19 3.0.101-0.5 -> 3.0.13-0.27
drwxr-xr-x 4 oracle oragrid 4096 Jan 11 17:10 3.0.13-0.27
drwxr-xr-x 4 oracle oragrid 4096 Jan 11 17:10 3.0.61-0.9
After this, re-running root.sh was successful. Wait a minute – re-running? What about cleaning up like we had to do for ages? Oracle Grid Infrastructure has an improvement: root.sh now is checkpointed, and only repeats what did not work before. Really nice!
Again, good luck,
Martin
PS: Another solution would be the patch for BUG:17475946 – ROOT.SH OR ACFSROOT INSTALL, FAILS: ACFS-9109: SLES11 SP3
I’m not sure if it works for SP2 as well. But it’s fixed in 12c.