>
Oracle, Technical

Grid/CRS AddNode or runInstaller fails with NullPointerException

Posting this here mostly to archive it, so I can find it later if I ever see this problem again.

Today I was repeatedly getting this error while trying to add a node to a cluster:

(grid)$ $ORACLE_HOME/oui/bin/addNode.sh -silent -noCopy CRS_ADDNODE=true CRS_DHCP_ENABLED=false INVENTORY_LOCATION=/u01/oraInventory ORACLE_HOME=$ORACLE_HOME "CLUSTER_NEW_NODES={new-node}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={new-node-vip}"
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 24575 MB    Passed
Oracle Universal Installer, Version 11.2.0.3.0 Production
Copyright (C) 1999, 2011, Oracle. All rights reserved.

Exception java.lang.NullPointerException occurred..
java.lang.NullPointerException
        at oracle.sysman.oii.oiic.OiicAddNodeSession.initialize(OiicAddNodeSession.java:524)
        at oracle.sysman.oii.oiic.OiicAddNodeSession.<init>(OiicAddNodeSession.java:133)
        at oracle.sysman.oii.oiic.OiicSessionWrapper.createNewSession(OiicSessionWrapper.java:884)
        at oracle.sysman.oii.oiic.OiicSessionWrapper.<init>(OiicSessionWrapper.java:191)
        at oracle.sysman.oii.oiic.OiicInstaller.init(OiicInstaller.java:512)
        at oracle.sysman.oii.oiic.OiicInstaller.runInstaller(OiicInstaller.java:968)
        at oracle.sysman.oii.oiic.OiicInstaller.main(OiicInstaller.java:906)
SEVERE:Abnormal program termination. An internal error has occured. Please provide the following files to Oracle Support :

"Unknown"
"Unknown"
"Unknown"

There were two notes on MOS related to NullPointerExceptions from runInstaller (which is used behind the scenes for addNode in 11.2.0.3 on which I had this problem). Note 1073878.1 describes addNode failing in 10gR2, and the root cause was that the home containing CRS binaries was not registered in the central inventory. Note 1511859.1 describes attachHome failing, presumably on 11.2.0.1 – and the root cause was file permissions that blocked reading of oraInst.loc.

Based on these two notes, I had a suspicion that my problem had something to do with the inventory. Note that you can get runInstaller options by running “runInstaller -help” and on 11.2.0.3 you can debug with “-debug -logLevel finest” at the end of your addNode command line. The log file is produced in a logs directory under your inventory. However in this case, it produces absolutely nothing helpful at all…

After quite a bit of work (even running strace and ltrace on the runInstaller, which didn’t help one bit)… I finally figured it out:

(grid)$ grep oraInst $ORACLE_HOME/oui/bin/addNode.sh
INVPTRLOC=$OHOME/oraInst.loc

The addNode script was hardcoded to look only in the ORACLE_HOME for the oraInst.loc file. It would not read the file from /etc or /var/opt/oracle because of this parameter.

On this particular server, there was not an oraInst.loc file in the grid ORACLE_HOME. Usually the file is there when you do a normal cluster installation. In our case, it’s absence was an artifact of the specific cloning process we use to rapidly provision clusters. As soon as I copied the file from /etc into the grid ORACLE_HOME, the addNode process continued as normal.

Sometimes it would be nice if runInstaller could give more informative error messages or tracing info!

About Jeremy

Building and running reliable data platforms that scale and perform. about.me/jeremy_schneider

Discussion

Comments are closed.

Disclaimer

This is my personal website. The views expressed here are mine alone and may not reflect the views of my employer.

contact: 312-725-9249 or schneider @ ardentperf.com


https://about.me/jeremy_schneider

oaktableocmaceracattack

(a)

Enter your email address to receive notifications of new posts by email.

Join 68 other subscribers