>
Oracle, Technical

Set Up Exadata for Cloud Control 12.1.0.2

I recently helped setup an Exadata X2-8 Database Machine with the latest version of OEM Cloud Countrol (12.1.0.2). A few documents do exist for this process – the most useful of which are the Exadata Discovery Cookbook and the Setup Automation Kit. However I found a few inconsistencies and problems; I think the existing documents I found were written on older versions of OEM and older versions of the tools. Also there are some additional steps for older Exadatas which didn’t apply to my case.

I’m publishing my final procedure here with hopes that it helps you, but as always please cross-reference this with the appropriate documentation before doing anything in your own environment.

At this customer’s request we also configured SNMP to integrate alerting with another system – in their case, Exadata-related alerts will be raised in BMC Event Manager. I’ve also included the steps I followed to enable this; it should be easy enough to tweak my procedure here for any SNMP-compatible monitoring system.

Steps to setup Exadata with Cloud Control 12c

IMPORTANT NOTE: the steps provided here are NOT a substitute for reviewing relevant documentation. future environments might be different at any level (exadata, patches, agents, etc) and require a fresh review of documentation for changes.

  1. Download Exadata Discovery Cookbook: http://www.oracle.com/technetwork/oem/exa-mgmt/em12c-exadata-discovery-cookbook-1662643.pdf
  2. Download OEM Setup Automation Kit (Note 1440951.1 -> Patch 14628061)
  3. Verify the Exadata and OEM Pre-Requisites. I found three good pre-requisite lilsts: (1) the Cookbook itself, (2) the Setup Automation Kit README and (3) Oracle Support Note 1437434.1. There’s a lot of overlap in these three lists but it’s worth checking all three documents because some new updates have only gotten into one or two of them.For this client, there were a handful of issues that we had to get resolved:
    • Major Issue – OEM Self-Update: If your OEM server is not 64-bit Linux then it doesn’t have the 64-bit Linux agent in its library be default. This procedure relies on OEM’s agent deployment capabilities, so you need to add the 64-bit Linux agent to the OEM software library using OEM’s “self-update” feature.This client’s OEM server was hosted on AIX. Furthermore, they have somewhat restrictive network security policies. OEM’s self-update feature relies on the OEM server directly accessing some of Oracle Corp’s internet services; we submitted organizational security requests for this access.Offline Update: While waiting for network access, I gave offline update [OEM manual] a try. OEM gave me a URL and I was able to download this file. Then, there seem to be two choices in 11.2.0.2 for getting the file into the OEM repo. I think the manuals are invalid for this section; I never got it to work.

      1) The web interface asked me to upload the file directly, which isn’t mentioned in the manual. The file uploaded alright, then it claimed that a job was submitted to process the file. However i could not find any evidence that a job ever ran. I did notice that the local agent seemed to be unreachable.

      2) I also tried following the command-line process outlined in the manual. I received the error: “Specified file is not a valid Self Update catalog file. Please check and try again with a valid file.” Searched for info about this error in both metalink and google… no meaningful results.

      Online Update: This work was spread over a period of time. Before I had a chance to finish troubleshooting the offline update, our network access request came through. Following guidance from Support Note 1457376.1 (which Alex G found) we requested access for these hostnames:

      aru-akam.oracle.com
      ccr.oracle.com
      login.oracle.com
      support.oracle.com
      updates.oracle.com

      After network access was granted, online update worked flawlessly – although you do need to remember that it’s a two-step process: (1) download the file and (2) “apply” the file to the repository.

    • Minor Issues – acquire all needed passwords, create missing accounts/passwords, ssh ciphers: First, it’s useful to remember that the default password for nearly every account and device in an exadata database machine is the same password. This is terribly useful – I’m so glad Oracle took the time to make this consistent. If nobody knows the password for some obscure device, try the default one. (Which I’m sure you already know!)I only had to create two new passwords: (1) a new account on the ILOM, which is easy and well-documented in pre-req notes and (2) a new password for the DBSNMP account in one existing database on the system.The pre-req notes also say you should make sure sshd explicitly lists certain ssh ciphers; I suspect this is mainly for some older exadata database machines. On my machine, the cells all matched the docs exactly but the compute servers didn’t have a cipher line in the sshd config. All the required ciphers are included by default but I went ahead and added an explicit line anyway.
  4. Update file /opt/oracle.SupportTools/onecommand/em.param – the automation kit will use values here. I had to update the values for OMS_HOST, OMS_PORT and EM_USER.
  5. Extract kit to exadata node 1 and run the kit as root. Ignore the cookbook instructions because they’re out-of-date; use the README from the kit. I just typed “perl setupem.pl” to run the kit.The kit is fantastic and I highly recommend it. I can also vouch for its rollback capabilities which are excellent. I had to rollback the entire process after I’d finished it the first time in order to change the user it was installed under; it worked flawlessly. Just make sure to pay attention and follow all the manual instructions carefully whenever they are given!A few notes from this particular client:
    • I skipped the exachk run – we had a very old version installed, i had problems running it even manually and didn’t have time/scope allocated to update it. In this case I manually verified everything related to the OEM setup and spent a lot of time making sure I had been thorough on this. In general though it’s definitely best practice to have an up-to-date version of exachk installed and use it.
    • I received a DNS error “script can’t get domain” which was safe to ignore because I had already the verified correct domain manually.
    • I received an infiniband error “script can’t SSH into switches” which was safe to ignore because I had already verified correct IB firmware versions manually.

    The kit/script automates the entire process of getting agents installed properly onto exadata, including the exadata-specific extensions. After you finish this script, you’re 75% finished with the whole thing!

  6. Follow directions and screenshots from the cookbook to perform exadata, cluster and database discovery through the guided processes. I found the cookbook to be reliable for this part. It’s pretty simple and the screenshots are quite nice.One note: during exadata discovery, you are presented with a list of which hosts act as primary monitors for various devices. Take note of which host you accept as default for the IP (Cisco) switch.After guided discovery, the cookbook will also walk you through configuring the switch to forward SNMP traps. You’ll need to remember which host you configured as the primary monitor!

    For this particular client we had one additional bump: I wasn’t able to acquire the password for the unix user who owned the agent, and it couldn’t be changed either. (!!) The agent password is not needed for agent deployment since the kit runs as the root user. However it is needed for guided discovery. To get around this, I backed up /etc/shadow and changed the password just for the short time I was running discovery. I restored the shadow file and thus the original password after I finished.

At this point the Exadata is configured with OEM Cloud Control! Really it’s not that complicated, the tough part is just knowing where to start and then getting all the right pieces in the right places.

Steps to setup SNMP Integration for 3rd Party Alerting

Version 12c of OEM Cloud Control introduces some key new features around “Incident Management”. Before configuring SNMP notifications, it is very important to understand the underlying concepts around this feature.

A good starting point are chapters 3 and 4 of the Cloud Control Administrator’s Guide:
http://docs.oracle.com/cd/E24628_01/doc.121/e24473/toc.htm

After you understand OEM Incident Management, continue with these instructions to configure SNMP traps and create a test event:

  1. Find your MIB by logging into the OEM server and getting the file $OMS_HOME/network/doc/omstrap.v1The MIB and Events are documented in Appendices A, B and C of the Administrators Guide. Your team that manages your 3d party alerting system will probably want to import this MIB.Note: our client did have one issue here. It seems that their current BEM tool allowed a maximum of 30 slots for storing variables on each event class. Unfortunately the MIB from OEM has one event class with 70 variables. (!!) The Event Class in question is “SNMP_oraEMNGEvent” and can be found starting at line 1314 of our 12.1.0.2 MIB file. We asked Oracle and of course the response was: “there’s important info after field 30 and there’s no workaround.” Nice. However we are still moving forward with this client (ignoring fields after 30 for that event class) and we’re going to see whether it causes any major problems in practice.
  2. Login to the OEM web console
  3. Navigate to Setup -> Notifications -> Notification Methods
  4. Choose to Add SNMP trap targets using the guided wizard. You will need to know the IP addresses or hostnames of your SNMP host. Also at this point input the community if it’s not the default value of “public”.
  5. Nagivate to Setup -> Incidents -> Incident Rules
  6. Create a new Rule Set for testingMy test ruleset:
    • named “test tablespace low”
    • applies only to exadata cluster database BIP
    • one rule: “exadata tablespace free”
      • applies only to 1 metric: “Tablespace Space Used (%)”
      • one action: no conditions, call both SNMP trap targets
  7. Trigger the testing ruleset which you just created:
    SQL> create tablespace testalert1 datafile '+data' size 1m;
    
    Tablespace created.
    
    SQL> create table test_obj tablespace testalert1 as select * from dba_objects where 1=0
    
    Table created.
    
    SQL> insert into test_obj select * from dba_objects where rownum<10000;
    insert into test_obj select * from dba_objects where rownum select bytes from dba_segments where segment_name='TEST_OBJ'
    
         BYTES
    ----------
        983040
    
    SQL> select user_bytes, bytes from dba_data_files where tablespace_name='TESTALERT1';
    
    USER_BYTES      BYTES
    ---------- ----------
        983040    1048576
    
    SQL> select bytes, user_bytes, blocks, user_blocks from dba_data_files where tablespace_name='TESTALERT1';
    
         BYTES USER_BYTES     BLOCKS USER_BLOCKS
    ---------- ---------- ---------- -----------
       1048576     983040        128         120
  8. Wait several minutes, then navigate in OEM to Enterprise -> Monitoring -> Incident Manager
  9. The standard view is selected by default. Look for the incident “Tablespace [TESTALERT1] is [100 percent] full” in the right pane and click on this incident to select it.
  10. In the bottom pane, click on the EVENTS tab. Click on the Latest Event to open its details.
  11. The General tab is open by default. In the Last Comment field, verify that the SNMP trap was sent.

And that’s it. This worked for us – let us know if it worked for you or if you did anything differently!

About these ads

About Jeremy Schneider

Doing stuff with Oracle Database, Performance, Clusters, Linux. about.me/jeremy_schneider

Discussion

Comments are closed.

Disclaimer

(a) The views expressed on this website are mine alone and do not necessarily reflect the views of my employer.

about.me

Jeremy Schneider
Follow

Get every new post delivered to your Inbox.

Join 897 other followers

%d bloggers like this: