>
Linux, Oracle, Technical

Oracle IOPS and HBA Queue Depth

About a month ago I wrote an overview of Linux Caching and I/O Queues as they pertain to Oracle. I was working on a project to architect, install and configure the beginnings of an 8-node cluster consisting of either one or two RAC databases. During the project, while I was waiting for the OS guys to resolve some networking issues, I ran a bunch of benchmarks on the storage subsystem. Specifically, I experimented with the size of the HBA Queue Depth to see if it would make a difference in performance.

But before getting into the results, a quick overview of our configuration: it was 11g RAC on Red Hat Enterprise Linux 5; Dell servers with four dual-core Opteron chips each. The RAC cluster initially had four nodes but will grow to at least eight as the data is migrated. The system has 4G QLogic cards, a McData switch and a 3Par SAN (which is blazing fast). ASM (no CFS) and dedicated Oracle Homes. The first spec had an InfiniBand interconnect but after a teleconference with Alex from Pythian discussing the project’s specific requirements, the spec was updated to use redundant Gigabit Ethernet.

Picking up where I left off: the default limit set by the Linux qla2xxx driver for concurrent I/O requests on QLogic cards (32 per LUN) is conservative. So can I increase performance by increasing this limit? The best way to answer a question like this is simply to try it.

Tweaking the HBA Queue Depth

With QLogic HBA’s on Linux the queue depth is configured through the ql2xmaxqdepth module option. I want to run an experiment where I vary this parameter and measure the I/O performance. To start out, I’d like to compare the default queue size of 32 with an increased setting of 64. But how can I effectively measure I/O performance? I’m specifically interested in how Oracle’s RDBMS will perform, so I think that the best tool is Oracle’s Orion benchmarking tool which is designed to simulate database I/O patterns and measure the result.

Now there are two ways to measure I/O performance: IOPS and MBPS.

  1. When you measure IOPS you’re usually investigating small I/O operations and putting stress on the overhead associated with a single read or write. Throughput is not the main concern when you measure IOPS. This is most relevant on transactional systems.

  2. When you measure MBPS you’re usually investigating large I/O operations and putting stress on the overall throughput. Latency is not the main concern when you measure MBPS. This is most relevant on warehouse and analytical systems.

For more reading, James Koopmann just wrote a few good articles over at Database Journal about getting IOPS/MBPS measurements from existing databases and the relationship between IOPS/MBPS and vendor-supplied disk specs.

Now this particular project’s database will back several high-volume websites and the traffic is probably only 10-20% writes and 80-90% reads. However much of the data changes somewhat frequently and it is definitely an OLTP workload – almost entirely index-based reads. On the AWR report from a current production database, scattered reads almost didn’t even register while sequential reads were by far the most significant event. This means that we need to optimize this storage system – and our benchmark – for IOPS rather than raw throughput.

Orion Results

The basic idea behind Orion is pretty simple. Orion is designed to simulate a mixed workload between single-block reads and multi-block reads. You give it a bunch of parameters that describe your environment – and then it runs its test over and over again, varying the balance between single-block and multi-block reads while measuring the IOPS, latency and MBPS. One important point: Orion varies the balance by changing the number of threads that are concurrently doing reads/writes. Unlike swingbench or hammerora, there is no think time – each thread constantly tries to do I/O.

Since the database for this project is nearly 100% single-block reads, I decided to just run a “basic” matrix – which doesn’t test mixed workloads. It runs about 45 tests with varying levels of concurrency between 1 thread and 500 threads for single-block reads and the same for multiblock-reads. (I just ignored the multi-block results.) I instructed Orion to do 15% write operations and 85% read operations.

# 
# ./orion_linux_em64t -run advanced -testname simple -num_disks 100 -write=15 -matrix=basic -verbose
# 
# 

I ran this benchmark four times in a row. Before each run I changed the queue depth and rebooted the server. I alternated between 32 and 64, using each queue depth twice. The result was a consistent 7% improvement in IOPS and latency by doubling the queue depth.

IOPS for different queue depths Latencies for different queue depths

What’s the Best Setting for Queue Depth?

That depends on how many clients are accessing the storage device. Do not max out the queue depth on all your servers based only on this article. Remember from the first article that the Storage Device’s FC Port can concurrently process a limited number of requests. If you have a large number of devices accessing the same storage array and you increase the queue depth on all of them, then you will start seeing the dreaded “QUEUE FULL SCSI” errors! However if there are only a handful of clients and you know that you won’t be adding more then you can certainly increase this parameter and get the associated performance boost at peak workload.

To get the optimal value you need to consult the manuals or support channels for your storage system to find out its queue depth. Factor in the number of clients you have accessing this array plus some buffer for safety and then you can determine optimal values for each server.

About Jeremy

Building and running reliable data platforms that scale and perform. about.me/jeremy_schneider

Discussion

7 thoughts on “Oracle IOPS and HBA Queue Depth

  1. Hi Jeremy,

    Sorry for coming back late to this but I’ve been quite busy recently.
    We are just about to get our hands dirty on this new cluster so I don’t know how it’s doing yet but I presume everything is OK – nobody complained so far so you must have done a good job.
    Interesting results on queue depth. Thanks for sharing that.

    … while I was waiting for the OS guys to resolve some networking issues… The first spec had an InfiniBand interconnect but….

    Now imagine how long it could be to wait if that would be not simple Gigabit Ethernet but, indeed, Infiniband. ;-)

    Cheers,
    Alex

    Like

    Posted by Alex Gorbachev | March 19, 2008, 9:27 pm
  2. nice work, man

    Like

    Posted by Andralg | March 25, 2008, 1:44 pm
  3. Hi Jeremy,

    While higher queue depths do give good results, it can also result in overruns. It may be a better to set a per target limit along with the LUN limits so what a target never gets throttled beyond its capabilities. On the other hand, how do you actually measure the active queue depth (peak and average)? Running iostat -x will give this information, however you could use SWAT to collate it and you could profile your database storage needs pretty well.

    Thanks
    Krishna Manoharan
    http://dsstos.blogspot.com

    Like

    Posted by Krishna Manoharan | April 8, 2008, 10:09 pm
  4. Hey Jeremy. My name is also Jeremy Schneider. I’m out of Oshkosh, Wisconsin. I know it sounds like a hoax, but it’s not. I’m very into technology and the computers as well. Great Site.

    Like

    Posted by Jeremy Schneider | April 11, 2008, 12:49 pm
  5. Hi Jeremy

    Have you tried test with different number of LUNs?

    Thanks


    LSC

    Like

    Posted by lishan cheng | August 13, 2008, 2:40 pm
  6. ever try setting to 16? Sometimes not queing and sending to fast SAN disks speeds things up. When you have a good number of small LUNs assigned as ASM, a lower Que depth can work better.

    Craig

    Like

    Posted by Craig | September 11, 2008, 4:29 pm
  7. How did you like the 3Par? We purchased one last year and it is quite an amazing unit, particularly if you don’t want to be managing individual disks. Blew away our EMC Clarion.

    Like

    Posted by Gerry Bragg | November 12, 2008, 7:19 pm

Disclaimer

This is my personal website. The views expressed here are mine alone and may not reflect the views of my employer.

contact: 312-725-9249 or schneider @ ardentperf.com


https://about.me/jeremy_schneider

oaktableocmaceracattack

(a)

Enter your email address to receive notifications of new posts by email.

Join 56 other subscribers
%d bloggers like this: