Ardent Performance Computing

Jeremy Schneider

Search

>

OSP: Overview

Posted by Jeremy ⋅ October 31, 2013 ⋅ 1 Comment

This is the second of twelve articles in a series called Operationally Scalable Practices. You can read the introduction in the first article. In short, this series offers helpful suggestions for younger organizations and newer DBAs to best position them for very large-scale growth.

Before getting into specifics, we will lay out a general overview of the content. I expect this overview to be revised the most as the series is refined over time – so check periodically to see if there have been updates!

First, one quick note: my single overarching principle is simplicity. Your specific implementation of a guideline offered here will depend on your company size and will change as your company grows. A small team can get a lot of mileage out of online collaborative spreadsheets for inventory or deployment checklists. I recommend a bias toward simpler platforms like this instead of big enterprise software solutions – the key is that you’re *doing* the inventory or checklist itself. If you keep growing then sophisticated tools will eventually become necessary but they’re complicated and costly and add overhead to your team activities. Always try to find the simplest possible techniques and tools for your team as you implement the guidelines herein!

And now, here’s the outline of the next ten articles. The outline in itself is actually a great checklist as you think about your operations. It’s ambitious – hopefully I’ll be able to follow through with getting this written!

The Foundation

Keep documentation & processes somewhere with change history
Use checklists for general tasks, maintenance and deployments
- Start simple, grow slowly into more sophisticated systems (e.g. ticketing or release management)
Make sure your basics are solid: monitoring (both email & paging), backups, inventory
- Monitor what matters to your business from end-to-end (networks, applications, databases, storage)
- Actively manage paging events and thresholds, no “expected & non-critical” pages

Build a Standard Platform from the Bottom-Up (Part 1) (Part 2) (Part3)

Define storage
- Start small & simple (local RAID)
- Minimum two independent volumes per server (recovery area)
- Set expectations for capacity, IOPS and throughput
- Define a growth path for the next year or so
- Morle’s SANE SAN is still worth reading for storage appliances
- Tread carefully with newer, less-understood, non-traditional storage (dedupe, serialized, etc)
Calculate or choose a rough core-to-memory ratio
- Need extra memory for consolidation
- This standard core count becomes minimum unit of licensing; balance cost implications with technical considerations
Discuss network communication for inter-DB traffic and backup traffic (DG, GG, DP, environment refreshes, etc)
Suggestions for defining slots per server
Suggestions for defining workload per slot

Build a Standard Cluster Platform (Part 1)

When to consider clusters and when not to consider clusters
- Parallel or distributed processing, fault tolerance, incremental growth, pooled resources for better utilization
- Expensive software licenses, require application modifications, complexity drives very expensive training and hiring
There are many ways to cluster (not just RAC)
Clustering with Oracle will require shared storage of some kind (DAS,NAS,NFS,SAN)
Individual services or applications can be active-active or active-passive independently
Keep it simple – for example, with slots and pools

Suggestions on Naming

Servers
Virtual Machines
DNS, NIS, Domains
Database, SID, etc
Users (NIS, OS, DB)
Cluster, DG config
Services
SQLNET (tnsnames)

Design Small yet Ready for MAA

Consolidate high in the stack
Scale up before out
Security topics: minimize privileges and roles
Maximize what you already bought, minimize expensive extras
- Express Edition, Personal Edition, Standard Edition
- Standby for fault tolerance (data guard, dbvisit)
- Other tools – especially for standard edition (MOS, sash, etc)
ASM vs FS, OMF
- Suggestions on DB file paths, names, extensions for all file types
Standardize on RMAN with a backup agent, hot backups, BCT
- Use storage snapshots and standby DBs for backups when justified
Create databases ready for adding features later with minimal change
- Physical replication (standby)
- Logical replication (goldengate, etc)
- Encryption
- RAC
Justify non-default config
Strongly justify hidden config

Managing Software (install/patch/upgrade)

Plan for patching
Suggestions for users and groups
- OS users & groups (ASM and CRS, individual accounts)
- DB users (including sysdba, sysasm, sysoper)
- Account privileges
- Records and auditing
Suggestions for directory layouts
- OFA
- Using (or not using) symbolic links
- Installing cluster software on local storage
Server time zones and time synchronization
Repackaging oracle binaries
- Versioning your packages

Managing Configuration & Scripts

Centralize
Use version control (a.k.a. revision control, source control or change control)
Configuration management and automation
Configuration that needs managing
- Database configuration (e.g. tnsnames, wallets, init files, pwfiles)
- OS configuration (e.g. hugepages, udev/asmlib)
Files and directories that need cleanup/housekeeping
Consistent management of jobs and schedules (cron, DB, oem, backup software, etc)
- Built-in DB maintenance windows and jobs
Management & administration GUIs

Operations Team Processes

Using automation and scripting (deployment, db create, backups, schema/user creation, config)
Utilize backups to move data
Trust Oracle not Google (when it comes to Oracle software)
- Cite references in process documentation
- Official documentation should be the source for 99% of processes
- MOS is secondary source; notes can have problems but may also be more detailed and up-to-date than the official documentation
- Heavy internal testing and documentation is only acceptable tertiary source
- Blogs, forums, wikis and other public websites are never an acceptable source for processes

Operations Team Calendar

Assess quarterly patches
Schedule recovery exercises
Schedule failover exercises
The importance of vacations

Attitude and Culture

Vendor support services are your ally not your enemy – even if you’re a small company
Value soft skills (personality insights, negotiation, other interpersonal communication skills)
Value learning & education
- Encourage a culture of experiments, tests, evidence and proof
- Always cite references
Understand business priorities: customer relationships, vendor relationships, cost, time, budgeting, negotiation, complexity, manpower…
Don’t get too cynical or too comfortable; expect business first
Importance of breaks and hobbies

This covers a lot of ground – but I still see a few potential gaps. In particular, I’m still getting up to speed on some of the latest updates to Oracle’s database. For those of you who are working with 12c – is there anything we should do differently to be prepared for upcoming changes? For example, I’m still contemplating whether this blueprint should be tweaked for multitenant or far sync readiness. If you have any thoughts then please share them!

About Jeremy

Building and running reliable data platforms that scale and perform. about.me/jeremy_schneider

View all posts by Jeremy »

Discussion

One thought on “OSP: Overview”

Jeremy, Kudos for getting this important topic started. On the “Design Small yet Ready for MAA”, consider adding “Design for Scale out” [i.e. design as if you would deploy on RAC] but “Deploy on Single node before you consider Multi-node”. I agree you should scale up before you scale out, but scaling up usually has a higher cost [think expensive HP Superdomes or IBM P-series servers] and more importantly, a hard upper limit [which is increasing but is still costly]. You might also consider my article “Oracle Scalability – Some perspectives” that was recent;y published on IOUG’s SELECT Journal.

LikeLike

Posted by jkanagaraj | December 27, 2013, 2:37 pm

Ardent Performance Computing

Ardent Performance Computing

Search

OSP: Overview

The Foundation

Build a Standard Platform from the Bottom-Up (Part 1) (Part 2) (Part3)

Build a Standard Cluster Platform (Part 1)

Suggestions on Naming

Design Small yet Ready for MAA

Managing Software (install/patch/upgrade)

Managing Configuration & Scripts

Operations Team Processes

Operations Team Calendar

Attitude and Culture

About Jeremy

Discussion

One thought on “OSP: Overview”

Disclaimer

Email Updates

Recent Posts

Recent Comments

Ardent Performance Computing

Search

OSP: Overview

The Foundation

Build a Standard Platform from the Bottom-Up (Part 1) (Part 2) (Part3)

Build a Standard Cluster Platform (Part 1)

Suggestions on Naming

Design Small yet Ready for MAA

Managing Software (install/patch/upgrade)

Managing Configuration & Scripts

Operations Team Processes

Operations Team Calendar

Attitude and Culture

Share this:

Related

About Jeremy

Discussion

One thought on “OSP: Overview”

Disclaimer

Email Updates

Recent Posts

Recent Comments