Little trick I’d seen once that might come in handy for someone.
Lets suppose that somehow you accidentally delete (“rm”) all copies of your current controlfile – and you don’t have a backup!! If the database is still running then don’t shut it down! There might actually be a way to recover the control file. Especially if you are using shared servers.
In unix when you “rm” a file it simply disconnects the inode from the directory. (The inode is the block on the disk with all the meta-information about the file.) If any process is still holding this file open then the inode – and in effect the file – will not be released as free space until all processes close the file.
So how does this help us recover a lost controlfile? Well if an Oracle process has the file open then that process can still access if even after you’ve deleted the file from unix.
I’ll give a quick demo:
Setup
First, some quick background… I’ve only got one shared server running on this database to keep the demo simple. And I’ve setup two TNSNAMES entries – one that requests a dedicated server and another that requests a shared server.
First, just to make sure, we’ll connect to the shared server and force it to open the control file.
nap01:~/product/10.2.0/db_1/network/admin$ sqlplus system/jeremy@jt10g-shared
SQL*Plus: Release 10.2.0.1.0 - Production on Thu May 3 16:50:51 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
SQL> select name from v$database;
NAME
---------
JT10G
To confirm that this the process opened a connection to the control files you can use fuser, lsof, or something like this script:
nap01:~$ ./jduser /u04/oracle/oradata/jt10g
/u04/oracle/oradata/jt10g/control01.ctl:
[7128]ora_dbw0_jt10g [7130]ora_lgwr_jt10g [7132]ora_ckpt_jt10g
[7146]ora_s000_jt10g
/u04/oracle/oradata/jt10g/control02.ctl:
[7128]ora_dbw0_jt10g [7130]ora_lgwr_jt10g [7132]ora_ckpt_jt10g
[7146]ora_s000_jt10g
/u04/oracle/oradata/jt10g/control03.ctl:
[7128]ora_dbw0_jt10g [7130]ora_lgwr_jt10g [7132]ora_ckpt_jt10g
[7146]ora_s000_jt10g
The Big Mistake
Now let’s simulate a moment of complete stupidity:
nap01:~$ cd /u04/oracle/oradata/jt10g
nap01:~/oradata/jt10g$ rm control*.ctl
Can We Get Back the Control File?
Ok. Time to update your resume. But maybe on the way out the door we could try a few things. First let’s try getting at the controlfile with a normal connection:
nap01:~/product/10.2.0/db_1/network/admin$ sqlplus system/jeremy@jt10g
SQL*Plus: Release 10.2.0.1.0 - Production on Thu May 3 16:58:12 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
SQL> alter database backup controlfile to
2 '/u04/oracle/oradata/jt10g/backup.ctl';
alter database backup controlfile to
*
ERROR at line 1:
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/u04/oracle/oradata/jt10g/control01.ctl'
ORA-27041: unable to open file
SVR4 Error: 2: No such file or directory
Additional information: 3
No such luck. Now lets try it on the shared server.
nap01:~/product/10.2.0/db_1/network/admin$ sqlplus system/jeremy@jt10g-shared
SQL*Plus: Release 10.2.0.1.0 - Production on Thu May 3 17:00:06 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
SQL> alter database backup controlfile to
2 '/u04/oracle/oradata/jt10g/backup.ctl';
Database altered.
Recovering the Database
Now you don’t have a CURRENT copy of your control file but we do now have a BACKUP. It’s not as simple as just starting up the DB… so let’s quickly walk through the recovery. First we’ll grab some info about the log files, the we’ll go ahead and shutdown the database. We won’t be able to do a normal shutdown because that requires access to the controlfile; we’ll have to abort the instance.
SQL> col member format a40
SQL> select l.status, member
2 from v$logfile inner join v$log l using (group#);
STATUS MEMBER
---------------- ----------------------------------------
INACTIVE /u04/oracle/oradata/jt10g/redo03.log
UNUSED /u04/oracle/oradata/jt10g/redo02.log
CURRENT /u04/oracle/oradata/jt10g/redo01.log
SQL> shutdown abort;
ORACLE instance shut down.
Now at this point if we try to start the database back up we’ll just get those ORA-00210 errors (cannot open the specified control file). So it’s time to do some recovery. We just made a backup; first we’ll copy the backup to where the originals were:
nap01:~$ cd /u04/oracle/oradata/jt10g/
nap01:~/oradata/jt10g$ cp backup.ctl control01.ctl
nap01:~/oradata/jt10g$ cp backup.ctl control02.ctl
nap01:~/oradata/jt10g$ cp backup.ctl control03.ctl
Next we’ll mount the backup controlfiles. Because this was a backup we will need recover it before we can open the database. We checked the current log earlier – we will just need to apply that online log. Finally we will need to open RESETLOGS and start a new incarnation since we used a backup controlfile.
nap01:~/oradata/jt10g$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.1.0 - Production on Thu May 3 18:09:31 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup mount
ORACLE instance started.
Total System Global Area 629145600 bytes
Fixed Size 1980712 bytes
Variable Size 171968216 bytes
Database Buffers 448790528 bytes
Redo Buffers 6406144 bytes
Database mounted.
SQL> recover database using backup controlfile until cancel;
ORA-00279: change 556824 generated at 05/03/2007 12:57:33 needed for thread 1
ORA-00289: suggestion :
/u04/oracle/product/10.2.0/db_1/dbs/arch1_2_621607779.dbf
ORA-00280: change 556824 for thread 1 is in sequence #2
Specify log: {=suggested | filename | AUTO | CANCEL}
/u04/oracle/oradata/jt10g/redo01.log
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.
That’s it! We just successfully recovered a controlfile without a backup. Of course even though you now know this cool trick you’d better still have a good backup strategy. If you ever actually have to do something like this on a real system… well let’s just hope you don’t. :)
I was under the impression that as soon as a single copy of a controlfile was missing or corrupted, the instance would shutdown (presumably “abort”). I’ll have to test this on my sandbox.
Anyway, handy tip. Hope I never have to see it!
LikeLike
You didn’t actually “recover” the controlfile.
You really “restored” it.
You had done a shutdown abort of the database.
The datafiles were not in sync. The Recover command looks at all the datafile headers and realises that some datafiles are “ahead” of the others. The Recover command, then, DOES need an ArchiveLog (or the active online redo that hadn’t been archived yet — faking it is as an ArchiveLog).
You were testing a specific scenario where the controlfiles were deleted when the instance were running.
Test the scenario where the the database is shutdown normal/immediate (or even a RECOVER has been completed without an OPEN) — then restore or recreate controlfile and try a RECOVER. See if you can just do a CANCEL there.
Hemant
LikeLike
Hi
But is it possible to use the RM control*.ctl, when the database is up and running.In windows this is not possible as it will not allow to delete the control file when it is in use.Anyways the scenario and the solution is awesome.
it will be really useful for inexperienced DBA who might have forgotten to include the contro file in their backup startegies..what say???
LikeLike
…”Test the scenario where the the database is …even a RECOVER has been completed without an OPEN…”
Did this after failed recover w/o OPEN. I couldn’t query the database for the current redo log so I used the redo log with the newest date – it worked! This was not production but just me fooling around with renaming a test database. Still was cool learning experience.
LikeLike
Good information. Does it working in Oracle 9i too? Using shared server can we take backup of control file in shared server (while all control files deleted). What is the logic behind this? I mean to ask we cann’t able to take backup controlfile in dedicated server and able to take backup controlfile in shared server. It is somthing strange.
LikeLike