RMAN-20035 RMAN-06004与恢复目录重新注册 - 数据库编程

一个数据库RMAN 热备份失败，报错RMAN-20035 和RMAN-06004 。而使用exp 进行的逻辑备份正常。

具体信息如下：

RMAN-03022: compiling command: backup

RMAN-03026: error recovery releasing channel resources

RMAN-08031: released channel: ch01

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure during compilation of command

RMAN-03013: command type: backup

RMAN-03014: implicit resync of recovery catalog failed

这些错误信息也证明数据库本身没有问题，问题应该出在RMAN 上。

这个库的环境和备份配置这里先介绍一下。

数据库是Oracle 8.1.7.4 ，很早的一个版本。RMAN 备份的目录库是在一个Oracle 10g 的数据库中。

这个错误导致了数据库RMAN 热备份不成功。

问题的关键点是来自恢复目录数据库的错误，如下：

RMAN-06004: ORACLE error from recovery catalog database: RMAN-20035: invalid high recid

1. 分析
第一步，检查备份脚本

确认备份脚本一直以来没有变化过。因此，脚本本身没有问题。

第二步，检查目录库

在该数据库服务器上登录sqlplus ，连接到目录库上。

bash-2.05$ sqlplus m18_rman_cata_34/wexxxxxxxx@xxxdb

SQL*Plus: Release 8.1.7.0.0 - Production on Fri May 18 14:42:41 2012

Connected to:

Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production

With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL>

显示是可以连接的。

第三步，检查备份集情况

使用rman 操作，显示一下备份情况。

RMAN> list backupset;

RMAN-03022: compiling command: list

RMAN-03026: error recovery releasing channel resources

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure during compilation of command

RMAN-03013: command type: list

RMAN-03014: implicit resync of recovery catalog failed

RMAN-06004: ORACLE error from recovery catalog database: RMAN-20035: invalid high recid

结果很悲剧。

已经不能从恢复目录库中查询备份结果集的信息了。

可以确定备份的恢复目录库有问题了。

将恢复目录库重做一下，也可以解决这个问题。

还有一种方法，是重新注册一下，将本库重新注册进入恢复目录库。

2. 解决
首先，查询一下该库在恢复目录库中的注册信息。

SQL> select db_key,dbid,name from rc_database;

DB_KEY DBID NAME

---------- ---------- --------

1 3753655651 M18

在恢复目录库中，NAME 为M18 的记录就是我们备份失败的数据库。这个恢复目录库中只有一个库的备份信息存储在其中。

其次，使用dbms_rcvcat 包的unregiseterdatabase 函数解除恢复目录库中的注册信息

使用sqlplus 登录到恢复目录库中，执行下面的SQL

exec dbms_rcvcat.unregisterdatabase(1,3753655651);

最后，重新注册一下恢复目录库

在rman 下执行register database; 命令注册。操作过程如下：

bash-2.05$ rman target / rcvcat m18_rman_cata_34/xxxxx@xxxdb

Recovery Manager: Release 8.1.7.4.0 - Production

RMAN-06005: connected to target database: M18 (DBID=3753655651)

RMAN-06008: connected to recovery catalog database

RMAN> register database;

RMAN-03022: compiling command: register

RMAN-03023: executing command: register

RMAN-08006: database registered in recovery catalog

RMAN-03023: executing command: full resync

RMAN-08002: starting full resync of recovery catalog

^[RMAN-08004: full resync complete

可以从执行结果中看出，已经完全和恢复目录同步成功了。

数据库RMAN 热备份工作恢复正常。

3. 总结
这个系统的环境毕竟复杂，经历了从物理机器迁移到虚拟机，又迁移回来。一般这种错误是数据库使用open resetlogs 方式打开过，恢复目录库的信息就得重建。另有一种情况，就是bug 导致。