日期:2014-05-16  浏览次数:20607 次

从 ASM 中删除归档日志遇 ORA-15028 错误

今天一8点跑去中金校验数据,发现近期库(AIX6.1下4节点11gR2 RAC)的归档满了,数据库hang住,跑去问提前到的中间件的哥们,结果来了一句没发现什么异常……

心凉了一截,这他妈我要是晚来一会,准出事啊,纳税人还不得急死……二话不说赶紧去先清清再说,切换到grid用户下,通过 asmcmd 用 os 命令连删除了两个文件夹

结果删到第二个文件夹时,突然报错:

ORA-15032: not all alterations performed
ORA-15028: ASM file '+FRA/bjschxcx/……' not dropped; currently being accessed (DBD ERROR: OCIStmtExecute)

ls 命令核查,发现只有一个文件未删除,数据库已经从 hang 机状态恢复了,尝试用 RMAN 删除,仍然报如下错误:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of delete command on default channel at 06/08/2012 13:20:35
ORA-15028: ASM file '+FRA/bjschxcx/……'  not dropped; currently being accessed

我要删除的归档是好几天钱的了,当前按道理应该没有使用才对,即便是近期库上配置了好几家厂商的 GoldenGate 实例,数据库在释放一点归档空间后虽然成功

起来了,但是这个问题不解决也不是个事,我在几家厂商的 GoldenGate 实例上查了一下,都未用到我要删除的归档日志,而且进程都没有延迟。

查阅了下 metalink ,有 2、3 篇文章对此现象有描述

其中一篇描述如下,肯定不符合我这里遇到的场景,首先排除、

Cause

The issue can be caused by any replication process running or hanging, holding this file.

For example a Golden Gate replication or shareplex replication process.

Solution

Stop the replication process and try deleting the file uisng rman or ASMCMD.


另外两篇如下:

Cause:	An attempt was made to drop an ASM file, but the file was being 
	accessed by one or more database instances and therefore could not 
	be dropped. 
Action:	Shut down all database instances that might be accessing this 
	file and then retry the drop command.

Solution

 Use the following to quickly find out which database instance holds the lock and to identify for restart:

ASMCMD [+] > lsof -G DG_ARCH
DB_Name  Instance_Name  Path
myprod     myprod1          +dg_arch/myprod/archivelog/2012_06_04/thread_1_seq_72711.5178.785032231
myprod     myprod1          +dg_arch/myprod/archivelog/2012_06_04/thread_1