国产xxxx99真实实拍_久久不雅视频_高清韩国a级特黄毛片_嗯老师别我我受不了了小说

資訊專欄INFORMATION COLUMN

RAC雙節點crash回復一例

IT那活兒 / 1683人閱讀
RAC雙節點crash回復一例

客戶現場兩節點庫crash告警。運維人員緊急將數據庫拉起,應用恢復。但啟動后alert log 報錯ORA-16191和ORA-01031,為DataGuard主備庫密碼文件不一致所致, 重建密碼文件后, 故障解決。

 分析alert log發現:16:32,節點1讀取控制文件發現壞塊,緊接著16:33分實例無法正常讀取控制文件導致crash,然后實例2在16:35關閉。經檢查控制文件并未存在壞塊,初步判定為數據庫短暫讀取控制文件失敗導致BUG。 

發起SR,經SSC人員及SR后臺專家共同確認為bug 11698676,該bug與bug  9549042為重復bug,并在patch 9549042上被fixed。 

2. 故障分析/處理

2.1 故障處理 

  4月5日16:34, ssyy庫兩節點相繼crash, 緊急接入后確認兩實例已被徹底關閉、監聽仍然開啟,緊急startup將兩實例拉起,應用恢復連接至生產庫。

  重啟實例后,檢查節點1 alert log 發現: 

Check that the primary and standby are using a password file

and remote_login_passwordfile is set to SHARED or EXCLUSIVE, 

and that the SYS password is same in the password files.

returning error ORA-16191

    提示為SYS主備庫上密碼文件不一致導致, 于是決定主庫重建密碼文件,并將新生成的密碼文件拷至備庫節點應用(操作前備份原密碼文件,并更改主庫SYS密碼).

  分別在primary-rac兩個節點上執行密碼文件創建語句.

orapwd file=/oracle/db/oracle/product/11.1.0/db/dbs/ssyydb1 entries=5 force=y  password=*********

orapwd file=/oracle/db/oracle/product/11.1.0/db/dbs/ssyydb2 entries=5 force=y  password=*********

       分別將ssyydb1和ssyydb2依次拷至standby-rac節點1和節點2.   

  primary-rac1節點alert log 仍持續報錯:

Errors in file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_arc2_4134.trc:

ORA-01031: insufficient privileges

PING[ARC2]: Heartbeat failed to connect to standby drdb. Error is 1031.

     此時,主庫節點1無法向備庫節點1傳送archive log. 查詢MOS,ORA-01031仍為主備庫密碼文件不一致導致,懷疑主庫歸檔進程使用了主機緩存密碼文件導致,因歸檔進程為非關鍵進程,kill -9 后會重新啟動,對當前數據庫無影響。 

  依次kill主庫節點1和節點2所有歸檔進程,節點1仍持續報錯ORA-01031。

  sqlplus連接確認主備庫上SYS密碼已更改.

  檢查新生成的密碼文件是否已被應用:

--主庫節點

SQL> select * from  v$pwfile_users;

USERNAME                       SYSDB SYSOP SYSAS

------------------------------ ----- ----- -----

SYS                            TRUE  TRUE  FALSE

--備庫節點

SQL> select * from  v$pwfile_users;

no rows selected

     顯然,主庫密碼文件已被應用,備庫密碼文件未被應用。

     仔細檢查備庫密碼文件, 文件名未滿足orapw<$ORACLE_SID>命名規則, 密碼文件沿      用主庫密碼文件,但備庫實例名區別于主庫實例名。

     修改備庫密碼文件名:

mv $ORACLE_HOME/dbs/ssyydb1 $ORACLE_HOME/dbs/orapwdrdb1 

mv $ORACLE_HOME/dbs/ssyydb2 $ORACLE_HOME/dbs/orapwdrdb2

     持續觀察幾分鐘,ORA-01031錯誤未解決. 

  查詢MOS,參照ORA-1031 for Remote Archive Destination on Primary (Doc ID 733793.1)解決方案操作.

1. Make sure parameter REMOTE_LOGIN_PASSWORDFILE is set to EXCLUSIVE or SHARED in both databases.  


2. Copy the password file again from primary : 


a. Defer the log_archive_dest_2 on primary: 

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2 = DEFER; 


b. Copy/ftp the password file from primary to standby and rename it accordingly on the standby database. Creating the password file on standby with orapwd-utility is not supported for 11g anymore.

Make sure that name of password file on both primary and standby is : orapw. Name of the password file is case sensitive. If SID of database on standby is prod then name of the password file should be orapwprod, orapwPROD will not work. 


c. Enable the log_archive_dest_2 on primary: 

SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2 = ENABLE; 


d. Switch 2-3 log files on primary : 

SQL> ALTER SYSTEM SWITCH LOGFILE; 


e. Check the status of log_archive_dest_2 on primary. 

SQL> SELECT STATUS,ERROR FROM V$ARCHIVE_DEST WHERE DEST_ID =2; 

STATUS    ERROR 

--------- ----------------------------------------------------------------- 

VALID 

     持續跟蹤主庫節點alert log ,在持續ORA-01031報錯3-5分鐘后, 主庫節點均能正常向備庫節點傳送archive log,備庫實例也能正常應用archive log, 主庫節點1和節點2 alert log 也未曾重現ORA-01031和ORA-16191.

     至此,故障全部解決! 

2.2 crash分析 

    

    首先,檢查兩節點syslog,無異常,排除主機因素。

     實例1 alert log:

Fri Apr 05 15:58:52 2013

Archived Log entry 34220 added for thread 1 sequence 12072 ID 0x9441c6d1 dest 1:

Fri Apr 05 16:32:39 2013

Read from controlfile member /dev/oravg/rlv_cntl1 has found a corrupted block (blk# 4, cf seq# 0)

Hex dump of (file 0, block 4) in trace file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc

Corrupt block relative dba: 0x00000004 (file 0, block 4)

Bad check value found during control file block read

Data in bad block:

type: 21 format: 2 rdba: 0x00000004

last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04

spare1: 0x0 spare2: 0x0 spare3: 0x0

consistency value in tail: 0x00001501

check value in block header: 0x8f5d

computed block checksum: 0x2

Re-read from controlfile member /dev/oravg/rlv_cntl1 returned valid block 4

Hex dump of (file 0, block 4) in trace file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc

Errors in file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc:

ORA-00202: control file: /dev/oravg/rlv_cntl1

Errors in file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc  (incident=888259):

ORA-00227: corrupt block detected in control file: (block 4, # blocks 1)

ORA-00202: control file: /dev/oravg/rlv_cntl1

Incident details in: /oracle/db/diag/rdbms/ssyy/ssyy1/incident/incdir_888259/ssyy1_lmon_22418_i888259.trc

Fri Apr 05 16:33:24 2013

Errors in file /oracle/db/diag/rdbms/ssyy/ssyy1/trace/ssyy1_lmon_22418.trc:

ORA-00227: corrupt block detected in control file: (block 4, # blocks 1)

ORA-00202: control file: /dev/oravg/rlv_cntl1

LMON (ospid: 22418): terminating the instance due to error 227

     16:32:39,實例1在讀控制文件/dev/oravg/rlv_cntl1的時候出錯,發現壞塊。

     16:33:24,實例1因無法正常讀取控制文件導致實例crash。 

     檢查三個控制文件,未發現壞塊。

ssyy1: dbv file=/dev/datavg02/rlv_cntl1 blocksize=16384

ssyy1: dbv file=/dev/datavg02/rlv_cntl2 blocksize=16384

ssyy1: dbv file=/dev/datavg02/rlv_cntl3 blocksize=16384

     

     查看節點2 crsd.log: 16:35:23由于數據庫異常offline,CRS停掉實例2.

2013-04-05 16:32:42.179: [  CRSRES][6345673] Resource recovery not purged:ora.ssyy.ssyy2.inst

2013-04-05 16:32:42.205: [  CRSRES][6345673] ora.ssyy.ssyy2.inst target set to OFFLINE before stop action

2013-04-05 16:32:42.206: [  CRSRES][6345673] StopResource: setting CLI values

2013-04-05 16:32:42.252: [  CRSRES][6345673] Attempting to stop `ora.ssyy.ssyy2.inst` on member `ssyy2`

2013-04-05 16:33:40.826: [    CRSD][54] SM: rE2Ec: 4

2013-04-05 16:33:40.896: [  CRSRES][6345681] ora.ssyy.db target set to OFFLINE before stop action

2013-04-05 16:33:40.896: [  CRSRES][6345681] StopResource: setting CLI values

2013-04-05 16:33:42.288: [    CRSD][6345681] SM:dE2Ec: all E2E cmds done. 0

2013-04-05 16:35:23.123: [  CRSRES][6345695] Resource recovery not purged:ora.ssyy.db

2013-04-05 16:35:23.124: [  CRSRES][6345695] `ora.ssyy.db` is already OFFLINE.

2013-04-05 16:35:23.173: [  CRSRES][6345673] Stop of `ora.ssyy.ssyy2.inst` on member `ssyy2` succeeded.

     

     初步懷疑為bug導致, 發起SR,經SSC人員及SR后臺專家共同確認,命中bug 11698676。

     該bug與bug 9549042為重復bug, 在當前HP-UX Itanium 64 bit 平臺下,有現成patch 9549042。

2.3 解決方案 

     官方建議,盡快打patch 9549042, 以規避此crash故障再現。


文章版權歸作者所有,未經允許請勿轉載,若此文章存在違規行為,您可以聯系管理員刪除。

轉載請注明本文地址:http://specialneedsforspecialkids.com/yun/130244.html

相關文章

發表評論

0條評論

IT那活兒

|高級講師

TA的文章

閱讀更多
最新活動
閱讀需要支付1元查看
<