CRS报CRS-2409告警信息问题分析与处理


Oracle 11.2.0.3 

1、报错信息

检查第1节点的CRS alert log,发现存在有下面异常信息

2013-0X-XX 19:27:17.609

[ctssd(18809056)]CRS-2409:The clock on host XXXdb1 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.

2013-0X-XX 19:59:42.312

[ctssd(18809056)]CRS-2409:The clock on host XXXdb1 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.

上面报错的意思是,ORACLE的CTSSD服务发现异常,不能处于观察模式

 

2、报错信息问题分析

2.1、操作系统的NTPD服务处于启动状态,CTSSD就不会工作,但是只要CTSSD服务启动,正常情况下应该处于观察模式

2.2、当前OS的NTPD服务正在运行,并且,CTSSD不能处于观察模式运行

 

3、排查过程

3.1、检查两个节点的时间是否存在差异

XXXdb1:/u01/app/11.2.0.3/grid/log/XXXdb1$ssh XXXdb2 date

Mon Jul 15 20:30:17 GMT+08:00 2013

 

XXXdb1:/u01/app/11.2.0.3/grid/log/XXXdb1$date

Mon Jul 15 20:30:18 GMT+08:00 2013

经检查,时间不存在差异

 

3.2、检查OS的NTPD服务

XXXdb1:/# lssrc -ls xntpd

Program name: /usr/sbin/xntpd

Version: 3

Leap indicator: 00 (No leap second today.)

Sys peer: 10.XXX.XXX.71

Sys stratum: 2

Sys precision: -18

Debug/Tracing: DISABLED

Root distance: 0.000397

Root dispersion: 0.013458

Reference ID: 10.XXX.XXX.71

Reference time: d58e6e41.d0fca000 Mon, Jul 15 2013 20:49:05.816

Broadcast delay: 0.003906 (sec)

Auth delay: 0.000122 (sec)

System flags: bclient auth pll monitor filegen

System uptime: 30149695 (sec)

Clock stability: 0.047607 (sec)

Clock frequency: 0.000000 (sec)

Peer: 10.XXX.XXX.71

flags: (configured)(sys peer)

stratum: 1, version: 3

our mode: client, his mode: server

Subsystem Group PID Status

xntpd tcpip 4128900 active

经检查两个节点,OS层的NTPD都在运行,并且可以做时间同步

 

3.3、检查ctssd的运行情况

XXXdb1:/#su - grid

XXXdb1:/home/grid$ crsctl stat res ora.ctssd -init

NAME=ora.ctssd

TYPE=ora.ctss.type

TARGET=ONLINE

STATE=ONLINE on XXXdb1

经检查两个节点,CTSSD服务都已经启动

 

3.4、借助CRS的cluvfy工具诊断CTSS错误的原因

XXXdb1:/home/grid$cluvfy comp clocksync -n all -verbose

Verifying Clock Synchronization across the cluster nodes

Checking if Clusterware is installed on all nodes...

Check of Clusterware install passed

Checking if CTSS Resource is running on all nodes...

Check: CTSS Resource running on all nodes

Node Name Status

------------------------------------ ------------------------

XXXdb2 passed

XXXdb1 passed

Result: CTSS resource check passed

Querying CTSS for time offset on all nodes...

Result: Query of CTSS for time offset passed

Check CTSS state started...

Check: CTSS state

Node Name State

------------------------------------ ------------------------

XXXdb2 Observer

XXXdb1 Observer

CTSS is in Observer state. Switching over to clock synchronization checks using NTP

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP Configuration file check started...

The NTP configuration file "/etc/ntp.conf" is available on all nodes

NTP Configuration file check passed

……

Checking NTP daemon command line for slewing option "-x"

Check: NTP daemon command line

Node Name Slewing Option Set?

------------------------------------ ------------------------

XXXdb2 no

XXXdb1 no

Result:

NTP daemon slewing option check failed on some nodes

PRVF-5436 : The NTP daemon running on one or more nodes lacks the slewing option "-x"

Result: Clock synchronization check using Network Time Protocol(NTP) failed

见上面标红色字体部分,在做NTP slewingoption时,两个节点都不通过,原因为,NTP没有运行在“-X”模式

 

3.5、检查OS层NTPD的配置

(1)检查/etc/ntpd.conf

server 10.XXX.XXX.71

broadcastclient

driftfile /etc/ntp.drift

tracefile /etc/ntp.trace

(2)检查/etc/rc.tcpip文件的配置

存在有下面信息:

# Start up Network Time Protocol (NTP) daemon

start /usr/sbin/xntpd "$src_running" -a "-x"

看来配置不存在问题,但当前运行却不处于”-x”模式,很有可能是NTPD被重启过,启动时没有加上个”-x”参数

接下来请看第2页精彩内容

推荐阅读:

Oracle 12C R1 RAC安装CRS

Oracle 11g RAC 执行root.sh时遭遇 CRS-0184/PRCR-1070 

Oracle 10g R2 RAC CRS无法启动,CRS-1604:CSSD 表决文件脱机

VMware 下Oracle RAC搬家引起CRS-1006/CRS-0215/CRS-0233 

  • 1
  • 2
  • 下一页

相关内容