Email: service@parnassusdata.com 7 x 24 online support!
Bug 22891868 - Oracle Grid Clusterware OHASD does not restart CRSD when crsd.bin is hanging (Doc ID 22891868.8)
Bug 22891868 OHASD does not restart CRSD when crsd.bin is hanging
This note gives a brief overview of bug 22891868.
The content was last updated on: 28-JUN-2018
Click here for details of each of the sections below.
Affects:
Product (Component) Oracle Server (PCW)
Range of versions believed to be affected Versions BELOW 12.2
Versions confirmed as being affected
12.1.0.2 (Server Patch Set)
Platforms affected Generic (all / most platforms affected)
Fixed:
The fix for 22891868 is first included in
12.2.0.1 (Base Release)
12.1.0.2.170418 (Apr 2017) Grid Infrastructure Patch Set Update (GI PSU)
12.1.0.2.170418 (Apr 2017) Bundle Patch for Windows Platforms
Interim patches may be available for earlier versions - click here to check.
Symptoms:
Related To:
(None Specified)
Cluster Ready Services / Parallel Server Management
Description
This bug is only relevant when using Real Application Clusters (RAC)
OHASD may not restart CRSD when crsd.bin is hanging
Rediscovery Notes
1. We may have this scenario:
Time 1. CRSD hangs
Time 2. OHASD is terminated
Time 3. CRSD still hanging
Time 4. OHASD restart
2. A call stack on OHASD reveals a check on ASM is running:
....
clsn_agent::CrsCmd::ClscrsCmdData::stat(clsagfw_aectx const*,
std::map<std::basic_string<char, std::char_traits<char>, std::allocator<char>
>, std::basic_string<char, std::char_traits<char>, std::allocator<char> >,
std::less<std::basic_string<char, std::char_traits<char>,
std::allocator<char> > >, std::allocator<std::pair<std::basic_string<char,
std::char_traits<char>, std::allocator<char> > const, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > > > >&, CLSCRS_STATFLAG, bool)
()
#16 0x00000000006fe968 in
clsn_agent::CrsCmd::ClscrsCmdData::stat(clsagfw_aectx const*,
std::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&, std::basic_string<char, std::char_traits<char>, std::allocator<char>
>&, CLSCRS_STATFLAG, bool) ()
#17 0x00000000006f805f in clsn_agent::CrsCmd::stat(clsagfw_aectx const*,
std::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&, std::basic_string<char, std::char_traits<char>, std::allocator<char>
> const&, CLSCRS_FLAG, std::basic_string<char, std::char_traits<char>,
std::allocator<char> >&, std::basic_string<char, std::char_traits<char>,
std::allocator<char> >&, CLSCRS_STATFLAG, bool) ()
#18 0x000000000048104f in clsn_agent::AsmAgent::checkCbk(clsagfw_aectx
const*, clsn_agent::Gimh*, std::basic_string<char, std::char_traits<char>,
std::allocator<char> >&) ()
#19 0x0000000000554d16 in clsn_agent::InstAgent::checkState(clsagfw_aectx
const*) ()
#20 0x0000000000551fbb in clsn_agent::InstAgent::check(clsagfw_aectx const*)
()
#21 0x000000000045f7a1 in clsn_agent::Agent::commonCheck(clsagfw_aectx
const*) ()
#22 0x0000000000508510 in clsn_agent::check(clsagfw_aectx const*) ()
#23 0x000000000098cf80 in cls_agfw::Cmd::execute() ()
#24 0x0000000000990cfd in cls_agfw::CmdEx::executeCmd(cls::Message*) ()
#25 0x0000000000990b4f in cls_agfw::CmdEx::clsRequestHdlr(cls::Message*) ()
#26 0x00000000009fd333 in cls::ThreadModel::processQueue(sltstid*) ()
#27 0x00000000009fbe54 in cls::ThreadModel::runTM(void*) ()
#28 0x0000000000a098fb in CLS_Threading::CLSthreadMain::cppStart(void*) ()
3. When the check action on ASM is completed (or aborted) then OHASD tries to start CRSD. This could happen after 20 mins.
Workaround
Start the CRSD resource running:
crsctl start res ora.crsd -init
Note. This fix is dependent on the fix for bug 8934841