In a previous post I indicated how I was going to create an alert based on the table DBA_ALERT_HISTORY because it would show us when a database went down and also come back up. What I found out the other day was that if a node in our RAC instance becomes unreachable, or can't communicate with the Oracle database for another reason such as a server crash, the table DBA_ALERT_HISTORY may not get logged to until the node becomes available again.
This was interesting because even though the other node went through configuration believing it was the only node available, the remaining node isn't configured to log into this table that it lost track of the other node since it is the responsibility of the offending node to write to the table. After the server was communicative again 10-20 minutes later, I saw a most interesting series of entries logged into the table indicating that the server is coming up, followed by a notification that the server is down just milliseconds later....even though the instance was still coming up! This second notification was actually the original notification that should have went out when the instance went down, but it couldn't be logged to the table until the server was operational and communicating with the DB.
I still think I need to create an alert based on this table, but I thought it was very interesting to see the table in a real-time fashion while we had a server basically offline and my expectations didn't match up with what happened at all!
No comments:
Post a Comment