Thursday, June 28, 2012

Users do what users do

If I had to write a book, the first chapter would be titled after this post.  Maybe more accurately, users do what users can do.  Last week I was going to write something up about this type of thing, but it makes SO much more sense after this week and introduces what I'll call Main Tenants of Support.

These Main Tenants are things we support individuals hold to be self evident, even if developers and project personnel don't understand a damn thing about them.

Our case:  We were given scripts from Oracle to drop and recreate a queue table that we were told by our developers wasn't being used since it had built up over 10 years of non-use.

rem stop queue: exec dbms_aqadm.stop_queue(‘AR.AR_REV_REC_Q’);

rem drop queue: exec dbms_aqadm.drop_queue(‘AR.AR_REV_REC_Q’);

rem drop queue table: exec dbms_aqadm.drop_queue_table(‘AR.AR_REV_REC_QT’);

rem create new queue table: begin DBMS_AQADM.CREATE_QUEUE_TABLE( Queue_table => ‘AR.AR_REV_REC_QT’, Queue_payload_type => ‘SYSTEM.AR_REV_REC_TYP’, storage_clause => ‘TABLESPACE AR’, Sort_list => ‘ENQ_TIME’, Multiple_consumers => TRUE, Compatible => ‘10.2’, comment => ‘Revenue Recognition Table’); end; /

rem create queue: exec dbms_aqadm.create_queue(‘AR.AR_REV_REC_Q’,’AR.AR_REV_REC_QT’);

rem start queue: exec dbms_aqadm.start_queue(‘AR.AR_REV_REC_Q’);

So instead of just purging it once and then letting another 10 years go by before we purge it again, we decided to wrap it into a concurrent program and let it run under 2 parameters:
  • drop and recreate queue
  • delete records from the queue
In DEV and TEST the report was run in 60-70 hours which seemed like an acceptable time frame for deleting 45-50 million records and rebuilding the queue, since this is something we've never had to do at this type of scale.

What we didn't know:  The queue table is used in base Oracle functionality for invoice processing.  So what you say?  The API for the drop queue was designed to delete all the rows AFTER it locked the object.

Error: Unable to create invoice due to an error: ORA-04021: timeout occurred while waiting to lock object AR.AR_REV_REC_Q
ORA-06512: at "APPS.RA_CUSTOMER_TRX


Why was this a problem:  That meant when we kicked off the concurrent program 10 PM Friday night, it prevented receipts from getting created in Financials from our web payment front end until we saw an alert on Monday shortly before noon that was telling us we had a backlog in receipt creation.  Killing the concurrent request didn't do anything because the process was still alive, so we had to mark the session for kill but the session was still alive.  It did stop what it was doing after getting past the 9 millionth record, but then it popped out to our shell of the concurrent program which was a problem because it found itself in a loop that it had no way out of.  I researched a bit and found a good article from Tanel Poder breaking down a situation a lot like ours, and I had some tricks in my bag from a previous issue involving the Revenue Recognition, to realize that the OS process needed to be killed manually.

What this shows us about our Main Tenant:  We were able to make the mistake of locking the queue.  Oracle told us to do this, so why would we believe it would have an issue?  We did exactly what the application allowed us to do, so we did it.  The same can be said of forms, fields, and just about anything else in the application so we need to be aware of how developers and project analysts (and Oracle) allow users to interact with the application.  Did we build a form that allows blind queries?  Does a report spend 3 days figuring out a Hash Join before running the actual meat and potatoes of the report in 15 minutes time?  It is the aftermath of many of these unasked, or unanswered, questions prior to implementation that I have to deal with on a daily basis.

Edit: 8/16/12 - Needed to change the link to Tanel's article since it was somehow linking the search engine URL instead.

Thursday, June 21, 2012

Every blog begins with a first post.

This is mine.  This came about after reading a great post by Tom Kyte where he posts a bit from one of his books comparing what we do in our daily Oracle lives to different kinds of doctors in the field.  Really, go read it.  Okay now that you're back, I'll tell you why I decided on being The Oracle EMT.

While he identified many types of physicians, I didn't really feel like any of them fit with what I do in my Production support role because I'm trying to do all of them at different times.  I more empathize with a fire-fighter, or a police detective at times, but tying back to the example the EMT is somebody going out into the world never knowing what they will encounter from day to day.  And they are dealing with critical life threatening issues.  Their symbol is the Star of Life for Pete's sake!  That's how I feel constantly.  While I might sit at my desk and be a target for work to come in, and be like the emergency room doctor, it really is wide open for what will come my way and feels like I'm out in that world with anything around the next corner.  Deciding what to work on today, might not be a huge deal, but picking large projects to invest my time into could have a small issue languish for weeks, which in its own right become a hair on fire problem that I'll have to address immediately when I'm in the middle of doing something completely else.

Lining up my thoughts a bit more, I am looking for this blog to be directed at the everyday Oracle applications specialist that had to be out there in the field.  I'm not looking for work, I'm perfectly happy learning from the job I have right now with the technology we're dealing with, so this isn't a launching pad for some consulting company outing and eventually I'd like to see more people contributing to the endeavor once it has some steam going.  I want to be able to vent or show things here, with individuals helping each other to make this a safe place to talk and share ideas in the comments.  Nothing is taboo.  Not even me.

-The Oracle EMT