Wednesday, October 3, 2012

Parallel Executions

I mentioned earlier that we implemented Exadata, and I learned something about parallel executions as a result!  Ignoring the fact that too much parallel execution on our system after the first few hours almost brought our system down, one of our developers had found that a simple query of a table with 54 rows in 8 blocks was taking too long to process versus our old environment.

Global Stats
=======================================
| Elapsed |  Other   | Fetch | Buffer |
| Time(s) | Waits(s) | Calls |  Gets  |
=======================================
|    0.00 |     0.00 |     1 |      3 |
=======================================
Global Stats
===============================================================
| Elapsed |   Cpu   | Concurrency |  Other   | Fetch | Buffer |
| Time(s) | Time(s) |  Waits(s)   | Waits(s) | Calls |  Gets  |
===============================================================
|    1.92 |    0.02 |        0.86 |     1.04 |     1 |      5 |
===============================================================

The 0 values in the first example were actually recorded to be 0.000079s and when we were seeing it then taking 2-3 seconds at time for each call, he started drilling down to see why there was such bad for our online performance.  He found that the default INSTANCES setting on the table was causing our Optimizer to compute a parallel execution plan which never caused us a problem because we didn't have an exponential number of potential slaves available for the parallel execution to take up the overhead of our RAC nodes.  The sins of our past, marking tables and indexes for parallel execution without needing to or really utilizing the feature, had caught up with us and Exadata didn't CAUSE the flaws it merely REVEALED them to us.  So we obviously fixed this, as well as limited our parallel executions on our DB nodes, but what else did we do?

We learned from our failure by creating an alert to track the current (and future) items that were running marked as parallel in our application just in case we had other objects running marked similarly (or implemented new objects in the future):

SELECT sql_id,
       substr(sql_text, 0, 200) AS sql_text,
       SUM(px_servers_executions) AS sum_px_exec,
       to_char(MAX(last_active_time), 'DD-MON-YYYY HH24:MI:SS') AS last_active_time
INTO
&SQLID,
&TEXT,
&EXEC,
&TIME
  FROM gv$sqlstats
WHERE px_servers_executions > 0
   AND upper(sql_text) NOT LIKE '%GV$%'
   AND upper(sql_text) NOT LIKE '%ORACLEOEM%'
   AND upper(sql_text) NOT LIKE '%GV%DB%'
   AND upper(sql_text) NOT LIKE '%SYS%DB%'
   AND last_active_time > SYSDATE - 1
GROUP BY
       sql_id,
       sql_text
order by
    sum_px_exec desc

Obviously you can see I "commented" out a lot of the returns from gv$sqlstats since many objects owned by the system are parallel which we really don't care too much about.  Here are a few more scripts you can use to identify the objects in your system which have parallelism activated:


select owner, table_name, degree
from  dba_tables
where TRIM(DEGREE) > TO_CHAR(1);

select owner, table_name, index_name, degree
from  dba_indexes
where TRIM(DEGREE) > TO_CHAR(1);

Edit 10/4/2012:  Re-read this today and realized that I mispoke about what our alert was all about, so I corrected it by changing a few words.

No comments:

Post a Comment