Oracle Logbook

October 17, 2020

Finding what is really consuming Oracle resources, combining top queries.

Filed under: Uncategorized — Helio Dias @ 1:41 pm
Tags: , ,

Top queries can be a bit misleading, since one query is not exactly on top but is consuming all kind of resources together.

When you combining all aspects that matter to you, giving a total rate, about time, CPU, IO, concurrency, memory.

The query below is based on force_matching_signature, and if you prefer, you can replace it to sql_id.

https://github.com/HelioDias/top_queries_oracle/blob/main/sqlarea

with
w0 as (select /*+materialize*/
force_matching_signature,count() number_of_sql_ids, sum(executions) executions,sum(rows_processed) rows_processed,trunc(sum(elapsed_time/1000000)) elapsed_seconds, trunc(max(elapsed_time/decode(executions,0,1,executions))/1000000,4) max_seconds_per_execution, trunc(sum(elapsed_time)/decode(sum(executions),0,1,sum(executions))/1000000,4) total_seconds_per_exec, stddev(elapsed_time/decode(executions,0,1,executions)) std_dev_secs_per_exec, sum(fetches) fetches,sum(parse_calls) parse_calls,sum(disk_reads) disk_reads,sum(direct_writes) direct_writes,sum(direct_reads) direct_reads, sum(buffer_gets) buffer_gets, max(buffer_gets/decode(executions,0,1,executions)) max_buffer_gets_per_exec, trunc(sum(application_wait_time/1000000),4) application_wait_seconds, trunc(sum(concurrency_wait_time/1000000),4) concurrency_wait_seconds, trunc(sum(cluster_wait_time/1000000),4) cluster_wait_seconds, trunc(sum(user_io_wait_time/1000000),4) user_io_wait_seconds, trunc(sum(cpu_time/1000000),4) cpu_seconds from v$sqlarea where force_matching_signature>0 group by force_matching_signature ), w1 as (select w0.,
rank() over(order by cpu_seconds desc) cpu_seq,
rank() over(order by user_io_wait_seconds desc) io_seq,
rank() over(order by cluster_wait_seconds desc) gc_seq,
rank() over(order by concurrency_wait_seconds desc) concurrency_seq,
rank() over(order by max_buffer_gets_per_exec desc) max_buffer_gets_exec_seq,
rank() over(order by buffer_gets desc) buffer_gets_seq,
rank() over(order by total_seconds_per_exec desc) total_seconds_per_exec_seq,
rank() over(order by elapsed_seconds desc) elapsed_seconds_seq,
rank() over(order by number_of_sql_ids desc) number_of_sql_ids_seq,
rank() over(order by executions desc) executions_seq
from w0
),
w2 as (select /*+materialize*/ * from w1 where cpu_seq<11 or io_seq<11 or gc_seq<11 or concurrency_seq<11 or max_buffer_gets_exec_seq<11 or
buffer_gets_seq<11 or total_seconds_per_exec_seq<11 or elapsed_seconds_seq<11 or
number_of_sql_ids_seq<11 or executions_seq<11)
select * from w2 order by cpu_seq+io_seq+gc_seq+concurrency_seq+max_buffer_gets_exec_seq+buffer_gets_seq+total_seconds_per_exec_seq+
elapsed_seconds_seq+number_of_sql_ids_seq
— executions is out of the sort on purpose, since it will be present somehow in other metrics
— if you want give more power to one metric just divide it by something e.g. cpu_seq/10+….
;

Further implementation will include the following dimensions, top last hour, top last day, top last month, an extra weight over queries currently running, amount of Exadata usage.

October 11, 2020

Oracle Exadata vs PostgreSQL “Storage Index”

Filed under: Uncategorized — Helio Dias @ 2:25 pm
Tags: , , , , ,

Is it possible that a tiny PostgreSQL beats Oracle Exadata on realm of Storage Index?
Yes. Exadata took 6 seconds executing a query over a billion rows, where PostgreSQL took less than 2 seconds doing the same query.

This it means that PostgreSQL is better than Exadata?
NO absolutely not, it just means that every tool has it place to shine.
And actually the reasons why this happen was because two factors:
I compared the first execution on both DB, and Exadata Storage Index is not persistent.
And I was aiming to filtering 1440 rows, So PostgreSQL wouldn’t have network issue with it. For sure above 2 millions of filtering Exadata would win, even on first executions.

Let’s clarify that Storage Index just really exists on Oracle, because Exadata have a intermediate tier (cell servers) that avoid sending trash rows to the DBMS layer. What PostgreSQL implement is a Block Range INdex BRIN, that it’s persistent and operate in Min/Max Block like Exadata, and is so small.

PostgreSQL targets the big tables since it BRIN is very very lightweight and so small that you don’t notice that it even exists (As long it is used properly).
On the other hand Exadata implementation is all about reduce the overload on the DB Nodes, having the performance improvement as side effect.

PostgreSQL is about planned things, where you must create in advance the BRINs, where Exadata is all about the unexpectable.

Below are the script for Exadata:
CREATE TABLE BILLION_BY_MINUTE PARALLEL NOLOGGING AS
WITH W0 AS (SELECT /+MATERIALIZE/1 FROM DUAL CASCADE CONNECT BY LEVEL<31623) /*This limiting was due to lack of memory on my autonomous database, so later I autojoin it to produce the billion rows*/
select sysdate-rownum/24/60 EG_DATE,
trunc(dbms_random.value(1,100000)) EG_VALUE
from (SELECT 1 FROM W0,W0 T2 WHERE ROWNUM<=1000000000);

select sum(eg_value) from BILLION_BY_MINUTE where eg_date>sysdate – 1

Execution time: 6.975 seconds — First execution
Execution time: 0.125 seconds — Following executions

PostgreSQL script:

CREATE TABLE BILLION_BY_MINUTE AS
with w0 as (
select * from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))
as q (c1))
,w1 as (select 1 from w0,w0 as t2,w0 as t3)
,w2 as (select 1 from w1,w1 as t2,W1 AS T3)
,w3 as (select tempo – linha interval ‘1 MINUTE’ as EG_DATE,trunc(random()*1000000) EG_value
from (select localtimestamp as tempo,row_number() over() as linha
from w2) tb1)
SELECT * FROM W3

create index BILLION_BY_MINUTE_brix on BILLION_BY_MINUTE using brin(eg_date);

select sum(eg_value) from BILLION_BY_MINUTE where eg_date>now() – interval ‘1 day’;


October 7, 2020

Improving the performance with RowId (3 to 6 times)

Filed under: Uncategorized — Helio Dias @ 5:41 am
Tags: , ,

Every access that your application does using primary keys needs to navigate on the index structure among the multiple levels (often between 2 and 4) and after that, very often one extra lookup on the table.
So generally speaking, we do 3 readings to access the data on the table.

Accessing the data with PK is like going to a place using their address, and with RowId is like having the precise GPS location.

With RowId we can access it with one single read.
On the application side with have some scenarios that often don’t care about this benefit e.g. a screen where just one row will be updated, and others more dramatic cases, where a batch of information will be changed.
Even for the single row update, using RowId can give an overall improvement on the throughput, since we can have thousands of users doing “just one row update”

To implement it, the application must bring the RowId together with general selects that will fill their local objects.

But have a catch on this, imagine after you load in memory your objects with it’s rowids, another application delete a row and then insert another row in the exactly same place, the rowid would still be same, but the data is now something completely unrelated.
And that’s why some developers don’t use rowids on their applications.

To overcome the consistence issue, it’s actually pretty straight forward, on the updates, the application must use RowID and the PK Together, then Oracle will use the RowId to find the data, and the PK to validate if the data still the same.

Blog at WordPress.com.