Oracle Logbook

September 26, 2020

No more “snapshot too old” on Oracle (ora-01555)

Filed under: Uncategorized — Helio Dias @ 11:02 am

I use to joke around that the way to avoid this error is migrate to other RDBMS. but it’s really just joke.

Snapshot too old is not a bad implementation thing that just happen on Oracle databases, instead it’s a side effect of one the most elegant solution to deal with high concurrency environments.

Complaining that “Snapshot too old” it’s an Oracle issue, it’s like saying that spaceships are bad because they have lack of oxygen issues. The only reason why you know that you don’t have air in out space is because the spaceship allowed you to be there in the first place.

There are some ways to the users/DBAs deal with it, and I have an idea how Oracle could redesign an improvement for it, that in some cases would allow queries to run for months, without any Snapshot too old.

The key problem is that Oracle creates and keep Undo for everything, not just for the tables that you need for your longer query or the duration of their transaction, and deep down it’s a very good solution, because avoid many latches control on objects of queries that after some seconds or minutes it will finished anyway.

How Oracle could minimize or almost get rid of the issue, without having to control each query that starts on the database?
Oracle could monitor the Undo available, and at some point let’s say 50% of Undo, Oracle could evaluate the longest query against the second longest in execution and release the Undo for all blocks that belongs to the oldest SCN but are fresher the second longest, and of course belong to objects that the longest query are not using.

This way Oracle would get ride of large amount of Undo usage, since often one database have just one gigantic query in place.
But what if they have a secondary very long query? Oracle simple would repeat the process comparing the 2nd and 3rd query Objects/SCNs and so on, limiting to transactions longer that 1 hour (or some parameter to control this limit as well).

The advantage of my solution is that if the customer DB never come close to the threshold (let’s say 50% Undo), your database would never have any impact.

Currently the way to deal with it are:

1 – Improve your query speed (that majority of errors comes because a long running query, is making Oracle keep Undo since the beginning of the query.

2 – Increase the size of Undo tablespace and the retention period (recently versions allows you to even force the retention, in the past was just a goal)

3 – Avoid huge transaction, e.g. deleting at once a table that uses more space than your Undo.

4 – Using truncate instead of delete, even for partial deletion. e.g. you want to keep 10% of the table, create a stage table, insert what you want to keep there. truncate the main table, insert it back, and truncate the stage table. (Must consider the amount of keepers and also if the application will survive without the keepers for a while)

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

<span>%d</span> bloggers like this: