Skip to main content

The effective use of partition pruning for the optimisation of retrieval speed (Part 3)

The devil is in the detail - what details decide on the effectiveness of partition pruning?

It's actually all about how partitioned interval tables could work in general, but, as the saying goes, "The devil is in the details", therefore I would add several important comments:

  1. History depth. As history gets longer, partitioning becomes more effective.
  2. Old reports. If you change an interval table to a partitioned interval one, then all queries against it remain valid (of course, to see the performance advantages you need to modify the filter condition).
  3. Outdated keys. Number of keys in historical table is usually changed over the time, in our example new accounts are opened and some are closed. Balances for the closed accounts could not be changed, but because of described logic, corresponding rows should be copied to each new month causing unnecessary growth of the table. In order to minimize this effect it is possible to define a default value for historized attribute. It means absence of a rows should be interpreted as the attribute having that value. In our example it is reasonable to assume, that closed account has a balance of zero, therefore we can implement a logic to not keep rows with zero balance; instead, in selects, we should count absence of a row as zero. Another example: if we keep track of statuses of account, then "Inactive" could be such a default value.
  4. ETL performance. ETL is going to be a bit more complicated for partitioned tables. But it is not necessarily slower, because you don't have to read the whole target table to detect differences.
  5. Given date in the future. If you select balances for the date which lie in a month not yet loaded in the table, then you receive no rows. If you use a classical interval table then the result is balances for the last loaded date. "No rows" is in my view more accurate, because we don't know the status in the future, but it is indeed the difference which should be taken to account.
  6. ETL for partitioned interval tables should correctly process the situation when no changes were made in the first day of a new month. Current values from previous month should be copied anyway.
  7. Local index for PK. You can use local index for the primary key if you partition the table. It speeds up its maintenance and allows you to drop/rebuild single partitions instead of the whole index for the time of ETL run.
  8. Partition level statistics. Oracle optimizer can use partition level statistics in case of partitioned interval tables, because just one partition being accessed each time. Therefore, you can save resources on collecting table level statistics for huge tables.