Compression for tables and table partitions
Licensing
As usual, everything has its price: Using the Standard Edition, one is not licensed to use compression. This comes with the Enterprise Edition only:
- Standard Compression
- Already included with the EE license fee, Standard Compression can be
used. It applies only to "Direct Load" (including
INSERT /*+APPEND*/), and restricts you to the keywordsCOMPRESSandUNCOMPRESSwithoutFORoptions. - Advanced Compression
- Requires a separate license. Having this aquired, you are permitted to
use the
COMPRESS FOR …clause, making use of OLTP compression plus different compression levels (high/low).
Table Compression
To (un)compress a table, you have two choices:
ALTER TABLE mytable [NO]COMPRESS;- leaves existing data untouched
- process takes only split-seconds, as it only changes the tables property
- new data will be affected and handled accordingly
ALTER TABLE mytable MOVE [NO]COMPRESS;- immediately re-organizes existing data while moving the table
- table is completely locked during the MOVE (i.e. no updates/inserts possible)
- new data will be affected and handled accordingly
If the table was moved to (un)compress existing data, indexes must be rebuilt
afterwards using ALTER INDEX … REBUILD; To determine affected indexes, the
data dictionary can be queried:
SELECT index_name
FROM user_indexes
WHERE table_name='MYTABLE'
AND status NOT IN ('VALID','N/A');
As mentioned above, while moving a table it cannot be accessed for
inserts/updates. If this is a No-Go, there's also the dbms_compression
package (as it has no procedure for direct-load, I'm not sure if this is bound
to advanced compression only) with 11gR2 (alternatively, there is a similar
package available for download from
OTN).
Next to procedures for calculating compression ratio and stats, it provides an
undocumented procedure named incremental_compress().
Compress table partitions
Basically this works as described above for tables, just with a few specialities: If compression was not already turned on for the table (or at least one of its partitions was already compressed), prior to compressing all bitmap indexes for this table must be either dropped or at least rendered unusable. This is due to changes required to the block headers, and can be "reversed" (via rebuild or recreate) after compression took place. The easiest way is to use
ALTER INDEX mybitmapindex UNUSABLE;
-- compression part goes here
ALTER INDEX mybitmapindex REBUILD;
To compress a single partition:
ALTER TABLE mytable
MOVE PARTITION mypart01
[ TABLESPACE mytablespace ]
[NO]COMPRESS
[ UPDATE INDEXES ];
As indicated, a different tablespace can be specified optionally; by default the index will be moved within the tablespace it is located. The parameter to update indexes is optional as well, but in practice one will have to use it. Still, although used, some unusable indexes may be left behind:
SELECT index_name
FROM user_ind_partitions
WHERE partition_name='MYPART01'
AND status NOT IN ('VALID','N/A');
For all of them:
ALTER INDEX index_name
REBUILD PARTITION mypart01;
Compression of Subpartitions
Principially, this goes like with partitions – except it doesn't. Why?
Because there is no MOVE [NO]COMPRESS for subpartitions (at least not up to
11g). Why? They simply have no compression attribute. Why? (Oh, 3-w = www).
Oracle decided this way. Subpartitions simply inherit from their respecive
partitions.
So what does this mean in practice? The DBA has to turn on compression for the respective partition, then simply move the subpartition, and optionally turn off compression for the subpartition agein. For indexes, see above.
If the compression property of the subpartition is switched back after the move, this also means: Only existing data are affected. That may be a draw-back: One cannot have a mixed state of subpartitions, as they only use their "master's" compression attribute.
After switching compression
Compression definitely changes something: more data fit into one block, and compression ratio ranges from about 10..70% – so the DBA should make sure to re-generate statistics to indicate these changes to the optimizer!
dbms_stats.gather_table_stats (
user(),
mytablename,
[ mypartitionname, ]
[ estimate_percent => 20, ]
[ method_opt => 'FOR ALL COLUMNS SIZE 1', ]
cascade => TRUE
);
As the first two parameters are mandatory, the PL/SQL built-in function
user() can be used to put the current user in automatically. Partition name
is an optional parameter as well – useful if only a single partition was
altered. cascade should be set to TRUE to also include index stats.
Restrictions
As nice as all this sounds – we shall not forget the "smallprint". As Randolf points out, there may be some trouble afterwards. How much trouble that is depends on the Oracle version used:
- with 9i, a lot of DDL is no longer possible as soon as a table (or at least one partition) is compressed. This includes adding/dropping columns or even altering them unused as well as changing column default values
- with 10g, some of those restrictions are gone: it is possible to add columns or set them unused – but still cannot drop columns or alter their default values
- with 11g almost everything is possible again. Dropping columns or adding columns with default values still does not work, but one can at least set them unused or altering the default value lateron (which only affects new data, whereas setting a default value while adding a column would require all existing rows to be updated)
- using advanced compression, everything works fine again. This is, however, not true with Hybrid Columnar Compression (HCC) introduced with 11gR2, which only affects ExaData.
The only work-around for all versions is not really applyable to reality: One needs to uncompress all tables/partitions/subpartitions and make sure that none of them has the compress property set to "true" …
Index compression
That's something often misunderstood. To get the complete idea, there's a good
read with the Blog of Andy
Black. For the basics,
very short: It is misnamed. Indexes are "deduped", not compressed. Stealing one
of Andys presentations, a car table with a PK index over the columns MAKE,
TRIM, MODEL and YEAR things become quite clear:
MAKE TRIM MODEL YEAR
---- ---- ----- ----
FORD LIMITED EXPLORER 1990
FORD LIMITED EXPLORER 2005
FORD LIMITED TEMPO 1991
FORD SPORT FUSION 2010
Though all four columns together form an unique key, there are a lot of repetitions in the first and second, and even some in the third column, which can be de-duplicated. In the end, this also saves space – which is probably why Oracle called it "compression" …
