«

»

Jan 17

Compression in Teradata

What is Compression in Teradata


Compression is the feature of Teradata which helps to reduce the disk storage. It is useful in reducing the disk space required by FIXED LENGTH columns or NULLS. It is a loss less compression so no data will be lost in it.

We can do compression on the following data types –

  • Nulls, Zeros, Blanks. (Note all are different values and never consider them as same)
  • Any numeric data type. (INTEGER, DECIMAL, FLOAT, DOUBLE, SMALL INT, BYTE INT)
  • DATE
  • CHARACTER(up to 255 characters)

Syntax of Compression –

CREATE TABLE EMP_ADDRESS(

    Address VARCHAR(50),

    City CHAR(20) COMPRESS (‘Bangalore’),

    StateCode CHAR(2)

  );

Guidelines of Compression –

  • Columns must be FIXED LENGTH (not variable length) and it must be less than or equal to 255 characters.
  • The compressed columns cannot be a part of PRIMARY INDEX .
  • We can compress upto 256 constant values, which means for a single FIXED LENGTH column we can define max 256 values to be compressed.
  • NULL will be compressed by default if you choose that column for compression.

NOTE Maximum 256 values allowed in Teradata is including NULL, which means that you can define 255 unique values for that column (255 unique values + 1 NULL value = 256).

  • Compression is case sensitive which mean if you compressed ‘TERADATATECH’ it will not compress ‘teradatatech’.

 

Difference between compression and VARCHAR

VARCHAR will be more efficient when the difference of maximum and average field length is high and compressibility is low.


Compression and fixed CHAR will be more efficient when the difference of maximum and average field length is low and compressibility is high.

Feel free to post your doubts as comment and if you like the post don’t forget to share 🙂

 

19 comments

Skip to comment form

  1. madhura

    why the COLUMNS defined as ALC (algorithmic compression) can not be compressed to MVC.

    the database size got increased after compression

  2. nites

    how many column compression can done in one teradata table?

  3. pankaj

    I admin,

    Can we specify compression on column after creating the table, like in insert/update
    or alter table statements.

  4. kesavan

    What is advantage of using char datatype with compression for columns..instead we can use varchar datatype directly ryte…can u plz clarify??

  5. Manoj

    🙂

  6. arockia

    How much space the bit pattern occupies in compression?

  7. anji

    ** how can we identify,whether the compress is applied on col or not other than check the size of the table?

  8. anji

    Hi Admin,

    Am beginner in Teradata & trying to identify how compress is working. Below is my query. Please let me know if i did any mistake.

    create table samples.tbl_vth_out_cmpress
    (
    col1 int
    ,col1n int
    ,col2 char(5)
    ,col3 varchar(5)
    );
    insert into samples.tbl_vth_out_cmpress values (,5,,);
    select * from samples.tbl_vth_out_cmpress;
    select count(*) from samples.tbl_vth_out_cmpress where col1n IS NULL;

    col1 col1n col2 col3
    ———– ———– —– —–
    ? 5 ? ?

    Count(*)
    ———–
    0

    ==================================
    create table samples.tbl_vth_cmpress
    (
    col1 int
    ,col1n int compress (5)
    ,col2 char(5) compress
    ,col3 varchar(5)
    );
    insert into samples.tbl_vth_cmpress values (,5,,);
    select * from samples.tbl_vth_cmpress;
    select count(*) from samples.tbl_vth_cmpress where col1n IS NULL;

    col1 col1n col2 col3
    ———– ———– —– —–
    ? 5 ? ?

    Count(*)
    ———–
    0
    ================================
    select * from dbc.tablesize where TableName like’%tbl_vth%’;
    Vth_Out : 1536 Vth: 1024

  9. Priyabrat

    Hi Admin,
    Can you please explain how the data gets stored while using compression.
    for e.g when i write Compress(Bangalore) , how many bytes are really stored for this 9 characters ? Usually 9 characters should take 9 bytes ,so how does compression is helpful here?

  10. Swaroop

    Can we see the difference in size after applying the compression on tables whose size is in kb?

    1. admin

      yes,
      one way of doing is to make use of dbc.tablesize to calculate size of tables before and after compression.

  11. rafeek

    Thank you providing the valueable information related to TeraData as it is easy to understand

  12. Rohit

    Just to clarify my understanding, I would like to know in more detail as to how compression saves space?

    1) Does it take the actual length of the variable into consideration instead of length defined for particular column while creating the table?
    For example, if we have defined a column as “CITY VARCHAR(30)”, but the actual value stored in this field for a particular row is 8 character long, so does it mean that with the help of compression we have achieved a saving of 22 character?

    2) As mentioned in your example above, how does compression work if we provide a constant value in the table/column definition itself? Does it replaces the constant value with a flag value in order to reduce space?

    Thanks in advance!

    1. admin

      Hi Rohit

      1) Compression is allowed on FIXED length data types, VARCHAR itself saves the space by assigning only that much space which is required. So without using compression you can save the space of 22 characters as per your example.

      2) yes it assign the constant value as flag value (bit pattern), and use only that flag value for all internal comparisons. But encryption and decryption of this flag value is one of the overhead of using compression.

      1. Prasanna

        Hi Admin,

        Here I have q question…

        If we declared a char variable, hw it will compress.

        For example:

        city char[20]=”Banglore”.

        Here will is compress “banglore” or the the lengthof the column[20]

        please clarify ?

  13. mike

    Hi Admin,
    Please add some articles on multiload and BTEq also.

    Thanks

  14. Kumar

    Could you please give more details with examples regarding the Compression and Varchar differences?

    1. admin

      Actually for VARCHAR and COMPRESSION difference you need to do some experiments from your end –

      for e.g.
      for any table try to find out the unique value for a particular column on which you want to apply compression

      Sel columnname, count(1) from tablename group by 1 order by 1;

      if you can find count which is more than 40% of total row count then we can easily put compression on that column.
      While on the other hand if you are unable to find any significant volume of unique values then better to start some RnD between VARCHAR and COMPRESSION options.

      However in my opinion if none of them is a clear winner then better to make that column as VARCHAR.

      Hope you get the point 😉

  15. Rams

    I àm new for teradata. These basics are very easy to understand.

    Thank you

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>