What are No Primary Index tables ?
Starting with Teradata Release 13, tables can be defined without having a primary index.
As we all know, the primary index is the main idea behind an evenly data distribution on a Teradata system. By design, the primary index ensures that a Teradata system is unconditionally scalable.
Hence, the question is: what is the meaning of tables without a primary index and how are they implemented and fit into the hashing design of Teradata.
Initially, some words regarding how data is distributed in case of a no primary index table.
Basically, rows are distributed randomly across the AMPs.
As no hashing takes place, but rows have to be identified uniquely, the ROWID is generated differently from the ROWID of a regular table having a primary index:
As we do not have a hash value, Teradata uses the HASHBUCKET of the responsible AMP and adds a uniqueness value. As you can conclude, the bytes normally occupied by the hash value can now be used to increase the range for generating uniqueness values.
This is how No Primary Index Tables are created:
CREATE TABLE <TABLE>
PK INTEGER NOT NULL
) NO PRIMARY INDEX
Usage for No Primary Index Tables
As no primary index tables are distributed randomly across the AMPs, loading will become faster. Let’s take as an example, the phases of a FastLoad:
- Incoming rows are distributed in a round-robin fashion randomly across all AMPs
- The rows are hashed by the primary index value and forwarded to the responsible AMPs
- The responsible AMPs sort the received rows by ROWID
Now let’s consider a no primary index table. Basically, after distributing the rows randomly across the AMPs we are ready. No hashing and redistribution is needed. No sorting is needed. Further, as rows are assigned randomly to the AMPs, your data will always be distributed evenly across all AMPs and no skewing will occur.
As you can imagine, this makes loading much faster. Only the aquisition phase of the loading utilities is executed.
However much useful no primary index tables are in order to decrease the load times, don’t forget that without a primary index Teradata is limited to full tables scans if rows have to be retrieved.
You probably will recognize some similarities between no primary index tables and the Teradata columnar feature introduced with Teradata 14. Basically, tables which are using the new column partition feature of Teradata are equally no primary index tables.
Although offering great performance improvements for certain workload types, column stores on Teradata lack as well primary index access.
To some extend, this disadvantage of no primary index tables can be compensated with join indexes or secondary indexes.
Basically, no primary index tables are not designed for being production tables. Consider using them during the ETL-Process in case Teradata anyway has to do full table scans like SQL transformations carried out on each row etc.
There are some further restrictions if you decide to use no primary index tables. Here are the most important:
- Only MULTISET tables can be created
- No identity columns can be used
- NoPi tables cannot be partitioned with a PPI
- No statements with an update character allowed (UPDATE,MERGE INTO,UPSERT)
- No Permanent Journal possible
- Cannot be defined as Queue Tables
- Author – Roland Wenzlofsky
- Website Link – www.teradatapro.com
- Contact – https://plus.google.com/u/0/+RolandWenzlofsky/posts