«

»

Oct 06

Teradata Utilities – FASTLOAD part 2



Phases of Fastload

Fastload divides its job into two phases


1)       Phase 1 – Acquisition

2)      Phase 2 – Application

Phase 1 – Acquisition

  • The main objective of this phase is to send rows of data file from HOST computer to Teradata AMPs as fast as possible.
  • Rows of the data file are packed into 64k blocks and send it to PE.
  • PE will parse the SQL of Fastload and send the Explain plan to each AMP. It will by default create 1 session per AMP, so if your system has 200 AMP then it will make 200 sessions for 1 Fastload job.

TipIt is advisable to restrict the session with the help of .SESSION command so that Fastload job don’t end up taking all the available resources of the system.

  • After creation of sessions the 64k blocks data is passed to AMP with the help of PE and BYNET where it is quickly hashed according to its PI value.
  • Based on this row hash value the rows are then redistributed to its proper AMP. Internal redistribution takes place within AMP, so that each AMP gets the correct row. To know more about this redistribution based on row hash please refer –Primary Index in Teradata
  • Now each row is placed in its proper AMP, but they are not sorted till now.
  • Any error in this phase will be recorded in the Error table 1.

Phase 2 – Application

  • The main objective of this phase is to store each row into the actual target table.
  • The rows are sorted by AMP which is temporarily stored in its DISK during phase 1
  • These sorted rows then send to actual target table where they will reside permanently.
  • All these operation are BLOCK level operation thus giving more speed when we compare it ROW level operation.
  • Any error in this phase will be stored in Error table 2.

Limitations of FAST LOAD 

There are few limitations which we have on Fastload.

These limitations are necessary for Fastload to load data with the lightning fast speed into tables.

  • NO SECONDARY INDEXES ARE ALLOWED ON TARGET TABLE – Fastload can load tables only with primary indexes defined on it. If we have a secondary index on the table then Fastload will not load that table. We get an error message if we load such type of table –

If secondary index exist already, we need to drop them.

After loading the table through Fastload we can easily recreate them on the table.

CREATE INDEX (Column-name/s) ON dbname.tablename —— > creates Non Unique Secondary Index

or

CREATE UNIQUE INDEX (Column-name/s) ON dbname.tablename —— > creates Unique Secondary Index

TIP Fastload Allows loading of PPI table.

  • NO REFERENTIAL INTEGRITY IS ALLOWED – Fastload cannot load data into tables that are defined with Referential Integrity (RI). This would require too much system checking to prevent referential constraints to a different table
  • NO TRIGGERS ARE ALLOWED AT LOAD TIME – Fastload is much too focused on speed to pay attention to the needs of other tables, which is what Triggers are all about. Additionally, these require more than one AMP and more than one table. Fastload does one table only. Simply ALTER the Triggers to the DISABLED status prior to using Fastload.
  • DUPLICATE ROWS ARE NOT SUPPORTED – Multiset tables are a table that allow duplicate rows — that is when the values in every column are identical. When Fastload finds duplicate rows, they are discarded. While Fastload can load data into a multi-set table, Fastload will not load duplicate rows into a multi-set table because Fastload discards duplicate rows

To know more about Fastload check below mentioned links –


Fastload Part 1

Fastload part 3

Fastload Part 4

 

22 comments

Skip to comment form

  1. Vaibhav kumar

    i tried making two sessions in fasload 1st got passed with inserion and 2nd didnt showed any such behaviour.
    so why can’t we have multiple sessions in fastload?

  2. Rahul Agrawal

    If I use fastload to load a set table from a multiset table which have duplicates, will fastload utility allow this or it will fail? Please provide some insight on this.

    1. admin

      fastload will not fail because of duplicates, instead of that it will discard the duplicates and stored the count of duplicated rows.

  3. Sudha

    Hi Admin,

    Can we use fast load for loading the data from multiple files.if yes, where has to mention the multiple file.

    Thanks In Advance,
    Sudha

  4. Divya

    As am new to teradata….Is it possible to use MODIFY USER command inside the stored procedure…

  5. Chaithanya Reddy

    Hi,
    I am new to teradata,will FLoad loads data to staging

    1. admin

      yes

  6. Tejas More

    Hi,
    I have a query here.
    In FLOAD phase2 (application) you said that the data is sorted and then send to actual target table.

    May I know this sorting is on what basis? Is it on the basis of hash value or something else ?

    Thanks,
    Tejas.

    1. admin

      Hi,

      Sorting is on row hash only.

  7. Kumar

    Thanks for the information. its really easy to understand with the way you are explaining. Can you also explain about other utilities in Teradata and also to how to execute fastload scripts in unix with an example.

    Thanks
    Kumar

  8. sujit

    Nice explaination admin:)

  9. karthik

    Hi admin,
    can we use Multiset table as target table in Fastload?

    1. admin

      You can use multiset table in fastload loading, however it wont support the multiset table feature (duplicate row)

  10. karthik

    Fastload does not allow duplicate due to “RESTART LOGIC”. Could you please elaborate this?

  11. Amol

    Hi Admin,
    Thanks for your explanation. it is easy to grab.

    Can you please put some information about TASM and how it works. We can not see any related topics here.

  12. karan

    Hi Admin,
    One doubt pls clarify.
    In first phase data is read from flat file ,hashing done on it and stored on AMP without sorting.
    In second phase data is sorted as per AMP and loaded to final actual table.
    As per my understanding storing data in table is same as storing data in AMP.
    then how these steps differ?
    Pls clarify. Thanks in advance.

    Regards
    Karan

    1. admin

      to speed up the process and avoid unnecessary lock of the target table, in first phase only redistribution happens, then in second phase target tables are locked and after sorting, those rows are inserted into target tables.
      The difference in both the steps is the task which they are doing.

  13. Priya

    Hi Admin,
    Can you post on other utilities like multiload and fastexport. Because all the posts here are very easy to understand and the informations are very good.

    1. admin

      will post soon 🙂

  14. priya

    Hi Admin,

    Really very nice and clear explanation..

    1. admin

      Thanks Priya ..

      1. munish

        Hi admin,
        i am new to teradata , i find this information is really useful. thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>