Thursday, July 22, 2010

Different ways of Merging/Joining two datasets

There are different ways of merging/joining two datasets like:
1. Using data merge Statements
a. It is necessary to sort the datasets before merging and sorting used to take longer time incase datasets are big
b. It is necessary to have the joining variable/s in the same name in two datasets.
2. Using proc sql
a. There is no need to sort datasets and even the joining variable can have different names in two datasets.
b. Use huge amount of work space and creates problem when the dataset/s are big and there is work space limitation.
3. Merge Using proc format
a. This is a really quick way of merging. Sort/Merge is used here when key values from one file are needed to extract records from another file containing the same key, or BY variable.
4. For Syntax, more details and more ways of merging you can refer the following links:
http://www2.sas.com/proceedings/sugi30/054-30.pdf
http://www.nesug.org/Proceedings/nesug09/po/po13.pdf

3 comments:

  1. Also when we are merging a small data set with a large data set, you can use SAS Hash tables. Slightly complicated to use, but very effective.

    ReplyDelete
  2. There are couple of more ways to join two datasets apart from the methods mentioned above:

     MERGE WITH SET-KEY
     MERGE WITH MODIFY
     MERGE WITH ARRAY
     MERGE WITH UPDATE
     CALL EXECUTE
    For syntax and more details, please check the pdf file “po13.pdf” in the link mentioned above.

    ReplyDelete
  3. very helpful information .Merging using proc format is a quick alternative . Thanks..

    ReplyDelete