Creating and Unpacking Delimited Strings in SQL

Kevin Meade's picture
articles: 

Recently a friend asked me for this. I see it a lot on OraFaq as a question in the forums so here are the basics of working with delimited strings. I will show the various common methods for creating them and for unpacking them. Its not like I invented this stuff so I will also post some links for additional reading.

Here are some link for more reading:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:15637744429336
http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php

Creating Delimited Strings

There are five basic methods for creating a delimited string in Oracle:

1) (my favorite) use a hierarchical query, also known as the sys_connect_by_path method
2) create a simple plsql function for your own needs
3) use an oracle supplied function DBMS_UTIL.TABLE_TO_COMMA and WM_CONCAT
4) bulid your own user defined aggregate (I first saw this on asktomhome)
5) even use XML (super ugly and not sure why anyone would do it this way so I won't show it)

sys_connect_by_path

In this method we take advantage of oracle's hierarchical processing in sql to do string concatenation all inside the sql statement.

SQL> select substr(sys_connect_by_path(table_name,','),2) table_list
  2  from (
  3         select rownum rowno,table_name
  4         from user_tables
  5         where rownum < 4
  6       )
  7  where connect_by_isleaf = 1
  8  connect by prior rowno = rowno - 1
  9  start with rowno = 1
 10  /

TABLE_LIST
------------------------------------------------------------------------------------
MV_UW_PROD_ASSIGNMENTS,MV_CALL_MEMO_REPORT,MV_SALES_TEAMS

1 row selected.

I like this because of its flexibility. You can supply any query you want without using any procedural code.

Simple PLSQL Function

Maybe the most common method is creating your own special purpose function. Not my favorite method as it usually is not general use and is not terribly efficient. There are many variations on this theme depending upon what the format of your incomming data is and how you get it ready for sending to the function. I have seen varations that use PLSQL table types and database object types and of course query parameters like this one.

create or replace function kev_get_string_list (query_p in varchar2) return varchar2 is
   c1 sys_refcursor;
   comma_seperated_list_v varchar2(4000);
   string_v varchar2(4000);
begin
   open c1 for query_p;
   loop
      fetch c1 into string_v;
      if c1%notfound then exit; end if;
      comma_seperated_list_v := comma_seperated_list_v||','||string_v;
   end loop;
   close c1;
   comma_seperated_list_v := substr(comma_seperated_list_v,2);
   return (comma_seperated_list_v);
end;
/
show errors

SQL> select kev_get_string_list('select table_name from user_tables where rownum < 4') thestring from dual;

THESTRING
---------------------------------------------------------------------------------------------------------------
MV_UW_PROD_ASSIGNMENTS,MV_CALL_MEMO_REPORT,MV_SALES_TEAMS

1 row selected.

Use an Oracle Supplied Function

DBMS_UTIL.TABLE_TO_COMMA does it but his is a procedure not a function and is not callable directly from sql. It also requires a plsql table input so has limited use outside of plsql. I won't bother with an example as it is in fact like the example above just shown.

But this is interesting. WM_CONCAT is an oracle supplied aggregate function that creates comma delimited strings.

SQL> select wm_concat(table_Name) comma_delimited_list
  2  from user_tables
  3  where rownum < 4
  4  /

COMMA_DELIMITED_LIST
-----------------------------------------------------------
CONTACT_POINT,ELECTRONIC_ADDRESS,POSTAL_ADDRESS

1 row selected.

But there are some issues:

1) it may or may not be documented depending upon your release.
2) you must have installed oracle workspace manager for it to be available.

Roll Your Own User Defined Aggregate

I first saw this on asktomhome. It may be the most natural method philosophically and most versatile. It is also the basic example of using user defined aggregate for something useful. We basically build our own WM_CONCAT so to speak.

create or replace type our_string_agg as object
(
   string_v varchar2(32767)
     ,static function odciaggregateinitialize(sctx in out our_string_agg) return number
     ,member function odciaggregateiterate(self in out our_string_agg, value in varchar2 ) return number
     ,member function odciaggregateterminate(self in our_string_agg, returnvalue out varchar2, flags in number) return number
     ,member function odciaggregatemerge(self in out our_string_agg, ctx2 in our_string_agg) return number
);
/
show errors


create or replace type body our_string_agg is
   static function odciaggregateinitialize(sctx in out our_string_agg)  return number is
   begin
      sctx := our_string_agg(null);
      return odciconst.success;
   end;

   member function odciaggregateiterate(self in out our_string_agg, value in varchar2)  return number is
   begin
      self.string_v := self.string_v || ',' || value;
      return odciconst.success;
   end;

   member function odciaggregateterminate(self in our_string_agg, returnvalue out varchar2, flags in number) return number is
   begin
      returnvalue := rtrim(ltrim(self.string_v, ','), ',');
      return odciconst.success;
   end;

   member function odciaggregatemerge(self in out our_string_agg, ctx2 in our_string_agg)  return number is
   begin
      self.string_v := self.string_v || ',' || ctx2.string_v;
      return odciconst.success;
   end;
end;
/
show errors


create or replace function agg_get_comma_delimited_string (data_in_p in varchar2) return varchar2
   parallel_enable
   aggregate
   using our_string_agg;
/
show errors

SQL> select agg_get_comma_delimited_string(table_name) comma_delimited_string
  2  from user_tables
  3  where rownum < 4
  4  /

COMMA_DELIMITED_STRING
--------------------------------------------------------------------------------------
CONTACT_POINT,ELECTRONIC_ADDRESS,POSTAL_ADDRESS

1 row selected.

It is my understanding that WC_CONCAT is in fact a "user defined aggregate" just defined by oracle for us. For those not familiar with user defined aggregates consider this:

SUM is an aggregate function. It operates across a set of rows. Could you write your own SUM function? Sure, you use the ODCIAggregate API to do it. Thus you can extend oracle with your own aggregate functions. Of course you need a reason to do it. This is one reason. Not many people have a reason so not many people use this. User Defined Aggregates were I believe created most to support Oracle's geocoding stuff.

As you can see UDA is somewhat involved. However, good things come from the effort. The result is another function that operates like all other functions. A very handy way to extend oracle if you have a need for it.

XML method

OK this method is ugly. It requires use of XMLAGG and XML_ELEMENT functions and multiple passes of the data and even when done is very unclear what it is doing. I won't show it, but you can easily find it with a google search.

Then there is the unpacking side of things. How does one unpack a delimited string. Well for this there are two basic methods:

1) sql method
2) function method

Unpacking is less exciting than packing.

The SQL Method

This is pretty much the reverse of the sys_connect_by_path. Once again we use the hierarchical capabilities of Oracle but this time we unpack instead of pack. The calculation of how many items are in the string is part of the trick here.

variable v1 varchar2(30)
exec begin :v1 := 'xyz,pdq,abc'; end;

SQL> select substr(','||:v1||','
  2               ,instr(','||:v1||',',',',1,rownum)+1
  3               ,instr(','||:v1||',',',',1,rownum+1)-instr(','||:v1||',',',',1,rownum)-1) avalue
  4  from dual
  5  connect by level <= length(:v1)-length(replace(:v1,','))+1
  6  /

AVALUE
----------------------------------
xyz
pdq
abc

3 rows selected.

The Function Method

create or replace type c_varchar2_30 is table of varchar2(30)
/

create or replace function unpack_delimited_string (string_p in varchar2) return c_varchar2_30 is
   c_varchar2_30_v c_varchar2_30 := c_varchar2_30();
begin
   for i in 1..nvl(length(string_p)-length(replace(string_p,','))+1,0) loop
      c_varchar2_30_v.extend;
      c_varchar2_30_v(c_varchar2_30_v.last) :=
            substr(','||string_p||','
                  ,instr(','||string_p||',',',',1,i)+1
                  ,instr(','||string_p||',',',',1,i+1)-instr(','||string_p||',',',',1,i)-1)
      ;
   end loop;
   return (c_varchar2_30_v);
end;
/
show errors

SQL> select *
  2  from table(cast(unpack_delimited_string('xyz,abc,pdq') as c_varchar2_30))
  3  /

COLUMN_VALUE
------------------------------
xyz
abc
pdq

3 rows selected.

Once again there are many variations of this theme mostly depending upon how you want the unpacked result returned. I like keeping things SQL ready so I choose the table type as my return type so I could get access to it via sql easily.

OK, I am sure you guys have your own methods for this so reply here and add them to the post.

Thanks, Kevin