An Introduction to ANYDATA

Kevin Meade's picture
AttachmentSize
Plain text icon anydata.txt5.42 KB
articles: 

My newest project needed to create a record keeping component that would keep track of balancing results over time. Sounded good, meant my customers could look back over time to see how things went, and it provided persistent evidence of good results. I figured on adding two routines (save/get) to a logging package already in existence to save balancing data via an autonomous transaction and retrieve it when needed. But in writing these routines it dawned on me that they would destroy the reusable nature of the logging package. Finally, a real life use for ANYDATA.

ANYDATA is an Oracle data type (an object actually), that is as they say “self describing”. Your guess is as good as mine as to what that really means. The party line goes something like: an ANYDATA data item contains data along with a data type descriptor (including data type name) of what the data looks like. Well… I sort of understand. But as usual, looking at some working code seems like the best way to really understand, so here is a quick and dirty introduction to ANYDATA along with how I used it in a real life scenario.

Here is a simple table that has a column defined as ANYDATA.

create table temp1
(
 a number not null
,b sys.anydata
)
/

OK, not too intimidating. The idea here is that we can store (with a few exceptions), any data we want in this ANYDATA column. It is actually pretty simple to put data into an ANYDATA data item. You just use one of the handy methods of the ANYDATA object. Here we can see the storage of number, date, and varchar2 values. Notice the use of the “convert-XXX” functions. Turns out there is a convert function for almost every type of data Oracle supports.

insert into temp1 values (1,sys.anydata.convertnumber(1))
/
insert into temp1 values (2,sys.anydata.convertdate(sysdate))
/
insert into temp1 values (3,sys.anydata.convertvarchar2('a'))
/

commit
/

SQL> select count(*) from temp1
  2  /

  COUNT(*)
----------
         3

1 row selected.

Yep, three rows went in. Wonder what it looks like?

SQL> col b format a20 trunc
SQL> select * from temp1
  2  /

  A B()
--- --------------------
  1 ANYDATA()
  2 ANYDATA()
  3 ANYDATA()

3 rows selected.

Hmm.. not much help there. What else can this thing tell me.

SQL> col typename format a20 trunc
SQL> select temp1.*,sys.anydata.gettypename(temp1.b) typename from temp1
  2  /

  A B()                  TYPENAME
--- -------------------- --------------------
  1 ANYDATA()            SYS.NUMBER
  2 ANYDATA()            SYS.DATE
  3 ANYDATA()            SYS.VARCHAR2

3 rows selected.

We used one of the many methods of the ANYDATA object type to find out the name of the data type of each specific column value. Yes, kind of neat, but how do I see the data? Well, that gets a little more involved. Because the data stored is actually of many different types, you can’t just select it. You will have to write some PL/SQL functions to help out.

Each data type convert method like the three we used in the inserts above, of the ANYDATA object type, has a corresponding get method that will give what you stored, back to you. Problem is, these get methods are not simple functions they are procedures so we need to write our own wrapper functions to effectively use these methods. Something like this package will do fine.

Each specific function in our package accepts an ANYDATA input and returns a specific output for its type. In the package body you will see the corresponding get methods that are invoked.

create or replace
package pkg_temp1
as

   function getnumber (anydata_p in sys.anydata) return number;
   function getdate (anydata_p in sys.anydata) return date;
   function getvarchar2 (anydata_p in sys.anydata) return varchar2;

end;
/
show errors

create or replace
package body pkg_temp1
as

   function getnumber (anydata_p in sys.anydata) return number is
      x number;
      thenumber_v number;
   begin
      x := anydata_p.getnumber(thenumber_v);
      return (thenumber_v);
   end;

   function getdate (anydata_p in sys.anydata) return date is
      x number;
      thedate_v date;
   begin
      x := anydata_p.getdate(thedate_v);
      return (thedate_v);
   end;

   function getvarchar2 (anydata_p in sys.anydata) return varchar2 is
      x number;
      thevarchar2_v varchar2(4000);
   begin
      x := anydata_p.getvarchar2(thevarchar2_v);
      return (thevarchar2_v);
   end;

end;
/
show errors

With this package in place we can now see our data.

col thevalue format a20 trunc
select temp1.*,sys.anydata.gettypename(temp1.b) typename
,case
      when sys.anydata.gettypename(temp1.b) = 'SYS.NUMBER' then
         to_char(pkg_temp1.getnumber(temp1.b))
      when sys.anydata.gettypename(temp1.b) = 'SYS.DATE' then
         to_char(pkg_temp1.getdate(temp1.b),'dd-mon-rrrr hh24:mi:ss')
      when sys.anydata.gettypename(temp1.b) = 'SYS.VARCHAR2' then
         pkg_temp1.getvarchar2(temp1.b)
 end thevalue
 from temp1
/


  A B()                  TYPENAME             THEVALUE
--- -------------------- -------------------- --------------------
  1 ANYDATA()            SYS.NUMBER           1
  2 ANYDATA()            SYS.DATE             25-may-2007 16:35:57
  3 ANYDATA()            SYS.VARCHAR2         a

3 rows selected.

Turns out we can even store user defined datatypes (objects and collections) in an ANYDATA column. Pay particular attention to the fact that we used the OID option of the create type commands below. This is one gotcha of ANYDATA.

If we store an object or collection in an ANYDATA column and later we drop the object or collection, then any ANYDATA column value dependent upon the dropped object will become inaccessible. So much for "self describing" eh. Even if we recreate the object and/or collection later, exactly as it was before, we still won't be able to access the data.

The fact is, that object types and collections have at least two identifications, a name, and an OID. Depending upon which one you use, you may or may not be able to find your collection data later.

Oracle always creates object types with an OID value. You don't normally supply one, so Oracle defaults a value for you. It uses a GUID construct for this (read up on sys_guid() if you want to know), which in theory is a string that is unique around the world (yep). When you recreate an object or collection (same name, same attributes, same everything it had before), Oracle still creates it with a different OID. Some internal Oracle functions reference the OID not the object name, and ANYDATA is one of them it seems. So even if after dropping an object or collection that an ANYDATA column value was dependent upon, you recreate the object or collection with the same definition, the previously inaccessible ANYDATA column value still won't be accessible because the stored data is remembering the old OID value.

But, if you create your object using an OID value, then you can always recreate it with the same OID value and the ANYDATA column value will recognize it and you are safe. There are presumably two reasons for needing OID values as regards types: a) to ensure an ANYDATA value will always work, and b) to be able to share types across database instances.

create or replace type o_temp1 oid '3150D5BF61DE33EDE0440003BA62E91A' is object (a number,b number,c number)
/

insert into temp1 values (4,sys.anydata.convertobject(o_temp1(1,2,3)))
/

create or replace type c_temp1 oid '3150D5BF61DF33EDE0440003BA62E91A' is table of o_temp1
/

set serveroutput on
declare
   c_temp1_v c_temp1;
begin
   select cast(multiset(select * from (
      select 1 c1,2 c2,3 c3 from dual union all
      select 4 c1,5 c2,6 c3 from dual union all
      select 7 c1,8 c2,9 c3 from dual
                                      )
                       ) as c_temp1
              )
   into c_temp1_v
   from dual;
dbms_output.put_line(c_temp1_v.count);
   insert into temp1 values (5,sys.anydata.convertcollection(c_temp1_v));
end;
/

Of course we will need two more functions for these two additional data type definitions we just created.

   function get_o_temp1 (anydata_p in sys.anydata) return o_temp1;
   function get_c_temp1 (anydata_p in sys.anydata) return c_temp1;


   function get_o_temp1 (anydata_p in sys.anydata) return o_temp1 is
      x number;
      o_temp1_v o_temp1;
   begin
      x := anydata_p.getobject(o_temp1_v);
      return (o_temp1_v);
   end;

   function get_c_temp1 (anydata_p in sys.anydata) return c_temp1 is
      x number;
      c_temp1_v c_temp1;
   begin
      x := anydata_p.getcollection(c_temp1_v);
      return (c_temp1_v);
   end;

After adding these functions to our helper package we can see the data.

col typename format a20 trunc
select temp1.*,sys.anydata.gettypename(temp1.b) typename from temp1
/

  A B()                  TYPENAME
--- -------------------- -------------------
  1 ANYDATA()            SYS.NUMBER
  2 ANYDATA()            SYS.DATE
  3 ANYDATA()            SYS.VARCHAR2
  4 ANYDATA()            KMEADE.O_TEMP1
  5 ANYDATA()            KMEADE.C_TEMP1

5 rows selected.

COL AC_TEMP1 FORMAT A62
select temp1.*
      ,case when sys.anydata.gettypename(temp1.b) = 'SYS.NUMBER' then pkg_temp1.getnumber(temp1.b) end anumber
      ,case when sys.anydata.gettypename(temp1.b) = 'SYS.DATE' then  pkg_temp1.getdate(temp1.b) end adate
      ,case when sys.anydata.gettypename(temp1.b) = 'SYS.VARCHAR2' then  pkg_temp1.getvarchar2(temp1.b) end avarchar2
      ,case when substr(sys.anydata.gettypename(temp1.b),instr(sys.anydata.gettypename(temp1.b),'.')+1) = 'O_TEMP1' then
         pkg_temp1.get_o_temp1(temp1.b) end ao_temp1
      ,case when substr(sys.anydata.gettypename(temp1.b),instr(sys.anydata.gettypename(temp1.b),'.')+1) = 'C_TEMP1' then
         pkg_temp1.get_c_temp1(temp1.b) end ac_temp1
from temp1
/

  A B()                     ANUMBER ADATE     AVARCHAR2            AO_TEMP1(A, B, C)    AC_TEMP1(A, B, C)
--- -------------------- ---------- --------- -------------------- -------------------- ------------------------------
  1 ANYDATA()                     1
  2 ANYDATA()                       25-MAY-07
  3 ANYDATA()                                 a
  4 ANYDATA()                                                      O_TEMP1(1, 2, 3)
  5 ANYDATA()                                                                           C_TEMP1(O_TEMP1(1, 2, 3), O_TE

5 rows selected.

(sorry, cut off the collection rows because of space on the page, they are there which you will see if you run the test cases)

So, this is nice and all, but why would anyone what to go to all this trouble just to save some data in a table. Well, most of the time you won’t. But there are times when you won’t know what data you are getting in advance, or more likely, you don’t want to know. The balancing data retention component I mentioned in opening is one such case.

Consider this situation: an application system wants to save the data it used to balance for posterity. It has this data as a set of rows. What do you do? Most people would create a table that maps to the data and the tell the application system “INSERT HERE”. Sounds OK, till then next application system comes along. They want to do the same thing except their data don’t look like the table you created for balancing for those other guys, so now what do you do? Well you got two choices:

1) create another table that maps to this new application’s data and tell them to “INSERT HERE”. This works but you can see it suffers from the fact that it means each application system that comes on board will need to create new objects to support balancing data retention, and this should after all be a common function done in a consistent way, but you have no common function any more.

2) Or, you decide to implement “A STANDARD” for balancing data. Well standards aren’t bad but you’ll have two problems with this idea: a) you have to build a standard which in light of the fact that the first application already did something and would have to change if you make the standard something different from what they did, means there will be pressure from several corners to adopt application A balancing as the standard, b) somebody will always be able to come up with a hard requirement that won’t fit your standard for balancing.

But, with a table something like this one:

create table hhi_acmr_job_run_bal
(
  hhi_acmr_job_run_bal_id number not null
, hhi_acmr_job_run_id number not null
, descriptor varchar2(100) not null
, balancing_dataset sys.anydata
)
tablespace ACMRDIMS_TABLE
/

You can build a reusable store/retrieve mechanism for balancing. It provides a simple framework and makes clear what the responsibilities of the reusable code are, and what the responsibilities of each application system are. You expose one interface and create a guide that tells application developers what steps they need to go through to correctly save their balancing work. In general they will need to do the following:

1) Define the layout of their balancing objects. They might in fact have more than one set of balancing data.
2) Create an object type and collection that maps to each layout.
3) Write functions in their applications that deal with the specifics of their layout. These would include:
. a) a select function (or view) to gather balancing data into rows
. b) a collect function to convert these rows to a collection
. c) a save function that invokes the ANYDATA convertcollection call and passes the result to the balancing save code.
. d) a get function to do to opposite of (C) using the ANYDATA getcollection call
. e) maybe a view that hides everything

The advantage to using ANYDATA here is that the balancing retention component remains simple and easy to use because it doesn’t do a whole lot and doesn’t care about the specifics of data types of each application. Additionally, you still have a standard in place because you have created a clear division of work and a consistent process to get the work done. Best of all, you have not tied anyone’s hands as to what they build.

Well, that is about it. One final note. Not all data types are supported by ANYDATA, even though the documentation may suggest they are. Of particular note are XML data, CLOB/BLOB data. Oracle is working on it. DESC SYS.ANYDATA to see what is available and remember, CLOB/BLOB don't work event hough it looks like it will.

Kevin

Comments

it very good article.

keep garden of oracle green by giving water of your knowledge and fertilizer of your experience. so new roots like us can grow. as you ever does.

regards.

Hi Kevin,

It always nice to read your articles , all are very good explainative .
I understood each and every point of this article but I am still in confusion about how to use anytype and anydataset in real life problems.
I will be very grateful to you if you will give focus on anytype and anydataset.

Regards,

Prashant Kesarwani

Kevin Meade's picture

Well, I am with you. anydata is a pretty useless thing for most people. Usually, when we work with data in business, we understand the data we are working with. this means we know the real datatypes of the data. Hence there is little use for something like anydata. Indeed, we would not want to use it in most situations.

However, there are times (as I stated in the open paragraphs), when you do not control the datatypes being used, or, when you don't care about the contents of the data itself. As a simple rule (all rules having exceptions), use anydata datatype, when you are creating some form of "shipping" utility. By this I mean, you are creating some kind of service which manages "black boxes", so it does not really care what is in the box.

Consider for a moment, advanced queueing. I moves messages around. Does it care what the messages are when it moves them? No, it does not give two hoots what the content of each message. It cares only that its job is to move the message from A to B, without distortion and with certainty of delivery. Anydata datatype is a natural for this kind of facility.

So, one thing we could say is: consider use of anydata datatype, when the application being built does not actually use the data it carriers. Think of anydata as a box. Into the box you put stuff, and you give it to the UPS man. He moves the box for you, but he does not really care what is in the box.

Kevin

Thanks kevin for your useful suggesstions I will defenatly follow your suggesstions while facing real life problems.I will keep in touch with you for further issues I am having.

Regards,
Prashant Kesarwani

Kevin ,

How are you doing ? I need your help on mutating error on triggers.I understand the cause of this error but I am not very much satisfied with the solutions given on internet. So Could you give some focus on mutating error on triggers?

Regards,
Prashant Kesarwani

Kevin Meade's picture

mutating errors happen because you are trying to insert/update/delete/select off a table that your are already changing. Generally this is the table upon which the trigger is defined.

constraining errors happen because you are trying to insert/update/delete/select off a table related via a foriegn key, to a table you are already changing. This obviously would be a parent or child table of the table on which the trigger is defined.

The reason Oracle throws these errors, is because it is protecting your data. It is saying that the state of a table is "in flux" or "not determinable" and thus this particular operation on this particular table at this particular moment, if allowed to proceed would be ambiguous in some way and is not guaranteed to end with a "consistent and correct" result. Therefore, Oracle won't let you do it. Don't ask me for the gritty details, because I don't understand them all my self, and they change with every release of Oracle anyway.

There are a few known methods for avoiding these errors:

1) use package variables and/or work tables to store transaction details from row level triggers, and then use this information to finish your work in a table level trigger (ugly)

2) use autonomous transactions to effectively change transaction semantics thus removing the ambiguity of your operation, but as a side affect, essentially bypass normal transaction management and saftey nets employed during trigger operations (dangerous)

3) disable foreign keys so constraining errors can't happen (stupid)

4) use views and instead-of-triggers because they work much like before row triggers but are not subject to mutating and constraining limitations (best used as a design strategy and not a quick fix).

5) redesign your transactions (best solution but not always practical, and sometimes not even possible).

Pick one and get moving.

Good luck, Kevin

very useful one

Good clarification and rectification about and from mutating problem.

Hi Kevin,

Here is my url for displaying the contents of a column of ANYDATA data type in pure SQL.

http://oraqa.com/2007/09/09/how-to-display-the-contents-of-a-column-of-anydata-data-type-in-sql/

Thanks,

Frank

HI KEVIN,
THIS IS A WONDERFUL CLARIFICATION. SOME TIMES PEOPLE LIKE ME MISS SOME BASIC POINTS AND GIVE A BAD DESIGN AND LATER WE COME ACCROSS ERRORS LIKE THESE. YOUR EXPLANATION TO AVOID MUTATION ERROR IS EXCELLENT. I WOULD LIKE TO ADD ONE MORE POINT TO YOURS

Solve the mutating table problem by moving the violating code from a row-level trigger to a statement-level trigger
Mutating table problem does not affect statement-level triggers.

THANKS AND REGARDS,
VAMSI K GUMMADI.

Good day, Kevin,

I recently had to dig into an issue where a date was being altered to something out of range (ie, 4990 BC). After much research and head-banging, I found out that an RLS policy which referenced an ANYDATA predicate was part of the issue. And I further believe (although unproven at this point) that the same ANYDATA column was overflowing into the next column, which just happened to be a date (and since Century is the high-order bit, it was mucked).

I wrote up a testcase that demonstrates the issue quite clearly and I can link the blog entry if you are interested (didn't want to do a link on my first post - that just seemed tacky). Does the above scenario ring any bells for you? Is it possible that an RLS policy (which, as you well know, appends to the predicate of a query) might cause one of the selected columns to go whacky?

I am not overly impressed with ANYDATA, personally. =) Speaking strictly from a DBA point of view. From a developers perspective, I can understand how it looks very attractive and noteworthy. The documentation could use some examples, let alone better ones. Like the examples you have so kindly provided. *grin*

Tom Kyte gave me a boiled-down testcase:

connect / as sysdba

drop user test cascade;
create user test identified by test;
grant create session, resource to test;

grant execute on dbms_rls to test;

connect test/test;

CREATE TABLE t ( when VARCHAR2(13), val SYS.ANYDATA, dt DATE );
insert into t values ('before policy',sys.anydata.convertvarchar2('123456789'),sysdate);

create or replace function testapp_security_function (p_schema varchar2, p_object varchar2)
return varchar2
is
begin
return '((cast(val as varchar2(1)) = ''Y'') or 1=1 )';
end;

/

begin
DBMS_RLS.ADD_POLICY (
object_schema => 'TEST',
object_name => 'T',

policy_name => 'TESTAPP_POLICY',
function_schema => 'TEST',
policy_function => 'TESTAPP_SECURITY_FUNCTION',
statement_types => 'SELECT,UPDATE,INSERT,DELETE',
update_check => TRUE,
enable => TRUE,
static_policy => FALSE);
end;
/
insert into t values ('after policy',sys.anydata.convertvarchar2('123456789'),sysdate);
connect / as sysdba
column val format a10
column dump format a40
select when, val, dt, dump(dt) dump from test.t t;

Thank you for this article, Kevin - it saved me a lot of research.
My experience with ANYDATA previously has been with Streams where it is mostly wrapped up in the provided APIs. Right now I'm checking out the feasibility of an application where I need to issue arbitrary SQLs without knowing what I'm querying or updating. ANYDATA may be the way to go: one can produce a kind-of polymorphic API that can work with any table.
--
John Watson
http://skillbuilders.com