How to efficiently transfrom narrow table into wide in Oracle? -

- July 15, 2013

i have narrow table following columns: <customer id> <field id> <value>, of them numbers.

i want reshape table wide format:

<customer id> <field1> <field2> <field3> ...

i have separate dictionary table dic_fields translates field id field name.

i work on exadata server. narrow table has 2.5 billion records, , have 200 fields.

the obvious simple solution below badly fills temporary space on our exadata server.

create table wide_ads (    customerid   ,max(case when fieldid = 1 value end) gender   ,max(case when fieldid = 2 value end) age   ,max(case when fieldid = 3 value end) education narrow_ads group customerid );

we tried cleverer , manual method:

create index index1       on szeroki_ads(customerid);   declare     rowidwide rowid;     type tcolnames table of string(32000) index pls_integer ;     arrcolnames tcolnames;     x_customerid number;     strcolname varchar2(32);     strcolvalue varchar2(32000);     strsql varchar2(200);     lngcounter pls_integer;     lngfieldid pls_integer; begin     lngcounter := 0;     -- pre-load dictionary arrcolnames speedup lookup.     dic_el in (select * dic_fields order fieldid) loop         lngfieldid := to_number(dic_el.fieldid);         arrcolnames(lngfieldid) := dic_el.fieldname;     end loop;      narrow_rec in (select * narrow_ads value not null ) loop          strcolname := arrcolnames(narrow_rec.fieldid);         strcolvalue := narrow_rec.value;         x_idklienta := narrow_rec.customerid;         begin             select rowid rowidwide wide_ads                 customerid = narrow_rec.customerid;             strsql := 'update :1 set :2 = :3 rowid = :4';             execute immediate strsql using wide_ads, strcolname, strcolvalue, rowidwide;         exception         when no_data_found                 strsql :=                  'insert '|| wide_ads ||' (customerid, '|| strcolname ||')                  values                      (:1, :2)';                 execute immediate strsql using  x_customerid, to_number(strcolvalue) ;         end;         if lngcounter=10000             commit;             lngcounter:=0;             dbms_output.put_line('clik...');         else             lngcounter:=lngcounter+1;         end if;         end loop; end;

although doesn't take temp, fails miserably performance-wise; processes 10 000 records in 50 sec - 1000 times slower, expected.

what can speed process?

as lalit comments, try in chunks based on customerid.

first, create index on customerid (if not exist):

create index indnarrowads on narrow_ads(customerid);

second, going create auxiliary table compute buckets based on customerid (in example create 1000 buckets, 1 bucket represent 1 block insert statement):

create table buckets(mincustomer, maxcustomer, bucketnum)   select min(customerid), max(customerid), bucket     (select customerid,                   width_bucket(customerid,                                (select min(customerid) narrow_ads),                               (select max(customerid) narrow_ads),                               1000) bucket             narrow_ads)     group bucket;

you can use more/less buckets modifying fourth argument of width_bucket function.

third, create wide_ads table (the structure no data). should manually (with special attention on storage parameters) can use own query where false condition:

create table wide_ads select    customerid   ,max(case when fieldid = 1 value end) gender   ,max(case when fieldid = 2 value end) age   ,max(case when fieldid = 3 value end) education  narrow_ads  1=0;

fourth, execute query on each bucket (1 bucket means 1 insert statement):

begin   b in (select * buckets order bucketnum) loop      insert wide_ads          select           customerid         ,max(case when fieldid = 1 value end) gender         ,max(case when fieldid = 2 value end) age         ,max(case when fieldid = 3 value end) education        narrow_ads        customerid between b.mincustomer , b.maxcustomer        group customerid;      commit;   end loop; end;

and finally, drop auxiliary table (and index if not necessary).

oracle optimizer should use index on customerid perform "index range scan" on narrow_ads. so, each insert should find efficiently corresponding interval.

note width_buckets creates buckets based on uniform divisions on specified interval on customerid (from min max values). not create buckets based on uniform number of rows. , note narrow_ads must not modified while process being executed.

as pl/sql block executes commit on each iteration , loop iterates on buckets using bucketnum order, can see how wide_ads grows , bucket being processed (retrieving max customerid wide_ads , find corresponding bucket on buckets table).

if temporary space usage high, increase number of buckets (each insert smaller).

Search This Blog

Bay WIKI

How to efficiently transfrom narrow table into wide in Oracle? -

Comments

Post a Comment

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

go - syntax error: unexpected name, expecting semicolon or newline -