How to efficiently transfrom narrow table into wide in Oracle? -


i have narrow table following columns: <customer id> <field id> <value>, of them numbers.

i want reshape table wide format:

<customer id> <field1> <field2> <field3> ...

i have separate dictionary table dic_fields translates field id field name.

i work on exadata server. narrow table has 2.5 billion records, , have 200 fields.

the obvious simple solution below badly fills temporary space on our exadata server.

create table wide_ads (    customerid   ,max(case when fieldid = 1 value end) gender   ,max(case when fieldid = 2 value end) age   ,max(case when fieldid = 3 value end) education narrow_ads group customerid ); 

we tried cleverer , manual method:

create index index1       on szeroki_ads(customerid);   declare     rowidwide rowid;     type tcolnames table of string(32000) index pls_integer ;     arrcolnames tcolnames;     x_customerid number;     strcolname varchar2(32);     strcolvalue varchar2(32000);     strsql varchar2(200);     lngcounter pls_integer;     lngfieldid pls_integer; begin     lngcounter := 0;     -- pre-load dictionary arrcolnames speedup lookup.     dic_el in (select * dic_fields order fieldid) loop         lngfieldid := to_number(dic_el.fieldid);         arrcolnames(lngfieldid) := dic_el.fieldname;     end loop;      narrow_rec in (select * narrow_ads value not null ) loop          strcolname := arrcolnames(narrow_rec.fieldid);         strcolvalue := narrow_rec.value;         x_idklienta := narrow_rec.customerid;         begin             select rowid rowidwide wide_ads                 customerid = narrow_rec.customerid;             strsql := 'update :1 set :2 = :3 rowid = :4';             execute immediate strsql using wide_ads, strcolname, strcolvalue, rowidwide;         exception         when no_data_found                 strsql :=                  'insert '|| wide_ads ||' (customerid, '|| strcolname ||')                  values                      (:1, :2)';                 execute immediate strsql using  x_customerid, to_number(strcolvalue) ;         end;         if lngcounter=10000             commit;             lngcounter:=0;             dbms_output.put_line('clik...');         else             lngcounter:=lngcounter+1;         end if;         end loop; end; 

although doesn't take temp, fails miserably performance-wise; processes 10 000 records in 50 sec - 1000 times slower, expected.

what can speed process?

as lalit comments, try in chunks based on customerid.

first, create index on customerid (if not exist):

create index indnarrowads on narrow_ads(customerid); 

second, going create auxiliary table compute buckets based on customerid (in example create 1000 buckets, 1 bucket represent 1 block insert statement):

create table buckets(mincustomer, maxcustomer, bucketnum)   select min(customerid), max(customerid), bucket     (select customerid,                   width_bucket(customerid,                                (select min(customerid) narrow_ads),                               (select max(customerid) narrow_ads),                               1000) bucket             narrow_ads)     group bucket; 

you can use more/less buckets modifying fourth argument of width_bucket function.

third, create wide_ads table (the structure no data). should manually (with special attention on storage parameters) can use own query where false condition:

create table wide_ads select    customerid   ,max(case when fieldid = 1 value end) gender   ,max(case when fieldid = 2 value end) age   ,max(case when fieldid = 3 value end) education  narrow_ads  1=0; 

fourth, execute query on each bucket (1 bucket means 1 insert statement):

begin   b in (select * buckets order bucketnum) loop      insert wide_ads          select           customerid         ,max(case when fieldid = 1 value end) gender         ,max(case when fieldid = 2 value end) age         ,max(case when fieldid = 3 value end) education        narrow_ads        customerid between b.mincustomer , b.maxcustomer        group customerid;      commit;   end loop; end; 

and finally, drop auxiliary table (and index if not necessary).

oracle optimizer should use index on customerid perform "index range scan" on narrow_ads. so, each insert should find efficiently corresponding interval.

note width_buckets creates buckets based on uniform divisions on specified interval on customerid (from min max values). not create buckets based on uniform number of rows. , note narrow_ads must not modified while process being executed.

as pl/sql block executes commit on each iteration , loop iterates on buckets using bucketnum order, can see how wide_ads grows , bucket being processed (retrieving max customerid wide_ads , find corresponding bucket on buckets table).

if temporary space usage high, increase number of buckets (each insert smaller).


Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - How to Hide Date Menu from Datepicker in yii2 -