How to efficiently transfrom narrow table into wide in Oracle? -
i have narrow table following columns: <customer id> <field id> <value>
, of them numbers.
i want reshape table wide format:
<customer id> <field1> <field2> <field3> ...
i have separate dictionary table dic_fields
translates field id field name.
i work on exadata server. narrow table has 2.5 billion records, , have 200 fields.
the obvious simple solution below badly fills temporary space on our exadata server.
create table wide_ads ( customerid ,max(case when fieldid = 1 value end) gender ,max(case when fieldid = 2 value end) age ,max(case when fieldid = 3 value end) education narrow_ads group customerid );
we tried cleverer , manual method:
create index index1 on szeroki_ads(customerid); declare rowidwide rowid; type tcolnames table of string(32000) index pls_integer ; arrcolnames tcolnames; x_customerid number; strcolname varchar2(32); strcolvalue varchar2(32000); strsql varchar2(200); lngcounter pls_integer; lngfieldid pls_integer; begin lngcounter := 0; -- pre-load dictionary arrcolnames speedup lookup. dic_el in (select * dic_fields order fieldid) loop lngfieldid := to_number(dic_el.fieldid); arrcolnames(lngfieldid) := dic_el.fieldname; end loop; narrow_rec in (select * narrow_ads value not null ) loop strcolname := arrcolnames(narrow_rec.fieldid); strcolvalue := narrow_rec.value; x_idklienta := narrow_rec.customerid; begin select rowid rowidwide wide_ads customerid = narrow_rec.customerid; strsql := 'update :1 set :2 = :3 rowid = :4'; execute immediate strsql using wide_ads, strcolname, strcolvalue, rowidwide; exception when no_data_found strsql := 'insert '|| wide_ads ||' (customerid, '|| strcolname ||') values (:1, :2)'; execute immediate strsql using x_customerid, to_number(strcolvalue) ; end; if lngcounter=10000 commit; lngcounter:=0; dbms_output.put_line('clik...'); else lngcounter:=lngcounter+1; end if; end loop; end;
although doesn't take temp, fails miserably performance-wise; processes 10 000 records in 50 sec - 1000 times slower, expected.
what can speed process?
as lalit comments, try in chunks based on customerid
.
first, create index on customerid
(if not exist):
create index indnarrowads on narrow_ads(customerid);
second, going create auxiliary table compute buckets based on customerid
(in example create 1000 buckets, 1 bucket represent 1 block insert statement):
create table buckets(mincustomer, maxcustomer, bucketnum) select min(customerid), max(customerid), bucket (select customerid, width_bucket(customerid, (select min(customerid) narrow_ads), (select max(customerid) narrow_ads), 1000) bucket narrow_ads) group bucket;
you can use more/less buckets modifying fourth argument of width_bucket function.
third, create wide_ads
table (the structure no data). should manually (with special attention on storage parameters) can use own query where
false condition:
create table wide_ads select customerid ,max(case when fieldid = 1 value end) gender ,max(case when fieldid = 2 value end) age ,max(case when fieldid = 3 value end) education narrow_ads 1=0;
fourth, execute query on each bucket (1 bucket means 1 insert statement):
begin b in (select * buckets order bucketnum) loop insert wide_ads select customerid ,max(case when fieldid = 1 value end) gender ,max(case when fieldid = 2 value end) age ,max(case when fieldid = 3 value end) education narrow_ads customerid between b.mincustomer , b.maxcustomer group customerid; commit; end loop; end;
and finally, drop auxiliary table (and index if not necessary).
oracle optimizer should use index on customerid
perform "index range scan" on narrow_ads
. so, each insert
should find efficiently corresponding interval.
note width_buckets
creates buckets based on uniform divisions on specified interval on customerid
(from min max values). not create buckets based on uniform number of rows. , note narrow_ads
must not modified while process being executed.
as pl/sql block executes commit
on each iteration , loop iterates on buckets using bucketnum
order, can see how wide_ads grows , bucket being processed (retrieving max customerid
wide_ads
, find corresponding bucket on buckets
table).
if temporary space usage high, increase number of buckets (each insert smaller).
Comments
Post a Comment