How to efficiently transfrom narrow table into wide in Oracle? -
i have narrow table following columns: <customer id> <field id> <value>, of them numbers.
i want reshape table wide format:
<customer id> <field1> <field2> <field3> ...
i have separate dictionary table dic_fields translates field id field name.
i work on exadata server. narrow table has 2.5 billion records, , have 200 fields.
the obvious simple solution below badly fills temporary space on our exadata server.
create table wide_ads ( customerid ,max(case when fieldid = 1 value end) gender ,max(case when fieldid = 2 value end) age ,max(case when fieldid = 3 value end) education narrow_ads group customerid ); we tried cleverer , manual method:
create index index1 on szeroki_ads(customerid); declare rowidwide rowid; type tcolnames table of string(32000) index pls_integer ; arrcolnames tcolnames; x_customerid number; strcolname varchar2(32); strcolvalue varchar2(32000); strsql varchar2(200); lngcounter pls_integer; lngfieldid pls_integer; begin lngcounter := 0; -- pre-load dictionary arrcolnames speedup lookup. dic_el in (select * dic_fields order fieldid) loop lngfieldid := to_number(dic_el.fieldid); arrcolnames(lngfieldid) := dic_el.fieldname; end loop; narrow_rec in (select * narrow_ads value not null ) loop strcolname := arrcolnames(narrow_rec.fieldid); strcolvalue := narrow_rec.value; x_idklienta := narrow_rec.customerid; begin select rowid rowidwide wide_ads customerid = narrow_rec.customerid; strsql := 'update :1 set :2 = :3 rowid = :4'; execute immediate strsql using wide_ads, strcolname, strcolvalue, rowidwide; exception when no_data_found strsql := 'insert '|| wide_ads ||' (customerid, '|| strcolname ||') values (:1, :2)'; execute immediate strsql using x_customerid, to_number(strcolvalue) ; end; if lngcounter=10000 commit; lngcounter:=0; dbms_output.put_line('clik...'); else lngcounter:=lngcounter+1; end if; end loop; end; although doesn't take temp, fails miserably performance-wise; processes 10 000 records in 50 sec - 1000 times slower, expected.
what can speed process?
as lalit comments, try in chunks based on customerid.
first, create index on customerid (if not exist):
create index indnarrowads on narrow_ads(customerid); second, going create auxiliary table compute buckets based on customerid (in example create 1000 buckets, 1 bucket represent 1 block insert statement):
create table buckets(mincustomer, maxcustomer, bucketnum) select min(customerid), max(customerid), bucket (select customerid, width_bucket(customerid, (select min(customerid) narrow_ads), (select max(customerid) narrow_ads), 1000) bucket narrow_ads) group bucket; you can use more/less buckets modifying fourth argument of width_bucket function.
third, create wide_ads table (the structure no data). should manually (with special attention on storage parameters) can use own query where false condition:
create table wide_ads select customerid ,max(case when fieldid = 1 value end) gender ,max(case when fieldid = 2 value end) age ,max(case when fieldid = 3 value end) education narrow_ads 1=0; fourth, execute query on each bucket (1 bucket means 1 insert statement):
begin b in (select * buckets order bucketnum) loop insert wide_ads select customerid ,max(case when fieldid = 1 value end) gender ,max(case when fieldid = 2 value end) age ,max(case when fieldid = 3 value end) education narrow_ads customerid between b.mincustomer , b.maxcustomer group customerid; commit; end loop; end; and finally, drop auxiliary table (and index if not necessary).
oracle optimizer should use index on customerid perform "index range scan" on narrow_ads. so, each insert should find efficiently corresponding interval.
note width_buckets creates buckets based on uniform divisions on specified interval on customerid (from min max values). not create buckets based on uniform number of rows. , note narrow_ads must not modified while process being executed.
as pl/sql block executes commit on each iteration , loop iterates on buckets using bucketnum order, can see how wide_ads grows , bucket being processed (retrieving max customerid wide_ads , find corresponding bucket on buckets table).
if temporary space usage high, increase number of buckets (each insert smaller).
Comments
Post a Comment