oracle function call is being cached during bulk insert
when inserting bulk rows and using a function call as one of the column values i'm getting exactly the same values for every 10-11 rows from the function. the function is actually generating UUID values and returns unique results. If i replace the function call in the insert statement with the actual code of the function it never repeats.
So what i conclude from this is that oracle actually caches the result of the function and calls it only once for every 10-11 rows it's inserting. how can i change this behavior?
the function i'm calling i've taken from http://www.oracle-base.com/articles/9i/UUID9i.php :
create or replace
FUNCTION new_uuid RETURN VARCHAR2 AS
l_seed BINARY_INTEGER;
l_random_num NUMBER(5);
l_date VARCHAR2(25);
l_random VARCHAR2(4);
l_ip_address VARCHAR2(12);
BEGIN
l_seed := TO_NUMBER(TO_CHAR(SYSDATE,'YYYYDDMMSS'));
DBMS_RANDOM.initialize (val => l_seed);
l_random_num := TRUNC(DBMS_RANDOM.value(low => 1, high => 65535));
DBMS_RANDOM.terminate;
l_date := conversion_api.to_hex(TO_NUMBER(TO_CHAR(SYSTIMESTAMP,'FFSSMIHH24DDMMYYYY')));
l_random := RPAD(conversion_api.to_hex(l_random_num), 4, '0');
l_ip_address := conversion_api.to_hex(TO_NUMBER(REPLACE(NVL(SYS_CONTEXT('USERENV','IP_ADDRESS'), '123.123.123.123'), '.', '')));
RETURN SUBSTR(l_date, 1, 8) || '-' ||
SUBSTR(l_date, 9, 4) || '-' ||
SUBSTR(l_date, 13, 4) || '-' ||
RPAD(SUBSTR(l_date, 17), 4, '0') || '-' ||
RPAD(L_RANDOM || L_IP_ADDRESS, 12, '0');
END;
and here's the insert statement i'm using:
INSERT INTO My_TABLE(ID, NAME,)
SELECT NEW_UUID(), NAME
FROM MY_TABLE2;
COMMIT;
the select inside this statement produces lots of repeating UUIDs. while this statement produces unique ones:
SELECT RPAD(RPAD(my_schema.conversion_api.to_hex(TRUNC(DBMS_RANDOM.VALUE( 1, 65535))), 4, '0') || my_schema.conversion_api.to_hex(TO_NUMBER(REPLACE(NVL(SYS_CONTEXT('USERENV','IP_ADDRESS'), '123.123.123.123'), '.', ''))), 12, '0') sss
FROM my_schema.MY_TABLE
The thing is, "random" isn't actually random. Given the same seed for DBMS_RANDOM.INITIALISE()
, subsequent calls to DBMS_RANDOM.VALUE()
will return the same result. Check it out:
SQL> exec DBMS_RANDOM.initialize (val => 1)
PL/SQL procedure successfully completed.
SQL> select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual
2 /
TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
49214
SQL> r
1* select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual
TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
56385
SQL> r
1* select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual
TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
23941
SQL> exec DBMS_RANDOM.initialize (val => 1)
PL/SQL procedure successfully completed.
SQL> select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual;
TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
49214
SQL> r
1* select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual
TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
56385
SQL> r
1* select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual
TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
23941
SQL>
If we look at the code you got from Tim's site we see this line:
l_seed := TO_NUMBER(TO_CHAR(SYSDATE,'YYYYDDMMSS'));
From which we can surmise that your process is inserting 10-11 rows per second :)
If you replace SYSDATE with SYSTIMESTAMP and change the mask to go to millisecs (or smaller) then you should get a different seed each time, and hence a different value each time. Note that you still need to force the re-evaluation of the function to guarantee getting a different result for each row (see the demo below).
Um, did I say "guarantee". Uh-oh. It is in the nature of the random that it can produce the same result two goes running. So perhaps that should be "to minimize teh chances of getting the same result for each row".
Alternatively, remove the initialisation from the function and call it before you start your bulk inserts. Whether this is feasible depends entirely on your business logic.
Demonstration
Here is a function which generates a "random" number:
create or replace function get_random_number
(p_seed in number := 0)
return pls_integer
is
begin
if p_seed = 0
then
DBMS_RANDOM.initialize (val => TO_NUMBER(TO_CHAR(SYSDATE,'YYYYDDMMSS')));
else
DBMS_RANDOM.initialize (val => p_seed);
end if;
return TRUNC(DBMS_RANDOM.value(low => 1, high => 65535));
end;
/
If we call it twenty times with the default parameter it returns the same number each time:
SQL> select rownum
, get_random_number
from dual
connect by level <= 20
/
2 3 4 5
ROWNUM GET_RANDOM_NUMBER
---------- -----------------
1 10239
2 10239
3 10239
4 10239
5 10239
6 10239
7 10239
8 10239
9 10239
10 10239
11 10239
12 10239
13 10239
14 10239
15 10239
16 10239
17 10239
18 10239
19 10239
20 10239
20 rows selected.
SQL>
Whereas if we pass a value it uses a different seed each time and lo! we get a different result:
SQL> select rownum
, get_random_number(rownum)
from dual
connect by level <= 20
/
2 3 4 5
ROWNUM GET_RANDOM_NUMBER(ROWNUM)
---------- -------------------------
1 49214
2 6476
3 42426
4 2370
5 48546
6 52483
7 6964
8 46764
9 27569
10 7673
11 52446
12 50229
13 27861
14 31413
15 11518
16 13471
17 38766
18 9949
19 61656
20 25797
20 rows selected.
SQL>
This works because passing in ROWNUM forces the evaluation of the function for each row. You should not use ROWNUM as the seed in a production system: timestamps are better. Or concatenate the datetime with the rownum to provide a unique seed for each row.
APC's diagnostic is correct. You need to have entropy in your random generator seed.
Though, Oracle already has a unique id generator which is SYS_GUID()
.
SELECT sys_guid(), name FROM my_table2;
You can try this which produces 9 GUIDs:
SELECT sys_guid() from dual connect by level < 10;
Don't try to reinvent the wheel when it already exists.
Haven't tried it, but I believe Oracle is calculating the value for your new_uuid() function once, and outputting for each returned row (same as if you did select systimestamp, whatever from whatever... it would output the same timestamp for all rows.
So, you can modify your function to take some input from each row (for the seed perhaps?), or just use sequences.
链接地址: http://www.djcxy.com/p/91438.html上一篇: Java的GUID(Android)