oracle function call is being cached during bulk insert

2018-07-02 19:23:11

when inserting bulk rows and using a function call as one of the column values i'm getting exactly the same values for every 10-11 rows from the function. the function is actually generating UUID values and returns unique results. If i replace the function call in the insert statement with the actual code of the function it never repeats.

So what i conclude from this is that oracle actually caches the result of the function and calls it only once for every 10-11 rows it's inserting. how can i change this behavior?

the function i'm calling i've taken from http://www.oracle-base.com/articles/9i/UUID9i.php :

create or replace
FUNCTION        new_uuid RETURN VARCHAR2 AS
  l_seed        BINARY_INTEGER;
  l_random_num  NUMBER(5);
  l_date        VARCHAR2(25);
  l_random      VARCHAR2(4);
  l_ip_address  VARCHAR2(12);
BEGIN
  l_seed := TO_NUMBER(TO_CHAR(SYSDATE,'YYYYDDMMSS'));
  DBMS_RANDOM.initialize (val => l_seed);
  l_random_num := TRUNC(DBMS_RANDOM.value(low => 1, high => 65535));
  DBMS_RANDOM.terminate;

  l_date       := conversion_api.to_hex(TO_NUMBER(TO_CHAR(SYSTIMESTAMP,'FFSSMIHH24DDMMYYYY')));
  l_random     := RPAD(conversion_api.to_hex(l_random_num), 4, '0');
  l_ip_address := conversion_api.to_hex(TO_NUMBER(REPLACE(NVL(SYS_CONTEXT('USERENV','IP_ADDRESS'), '123.123.123.123'), '.', '')));

  RETURN SUBSTR(l_date, 1, 8)                     || '-' ||
         SUBSTR(l_date, 9, 4)                     || '-' ||
         SUBSTR(l_date, 13, 4)                    || '-' ||
         RPAD(SUBSTR(l_date, 17), 4, '0')         || '-' ||
         RPAD(L_RANDOM || L_IP_ADDRESS, 12, '0');
END;

and here's the insert statement i'm using:

INSERT INTO My_TABLE(ID, NAME,)
SELECT NEW_UUID(), NAME
FROM MY_TABLE2;
COMMIT;

the select inside this statement produces lots of repeating UUIDs. while this statement produces unique ones:

SELECT RPAD(RPAD(my_schema.conversion_api.to_hex(TRUNC(DBMS_RANDOM.VALUE( 1, 65535))), 4, '0') || my_schema.conversion_api.to_hex(TO_NUMBER(REPLACE(NVL(SYS_CONTEXT('USERENV','IP_ADDRESS'), '123.123.123.123'), '.', ''))), 12, '0') sss
FROM my_schema.MY_TABLE

The thing is, "random" isn't actually random. Given the same seed for DBMS_RANDOM.INITIALISE() , subsequent calls to DBMS_RANDOM.VALUE() will return the same result. Check it out:

SQL> exec DBMS_RANDOM.initialize (val => 1)

PL/SQL procedure successfully completed.

SQL> select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual
  2  /

TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
                                       49214

SQL> r
  1* select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual

TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
                                       56385

SQL> r
  1* select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual

TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
                                       23941

SQL> exec DBMS_RANDOM.initialize (val => 1)

PL/SQL procedure successfully completed.

SQL> select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual;

TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
                                       49214

SQL> r
  1* select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual

TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
                                       56385

SQL> r
  1* select TRUNC(DBMS_RANDOM.value(low => 1, high => 65535)) from dual

TRUNC(DBMS_RANDOM.VALUE(LOW=>1,HIGH=>65535))
--------------------------------------------
                                       23941

SQL>

If we look at the code you got from Tim's site we see this line:

l_seed := TO_NUMBER(TO_CHAR(SYSDATE,'YYYYDDMMSS'));

From which we can surmise that your process is inserting 10-11 rows per second :)

If you replace SYSDATE with SYSTIMESTAMP and change the mask to go to millisecs (or smaller) then you should get a different seed each time, and hence a different value each time. Note that you still need to force the re-evaluation of the function to guarantee getting a different result for each row (see the demo below).

Um, did I say "guarantee". Uh-oh. It is in the nature of the random that it can produce the same result two goes running. So perhaps that should be "to minimize teh chances of getting the same result for each row".

Alternatively, remove the initialisation from the function and call it before you start your bulk inserts. Whether this is feasible depends entirely on your business logic.

Demonstration

Here is a function which generates a "random" number:

create or replace function get_random_number 
    (p_seed in number := 0)
    return pls_integer
is
begin
    if p_seed = 0
    then
        DBMS_RANDOM.initialize (val => TO_NUMBER(TO_CHAR(SYSDATE,'YYYYDDMMSS')));
    else    
        DBMS_RANDOM.initialize (val => p_seed);
    end if;
    return TRUNC(DBMS_RANDOM.value(low => 1, high => 65535));
end;
/

If we call it twenty times with the default parameter it returns the same number each time:

SQL> select rownum
       , get_random_number
from   dual
connect by level <= 20
/
  2    3    4    5  
    ROWNUM GET_RANDOM_NUMBER
---------- -----------------
         1             10239
         2             10239
         3             10239
         4             10239
         5             10239
         6             10239
         7             10239
         8             10239
         9             10239
        10             10239
        11             10239
        12             10239
        13             10239
        14             10239
        15             10239
        16             10239
        17             10239
        18             10239
        19             10239
        20             10239

20 rows selected.

SQL>

Whereas if we pass a value it uses a different seed each time and lo! we get a different result:

SQL> select rownum
       , get_random_number(rownum)
from   dual
connect by level <= 20
/
  2    3    4    5  
    ROWNUM GET_RANDOM_NUMBER(ROWNUM)
---------- -------------------------
         1                     49214
         2                      6476
         3                     42426
         4                      2370
         5                     48546
         6                     52483
         7                      6964
         8                     46764
         9                     27569
        10                      7673
        11                     52446
        12                     50229
        13                     27861
        14                     31413
        15                     11518
        16                     13471
        17                     38766
        18                      9949
        19                     61656
        20                     25797

20 rows selected.

SQL>

This works because passing in ROWNUM forces the evaluation of the function for each row. You should not use ROWNUM as the seed in a production system: timestamps are better. Or concatenate the datetime with the rownum to provide a unique seed for each row.

APC's diagnostic is correct. You need to have entropy in your random generator seed.

Though, Oracle already has a unique id generator which is SYS_GUID() .

SELECT sys_guid(), name FROM my_table2;

You can try this which produces 9 GUIDs:

SELECT sys_guid() from dual connect by level < 10;

Tora截图

Don't try to reinvent the wheel when it already exists.

Haven't tried it, but I believe Oracle is calculating the value for your new_uuid() function once, and outputting for each returned row (same as if you did select systimestamp, whatever from whatever... it would output the same timestamp for all rows.

So, you can modify your function to take some input from each row (for the seed perhaps?), or just use sequences.

链接地址: http://www.djcxy.com/p/91438.html

上一篇: Java的GUID（Android）

下一篇: 在批量插入过程中正在缓存oracle函数调用