Can N function cause problems with existing queries?

2018-06-11 07:44:37

We use Oracle 10g and Oracle 11g .

We also have a layer to automatically compose queries, from pseudo-SQL code written in .net (something like SqlAlchemy for Python).

Our layer currently wraps any string in single quotes ' and, if contains non-ANSI characters, it automatically compose the UNISTR with special characters written as unicode bytes (like 0E0 ).

Now we created a method for doing multiple inserts with the following construct:
INSERT INTO ... (...) SELECT ... FROM DUAL UNION ALL SELECT ... FROM DUAL ...

This algorithm could compose queries where the same string field is sometimes passed as 'my simple string' and sometimes wrapped as UNISTR('my string with special chars like 0E0') .

The described condition causes a ORA-12704: character set mismatch .

One solution is to use the INSERT ALL construct but it is very slow compared to the one used now.

Another solution is to instruct our layer to put N in front of any string (except for the ones already wrapped with UNISTR ). This is simple.

I just want to know if this could cause any side-effect on existing queries.

Note: all our fields on DB are either NCHAR or NVARCHAR2 .

Oracle ref: http://docs.oracle.com/cd/B19306_01/server.102/b14225/ch7progrunicode.htm

Basicly what you are asking is, is there a difference between how a string is stored with or without the N function.

You can just check for yourself consider:

SQL> create table test (val nvarchar2(20));

Table TEST created.

SQL> insert into test select n'test' from dual;

1 row inserted.

SQL> insert into test select 'test' from dual;

1 row inserted.

SQL> select dump(val) from test;
DUMP(VAL)                                                                      
--------------------------------------------------------------------------------
Typ=1 Len=8: 0,116,0,101,0,115,0,116                                            
Typ=1 Len=8: 0,116,0,101,0,115,0,116

As you can see identical so no side effect.

The reason this works so beautifully is because of the elegance of unicode

If you are interested here is a nice video explaining it

https://www.youtube.com/watch?v=MijmeoH9LT4

I assume that you get an error "ORA-12704: character set mismatch" because your data inside quotes considered as char but your fields is nchar so char is collated using different charsets, one using NLS_CHARACTERSET , the other NLS_NCHAR_CHARACTERSET .

When you use an UNISTR function, it converts data from char to nchar (in any case that also converts encoded values into characters) as the Oracle docs say:

"UNISTR takes as its argument a text literal or an expression that resolves to character data and returns it in the national character set."

When you convert values explicitly using N or TO_NCHAR you only get values in NLS_NCHAR_CHARACTERSET without decoding. If you have some values encoded like this "0E0" they will not be decoded and will be considered unchanged.

So if you have an insert such as:

   insert into  select N'my string with special chars like 0E0', 
    UNISTR('my string with special chars like 0E0') from dual ....

your data in the first inserting field will be: 'my string with special chars like 0E0' not 'my string with special chars like à' . This is the only side effect I'm aware of. Other queries should already use NLS_NCHAR_CHARACTERSET encoding, so it shouldn't be any problem using an explicit conversion.

And by the way, why not just insert all values as N'my string with special chars like à' ? Just encode them into UTF-16 (I assume that you use UTF-16 for nchars) first if you use different encoding in 'upper level' software.

use of n function - you have answers already above.

If you have any chance to change the charset of the database, that would really make your life easier. I was working on huge production systems, and found the trend that because of storage space is cheap, simply everyone moves to AL32UTF8 and the hassle of internationalization slowly becomes the painful memories of the past.

I found the easiest thing is to use AL32UTF8 as the charset of the database instance, and simply use varchar2 everywhere. We're reading and writing standard Java unicode strings via JDBC as bind variables without any harm, and fiddle.

Your idea to construct a huge text of SQL inserts may not scale well for multiple reasons:

there is a fixed length of maximum allowed SQL statement - so it won't work with 10000 inserts

it is advised to use bind variables (and then you don't have the n'xxx' vs unistr mess either)

the idea to create a new SQL statement dynamically is very resource unfriedly. It does not allow Oracle to cache any execution plan for anything, and will make Oracle hard parse your looong statement at each call.

What you're trying to achieve is a mass insert. Use the JDBC batch mode of the Oracle driver to perform that at light-speed, see eg: http://viralpatel.net/blogs/batch-insert-in-java-jdbc/

Note that insert speed is also affected by triggers (which has to be executed) and foreign key constraints (which has to be validated). So if you're about to insert more than a few thousands of rows, consider disabling the triggers and foreign key constraints, and enable them after the insert. (You'll lose the trigger calls, but the constraint validation after insert can make an impact.)

Also consider the rollback segment size. If you're inserting a million of records, that will need a huge rollback segment, which likely will cause serious swapping on the storage media. It is a good rule of thumb to commit after each 1000 records.

(Oracle uses versioning instead of shared locks, therefore a table with uncommitted changes are consistently available for reading. The 1000 records commit rate means roughly 1 commit per second - slow enough to benefit of write buffers, but quick enough to not interfer with other humans willing to update the same table.)

链接地址: http://www.djcxy.com/p/32504.html

上一篇: 重叠事件的最大次数的持续时间

下一篇: N功能是否会导致现有查询出现问题？