Oracle regular expression having a hyphen doesn't give same result on Windows as on Unix

我怕爱的太早我们不能终老 提交于 2020-01-03 17:32:47

问题


I have the following query with a regular expression:

select REGEXP_REPLACE ('TEST 3304 V2', '[`~!@#$%^&*()_+-={}|;.:<>?,./]', ' ') as REG 
from dual;

When executed via SQL*Plus on a Windows machine it returns the following:

SQL>  select REGEXP_REPLACE ('TEST 3304 V2', '[`~!@#$%^&*()_+-={}|;.:<>?,./]', ' ') as REG from dual;

REG
------------
TEST 3304 V2

On a SunOS machine I get a different result:

SQL>  select REGEXP_REPLACE ('TEST 3304 V2', '[`~!@#$%^&*()_+-={}|;.:<>?,./]', ' ') as REG from dual;

REG
------------
TEST      V

These queries were run against the same Oracle server. Is there any reason for that difference in the output?

SQL*Plus version on Windows:

SQL*Plus: Release 11.2.0.1.0 Production on Mar. Oct. 14 15:36:35 2014

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL*Plus version on Unix:

SQL*Plus: Release 11.2.0.2.0 Production on Tue Oct 14 16:01:26 2014

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

回答1:


As Avinash Raj said in comments, the hyphen in your regular expression pattern is being interpreted as a range. The behaviour seems to be dependent on the sorting algorithm being used by the two clients, based on the NLS_LANG environment variable, which influences the NLS_SORT value.

With NLS_LANG=ENGLISH_UNITED KINGDOM.WE8ISO8859P1:

SQL> select REGEXP_REPLACE ('TEST 3304 V2', '[`~!@#$%^&*()_+-={}|;.:<>?,./]', ' ') as REG from dual;

REG
------------
TEST      V

SQL> select value from nls_session_parameters where parameter = 'NLS_SORT';

VALUE
----------
BINARY

Going out on a limb as your profile says you're in Morocco, with NLS_LANG="ARABIC_MOROCCO.AR8MSWIN1256":

SQL> select REGEXP_REPLACE ('TEST 3304 V2', '[`~!@#$%^&*()_+-={}|;.:<>?,./]', ' ') as REG from dual;

REG
------------
TEST 3304 V2

SQL> select value from nls_session_parameters where parameter = 'NLS_SORT';

VALUE
----------
ARABIC

The reason is that the pattern segment +-= is treated as a range covering all characters from + to =. In the ISO8859-1 and Windows 1252 character set that is characters 43 to 61, and all the numeric digits fall within that range - zero is 48 for example - are within that range, so the regex replaces them. That is also true in the Windows 1256 character set. (And anything based on ASCII).

But your NLS_LANG is also implicitly changing the sort order, and it's switch from BINARY to ARABIC sorting that changes the behaviour. You can see that within a single session; with NLS_LANG=ENGLISH_UNITED KINGDOM.WE8ISO8859P1:

SQL> select REGEXP_REPLACE ('TEST 3304 V2', '[`~!@#$%^&*()_+-={}|;.:<>?,./]', ' ') as REG from dual;

REG
------------
TEST      V

SQL> alter session set NLS_SORT=ARABIC;

Session altered.

SQL> select REGEXP_REPLACE ('TEST 3304 V2', '[`~!@#$%^&*()_+-={}|;.:<>?,./]', ' ') as REG from dual;

REG
------------
TEST 3304 V2

You can also tell that it is a range issue by slightly modifying the range; changing +-= to +-3 so higher digits are not included, but leaving everything else the same:

SQL> alter session set NLS_SORT=BINARY;

Session altered.

SQL> select REGEXP_REPLACE ('TEST 3304 V2', '[`~!@#$%^&*()_+-3{}|;.:<>?,./]', ' ') as REG from dual;

REG
------------
TEST    4 V

Read more about linguistic sorting.

Relying on NLS settings is always risky though, so it's better to avoid the range issue entirely by changing the pattern to have the hyphen at the beginning or end, which stops it being seen as a range at all; again as Avinash Raj suggested.



来源:https://stackoverflow.com/questions/26363671/oracle-regular-expression-having-a-hyphen-doesnt-give-same-result-on-windows-as

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!