Oracle JDBC charset and 4000 char limit

℡╲_俬逩灬. 提交于 2019-12-03 06:59:12

Prior to Oracle 12.1, a VARCHAR2 column is limited to storing 4000 bytes of data in the database character set even if it is declared VARCHAR2(4000 CHAR). Since every character in your string requires 2 bytes of storage in the UTF-8 character set, you won't be able to store more than 2000 characters in the column. Of course, that number will change if some of your characters actually require just 1 byte of storage or if some of them require more than 2 bytes of storage. When the database character set is Windows-1252, every character in your string requires only a single byte of storage so you'll be able to store 4000 characters in the column.

Since you have longer strings, would it be possible to declare the column as a CLOB rather than as a VARCHAR2? That would (effectively) remove the length limitation (there is a limit on the size of a CLOB that depends on the Oracle version and the block size but it's at least in the multiple GB range).

If you happen to be using Oracle 12.1 or later, the max_string_size parameter allows you to increase the maximum size of a VARCHAR2 column from 4000 bytes to 32767 bytes.

Solved this problem by cutting the String to the require byte length. Note that this can't be done by simply using

stat.substring(0, length)

since this produces an UTF-8 String that might be up to three times longer than allowed.

while (stat.getBytes("UTF8").length > length) {
  stat = stat.substring(0, stat.length()-1);
}

note do not use stat.getBytes() since this is dependent on the set 'file.encoding' and produces either Windows-1252 or UTF-8 bytes!

If you use Hibernate you can do this using org.hibernate.Interceptor!

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!