Behavior of using \Z vs \z as Scanner delimiter

孤者浪人 提交于 2020-01-02 04:30:09

问题


[Edit] I found the answer, but I can't answer the question due to restrictions on new users. Either way, this is a known bug in Java.

http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8028387

I'm trying to read a file into a string in Java 6 on 64 bit ubuntu. Java is giving me the very strange result that with "\\Z" it reads the entire file, but with "\\z" it reads the entire string up to 1024 characters. I've read the Java 6 API for all the classes and I am at a loss.

Description of \Z and \z can be found at:

http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#lt

What could be causing this strange behavior?

String fileString = new Scanner(new File(fileName)).useDelimiter("\\z").next();
String fileString2 = new Scanner(new File(fileName)).useDelimiter("\\Z").next();
System.out.println("using Z : " + fileString2.length());
System.out.println("Using z "+ fileString.length());

Output: using Z : 9720 Using z : 1024

Thanks!

Details about the file/java-version:

Running Ubuntu with java-6-openjdk-amd64 (tested also with oracle java6) File is simple text file UTF-8 encoded.


回答1:


As Pattern documentation states

  • \z The end of the input
  • \Z The end of the input but for the final terminator, if any

I suspect that since Scanners buffer size is set to 1024,

354  private static final int BUFFER_SIZE = 1024; // change to 1024;

Scanner reads this amount of characters and uses it as current input, so \z can be used here to represent its end, while \Z can't because it is not "final terminator" (there are more elements in entire input to read).



来源:https://stackoverflow.com/questions/22350037/behavior-of-using-z-vs-z-as-scanner-delimiter

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!