Trying to read binary file as text but scanner stops at first line

老子叫甜甜 提交于 2019-12-02 13:53:43

问题


I'm trying to read a binary file but my program just stops at first line.. I think it's because of the strange characters the file has..I just want to extract some directions from it. Is there a way to do this?..

public static void main(String[] args) throws IOException
{

    Scanner readF = new Scanner(new File("D:\\CurrentDatabase_372.txt"));
    String line = null;
    String newLine = System.getProperty("line.separator");
    FileWriter writeF = new FileWriter("D:\\Songs.txt");

    while (readF.hasNext())
    {
        line = readF.nextLine();

        if (line.contains("D:\\") && line.contains(".mp3"))
        {
            writeF.write(line.substring(line.indexOf("D:\\"), line.indexOf(".mp3") + 4) + newLine);
        }
    }

    readF.close();
    writeF.close();
}

The file starts like this:

pppppamepD:\Music\Korn\Untouchables\03     Blame.mp3pmp3pmp3pKornpMetalpKornpUntouchablespKornpUntouchables*;*KornpKornpKornUntouchables003pMetalKornUntouchables003pBlameKornUntouchables003pKornKornUntouchables003pMP3pppppCpppÀppp@ppøp·pppŸú#pdppppppòrSpUpppppp€ppªp8›qpppppppppppp,’ppÒppp’ÍpET?ppppppôpp¼}`Ñ#ãâK†¡H¤*(DppppppppppppppppuÞѤéú:M®$@]jkÝW0ÛœFµú½XVNp`w—wâÊp:ºŽwâÊpppp8Npdpp¡pp{)pppppppppppppppppyY:¸[ªA¥Bi   `Û¯pppppppppppp2pppppppppppppppppppppppppppppppppppp¿ÞpAppppppp€ppp€;€?€CpCpC€H€N€S€`€e€y€~p~p~€’€«€Ê€â€Hollow LifepD:\Musica\Korn\Untouchables\04 Hollow Life.mp3pmp3pmp3pKornpMetalpKornpUntouchablespKornpUntouchables*;*KornpKornpKornUntouchables004pMetalKornUntouchables004pHollow LifeKornUntouchables004pKornKornUntouchables004pMP3pppppCpppÀHppppppøp¸pppǺxp‰ppppppòrSpUpppppp€ppªp8›qpppppppppppp,’ppÒpppŠºppppppppppôpp¼}`Ñ#ãâK†¡H¤*(DpppppppppppppppppãG#™R‚CA—®þ^bN °mbŽ‚^¨pG¦sp;5p5ÓÐùšwâÊp
)ŽwâÊpppp8Npdpp!cpp{pppppppppppppppppyY:¸[ªA¥Bi `ۯǺxp‰pppppp2pppppppppppppppppppppppppppppppppppp¿

I want to extract file directions like "D:\Music\Korn\Untouchables\03 Blame.mp3".


回答1:


You cannot use a line-oriented scanner to read binary files. You have no guarantee that the binary file even has "lines" delimited by newline characters. For example, what would your scanner do if there were TWO files matching the pattern "D:\.*.mp3" with no intervening newline? You would extract everything between the first "D:\" and the last ".mp3", with all the garbage in between. Extracting file names from a non-delimited stream such as this requires a different strategy.

If i were writing this I'd use a relatively simple finite-state recognizer that processes characters one at a time. When it encounters a "d" it starts saving characters, checking each character to ensure that it matches the required pattern, ending when it sees the "3" in ".mp3". If at any point it detects a character that doesn't fit, it resets and continues looking.

EDIT: If the files to be processed are small (less than 50mb or so) you could load the entire file into memory, which would make scanning simpler.




回答2:


As was said, since it is a binary file you can't use a Scanner or other character based readers. You could use a regular FileInputStream to read the actual raw bytes of the file. Java's String class has a constructor that will take an array of bytes and turn them into a string. You can then search that string for the file name(s). This may work if you just use the default character set.

String(byte[]): http://download.oracle.com/javase/1.4.2/docs/api/java/lang/String.html FileInputStream for reading bytes: http://download.oracle.com/javase/tutorial/essential/io/bytestreams.html




回答3:


Use hasNextLine() instead of hasNext() in the while loop check.

while (readF.hasNextLine()) {
 String line = readF.nextLine();
 //Your code
 }


来源:https://stackoverflow.com/questions/5347327/trying-to-read-binary-file-as-text-but-scanner-stops-at-first-line

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!