Java read file got a leading BOM [  ]

前端 未结 6 1290
孤独总比滥情好
孤独总比滥情好 2020-12-20 23:15

I am reading a file containing keywords line by line and found a strange problem. I hope lines that following each other if their contents are the same, they should be handl

6条回答
  •  -上瘾入骨i
    2020-12-20 23:48

    The Byte Order Mark (BOM) is a Unicode character. You will get characters like  at the start of a text stream, because BOM use is optional, and, if used, should appear at the start of the text stream.

    • Microsoft compilers and interpreters, and many pieces of software on Microsoft Windows such as Notepad treat the BOM as a required magic number rather than use heuristics. These tools add a BOM when saving text as UTF-8, and cannot interpret UTF-8 unless the BOM is present or the file contains only ASCII. Google Docs also adds a BOM when converting a document to a plain text file for download.
    File file = new File( csvFilename );
    FileInputStream inputStream = new FileInputStream(file);
    // [{"Key2":"21","Key1":"11","Key3":"31"} ]
    InputStreamReader inputStreamReader = new InputStreamReader( inputStream, "UTF-8" );
    

    We can resolve by explicitly specifying charset as UTF-8 to InputStreamReader. Then in UTF-8, the byte sequence  decodes to one character, which is U+FEFF (?).

    Using Google Guava's jar CharMatcher, you can remove any non-printable characters and then retain all ASCII characters (dropping any accents) like this:

    String printable = CharMatcher.INVISIBLE.removeFrom( input );
    String clean = CharMatcher.ASCII.retainFrom( printable );
    

    Full Example to read data from the CSV file to JSON Object:

    public class CSV_FileOperations {
        static List> listObjects = new ArrayList>();
        protected static List jsonArray = new ArrayList();
    
        public static void main(String[] args) {
            String csvFilename = "D:/Yashwanth/json2Bson.csv";
    
            csvToJSONString(csvFilename);
            String jsonData = jsonArray.toString();
            System.out.println("File JSON Data : \n"+ jsonData);
        }
    
        @SuppressWarnings("deprecation")
        public static String csvToJSONString( String csvFilename ) {
            try {
                File file = new File( csvFilename );
                FileInputStream inputStream = new FileInputStream(file);
    
                String fileExtensionName = csvFilename.substring(csvFilename.indexOf(".")); // fileName.split(".")[1];
                System.out.println("File Extension : "+ fileExtensionName);
    
                // [{"Key2":"21","Key1":"11","Key3":"31"} ]
                InputStreamReader inputStreamReader = new InputStreamReader( inputStream, "UTF-8" );
    
                BufferedReader buffer = new BufferedReader( inputStreamReader );
                Stream readLines = buffer.lines();
                boolean headerStream = true;
    
                List headers = new ArrayList();
                for (String line : (Iterable) () -> readLines.iterator()) {
                    String[] columns = line.split(",");
                    if (headerStream) {
                        System.out.println(" ===== Headers =====");
    
                        for (String keys : columns) {
                            //  - UTF-8 - ? « https://stackoverflow.com/a/11021401/5081877
                            String printable = CharMatcher.INVISIBLE.removeFrom( keys );
                            String clean = CharMatcher.ASCII.retainFrom(printable);
                            String key = clean.replace("\\P{Print}", "");
                            headers.add( key );
                        }
                        headerStream = false;
                        System.out.println(" ===== ----- Data ----- =====");
                    } else {
                        addCSVData(headers, columns );
                    }
                }
                inputStreamReader.close();
                buffer.close();
    
    
            } catch (FileNotFoundException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }
            return null;
        }
        @SuppressWarnings("unchecked")
        public static void addCSVData( List headers, String[] row ) {
            if( headers.size() == row.length ) {
                HashMap mapObj = new HashMap();
                JSONObject jsonObj = new JSONObject();
                for (int i = 0; i < row.length; i++) {
                    mapObj.put(headers.get(i), row[i]);
                    jsonObj.put(headers.get(i), row[i]);
                }
                jsonArray.add(jsonObj);
                listObjects.add(mapObj);
            } else {
                System.out.println("Avoiding the Row Data...");
            }
        }
    }
    

    json2Bson.csv File data.

    Key1    Key2    Key3
    11  21  31
    12  22  32
    13  23  33
    

提交回复
热议问题