Split a quoted string with a delimiter

前端 未结 5 1510
清酒与你
清酒与你 2020-12-06 19:14

I want to split a string with a delimiter white space. but it should handle quoted strings intelligently. E.g. for a string like

\"John Smith\" Ted Barry 
<         


        
5条回答
  •  暖寄归人
    2020-12-06 19:21

    This is my own version, clean up from http://pastebin.com/aZngu65y (posted in the comment). It can take care of Unicode. It will clean up all excessive spaces (even in quote) - this can be good or bad depending on the need. No support for escaped quote.

    private static String[] parse(String param) {
      String[] output;
    
      param = param.replaceAll("\"", " \" ").trim();
      String[] fragments = param.split("\\s+");
    
      int curr = 0;
      boolean matched = fragments[curr].matches("[^\"]*");
      if (matched) curr++;
    
      for (int i = 1; i < fragments.length; i++) {
        if (!matched)
          fragments[curr] = fragments[curr] + " " + fragments[i];
    
        if (!fragments[curr].matches("(\"[^\"]*\"|[^\"]*)"))
          matched = false;
        else {
          matched = true;
    
          if (fragments[curr].matches("\"[^\"]*\""))
            fragments[curr] = fragments[curr].substring(1, fragments[curr].length() - 1).trim();
    
          if (fragments[curr].length() != 0)
            curr++;
    
          if (i + 1 < fragments.length)
            fragments[curr] = fragments[i + 1];
        }
      }
    
      if (matched) { 
        return Arrays.copyOf(fragments, curr);
      }
    
      return null; // Parameter failure (double-quotes do not match up properly).
    }
    

    Sample input for comparison:

    "sdfskjf" sdfjkhsd "hfrif ehref" "fksdfj sdkfj fkdsjf" sdf sfssd
    
    
    asjdhj    sdf ffhj "fdsf   fsdjh"
    日本語 中文 "Tiếng Việt" "English"
        dsfsd    
       sdf     " s dfs    fsd f   "  sd f   fs df  fdssf  "日本語 中文"
    ""   ""     ""
    "   sdfsfds "   "f fsdf
    

    (2nd line is empty, 3rd line is spaces, last line is malformed). Please judge with your own expected output, since it may varies, but the baseline is that, the 1st case should return [sdfskjf, sdfjkhsd, hfrif ehref, fksdfj sdkfj fkdsjf, sdf, sfssd].

提交回复
热议问题