Matching brackets in a string

后端 未结 9 1606
小蘑菇
小蘑菇 2020-12-03 03:47

What is the most efficient or elegant method for matching brackets in a string such as:

\"f @ g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]] // z\"
<
9条回答
  •  北海茫月
    2020-12-03 04:24

    I can offer a heavy approach (not too elegant). Below is my implementation of the bare-bones Mathematica parser (it will only work for strings containing Fullform of the code, with the possible exception for double brackets - which I will use here), based on rather general functionality of breadth-first parser that I developed mostly to implement an HTML parser:

    ClearAll[listSplit, reconstructIntervals, groupElements, 
    groupPositions, processPosList, groupElementsNested];
    
    listSplit[x_List, lengthlist_List, headlist_List] := 
      MapThread[#1 @@ Take[x, #2] &, {headlist, 
        Transpose[{Most[#] + 1, Rest[#]} &[
          FoldList[Plus, 0, lengthlist]]]}];
    
    reconstructIntervals[listlen_Integer, ints_List] := 
      Module[{missed, startint, lastint},
        startint  = If[ints[[1, 1]] == 1, {}, {1, ints[[1, 1]] - 1}];
        lastint = 
           If[ints[[-1, -1]] == listlen, {}, {ints[[-1, -1]] + 1, listlen}];
        missed = 
          Map[If[#[[2, 1]] - #[[1, 2]] > 1, {#[[1, 2]] + 1, #[[2, 1]] - 1}, {}] &, 
          Partition[ints, 2, 1]];
        missed = Join[missed, {lastint}];
        Prepend[Flatten[Transpose[{ints, missed}], 1], startint]];
    
    groupElements[lst_List, poslist_List, headlist_List] /; 
     And[OrderedQ[Flatten[Sort[poslist]]], Length[headlist] == Length[poslist]] := 
      Module[{totalheadlist, allints, llist},
        totalheadlist = 
         Append[Flatten[Transpose[{Array[Sequence &, {Length[headlist]}], headlist}], 1], Sequence];
      allints = reconstructIntervals[Length[lst], poslist];
      llist = Map[If[# === {}, 0, 1 - Subtract @@ #] &, allints];
      listSplit[lst, llist, totalheadlist]];
    
      (* To work on general heads, we need this *)
    
    groupElements[h_[x__], poslist_List, headlist_List] := 
       h[Sequence @@ groupElements[{x}, poslist, headlist]];
    
    (* If we have a single head *)
    groupElements[expr_, poslist_List, head_] := 
        groupElements[expr, poslist, Table[head, {Length[poslist]}]];
    
    
    groupPositions[plist_List] :=
         Reap[Sow[Last[#], {Most[#]}] & /@ plist, _, List][[2]];
    
    
    processPosList[{openlist_List, closelist_List}] :=
       Module[{opengroup, closegroup, poslist},
        {opengroup, closegroup} = groupPositions /@ {openlist, closelist} ;
        poslist =  Transpose[Transpose[Sort[#]] & /@ {opengroup, closegroup}];
        If[UnsameQ @@ poslist[[1]],
           Return[(Print["Unmatched lists!", {openlist, closelist}]; {})],
           poslist = Transpose[{poslist[[1, 1]], Transpose /@ Transpose[poslist[[2]]]}]
        ]
    ];
    
    groupElementsNested[nested_, {openposlist_List, closeposlist_List}, head_] /; Head[head] =!= List := 
     Fold[
      Function[{x, y}, 
        MapAt[groupElements[#, y[[2]], head] &, x, {y[[1]]}]], 
      nested, 
      Sort[processPosList[{openposlist, closeposlist}], 
       Length[#2[[1]]] < Length[#1[[1]]] &]];
    
    ClearAll[parse, parsedToCode, tokenize, Bracket ];
    
    (* "tokenize" our string *)
    tokenize[code_String] := 
     Module[{n = 0, tokenrules},
       tokenrules = {"[" :> {"Open", ++n}, "]" :> {"Close", n--}, 
           Whitespace | "" ~~ "," ~~ Whitespace | ""};
       DeleteCases[StringSplit[code, tokenrules], "", Infinity]];
    
    (* parses the "tokenized" string in the breadth-first manner starting 
       with the outermost brackets, using Fold and  groupElementsNested*)
    
    parse[preparsed_] := 
      Module[{maxdepth = Max[Cases[preparsed, _Integer, Infinity]], 
       popenlist, parsed, bracketPositions},
       bracketPositions[expr_, brdepth_Integer] := {Position[expr, {"Open", brdepth}], 
       Position[expr, {"Close", brdepth}]};  
       parsed = Fold[groupElementsNested[#1, bracketPositions[#1, #2], Bracket] &,
                   preparsed, Range[maxdepth]];
       parsed =  DeleteCases[parsed, {"Open" | "Close", _}, Infinity];
       parsed = parsed //. h_[x___, y_, Bracket[z___], t___] :> h[x, y[z], t]];
    
     (* convert our parsed expression into a code that Mathematica can execute *)
     parsedToCode[parsed_] :=
     Module[{myHold},
       SetAttributes[myHold, HoldAll];   
       Hold[Evaluate[
         MapAll[# //. x_String :> ToExpression[x, InputForm, myHold] &, parsed] /.
          HoldPattern[Sequence[x__][y__]] :> x[y]]] //. myHold[x___] :> x
    
     ];
    

    (note the use of MapAll in the last function). Now, here is how you can use it :)

    In[27]:= parsed = parse[tokenize["f[g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]]]"]]
    
    Out[27]= {"f"["g"["h"[Bracket[
     "i"[Bracket["j"["2"], 
       "k"[Bracket["1", "m"[Bracket["1", "n"["2"]]]]]]]]]]]}
    
    In[28]:= parsed //. a_[Bracket[b__]] :> "Part"[a, b]
    
    
    Out[28]= {"f"["g"["Part"["h", 
    "Part"["i", "j"["2"], 
     "Part"["k", "1", "Part"["m", "1", "n"["2"]]]]]]]}
    

    Now you can use parseToCode:

    In[35]:= parsedToCode[parsed//.a_[Bracket[b__]]:>"Part"[a,b]]//FullForm
    
    Out[35]//FullForm= Hold[List[f[g[Part[h,Part[i,j[2],Part[k,1,Part[m,1,n[2]]]]]]]]]
    

    EDIT

    Here is an addition needed to make only the character-replacement, as requested:

    Clear[stringify, part, parsedToString];
    stringify[x_String] := x;
    stringify[part[open_, x___, close_]] := 
       part[open, Sequence @@ Riffle[Map[stringify, {x}], ","], close];
    stringify[f_String[x___]] := {f, "[",Sequence @@ Riffle[Map[stringify, {x}], ","], "]"};
    
    parsedToString[parsed_] := 
     StringJoin @@ Flatten[Apply[stringify, 
      parsed //. Bracket[x__] :> part["yourOpenChar", x, "yourCloseChar"]] //. 
        part[x__] :> x];
    

    Here is how we can use it:

    In[70]:= parsedToString[parsed]
    
    Out[70]= "f[g[h[yourOpenChari[yourOpenCharj[2],k[yourOpenChar1,m[\
      yourOpenChar1,n[2]yourCloseChar]yourCloseChar]yourCloseChar]\
       yourCloseChar]]]"
    

提交回复
热议问题