How is pattern matching in Scala implemented at the bytecode level?

后端 未结 4 1183
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-04 06:24

How is pattern matching in Scala implemented at the bytecode level?

Is it like a series of if (x instanceof Foo) constructs, or something else? What are

相关标签:
4条回答
  • 2020-12-04 06:32

    The low level can be explored with a disassembler but the short answer is that it's a bunch of if/elses where the predicate depends on the pattern

    case Sum(l,r) // instance of check followed by fetching the two arguments and assigning to two variables l and r but see below about custom extractors 
    case "hello" // equality check
    case _ : Foo // instance of check
    case x => // assignment to a fresh variable
    case _ => // do nothing, this is the tail else on the if/else
    

    There's much more that you can do with patterns like or patterns and combinations like "case Foo(45, x)", but generally those are just logical extensions of what I just described. Patterns can also have guards, which are additional constraints on the predicates. There are also cases where the compiler can optimize pattern matching, e.g when there's some overlap between cases it might coalesce things a bit. Advanced patterns and optimization are an active area of work in the compiler, so don't be surprised if the byte code improves substantially over these basic rules in current and future versions of Scala.

    In addition to all that, you can write your own custom extractors in addition to or instead of the default ones Scala uses for case classes. If you do, then the cost of the pattern match is the cost of whatever the extractor does. A good overview is found in http://lamp.epfl.ch/~emir/written/MatchingObjectsWithPatterns-TR.pdf

    0 讨论(0)
  • 2020-12-04 06:35

    To expand on @Zifre's comment: if you are reading this in the future and the scala compiler has adopted new compilation strategies and you want to know what they are, here's how you find out what it does.

    Copy-paste your match code into a self-contained example file. Run scalac on that file. Then run javap -v -c theClassName$.class.

    For example, I put the following into /tmp/question.scala:

    object question {
      abstract class Expr
      case class Number(n: Int) extends Expr
      case class Sum(e1: Expr, e2: Expr) extends Expr
    
      def eval(e: Expr): Int = e match {
        case Number(x) => x
        case Sum(l, r) => eval(l) + eval(r)
      }
    }
    

    Then I ran scalac question.scala, which produced a bunch of *.class files. Poking around a bit, I found the match statement inside question$.class. The javap -c -v question$.class output is available below.

    Since we're looking for a condition control flow construct, knowing about the java bytecode instruction set suggests that looking for "if" should be a good place to start.

    In two locations we find a pair of consecutive lines on the form isinstanceof <something>; ifeq <somewhere>, which means: if the most recently computed value is not an instance of something then goto somewhere. (ifeq is jump if zero, and isinstanceof gives you a zero to represent false.)

    If you follow the control flow around, you'll see that it agrees with the answer given by @Jorge Ortiz: we do if (blah isinstanceof something) { ... } else if (blah isinstanceof somethingelse) { ... }.

    Here is the javap -c -v question$.class output:

    Classfile /tmp/question$.class
      Last modified Nov 20, 2020; size 956 bytes
      MD5 checksum cfc788d4c847dad0863a797d980ad2f3
      Compiled from "question.scala"
    public final class question$
      minor version: 0
      major version: 50
      flags: (0x0031) ACC_PUBLIC, ACC_FINAL, ACC_SUPER
      this_class: #2                          // question$
      super_class: #4                         // java/lang/Object
      interfaces: 0, fields: 1, methods: 3, attributes: 4
    Constant pool:
       #1 = Utf8               question$
       #2 = Class              #1             // question$
       #3 = Utf8               java/lang/Object
       #4 = Class              #3             // java/lang/Object
       #5 = Utf8               question.scala
       #6 = Utf8               MODULE$
       #7 = Utf8               Lquestion$;
       #8 = Utf8               <clinit>
       #9 = Utf8               ()V
      #10 = Utf8               <init>
      #11 = NameAndType        #10:#9         // "<init>":()V
      #12 = Methodref          #2.#11         // question$."<init>":()V
      #13 = Utf8               eval
      #14 = Utf8               (Lquestion$Expr;)I
      #15 = Utf8               question$Number
      #16 = Class              #15            // question$Number
      #17 = Utf8               n
      #18 = Utf8               ()I
      #19 = NameAndType        #17:#18        // n:()I
      #20 = Methodref          #16.#19        // question$Number.n:()I
      #21 = Utf8               question$Sum
      #22 = Class              #21            // question$Sum
      #23 = Utf8               e1
      #24 = Utf8               ()Lquestion$Expr;
      #25 = NameAndType        #23:#24        // e1:()Lquestion$Expr;
      #26 = Methodref          #22.#25        // question$Sum.e1:()Lquestion$Expr;
      #27 = Utf8               e2
      #28 = NameAndType        #27:#24        // e2:()Lquestion$Expr;
      #29 = Methodref          #22.#28        // question$Sum.e2:()Lquestion$Expr;
      #30 = NameAndType        #13:#14        // eval:(Lquestion$Expr;)I
      #31 = Methodref          #2.#30         // question$.eval:(Lquestion$Expr;)I
      #32 = Utf8               scala/MatchError
      #33 = Class              #32            // scala/MatchError
      #34 = Utf8               (Ljava/lang/Object;)V
      #35 = NameAndType        #10:#34        // "<init>":(Ljava/lang/Object;)V
      #36 = Methodref          #33.#35        // scala/MatchError."<init>":(Ljava/lang/Object;)V
      #37 = Utf8               this
      #38 = Utf8               e
      #39 = Utf8               Lquestion$Expr;
      #40 = Utf8               x
      #41 = Utf8               I
      #42 = Utf8               l
      #43 = Utf8               r
      #44 = Utf8               question$Expr
      #45 = Class              #44            // question$Expr
      #46 = Methodref          #4.#11         // java/lang/Object."<init>":()V
      #47 = NameAndType        #6:#7          // MODULE$:Lquestion$;
      #48 = Fieldref           #2.#47         // question$.MODULE$:Lquestion$;
      #49 = Utf8               question
      #50 = Class              #49            // question
      #51 = Utf8               Sum
      #52 = Utf8               Expr
      #53 = Utf8               Number
      #54 = Utf8               Code
      #55 = Utf8               LocalVariableTable
      #56 = Utf8               LineNumberTable
      #57 = Utf8               StackMapTable
      #58 = Utf8               SourceFile
      #59 = Utf8               InnerClasses
      #60 = Utf8               ScalaInlineInfo
      #61 = Utf8               Scala
    {
      public static final question$ MODULE$;
        descriptor: Lquestion$;
        flags: (0x0019) ACC_PUBLIC, ACC_STATIC, ACC_FINAL
    
      public static {};
        descriptor: ()V
        flags: (0x0009) ACC_PUBLIC, ACC_STATIC
        Code:
          stack=1, locals=0, args_size=0
             0: new           #2                  // class question$
             3: invokespecial #12                 // Method "<init>":()V
             6: return
    
      public int eval(question$Expr);
        descriptor: (Lquestion$Expr;)I
        flags: (0x0001) ACC_PUBLIC
        Code:
          stack=3, locals=9, args_size=2
             0: aload_1
             1: astore_2
             2: aload_2
             3: instanceof    #16                 // class question$Number
             6: ifeq          27
             9: aload_2
            10: checkcast     #16                 // class question$Number
            13: astore_3
            14: aload_3
            15: invokevirtual #20                 // Method question$Number.n:()I
            18: istore        4
            20: iload         4
            22: istore        5
            24: goto          69
            27: aload_2
            28: instanceof    #22                 // class question$Sum
            31: ifeq          72
            34: aload_2
            35: checkcast     #22                 // class question$Sum
            38: astore        6
            40: aload         6
            42: invokevirtual #26                 // Method question$Sum.e1:()Lquestion$Expr;
            45: astore        7
            47: aload         6
            49: invokevirtual #29                 // Method question$Sum.e2:()Lquestion$Expr;
            52: astore        8
            54: aload_0
            55: aload         7
            57: invokevirtual #31                 // Method eval:(Lquestion$Expr;)I
            60: aload_0
            61: aload         8
            63: invokevirtual #31                 // Method eval:(Lquestion$Expr;)I
            66: iadd
            67: istore        5
            69: iload         5
            71: ireturn
            72: new           #33                 // class scala/MatchError
            75: dup
            76: aload_2
            77: invokespecial #36                 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
            80: athrow
          LocalVariableTable:
            Start  Length  Slot  Name   Signature
                0      81     0  this   Lquestion$;
                0      81     1     e   Lquestion$Expr;
               20      61     4     x   I
               47      34     7     l   Lquestion$Expr;
               54      27     8     r   Lquestion$Expr;
          LineNumberTable:
            line 6: 0
            line 7: 2
            line 8: 27
            line 6: 69
          StackMapTable: number_of_entries = 3
            frame_type = 252 /* append */
              offset_delta = 27
              locals = [ class question$Expr ]
            frame_type = 254 /* append */
              offset_delta = 41
              locals = [ top, top, int ]
            frame_type = 248 /* chop */
              offset_delta = 2
    }
    SourceFile: "question.scala"
    InnerClasses:
      public static #51= #22 of #50;          // Sum=class question$Sum of class question
      public static abstract #52= #45 of #50; // Expr=class question$Expr of class question
      public static #53= #16 of #50;          // Number=class question$Number of class question
      ScalaInlineInfo: length = 0xE (unknown attribute)
       01 01 00 02 00 0A 00 09 01 00 0D 00 0E 01
      Scala: length = 0x0 (unknown attribute)
    
    0 讨论(0)
  • 2020-12-04 06:48

    Since version 2.8, Scala has had the @switch annotation. The goal is to ensure, that pattern matching will be compiled into tableswitch or lookupswitch instead of series of conditional if statements.

    0 讨论(0)
  • 2020-12-04 06:55

    James (above) said it best. However, if you're curious it's always a good exercise to look at the disassembled bytecode. You can also invoke scalac with the -print option, which will print your program with all Scala-specific features removed. It's basically Java in Scala's clothing. Here's the relevant scalac -print output for the code snippet you gave:

    def eval(e: Expr): Int = {
      <synthetic> val temp10: Expr = e;
      if (temp10.$isInstanceOf[Number]())
        temp10.$asInstanceOf[Number]().n()
      else
        if (temp10.$isInstanceOf[Sum]())
          {
            <synthetic> val temp13: Sum = temp10.$asInstanceOf[Sum]();
            Main.this.eval(temp13.e1()).+(Main.this.eval(temp13.e2()))
          }
        else
          throw new MatchError(temp10)
    };
    
    0 讨论(0)
提交回复
热议问题