CLOS make-instance is really slow and causes heap exhaustion in SBCL

こ雲淡風輕ζ 提交于 2019-12-10 12:57:10

问题


I'm writing an multiarchitecture assembler/disassembler in Common Lisp (SBCL 1.1.5 in 64-bit Debian GNU/Linux), currently the assembler produces correct code for a subset of x86-64. For assembling x86-64 assembly code I use a hash table in which assembly instruction mnemonics (strings) such as "jc-rel8" and "stosb" are keys that return a list of 1 or more encoding functions, like the ones below:

(defparameter *emit-function-hash-table-x64* (make-hash-table :test 'equalp))
(setf (gethash "jc-rel8" *emit-function-hash-table-x64*) (list #'jc-rel8-x86))
(setf (gethash "stosb"   *emit-function-hash-table-x64*) (list #'stosb-x86))

The encoding functions are like these (some are more complicated, though):

(defun jc-rel8-x86 (arg1 &rest args)
  (jcc-x64 #x72 arg1))

(defun stosb-x86 (&rest args)
  (list #xaa))

Now I am trying to incorporate the complete x86-64 instruction set by using NASM's (NASM 2.11.06) instruction encoding data (file insns.dat) converted to Common Lisp CLOS syntax. This would mean replacing regular functions used for emitting binary code (like the functions above) with instances of a custom x86-asm-instruction class (a very basic class so far, some 20 slots with :initarg, :reader, :initform etc.), in which an emit method with arguments would be used for emitting the binary code for given instruction (mnemonic) and arguments. The converted instruction data looks like this (but it's more than 40'000 lines and exactly 7193 make-instance's and 7193 setf's).


;; first mnemonic + operand combination instances (:is-variant t).
;; there are 4928 such instances for x86-64 generated from NASM's insns.dat.

(eval-when (:compile-toplevel :load-toplevel :execute)

(setf Jcc-imm-near (make-instance 'x86-asm-instruction
:name "Jcc"
:operands "imm|near"
:code-string "[i: odf 0f 80+c rel]"
:arch-flags (list "386" "BND")
:is-variant t))

(setf STOSB-void (make-instance 'x86-asm-instruction
:name "STOSB"
:operands "void"
:code-string "[ aa]"
:arch-flags (list "8086")
:is-variant t))

;; then, container instances which contain (or could be refer to instead)
;; the possible variants of each instruction.
;; there are 2265 such instances for x86-64 generated from NASM's insns.dat.

(setf Jcc (make-instance 'x86-asm-instruction
                         :name "Jcc"
                         :is-container t
                         :variants (list Jcc-imm-near
                                         Jcc-imm64-near
                                         Jcc-imm-short
                                         Jcc-imm
                                         Jcc-imm
                                         Jcc-imm
                                         Jcc-imm)))

(setf STOSB (make-instance 'x86-asm-instruction
                           :name "STOSB"
                           :is-container t
                           :variants (list STOSB-void)))

;; thousands of objects more here...

) ; this bracket closes (eval-when (:compile-toplevel :load-toplevel :execute)

I have converted NASM's insns.dat to Common Lisp syntax (like above) using a trivial Perl script (further below, but there's nothing of interest in the script itself) and in principle it works. So it works, but compiling those 7193 objects is really really slow and commonly causes heap exhaustion. On my Linux Core i7-2760QM laptop with 16G of memory the compiling of an (eval-when (:compile-toplevel :load-toplevel :execute) code block with 7193 objects like the ones above takes more than 7 minutes and sometimes causes heap exhaustion, like this one:

;; Swank started at port: 4005.
* Heap exhausted during garbage collection: 0 bytes available, 32 requested.
 Gen StaPg UbSta LaSta LUbSt Boxed Unboxed LB   LUB  !move  Alloc  Waste   Trig    WP  GCs Mem-age
   0:     0     0     0     0     0     0     0     0     0        0     0 41943040    0   0  0.0000
   1:     0     0     0     0     0     0     0     0     0        0     0 41943040    0   0  0.0000
   2:     0     0     0     0     0     0     0     0     0        0     0 41943040    0   0  0.0000
   3: 38805 38652     0     0 49474 15433   389   416     0 2144219760 9031056 1442579856    0   1  1.5255
   4: 127998 127996     0     0 45870 14828   106   143   199 1971682720 25428576  2000000    0   0  0.0000
   5:     0     0     0     0     0     0     0     0     0        0     0  2000000    0   0  0.0000
   6:     0     0     0     0  1178   163     0     0     0 43941888     0  2000000  985   0  0.0000
   Total bytes allocated    = 4159844368
   Dynamic-space-size bytes = 4194304000
GC control variables:
   *GC-INHIBIT* = true
   *GC-PENDING* = in progress
   *STOP-FOR-GC-PENDING* = false
fatal error encountered in SBCL pid 9994(tid 46912556431104):
Heap exhausted, game over.

Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb>

I had to add --dynamic-space-size 4000 parameter for SBCL to get it compiled at all, but still after allocating 4 gigabytes of dynamic space heap sometimes gets exhausted. Even if the heap exhaustion would be solved, more than 7 minutes for compiling 7193 instances after only adding a slot in the class ('x86-asm-instruction class used for these instances) is way too much for interactive development in REPL (I use slimv, if that matters).

Here's (time (compile-file output:

;   caught 18636 WARNING conditions


; insns.fasl written
; compilation finished in 0:07:11.329
Evaluation took:
  431.329 seconds of real time
  238.317000 seconds of total run time (234.972000 user, 3.345000 system)
  [ Run times consist of 6.073 seconds GC time, and 232.244 seconds non-GC time. ]
  55.25% CPU
  50,367 forms interpreted
  784,044 lambdas converted
  1,031,842,900,608 processor cycles
  19,402,921,376 bytes consed

Using OOP (CLOS) would enable incorporating the instruction mnemonic (such as jc or stosb above, :name), allowed operands of the instruction (:operands), instruction's binary encoding (such as #xaa for stosb, :code-string) and possible architecture limitations (:arch-flags) of the instruction in one object. But it seems that at least my 3-year-old computer is not efficient enough to compile around 7000 CLOS object instances quickly.

My question is: Is there some way to make SBCL's make-instance faster, or should I keep assembly code generation in regular functions like the examples further above? I'd be also very happy to know about any other possible solutions.

Here's the Perl script, just in case:

#!/usr/bin/env perl
use strict;
use warnings;

# this program converts NASM's `insns.dat` to Common Lisp Object System (CLOS) syntax.

my $firstchar;
my $line_length;
my $are_there_square_brackets;
my $mnemonic_and_operands;
my $mnemonic;
my $operands;
my $code_string;
my $flags;
my $mnemonic_of_current_mnemonic_array;

my $clos_object_name;
my $clos_mnemonic;
my $clos_operands;
my $clos_code_string;
my $clos_flags;

my @object_name_array = ();
my @mnemonic_array = ();
my @operands_array = ();
my @code_string_array = ();
my @flags_array = ();

my @each_mnemonic_only_once_array = ();

my @instruction_variants_array = ();
my @instruction_variants_for_current_instruction_array = ();

open(FILE, 'insns.dat');

$mnemonic_of_current_mnemonic_array = "";

# read one line at once.
while (<FILE>)
{
    $firstchar = substr($_, 0, 1);
    $line_length = length($_);
    $are_there_square_brackets = ($_ =~ /\[.*\]/);
    chomp;
    if (($line_length > 1) && ($firstchar =~ /[^\t ;]/))
    {
        if ($are_there_square_brackets)
        {
            ($mnemonic_and_operands, $code_string, $flags) = split /[\[\]]+/, $_;
            $code_string = "[" . $code_string . "]";
            ($mnemonic, $operands) = split /[\t ]+/, $mnemonic_and_operands;
        }
        else
        {
            ($mnemonic, $operands, $code_string, $flags) = split /[\t ]+/, $_;
        }
        $mnemonic =~ s/[\t ]+/ /g;
        $operands =~ s/[\t ]+/ /g;
        $code_string =~ s/[\t ]+/ /g;
        $flags =~ s/[\t ]+//g;

        # we don't want non-x86-64 instructions here.
        unless ($flags =~ "NOLONG")
        {
            # ok, the content of each field is now filtered,
            # let's convert them to a suitable Common Lisp format.
            $clos_object_name = $mnemonic . "-" . $operands;

            # in Common Lisp object names `|`, `,`, and `:` must be escaped with a backslash `\`,
            # but that would get too complicated.
            # so we'll simply replace them:
            # `|` -> `-`.
            # `,` -> `.`.
            # `:` -> `.`.
            $clos_object_name =~ s/\|/-/g;              
            $clos_object_name =~ s/,/./g;              
            $clos_object_name =~ s/:/./g;              

            $clos_mnemonic    = "\"" . $mnemonic . "\"";
            $clos_operands    = "\"" . $operands . "\"";
            $clos_code_string = "\"" . $code_string . "\"";

            $clos_flags = "\"" . $flags . "\"";        # add first and last double quotes.
            $clos_flags =~ s/,/" "/g;                  # make each flag its own Common Lisp string.
            $clos_flags = "(list " . $clos_flags. ")"; # convert to `list` syntax.

            push @object_name_array, $clos_object_name;
            push @mnemonic_array, $clos_mnemonic;
            push @operands_array, $clos_operands;
            push @code_string_array, $clos_code_string;
            push @flags_array, $clos_flags;

            if ($mnemonic eq $mnemonic_of_current_mnemonic_array)
            {
                # ok, same mnemonic as the previous one,
                # so the current object name goes to the list.
                push @instruction_variants_for_current_instruction_array, $clos_object_name;
            }
            else
            {
                # ok, this is a new mnemonic.
                # so we'll mark this as current mnemonic.
                $mnemonic_of_current_mnemonic_array = $mnemonic;
                push @each_mnemonic_only_once_array, $mnemonic;

                # we first push the old array (unless it's empty), then clear it,
                # and then push the current object name to the cleared array.

                if (@instruction_variants_for_current_instruction_array)
                {
                    # push the variants array, unless it's empty.
                    push @instruction_variants_array, [ @instruction_variants_for_current_instruction_array ];
                }
                @instruction_variants_for_current_instruction_array = ();
                push @instruction_variants_for_current_instruction_array, $clos_object_name;
            }
        }
    }
}

# the last instruction's instruction variants must be pushed too.
if (@instruction_variants_for_current_instruction_array)
{
    # push the variants array, unless it's empty.
    push @instruction_variants_array, [ @instruction_variants_for_current_instruction_array ];
}

close(FILE);

# these objects need be created already during compilation.
printf("(eval-when (:compile-toplevel :load-toplevel :execute)\n");

# print the code to create each instruction + operands combination object.

for (my $i=0; $i <= $#mnemonic_array; $i++)
{
    $clos_object_name = $object_name_array[$i];
    $mnemonic         = $mnemonic_array[$i];
    $operands         = $operands_array[$i];
    $code_string      = $code_string_array[$i];
    $flags            = $flags_array[$i];

    # print the code to create a variant object.
    # each object here is a variant of a single instruction (or a single mnemonic).
    # actually printed as 6 lines to make it easier to read (for us humans, I mean), with an empty line in the end.
    printf("(setf %s (make-instance 'x86-asm-instruction\n:name %s\n:operands %s\n:code-string %s\n:arch-flags %s\n:is-variant t))",
        $clos_object_name,
        $mnemonic,
        $operands,
        $code_string,
        $flags);
    printf("\n\n");
}

# print the code to create each instruction + operands combination object.

# for (my $i=0; $i <= $#each_mnemonic_only_once_array; $i++)
for my $i (0 .. $#instruction_variants_array)
{
    $mnemonic = $each_mnemonic_only_once_array[$i];

    # print the code to create a container object.
    printf("(setf %s (make-instance 'x86-asm-instruction :name \"%s\" :is-container t :variants (list \n", $mnemonic, $mnemonic);
    @instruction_variants_for_current_instruction_array = $instruction_variants_array[$i];

    # for (my $j=0; $j <= $#instruction_variants_for_current_instruction_array; $j++)
    for my $j (0 .. $#{$instruction_variants_array[$i]} )
    {
        printf("%s", $instruction_variants_array[$i][$j]);

        # print 3 closing brackets if this is the last variant.
        if ($j == $#{$instruction_variants_array[$i]})
        {
            printf(")))");
        }
        else
        {
            printf(" ");
        }
    }

    # if this is not the last instruction, print two newlines.
    if ($i < $#instruction_variants_array)
    {
        printf("\n\n");
    }
}

# print the closing bracket to close `eval-when`.
print(")");

exit;

回答1:


18636 warnings looks really bad, Start by getting rid of all the warnings.

I would start by getting rid of the EVAL-WHEN around all that. Does not make much sense to me. Either load the file directly, or compile and load the file.

Also note that SBCL does not like (setf STOSB-void ...) when the variable is undefined. New top-level variables are introduced with DEFVAR or DEFPARAMETER. SETF just sets them, but does not define them. That should help to get rid of the warnings.

Also :is-container t and :is-variant t smell like these properties should be converted into classes to inherit from (for example as a mixin). A container has variants. A variant does not have variants.



来源:https://stackoverflow.com/questions/27096007/clos-make-instance-is-really-slow-and-causes-heap-exhaustion-in-sbcl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!