Why do function “have memory” in REBOL?

喜欢而已 提交于 2020-01-12 23:19:10

问题


In rebol I have written this very simple function:

make-password: func[Length] [
    chars: "QWERTYUIOPASDFGHJKLZXCVBNM1234567890"
    password: ""
    loop Length [append password (pick chars random Length)]
    password
    ]

When I run this multiple times in a row things get really confusing:

loop 5 [print make-password 5]

Gives (for example) this output:

  • TWTQW
  • TWTQWWEWRT
  • TWTQWWEWRTQWWTW
  • TWTQWWEWRTQWWTWQTTQQ
  • TWTQWWEWRTQWWTWQTTQQTRRTT

It looks like the function memorised the past executions and stored the result and than used it again!

I did not ask this!

I would like to have output similar to the following:

  • IPS30
  • DQ6BE
  • E70IH
  • XGHBR
  • 7LMN5

How can I achieve this result?


回答1:


A good question.

Rebol code is actually best thought of as a very stylized data structure. That data structure "happens to be executable". But you need to understand how it works.

For instance, from @WiseGenius's suggestion:

make-password: func[Length] [
    chars: "QWERTYUIOPASDFGHJKLZXCVBNM1234567890"
    password: copy ""
    loop Length [append password (pick chars random Length)]
    password
]

Take a look at the block containing append password.... That block is "imaged" there; what it really looks like under the hood is:

chars: **pointer to string! 0xSSSSSSS1**
password: copy **pointer to string! 0xSSSSSSS2**
loop Length **pointer to block! 0xBBBBBBBB**
password

All series are working this way when they are loaded by the interpreter. Strings, blocks, binaries, paths, parens, etc. Given that it's "turtles all the way down", if you follow through to that pointer, the block 0xBBBBBBBB is internally:

append password **pointer to paren! 0xPPPPPPPP**

One result of this is that a series can be referenced (and hence "imaged") in multiple places:

>> inner: [a]

>> outer: reduce [inner inner]
[[a] [a]]

>> append inner 'b

>> probe outer
[[a b] [a b]]

This can be a source of confusion for newcomers, but once you understand the data structure you begin to know when to use COPY.

So you've noticed an interesting implication of this with functions. Consider this program:

foo: func [] [
    data: []
    append data 'something
]

source foo

foo
foo

source foo

That produces a possibly-surprising result:

foo: func [][
    data: [] 
    append data 'something
]

foo: func [][
    data: [something something] 
    append data 'something
]

We call foo a couple of times, it appears that the function's source code is changing as we do so. It is, in a sense, self-modifying code.

If this bothers you, there are tools in R3-Alpha for attacking it. You can use PROTECT to protect function bodies from modification, and even create your own alternatives to routines like FUNC and FUNCTION that will do it for you. (PFUNC? PFUNCTION?) In Rebol version 3 you can write:

pfunc: func [spec [block!] body [block!]] [
    make function! protect/deep copy/deep reduce [spec body]
]

foo: pfunc [] [
    data: []
    append data 'something
]

foo

When you run that you get:

*** ERROR
** Script error: protected value or series - cannot modify
** Where: append foo try do either either either -apply-
** Near: append data 'something

So that forces you to copy series. It also points out that FUNC is just a function! itself, and so is FUNCTION. You can make your own generators.

This may break your brain and you may run screaming saying "this is not any sane way to write software". Or maybe you will say "my God, it's full of stars." Reactions may vary. But it is fairly fundamental to the "trick" that powers the system and gives it wild flexibility.

(Note: The Ren-C branch of Rebol3 has fundamentally made it so that function bodies--and source series in general--are locked by default. If one wants a static variable in a function, you can say foo: func [x <static> accum (copy "")] [append accum x | return accum] and the function will accumulate state in accum across calls.)

I'll also suggest paying close attention to what is actually happening on each run. Before you've run the foo function, data has no value. What happens is each time we execute the function and the evaluator sees a SET-WORD! followed by a series value, it performs the assignment to the variable.

data: **pointer to block! 0xBBBBBBBB**

After that assignment, you'll have two references to the block in existence. One is its existence in the code structure that was established at LOAD time, before the function had ever been run. The second reference is the one that was stored into the data variable. It's through this second reference that you are modifying this series.

And notice that data will be reassigned each time the function is run. But reassigned to the same value over and over again...that original block pointer! This is why you have to COPY if you want a fresh block on every run.

Grasping the underlying simplicity in the evaluator rules is part of the giddy interesting-ness. This is how the simplicity was dressed up to make a language (in a way you could twist to your own means). For instance, there is no "multiple-assignment":

a: b: c: 10

That's just the evaluator hitting a: as a SET-WORD! symbol and saying "okay, let's associate the variable a in its binding context with whatever the next complete expression produces.". b: does the same. c: does the same but hits a terminal because of the integer value 10...and then also evaluates to 10. So it looks like multiple-assignment.

So just remember that the original instance of a series literal is the one hanging in the loaded source. If the evaluator ever gets around to doing this kind of SET-WORD! or SET assignment, it will borrow the pointer to that literal in the source to poke into the variable. It's a mutable reference. You (or the abstractions you design) can make it immutable with PROTECT or PROTECT/DEEP, and you can make it not-a-reference with COPY or COPY/DEEP.


Related Note

Some argue that you should never write copy []...because (a) you might get in the habit of forgetting to write the COPY, and (b) you are making an unused series every time you do it. That "blank series template" gets allocated, has to be scanned by the garbage collector, and no one ever actually touches it.

If you write make block! 10 (or whatever size you want to preallocate the block) you avoid the issue, save a series, and offer a sizing hint.




回答2:


By default, this notation doesn't copy the value of the string "" to password. Instead, it sets password to point to that string which sits in the body block of the function. So when you perform append on password you're actually appending to that string which it points to, which is sitting in your function's body block. You're actually changing part of the body block of the function. To see what's going on, you can use ?? examine your function to watch what happens to it each time you use it:

make-password: func[Length] [
    chars: "QWERTYUIOPASDFGHJKLZXCVBNM1234567890"
    password: ""
    loop Length [append password (pick chars random Length)]
    password
]

loop 5 [
    print make-password 5
    ?? make-password
]

This should give you something like:

TWTQW
make-password: func [Length][
    chars: "QWERTYUIOPASDFGHJKLZXCVBNM1234567890"
    password: "TWTQW"
    loop Length [append password (pick chars random Length)]
    password
]
TWTQWWEWRT
make-password: func [Length][
    chars: "QWERTYUIOPASDFGHJKLZXCVBNM1234567890"
    password: "TWTQWWEWRT"
    loop Length [append password (pick chars random Length)]
    password
]
TWTQWWEWRTQWWTW
make-password: func [Length][
    chars: "QWERTYUIOPASDFGHJKLZXCVBNM1234567890"
    password: "TWTQWWEWRTQWWTW"
    loop Length [append password (pick chars random Length)]
    password
]
TWTQWWEWRTQWWTWQTTQQ
make-password: func [Length][
    chars: "QWERTYUIOPASDFGHJKLZXCVBNM1234567890"
    password: "TWTQWWEWRTQWWTWQTTQQ"
    loop Length [append password (pick chars random Length)]
    password
]
TWTQWWEWRTQWWTWQTTQQTRRTT
make-password: func [Length][
    chars: "QWERTYUIOPASDFGHJKLZXCVBNM1234567890"
    password: "TWTQWWEWRTQWWTWQTTQQTRRTT"
    loop Length [append password (pick chars random Length)]
    password
]

To copy the string to password rather than point to it, try this instead:

make-password: func[Length] [
    chars: "QWERTYUIOPASDFGHJKLZXCVBNM1234567890"
    password: copy ""
    loop Length [append password (pick chars random Length)]
    password
]



回答3:


Not having enough reputation to comment on HostileFork's answer, I react this way. It's about your "Related Note", that points me to something I had never been aware of.

"Some argue" suggests you are not amongst them, but nevertheless you’ve made me think I better write str: make string! 0 and blk: make block! 0 from now on, not only within functions. The sizing hint has always puzzled me. Are there any recommendations for what to choose here in case you have no idea of the final magnitude? (Not less than your minimal expectation of course, and also not more than the maximum.)



来源:https://stackoverflow.com/questions/25935648/why-do-function-have-memory-in-rebol

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!