问题
And who has the authority to decide?
Edit: Apparently I haven't succeeded in formulating my question well.
I am not asking how Java's argument passing works. I know that what looks like a variable holding an object is actually a variable holding a reference to the object, and that reference is passed by value.
There are lots of fine explanations of that mechanism here (in the linked threads and others) and elsewhere.
The question is about the technical meaning of the term pass-by-reference. (End edit)
I am not sure if this is the right kind of question for SO, apologies if not, but I don't know a better place. Much has already been said in other questions here, for example Is Java "pass-by-reference" or "pass-by-value"? and pass by reference or pass by value?, but I haven't found an authoritative answer to the question what the term means.
I have thought that "pass by reference" means "pass a reference (usually a pointer) to the object", so the callee can modify the object the caller sees, while "pass by value" means copying the object, and letting the callee have fun with the copy (obvious problem: what if the object contains references, deep copy or shallow).
Sing the FW turns up lots of places saying "pass by reference" means just that, here there's some argument that it means more, but the definition still reads
A ParameterPassing mode where a reference (or if you want to be politically incorrect, a pointer) to the actual parameter is passed into the formal parameter; when the callee needs the formal parameter it dereferences the pointer to get it.
I haven't found many places giving a stronger definition for the term, on this page, I found "The lvalue of the formal parameter is set to the lvalue of the actual parameter." and, if I understand correctly, the same definition is used here ("The formal parameter merely acts as an alias for the actual parameter.")
In fact, the only places I found where the stronger definition is used are places arguing against the notion that in Java, objects are passed by reference (that may be due to my lacking google-fu).
So, if I got things straight, pass-by-reference
class Thing { ... }
void byReference(Thing object){ ... }
Thing something;
byReference(something);
according to the first definition would roughly correspond to (in C)
struct RawThing { ... };
typedef RawThing *Thing;
void byReference(Thing object){
// do something
}
// ...
struct RawThing whatever = blah();
Thing something = &whatever;
byReference(something); // pass whatever by reference
// we can change the value of what something (the reference to whatever) points to, but not
// where something points to
and in that sense, saying that Java passes objects by reference would be adequate. But according to the second definition, pass-by-reference means more or less
struct RawThing { ... };
typedef RawThing *RawThingPtr;
typedef RawThingPtr *Thing;
void byReference(Thing object){
// do something
}
// ...
RawThing whatever = blah();
RawThingPtr thing_pointer = &whatever;
byReference(&thing_pointer); // pass whatever by reference
// now we can not only change the pointed-to (referred) value,
// but also where thing_pointer points to
And since Java only lets you have pointers-to-objects (limiting what you can do with them) but doesn't have pointers-to-pointers, in that sense, saying that Java passes objects by reference is totally wrong.
So,
- Have I adequately understood the above definitions of pass-by-reference?
- Are there other definitions around?
- Is there consensus which definition is "the correct one", if so, which?
回答1:
Sure, different people currently have different definitions of what "pass-by-reference" means. And that is why they disagree on whether something is pass-by-reference or not.
However, whatever definition you use, you must use it consistently across languages. You can't say that one language has pass-by-value, and have the exact same semantics in another language and say that it is pass-by-reference. Pointing out the analogies between languages is the best way to address this dispute, because although people might have strong opinions about the passing modes in particular languages, when you contrast the identical semantics with other languages, it sometimes brings counter-intuitive results that force them to re-think their definition.
- One predominant view is that Java is pass-by-value only. (Search everywhere on the Internet and you will find this point of view.) This view is that objects are not values, but are always manipulated through references, and thus it is references that are assigned or passed, by value. This view holds that the test of pass-by-reference is whether it is possible to assign to a variable in the calling scope.
If one agrees with this viewpoint, then one must also consider most languages, including as diverse ones as Python, Ruby, OCaml, Scheme, Smalltalk, SML, Go, JavaScript, Objective-C, etc. as pass-by-value only. If any of this strikes you as strange or counterintuitive, I challenge you to point out why you think it is different between the semantics of objects in any of those languages from objects in Java. (I know that the some of these languages may explicitly claim that they are pass-by-reference; but it is irrelevant what they say; a consistent definition must be applied to all languages based on the actual behavior.)
- If you take the opposing view that objects in Java are pass-by-reference, then you must also consider C as pass-by-reference.
Take your Java example:
class Thing { int x; }
void func(Thing object){ object.x = 42; object = null; }
Thing something = null;
something = new Thing();
func(something);
in C, it would be equivalent to this:
typedef struct { int x; } Thing;
void func(Thing *object){ object->x = 42; object = NULL; }
Thing *something = NULL;
something = malloc(sizeof Thing);
memset(something, 0, sizeof(something));
func(something);
// later:
free(something);
I claim that the above are semantically equivalent; only the syntax is different. The only syntax differences are:
- C requires an explicit
*
to denote a pointer type; Java's reference (pointers to objects) types don't need an explicit*
. - C uses
->
to access a field through a pointer; Java just uses.
- Java uses
new
to dynamically allocate memory for a new object on the heap; C usesmalloc
to allocate it, and then we need to initialize the memory. - Java has garbage collection
Note that, importantly,
- The syntax for calling the function with the object are the same in both cases:
func(something)
, without needing to do anything like taking address or anything. - In both cases, the object is dynamically-allocated (it may live beyond the scope of the function). And
- In both cases, the
object = null;
inside the function does not affect the calling scope.
So the semantics are the same in both cases, so if you call Java pass-by-reference you must call C pass-by-reference too.
回答2:
Who has the authority to decide? Nobody, and everybody. You decide for yourself; a writer decides for his or her book; and a reader decides whether to agree with the writer.
To understand the term, one needs to go under the hood of the language (and explaining them in terms of C code rather misses the point). Parameter passing styles refer to mechanisms that compilers typically use to create certain behaviour. The following are usually defined:
- pass by value: the argument is copied into the parameter when the subroutine is entered
- pass by result: the parameter is undefined when the subroutine is entered, and it is copied to the argument when the subroutine returns
- pass by value-result: the argument is copied into the parameter at entry, and the parameter is copied into the argument at return
- pass by reference: a reference to the argument variable is copied to the parameter; any access of the parameter variable is transparently translated into an access of the argument variable
(A note of terminology: a parameter is the variable defined in the subroutine, an argument is the expression that is used in a call.)
Textbooks usually also define pass by name, but it's rare and not easy to explain here. Pass by need also exists.
The importance of the parameter passing style is its effect: in pass by value, any changes made to the parameter is not communicated to the argument; in pass by result, any changes made to the parameter are communicated to the argument at the end; in pass by reference, any changes made to the parameter are communicated to the argument as they are made.
Some languages define more than one passing style, allowing the programmer to select their preferred style for each parameter separately. For example, in Pascal, the default style is pass by value, but a programmer can use the var
keyword to specify pass by reference. Some other languages specify one passing style. There are also languages that specify different styles for different types (for example, in C, pass by value is the default but arrays are passed by reference).
Now, in Java, technically we have a language with pass-by-value, with the value of an object variable being a reference to the object. Whether that makes Java pass-by-reference where object variables are concerned is a matter of taste.
回答3:
Both of your C examples actually demonstrate pass-by-value, because C doesn't have pass-by-reference. It's just that the value that you're passing is a pointer. Pass-by-reference occurs in languages such as Perl:
sub set_to_one($)
{
$_[0] = 1; # set argument = 1
}
my $a = 0;
set_to_one($a); # equivalent to: $a = 1
Here, the variable $a
is actually passed by reference, so the subroutine can modify it. It's not modifying some object that $a
is pointing to via indirection; rather, it modifies $a
itself.
Java is like C in this respect, except that in Java objects are "reference types", so all you ever have (and all you can ever pass) are pointers to them. Something like this:
void setToOne(Integer i)
{
i = 1; // set argument = 1
}
void foo()
{
Integer a = 0;
setToOne(a); // has no effect
}
won't actually change a
; it only reassigns i
.
回答4:
Passing by reference is, in effect, passing a reference to a value -- rather than a copy of it -- as an argument.
I guess before we go on, certain things should be defined. I may be using them differently than you're used to seeing them used.
An object is a molecule of data. It occupies storage, and may contain other objects, but has its own identity and may be referred to and used as a single unit.
A reference is an alias, or handle, to an object. At the language level, a reference mostly acts like the thing it's referring to; depending on the language, the compiler/interpreter/runtime/gnomes will automagically dereference it when the actual object is needed.
A value is the result of evaluating an expression. It is a concrete object, that can be stored, passed to functions, etc. (OOP wonks, note i use "object" here in the generic "molecule of data" sense, rather than the OOP "instance of a class" sense.)
A variable is a named reference to a pre-allocated value.
Especially note: variables are not values. The name notwithstanding, variables typically do not change. Their value is what changes. That they're so easily mixed up is partly a testament to how good the reference<-->referent illusion usually is.
A reference-typed variable (a la Java, C#, ...) is a variable whose value is a reference.
Most languages, when you pass a variable as an argument, will by default create a copy of the variable's value and pass the copy. The callee binds its name for the parameter to that copy. This is called "passing by value" (or, more clearly, "passing by copy"). The two variables on either side of the call end up with different storage locations, and are thus completely different variables (only related in that they typically start out with equal values).
Passing by reference, on the other hand, doesn't do the copy. Instead, it passes the variable itself (minus the name). That is, it passes a reference to the very same value the variable aliases. (This is typically done by implicitly passing a pointer to the variable's storage, but that's just an implementation detail; the caller and callee don't have to know or care how it happens.) The callee binds its parameter's name to that location. The end result is that both sides use the same storage location (just by possibly different names). Any changes the callee makes to its variable are thus also made to the caller's variable. For example, in the case of object-oriented languages, the variable can be assigned a whole different value.
Most languages (including Java) do not support this natively. Oh, they like to say they do...but that's because people who have never been able to truly pass by reference, often don't grok the subtle difference between doing so and passing a reference by value. Where the confusion comes in with those languages, is with reference-type variables. Java itself never works directly with reference-type objects, but with references to those objects. The difference is in the variables "containing" said objects. The value of a reference-type variable is such a reference (or, sometimes, a special reference value that means "nothing"). When Java passes such a reference, while it doesn't copy the object, it still copies the value (ie: the reference the function gets is a copy of the value the variable refers to). That is, it is passing a reference, but is passing it by value. This allows most of the things that passing by reference allows, but not all.
The most obvious test i can think of for real pass-by-reference support, would be the "swap test". A language that natively supports passing by reference must provide enough support to write a function swap
that swaps the values of its arguments. Code equivalent to this:
swap (x, y): <-- these should be declared as "reference to T"
temp = x
x = y
y = temp
--
value1 = (any valid T)
value2 = (any other valid T)
a = value1
b = value2
swap(a, b)
assert (a == value2 and b == value1)
- must be possible and run successfully -- for any type T that permits copying and reassignment -- using the language's assignment and strict equality operators (including any overloads specified by T); and
- must not require the caller to convert or "wrap" the args (eg: by explicitly passing a pointer). Requiring that the args be marked as being passed by reference is OK.
(Obviously, languages that don't have mutable variables can't be tested this way -- but that's fine, because they don't matter. The big semantic difference between the two is how modifiable the caller's variable is by the callee. When the variable's value is not modifiable in any case, the difference becomes merely an implementation detail or optimization.)
Note, most of the talk in this answer is about "variables". A number of languages, like C++, also allow passing anonymous values by reference. The mechanism is the same; the value takes up storage, and the reference is an alias to it. It just doesn't necessarily have a name in the caller.
回答5:
Wikipedia gives a very clear definition of call-by-reference I can not improve upon:
In call-by-reference evaluation (also referred to as pass-by-reference), a function receives an implicit reference to a variable used as argument, rather than a copy of its value. This typically means that the function can modify (i.e. assign to) the variable used as argument- something that will be seen by its caller.
Note that neither of your examples is call-by-reference, because assigning a formal parameter in C never modifies the argument as seen by the caller.
But that's enough copy-pasting, read the thorough discussion (with examples) at
http://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_reference
回答6:
Java doesn't pass by reference. You are always passing a copy/by value. However if you pass an object then you will get a copy of the reference. So you can directly edit the object, however if you overwrite your local reference then the original object reference won't be overriden.
回答7:
Passing parameters by reference means that the pointer nesting of parameters is deeper than the pointer nesting of local variables. If you have a variable with the type of a class, the variable is a pointer to the actual value. A variable of a primitive type is contains the value itself.
Now, if you pass these variables by value, you keep the pointer nesting: The object reference stays a pointer to the object, and the primitive variable stays the value itself.
Passing the variables as references means that the pointer nesting gets deeper: You pass a pointer to the object reference, so that you can change the object reference; or you pass a pointer to the primitive, so that you can change its value.
These definitions are used in C# and Object Pascal which both have keywords to pass a variable by reference.
To answer your question: Because the last variables - whatever
in the first example and thing_pointer
in the second one - are passed to the function each through a pointer (&
), both are passed by reference.
回答8:
If you are familiar with C, perhaps the following analogy explains how Java works. This will be true only for objects of class-type (and not fundamental type).
In Java, we can have a variable and pass it to a function:
void f(Object x)
{
x.bar = 5; // #1j
x = new Foo; // #2j
}
void main()
{
Foo a;
a.bar = 4;
f(a);
// now a.bar == 5
}
In C, this would look as follows:
void f(struct Foo * p)
{
p->bar = 5; // #1c
p = malloc(sizeof(struct Foo)); // #2c
}
int main()
{
struct Foo * w = malloc(sizeof(struct Foo));
w->bar = 4;
f(w);
/* now w->bar = 5; */
}
In Java, variables of class-type are always references, which in C would be most faithfully mapped to pointers. But in function calls, the pointer itself is passed by copy. Accessing the pointer as in #1j and #1c modifies the original variable, so in that sense you are passing around a reference to the variable. However, the variable itself is only a pointer, and it itself is passed by copy. So when you assign something else to it. as in #2j and #2c, you are only rebinding the copy of the reference/pointer in the local scope of f
. The original variable, a
or w
in the respective examples, remains untouched.
In short: Everything is a reference, and references are passed by value.
In C, on the other hand, I could implement a true "passing by reference" by declaring void v(struct Foo ** r);
and calling f(&w)
; this would allow me to change w
itself from within f
.
Note 1: this is not true for fundamental types like int
, which are wholly passed by value.
Note 2: A C++ example would be a bit tidier since I could pass the pointer by reference (and I didn't have to say struct
): void f(Foo * & r) { r = new Foo; }
and f(w);
.
来源:https://stackoverflow.com/questions/8113781/what-exactly-does-pass-by-reference-mean