问题
I'm coming from Java and learning Python. So far what I found very cool, yet very hard to adapt, is that there's no need to declare types. I understand that each variable is a pointer to an object, but so far I'm not able to understand how to design my code then.
For example, I'm writing a function that accepts a 2D NumPy array. Then in the body of the function I'm calling different methods of this array (which is an object of array
in Numpy). But then in the future suppose I want to use this function, by that time I might have forgotten totally what I should pass to the function as a type. What do people normally do? Do they just write documentation for this? Because if that is the case, then this involves more typing and would raise the question about the idea of not declaring the type.
Also suppose I want to pass an object similar to an array in the future. Normally in Java one would implement an interface and then let both classes to implement the methods. Then in the function parameters I define the variable to be of the type of the interface. How can this issue be solved in Python or what approaches can be used to make the same idea?
回答1:
This is a very healthy question.
Duck typing
The first thing to understand about python is the concept of duck typing:
If it walks like a duck, and quacks like a duck, then I call it a duck
Unlike Java, Python's types are never declared explicitly. There is no restriction, neither at compile time nor at runtime, in the type an object can assume.
What you do is simply treat objects as if they were of the perfect type for your needs. You don't ask or wonder about its type. If it implements the methods and attributes you want it to have, then that's that. It will do.
def foo(duck):
duck.walk()
duck.quack()
The only contract of this function is that duck
exposes walk()
and quack()
. A more refined example:
def foo(sequence):
for item in sequence:
print item
What is sequence
? A list
? A numpy array
? A dict
? A generator
? It doesn't matter. If it's iterable (that is, it can be used in a for ... in
), it serves its purpose.
Type hinting
Of course, no one can live in constant fear of objects being of the wrong type. This is addressed with coding style, conventions and good documentation. For example:
- A variable named
count
should hold an integer - A variable
Foo
starting with an upper-case letter should hold atype
(class
) - An argument
bar
whose default value isFalse
, should hold abool
too when overridden
Note that the duck typing concept can be applied to to these 3 examples:
count
can be any object that implements+
,-
, and<
Foo
can be any callable that returns an object instancebar
can be any object that implements__nonzero__
In other words, the type is never defined explicitly, but always strongly hinted at. Or rather, the capabilities of the object are always hinted at, and its exact type is not relevant.
It's very common to use objects of unknown types. Most frameworks expose types that look like lists and dictionaries but aren't.
Finally, if you really need to know, there's the documentation. You'll find python documentation vastly superior to Java's. It's always worth the read.
回答2:
I've reviewed a lot of Python code written by Java and .Net developers, and I've repeatedly seen a few issues I might warn/inform you about:
Python is not Java
Don't wrap everything in a class:
Seems like even the simplest function winds up being wrapped in a class when Java developers start writing Python. Python is not Java. Don't write getters and setters, that's what the property decorator is for.
I have two predicates before I consider writing classes:
- I am marrying state with functionality
- I expect to have multiple instances (otherwise a module level dict and functions is fine!)
Don't type-check everything
Python uses duck-typing. Refer to the data model. Its builtin type coercion is your friend.
Don't put everything in a try-except block
Only catch exceptions you know you'll get, using exceptions everywhere for control flow is computationally expensive and can hide bugs. Try to use the most specific exception you expect you might get. This leads to more robust code over the long run.
Learn the built-in types and methods, in particular:
From the data-model
str
join
- just do
dir(str)
and learn them all.
list
append
(add an item on the end of the list)extend
(extend the list by adding each item in an iterable)
dict
get
(provide a default that prevents you from having to catch keyerrors!)setdefault
(set from the default or the value already there!)fromkeys
(build a dict with default values from an iterable of keys!)
set
Sets contain unique (no repitition) hashable objects (like strings and numbers). Thinking Venn diagrams? Want to know if a set of strings is in a set of other strings, or what the overlaps are (or aren't?)
union
intersection
difference
symmetric_difference
issubset
isdisjoint
And just do dir()
on every type you come across to see the methods and attributes in its namespace, and then do help() on the attribute to see what it does!
Learn the built-in functions and standard library:
I've caught developers writing their own max functions and set objects. It's a little embarrassing. Don't let that happen to you!
Important modules to be aware of in the Standard Library are:
os
sys
collections
itertools
pprint
(I use it all the time)logging
unittest
re
(regular expressions are incredibly efficient at parsing strings for a lot of use-cases)
And peruse the docs for a brief tour of the standard library, here's Part 1 and here's Part II. And in general, make skimming all of the docs an early goal.
Read the Style Guides:
You will learn a lot about best practices just by reading your style guides! I recommend:
- PEP 8 (anything included in the standard library is written to this standard)
- Google's Python Style Guide
- Your firm's, if you have one.
Additionally, you can learn great style by Googling for the issue you're looking into with the phrase "best practice" and then selecting the relevant Stackoverflow answers with the greatest number of upvotes!
I wish you luck on your journey to learning Python!
回答3:
For example I'm writing a function that accepts a 2D Numpy array. Then in the body of the function I'm calling different methods of this array (which is an object of array in Numpy). But then in the future suppose I want to use this function, by that time I might forgot totally what should I pass to the function as a type. What do people normally do? Do they just write a documentation for this?
You write documentation and name the function and variables appropriately.
def func(two_d_array):
do stuff
Also suppose I want in the future to pass an object similar to an array, normally in Java one would implement an interface and then let both classes to implement the methods.
You could do this. Create a base class and inherit from it, so that multiple types have the same interface. However, quite often, this is overkill and you'd simply use duck typing instead. With duck typing, all that matters is that the object being evaluated defines the right properties and methods required to use it within your code.
Note that you can check for types in Python, but this is generally considered bad practice because it prevents you from using duck typing and other coding patterns enabled by Python's dynamic type system.
回答4:
Yes, you should document what type(s) of arguments your methods expect, and it's up to the caller to pass the correct type of object. Within a method, you can write code to check the types of each argument, or you can just assume it's the correct type, and rely on Python to automatically throw an exception if the passed-in object doesn't support the methods that your code needs to call on it.
The disadvantage of dynamic typing is that the computer can't do as much up-front correctness checking, as you've noted; there's a greater burden on the programmer to make sure that all arguments are of the right type. But the advantage is that you have much more flexibility in what types can be passed to your methods:
- You can write a method that supports several different types of objects for a particular argument, without needing overloads and duplicated code.
- Sometimes a method doesn't really care about the exact type of an object as long as it supports a particular method or operation — say, indexing with square brackets, which works on strings, arrays, and a variety of other things. In Java you'd have to create an interface, and write wrapper classes to adapt various pre-existing types to that interface. In Python you don't need to do any of that.
回答5:
You can use assert
to check if conditions match:
In [218]: def foo(arg):
...: assert type(arg) is np.ndarray and np.rank(arg)==2, \
...: 'the argument must be a 2D numpy array'
...: print 'good arg'
In [219]: foo(np.arange(4).reshape((2,2)))
good arg
In [220]: foo(np.arange(4))
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-220-c0ee6e33c83d> in <module>()
----> 1 foo(np.arange(4))
<ipython-input-218-63565789690d> in foo(arg)
1 def foo(arg):
2 assert type(arg) is np.ndarray and np.rank(arg)==2, \
----> 3 'the argument must be a 2D numpy array'
4 print 'good arg'
AssertionError: the argument must be a 2D numpy array
It's always better to document what you've written completely as @ChinmayKanchi mentioned.
回答6:
Here are a few pointers that might help you make your approach more 'Pythonic'.
The PEPs
In general, I recommend at least browsing through the PEPs. It helped me a lot to grok Python.
Pointers
Since you mentioned the word pointers, Python doesn't use pointers to objects in the sense that C uses pointers. I am not sure about the relationship to Java. Python uses names attached to objects. It's a subtle but important difference that can cause you problems if you expect similar-to-C pointer behavior.
Duck Typing
As you said, yes, if you are expecting a certain type of input you put it in the docstring.
As zhangxaochen wrote, you can use assert to do realtime typing of your arguments, but that's not really the python way if you are doing it all the time with no particular reason. As others mentioned, it's better to test and raise a TypeError if you have to do this. Python favors duck typing instead - if you send me something that quacks like a numpy 2D array, then that's fine.
来源:https://stackoverflow.com/questions/22128123/how-to-design-code-in-python