python-internals

Why is checking isinstance(something, Mapping) so slow?

穿精又带淫゛_ 提交于 2019-11-30 05:01:40
问题 I recently compared the performance of collections.Counter to sorted for comparison checks (if some iterable contains the same elements with the same amount) and while the big-iterable performance of Counter is generally better than sorted it's much slower for short iterables. Using line_profiler the bottleneck seems to be the isinstance(iterable, collections.Mapping) -check in Counter.update : %load_ext line_profiler # IPython lst = list(range(1000)) %lprun -f Counter.update Counter(lst)

Why is deque implemented as a linked list instead of a circular array?

帅比萌擦擦* 提交于 2019-11-30 04:42:26
CPython deque is implemented as a doubly-linked list of 64-item sized "blocks" (arrays). The blocks are all full, except for the ones at either end of the linked list. IIUC, the blocks are freed when a pop / popleft removes the last item in the block; they are allocated when append / appendleft attempts to add a new item and the relevant block is full. I understand the listed advantages of using a linked list of blocks rather than a linked list of items: reduce memory cost of pointers to prev and next in every item reduce runtime cost of doing malloc / free for every item added/removed improve

Understanding memory allocation for large integers in Python

烈酒焚心 提交于 2019-11-30 04:18:47
How does Python allocate memory for large integers? An int type has a size of 28 bytes and as I keep increasing the value of the int , the size increases in increments of 4 bytes . Why 28 bytes initially for any value as low as 1 ? Why increments of 4 bytes ? PS: I am running Python 3.5.2 on a x86_64 (64 bit machine). Any pointers/resources/PEPs on how the (3.0+) interpreters work on such huge numbers is what I am looking for. Code illustrating the sizes: >>> a=1 >>> print(a.__sizeof__()) 28 >>> a=1024 >>> print(a.__sizeof__()) 28 >>> a=1024*1024*1024 >>> print(a.__sizeof__()) 32 >>> a=1024

Why is string's startswith slower than in?

廉价感情. 提交于 2019-11-30 04:10:13
Surprisingly, I find startswith is slower than in : In [10]: s="ABCD"*10 In [11]: %timeit s.startswith("XYZ") 1000000 loops, best of 3: 307 ns per loop In [12]: %timeit "XYZ" in s 10000000 loops, best of 3: 81.7 ns per loop As we all know, the in operation needs to search the whole string and startswith just needs to check the first few characters, so startswith should be more efficient. When s is big enough, startswith is faster: In [13]: s="ABCD"*200 In [14]: %timeit s.startswith("XYZ") 1000000 loops, best of 3: 306 ns per loop In [15]: %timeit "XYZ" in s 1000000 loops, best of 3: 666 ns per

In what structure is a Python object stored in memory?

爱⌒轻易说出口 提交于 2019-11-30 03:58:29
问题 Say I have a class A: class A(object): def __init__(self, x): self.x = x def __str__(self): return self.x And I use sys.getsizeof to see how many bytes instance of A takes: >>> sys.getsizeof(A(1)) 64 >>> sys.getsizeof(A('a')) 64 >>> sys.getsizeof(A('aaa')) 64 As illustrated in the experiment above, the size of an A object is the same no matter what self.x is. So I wonder how python store an object internally? 回答1: It depends on what kind of object, and also which Python implementation :-) In

Which standard library modules are required to run the Python 3.5 interpreter?

こ雲淡風輕ζ 提交于 2019-11-30 03:28:09
问题 Here's a CPython program that tries to initialize the interpreter with an empty sys.path : #include <Python.h> int main(int argc, char** argv) { wchar_t* program = NULL; wchar_t* sys_path = NULL; Py_NoSiteFlag = 1; program = Py_DecodeLocale(argv[0], NULL); Py_SetProgramName(program); sys_path = Py_DecodeLocale("", NULL); Py_SetPath(sys_path); Py_Initialize(); PyMem_RawFree(program); PyMem_RawFree(sys_path); Py_Finalize(); } Executing the program above raises the following error: Fatal Python

RuntimeError: lost sys.stdout

烂漫一生 提交于 2019-11-29 23:29:04
问题 I was trying to debug an issue with abc.ABCMeta - in particular a subclass check that didn't work as expected and I wanted to start by simply adding a print to the __subclasscheck__ method (I know there are better ways to debug code, but pretend for the sake of this question that there's no alternative). However when starting Python afterwards Python crashes (like a segmentation fault) and I get this exception: Fatal Python error: Py_Initialize: can't initialize sys standard streams Traceback

Python: the __getattribute__ method and descriptors

耗尽温柔 提交于 2019-11-29 22:49:59
问题 according to this guide on python descriptors https://docs.python.org/2/howto/descriptor.html method objects in new style classes are implemented using descriptors in order to avoid special casing them in attribute lookup. the way I understand this is that there is a method object type that implements __get__ and returns a bound method object when called with an instance and an unbound method object when called with no instance and only a class. the article also states that this logic is

Why is code using intermediate variables faster than code without?

三世轮回 提交于 2019-11-29 20:06:33
I have encountered this weird behavior and failed to explain it. These are the benchmarks: py -3 -m timeit "tuple(range(2000)) == tuple(range(2000))" 10000 loops, best of 3: 97.7 usec per loop py -3 -m timeit "a = tuple(range(2000)); b = tuple(range(2000)); a==b" 10000 loops, best of 3: 70.7 usec per loop How come comparison with variable assignment is faster than using a one liner with temporary variables by more than 27%? By the Python docs, garbage collection is disabled during timeit so it can't be that. Is it some sort of an optimization? The results may also be reproduced in Python 2.x

Why is it slower to iterate over a small string than a small list?

霸气de小男生 提交于 2019-11-29 19:13:20
I was playing around with timeit and noticed that doing a simple list comprehension over a small string took longer than doing the same operation on a list of small single character strings. Any explanation? It's almost 1.35 times as much time. >>> from timeit import timeit >>> timeit("[x for x in 'abc']") 2.0691067844831528 >>> timeit("[x for x in ['a', 'b', 'c']]") 1.5286479570345861 What's happening on a lower level that's causing this? Veedrac TL;DR The actual speed difference is closer to 70% (or more) once a lot of the overhead is removed, for Python 2. Object creation is not at fault.