python-internals

Python source code for built-in “in” operator

我与影子孤独终老i 提交于 2019-11-28 20:52:23
I am trying to find the implementation of the built-in in operator in the (C) Python source code. I have searched in the built-in functions source code, bltinmodule.c , but cannot find the implementation of this operator. Where can I find this implementation? My goal is to improve the sub-string search in Python by extending different C implementations of this search, although I am not sure if Python already uses the idea I have. To find the implementation of any python operator, first find out what bytecode Python generates for it, using the dis.dis function : >>> dis.dis("'0' in ()") 1 0

Why is str.translate much faster in Python 3.5 compared to Python 3.4?

二次信任 提交于 2019-11-28 18:30:15
I was trying to remove unwanted characters from a given string using text.translate() in Python 3.4. The minimal code is: import sys s = 'abcde12345@#@$#%$' mapper = dict.fromkeys(i for i in range(sys.maxunicode) if chr(i) in '@#$') print(s.translate(mapper)) It works as expected. However the same program when executed in Python 3.4 and Python 3.5 gives a large difference. The code to calculate timings is python3 -m timeit -s "import sys;s = 'abcde12345@#@$#%$'*1000 ; mapper = dict.fromkeys(i for i in range(sys.maxunicode) if chr(i) in '@#$'); " "s.translate(mapper)" The Python 3.4 program

Converting a series of ints to strings - Why is apply much faster than astype?

狂风中的少年 提交于 2019-11-28 18:17:23
I have a pandas.Series containing integers, but I need to convert these to strings for some downstream tools. So suppose I had a Series object: import numpy as np import pandas as pd x = pd.Series(np.random.randint(0, 100, 1000000)) On StackOverflow and other websites, I've seen most people argue that the best way to do this is: %% timeit x = x.astype(str) This takes about 2 seconds. When I use x = x.apply(str) , it only takes 0.2 seconds. Why is x.astype(str) so slow? Should the recommended way be x.apply(str) ? I'm mainly interested in python 3's behavior for this. jpp Performance It's worth

Does Python optimize away a variable that's only used as a return value?

痴心易碎 提交于 2019-11-28 17:20:47
Is there any ultimate difference between the following two code snippets? The first assigns a value to a variable in a function and then returns that variable. The second function just returns the value directly. Does Python turn them into equivalent bytecode? Is one of them faster? Case 1 : def func(): a = 42 return a Case 2 : def func(): return 42 No, it doesn't . The compilation to CPython byte code is only passed through a small peephole optimizer that is designed to do only basic optimizations (See test_peepholer.py in the test suite for more on these optimizations). To take a look at

Why are Python's arrays slow?

大憨熊 提交于 2019-11-28 16:26:26
I expected array.array to be faster than lists, as arrays seem to be unboxed. However, I get the following result: In [1]: import array In [2]: L = list(range(100000000)) In [3]: A = array.array('l', range(100000000)) In [4]: %timeit sum(L) 1 loop, best of 3: 667 ms per loop In [5]: %timeit sum(A) 1 loop, best of 3: 1.41 s per loop In [6]: %timeit sum(L) 1 loop, best of 3: 627 ms per loop In [7]: %timeit sum(A) 1 loop, best of 3: 1.39 s per loop What could be the cause of such a difference? Tim Peters The storage is "unboxed", but every time you access an element Python has to "box" it (embed

Python: why are * and ** faster than / and sqrt()?

半腔热情 提交于 2019-11-28 16:22:32
问题 While optimising my code I realised the following: >>> from timeit import Timer as T >>> T(lambda : 1234567890 / 4.0).repeat() [0.22256922721862793, 0.20560789108276367, 0.20530295372009277] >>> from __future__ import division >>> T(lambda : 1234567890 / 4).repeat() [0.14969301223754883, 0.14155197143554688, 0.14141488075256348] >>> T(lambda : 1234567890 * 0.25).repeat() [0.13619112968444824, 0.1281130313873291, 0.12830305099487305] and also: >>> from math import sqrt >>> T(lambda : sqrt

Why do tuples take less space in memory than lists?

杀马特。学长 韩版系。学妹 提交于 2019-11-28 16:08:46
A tuple takes less memory space in Python: >>> a = (1,2,3) >>> a.__sizeof__() 48 whereas list s takes more memory space: >>> b = [1,2,3] >>> b.__sizeof__() 64 What happens internally on the Python memory management? I assume you're using CPython and with 64bits (I got the same results on my CPython 2.7 64-bit). There could be differences in other Python implementations or if you have a 32bit Python. Regardless of the implementation, list s are variable-sized while tuple s are fixed-size. So tuple s can store the elements directly inside the struct, lists on the other hand need a layer of

Why is code using intermediate variables faster than code without?

扶醉桌前 提交于 2019-11-28 15:54:57
问题 I have encountered this weird behavior and failed to explain it. These are the benchmarks: py -3 -m timeit "tuple(range(2000)) == tuple(range(2000))" 10000 loops, best of 3: 97.7 usec per loop py -3 -m timeit "a = tuple(range(2000)); b = tuple(range(2000)); a==b" 10000 loops, best of 3: 70.7 usec per loop How come comparison with variable assignment is faster than using a one liner with temporary variables by more than 27%? By the Python docs, garbage collection is disabled during timeit so

Why does tuple(set([1,“a”,“b”,“c”,“z”,“f”])) == tuple(set([“a”,“b”,“c”,“z”,“f”,1])) 85% of the time with hash randomization enabled?

蹲街弑〆低调 提交于 2019-11-28 15:44:52
问题 Given Zero Piraeus' answer to another question, we have that x = tuple(set([1, "a", "b", "c", "z", "f"])) y = tuple(set(["a", "b", "c", "z", "f", 1])) print(x == y) Prints True about 85% of the time with hash randomization enabled. Why 85%? 回答1: I'm going to assume any readers of this question to have read both: Zero Piraeus' answer and My explanation of CPython's dictionaries. The first thing to note is that hash randomization is decided on interpreter start-up. The hash of each letter will

How exactly is Python Bytecode Run in CPython?

守給你的承諾、 提交于 2019-11-28 15:19:31
I am trying to understand how Python works (because I use it all the time!). To my understanding, when you run something like python script.py, the script is converted to bytecode and then the interpreter/VM/CPython–really just a C Program–reads in the python bytecode and executes the program accordingly. How is this bytecode read in? Is it similar to how a text file is read in C? I am unsure how the Python code is converted to machine code. Is it the case that the Python interpreter (the python command in the CLI) is really just a precompiled C program that is already converted to machine