cpython | 易学教程

String matching performance: gcc versus CPython

阅读更多关于 String matching performance: gcc versus CPython

问题 Whilst researching performance trade-offs between Python and C++, I've devised a small example, which mostly focusses on a dumb substring matching. Here is the relevant C++: using std::string; std::vector<string> matches; std::copy_if(patterns.cbegin(), patterns.cend(), back_inserter(matches), [&fileContents] (const string &pattern) { return fileContents.find(pattern) != string::npos; } ); The above is built with -O3. And here is Python: def getMatchingPatterns(patterns, text): return filter

RuntimeError: lost sys.stdout

阅读更多关于 RuntimeError: lost sys.stdout

问题 I was trying to debug an issue with abc.ABCMeta - in particular a subclass check that didn't work as expected and I wanted to start by simply adding a print to the __subclasscheck__ method (I know there are better ways to debug code, but pretend for the sake of this question that there's no alternative). However when starting Python afterwards Python crashes (like a segmentation fault) and I get this exception: Fatal Python error: Py_Initialize: can't initialize sys standard streams Traceback

Boolean identity == True vs is True

阅读更多关于 Boolean identity == True vs is True

It is standard convention to use if foo is None rather than if foo == None to test if a value is specifically None . If you want to determine whether a value is exactly True (not just a true-like value), is there any reason to use if foo == True rather than if foo is True ? Does this vary between implementations such as CPython (2.x and 3.x), Jython, PyPy, etc.? Example: say True is used as a singleton value that you want to differentiate from the value 'bar' , or any other true-like value: if foo is True: # vs foo == True ... elif foo == 'bar': ... Is there a case where using if foo is True

Why is code using intermediate variables faster than code without?

阅读更多关于 Why is code using intermediate variables faster than code without?

I have encountered this weird behavior and failed to explain it. These are the benchmarks: py -3 -m timeit "tuple(range(2000)) == tuple(range(2000))" 10000 loops, best of 3: 97.7 usec per loop py -3 -m timeit "a = tuple(range(2000)); b = tuple(range(2000)); a==b" 10000 loops, best of 3: 70.7 usec per loop How come comparison with variable assignment is faster than using a one liner with temporary variables by more than 27%? By the Python docs, garbage collection is disabled during timeit so it can't be that. Is it some sort of an optimization? The results may also be reproduced in Python 2.x

Why is it slower to iterate over a small string than a small list?

阅读更多关于 Why is it slower to iterate over a small string than a small list?

I was playing around with timeit and noticed that doing a simple list comprehension over a small string took longer than doing the same operation on a list of small single character strings. Any explanation? It's almost 1.35 times as much time. >>> from timeit import timeit >>> timeit("[x for x in 'abc']") 2.0691067844831528 >>> timeit("[x for x in ['a', 'b', 'c']]") 1.5286479570345861 What's happening on a lower level that's causing this? Veedrac TL;DR The actual speed difference is closer to 70% (or more) once a lot of the overhead is removed, for Python 2. Object creation is not at fault.

IronPython vs. Python .NET

阅读更多关于 IronPython vs. Python .NET

I want to access some .NET assemblies written in C# from Python code. A little research showed I have two choices: IronPython with .NET interface capability/support built-in Python with the Python .NET package What are the trade-offs between both solutions? Reed Copsey If you want to mainly base your code on the .NET framework, I'd highly recommend IronPython vs Python.NET. IronPython is pretty much native .NET - so it just works great when integrating with other .NET langauges. Python.NET is good if you want to just integrate one or two components from .NET into a standard python application.

Python hasattr vs getattr

阅读更多关于 Python hasattr vs getattr

问题 I have been reading lately some tweets and the python documentation about hasattr and it says: hasattr(object, name) The arguments are an object and a string. The result is True if the string is the name of >> one of the object’s attributes, False if not. (This is implemented by calling getattr(object, name) and seeing whether it raises an AttributeError or not.) There is a motto in Python that says that is Easier to ask for forgiveness than permission where I usually agree. I tried to do a

Should importlib.reload restore a deleted attribute in Python 3.6?

阅读更多关于 Should importlib.reload restore a deleted attribute in Python 3.6?

I'm looking into these two related questions: here and here . I am seeing a behavior I do not expect in Python 3.6, which differs from behavior using plain reload in Python 2.7 (and 3.4). Namely, it seems that a module attribute that would be populated during module initialization or when re-exec-ing the module during a reload, is not restored after its local name is removed with del ... see below: For Python 3.6: In [1]: import importlib In [2]: import math In [3]: del math.cos In [4]: math.cos --------------------------------------------------------------------------- AttributeError

Python 中的 10 个常见安全漏洞，以及如何避免（下）

阅读更多关于 Python 中的 10 个常见安全漏洞，以及如何避免（下）

简评：编写安全代码很困难，当你学习一个编程语言、模块或框架时，你会学习其使用方法。在考虑安全性时，你需要考虑如何避免被滥用，Python 也不例外，即使在标准库中，也存在用于编写应用的不良实践。然而，许多 Python 开发人员却根本不知道它们。接上篇 6. 解析 XML（Parsing XML）如果你的应用程序要加载、解析 XML 文件，则你可能正在使用 XML 标准库模块。通过 XML 的攻击大多是 DoS 风格（旨在使系统崩溃而不是泄露数据），这些攻击十分常见，特别是在解析外部（即不可信任的）XML 文件时。其中有个「billion laughs」，因为他的 payload 通常包含很多（十亿）「lols」。基本上，这个原理是可以在 XML 中使用参照实体，所以当解析器将这个 XML 文件加载到内存中时，它会消耗数 G 大小的内存（RAM）。试试看，如果你不相信我的话 :-) <?xml version="1.0"?> <!DOCTYPE lolz [ <!ENTITY lol "lol"> <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;"> <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;

What is Python's sequence protocol?

阅读更多关于 What is Python's sequence protocol?

问题 Python does a lot with magic methods and most of these are part of some protocol. I am familiar with the "iterator protocol" and the "number protocol" but recently stumbled over the term "sequence protocol". But even after some research I'm not exactly sure what the "sequence protocol" is. For example the C API function PySequence_Check checks (according to the documentation) if some object implements the "sequence protocol". The source code indicates that this is a class that's not a dict