A couple weeks ago, after an upgrade of libffi, we experienced odd build errors of python only on systems where python had previously been installed with an older libffi version:
error: [Errno 2] No such file or directory: '/lib/libffi-3.0.13/include/ffi.h'
There was no reference to libffi-3.0.13
anywhere in the python source, and it
turned out that it was contained in old python .pyc/.pyo bytecode files that
survived a rebuild due to a packaging bug, and apparently were queried as
authorative during the python build.
/lib/python2.7/_sysconfigdata.pyc:/lib/libffi-3.0.13/include
/lib/python2.7/_sysconfigdata.pyo:/lib/libffi-3.0.13/include
The packaging bug was that we didn't pre-generate .pyc/.pyo files just after the build of python, so they would become part of the package directory in /opt/python, but instead they were created on first access directly in /lib/python2.7, resulting in the following layout:
~ $ la /lib/python2.7/ | grep sysconfigdata
lrwxrwxrwx 1 root root 48 Mar 4 03:11 _sysconfigdata.py -> ../../opt/python/lib/python2.7/_sysconfigdata.py
-rw-r--r-- 1 root root 19250 Mar 4 03:20 _sysconfigdata.pyc
-rw-r--r-- 1 root root 19214 Jun 30 2018 _sysconfigdata.pyo
So on a rebuild of python, only the symlinks pointing to /opt/python were removed, while the generated-on-first-use .pyc/.pyo files survived.
Annoyed by this occurence I started researching how generation of these bytecode
file could be suppressed, and it turned out that it can be controlled using a
sys.dont_write_bytecode
variable, which in turn is set from the python C code.
Here's a patch doing that.
However, before turning off a feature that can potentially be a huge performance boost, a responsible distro maintainer needs to do a proper benchmarking study so he can make an educated decision.
So I developed a benchmark, that runs a couple of tasks using the bazaar
VCS system, which is written in python and uses a large amount of small files,
so the startup overhead should be significant. The task is executed 50 times, so
small differences in the host's CPU load due to other tasks should be evened
out.
The task is to generate a new bazaar repo, check 2 files and a directory into bazaar in 3 commits, and print a log at the end.
With bytecode generation disabled, the benchmark produced the following results:
real 3m 15.75s
user 2m 15.40s
sys 0m 4.12s
With pregenerated bytecode, the following results were measured:
real 1m 24.25s
user 0m 20.26s
sys 0m 2.55s
We can see, that in the case of a fairly big application like bazaar with hundreds of python files, the precompilation does indeed make a quite noticable difference. It is more than twice as fast.
What's also becoming apparent is that bazaar is slow as hell.
For the lulz, I replaced the bzr command in the above benchmark with git and
exported PAGER=cat
so git log
wouldn't interrupt the benchmark.
As expected, git is orders of magnitude faster:
real 0m 0.48s
user 0m 0.02s
sys 0m 0.05s
Out of curiosity, I fiddled some more with python and added a patch that
builds python so its optimization switch -O
is always active, and rebuilt both
python and bazaar to produce only .pyo files instead of .pyc.
Here are the results:
real 1m 23.88s
user 0m 20.18s
sys 0m 2.54s
We can see that the optimization flag is next to useless. The difference is so small it's almost not measurable.
Now this benchmark was tailored to measure startup compilation cost for a big project, what about a mostly CPU-bound task using only a few python modules?
I modified a password bruteforcer to exit after a couple thousand rounds for this purpose, and ran it 30x each without bytecode, with .pyc and .pyo each.
Here are the results:
No bytecode:
real 3m 50.42s
user 3m 50.25s
sys 0m 0.03s
.pyc bytecode:
real 3m 48.68s
user 3m 48.60s
sys 0m 0.01s
.pyo bytecode:
real 3m 49.14s
user 3m 49.06s
sys 0m 0.01s
As expected, there's almost no difference between the 3. Funnily enough, the optimized bytecode is even slower than the non-optimized bytecode in this case.
From my reading of this stackoverflow question it appears to me as if the .pyo bytecode differs from regular bytecode only in that it lacks instructions for the omitted assert() calls, and eventually debug facilities.
Which brings us back to the original problem: In order to have the .pyc files
contained in the package directory, they need to be generated manually during
the build, because apparently they're not installed as part of make install
.
This can be achieved by calling
./python -E Lib/compileall.py "$dest"/lib/python2.7
after make install finished. With that achieved, i compared the size of the previous /opt/python directory without .pyc files with the new one.
It's 22.2 MB vs 31.1MB, so the .pyc files add roughly 9MB and make the package almost 50% bigger.
Now it happens that some python packages, build scripts and the like call python
with the optimization flag -O
. this causes our previous problem to re-appear,
now we will have stray .pyo files in /lib/python2.7.
So we need to pregenerate not only .pyc, but also .pyo for all python modules. This will add another 9MB to the python package directory.
OR... we could simply turn off the ability to activate the optimised mode, which as we saw, is 99.99% useless. This seems to be the most reasonable thing to do, and therefore this is precisely what I now implemented in sabotage linux.