sqlite’s anal gamation

sqlite’s slogan: “Small. Fast. Reliable. Choose any three.”

i always wondered though, how such a small or “lite” package can take such a considerable amount of time to build.

as the main author of the sabotage linux distribution, building software is my daily bread, so i own a pretty fast build box.
it’s an 8 core machine with 3.1 GHz, which builds a complete 3.11 linux kernel in less than 5 minutes, making use of all 8 cores via the nice parallel build feature of GNU make.

make -j8

when invoking make like this, it first determines the dependencies between the translation units,
and then runs up to 8 build processes, one per cpu core, each one building a different .c file.

GCC 3.4.6, a C compiler with full C99 support builds in 43 sec:

$ time butch rebuild gcc3
2013.09.25 12:13:50 building gcc3 (/src/build/build_gcc3.sh) -> /src/logs/build_gcc3.log
2013.09.25 12:14:33 done.
real 0m 43.97s
user 1m 36.66s
sys 0m 13.74s

however, for sqlite, a supposedly small package, build times are comparatively huge:

$ time butch rebuild sqlite
2013.09.25 12:18:27 building sqlite (/src/build/build_sqlite.sh) -> /src/logs/build_sqlite.log
2013.09.25 12:19:21 done.
real 0m 54.03s
user 0m 52.02s
sys 0m 1.51s

nearly one minute, a fifth of the time used to build the linux kernel and 10 seconds more than the gcc compiler.

the full-blown postgresql database server package, takes less time to build as well:

$ time butch rebuild postgresql
2013.09.25 12:19:21 building postgresql (/src/build/build_postgresql.sh) -> /src/logs/build_postgresql.log
2013.09.25 12:19:57 done.
real 0m 36.63s
user 1m 53.34s
sys 0m 12.03s

how is it possible that postgresql, shipping 16 MB of compressed sources, as opposed to 1.8MB of sqlite, builds 33% faster ?

if you look at the user times above, you start getting an idea.
the user time (i.e. the entire cpu time burnt in userspace) for postgresql is 1m53, while the total time that actually passed was only 36s.

that means that the total work of 113 seconds was distributed among multiple cpu cores.
dividing the user time through the real time gives us a concurrency factor of 3.13.
not perfect, given that make was invoked with -j8, but much better than sqlite, which apparently only used a single core.

let’s take a look at sqlite’s builddir

$ find . -name '*.c'
./sqlite3.c
./shell.c
./tea/generic/tclsqlite3.c
./tea/win/nmakehlp.c

ah, funny. there are only 4 C files total. that partially explains why 8 cores didn’t help.
the 2 files in tea/ are not even used, which leaves us with

$ ls -la *.c
-rw-r--r-- 1 root root 91925 Jan 16 2012 shell.c
-rw-r--r-- 1 root root 4711082 Jan 16 2012 sqlite3.c

so in the top level builddir, there are just 2 C files, one being 90 KB, and the other roughly 5MB.
the 90KB version is built in less than 1 second, so after that the entire time spent is waiting for the single cpu core building the huge sqlite3.c.

so why on earth would somebody stuff all source code into a single translation unit and thereby defeat
makefile parallellism ?

after all, the IT industry’s mantra of the last 10 years was “parallellism, parallellism, and even more parallellism”.

here’s the explanation:
https://www.sqlite.org/amalgamation.html

it’s a “feature”, which they call amalgamation.

i call it anal gamation.

In addition to making SQLite easier to incorporate into other projects, the amalgamation also makes it run faster.
Many compilers are able to do additional optimizations on code when it is contained with in a single translation unit such as it is in the amalgamation.

so they have 2 reasons for wasting our time:

  • reason 1: easier to incorporate
  • reason 2: generated code is better as the compiler sees all code at once.

let’s look at reason 1:
what they mean with incorporation is embedding the sqlite source code into another projects source tree.

it is usually considered bad practice to embed third-party source code into your own source tree, for multiple reasons:

  • every program that uses its own embedded copy of library X does not benefit from security updates when the default library install is updated.
  • we have multiple different versions on the harddrive and loaded in RAM, wasting system resources
  • having multiple incompatible versions can lead to a lot of breakage when it’s used from another lib:
    for example application X uses lib Y and lib Z, and lib Z uses a “incorporated” version of lib Y.
    so we have a nice clash of 2 different lib Y versions. if lib Y has global state, it will get even worse.
  • if the library in question is using some unportable constructs, wrong ifdefs etc., it needs to be patched to build.
    having to apply and maintain different sets of patches against multiple different versions “incorporated” into other packages, represents a big burden for the packager.

instead, the installed version of libraries should be used.

pkg-config can be used to query existence, as well as CFLAGS and LDFLAGS needed to build against the installed version of the library. if the required library is not installed or too old, just throw an error at configure time and tell the user to install it via apt-get or whatever.

conclusion: “incorporation” of source code is a bad idea to begin with.

now let’s look at reason 2 (better optimized code):
it possibly sometimes made sense to help the compiler do its job in the 70ies, when everything started.
however, it’s 2013 now.
compilers do a great job optimizing, and they get better at it every day.

since GCC 4.5 was released in 2010, it ships with a feature called LTO
it builds object files together with metadata that allows it to strip off unneeded functions and variables, inline functions that are only called once or twice, etc at link time – pretty much anything the sqlite devs want to achieve, and probably even more than that.

conclusion: pseudo-optimizing C code by stuffing everything into a big file is obsolete since LTO is widely available.
LTO does a better job anyway – not that it matters much, as sqlite spends most time waiting for I/O.
every user who wants to make sqlite run faster, can simply add -flto to his CFLAGS.
there’s no need to dictate him which optimization he wants to apply.
following this logic, they could as well just ship generated assembly code…

but hey – we have the choice !
here ‘s actually a tarball containing the ORIGINAL, UN-ANAL-GAMATED SOURCE CODE…

… just that it’s not a tarball.

it’s a fscking ZIP file.
yes, you heard right.
they distribute their source as ZIP files, treating UNIX users as second-class citizens.

additionally they say that you should not use it:

sqlite-src-3080002.zip (5.12 MiB)
A ZIP archive of the complete source tree for SQLite version 3.8.0.2 as extracted from the version control system.
The Makefile and configure script in this tarball are not supported.
Their use is not recommended.
The SQLite developers do not use them.
You should not use them either.
If you want a configure script and an automated build, use either the amalgamation tarball or TEA tarball instead of this one.
To build from this tarball, hand-edit one of the template Makefiles in the root directory of the tarball and build using your own customized Makefile.

Note how the text talks about “this tarball” despite it being a ZIP file.

Fun. there’s only a single TARball on the entire site, so that’s what you naturally pick for your build system.
and that one contains the ANAL version.
Note that my distro’s build system does not even support zip files, as i don’t have a single package in my repo that’s not building from a tarball.
should i change it and write special case code for one single package which doesn’t play by the rules ?
i really don’t think so.

funny fact: they even distribute LINUX BINARY downloads as .zip.
i wonder in which world they live in.

why do i care so much about build time ? it’s just a minute after all.
because the distribution gets built over and over again. and it’s not just me building it, but a lot of other people as well – so the cumulated time spent waiting for sqlite to finish building its 5 MB file gets bigger and bigger each day.
in the past i built sqlite more than 200 times, so my personal cumulated wasted time on it already exceeds the amount of time i needed to write this blog post.

so what i hope to see is sqlite

  1. using tarballs for all their sources (and eventually distribute an additional .zip for windows lusers)
  2. using tarballs for their precompiled linux downloads
  3. either getting rid of the anal version entirely, now that they learned about LTO, or offer the anal version as an additional download and do not discourage users from using the sane version.

Update: I just upgraded sqlite from 3071000 to 3080002

2013.09.27 02:23:24 building sqlite (/src/build/build_sqlite.sh) -> /src/logs/build_sqlite.log
2013.09.27 02:25:31 done.

it now takes more than 2 minutes.

Advertisements

7 thoughts on “sqlite’s anal gamation

  1. Interesting. It builds on my 2.1Ghz laptop using 1 core in 7.4 seconds.

    As a long time user of SQLite in my product which runs on Windows 7 x64 I say that I prefer to put the SQLite source into my build tree and into my source control. I could go either way on the amalgamation itself. But building this code into my own dll and blending it with my own wrapper classes into a single dll in my system is a very good option.

    As far a .zip file… Why do you want to make a windows user a second class citizen? My understanding is that all current version of Linux have unzip utility.

    • Jeff Archer, i’m running the autoconf package which does probably way more than how you are using it.
      first thing is configure, which takes between 1 and 5 seconds ( i didnt measure that part), however it should be insignificant, as i provide a config.cache file which gives predefined answers to more than 50% of questions asked by configure.

      then, it builds a .o out of the huge C file, with high optimizations settings + debug info (these are default settings of sqlite)
      then, it builds a .o out of the huge C file, with high optimizations settings + debug info (these are default settings of sqlite), but this time with -fPIC (position independant code).

      then, it builds a .a (static library) out of the non-pic version object file.
      then, it builds a .so (dynamic library) out of the pic version object file.

      after that it builds the sqlite3 binary 2 times as well, once as ./sqlite3 statically linked, and once as .libs/sqlite3, dynamically linked against shell.o (which i omitted above for brevity, as its only 90KB and compiled in less than a second)

      the complete build log is available here
      http://sprunge.us/ZdhY

      the key point here is that the library is being built twice, and both times with high optimization and debug info.
      my CPU is AMD so it’s possible that its single thread performance is not as good as that of an equally high clocking intel chip. however usually it’s blazing fast at compilation since multiple build jobs can use all cores, so the maximum speed of a single core is not that important.

      maybe the M$ compiler is faster than gcc as well, or can deal better with huge files – but i supect you just didnt set a high optimization level – or M$ compiler can’t even optimize as well.
      i still prefer gcc, since it can at least deal with a 14 year old C99 standard, unlike M$…

    • oh i forgot to add, i don’t want to make windows user second class citizen by removing .zip downloads – in fact i hope that the developers will supply both .tar.xz/.gz and .zip downloads for all files that are relevant for both platforms.
      for linux only files, it makes of course no sense to package them as .zip and enrage the community…

      • I certainly buy that argument.
        My question is why do you build all that other stuff in some way that you do not prefer? The beauty of the amalgamation is that you just drop 2 files into any project and build it exactly the way you want it. I build both Win32 and x64 targets and it only takes 14 seconds on my laptop. Upgrading to latest version is just grabbing latest version of the 2 files and checking to version control in exactly 1 place for both targets.

      • Jeff, as a distro developer i don’t necessarily use those packages like a windows developer who uses them for his own projects – i install them merely because something else depends on it. there are tons of such dependencies in a linux system, so i’m happy if i can just run configure && make -j8 && make install, and that stuff builds as fast as possible without pissing me off with oddities such as .zip files or cmake.

        fortunately most linux packages use a common methodology:
        they’re distributed as tarballs, and use autoconf.
        all those packages support the same syntax and features, like –prefix or DESTDIR.
        so making them use my prefered CFLAGS is just a matter of passing them to configure.

        as linux comes with package management software, you just install or build the package, and then you can link against the library using -lsqlite. that’s it. there is no need to drop a C file into some source tree and to compile it by hand.
        that’s all been taken care of by the package build/installation manager.
        and as i outlined above, drop-in sourcecode is considered a bad practice. debian does not even accept packages that include their own copies of libraries due to those reasons.

        so having the sqlite code in a single file to be “simpler to deal with” is something that only affects windows people – linux users just scratch their head if you tell them that you compile sqlite by manually invoking the C compiler.

        btw i just updated sqlite to latest version and customized it so that now it doesnt build twice, and uses -O0 i.e. no optimization at all. that way it only takes 10 seconds.
        https://github.com/rofl0r/sabotage/commit/5c8c8cfdd10a9609e927c956bc2dd812454dea00

        as you can see from this commit, there’s actually much more going on than just compiling the file.
        it has to create and install static and dynamic libraries, headers, pkgconfig files and manpages.

        for my (and every other linux users) workflow, having everything in a single C file does not help a single bit.

  2. That’s a very PC centric viewpoint.

    The amalgamation exists because SQLite is used in a lot of non-PC places; its’ been found on countless mobile phones (many of the dumb variety), and millions of embedded systems you probably had no idea used SQLite. Lots of these have crappier compilers, for which the amalgamation helps significantly.

    They distribute the SQLite source in a single format because, well, it’s the most well tested form. Linux users tend not to care… because they’re pulling SQLite from their package manager. Hell, the only people who really care (in opposition) to the amalgamation are people like you; but even most distributions do all their package building on “build slaves”.

    • i’m quite sure amalgamation can make sense for some scenarios – however it’s not adequate for the #1 user, linux distros.
      i would not underestimate the number of linux users building software from source – in fact gentoo is one of the most prominent distros (judging from the number of user in the IRC channel, it’s even bigger than SuSE).
      unfortunately the sqlite devs do not really offer a working alternative to the amalgamation/autoconf tarball.
      i tried building the “full source” zip distribution, but that one requires an installed tcl interpreter – unfortunately tcl itself depends on sqlite – so you have to use the amalgamation anyway.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s