Understanding configure.py

Botan’s build is handled with a custom Python script, configure.py. This document tries to explain how configure works.

Note

You only need to read this if you are modifying the library, or debugging some problem with your build. For how to use it, see Building The Library.

Build Structure

Modules are a group of related source and header files, which can be individually enabled or disabled at build time. Modules can depend on other modules; if a dependency is not available then the module itself is also removed from the list. Examples of modules in the existing codebase are asn1 and x509, Since x509 depends on (among other things) asn1, disabling asn1 will also disable x509.

Most modules define one or more macros, which application code can use to detect the modules presence or absence. The value of each macro is a datestamp, in the form YYYYMMDD which indicates the last time this module changed in a way that would be visible to an application. For example if a class gains a new function, the datestamp should be incremented. That allows applications to detect if the new feature is available.

What configure.py does

First, all command line options are parsed.

Then all of the files giving information about target CPUs, compilers, etc are parsed and sanity checked.

In calculate_cc_min_version the compiler version is detected using the preprocessor.

Then in check_compiler_arch the target architecture are detected, again using the preprocessor.

Now that the target is identified and options have been parsed, the modules to include into the artifact are picked, in ModulesChooser.

In create_template_vars, a dictionary of variables is created which describe different aspects of the build. These are serialized to build/build_config.json.

Up until this point no changes have been made on disk. This occurs in do_io_for_build. Build output directories are created, and header files are linked into build/include/botan. Templates are processed to create the Makefile, build.h and other artifacts.

When Modifying configure.py

Run ./src/scripts/ci_build.py lint to run Pylint checks after any change.

Template Language

Various output files are generated by processing input files using a simple template language. All input files are stored in src/build-data and use the suffix .in. Anything not recognized as a template command is passed through to the output unmodified. The template elements are:

  • Variable substitution, %{variable_name}. The configure script creates many variables for various purposes, this allows getting their value within the output. If a variable is not defined, an error occurs.

    If a variable reference ends with |upper, the value is uppercased before being inserted into the template output.

    Using |concat:<some string> as a suffix, it is possible to conditionally concatenate the variable value with a static string defined in the template. This is useful for compiler switches that require a template-defined parameter value. If the substitution value is not set (i.e. “empty”), also the static concatenation value is omitted.

  • Iteration, %{for variable} block %{endfor}. This iterates over a list and repeats the block as many times as it is included. Variables within the block are expanded. The two template elements %{for ...} and %{endfor} must appear on lines with no text before or after.

  • Conditional inclusion, %{if variable} block %{endif}. If the variable named is defined and true (in the Python sense of the word; if the variable is empty or zero it is considered false), then the block will be included and any variables expanded. As with the for loop syntax, both the start and end of the conditional must be on their own lines with no additional text.

Build.h

The build.h header file is generated and overwritten each time the configure.py script is executed. This header can be included in any header or source file and provides plenty of compile-time information in the form of preprocessor #defines.

It is helpful to check which modules are included in the current build of the library via macro defines of the form “BOTAN_HAS” followed by the module name. Also, it contains version information macros and compile-time library configurations.

Adding a new module

Create a directory in the appropriate place and create a info.txt file.

Syntax of info.txt

Warning

The syntax described here is documented to make it easier to use and understand, but it is not considered part of the public API contract. That is, the developers are allowed to change the syntax at any time on the assumption that all users are contained within the library itself. If that happens this document will be updated.

Modules and files describing information about the system use the same parser and have common syntactical elements.

Comments begin with ‘#’ and continue to end of line.

There are three main types: maps, lists, and variables.

A map has a syntax like:

<MAP_NAME>
NAME1 -> VALUE1
NAME2 -> VALUE2
...
</MAP_NAME>

The interpretation of the names and values will depend on the map’s name and what type of file is being parsed.

A list has similar syntax, it just doesn’t have values:

<LIST_NAME>
ELEM1
ELEM2
...
</LIST_NAME>

Lastly there are single value variables like:

VAR1 SomeValue
VAR2 "Quotes Can Be Used (And will be stripped out)"
VAR3 42

Variables can have string, integer or boolean values. Boolean values are specified with ‘yes’ or ‘no’.

Module Syntax

The info.txt files have the following elements. Not all are required; a minimal file for a module with no dependencies might just contain a macro define and module_info.

Lists:
  • comment and warning provides block-comments which are displayed to the user at build time.

  • requires is a list of module dependencies. An os_features can be specified as a condition for needing the dependency by writing it before the module name and separated by a ?, e.g. rtlgenrandom?dyn_load.

  • header:internal is the list of headers (from the current module) which are internal-only.

  • header:public is a the list of headers (from the current module) which should be exported for public use. If neither header:internal nor header:public are used then all headers in the current directory are assumed internal.

    Note

    If you omit a header from both internal and public lists, it will be ignored.

  • header:external is used when naming headers which are included in the source tree but might be replaced by an external version. This is used for the PKCS11 headers.

  • arch is a list of architectures this module may be used on.

  • isa lists ISA features which must be enabled to use this module. Can be proceeded by an arch name followed by a : if it is only needed on a specific architecture, e.g. x86_64:ssse3.

  • cc is a list of compilers which can be used with this module. If the compiler name is suffixed with a version (like “gcc:5.0”) then only compilers with that minimum version can use the module. If you need to exclude just one specific compiler (for example because that compiler miscompiles the code in the module), you can prefix a compiler name with ! - like !msvc.

  • os_features is a list of OS features which are required in order to use this module. Each line can specify one or more features combined with ‘,’. Alternatives can be specified on additional lines.

Maps:
  • defines is a map from macros to datestamps. These macros will be defined in the generated build.h.

  • module_info contains documentation-friendly information about the module. Available mappings:

    • name must contain a human-understandable name for the module

    • brief may provide a short description about the module’s contents

    • type specifies the type of the module (defaults to Public)

      • Public Library users can directly interact with this module. E.g. they may enable or disable the module at will during build.

      • Internal Library users cannot directly interact with this module. Typically, it does not expose any public API and is enabled as a dependency of other modules. Explicitly disabling an internal module explicitly disables all dependent modules.

      • Virtual This module does not contain any implementation but acts as a container for other sub-modules. It cannot be interacted with by the library user and cannot be depended upon directly.

    • lifecycle specifies the module’s lifecycle (defaults to Stable)

      • Stable The module is stable and will not change in a way that would break backwards compatibility.

      • Experimental The module is experimental and may change in a way that would break backwards compatibility. Not enabled in a default build. Either use --enable-modules or --enable-experimental-features.

      • Deprecated The module is deprecated and will be removed in a future release. It remains to be enabled in a default build. Either use --disable-modules or --disable-deprecated-features.

  • libs specifies additional libraries which should be linked if this module is included. It maps from the OS name to a list of libraries (comma seperated).

  • frameworks is a macOS/iOS specific feature which maps from an OS name to a framework.

Variables:
  • load_on Can take on values never, always, auto, dep or vendor. TODO describe the behavior of these

  • endian Required endian for the module (any (default), little, big)

An example:

# Disable this by default
load_on never

<isa>
sse2
</isa>

<defines>
DEFINE1 -> 20180104
DEFINE2 -> 20190301
</defines>

<module_info>
name -> "This Is Just To Say"
brief -> "Contains a poem by William Carlos Williams"
</module_info>

<comment>
I have eaten
the plums
that were in
the icebox
</comment>

<warning>
There are no more plums
</warning>

<header:public>
header1.h
</header:public>

<header:internal>
header_helper.h
whatever.h
</header:internal>

<arch>
x86_64
</arch>

<cc>
gcc:4.9 # gcc 4.8 doesn't work for <reasons>
clang
</cc>

# Can work with POSIX+getentropy or Win32
<os_features>
posix1,getentropy
win32
</os_features>

<frameworks>
macos -> FramyMcFramerson
</frameworks>

<libs>
qnx -> foo,bar,baz
solaris -> socket
</libs>

Supporting a new CPU type

CPU information is stored in src/build-data/arch.

There is also a file src/build-data/detect_arch.cpp which is used for build-time architecture detection using the compiler preprocessor. Supporting this is optional but recommended.

Lists:
  • aliases is a list of alternative names for the CPU architecture.

  • isa_extensions is a list of possible ISA extensions that can be used on this architecture. For example x86-64 has extensions “sse2”, “ssse3”, “avx2”, “aesni”, …

Variables:
  • endian if defined should be “little” or “big”. This can also be controlled or overridden at build time.

  • family can specify a family group for several related architecture. For example both x86_32 and x86_64 use family of “x86”.

  • wordsize is the default wordsize, which controls the size of limbs in the multi precision integers. If not set, defaults to 32.

Supporting a new compiler

Compiler information is stored in src/build-data/cc. Looking over those files will probably help understanding, especially the ones for GCC and Clang which are most complete.

In addition to the info file, for compilers there is a file src/build-data/detect_version.cpp. The configure.py script runs the preprocessor over this file to attempt to detect the compiler version. Supporting this is not strictly necessary.

Maps:
  • binary_link_commands gives the command to use to run the linker, it maps from operating system name to the command to use. It uses the entry “default” for any OS not otherwise listed.

  • cpu_flags_no_debug unused, will be removed

  • cpu_flags used to emit CPU specific flags, for example LLVM bitcode target uses -emit-llvm flag. Rarely needed.

  • isa_flags maps from CPU extensions (like NEON or AES-NI) to compiler flags which enable that extension. These have the same name as the ISA flags listed in the architecture files.

  • lib_flags has a single possible entry “debug” which if set maps to additional flags to pass when building a debug library. Rarely needed.

  • mach_abi_linking specifies flags to enable when building and linking on a particular CPU. This is usually flags that modify ABI. There is a special syntax supported here “all!os1,arch1,os2,arch2” which allows setting ABI flags which are used for all but the named operating systems and/or architectures.

  • sanitizers is a map of sanitizers the compiler supports. It must include “default” which is a list of sanitizers to include by default when sanitizers are requested. The other keys should map to compiler flags.

  • so_link_commands maps from operating system to the command to use to build a shared object.

Variables:
  • binary_name the default name of the compiler binary.

  • linker_name the name of the linker to use with this compiler.

  • macro_name a macro of the for BOTAN_BUILD_COMPILER_IS_XXX will be defined.

  • output_to_object (default “-o”) gives the compiler option used to name the output object.

  • output_to_exe (default “-o”) gives the compiler option used to name the output object.

  • add_include_dir_option (default “-I”) gives the compiler option used to specify an additional include dir.

  • add_lib_dir_option (default “-L”) gives the compiler option used to specify an additional library dir.

  • add_sysroot_option gives the compiler option used to specify the sysroot.

  • add_lib_option (default “-l%s”) gives the compiler option to link in a library. %s will be replaced with the library name.

  • add_framework_option (default “-framework”) gives the compiler option to add a macOS framework.

  • preproc_flags (default “-E”) gives the compiler option used to run the preprocessor.

  • compile_flags (default “-c”) gives the compiler option used to compile a file.

  • debug_info_flags (default “-g”) gives the compiler option used to enable debug info.

  • optimization_flags gives the compiler optimization flags to use.

  • size_optimization_flags gives compiler optimization flags to use when compiling for size. If not set then --optimize-for-size will use the default optimization flags.

  • sanitizer_optimization_flags gives compiler optimization flags to use when building with sanitizers.

  • coverage_flags gives the compiler flags to use when generating coverage information.

  • stack_protector_flags gives compiler flags to enable stack overflow checking.

  • shared_flags gives compiler flags to use when generation shared libraries.

  • lang_flags gives compiler flags used to enable the required version of C++.

  • lang_binary_linker_flags gives flags to be passed to the linker when creating a binary

  • warning_flags gives warning flags to enable.

  • maintainer_warning_flags gives extra warning flags to enable during maintainer mode builds.

  • visibility_build_flags gives compiler flags to control symbol visibility when generation shared libraries.

  • visibility_attribute gives the attribute to use in the BOTAN_DLL macro to specify visibility when generation shared libraries.

  • ninja_header_deps_style style of include dependency tracking for Ninja, see also https://ninja-build.org/manual.html#ref_headers.

  • header_deps_flag flag to write out dependency information in the style required by ninja_header_deps_style.

  • header_deps_out flag to specify name of the dependency output file.

  • ar_command gives the command to build static libraries

  • ar_options gives the options to pass to ar_command, if not set here takes this from the OS specific information.

  • ar_output_to gives the flag to pass to ar_command to specify where to output the static library.

  • werror_flags gives the complier flags to treat warnings as errors.

Supporting a new OS

Operating system information is stored in src/build-data/os.

Lists:
  • aliases is a list of alternative names which will be accepted

  • target_features is a list of target specific OS features. Some of these are supported by many OSes (for example “posix1”) others are specific to just one or two OSes (such as “getauxval”). Adding a value here causes a new macro BOTAN_TARGET_OS_HAS_XXX to be defined at build time. Use configure.py --list-os-features to list the currently defined OS features.

  • feature_macros is a list of macros to define.

Variables:
  • ar_command gives the command to build static libraries

  • ar_options gives the options to pass to ar_command

  • ar_output_to gives the flag to pass to ar_command to specify where to output the static library.

  • bin_dir (default “bin”) specifies where binaries should be installed, relative to install_root.

  • cli_exe_name (default “botan”) specifies the name of the command line utility.

  • default_compiler specifies the default compiler to use for this OS.

  • doc_dir (default “doc”) specifies where documentation should be installed, relative to install_root

  • header_dir (default “include”) specifies where include files should be installed, relative to install_root

  • install_root (default “/usr/local”) specifies where to install by default.

  • lib_dir (default “lib”) specifies where library should be installed, relative to install_root.

  • lib_prefix (default “lib”) prefix to add to the library name

  • library_name

  • man_dir specifies where man files should be installed, relative to install_root

  • obj_suffix (default “o”) specifies the suffix used for object files

  • program_suffix (default “”) specifies the suffix used for executables

  • shared_lib_symlinks (default “yes) specifies if symbolic names should be created from the base and patch soname to the library name.

  • soname_pattern_abi

  • soname_pattern_base

  • soname_pattern_patch

  • soname_suffix file extension to use for shared library if soname_pattern_base is not specified.

  • static_suffix (default “a”) file extension to use for static library.

  • use_stack_protector (default “true”) specify if by default stack smashing protections should be enabled.

  • uses_pkg_config (default “yes”) specify if by default a pkg-config file should be created.