Saturday, October 22, 2011

Get, modify, and build the sources of an RPM package -- two of the three steps are simple

Intro

After years of using source based distributions, I finally realized that binary distributions aren't that binary after all. However, the rpmbuild system seems to be optimized for, ... , well, creating RPMs. In the following, I attempt to facilitate inspecting, and building SRPM packages for people who don't work with RPMs on a daily basis, but still want to see and tweak the bits that operate their system.

One thing that struck me as odd, is that by default all SRPM operations happen in one directory, ~/rpmbuild by default on Fedora. I'd really prefer to have all operations in a package specific directory. I'll demonstrate how to accomplish that and how to get to a modified build in three steps.

A little disclaimer: I did not work very much with RPM, but really wanted to easily get to the sources. Some things in here might be wrong or maybe there exist more convenient steps. I'm looking forward to be corrected in the comments.

Oh, and uninspired as I am, I have just prepended the character d to rpm. Maybe you find a word for which that character stands.


The Proceedings -- An Example

Let's say you haven't been working with RPMs for several weeks, but you suspect a bug in the editor Nano and want to take a look at it. You only need to remember and execute a single command:

drpm-get-source nano

This will download the Source-RPM for the currently installed version of Nano and create an SRPM build environment in a version-specific subdirectory of ~/drpm/, for example ~/drpm/nano-2.2.4-1.fc14,

Next, you browse the sources and modify some bits. This is the one step that is not trivial, but as it is the fun part, I leave all the things you do at this point up to you.

Finally, from within the build environment created in the first step, you simply execute the following command to build the result:

drpm-compile

These three steps are all I want to do most of the time. If the result is useful enough to enter the production system it can be installed to /usr/local, or a binary RPM can be created. The important point is that it is possible to easily get, modify, and compile the sources in a separate build directory.

Now, let's walk through the parts...


Unpacking The Sources

The script drpm-get-source takes a package name as argument, obtains the associated SRPM and prepares it in a package specific sub-directory of ~/drpm/.

The script uses yumdownloader to obtain the URL of the SRPM. It is possible to obtain the URL by other means and specify it as argument to the option --url. Also, it might be possible that there are missing dependencies to build the SRPM. In that case the script will exit with an error and display the appropriate yum-builddep command, ready to be pasted into a root console. You might want to adapt the displayed message to your distribution.

Here the Python script drpm-get-source:

#!/usr/bin/env python

from __future__ import print_function

import argparse
import collections
import os
import subprocess
import sys
import urllib

#--- CONFIG

# The desfault location for the --destdir option
DEFAULT_DESTDIR = "~/drpm"
# The name of the file that marks the root directory of an unpacked srpm.
PKGROOT_MARKER_FILE = ".drpm-root"

#--- INTERNAL

RPM_DIRS = "BUILD  BUILDROOT  RPMS  SOURCES  SPECS  SRPMS".split()

# Capture location info of a package
PackageInfo = collections.namedtuple("PackageInfo",
        ['srpm_url', 'srpm_path', 'pkg_topdir'])

def error(msg, exit_status=1):
    """
    Generate a colorful error message and exit with the given exit_status.
    """
    sys.stderr.write("\x1b[1;37;41m%s\x1b[0m\n" % msg)
    sys.exit(exit_status)

def execute_shell_command(shell_command):
    """
    Execute shell_command and return the exit code and the combined output of
    stderr and stdout.
    """
    p = subprocess.Popen(shell_command, shell=True, stdout=subprocess.PIPE,
            stderr=subprocess.STDOUT)
    output = p.stdout.read().decode().strip()
    retval = p.wait()
    return (retval, output)

def parse_cmdline():
    """
    Return a Namespace object of the parsed command line.

    The destdir attribute will be absolute and a home directory reference in it
    will be expanded.
    """
    desc="Download a source rpm and unpack it."
    aparser = argparse.ArgumentParser(description=desc)
    aparser.add_argument('pkg',
            help='The package to obtain the sources for (default) or an '
                    'URL to a SRPM to use if the "--url" option is specified.')
    aparser.add_argument('-d', '--destdir', default=DEFAULT_DESTDIR,
            help='Base directory where the sources will be unpacked. '
                    '(Default:%(default)s) A subdirectory named like '
                    'the package will be created there.')
    aparser.add_argument('-u', '--url', action='store_true', default=False,
            help='The argument is a SRPM URL obtained by other means.')
    cmdline = aparser.parse_args()
    cmdline.destdir = os.path.abspath(os.path.expanduser(cmdline.destdir))
    return cmdline

def get_source_url(pkg):
    """
    Use yumdownloader to obtain and return the SRPM URL of the specified
    package.
    """
    url_cmd = "yumdownloader --source --urls '%s'" % pkg
    retval, output= execute_shell_command(url_cmd)
    if retval:
        error("%s\n\n%s\n\nGetting the url with yumdownloader failed."
                % (output, url_cmd))
    url = output.split('\n')[-1].strip()
    if not url:
        error("%s\n\n%s\n\nGetting the url failed. Last line of "
                "output is empty." % (url_cmd, output))
    print("Got URL: " + url)
    return url

def get_package_info(url, destdir):
    """
    Return a PackageInfo instance for the given arguments.
    """
    STRIP_SUFFIX = '.src.rpm'
    srpm_filename = os.path.basename(url)
    if srpm_filename.endswith(STRIP_SUFFIX):
        pkg_name = srpm_filename[:-len(STRIP_SUFFIX)]
    else:
        error("url '%s' does not end with '%s'" % (url, STRIP_SUFFIX))
    pkg_topdir = os.path.join(destdir, pkg_name)
    srpm_path = os.path.join(pkg_topdir, 'SRPMS', srpm_filename)
    return PackageInfo(srpm_url=url, srpm_path=srpm_path, pkg_topdir=pkg_topdir)

def create_dir_structure(pkg_info):
    """
    Create the directory structure of an SRPM build environment.
    """
    pkg_topdir = pkg_info.pkg_topdir
    if os.path.exists(pkg_topdir):
        error("Destination exists: " + pkg_topdir)
    print("Creating directory structure at: " + pkg_topdir)
    for subdir in [''] + RPM_DIRS:
        os.makedirs(os.path.join(pkg_topdir, subdir))
    with open(os.path.join(pkg_topdir, PKGROOT_MARKER_FILE), "w") as fw:
        fw.write("this file marks an unpacked srpm root")

def download_srpm(pkg_info):
    """
    Download the SRPM to its destined location
    """
    urllib.urlretrieve(pkg_info.srpm_url, pkg_info.srpm_path)

def find_specfile(pkg_info):
    """
    Return the one and only spec file or exit with an error.
    """
    specdir = os.path.join(pkg_info.pkg_topdir, 'SPECS')
    entries = [e for e in os.listdir(specdir) if e.endswith('.spec')]
    if len(entries) != 1:
        error("Not exactly one spec file found\nrpmbuild --define "
                "'_topdir %s' -bp --target=$(uname -m) " %
                        pkg_info.pkg_topdir)
    specfile = os.path.join(specdir, entries[0])
    return specfile

def unpack_srpm(pkg_info):
    """
    Execute an rpm command that unpacks the SRPM into the current build
    environment.
    """
    unpack_cmd = ("rpm --define '_topdir %s' -Uvh %s" %
            (pkg_info.pkg_topdir, pkg_info.srpm_path))
    print(unpack_cmd)
    if os.system(unpack_cmd):
        error("%s\nUnpacking failed: %s" %
                (unpack_cmd, pkg_info.pkg_topdir))

def prep_srpm(pkg_info, specfile):
    """
    Execute an rpmbuild command that prepares the sources for a build.
    """
    # This will probably require root rights.  Just form the command and make
    # it part of an error message so that it can be pasted into a root console.
    builddep_cmd = 'yum-builddep %s' % pkg_info.srpm_path
    prep_cmd = ("rpmbuild --define '_topdir %s' -bp %s" %
            (pkg_info.pkg_topdir, specfile))
    if os.system(prep_cmd):
        error("%s\n%s\nPrepping failed: %s" % (builddep_cmd, prep_cmd,
                pkg_info.pkg_topdir))

def main():

    cmdline = parse_cmdline()

    if cmdline.url:
        url = cmdline.pkg
    else:
        url = get_source_url(cmdline.pkg)

    pkg_info = get_package_info(url, cmdline.destdir)

    create_dir_structure(pkg_info)

    download_srpm(pkg_info)

    unpack_srpm(pkg_info)

    specfile = find_specfile(pkg_info)

    prep_srpm(pkg_info, specfile)

    print("\nSources are at:\n" + pkg_info.pkg_topdir)

if __name__ == '__main__':
    main()

Executing rpm/rpmbuild Commands For The Current Package

When rpm and rpmbuild operate on Source-RPMs they always assume the operations should take place in the SRPM build environment location specified by the macro _toplevel. To use a separate location for each source package, the wrapper script drpm-shared, determines the root directory of the unpacked SRPM and adds an appropriate macro definition to the command line. Therefore, when you execute an rpm or rpmbuild command with that wrapper it will operate on the SRPM build environment that belongs to the current working directory.

drpm-shared should be called through any of the following symbolic links, which determine the behavior of the script:

drpm
drpmbuild
drpmbuildspec

The first two wrap rpm and rpmbuild respectively, and the last one additionally appends the path to the spec file to a rpmbuild command line. To install drpm-shared, save it into a directory in your $PATH and inside that directory, create the necessary symbolic links with the following commands:

ln -s drpm-shared drpm
ln -s drpm-shared drpmbuild
ln -s drpm-shared drpmbuildspec

This is the shell script drpm-shared:

#!/bin/sh

# Config
# ======

# The name of the file that marks the root directory of an unpacked srpm.
RPMROOT_MARKER_FILE=".drpm-root"

# Internal
# ========

PROG="$(basename ${0})"

usage() {
    echo "USAGE: ${PROG} [-h|--help] <${WRAPPED_COMMAND}-sec-argv>... " >&2
    echo "Find the srpm root (${RPMROOT_MARKER_FILE}) of cwd and "\
            "execute the given ${WRAPPED_COMMAND} command with the "\
            "appropriate _toplevel macro definition."
}

error() {
    echo "ERROR: $@" >&2
    exit 1
}

# Print the canonical, absolute path to the current SRPM root directory.  Print
# an empty string if not inside of an RPM package.
__get_rpm_root() {
    local cwd_canon=`pwd -P`
    local lookout_path="${cwd_canon}"
    while [ ! -e "${lookout_path}/${RPMROOT_MARKER_FILE}" ] ; do
        local parent_dir=`dirname "${lookout_path}"`
        if [ "${parent_dir}" = "${lookout_path}" ] ; then
            lookout_path=""
            break
        fi
        lookout_path="${parent_dir}"
    done
    echo -n ${lookout_path}
}

# Print the path to the one and only spec file or abort with an error.
__find_spec_path() {
    srpm_root="$1"
    [ -n "${srpm_root}" ] || error "__find_spec_path: missing srpm_root argument"
    spec_file=""
    for candidate in "${srpm_root%/}"/SPECS/*.spec ; do
        [ -n "${spec_file}" ] && error "__find_spec_path: more than "\
                "one spec file available -- do not know which one to add" 
        spec_file="${candidate}" 
    done
    echo -n $spec_file
}


# Sequence
# ========

WRAPPED_COMMAND=""
WANT_SPEC_FILE_PATH=""
case "$PROG" in
    # Call rpmbuild and append the path to the spec-file to the command
    drpmbuildspec)
        WRAPPED_COMMAND="rpmbuild"
        WANT_SPEC_FILE_PATH="yes"
        ;;
    # Call rpmbuild
    drpmbuild)
        WRAPPED_COMMAND="rpmbuild"
        ;;
    # Call rpm
    drpm)
        WRAPPED_COMMAND="rpm"
        ;;
    *)
        error "Unknown command name $PROG."
        ;;
esac

case "$1" in 
    -h|--help)
        usage
        exit 0
        ;;
esac

# Find the root directory of the unpacked srpm
srpm_root=`__get_rpm_root`
if [ -z "${srpm_root}" ] ; then
    error "not inside an unpacked rpm package -- cannot "\
            "find \"${RPMROOT_MARKER_FILE}\""
fi

# Set the spec-file argument if requested
SPEC_PATH=""
if [ -n "${WANT_SPEC_FILE_PATH}" ] ; then
    SPEC_PATH=`__find_spec_path "${srpm_root}"`
    [ $? -eq 0 ] || error "$SPEC_PATH"
fi

# Print and execute the command
echo
echo "${WRAPPED_COMMAND}" --define "_topdir ${srpm_root}" "${@}" "${SPEC_PATH}"
echo
"${WRAPPED_COMMAND}" --define "_topdir ${srpm_root}" "${@}" "${SPEC_PATH}"
exit $?

Fine, but I just want to build it

After unpacking and preparing the sources, an obvious task is to execute the build and install steps without undoing own modifications. That can be done with the following shell function:

drpm-compile () {
    drpmbuildspec -bc --short-circuit && drpmbuildspec -bi --short-circuit
}

Now, how to integrate it into my system?

The easiest way is to simply install to /usr/local with this command:

drpmbuildspec -bi --short-circuit --buildroot=/usr/local

This, of course, lacks all the advantages of the RPM system. To really integrate your changes into another RPM, you have to get a bit more familiar with RPM. Here are some helpful links:

How to create an RPM package
RPM Guide