Where did that software come from?
StoryOctober 09, 2017
Where did the software on your embedded system come from? Can you prove it? Can you safely update systems in the field? Cryptography provides the tools for verifying the integrity and provenance of software and data. There is a process as to how users can verify the source of software, if it was tampered with in transit, and if it was modified after installation.
Military systems are subject to many attacks, including attacks on the software supply chain that provides software to the system. To ensure protection against these attacks, managers should ask three questions: What is the source of the software components? Has the software been tampered with or modified? Can they prove it?
Cryptography solves this problem through software signing and hashing, which work together to verify sources and files. Software signing uses a public-key/private-key pair to verify the source of a piece of software. Software is signed with a private key, and the public key is then used to verify that the software was signed with that specific private key. Whether the signer and the software can be trusted is a separate discussion; software signing verifies the source of the software and that it hasn’t been tampered with after being signed.
Hashing is a technique for processing a file or set of data of any length and producing a single fixed-length checksum that is unique to that file. Any changes to the file produce a completely new checksum – for example, changing one bit in a 10 GB file will produce a different checksum. The popular (although somewhat weak) sha1 hash produces a 40-character checksum, while the more secure sha256 hash produces a 64-character checksum. With a file and its checksum you can verify that the file has not been corrupted or tampered with in any way. Hashes run quickly, even on large files, making them an effective tool for file verification.
These techniques can be applied to any file. It doesn’t matter if the file contains source code, executable images, data, or other files; any file or set of data can be used.
Can source be controlled?
The starting point for all software is source code, which is typically written and modified over a period of time by multiple people and released as multiple versions and updates of a product. The code is spread across hundreds or thousands of files and is constantly changing. Effective code management uses a version-controlled code repository such as git1 (see it at https://git-scm.com/).
A git repository is a database of patches, each with a unique identifier. In git, this unique identifier is the hash of the contents of the patch – the result is that each patch is identified by its contents. Any changes to the contents of a patch are immediately visible, since the patch no longer matches its identifier.
A git patch includes information on the previous patch it is applied to and the identity of the person submitting the patch. Similar to a blockchain, git patches incorporate sets of backpointers based on encryption, making it impossible for someone to change the history without detection. Patches may also be signed using the techniques previously described, thereby verifying who made the patch. This technique is a useful tool in any environment that requires verification of contributions.
A version-controlled software repository is the foundation of any secure software-supply chain, as it provides a history of all changes to the software and who made the changes. It also provides reliable ways to build specific versions of a software package.
Building verifiable provenance
Using software from known and trusted sources is important to maintaining the integrity of your embedded system. But how do you know that a piece of installable software actually comes from known source code?
Source code repositories, combined with automated build systems like Jenkins2 (see at https://jenkins.io/), enable the user to build an executable image from a known set of source files. After an image is built, it can be signed and hashed by the build system. This allows the user to know both the source of the software and the exact build that produced the software. Routine builds are signed with test keys, whereas production builds are signed with a release key, require special authorization and approval, and are often signed on a separate secure system. This enables the user to determine both the source of a piece of software and whether or not it is an official release.
All files making up a piece of software are combined into a single package for distribution, installation, and updates. A packaging system used in Linux is rpm3 (see http://rpm.org/). An rpm is a single file that contains a compressed archive of multiple files plus the commands for installing, configuring, updating, and removing its associated application. An rpm file also includes a manifest of all the files in the archive, including their names, version numbers, and checksums. This manifest information is included in an rpm database, which maintains information on all rpm-based software installed on a system.
Software often includes third-party components. When these third-party components are included in an rpm, the rpm metadata and checksum ensure that this is the software that the vendor included. Third-party components should be signed to ensure their integrity; if they are simply passed through from the other vendor, they should be signed by the other vendor.
Typically, rpm packages themselves are signed. The tooling to create and sign rpm packages is included in Linux and should be used by everyone developing software, including in-house developers. The rpm installer by default checks to see that a package is signed with a known key before allowing installation. Attempts to install unsigned software or software signed with an unknown key will fail unless they are overridden. The rpm installer also checks the integrity of the package: If the contents of the package have been modified, either through data corruption or malicious tampering, the installation will fail.
The operating system vendor will include its public key in the operating system (OS). This addition enables the user to be sure that software packages, updates, and security errata are in fact from the OS vendor and have not been tampered with by any outside party.
The user must add the software keys for each approved vendor to the system. Depending on the particular security requirements, the user may need to take steps to ensure the validity of vendor keys, especially when downloading software from mirrors or other intermediate sources like system integrators. Keys should be obtained directly from the vendor website. Some go so far as to hand-carry hard copy listings of the key from a known source.
Moreover, signed rpm packages allows the use of unsecured transports such as the Internet or a CD-ROM through the mail, as the rpm tooling enables verification of the source of the rpm and whether it has been corrupted or tampered with.
Life after installation
Checking software before and during installation is a good start, but it’s important to continue maintenance after installation is complete. What can be done to verify a running system?
A powerful feature of rpm is that it allows the user to verify the integrity of files on a running system. The rpm database includes the checksum for all files contained in each rpm. System utilities enable the user to calculate the checksum for each file on the system, compare this to the rpm database, and identify any files that have changed. The rpm database is a fast and efficient way to do this. Another way to accomplish this is to go back to the signed rpm packages and use the checksums directly from the rpm. While this way is slower and requires access to the original installation files, it is quite secure.
Major Linux distributions use these techniques to ensure that they are installing and running unmodified software from a known source. Knowing the origin of all software installed on systems and whether or not is has been changed is vital. This knowledge is a powerful starting point for establishing and maintaining system integrity as systems in the field are deployed and updated.
Russell Doty is a technology strategist and product manager at Red Hat, focused on systems manageability and security, addressing both technical and usability issues, as well as delving into the special characteristics of the Internet of Things. Russell has extensive background in high-performance computing, visualization, and computer hardware from previous positions at Digital Equipment Corporation and Compaq. Recent open source projects include the OpenLMI system-management framework and the OpenSCAP security automation system. Readers may connect with Russell at [email protected].
Red Hat www.redhat.com