TUF: The Update Framework

1. Introduction

1.1. Scope

   This document describes a framework for securing software update systems.

1.2. Motivation

   Software is commonly updated through software update systems.  These systems
   can be package managers that are responsible for all of the software that is
   installed on a system, application updaters that are only responsible for
   individual installed applications, or software library managers that install
   software that adds functionality such as plugins or programming language
   libraries.

   Software update systems all have the common behavior of downloading files
   that identify whether updates exist and, when updates do exist, downloading
   the files that are required for the update.  For the implementations
   concerned with security, various integrity and authenticity checks are
   performed on downloaded files.

   Software update systems are vulnerable to a variety of known attacks.  This
   is generally true even for implementations that have tried to be secure.

1.3. History and credit

   Work on TUF began in late 2009.  The core ideas are based off of previous
   work done by Justin Cappos and Justin Samuel that identified security flaws
   in all popular Linux package managers.  More information and current
   versions of this document can be found at https://www.updateframework.com/

   The development of TUF is supported by GENI (http://www.geni.net/).

   TUF's Python implementation is based heavily on Thandy, the application
   updater for Tor (http://www.torproject.org/). Its design and this spec are
   also largely based on Thandy's, with many parts being directly borrowed
   from Thandy. The Thandy spec can be found here:
   https://gitweb.torproject.org/thandy.git?a=blob_plain;f=specs/thandy-spec.txt;hb=HEAD

   Whereas Thandy is an application updater for an individual software project,
   TUF aims to provide a way to secure any software update system. We're very
   grateful to the Tor Project and the Thandy developers as it is doubtful our
   design and implementation would have been anywhere near as good without
   being able to use their great work as a starting point. Thandy is the hard
   work of Nick Mathewson, Sebastian Hahn, Roger Dingledine, Martin Peck, and
   others.

1.4. Non-goals

   We aren't creating a universal update system, but rather a simple and
   flexible way that applications can have high levels of security with their
   software update systems.  Creating a universal software update system would
   not be a reasonable goal due to the diversity of application-specific
   functionality in software update systems and the limited usefulness that
   such a system would have for securing legacy software update systems.

   We won't be defining package formats or even performing the actual update
   of application files.  We will provide the simplest mechanism possible that
   remains easy to use and provides a secure way for applications to obtain and
   verify files being distributed by trusted parties.

   We are not providing a means to bootstrap security so that arbitrary
   installation of new software is secure.  In practice this means that people
   still need to use other means to verify the integrity and authenticity of
   files they download manually.

   The framework will not have the responsibility of deciding on the correct
   course of action in all error situations, such as those that can occur when
   certain attacks are being performed.  Instead, the framework will provide 
   the software update system the relevant information about any errors that
   require security decisions which are situation-specific.  How those errors
   are handled is up to the software update system.

1.5. Goals

   We need to provide a framework (a set of libraries, file formats, and
   utilities) that can be used to secure new and existing software update
   systems.

   The framework should enable applications to be secure from all known attacks
   on the software update process.  It is not concerned with exposing
   information about what software is being updating (and thus what software
   the client may be running) or the contents of updates.

   The framework should provide means to minimize the impact of key compromise.
   To do so, it must support roles with multiple keys and threshold/quorum
   trust (with the exception of minimally trusted roles designed to use a
   single key).  The compromise of roles using highly vulnerable keys should
   have minimal impact.  Therefore, online keys (keys which are used in an
   automated fashion) must not be used for any role that clients ultimately
   trust for files they may install.

   The framework must be flexible enough to meet the needs of a wide variety of
   software update systems.

   The framework must be easy to integrate with software update systems.

1.5.1 Goals for implementation

   The client side of the framework must be straightforward to implement in any
   programming language and for any platform with the requisite networking and
   crypto support.

   The framework should be easily customizable for use with any crypto
   libraries.

   The process by which developers push updates to the repository must be
   simple.

   The repository must serve only static files and be easy to mirror.

   The framework must be secure to use in environments that lack support for
   SSL (TLS).  This does not exclude the optional use of SSL when available,
   but the framework will be designed without it.

1.5.2. Goals for specific attacks to protect against

   Note: When saying the framework protects against an attack, this means that
   the attack will not be successful.  It does not mean that a client will
   always be able to successfully update during an attack.  Fundamentally, an
   attacker positioned to intercept and modify a client's communication will
   always be able to perform a denial of service.  The part we have control
   over is not allowing an inability to update to go unnoticed.

   Rollback attacks.  Attackers should not be able to trick clients into
   installing software that is older than that which the client previously knew
   to be available.

   Indefinite freeze attacks.  Attackers should not be able respond to client
   requests with the same, outdated metadata without the client being aware of
   the problem.

   Endless data attacks.  Attackers should not be able to respond to client
   requests with huge amounts of data (extremely large files) that interfere
   with the client's system.

   Slow retrieval attacks.  Attackers should not be able to prevent clients
   from being aware of interference with receiving updates by responding to
   client requests so slowly that automated updates never complete.

   Extraneous dependencies attacks.  Attackers should not be able to cause
   clients to download or install software dependencies that are not the
   intended dependencies.

   Mix-and-match attacks.  Attackers should not be able to trick clients into
   using a combination of metadata that never existed together on the
   repository at the same time.

   Malicious repository mirrors should not be able to prevent updates from good
   mirrors.

1.5.3. Goals for PKIs

   Software update systems using the framework's client code interface should
   never have to directly manage keys.

   All keys must be easily and safely revocable.  Trusting new keys for a role
   must be easy.

   For roles where trust delegation is meaningful, a role should be able to
   delegate full or limited trust to another role.

   The root of trust will not rely on external PKI.  That is, no authority will
   be derived from keys outside of the framework.

2. System overview

   The framework ultimately provides a secure method of obtaining trusted
   files.  To avoid ambiguity, we will refer to the files the framework is used
   to distribute as "target files".  Target files are opaque to the framework.
   Whether target files are packages containing multiple files, single text
   files, or executable binaries is irrelevant to the framework.

   The metadata describing target files is the information necessary to
   securely identity the file and indicate which roles are trusted to provide
   the file.  As providing additional information about
   target files may be important to some software update systems using the
   framework, additional arbitrary information can be provided with any target
   file. This information will be included in signed metadata that describes 
   the target files.

   The following are the high-level steps of using the framework from the
   viewpoint of a software update system using the framework.  This is an
   error-free case.

      Polling:
        - Periodically, the software update system using the framework
          instructs the framework to check each repository for updates.
          If the framework reports to the application code that there are
          updates, the application code determines whether it wants to
          download the updated target files.  Only target files that are
          trusted (referenced by properly signed and timely metadata) are made
          available by the framework.

      Fetching:
        - For each file that the application wants, it asks the framework to
          download the file.  The framework downloads the file and performs
          security checks to ensure that the downloaded file is exactly what is
          expected according to the signed metadata.  The application code is
          not given access to the file until the security checks have been
          completed.  The application asks the framework to copy the downloaded
          file to a location specified by the application.  At this point, the
          application has securely obtained the target file and can do with it
          whatever it wishes.

2.1. Roles and PKI

   In the discussion of roles that follows, it is important to remember that
   the framework has been designed to allow a large amount of flexibility for
   many different use cases.  For example, it is possible to use the framework
   with a single key that is the only key used in the entire system.  This is
   considered to be insecure but the flexibility is provided in order to meet
   the needs of diverse use cases.

   There are four fundamental top-level roles in the framework:
     - Root role
     - Targets role
     - Release role
     - Timestamp role

   There is also one optional top-level role:
     - Mirrors role

   All roles can use one or more keys and require a threshold of signatures of
   the role's keys in order to trust a given metadata file.

2.1.1 Root role

   The root role delegates trust to specific keys trusted for all other
   top-level roles used in the system.

   The client-side of the framework must ship with trusted root keys for each
   configured repository.

   The root role's private keys must be kept very secure and thus should be
   kept offline.

2.1.2 Targets role

   The targets role's signature indicates which target files are trusted by
   clients.  The targets role signs metadata that describes these files, not
   the actual target files themselves.

   In addition, the targets role can delegate full or partial trust to other
   roles.  Delegating trust means that the targets role indicates another role
   (that is, another set of keys and the threshold required for trust) is
   trusted to sign target file metadata.  Partial trust delegation is when the
   delegated role is only trusted for some of the target files that the
   delegating role is trusted for. 

   Delegated developer roles can further delegate trust to other delegated
   roles.  This provides for multiple levels of trust delegation where each
   role can delegate full or partial trust for the target files they are
   trusted for.  The delegating role in these cases is still trusted.  That is,
   a role does not become untrusted when it has delegated trust.

   Delegated trust can be revoked at any time by the delegating role signing
   new metadata that indicates the delegated role is no longer trusted.

2.1.3 Release role

   The release role signs a metadata file that provides information about the
   latest version of all of the other metadata on the repository (excluding the
   timestamp file, discussed below).  This information allows clients to know
   which metadata files have been updated and also prevents mix-and-match
   attacks.

2.1.4 Timestamp role

   To prevent an adversary from replaying an out-of-date signed metadata file
   whose signature has not yet expired, an automated process periodically signs
   a timestamped statement containing the the hash of the release file.  Even
   though this timestamp key must be kept online, the risk posed to clients by
   compromise of this key is minimal.

2.1.5 Mirrors role

   Every repository has one or more mirrors from which files can be downloaded
   by clients.  A software update system using the framework may choose to
   hard-code the mirror information in their software or they may choose to use
   mirror metadata files that can optionally be signed by a mirrors role.

   The importance of using signed mirror lists depends on the application and
   the users of that application.  There is minimal risk to the application's
   security from being tricked into contacting the wrong mirrors.  This is
   because the framework has very little trust in repositories.

2.2. Threat Model And Analysis

   We assume an adversary who can respond to client requests, whether by acting
   as a man-in-the-middle or through compromising repository mirrors.  At
   worst, such an adversary can deny updates to users if no good mirrors are
   accessible.  An inability to obtain updates is noticed by the framework.

   If an adversary compromises enough keys to sign metadata, the best that can
   be done is to limit the number of users who are affected.  The level to
   which this threat is mitigated is dependent on how the application is using
   the framework.  This includes whether different keys have been used for
   different signing roles.

   A detailed threat analysis is outside the scope of this document.  This is
   partly because the specific threat posted to clients in many situations is
   largely determined by how the framework is being used.

3. The repository

   An application uses the framework to interact with one or more repositories.
   A repository is a conceptual source of target files of interest to the
   application.  Each repository has one or more mirrors which are the actual
   providers of files to be downloaded.  For example, each mirror may specify a
   different host where files can be downloaded from over HTTP.

   The mirrors can be full or partial mirrors as long as the application-side
   of the framework can ultimately obtain all of the files it needs.  A mirror
   is a partial mirror if it is missing files that a full mirror should have.
   If a mirror is intended to only act as a partial mirror, the metadata and
   target paths available from that mirror can be specified.

   Roles, trusted keys, and target files are completely separate between
   repositories.  A multi-repository setup is a multi-root system.  When an
   application uses the framework with multiple repositories, the framework
   does not perform any "mixing" of the trusted content from each repository.
   It is up to the application to determine the significance of the same or
   different target files provided from separate repositories.

3.1 Repository layout

   The filesystem layout in the repository is used for two purposes:
     - To give mirrors an easy way to mirror only some of the repository.
     - To specify which parts of the repository a given role has authority
       to sign/provide.

3.1.1 Target files

   The filenames and the directory structure of target files available from
   a repository are not specified by the framework.  The names of these files
   and directories are completely at the discretion of the application using
   the framework.

3.1.2 Metadata files

   The filenames and directory structure of repository metadata are strictly
   defined.  The following are the metadata files of top-level roles relative
   to the base URL of metadata available from a given repository mirror.

    /root.txt

         Signed by the root keys; specifies trusted keys for the other
         top-level roles.

    /release.txt

         Signed by the release role's keys.  Lists hashes and sizes of all
         metadata files other than timestamp.txt.

    /targets.txt

         Signed by the target role's keys.  Lists hashes and sizes of target
         files.

    /timestamp.txt

         Signed by the timestamp role's keys.  Lists hashes and size of the
         release file.  This is the first and potentially only file that needs
         to be downloaded when clients poll for the existence of updates.

    /mirrors.txt (optional)

         Signed by the mirrors role's keys.  Lists information about available
         mirrors and the content available from each mirror.

   An implementation of the framework may optionally choose to make available
   any metadata files in compressed (e.g. gzip'd) format.  In doing so, the
   filename of the compressed file should be the same as the original with the
   addition of the file name extension for the compression type (e.g.
   release.txt.gz).  The original (uncompressed) file should always be made
   available, as well.

3.1.2.1 Metadata files for targets delegation

   When the targets role delegates trust to other roles, each delegated role
   provides one signed metadata file.  This file is located at:

    /targets/DELEGATED_ROLE.txt

   where DELEGATED_ROLE is the name of the delegated role that has been
   specified in targets.txt.  If this role further delegates trust to a role
   named ANOTHER_ROLE, that role's signed metadata file is made available at:

    /targets/DELEGATED_ROLE/ANOTHER_ROLE.txt

4. Document formats

   All of the formats described below include the ability to add more
   attribute-value fields for backwards-compatible format changes.  If
   a backwards incompatible format change is needed, a new filename can
   be used.

4.1. Metaformat

   All documents use a subset of the JSON object format, with
   floating-point numbers omitted.  When calculating the digest of an
   object, we use the "canonical JSON" subdialect as described at
        http://wiki.laptop.org/go/Canonical_JSON

4.2. File formats: general principles

   All signed files are of the format:
       { "signed" : X,
         "signatures" : [
            { "keyid" : K,
              "method" : M,
              "sig" : S }
            , ... ]
       }

   where: X is a list whose first element describes the signed object.
          K is the identifier of a key signing the document
          M is the method to be used to make the signature
          S is a signature of the canonical encoding of X using the
          identified key.

   We define one signing method at present:
       sha256-pkcs1 : A base64 encoded signature of the SHA256 hash of the
         canonical encoding of X, using PKCS-1 padding.

   All times are given as strings of the format "YYYY-MM-DD HH:MM:SS",
   in UTC.

   All keys are of the format:
      { "keytype" : KEYTYPE,
        "keyval" : KEYVAL }

   where KEYTYPE is a string describing the type of the key and how it's
   used to sign documents.  The type determines the interpretation of
   KEYVAL.

   The KEYID of a key is the hex representation of the SHA-256 hash of the
   canonical encoding of the key.

   We define one keytype at present: 'rsa'.  Its format is:
      { "keytype" : "rsa",
        "keyval" : { "e" : E,
                     "n" : N }
      }

   where E and N are the binary representations of the exponent and
   modulus, encoded as big-endian numbers in base64.  All RSA keys must
   be at least 2048 bits long.

4.3. File formats: root.txt

   The root.txt file is signed by the root role's keys.  It indicates
   which keys are authorized for all top-level roles, including the root
   role itself.  Revocation and replacement of top-level role keys, including
   for the root role, is done by changing the keys listed for the roles in
   this file.

   The format of root.txt is as follows:

     { "_type" : "Root",
       "ts" : TIME,
       "expires" : EXPIRES,
       "keys" : {
           KEYID : KEY
           , ... },
       "roles" : {
           ROLE : {
             "keyids" : [ KEYID, ... ] ,
             "threshold" : THRESHOLD }
           , ... }
     }

   The "ts" line describes when this file was updated.  Clients
   MUST NOT replace a file with an older one, and SHOULD NOT accept a
   file too far in the future.

   The "expires" line states when the metadata should be considered expired
   and no longer trusted by clients.  Clients MUST NOT trust an expired file.

   A ROLE is one of "root", "release", "targets", "timestamp", or "mirrors".
   A role for each of "root", "release", "timestamp", and "targets" MUST be
   specified in the key list. The role of "mirror" is optional.  If not
   specified, the mirror list will not need to be signed if mirror lists are
   being used.

   The KEYID must be correct for the specified KEY.  Clients MUST calculate
   each KEYID to verify this is correct for the associated key.  Clients MUST
   ensure that for any KEYID represented in this key list and in other files,
   only one unique key has that KEYID.

   The THRESHOLD for a role is an integer of the number of keys of that role
   whose signatures are required in order to consider a file as being properly
   signed by that role.

4.4. File formats: release.txt

   The release.txt file is signed by the release role.  It lists hashes and
   sizes of all metadata on the repository, excluding timestamp.txt and
   mirrors.txt.

   The format of release.txt is as follows:

     { "_type" : "Release",
       "ts" : TIME,
       "expires" : EXPIRES,
       "meta" : METAFILES
     }

   METAFILES is an object whose format is the following:

     { METAPATH : {
           "length" : LENGTH,
           "hashes" : HASHES,
           ("custom" : { ... }) }
       , ...
     }

   METAPATH is the the metadata file's path on the repository relative to the
   metadata base URL.

4.5. File formats: targets.txt and delegated target roles

   The format of targets.txt is as follows:

     { "_type" : "Targets",
       "ts" : TIME,
       "expires" : EXPIRES,
       "targets" : TARGETS,
       ("delegations" : DELEGATIONS)
     }

   TARGETS is an object whose format is the following:

     { TARGETPATH : {
           "length" : LENGTH,
           "hashes" : HASHES,
           ("custom" : { ... }) }
       , ...
     }

   Each key of the TARGETS object is a TARGETPATH.  A TARGETPATH is a path to
   a file that is relative to a mirror's base URL of targets.

   It is allowed to have a TARGETS object with no TARGETPATH elements.  This
   can be used to indicate that no target files are available.

   The HASH and LENGTH are the hash and length of the target file. If
   defined, the elements and values of "custom" will be made available to the
   client application.  The information in "custom" is opaque to the framework
   and can include version numbers, dependencies, requirements, and any other
   data that the application wants to include to describe the file at
   TARGETPATH.  The application may use this information to guide download
   decisions.

   DELEGATIONS is an object whose format is the following:

     { "keys" : {
           KEYID : KEY,
           ... },
       "roles" : {
           ROLE : {
             "keyids" : [ KEYID, ... ] ,
             "threshold" : THRESHOLD,
             "paths" : [ PATHPATTERN, ... ] }
           , ... }
     }

   The "paths" list describes paths that the role is trusted to provide.
   Clients MUST check that a target is in one of the trusted paths of all roles
   in a delegation chain, not just in a trusted path of the role that describes
   the target file.  The format of a PATHPATTERN may be either a path to a
   single file or a path to a directory and end with "/**" to indicate all
   files under that directory.  The value of "/**" by itself therefore means
   all files.

   The metadata files for delegated target roles has the same format as the
   top-level targets.txt metadata file.

4.6. File formats: timestamp.txt

   The timestamp file is signed by a timestamp key.  It indicates the
   latest versions of other files and is frequently resigned to limit the
   amount of time a client can be kept unaware of interference with obtaining
   updates.

   Timestamp files will potentially be downloaded very frequently.  Unnecessary
   information in them will be avoided.

   The format of the timestamp file is as follows:

     { "_type" : "Timestamp",
       "ts" : TIME,
       "expires" : EXPIRES,
       "meta" : METAFILES
     }

   METAFILES is the same is described for the release.txt file.  In the case of
   the timestamp.txt file, this will commonly only include a description of the
   release.txt file.

4.7. File formats: mirrors.txt

   The mirrors.txt file is signed by the mirrors role.  It indicates which
   mirrors are active and believed to be mirroring specific parts of the
   repository.

   The format of mirrors.txt is as follows:

     { "_type" : "Mirrorlist",
       "ts" : TIME,
       "expires" : EXPIRES,
       "mirrors" : [
          { "urlbase" : URLBASE,
            "metapath" : METAPATH,
            "targetspath" : TARGETSPATH,
            "metacontent" : [ PATHPATTERN ... ] ,
            "targetscontent" : [ PATHPATTERN ... ] ,
            ("custom" : { ... }) }
          , ... ]
     }

   URLBASE is the URL of the mirror which METAPATH and TARGETSPATH are relative
   to.  All metadata files will be retrieved from METAPATH and all target files
   will be retrieved from TARGETSPATH.

   The lists of PATHPATTERN for "metacontent" and "targetscontent" describe the
   metadata files and target files available from the mirror.

   The order of the list of mirrors is important.  For any file to be
   downloaded, whether it is a metadata file or a target file, the framework on
   the client will give priority to the mirrors that are listed first.  That is,
   the first mirror in the list whose "metacontent" or "targetscontent" include
   a path that indicate the desired file can be found there will the first
   mirror that will be used to download that file.  Successive mirrors with
   matching paths will only be tried if downloading from earlier mirrors fails.
   This behavior can be modified by the client code that uses the framework to,
   for example, randomly select from the listed mirrors.

5. Detailed Workflows

5.1. The client application

   Note: At any point in the following process there is a problem (e.g. only
   expired metadata can be retrieved), the software update system using the
   framework must decide how to proceed.

   The client code instructs the framework to check for updates.  The framework
   downloads the timestamp.txt file from a mirror and checks that the file is
   properly signed by the timestamp role, is not expired, and is not older than
   the last timestamp.txt file retrieved.  If the timestamp file lists the same
   release.txt file as was previously seen, the client code is informed that no
   updates are available and the update checking process stops.

   If the release.txt file has changed, the framework downloads the file and
   verifies that it is properly
   signed by the release role, is not expired, has a newer timestamp than the
   last release.txt file seen, and matches the description (hashes and size)
   in the timestamp.txt file.  The framework then checks which metadata files
   listed in release.txt differ from those described in the last release.txt
   file the framework had seen.  If the root.txt file has changed, the
   framework updates this (following the same security measures as with the
   other files) and starts the process over.  If any other metadata files have 
   changed, the framework downloads and checks those.

   By comparing the trusted targets from the old trusted metadata with the new
   metadata, the framework is able to determine which target files have
   changed. The framework ensures that any targets described in delegated
   targets files are allowed to be provided by the delegated role.

   When the client code asks the framework to download a target file, the
   framework downloads the file from (potentially trying multiple mirrors),
   checks the downloaded file to ensure that it matches the information
   described in the targets files, and then makes the file available to the
   client code.

6. Usage

   See https://www.updateframework.com/ for discussion of recommended usage in
   various situations.

6.1. Key management and migration

   All keys except the timestamp file signing key and the mirror list signing
   key should be stored securely offline (e.g. encrypted and on a separate
   machine, in special-purpose hardware, etc.).

   To replace a compromised root key or any other top-level role key, the root
   role signs a new root.txt file that lists the updated trusted keys for the
   role.  When replacing root keys, an application will sign the new root.txt
   file with both the new and old root keys until all clients are known to have
   obtained the new root.txt file (a safe assumption is that this will be a
   very long time or never).  There is no risk posed by continuing to sign the
   root.txt file with revoked keys as once clients have updated they no longer
   trust the revoked key.  This is only to ensure outdated clients remain able
   to update.

   To replace a delegated developer key, the role that delegated to that key
   just replaces that key with another in the signed metadata where the
   delegation is done.

F. Future directions and open questions

F.1. Support for bogus clocks.

   The framework may need to offer an application-enablable "no, my clock is
   _supposed_ to be wrong" mode, since others have noticed that many users seem
   to have incorrect clocks.