<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">

<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc toc="yes"?>

<rfc category="exp" ipr="noModificationTrust200811" docName="draft-eddy-dtnrg-checksum-00">
  <front>
    <title abbrev="Bundle Protocol Checksum">The DTN Bundle Protocol Payload Checksum Block</title>
        <author initials="W. M." surname="Eddy" fullname="Wesley M. Eddy">
          <organization abbrev="MTI Systems">MTI Systems</organization>
          <address>
            <postal>
	      <street>MS 500-ASRC</street>
              <street>NASA Glenn Research Center</street>
              <street>21000 Brookpark Road</street>
              <city>Cleveland</city><region>OH</region>
              <code>44135</code>
              <country>USA</country>
            </postal>
            <phone>+1-216-433-6682</phone>
            <email>wes@mti-systems.com</email>
          </address>
        </author>
	<author initials="L." surname="Wood" fullname="Lloyd Wood">
          <organization abbrev="Surrey alumni">University of Surrey alumni</organization>
          <address>
            <postal>
             <street></street>
             <city>Sydney</city><region>New South Wales</region>
             <country>Australia</country>
            </postal>
            <email>L.Wood@society.surrey.ac.uk</email>
          </address>
        </author>

    <date />
    <area>IRTF</area>
    <keyword>DTN</keyword>
    <keyword>DTNRG</keyword>
    <keyword>checksum</keyword>
    <abstract>
<t>

The Delay-Tolerant-Networking Bundle Protocol includes a custody transfer mechanism to provide
acknowledgements of receipt for particular bundles.  No checksum is included in
the basic DTN Bundle Protocol, however, so it is not possible to verify that
bundles have been either forwarded or passed through various convergence layers
without errors having been introduced.  This document partially rectifies the
situation by defining a Bundle Protocol extension for carrying payload
checksums and describes how its use can be integrated with both custody
transfer and fragmentation.

</t>
    </abstract>
  </front>

  <middle>
<section anchor="intro" title="Motivations">

<t>

Reliable transmission of information is a well-known problem for all protocol
layers.  Error detection and correction capabilities are found in lower layers,
but are also present in many
higher-layer protocols in order to detect residual bit errors and bugs.  For
example, IPv4 verifies a simple header checksum before processing inbound
packets, even when running over a data link like Ethernet that already performs
a stronger CRC, and TCP and UDP segments further includes a checksum covering their
contents as well as some IP header fields.  What may seem like paranoia is
actually not unfounded, as errors in received data or packet corruption are
known to creep into packets from causes
other than channel noise <xref target="SP00"/>.  Although coding of data on the
channel can reduce the impact of channel noise, end-to-end checksums are understood
to be necessary for applications requiring relative certainty that the data
received is error-free <xref target="SG98"/>.

</t>
<t>

The Delay/Disruption-Tolerant Networking (DTN) architecture <xref
target="RFC4838"/> is founded on an overlay of Bundle Agents (BAs), that forward
data units called bundles via a Bundle Protocol <xref target="SB07"/>.  Bundles
may be lost or errored both during transmission between BAs, or within a BA
itself.  Bundles belonging to applications that are not tolerant of lost data
have a &quot;custody transfer&quot; flag that requests reliable transmission
between bundle agents.  Unfortunately, the notion of reliability used in the
basic custody transfer mechanism does not take bit errors into account.  It is
assumed that the &quot;convergence layer adapters&quot; that connect BAs to
each other will detect and correct errors before presenting bundle data to the
BAs themselves.  This may be adequate in many cases, but is not always sufficient,
and the insufficiency is encapsulated in the well-known end-to-end principle <xref
target="RFC4838"/>.  It is possible (and even statistically likely) that
either transmission errors will go unnoticed, or unchecked errors will be
introduced within a BA's memory, storage, or forwarding systems.

</t>
<t>

For example, the UDP convergence-layer adapter that has been popularly
implemented in DTN stacks uses the usual IP 16-bit one's-complement checksum to
validate incoming packets.  This checksum is computed by summing 16-bit values
within a packet.  If two strings within the packet of size greater than 16 bits
are swapped in position, the checksum will pass even though the datagram is
now different from the original, and clearly errored.  The proposed TCP-based
convergence layer relies on the same checksum algorithm. This checksum
algorithm is considered weak, in that it does not detect a class of subtle errors,
but at least an attempt to verify that the packet was as sent has been made.

<!-- I think running DTN over UDP and TCP and then announcing it's a "convergence
adapter" is a con. Did they do anything apart from just use sockets? -->

</t>
<t>

Even stronger convergence-layer adapter error detection is not sufficient.
Errors within a BA's memory, errors due to memory issues within the BA's host, e.g.
radiation-induced soft errors, or
errors introduced from file-system corruption cannot be detected by convergence
layer adapters, as these errors occur in between successive phases of
forwarding and convergence-layer processing.  End-to-end computation and
verification of checksums is required to ensure integrity of DTN bundles
forwarded across a system composed of BAs and convergence layer adapters <xref
target="SRC84"/>.

</t>
<t>

The proposed Bundle Security mechanisms <xref target="SFWP07"/> are capable of
providing an end-to-end checksum, but are intended for the very different
problem of security. The current design of Bundle Security is not practical for simple
integrity checking outside of a more paranoid security context.  For example,
the Bundle Security mechanisms include &quot;mandatory ciphersuites&quot; that
implementations must support.  No simple checksum algorithm is among the
mandated algorithms.  The mandated ciphersuites do include some more complex
keyed-hash constructions, but these rely on key management, which is not
appropriate for general integrity checking between multiple parties simply
relaying bundles.  While it would be
technically possible to graft a non-security integrity-checking mechanism onto
the avaiable security mechanisms by specifying some assumed key, it would be
inapproriate for the problem at hand, and overkill.  Instead, it would be much simpler
and less error-prone
to implement a separate checksum block for optional inclusion on bundles.
<xref target="format"/> of this document defines such a block and <xref
target="rules"/> gives some simple rules for its use.

</t>
</section>
<section title="Payload Checksum Block Format" anchor="format">
<t>

A new bundle block, the Payload Checksum Block, is defined in this section, for inclusion in bundles where integrity
checking is desirable. An un-keyed checksum algorithm is used.  The integrity verification that it provides
covers only the bundle payload, as several portions of other bundle blocks are
allowed (and expected) to change in-transit as a bundle is forwarded through a
DTN.  The implications of this decision are discussed in the next section.

</t>
<t>

No Endpoint ID references are needed for this, so the layout follows that of
Figure 6 in the Bundle Protocol specification <xref target="SB07"/>, with its
use of Self-Delimiting Numeric Values (SDNVs):

</t>

<figure title="Payload Checksum Block Format" anchor="formatfig">
  <artwork align="center">
+-----------+-----------+-----------+-----------+
|Block Type | Block Processing Ctrl Flags (SDNV)|
+-----------+-----------+-----------+-----------+
|              Block Length (SDNV)              |
+-----------+-----------+-----------+-----------+
|           Checksum Algorithm (SDNV)           |
+-----------+-----------+-----------+-----------+
|                                               |
|             Payload Checksum Value            |
|               (variable length)               |
|                                               |
+-----------------------------------------------+
  </artwork>
</figure>

<t>

The fields shown in <xref target="formatfig"/> above are defined as:

<list style="symbols">

 <t>Block Type - Implementations are currently using a value defined as
experimental in the Bundle Protocol Specification, but can be expected to
transition to an assigned value.</t>
 <t>Block Processing Ctrl Flags - The bits in this SDNV are defined in the
Bundle Protocol Specification. For Payload Checksum Blocks, none of these
bits need be set, except perhaps bit 3, to indicate the &quot;last
block&quot;, when the block is sent as the final block in a bundle.</t>
<!-- CAs can reorder bundle blocks, right? Not sure of value of this flag. Either
it's implicitly the last block, or it's not. How does this affect checksum? -->
 <t>Block Length - This SDNV encodes the length of the remainder of the Payload
Checksum Block.  When the Checksum Algorithm SDNV is parsed, its length can be
subtracted from the Block Length value to determine the level of truncation for
the Payload Checksum Value, as explained below.</t>
 <t>Checksum Algorithm - This identifies the algorithm used to compute the
Payload Checksum Value field following it.  Defined algorithms are listed below.</t>
 <t>Payload Checksum Value - This is a raw string of octets whose length is
implied by the Block Length.  This string contains the checksum result computed
over the Payload Block of the bundle, and may only contain the high-order bits of
this result, if truncation is used to shorten the length of the checksum, as
described below.</t>
<!-- truncation is a risk. Need more details of tradeoffs. -->
</list>

</t>
<section title="Checksum Algorithms">

<t>

Any implementation of this specification MUST support the MD5 checksum
algorithm <xref target="RFC1321"/>.  MD5 has been chosen as the baseline
checksum algorithm in this mechanism because it represents a reasonable
tradeoff between robust error detection and efficiency of implementation.  For
widespread use in DTNs, both resource-efficient implementation and decent
error-detection capabilities are needed.  MD5 algorithms are known to achieve
processing several Mbps, even on
rather limited hardware <xref target="RFC1810"/>, yet MD5 provides a much more
robust checksum than the Internet's 16-bit one's complement checksum.  Although
MD5 has cryptographic weaknesses and is discouraged for use in security
protocols, concerns with resistance to pre-image generation are irrelevant here as
we are not using MD5 values in a security context.

</t>
<t>

We also have a defined value to indicate use of SHA-1 checksums <xref target="RFC3174"/>.
However, support for SHA-1 checksums is not required. SHA-1 is significantly less
efficient than MD5 to compute in our experience, for seemingly little added
error-detection capability when truncated to the same length.  Implementations
SHOULD at least support receiving and verifying SHA-1 checksums.

<!-- see note on SHOULD below. -->

</t>
<t>

An Adler-32 checksum option is also specified, but should be used only in cases
where bundle payloads are relatively small and efficiency of computation is
highly important.  Implementations SHOULD support at least receiving and
verifying Adler-32 checksums.

<!-- not sure about this. If you can receive and verify Adler/SHA, how much more code
is it to generate them? Not much, since you have to compute a checksum to compare
with a received checksum. This SHOULD as specified is meaningless... am wondering if
we should stick to just MD5. -->
</t>
<t>

The complete list of currently defined Checksum Algorithm values is:

</t>
<texttable>
  <ttcol>#</ttcol><ttcol>Algorithm</ttcol><ttcol>Un-Truncated Length</ttcol>
  <c>0</c><c>MD5</c><c>128 bits</c>
  <c>1</c><c>SHA-1</c><c>160 bits</c>
  <c>2</c><c>Adler-32</c><c>32 bits</c>
</texttable>

<!-- Adler is shortest, so should be 0, followed by MD5 then SHA-1 imo.
Why does this table render with extra blank lines? If it didn't it would
fit on one page... -->
<t>

Other checksum algorithms may be assigned values in future documents.

</t>
</section>
</section>

<section title="Rules for Use" anchor="rules">
<t>

On small payloads, the relatively long output of the MD5 or SHA-1 algorithms
might be viewed as a detriment to end-to-end application performance by
increasing the header overhead of the Bundle Protocol, and reducing the
capacity available to higher layers.  For this reason, senders of the Payload
Checksum Block are permitted to truncate the transmitted Payload Checksum
Values if the full checksum algorithm's output is deemed to be overly long in
comparison to the size of the payload.  This should be done carefully at the
sending application's discretion, and never by default.  When generating or
verifying a truncated checksum, it is understood that only the high-order octets
are included.

<!-- um, truncation affects reliability/collision space of the checksum, and thus its strength.
I'm not at all happy about this. (Also, in the True Spirit of Bundles, shouldn't the
checksum value be represented by an SDNV value and be of indeterminate length?) -->

</t>
<t>

The checksum conveyed in the Payload Checksum Block only covers the Payload
Block of a bundle, and does not include any pseudoheaders with EIDs,
timestamps, etc., or any other portion of a bundle or its other extension
blocks. Ensuring robustness of bundle header information and metadata is
a separate problem not addressed here; ideally each header would be self-checking
to guarantee a degree of robustness on receipt.

<!-- so the bundle headers are STILL not robust. I wouldn't trust a bundle as far
as I could throw it. -->

</t>
<t>

The checksum within the Payload Checksum Block has differing semantics if it
occurs before or after the Payload Block.  If placed before the Payload Block,
then the Checksum Value should be understood to cover the entire (unfragmented
/ reassembled) payload, whereas if it follows the Payload Block within a Bundle
Fragment, then the Checksum Value only applies to the included fragment of the
payload.

<!-- nasty. Indicate via a flag? -->

</t>
<t>

Intermediate Bundle Agents, which may not be affiliated with either the source nor
the destination of a bundle, are permitted to verify the Payload Checksum Block and
attempt local recovery.  If local recovery is not possible, then the bundle
MAY be deleted.
<!-- what if the IBA is a Custody Agent? Is an errored bundle better than none at
all? -->

</t>
<t>

This document suggests amending the Bundle Protocol specification with regard
to Custody Transfer.  Without any checksum verification, claiming custody of a
bundle is a potentially troublesome operation.  We suggest that the Bundle
Protocol specification require the use of a Payload Checksum Block when Custody
Transfer is requested by an application in order to close this gap.

<!-- and do you delete errored bundles? -->
</t>
</section>

<section title="Security Considerations">
<t>

The mechanism described in this document does not introduce any new security
concerns beyond those present in the basic Bundle Protocol. It only attempts
to ensure the reliability of the Bundle Protocol payload. Ensuring Bundle
Protocol header reliability remains an open problem.

</t>
</section>
<section title="IANA Considerations">
<t>
This document has no considerations for IANA.
</t>
</section>
<section title="Acknowledgements">
<t>

Some of the work on this document was performed at NASA's Glenn Research Center
under funding from the Earth Science Technology Office (ESTO) and the Space
Communications Architecture Working Group (SCAWG).

</t>
</section>
  </middle>

  <back>
  <references title="Normative References">
<reference anchor="SB07">
  <front>
    <title>Bundle Protocol Specification</title>
    <author initials="K." surname="Scott"/>
    <author initials="S." surname="Burleigh"/>
    <date month="April" year="2007"/>
  </front>
  <seriesInfo name="draft-irtf-dtnrg-bundle-spec-09" value="(work in progress)"/>
</reference>

    <?rfc include="reference.RFC.1321" ?>
    <?rfc include="reference.RFC.1810" ?>
    <?rfc include="reference.RFC.3174" ?>

  </references>

    <references title="Informative References">

<reference anchor="SP00">
  <front>
    <title>When the CRC and TCP Checksum Disagree</title>
    <author initials="J." surname="Stone"/>
    <author initials="C." surname="Partridge"/>
    <date month="September" year="2000"/>
  </front>
  <seriesInfo name="Proceedings of ACM SIGCOMM" value="2000"/>
</reference>

<reference anchor="SG98">
  <front>
    <title>Performance of checksums and CRCs over real data</title>
    <author initials="J." surname="Stone"/>
    <author initials="M." surname="Greenwald"/>
    <author initials="J." surname="Hughes"/>
    <author initials="C." surname="Partridge"/>
    <date month="October" year="1998"/>
  </front>
  <seriesInfo name="IEEE Transactions on Networks" value="vol. 6 issue 5, pp. 529-543"/>
</reference>

<reference anchor="SRC84">
  <front>
    <title>End-to-end Arguments in System Design</title>
    <author initials="J." surname="Saltzer"/>
    <author initials="D." surname="Reed"/>
    <author initials="D." surname="Clark"/>
    <date month="November" year="1984"/>
  </front>
  <seriesInfo name="ACM Transactions on Computer Systems" value="2 (4)"/>
</reference>

<reference anchor="SFWP07">
  <front>
    <title>Bundle Security Protocol Specification</title>
    <author initials="S." surname="Symington"/>
    <author initials="S." surname="Farrell"/>
    <author initials="H." surname="Weiss"/>
    <author initials="P." surname="Lovell"/>
    <date month="April" year="2007"/>
  </front>
  <seriesInfo name="draft-irtf-dtnrg-bundle-security-03" value="(work in progress)"/>
</reference>

    <?rfc include="reference.RFC.4838" ?>

</references>
  </back>
</rfc>
