rfc9768v1.txt   rfc9768.txt 
skipping to change at line 20 skipping to change at line 20
More Accurate Explicit Congestion Notification (AccECN) Feedback in TCP More Accurate Explicit Congestion Notification (AccECN) Feedback in TCP
Abstract Abstract
Explicit Congestion Notification (ECN) is a mechanism by which Explicit Congestion Notification (ECN) is a mechanism by which
network nodes can mark IP packets instead of dropping them to network nodes can mark IP packets instead of dropping them to
indicate incipient congestion to the endpoints. Receivers with an indicate incipient congestion to the endpoints. Receivers with an
ECN-capable transport protocol feed back this information to the ECN-capable transport protocol feed back this information to the
sender. ECN was originally specified for TCP in such a way that only sender. ECN was originally specified for TCP in such a way that only
one feedback signal can be transmitted per Round-Trip Time (RTT). one feedback signal can be transmitted per Round-Trip Time (RTT).
Newer TCP mechanisms like Congestion Exposure (ConEx), Data Center More recently defined TCP mechanisms like Congestion Exposure
TCP (DCTCP), or Low Latency, Low Loss, and Scalable Throughput (L4S) (ConEx), Data Center TCP (DCTCP), or Low Latency, Low Loss, and
need more Accurate ECN (AccECN) feedback information whenever more Scalable Throughput (L4S) need more Accurate ECN (AccECN) feedback
than one marking is received in one RTT. This document updates the information whenever more than one marking is received in one RTT.
original ECN specification defined in RFC 3168 by specifying a scheme This document updates the original ECN specification defined in RFC
that provides more than one feedback signal per RTT in the TCP 3168 by specifying a scheme that provides more than one feedback
header. Given TCP header space is scarce, it allocates a reserved signal per RTT in the TCP header. Given TCP header space is scarce,
header bit previously assigned to the ECN-nonce. It also overloads it allocates a reserved header bit previously assigned to the ECN-
the two existing ECN flags in the TCP header. The resulting extra nonce. It also overloads the two existing ECN flags in the TCP
space is additionally exploited to feed back the IP-ECN field header. The resulting extra space is additionally exploited to feed
received during the TCP connection establishment. Supplementary back the IP ECN field received during the TCP connection
feedback information can optionally be provided in two new TCP option establishment. Supplementary feedback information can optionally be
alternatives, which are never used on the TCP SYN. The document also provided in two new TCP Option alternatives, which are never used on
specifies the treatment of this updated TCP wire protocol by the TCP SYN. The document also specifies the treatment of this
middleboxes. updated TCP wire protocol by middleboxes.
Status of This Memo Status of This Memo
This is an Internet Standards Track document. This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has (IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841. Internet Standards is available in Section 2 of RFC 7841.
skipping to change at line 158 skipping to change at line 158
Explicit Congestion Notification (ECN) [RFC3168] is a mechanism by Explicit Congestion Notification (ECN) [RFC3168] is a mechanism by
which network nodes can mark IP packets instead of dropping them to which network nodes can mark IP packets instead of dropping them to
indicate incipient congestion to the endpoints. Receivers with an indicate incipient congestion to the endpoints. Receivers with an
ECN-capable transport protocol feed back this information to the ECN-capable transport protocol feed back this information to the
sender. In RFC 3168, ECN was specified for TCP in such a way that sender. In RFC 3168, ECN was specified for TCP in such a way that
only one feedback signal could be transmitted per Round-Trip Time only one feedback signal could be transmitted per Round-Trip Time
(RTT). This is sufficient for congestion control schemes like Reno (RTT). This is sufficient for congestion control schemes like Reno
[RFC6582] and CUBIC [RFC9438], as those schemes reduce their [RFC6582] and CUBIC [RFC9438], as those schemes reduce their
congestion window by a fixed factor if congestion occurs within an congestion window by a fixed factor if congestion occurs within an
RTT independent of the number of received congestion markings. RTT independent of the number of received congestion markings. More
Recently, proposed mechanisms like Congestion Exposure (ConEx recently defined mechanisms like Congestion Exposure (ConEx
[RFC7713]), DCTCP [RFC8257], and L4S [RFC9330] need to know when more [RFC7713]), DCTCP [RFC8257], and L4S [RFC9330] need to know when more
than one marking is received in one RTT, which is information that than one marking is received in one RTT, which is information that
cannot be provided by the feedback scheme as specified in [RFC3168]. cannot be provided by the feedback scheme as specified in [RFC3168].
This document specifies an update to the ECN feedback scheme of RFC This document specifies an update to the ECN feedback scheme of RFC
3168 that provides more accurate information and could be used by 3168 that provides more accurate information and could be used by
these and potentially other future TCP extensions, while still also these and potentially other future TCP extensions, while still also
supporting the pre-existing TCP congestion controllers that use just supporting the pre-existing TCP congestion controllers that use just
one feedback signal per round. Congestion control is the term the one feedback signal per round. Congestion control is the term the
IETF uses to describe data rate management. It is the algorithm that IETF uses to describe data rate management. It is the algorithm that
a sender uses to optimize its sending rate so that it transmits data a sender uses to optimize its sending rate so that it transmits data
skipping to change at line 224 skipping to change at line 224
CUBIC, AccECN can be used to respond to the extent of congestion CUBIC, AccECN can be used to respond to the extent of congestion
notification over a round trip, as for example DCTCP does in notification over a round trip, as for example DCTCP does in
controlled environments [RFC8257]. For congestion response, this controlled environments [RFC8257]. For congestion response, this
specification refers to the original ECN specification adopted in specification refers to the original ECN specification adopted in
2001 [RFC3168], as updated by the more relaxed rules introduced in 2001 [RFC3168], as updated by the more relaxed rules introduced in
2018 to allow ECN experiments [RFC8311], namely: a TCP-based Low 2018 to allow ECN experiments [RFC8311], namely: a TCP-based Low
Latency Low Loss Scalable (L4S) congestion control [RFC9330]; or Latency Low Loss Scalable (L4S) congestion control [RFC9330]; or
Alternative Backoff with ECN (ABE) [RFC8511]. Alternative Backoff with ECN (ABE) [RFC8511].
Section 5.2 explains how AccECN is compatible with current commonly Section 5.2 explains how AccECN is compatible with current commonly
used TCP options, and a number of current experimental modifications used TCP Options, and a number of current experimental modifications
to TCP, as well as SYN cookies. to TCP, as well as SYN cookies.
1.1. Document Roadmap 1.1. Document Roadmap
The following introductory section outlines the goals of AccECN The following introductory section outlines the goals of AccECN
(Section 1.2). Then, terminology is defined (Section 1.3) and a (Section 1.2). Then, terminology is defined (Section 1.3) and a
recap of existing prerequisite technology is given (Section 1.4). recap of existing prerequisite technology is given (Section 1.4).
Section 2 gives an informative overview of the AccECN protocol. Then Section 2 gives an informative overview of the AccECN protocol. Then
Section 3 gives the normative protocol specification, and Section 3.3 Section 3 gives the normative protocol specification, and Section 3.3
skipping to change at line 317 skipping to change at line 317
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
1.4. Recap of Existing ECN Feedback in IP/TCP 1.4. Recap of Existing ECN Feedback in IP/TCP
Explicit Congestion Notification (ECN) [RFC3168] can be split into Explicit Congestion Notification (ECN) [RFC3168] can be split into
two parts conceptionally. In the forward direction, alongside the two parts conceptionally. In the forward direction, alongside the
data stream, it uses a 2-bit field in the IP header. This is data stream, it uses a 2-bit field in the IP header. This is
referred to as IP-ECN later on. This signal carried in the IP (Layer referred to as IP ECN later on. This signal carried in the IP (Layer
3) header is exposed to network devices and may be modified when such 3) header is exposed to network devices and may be modified when such
a device starts to experience congestion (see Table 1). The second a device starts to experience congestion (see Table 1). The second
part is the feedback mechanism, by which the original data sender is part is the feedback mechanism, by which the original data sender is
notified of the current congestion state of the intermediate path. notified of the current congestion state of the intermediate path.
That returned signal is carried in a protocol-specific manner, and is That returned signal is carried in a protocol-specific manner, and is
not to be modified by intermediate network devices. While ECN is in not to be modified by intermediate network devices. While ECN is in
active use for protocols such as QUIC [RFC9000], SCTP [RFC9260], RTP active use for protocols such as QUIC [RFC9000], SCTP [RFC9260], RTP
[RFC6679], and Remote Direct Memory Access over Converged Ethernet [RFC6679], and Remote Direct Memory Access over Converged Ethernet
[RoCEv2], this document only concerns itself with the specific [RoCEv2], this document only concerns itself with the specific
implementation for the TCP protocol. implementation for the TCP protocol.
skipping to change at line 343 skipping to change at line 343
0b00, the packet is considered to have been sent by a Not ECN-capable 0b00, the packet is considered to have been sent by a Not ECN-capable
Transport (Not-ECT). When a network node experiences congestion, it Transport (Not-ECT). When a network node experiences congestion, it
will occasionally either drop or mark a packet, with the choice will occasionally either drop or mark a packet, with the choice
depending on the packet's ECN codepoint. If the codepoint is Not- depending on the packet's ECN codepoint. If the codepoint is Not-
ECT, only drop is appropriate. If the codepoint is ECT(0) or ECT(1), ECT, only drop is appropriate. If the codepoint is ECT(0) or ECT(1),
the node can mark the packet by setting the ECN codepoint to 0b11, the node can mark the packet by setting the ECN codepoint to 0b11,
which is termed 'Congestion Experienced' (CE), or loosely a which is termed 'Congestion Experienced' (CE), or loosely a
'congestion mark'. Table 1 summarises these codepoints. 'congestion mark'. Table 1 summarises these codepoints.
+==================+================+===========================+ +==================+================+===========================+
| IP-ECN codepoint | Codepoint name | Description | | IP ECN Codepoint | Codepoint Name | Description |
+==================+================+===========================+ +==================+================+===========================+
| 0b00 | Not-ECT | Not ECN-Capable Transport | | 0b00 | Not-ECT | Not ECN-Capable Transport |
+------------------+----------------+---------------------------+ +------------------+----------------+---------------------------+
| 0b01 | ECT(1) | ECN-Capable Transport (1) | | 0b01 | ECT(1) | ECN-Capable Transport (1) |
+------------------+----------------+---------------------------+ +------------------+----------------+---------------------------+
| 0b10 | ECT(0) | ECN-Capable Transport (0) | | 0b10 | ECT(0) | ECN-Capable Transport (0) |
+------------------+----------------+---------------------------+ +------------------+----------------+---------------------------+
| 0b11 | CE | Congestion Experienced | | 0b11 | CE | Congestion Experienced |
+------------------+----------------+---------------------------+ +------------------+----------------+---------------------------+
skipping to change at line 404 skipping to change at line 404
Like the general TCP approach, the Data Receiver of each TCP half- Like the general TCP approach, the Data Receiver of each TCP half-
connection sends AccECN feedback to the Data Sender on TCP connection sends AccECN feedback to the Data Sender on TCP
acknowledgements, reusing data packets of the other half-connection acknowledgements, reusing data packets of the other half-connection
whenever possible. whenever possible.
The AccECN protocol has had to be designed in two parts: The AccECN protocol has had to be designed in two parts:
* an essential feedback part that reuses the TCP-ECN header bits for * an essential feedback part that reuses the TCP-ECN header bits for
the Data Receiver to feed back the number of packets arriving with the Data Receiver to feed back the number of packets arriving with
CE in the IP-ECN field. This provides more accuracy than Classic CE in the IP ECN field. This provides more accuracy than Classic
ECN feedback, but limited resilience against ACK loss; ECN feedback, but limited resilience against ACK loss.
* a supplementary feedback part using one of two new alternative * a supplementary feedback part using one of two new alternative
AccECN TCP options that provide additional feedback on the number AccECN TCP Options that provide additional feedback on the number
of payload bytes that arrive marked with each of the three ECN of payload bytes that arrive marked with each of the three ECN
codepoints in the IP-ECN field (not just CE marks). See the BCP codepoints in the IP ECN field (not just CE marks). See the BCP
on Byte and Packet Congestion Notification [RFC7141] for the on Byte and Packet Congestion Notification [RFC7141] for the
rationale determining that conveying congested payload bytes rationale determining that conveying congested payload bytes
should be preferred over just providing feedback about congested should be preferred over just providing feedback about congested
packets. This also provides greater resilience against ACK loss packets. This also provides greater resilience against ACK loss
than the essential feedback, but it is currently more likely to than the essential feedback, but it is currently more likely to
suffer from middlebox interference. suffer from middlebox interference.
The two part design was necessary, given limitations on the space The two part design was necessary, given limitations on the space
available for TCP options and given the possibility that certain available for TCP Options and given the possibility that certain
incorrectly designed middleboxes might prevent TCP from using any new incorrectly designed middleboxes might prevent TCP from using any new
options. options.
The essential feedback part overloads the previous definition of the The essential feedback part overloads the previous definition of the
three flags in the TCP header that had been assigned for use by three flags in the TCP header that had been assigned for use by
Classic ECN. This design choice deliberately allows AccECN peers to Classic ECN. This design choice deliberately allows AccECN peers to
replace the Classic ECN feedback protocol, rather than leaving replace the Classic ECN feedback protocol, rather than leaving
Classic ECN feedback intact and adding more accurate feedback Classic ECN feedback intact and adding more accurate feedback
separately because: separately because:
* this efficiently reuses scarce TCP header space, given TCP option * this efficiently reuses scarce TCP header space, given TCP Option
space is approaching saturation; space is approaching saturation.
* a single upgrade path for the TCP protocol is preferable to a fork * a single upgrade path for the TCP protocol is preferable to a fork
in the design that modifies the TCP header to convey all ECN in the design that modifies the TCP header to convey all ECN
feedback; feedback.
* otherwise, Classic and Accurate ECN feedback could give * otherwise, Classic and Accurate ECN feedback could give
conflicting feedback about the same segment, which could open up conflicting feedback about the same segment, which could open up
new security concerns and make implementations unnecessarily new security concerns and make implementations unnecessarily
complex; complex.
* middleboxes are more likely to faithfully forward the TCP ECN * middleboxes are more likely to faithfully forward the TCP ECN
flags than newly defined areas of the TCP header. flags than newly defined areas of the TCP header.
AccECN is designed to work even if the supplementary feedback part is AccECN is designed to work even if the supplementary feedback part is
removed or zeroed out, as long as the essential feedback part gets removed or zeroed out, as long as the essential feedback part gets
through. through.
2.1. Capability Negotiation 2.1. Capability Negotiation
skipping to change at line 470 skipping to change at line 470
An AccECN TCP Client does not send an AccECN Option on the SYN as SYN An AccECN TCP Client does not send an AccECN Option on the SYN as SYN
option space is limited. The TCP Server sends an AccECN Option on option space is limited. The TCP Server sends an AccECN Option on
the SYN/ACK, and the TCP Client sends one on the first ACK to test the SYN/ACK, and the TCP Client sends one on the first ACK to test
whether the network path forwards these options correctly. whether the network path forwards these options correctly.
2.2. Feedback Mechanism 2.2. Feedback Mechanism
A Data Receiver maintains four counters initialized at the start of A Data Receiver maintains four counters initialized at the start of
the half-connection. Three count the number of arriving payload the half-connection. Three count the number of arriving payload
bytes marked CE, ECT(1), and ECT(0) in the IP-ECN field. These byte bytes marked CE, ECT(1), and ECT(0) in the IP ECN field. These byte
counters reflect only the TCP payload length, excluding the TCP counters reflect only the TCP payload length, excluding the TCP
header and TCP options. The fourth counter counts the number of header and TCP Options. The fourth counter counts the number of
packets arriving marked with a CE codepoint (including control packets arriving marked with a CE codepoint (including control
packets without payload if they are CE-marked). packets without payload if they are CE-marked).
The Data Sender maintains four equivalent counters for the half The Data Sender maintains four equivalent counters for the half-
connection, and the AccECN protocol is designed to ensure they will connection, and the AccECN protocol is designed to ensure they will
match the values in the Data Receiver's counters, albeit after a match the values in the Data Receiver's counters, albeit after a
little delay. little delay.
Each ACK carries the three least significant bits (LSBs) of the Each ACK carries the three least significant bits (LSBs) of the
packet-based CE counter using the ECN bits in the TCP header, now packet-based CE counter using the ECN bits in the TCP header, now
renamed the Accurate ECN (ACE) field (see Figure 3). The 24 LSBs of renamed the Accurate ECN (ACE) field (see Figure 3). The 24 LSBs of
some or all of the byte counters can be optionally carried in an some or all of the byte counters can be optionally carried in an
AccECN Option. For efficient use of limited option space, two AccECN Option. For efficient use of limited option space, two
alternative forms of the AccECN Option are specified with the fields alternative forms of the AccECN Option are specified with the fields
skipping to change at line 557 skipping to change at line 557
other than the L4S experiment [RFC9330], such as a lower severity or other than the L4S experiment [RFC9330], such as a lower severity or
a more instant congestion signal than CE. a more instant congestion signal than CE.
Feedback in bytes is provided to protect against the receiver or a Feedback in bytes is provided to protect against the receiver or a
middlebox using attacks similar to 'ACK-Division' to artificially middlebox using attacks similar to 'ACK-Division' to artificially
inflate the congestion window, which is why [RFC5681] now recommends inflate the congestion window, which is why [RFC5681] now recommends
that TCP counts acknowledge bytes not packets. that TCP counts acknowledge bytes not packets.
2.5. Generic (Mechanistic) Reflector 2.5. Generic (Mechanistic) Reflector
The ACE field provides feedback about CE markings in the IP-ECN field The ACE field provides feedback about CE markings in the IP ECN field
of both data and control packets. According to [RFC3168], the Data of both data and control packets. According to [RFC3168], the Data
Sender is meant to set the IP-ECN field of control packets to Not- Sender is meant to set the IP ECN field of control packets to Not-
ECT. However, mechanisms in certain private networks (e.g., data ECT. However, mechanisms in certain private networks (e.g., data
centres) set control packets to be ECN-capable because they are centres) set control packets to be ECN-capable because they are
precisely the packets that performance depends on most. precisely the packets that performance depends on most.
For this reason, AccECN is designed to be a generic reflector of For this reason, AccECN is designed to be a generic reflector of
whatever ECN markings it sees, whether or not they are compliant with whatever ECN markings it sees, whether or not they are compliant with
a current standard. Then as standards evolve, Data Senders can a current standard. Then as standards evolve, Data Senders can
upgrade unilaterally without any need for receivers to upgrade too. upgrade unilaterally without any need for receivers to upgrade too.
It is also useful to be able to rely on generic reflection behaviour It is also useful to be able to rely on generic reflection behaviour
when senders need to test for unexpected interference with markings when senders need to test for unexpected interference with markings
(for instance Sections 3.2.2.3, 3.2.2.4, and 3.2.3.2 of the present (for instance Sections 3.2.2.3, 3.2.2.4, and 3.2.3.2 of the present
document and paragraph 2 of Section 20.2 of [RFC3168]). document and paragraph 2 of Section 20.2 of [RFC3168]).
The initial SYN and SYN/ACK are the most critical control packets, so The initial SYN and SYN/ACK are the most critical control packets, so
AccECN feeds back their IP-ECN fields. Although RFC 3168 prohibits AccECN feeds back their IP ECN fields. Although RFC 3168 prohibits
ECN-capable SYNs and SYN/ACKs, providing feedback of ECN marking on ECN-capable SYNs and SYN/ACKs, providing feedback of ECN marking on
the SYN and SYN/ACK supports future scenarios in which SYNs might be the SYN and SYN/ACK supports future scenarios in which SYNs might be
ECN-enabled (without prejudging whether they ought to be). For ECN-enabled (without prejudging whether they ought to be). For
instance, [RFC8311] updates this aspect of RFC 3168 to allow instance, [RFC8311] updates this aspect of RFC 3168 to allow
experimentation with ECN-capable TCP control packets. experimentation with ECN-capable TCP control packets.
Even if the TCP Client (or Server) has set the SYN (or SYN/ACK) to Even if the TCP Client (or Server) has set the SYN (or SYN/ACK) to
Not-ECT in compliance with RFC 3168, feedback on the state of the IP- Not-ECT in compliance with RFC 3168, feedback on the state of the IP
ECN field when it arrives at the receiver could still be useful, ECN field when it arrives at the receiver could still be useful,
because middleboxes have been known to overwrite the IP-ECN field as because middleboxes have been known to overwrite the IP ECN field as
if it is still part of the old Type of Service (ToS) field if it is still part of the old Type of Service (ToS) field
[Mandalari18]. For example, if a TCP Client has set the SYN to Not- [Mandalari18]. For example, if a TCP Client has set the SYN to Not-
ECT, but receives feedback that the IP-ECN field on the SYN arrived ECT, but receives feedback that the IP ECN field on the SYN arrived
with a different codepoint, it can detect such middlebox with a different codepoint, it can detect such middlebox
interference. Previously, neither end knew what IP-ECN field the interference. Previously, neither end knew what IP ECN field the
other sent. So, if a TCP Server received ECT or CE on a SYN, it other sent. So, if a TCP Server received ECT or CE on a SYN, it
could not know whether it was invalid because only the TCP Client could not know whether it was invalid because only the TCP Client
knew whether it originally marked the SYN as Not-ECT (or ECT). knew whether it originally marked the SYN as Not-ECT (or ECT).
Therefore, prior to AccECN, the Server's only safe course of action Therefore, prior to AccECN, the Server's only safe course of action
in this example was to disable ECN for the connection. Instead, the in this example was to disable ECN for the connection. Instead, the
AccECN protocol allows the Server and Client to feed back the ECN AccECN protocol allows the Server and Client to feed back the ECN
field received on the SYN and SYN/ACK to their peer, which now has field received on the SYN and SYN/ACK to their peer, which now has
all the information to decide whether the connection has to fall back all the information to decide whether the connection has to fall back
from supporting ECN (or not). from supporting ECN (or not).
skipping to change at line 627 skipping to change at line 627
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Figure 2: The New Definition of the TCP Header Flags During the Figure 2: The New Definition of the TCP Header Flags During the
TCP Three-Way Handshake TCP Three-Way Handshake
During the TCP three-way handshake at the start of a connection, to During the TCP three-way handshake at the start of a connection, to
request more Accurate ECN feedback the TCP Client (host A) MUST set request more Accurate ECN feedback the TCP Client (host A) MUST set
the TCP flags (AE,CWR,ECE) = (1,1,1) in the initial SYN segment. the TCP flags (AE,CWR,ECE) = (1,1,1) in the initial SYN segment.
If a TCP Server (host B) that is AccECN-enabled receives a SYN with If a TCP Server (host B) that is AccECN-enabled receives a SYN with
the above three flags set, it MUST set both its half connections into the above three flags set, it MUST set both its half-connections into
AccECN mode. Then it MUST set the AE, CWR, and ECE TCP flags on the AccECN mode. Then it MUST set the AE, CWR, and ECE TCP flags on the
SYN/ACK to the combination in the top block of Table 2 that feeds SYN/ACK to the combination in the top block of Table 2 that feeds
back the IP-ECN field that arrived on the SYN. This applies whether back the IP ECN field that arrived on the SYN. This applies whether
or not the Server itself supports setting the IP-ECN field on a SYN or not the Server itself supports setting the IP ECN field on a SYN
or SYN/ACK (see Section 2.5 for rationale). or SYN/ACK (see Section 2.5 for rationale).
When the TCP Server returns any of the four combinations in the top When the TCP Server returns any of the four combinations in the top
block of Table 2, it confirms that it supports AccECN. The TCP block of Table 2, it confirms that it supports AccECN. The TCP
Server MUST NOT set one of these four combinations of flags on the Server MUST NOT set one of these four combinations of flags on the
SYN/ACK unless the preceding SYN requested support for AccECN as SYN/ACK unless the preceding SYN requested support for AccECN as
above. above.
Once a TCP Client (A) has sent the above SYN to declare that it Once a TCP Client (A) has sent the above SYN to declare that it
supports AccECN, and once it has received the above SYN/ACK segment supports AccECN, and once it has received the above SYN/ACK segment
that confirms that the TCP Server supports AccECN, the TCP Client that confirms that the TCP Server supports AccECN, the TCP Client
MUST set both its half connections into AccECN mode. The TCP Client MUST set both its half-connections into AccECN mode. The TCP Client
MUST NOT enter AccECN mode (or any feedback mode) before it has MUST NOT enter AccECN mode (or any feedback mode) before it has
received the first SYN/ACK. received the first SYN/ACK.
Once in AccECN mode, a TCP Client or Server has the rights and Once in AccECN mode, a TCP Client or Server has the rights and
obligations to participate in the ECN protocol defined in obligations to participate in the ECN protocol defined in
Section 3.1.5. Section 3.1.5.
The procedures for retransmission of SYNs or SYN/ACKs are given in The procedures for retransmission of SYNs or SYN/ACKs are given in
Section 3.1.4. Section 3.1.4.
skipping to change at line 669 skipping to change at line 669
3.1.2. Backward Compatibility 3.1.2. Backward Compatibility
The three flags are set to 1 to indicate AccECN support on the SYN The three flags are set to 1 to indicate AccECN support on the SYN
have been carefully chosen to enable natural fall-back to prior have been carefully chosen to enable natural fall-back to prior
stages in the evolution of ECN. Table 2 tabulates all the stages in the evolution of ECN. Table 2 tabulates all the
negotiation possibilities for ECN-related capabilities that involve negotiation possibilities for ECN-related capabilities that involve
at least one AccECN-capable host. The entries in the first two at least one AccECN-capable host. The entries in the first two
columns have been abbreviated, as follows: columns have been abbreviated, as follows:
AccECN: Supports more Accurate ECN feedback (the present AccECN: Supports more Accurate ECN feedback (the present
specification) specification).
Nonce: Supports ECN-nonce feedback [RFC3540] Nonce: Supports ECN-nonce feedback [RFC3540].
ECN: Supports 'Classic' ECN feedback [RFC3168] ECN: Supports 'Classic' ECN feedback [RFC3168].
No ECN: Not ECN-capable. Implicit congestion notification using No ECN: Not ECN-capable. Implicit congestion notification using
packet drop. packet drop.
+========+========+============+============+======================+ +========+========+============+============+======================+
| Host A | Host B | SYN | SYN/ACK | Feedback Mode | | Host A | Host B | SYN | SYN/ACK | Feedback Mode |
| | | A->B | B->A | of Host A | | | | A->B | B->A | of Host A |
| | | AE CWR ECE | AE CWR ECE | | | | | AE CWR ECE | AE CWR ECE | |
+========+========+============+============+======================+ +========+========+============+============+======================+
| AccECN | AccECN | 1 1 1 | 0 1 0 | AccECN (Not-ECT SYN) | | AccECN | AccECN | 1 1 1 | 0 1 0 | AccECN (Not-ECT SYN) |
skipping to change at line 716 skipping to change at line 716
row. row.
1. The top block shows the case already described in Section 3.1 1. The top block shows the case already described in Section 3.1
where both endpoints support AccECN and how the TCP Server (B) where both endpoints support AccECN and how the TCP Server (B)
indicates congestion feedback. indicates congestion feedback.
2. The second block shows the cases where the TCP Client (A) 2. The second block shows the cases where the TCP Client (A)
supports AccECN but the TCP Server (B) supports some earlier supports AccECN but the TCP Server (B) supports some earlier
variant of TCP feedback, as indicated in its SYN/ACK. Therefore, variant of TCP feedback, as indicated in its SYN/ACK. Therefore,
as soon as an AccECN-capable TCP Client (A) receives the SYN/ACK as soon as an AccECN-capable TCP Client (A) receives the SYN/ACK
shown, it MUST set both its half connections into the feedback shown, it MUST set both its half-connections into the feedback
mode shown in the rightmost column. If the TCP Client has set mode shown in the rightmost column. If the TCP Client has set
itself into Classic ECN feedback mode, it MUST comply with itself into Classic ECN feedback mode, it MUST comply with
[RFC3168]. [RFC3168].
An AccECN implementation has no need to recognize or support the An AccECN implementation has no need to recognize or support the
Server response labelled 'Nonce' or ECN-nonce feedback more Server response labelled 'Nonce' or ECN-nonce feedback more
generally [RFC3540], as RFC 3540 has been reclassified as generally [RFC3540], as RFC 3540 has been reclassified as
Historic [RFC8311]. AccECN is compatible with alternative ECN Historic [RFC8311]. AccECN is compatible with alternative ECN
feedback integrity approaches to the nonce (see Section 5.3). feedback integrity approaches to the nonce (see Section 5.3).
The SYN/ACK labelled 'Nonce' with (AE,CWR,ECE) = (1,0,1) is The SYN/ACK labelled 'Nonce' with (AE,CWR,ECE) = (1,0,1) is
skipping to change at line 738 skipping to change at line 738
SYN/ACK follows the procedure for forward compatibility given in SYN/ACK follows the procedure for forward compatibility given in
Section 3.1.3. Section 3.1.3.
3. The third block shows the cases where the TCP Server (B) supports 3. The third block shows the cases where the TCP Server (B) supports
AccECN but the TCP Client (A) supports some earlier variant of AccECN but the TCP Client (A) supports some earlier variant of
TCP feedback, as indicated in its SYN. TCP feedback, as indicated in its SYN.
When an AccECN-enabled TCP Server (B) receives a SYN with When an AccECN-enabled TCP Server (B) receives a SYN with
(AE,CWR,ECE) = (0,1,1), it MUST do one of the following: (AE,CWR,ECE) = (0,1,1), it MUST do one of the following:
* set both its half connections into the Classic ECN feedback * set both its half-connections into the Classic ECN feedback
mode and return a SYN/ACK with (AE,CWR,ECE) = (0,0,1) as mode and return a SYN/ACK with (AE,CWR,ECE) = (0,0,1) as
shown. Then it MUST comply with [RFC3168]. shown. Then it MUST comply with [RFC3168].
* set both its half-connections into Not ECN mode and return a * set both its half-connections into Not ECN mode and return a
SYN/ACK with (AE,CWR,ECE) = (0,0,0), then continue with ECN SYN/ACK with (AE,CWR,ECE) = (0,0,0), then continue with ECN
disabled. This latter case is unlikely to be desirable, but disabled. This latter case is unlikely to be desirable, but
it is allowed as a possibility, e.g., for minimal TCP it is allowed as a possibility, e.g., for minimal TCP
implementations. implementations.
When an AccECN-enabled TCP Server (B) receives a SYN with When an AccECN-enabled TCP Server (B) receives a SYN with
(AE,CWR,ECE) = (0,0,0), it MUST set both its half connections (AE,CWR,ECE) = (0,0,0), it MUST set both its half-connections
into the Not ECN feedback mode, return a SYN/ACK with into the Not ECN feedback mode, return a SYN/ACK with
(AE,CWR,ECE) = (0,0,0) as shown, and continue with ECN disabled. (AE,CWR,ECE) = (0,0,0) as shown, and continue with ECN disabled.
4. The fourth block displays a combination labelled 'Broken'. Some 4. The fourth block displays a combination labelled 'Broken'. Some
older TCP Server implementations incorrectly set the TCP-ECN older TCP Server implementations incorrectly set the TCP-ECN
flags in the SYN/ACK by reflecting those in the SYN. Such broken flags in the SYN/ACK by reflecting those in the SYN. Such broken
TCP Servers (B) cannot support ECN; so as soon as an AccECN- TCP Servers (B) cannot support ECN; so as soon as an AccECN-
capable TCP Client (A) receives such a broken SYN/ACK, it MUST capable TCP Client (A) receives such a broken SYN/ACK, it MUST
fall back to Not ECN mode for both its half connections and fall back to Not ECN mode for both its half-connections and
continue with ECN disabled. continue with ECN disabled.
The following additional rules do not fit the structure of the table, The following additional rules do not fit the structure of the table,
but they complement it: but they complement it:
Simultaneous Open: An originating AccECN Host (A), having sent a SYN Simultaneous Open: An originating AccECN Host (A), having sent a SYN
with (AE,CWR,ECE) = (1,1,1), might receive another SYN from host with (AE,CWR,ECE) = (1,1,1), might receive another SYN from host
B. Host A MUST then enter the same feedback mode as it would have B. Host A MUST then enter the same feedback mode as it would have
entered had it been a responding host and received the same SYN. entered had it been a responding host and received the same SYN.
Then host A MUST send the same SYN/ACK as it would have sent had Then host A MUST send the same SYN/ACK as it would have sent had
skipping to change at line 793 skipping to change at line 793
such a combination, the Server MUST negotiate the use of AccECN as if such a combination, the Server MUST negotiate the use of AccECN as if
the three flags had been set to (1,1,1). However, an AccECN Client the three flags had been set to (1,1,1). However, an AccECN Client
implementation MUST NOT send a SYN with any combination other than implementation MUST NOT send a SYN with any combination other than
the three listed. the three listed.
If a TCP Client sent a SYN requesting AccECN feedback with If a TCP Client sent a SYN requesting AccECN feedback with
(AE,CWR,ECE) = (1,1,1) and then receives a SYN/ACK with the currently (AE,CWR,ECE) = (1,1,1) and then receives a SYN/ACK with the currently
reserved combination (AE,CWR,ECE) = (1,0,1) but it does not have reserved combination (AE,CWR,ECE) = (1,0,1) but it does not have
logic specific to such a combination, the Client MUST enable AccECN logic specific to such a combination, the Client MUST enable AccECN
mode as if the SYN/ACK confirmed that the Server supported AccECN and mode as if the SYN/ACK confirmed that the Server supported AccECN and
as if it fed back that the IP-ECN field on the SYN had arrived as if it fed back that the IP ECN field on the SYN had arrived
unchanged. However, an AccECN Server implementation MUST NOT send a unchanged. However, an AccECN Server implementation MUST NOT send a
SYN/ACK with this combination (AE,CWR,ECE) = (1,0,1). SYN/ACK with this combination (AE,CWR,ECE) = (1,0,1).
| For the avoidance of doubt, the behaviour described in the | For the avoidance of doubt, the behaviour described in the
| present specification applies whether or not the three | present specification applies whether or not the three
| remaining reserved TCP header flags are zero. | remaining reserved TCP header flags are zero.
All of these requirements ensure that future uses of all the Reserved All of these requirements ensure that future uses of all the Reserved
combinations on a SYN or SYN/ACK can rely on consistent behaviour combinations on a SYN or SYN/ACK (see Table 2) can rely on consistent
from the installed base of AccECN implementations. See Appendix B.3 behaviour from the installed base of AccECN implementations. See
for related discussion. Appendix B.3 for related discussion.
3.1.4. Multiple SYNs or SYN/ACKs 3.1.4. Multiple SYNs or SYN/ACKs
3.1.4.1. Retransmitted SYNs 3.1.4.1. Retransmitted SYNs
If the sender of an AccECN SYN (the TCP Client) times out before If the sender of an AccECN SYN (the TCP Client) times out before
receiving the SYN/ACK, it SHOULD attempt to negotiate the use of receiving the SYN/ACK, it SHOULD attempt to negotiate the use of
AccECN at least one more time by continuing to set all three TCP ECN AccECN at least one more time by continuing to set all three TCP ECN
flags (AE,CWR,ECE) = (1,1,1) on the first retransmitted SYN (using flags (AE,CWR,ECE) = (1,1,1) on the first retransmitted SYN (using
the usual retransmission timeouts). If this first retransmission the usual retransmission timeouts). If this first retransmission
skipping to change at line 830 skipping to change at line 830
Retrying once before fall-back adds delay in the case where a Retrying once before fall-back adds delay in the case where a
middlebox drops an AccECN (or ECN) SYN deliberately. However, recent middlebox drops an AccECN (or ECN) SYN deliberately. However, recent
measurements [Mandalari18] imply that a drop is less likely to be due measurements [Mandalari18] imply that a drop is less likely to be due
to middlebox interference than other intermittent causes of loss, to middlebox interference than other intermittent causes of loss,
e.g., congestion, wireless transmission loss, etc. e.g., congestion, wireless transmission loss, etc.
Implementers MAY use other fall-back strategies if they are found to Implementers MAY use other fall-back strategies if they are found to
be more effective (e.g., attempting to negotiate AccECN on the SYN be more effective (e.g., attempting to negotiate AccECN on the SYN
only once or more than twice (most appropriate during high levels of only once or more than twice (most appropriate during high levels of
congestion). congestion)).
Further it might make sense to also remove any other new or Further it might make sense to also remove any other new or
experimental fields or options on the SYN in case a middlebox might experimental fields or options on the SYN in case a middlebox might
be blocking them, although the required behaviour will depend on the be blocking them, although the required behaviour will depend on the
specification of the other option(s) and any attempt to coordinate specification of the other option(s) and any attempt to coordinate
fall-back between different modules of the stack. For instance, even fall-back between different modules of the stack. For instance, if
if taking part in an [RFC8311] experiment that allows ECT on a SYN, taking part in an [RFC8311] experiment that allows ECT on a SYN, it
it would be advisable to try it without. would be advisable to have a fall-back strategy that tries use of
AccECN without setting ETC on SYN.
Whichever fall-back strategy is used, the TCP initiator SHOULD cache Whichever fall-back strategy is used, the TCP initiator SHOULD cache
failed connection attempts. If it does, it SHOULD NOT give up failed connection attempts. If it does, it SHOULD NOT give up
attempting to negotiate AccECN on the SYN of subsequent connection attempting to negotiate AccECN on the SYN of subsequent connection
attempts until it is clear that the blockage is persistently and attempts until it is clear that the blockage is persistently and
specifically due to AccECN. The cache needs to be arranged to expire specifically due to AccECN. The cache needs to be arranged to expire
so that the initiator will infrequently attempt to check whether the so that the initiator will infrequently attempt to check whether the
problem has been resolved. problem has been resolved.
All fall-back strategies will need to follow all the normative rules All fall-back strategies will need to follow all the normative rules
in Section 3.1.5, which concern behaviour when SYNs or SYN/ACKs in Section 3.1.5, which concern behaviour when SYNs or SYN/ACKs
negotiating different types of feedback have been sent within the negotiating different types of feedback have been sent within the
same connection, including the possibility that they arrive out of same connection, including the possibility that they arrive out of
order. As examples, the following non-normative bullets call out order. As examples, the following non-normative bullets call out
those rules from Section 3.1.5 that apply to the above fall-back those rules from Section 3.1.5 that apply to the above fall-back
strategies: strategies:
* Once the TCP Client has sent SYNs with (AE,CWR,ECE) = (1,1,1) and * Once the TCP Client has sent SYNs with (AE,CWR,ECE) = (1,1,1) and
with (AE,CWR,ECE) = (0,0,0), it might eventually receive a SYN/ACK with (AE,CWR,ECE) = (0,0,0), it might eventually receive a SYN/ACK
from the Server in response to one, the other, or both, and from the Server in response to one, the other, or both, and
possibly reordered; possibly reordered.
* Such a TCP Client enters the feedback mode appropriate to the * Such a TCP Client enters the feedback mode appropriate to the
first SYN/ACK it receives according to Table 2, and it does not first SYN/ACK it receives according to Table 2, and it does not
switch to a different mode, whatever other SYN/ACKs it might switch to a different mode, whatever other SYN/ACKs it might
receive or send; receive or send.
* If a TCP Client has entered AccECN mode but then subsequently * If a TCP Client has entered AccECN mode but then subsequently
sends a SYN or receives a SYN/ACK with (AE,CWR,ECE) = (0,0,0), it sends a SYN or receives a SYN/ACK with (AE,CWR,ECE) = (0,0,0), it
is still allowed to set ECT on packets for the rest of the is still allowed to set ECT on packets for the rest of the
connection. Note that this rule is different than that of a connection. Note that this rule is different than that of a
Server in an equivalent position (Section 3.1.5 explains). Server in an equivalent position (Section 3.1.5 explains).
* Having entered AccECN mode, in general a TCP Client commits to * Having entered AccECN mode, in general a TCP Client commits to
respond to any incoming congestion feedback, whether or not it respond to any incoming congestion feedback, whether or not it
sets ECT on outgoing packets (for rationale and some exceptions sets ECT on outgoing packets (for rationale and some exceptions
see Section 3.2.2.3, Section 3.2.2.4); see Section 3.2.2.3, Section 3.2.2.4).
* Having entered AccECN mode, a TCP Client commits to using AccECN * Having entered AccECN mode, a TCP Client commits to using AccECN
to feed back the IP-ECN field in incoming packets for the rest of to feed back the IP ECN field in incoming packets for the rest of
the connection, as specified in Section 3.2, even if it is not the connection, as specified in Section 3.2, even if it is not
itself setting ECT on outgoing packets. itself setting ECT on outgoing packets.
3.1.4.2. Retransmitted SYN/ACKs 3.1.4.2. Retransmitted SYN/ACKs
A TCP Server might send multiple SYN/ACKs indicating different A TCP Server might send multiple SYN/ACKs indicating different
feedback modes. For instance, when falling back to sending a SYN/ACK feedback modes. For instance, when falling back to sending a SYN/ACK
with (AE,CWR,ECE) = (0,0,0) after previous AccECN SYN/ACKs have timed with (AE,CWR,ECE) = (0,0,0) after previous AccECN SYN/ACKs have timed
out (Section 3.2.3.2.2); or to acknowledge different retransmissions out (Section 3.2.3.2.2); or to acknowledge different retransmissions
of the SYN (Section 3.1.4.1). of the SYN (Section 3.1.4.1).
skipping to change at line 900 skipping to change at line 901
All fall-back strategies will need to follow all the normative rules All fall-back strategies will need to follow all the normative rules
in Section 3.1.5, which concern behaviour when SYNs or SYN/ACKs in Section 3.1.5, which concern behaviour when SYNs or SYN/ACKs
negotiating different types of feedback are sent within the same negotiating different types of feedback are sent within the same
connection, including the possibility that they arrive out of order. connection, including the possibility that they arrive out of order.
As examples, the following non-normative bullets call out those rules As examples, the following non-normative bullets call out those rules
from Section 3.1.5 that apply to the above fall-back strategies: from Section 3.1.5 that apply to the above fall-back strategies:
* An AccECN-capable TCP Server enters the feedback mode appropriate * An AccECN-capable TCP Server enters the feedback mode appropriate
to the first SYN it receives using Table 2, and it does not switch to the first SYN it receives using Table 2, and it does not switch
to a different mode, whatever other SYNs it might receive and to a different mode, whatever other SYNs it might receive and
whatever SYN/ACKs it might send; whatever SYN/ACKs it might send.
* If a TCP Server in AccECN mode receives a SYN with (AE,CWR,ECE) = * If a TCP Server in AccECN mode receives a SYN with (AE,CWR,ECE) =
(0,0,0), it preferably acknowledges it first using an AccECN SYN/ (0,0,0), it preferably acknowledges it first using an AccECN SYN/
ACK, but it can retry using a SYN/ACK with (AE,CWR,ECE) = (0,0,0); ACK, but it can retry using a SYN/ACK with (AE,CWR,ECE) = (0,0,0).
* If a TCP Server in AccECN mode sends multiple AccECN SYN/ACKs, it * If a TCP Server in AccECN mode sends multiple AccECN SYN/ACKs, it
uses the TCP-ECN flags in each SYN/ACK to feed back the IP-ECN uses the TCP-ECN flags in each SYN/ACK to feed back the IP ECN
field on the latest SYN to have arrived; field on the latest SYN to have arrived.
* If a TCP Server enters AccECN mode and then subsequently sends a * If a TCP Server enters AccECN mode and then subsequently sends a
SYN/ACK or receives a SYN with (AE,CWR,ECE) = (0,0,0), it is SYN/ACK or receives a SYN with (AE,CWR,ECE) = (0,0,0), it is
prohibited from setting ECT on any packet for the rest of the prohibited from setting ECT on any packet for the rest of the
connection; connection.
* Having entered AccECN mode, in general a TCP Server commits to * Having entered AccECN mode, in general a TCP Server commits to
respond to any incoming congestion feedback, whether or not it respond to any incoming congestion feedback, whether or not it
sets ECT on outgoing packets (for rationale and some exceptions sets ECT on outgoing packets (for rationale and some exceptions
see Sections 3.2.2.3, 3.2.2.4); see Sections 3.2.2.3, 3.2.2.4).
* Having entered AccECN mode, a TCP Server commits to using AccECN * Having entered AccECN mode, a TCP Server commits to using AccECN
to feed back the IP-ECN field in incoming packets for the rest of to feed back the IP ECN field in incoming packets for the rest of
the connection, as specified in Section 3.2, even if it is not the connection, as specified in Section 3.2, even if it is not
itself setting ECT on outgoing packets. itself setting ECT on outgoing packets.
3.1.5. Implications of AccECN Mode 3.1.5. Implications of AccECN Mode
Section 3.1.1 describes the only ways that a host can enter AccECN Section 3.1.1 describes the only ways that a host can enter AccECN
mode, whether as a Client or as a Server. mode, whether as a Client or as a Server.
An implementation that supports AccECN has the rights and obligations An implementation that supports AccECN has the rights and obligations
concerning the use of ECN defined below, which update those in concerning the use of ECN defined below, which update those in
Section 6.1.1 of [RFC3168]. This section uses the following Section 6.1.1 of [RFC3168]. This section uses the following
definitions: definitions:
'During the handshake': The connection states prior to 'During the handshake': The connection states prior to
synchronization; synchronization.
'Valid SYN': A SYN that has the same port numbers and the same ISN 'Valid SYN': A SYN that has the same port numbers and the same ISN
as the SYN that first caused the Server to open the connection. as the SYN that first caused the Server to open the connection.
An 'Acceptable' packet is defined in Section 1.3. An 'Acceptable' packet is defined in Section 1.3.
Handling SYNs or SYN/ACKs of multiple types (e.g., fall-back): Handling SYNs or SYN/ACKs of multiple types (e.g., fall-back):
* Any implementation that supports AccECN: * Any implementation that supports AccECN:
- MUST NOT switch into a different feedback mode than the one it - MUST NOT switch into a different feedback mode than the one it
first entered according to Table 2, no matter whether it first entered according to Table 2, no matter whether it
subsequently receives valid SYNs or Acceptable SYN/ACKs of subsequently receives valid SYNs or Acceptable SYN/ACKs of
different types. different types.
- SHOULD ignore the TCP-ECN flags in SYNs or SYN/ACKs that are - SHOULD ignore the TCP-ECN flags in SYNs or SYN/ACKs that are
received after the implementation reaches the Established received after the implementation reaches the ESTABLISHED
state, in line with the general TCP approach [RFC9293]; state, in line with the general TCP approach [RFC9293].
Reason: Reaching established state implies that at least one Reason: Reaching ESTABLISHED state implies that at least one
SYN and one SYN/ACK have successfully been delivered. And all SYN and one SYN/ACK have successfully been delivered. And all
the rules for handshake fall-back are designed to work based on the rules for handshake fall-back are designed to work based on
those packets that successfully traverse the path, whatever those packets that successfully traverse the path, whatever
other handshake packets are lost or delayed. other handshake packets are lost or delayed.
- MUST NOT send a 'Classic' ECN-setup SYN [RFC3168] with - MUST NOT send a 'Classic' ECN-setup SYN [RFC3168] with
(AE,CWR,ECE) = (0,1,1) and a SYN with (AE,CWR,ECE) = (1,1,1) (AE,CWR,ECE) = (0,1,1) and a SYN with (AE,CWR,ECE) = (1,1,1)
requesting AccECN feedback within the same connection; requesting AccECN feedback within the same connection;
- MUST NOT send a 'Classic' ECN-setup SYN/ACK [RFC3168] with - MUST NOT send a 'Classic' ECN-setup SYN/ACK [RFC3168] with
skipping to change at line 986 skipping to change at line 987
handshake; handshake;
The last four rules are necessary because, if one peer were to The last four rules are necessary because, if one peer were to
negotiate the feedback mode in two different types of handshake, negotiate the feedback mode in two different types of handshake,
it would not be possible for the other peer to know for certain it would not be possible for the other peer to know for certain
which handshake packet(s) the other end had eventually received or which handshake packet(s) the other end had eventually received or
in which order it received them. So, in the absence of these in which order it received them. So, in the absence of these
rules, the two peers could end up using different ECN feedback rules, the two peers could end up using different ECN feedback
modes without knowing it. modes without knowing it.
* A host in AccECN mode that is feeding back the IP-ECN field on a * A host in AccECN mode that is feeding back the IP ECN field on a
SYN or SYN/ACK: SYN or SYN/ACK:
- MUST feed back the IP-ECN field on the latest valid SYN or - MUST feed back the IP ECN field on the latest valid SYN or
acceptable SYN/ACK to arrive. acceptable SYN/ACK to arrive.
* A TCP Server already in AccECN mode: * A TCP Server already in AccECN mode:
- SHOULD acknowledge a valid SYN arriving with (AE,CWR,ECE) = - SHOULD acknowledge a valid SYN arriving with (AE,CWR,ECE) =
(0,0,0) by emitting an AccECN SYN/ACK (with the appropriate (0,0,0) by emitting an AccECN SYN/ACK (with the appropriate
combination of TCP-ECN flags to feed back the IP-ECN field of combination of TCP-ECN flags to feed back the IP ECN field of
this latest SYN); this latest SYN).
- MAY acknowledge a valid SYN arriving with (AE,CWR,ECE) = - MAY acknowledge a valid SYN arriving with (AE,CWR,ECE) =
(0,0,0) by sending a SYN/ACK with (AE,CWR,ECE) = (0,0,0); (0,0,0) by sending a SYN/ACK with (AE,CWR,ECE) = (0,0,0).
Rationale: When a SYN arrives with (AE,CWR,ECE) = (0,0,0) at a TCP Rationale: When a SYN arrives with (AE,CWR,ECE) = (0,0,0) at a TCP
Server that is already in AccECN mode, it implies that the TCP Server that is already in AccECN mode, it implies that the TCP
Client had probably not received the previous AccECN SYN/ACK Client had probably not received the previous AccECN SYN/ACK
emitted by the TCP Server. Therefore, the first bullet recommends emitted by the TCP Server. Therefore, the first bullet recommends
attempting at least one more AccECN SYN/ACK. Nonetheless, the attempting at least one more AccECN SYN/ACK. Nonetheless, the
second bullet recognizes that the Server might eventually need to second bullet recognizes that the Server might eventually need to
fall back to a non-ECN SYN/ACK. In either case, the TCP Server fall back to a non-ECN SYN/ACK. In either case, the TCP Server
remains in AccECN feedback mode (according to the earlier remains in AccECN feedback mode (according to the earlier
requirement not to switch modes). requirement not to switch modes).
* An AccECN-capable TCP Server already in Not ECN mode: * An AccECN-capable TCP Server already in Not ECN mode:
- SHOULD respond to any subsequent valid SYN using a SYN/ACK with - SHOULD respond to any subsequent valid SYN using a SYN/ACK with
(AE,CWR,ECE) = (0,0,0), even if the SYN is offering to (AE,CWR,ECE) = (0,0,0), even if the SYN is offering to
negotiate Classic ECN or AccECN feedback mode; negotiate Classic ECN or AccECN feedback mode.
Rationale: There would be no point in the Server offering any Rationale: There would be no point in the Server offering any
type of ECN feedback, because the Client will not be using ECN. type of ECN feedback, because the Client will not be using ECN.
However, there is no interoperability reason to make this rule However, there is no interoperability reason to make this rule
mandatory. mandatory.
If for any reason a host is not willing to provide ECN feedback on a If for any reason a host is not willing to provide ECN feedback on a
particular TCP connection, it SHOULD clear the AE, CWR, and ECE flags particular TCP connection, it SHOULD clear the AE, CWR, and ECE flags
in all SYN and/or SYN/ACK packets that it sends. in all SYN and/or SYN/ACK packets that it sends.
Sending ECT: Sending ECT:
* Any implementation that supports AccECN: * Any implementation that supports AccECN:
- MUST NOT set ECT if it is in Not ECN feedback mode. - MUST NOT set ECT if it is in Not ECN feedback mode.
A Data Sender in AccECN mode: A Data Sender in AccECN mode:
- SHOULD set an ECT codepoint in the IP header of packets to - SHOULD set an ECT codepoint in the IP header of packets to
indicate to the network that the transport is capable and indicate to the network that the transport is capable and
willing to participate in ECN for this packet; willing to participate in ECN for this packet.
- MAY not set ECT on any packet (for instance if it has reason to - MAY not set ECT on any packet (for instance if it has reason to
believe such a packet would be blocked); believe such a packet would be blocked).
A TCP Server in AccECN mode: A TCP Server in AccECN mode:
- MUST NOT set ECT on any packet for the rest of the connection, - MUST NOT set ECT on any packet for the rest of the connection,
if it has received or sent at least one valid SYN or Acceptable if it has received or sent at least one valid SYN or Acceptable
SYN/ACK with (AE,CWR,ECE) = (0,0,0) during the handshake. SYN/ACK with (AE,CWR,ECE) = (0,0,0) during the handshake.
This rule solely applies to a Server because, when a Server This rule solely applies to a Server because, when a Server
enters AccECN mode, it doesn't know for sure whether the Client enters AccECN mode, it doesn't know for sure whether the Client
will end up in AccECN mode. But when a Client enters AccECN will end up in AccECN mode. But when a Client enters AccECN
mode, it can be certain that the Server is already in AccECN mode, it can be certain that the Server is already in AccECN
feedback mode. feedback mode.
Congestion response: Congestion response:
* A host in AccECN mode: * A host in AccECN mode:
- is obliged to respond appropriately to AccECN feedback that - is obliged to respond appropriately to AccECN feedback that
indicates there were ECN marks on packets it had previously indicates there were ECN marks on packets it had previously
sent, where 'appropriately' is defined in Section 6.1 of sent, where 'appropriately' is defined in Section 6.1 of
[RFC3168] and updated by Sections 2.1 and 4.1 of [RFC8311]; [RFC3168] and updated by Sections 2.1 and 4.1 of [RFC8311].
- is still obliged to respond appropriately to congestion - is still obliged to respond appropriately to congestion
feedback, even when it is solely sending non-ECN-capable feedback, even when it is solely sending non-ECN-capable
packets (for rationale, some examples and some exceptions see packets (for rationale, some examples and some exceptions see
Sections 3.2.2.3 and 3.2.2.4). Sections 3.2.2.3 and 3.2.2.4).
- is still obliged to respond appropriately to congestion - is still obliged to respond appropriately to congestion
feedback, even if it has sent or received a SYN or SYN/ACK feedback, even if it has sent or received a SYN or SYN/ACK
packet with (AE,CWR,ECE) = (0,0,0) during the handshake; packet with (AE,CWR,ECE) = (0,0,0) during the handshake.
- MUST NOT set CWR to indicate that it has received and responded - MUST NOT set CWR to indicate that it has received and responded
to indications of congestion. to indications of congestion.
For the avoidance of doubt, this is unlike an RFC 3168 data For the avoidance of doubt, this is unlike an RFC 3168 data
sender and this does not preclude the Data Sender from setting sender and this does not preclude the Data Sender from setting
the bits of the ACE counter field, which includes an overloaded the bits of the ACE counter field, which includes an overloaded
use of the same bit. use of the same bit.
Receiving ECT: Receiving ECT:
* A host in AccECN mode: * A host in AccECN mode:
- MUST feed back the information in the IP-ECN field of incoming - MUST feed back the information in the IP ECN field of incoming
packets using Accurate ECN feedback, as specified in packets using Accurate ECN feedback, as specified in
Section 3.2. Section 3.2.
For the avoidance of doubt, this requirement stands even if the For the avoidance of doubt, this requirement stands even if the
AccECN host has also sent or received a SYN or SYN/ACK with AccECN host has also sent or received a SYN or SYN/ACK with
(AE,CWR,ECE) = (0,0,0). Reason: Such a SYN or SYN/ACK implies (AE,CWR,ECE) = (0,0,0). Reason: Such a SYN or SYN/ACK implies
some form of packet mangling might be present. Even if the some form of packet mangling might be present. Even if the
remote peer is not setting ECT, it could still be set remote peer is not setting ECT, it could still be set
erroneously by packet mangling at the IP layer (see erroneously by packet mangling at the IP layer (see
Section 3.2.2.3). In such cases, the Data Sender is best Section 3.2.2.3). In such cases, the Data Sender is best
placed to decide whether ECN markings are valid, but it can placed to decide whether ECN markings are valid, but it can
only do that if the Data Receiver mechanistically feeds back only do that if the Data Receiver mechanistically feeds back
any ECN markings. This approach will not lead to TCP Options any ECN markings. This approach will not lead to TCP Options
being generated unnecessarily if the recommended simple scheme being generated unnecessarily if the recommended simple scheme
in Section 3.2.3.3 is used, because no byte counters will in Section 3.2.3.3 is used, because no byte counters will
change if no packets are set to ECT. change if no packets are set to ECT.
- MUST NOT use reception of packets with ECT set in the IP-ECN - MUST NOT use reception of packets with ECT set in the IP ECN
field as an implicit signal that the peer is ECN-capable. field as an implicit signal that the peer is ECN-capable.
Reason: ECT at the IP layer does not explicitly confirm the Reason: ECT at the IP layer does not explicitly confirm the
peer has the correct ECN feedback logic, because the packets peer has the correct ECN feedback logic, because the packets
could have been mangled at the IP layer. could have been mangled at the IP layer.
3.2. AccECN Feedback 3.2. AccECN Feedback
Each Data Receiver of each half connection maintains four counters, Each Data Receiver of each half-connection maintains four counters,
r.cep, r.ceb, r.e0b, and r.e1b: r.cep, r.ceb, r.e0b, and r.e1b:
* The Data Receiver MUST increment the CE packet counter (r.cep), * The Data Receiver MUST increment the CE packet counter (r.cep),
for every Acceptable packet that it receives with the CE code for every Acceptable packet that it receives with the CE code
point in the IP-ECN field, including CE-marked control packets and point in the IP ECN field, including CE-marked control packets and
retransmissions but excluding CE on SYN packets (SYN=1; ACK=0). retransmissions but excluding CE on SYN packets (SYN=1; ACK=0).
* A Data Receiver that supports sending of AccECN TCP Options MUST * A Data Receiver that supports sending of AccECN TCP Options MUST
increment the r.ceb, r.e0b, or r.e1b byte counters by the number increment the r.ceb, r.e0b, or r.e1b byte counters by the number
of TCP payload octets in Acceptable packets marked with the CE, of TCP payload octets in Acceptable packets marked with the CE,
ECT(0), and ECT(1) codepoint in their IP-ECN field, including any ECT(0), and ECT(1) codepoint in their IP ECN field, including any
payload octets on control packets and retransmissions, but not payload octets on control packets and retransmissions, but not
including any payload octets on SYN packets (SYN=1; ACK=0). including any payload octets on SYN packets (SYN=1; ACK=0).
Each Data Sender of each half connection maintains four counters, Each Data Sender of each half-connection maintains four counters,
s.cep, s.ceb, s.e0b, and s.e1b, intended to track the equivalent s.cep, s.ceb, s.e0b, and s.e1b, intended to track the equivalent
counters at the Data Receiver. counters at the Data Receiver.
A Data Receiver feeds back the CE packet counter using the Accurate A Data Receiver feeds back the CE packet counter using the Accurate
ECN (ACE) field, as explained in Section 3.2.2. And it optionally ECN (ACE) field, as explained in Section 3.2.2. And it optionally
feeds back all the byte counters using the AccECN TCP Option, as feeds back all the byte counters using the AccECN TCP Option, as
specified in Section 3.2.3. specified in Section 3.2.3.
Whenever a Data Receiver feeds back the value of any counter, it MUST Whenever a Data Receiver feeds back the value of any counter, it MUST
report the most recent value, no matter whether it is in a pure ACK, report the most recent value, no matter whether it is in a pure ACK,
skipping to change at line 1200 skipping to change at line 1201
Both parts of each of these conditions are equally important. For Both parts of each of these conditions are equally important. For
instance, even if AccECN negotiation has been successful, the ACE instance, even if AccECN negotiation has been successful, the ACE
field is not defined on any segments with SYN=1 (e.g., a field is not defined on any segments with SYN=1 (e.g., a
retransmission of an unacknowledged SYN/ACK, or when both ends send retransmission of an unacknowledged SYN/ACK, or when both ends send
SYN/ACKs after AccECN support has been successfully negotiated during SYN/ACKs after AccECN support has been successfully negotiated during
a simultaneous open). a simultaneous open).
3.2.2.1. ACE Field on the ACK of the SYN/ACK 3.2.2.1. ACE Field on the ACK of the SYN/ACK
A TCP Client (A) in AccECN mode MUST feed back which of the 4 A TCP Client (A) in AccECN mode MUST feed back which of the 4
possible values of the IP-ECN field was on the SYN/ACK by writing it possible values of the IP ECN field was on the SYN/ACK by writing it
into the ACE field of a pure ACK with no SACK blocks using the binary into the ACE field of a pure ACK with no SACK blocks using the binary
encoding in Table 3 (which is the same as that used on the SYN/ACK in encoding in Table 3 (which is the same as that used on the SYN/ACK in
Table 2). This shall be called the handshake encoding of the ACE Table 2). This shall be called the "handshake encoding" of the ACE
field, and it is the only exception to the rule that the ACE field field, and it is the only exception to the rule that the ACE field
carries the 3 least significant bits of the r.cep counter on packets carries the 3 least significant bits of the r.cep counter on packets
with SYN=0. with SYN=0.
Normally, a TCP Client acknowledges a SYN/ACK with an ACK that Normally, a TCP Client acknowledges a SYN/ACK with an ACK that
satisfies the above conditions anyway (SYN=0, no data, no SACK satisfies the above conditions anyway (SYN=0, no data, no SACK
blocks). If an AccECN TCP Client intends to acknowledge the SYN/ACK blocks). If an AccECN TCP Client intends to acknowledge the SYN/ACK
with a packet that does not satisfy these conditions (e.g., it has with a packet that does not satisfy these conditions (e.g., it has
data to include on the ACK), it SHOULD first send a pure ACK that data to include on the ACK), it SHOULD first send a pure ACK that
does satisfy these conditions (see Section 5.2), so that it can feed does satisfy these conditions (see Section 5.2), so that it can feed
back which of the four values of the IP-ECN field arrived on the SYN/ back which of the four values of the IP ECN field arrived on the SYN/
ACK. A valid exception to this "SHOULD" would be where the ACK. A valid exception to this "SHOULD" would be where the
implementation will only be used in an environment where mangling of implementation will only be used in an environment where mangling of
the ECN field is unlikely. the ECN field is unlikely.
The TCP Client MUST also use the handshake encoding for the pure ACK The TCP Client MUST also use the handshake encoding for the pure ACK
of any retransmitted SYN/ACK that confirms that the TCP Server of any retransmitted SYN/ACK that confirms that the TCP Server
supports AccECN. If the final ACK of the handshake does not arrive supports AccECN. If the final ACK of the handshake does not arrive
before its retransmission timer expires, the TCP Server is follow the before its retransmission timer expires, the TCP Server is follow the
procedure given in Section 3.1.4.2. procedure given in Section 3.1.4.2.
+==================+================+=====================+ +==================+================+=====================+
| IP-ECN codepoint | ACE on pure | r.cep of TCP Client | | IP ECN Codepoint | ACE on Pure | r.cep of TCP Client |
| on SYN/ACK | ACK of SYN/ACK | in AccECN mode | | on SYN/ACK | ACK of SYN/ACK | in AccECN Mode |
+==================+================+=====================+ +==================+================+=====================+
| Not-ECT | 0b010 | 5 | | Not-ECT | 0b010 | 5 |
+------------------+----------------+---------------------+ +------------------+----------------+---------------------+
| ECT(1) | 0b011 | 5 | | ECT(1) | 0b011 | 5 |
+------------------+----------------+---------------------+ +------------------+----------------+---------------------+
| ECT(0) | 0b100 | 5 | | ECT(0) | 0b100 | 5 |
+------------------+----------------+---------------------+ +------------------+----------------+---------------------+
| CE | 0b110 | 6 | | CE | 0b110 | 6 |
+------------------+----------------+---------------------+ +------------------+----------------+---------------------+
Table 3: The Encoding of the ACE Field in the ACK of Table 3: The Encoding of the ACE Field in the ACK of
the SYN-ACK to Reflect the SYN-ACK's IP-ECN Field the SYN-ACK to Reflect the SYN-ACK's IP ECN Field
When an AccECN Server in SYN-RCVD state receives a pure ACK with When an AccECN Server in SYN-RCVD state receives a pure ACK with
SYN=0 and no SACK blocks, instead of treating the ACE field as a SYN=0 and no SACK blocks, it MUST infer the meaning of each possible
counter, it MUST infer the meaning of each possible value of the ACE value of the ACE field from Table 4 instead of treating the ACE field
field from Table 4, which also shows the value that an AccECN Server as a counter. As a result, an AccECN Server MUST set s.cep to the
MUST set s.cep to as a result. respective value, also shown in Table 4.
Given this encoding of the ACE field on the ACK of a SYN/ACK is Given this encoding of the ACE field on the ACK of a SYN/ACK is
exceptional, an AccECN Server using large receive offload (LRO) might exceptional, an AccECN Server using large receive offload (LRO) might
prefer to disable LRO until such an ACK has transitioned it out of prefer to disable LRO until the ACK of the SYN/ACK was sent and it
SYN-RCVD state. has transitioned out of SYN-RCVD state.
+============+==========================+=====================+ +============+==========================+=====================+
| ACE on ACK | IP-ECN codepoint on SYN/ | s.cep of TCP Server | | ACE on ACK | IP ECN Codepoint on SYN/ | s.cep of TCP Server |
| of SYN/ACK | ACK inferred by Server | in AccECN mode | | of SYN/ACK | ACK Inferred by Server | in AccECN Mode |
+============+==========================+=====================+ +============+==========================+=====================+
| 0b000 | {Notes 1, 3} | Disable s.cep | | 0b000 | {Notes 1, 3} | Disable s.cep |
+------------+--------------------------+---------------------+ +------------+--------------------------+---------------------+
| 0b001 | {Notes 2, 3} | 5 | | 0b001 | {Notes 2, 3} | 5 |
+------------+--------------------------+---------------------+ +------------+--------------------------+---------------------+
| 0b010 | Not-ECT | 5 | | 0b010 | Not-ECT | 5 |
+------------+--------------------------+---------------------+ +------------+--------------------------+---------------------+
| 0b011 | ECT(1) | 5 | | 0b011 | ECT(1) | 5 |
+------------+--------------------------+---------------------+ +------------+--------------------------+---------------------+
| 0b100 | ECT(0) | 5 | | 0b100 | ECT(0) | 5 |
skipping to change at line 1291 skipping to change at line 1292
AccECN feedback. Nonetheless, as a Data Receiver, it MUST AccECN feedback. Nonetheless, as a Data Receiver, it MUST
NOT disable AccECN feedback. NOT disable AccECN feedback.
Any of the circumstances below could cause a value of zero Any of the circumstances below could cause a value of zero
but, whatever the cause, the actions above would be the but, whatever the cause, the actions above would be the
appropriate response: appropriate response:
* The TCP Client has somehow entered No ECN feedback mode * The TCP Client has somehow entered No ECN feedback mode
(most likely if the Server received a SYN or sent a SYN/ (most likely if the Server received a SYN or sent a SYN/
ACK with (AE,CWR,ECE) = (0,0,0) after entering AccECN ACK with (AE,CWR,ECE) = (0,0,0) after entering AccECN
mode, but possible even if it didn't); mode, but possible even if it didn't).
* The TCP Client genuinely might be in AccECN mode, but its * The TCP Client genuinely might be in AccECN mode, but its
count of received CE marks might have caused the ACE count of received CE marks might have caused the ACE
field to wrap to zero. This is highly unlikely, but not field to wrap to zero. This is highly unlikely, but not
impossible because the Server might have already sent impossible because the Server might have already sent
multiple packets while still in SYN-RCVD state, e.g., multiple packets while still in SYN-RCVD state, e.g.,
using TFO (see Section 5.2), and some might have been CE- using TFO (see Section 5.2), and some might have been CE-
marked. Then ACE on the first ACK seen by the Server marked. Then ACE on the first ACK seen by the Server
might be zero, due to previous ACKs experiencing an might be zero, due to previous ACKs experiencing an
unfortunate pattern of loss or delay. unfortunate pattern of loss or delay.
skipping to change at line 1354 skipping to change at line 1355
* It then follows the safety procedures in Section 3.2.2.5.2 to * It then follows the safety procedures in Section 3.2.2.5.2 to
calculate or estimate how many packets the ACK could have calculate or estimate how many packets the ACK could have
acknowledged under the prevailing conditions to determine whether acknowledged under the prevailing conditions to determine whether
the ACE field might have wrapped more than once. the ACE field might have wrapped more than once.
The encode/decode procedures during the three-way handshake are The encode/decode procedures during the three-way handshake are
exceptions to the general rules given so far, so they are spelled out exceptions to the general rules given so far, so they are spelled out
step by step below for clarity: step by step below for clarity:
* If a TCP Server in AccECN mode receives a CE mark in the IP-ECN * If a TCP Server in AccECN mode receives a CE mark in the IP ECN
field of a SYN (SYN=1, ACK=0), it MUST NOT increment r.cep (it field of a SYN (SYN=1, ACK=0), it MUST NOT increment r.cep (it
remains at its initial value of 5). remains at its initial value of 5).
Reason: It would be redundant for the Server to include CE-marked Reason: It would be redundant for the Server to include CE-marked
SYNs in its r.cep counter, because it already reliably delivers SYNs in its r.cep counter, because it already reliably delivers
feedback of any CE marking using the encoding in the top block of feedback of any CE marking using the encoding in the top block of
Table 2 in the SYN/ACK. This also ensures that, when the Server Table 2 in the SYN/ACK. This also ensures that, when the Server
starts using the ACE field, it has not unnecessarily consumed more starts using the ACE field, it has not unnecessarily consumed more
than one initial value, given they can be used to negotiate than one initial value, given they can be used to negotiate
variants of the AccECN protocol (see Appendix B.3). variants of the AccECN protocol (see Appendix B.3).
* If a TCP Client in AccECN mode receives CE feedback in the TCP * If a TCP Client in AccECN mode receives CE feedback in the TCP
flags of a SYN/ACK, it MUST NOT increment s.cep (it remains at its flags of a SYN/ACK, it MUST NOT increment s.cep (it remains at its
initial value of 5) so that it stays in step with r.cep on the initial value of 5) so that it stays in step with r.cep on the
Server. Nonetheless, the TCP Client still triggers the congestion Server. Nonetheless, the TCP Client still triggers the congestion
control actions necessary to respond to the CE feedback. control actions necessary to respond to the CE feedback.
* If a TCP Client in AccECN mode receives a CE mark in the IP-ECN * If a TCP Client in AccECN mode receives a CE mark in the IP ECN
field of a SYN/ACK, it MUST increment r.cep, but no more than once field of a SYN/ACK, it MUST increment r.cep, but no more than once
no matter how many CE-marked SYN/ACKs it receives (i.e., no matter how many CE-marked SYN/ACKs it receives (i.e.,
incremented from 5 to 6, but no further). incremented from 5 to 6, but no further).
Reason: Incrementing r.cep ensures the Client will eventually Reason: Incrementing r.cep ensures the Client will eventually
deliver any CE marking to the Server reliably when it starts using deliver any CE marking to the Server reliably when it starts using
the ACE field. Even though the Client also feeds back any CE the ACE field. Even though the Client also feeds back any CE
marking on the ACK of the SYN/ACK using the encoding in Table 3, marking on the ACK of the SYN/ACK using the encoding in Table 3,
this ACK is not delivered reliably, so it can be considered as a this ACK is not delivered reliably, so it can be considered as a
timely notification that is redundant but unreliable. The Client timely notification that is redundant but unreliable. The Client
skipping to change at line 1417 skipping to change at line 1418
ACK of the SYN/ACK) that is delayed for longer than the Server's ACK of the SYN/ACK) that is delayed for longer than the Server's
retransmission timeout; or packet duplication by the network. And retransmission timeout; or packet duplication by the network. And
the impact of any error in the feedback on such ACKs will only be the impact of any error in the feedback on such ACKs will only be
temporary. temporary.
3.2.2.3. Testing for Mangling of the IP/ECN Field 3.2.2.3. Testing for Mangling of the IP/ECN Field
* TCP Client side: * TCP Client side:
The value of the TCP-ECN flags on the SYN/ACK indicates the value The value of the TCP-ECN flags on the SYN/ACK indicates the value
of the IP-ECN field when the SYN arrived at the Server. The TCP of the IP ECN field when the SYN arrived at the Server. The TCP
Client can compare this with how it originally set the IP-ECN Client can compare this with how it originally set the IP ECN
field on the SYN. If this comparison implies an invalid field on the SYN. If this comparison implies an invalid
transition (defined below) of the IP-ECN field, for the remainder transition (defined below) of the IP ECN field, for the remainder
of the half-connection the Client is advised to send non-ECN- of the half-connection the Client is advised to send non-ECN-
capable packets, but it still ought to respond to any feedback of capable packets, but it still ought to respond to any feedback of
CE markings (explained below). However, the TCP Client MUST CE markings (explained below). However, the TCP Client MUST
remain in the AccECN feedback mode and it MUST continue to feed remain in the AccECN feedback mode and it MUST continue to feed
back any ECN markings on arriving packets (in its role as Data back any ECN markings on arriving packets (in its role as Data
Receiver). Receiver).
* TCP Server side: * TCP Server side:
The value of the ACE field on the last ACK of the three-way The value of the ACE field on the last ACK of the three-way
handshake indicates the value of the IP-ECN field when the SYN/ACK handshake indicates the value of the IP ECN field when the SYN/ACK
arrived at the TCP Client. The Server can compare this with how arrived at the TCP Client. The Server can compare this with how
it originally set the IP-ECN field on the SYN/ACK. If this it originally set the IP ECN field on the SYN/ACK. If this
comparison implies an invalid transition of the IP-ECN field, for comparison implies an invalid transition of the IP ECN field, for
the remainder of the half-connection the Server is advised to send the remainder of the half-connection the Server is advised to send
non-ECN-capable packets, but it still ought to respond to any non-ECN-capable packets, but it still ought to respond to any
feedback of CE markings (explained below). However, the Server feedback of CE markings (explained below). However, the Server
MUST remain in the AccECN feedback mode and it MUST continue to MUST remain in the AccECN feedback mode and it MUST continue to
feed back any ECN markings on arriving packets (in its role as feed back any ECN markings on arriving packets (in its role as
Data Receiver). Data Receiver).
If a Data Sender in AccECN mode starts sending non-ECN-capable If a Data Sender in AccECN mode starts sending non-ECN-capable
packets because it has detected mangling, it is still advised to packets because it has detected mangling, it is still advised to
respond to CE feedback. Reason: Any CE marking arriving at the Data respond to CE feedback. Reason: Any CE marking arriving at the Data
Receiver could be due to something early in the path mangling the Receiver could be due to something early in the path mangling the
non-ECN-capable IP-ECN field into an ECN-capable codepoint and then, non-ECN-capable IP ECN field into an ECN-capable codepoint and then,
later in the path, a network bottleneck might be applying CE markings later in the path, a network bottleneck might be applying CE markings
to indicate genuine congestion. This argument applies whether the to indicate genuine congestion. This argument applies whether the
handshake packet originally sent by the TCP Client or Server was non- handshake packet originally sent by the TCP Client or Server was non-
ECN-capable or ECN-capable because, in either case, an unsafe ECN-capable or ECN-capable because, in either case, an unsafe
transition could imply that non-ECN-capable packets later in the transition could imply that non-ECN-capable packets later in the
connection might get mangled. connection might get mangled.
Once a Data Sender has entered AccECN mode it is advised to check Once a Data Sender has entered AccECN mode it is advised to check
whether it is receiving continuous feedback of CE. Specifying whether it is receiving continuous feedback of CE. Specifying
exactly how to do this is beyond the scope of the present exactly how to do this is beyond the scope of the present
skipping to change at line 1483 skipping to change at line 1484
As always, once a host has entered AccECN mode, it follows the As always, once a host has entered AccECN mode, it follows the
general mandatory requirements (Section 3.1.5) to remain in the same general mandatory requirements (Section 3.1.5) to remain in the same
feedback mode and to continue feeding back any ECN markings on feedback mode and to continue feeding back any ECN markings on
arriving packets using AccECN feedback. This follows the general arriving packets using AccECN feedback. This follows the general
approach where an AccECN Data Receiver mechanistically reflects approach where an AccECN Data Receiver mechanistically reflects
whatever it receives (Section 2.5). whatever it receives (Section 2.5).
The ACK of the SYN/ACK is not reliably delivered (nonetheless, the The ACK of the SYN/ACK is not reliably delivered (nonetheless, the
count of CE marks is still eventually delivered reliably). If this count of CE marks is still eventually delivered reliably). If this
ACK does not arrive, the Server is advised to continue to send ECN- ACK does not arrive, the Server is advised to continue to send ECN-
capable packets without having tested for mangling of the IP-ECN capable packets without having tested for mangling of the IP ECN
field on the SYN/ACK. field on the SYN/ACK.
All the fall-back behaviours in this section are necessary in case All the fall-back behaviours in this section are necessary in case
mangling of the IP-ECN field is asymmetric, which is currently common mangling of the IP ECN field is asymmetric, which is currently common
over some mobile networks [Mandalari18]. In this case, one end might over some mobile networks [Mandalari18]. In this case, one end might
see no unsafe transition and continue sending ECN-capable packets, see no unsafe transition and continue sending ECN-capable packets,
while the other end sees an unsafe transition and stops sending ECN- while the other end sees an unsafe transition and stops sending ECN-
capable packets. capable packets.
Invalid transitions of the IP-ECN field are defined in Section 18 of Invalid transitions of the IP ECN field are defined in Section 18 of
the Classic ECN specification [RFC3168] and repeated here for the Classic ECN specification [RFC3168] and repeated here for
convenience: convenience:
* the Not-ECT codepoint changes; * the Not-ECT codepoint changes.
* either ECT codepoint transitions to Not-ECT; * either ECT codepoint transitions to Not-ECT.
* the CE codepoint changes. * the CE codepoint changes.
RFC 3168 says that a router that changes ECT to Not-ECT is invalid RFC 3168 says that a router that changes ECT to Not-ECT is invalid
but safe. However, from a host's viewpoint, this transition is but safe. However, from a host's viewpoint, this transition is
unsafe because it could be the result of two transitions at different unsafe because it could be the result of two transitions at different
routers on the path: ECT to CE (safe) then CE to Not-ECT (unsafe). routers on the path: ECT to CE (safe) then CE to Not-ECT (unsafe).
This scenario could well happen where an ECN-enabled home router This scenario could well happen where an ECN-enabled home router
congests its upstream mobile broadband bottleneck link, then the congests its upstream mobile broadband bottleneck link, then the
ingress to the mobile network clears the ECN field [Mandalari18]. ingress to the mobile network clears the ECN field [Mandalari18].
skipping to change at line 1531 skipping to change at line 1532
If AccECN has been successfully negotiated, the Data Sender MAY check If AccECN has been successfully negotiated, the Data Sender MAY check
the value of the ACE counter in the first feedback packet (with or the value of the ACE counter in the first feedback packet (with or
without data) that arrives after the three-way handshake. If the without data) that arrives after the three-way handshake. If the
value of this ACE field is found to be zero (0b000), for the value of this ACE field is found to be zero (0b000), for the
remainder of the half-connection the Data Sender ought to send non- remainder of the half-connection the Data Sender ought to send non-
ECN-capable packets and it is advised not to respond to any feedback ECN-capable packets and it is advised not to respond to any feedback
of CE markings. of CE markings.
Reason: the symptoms imply any or all of the following: Reason: the symptoms imply any or all of the following:
* the remote peer has somehow entered Not ECN feedback mode; * the remote peer has somehow entered Not ECN feedback mode.
* a broken remote TCP implementation; * a broken remote TCP implementation.
* potential mangling of the ECN fields in the TCP headers (although * potential mangling of the ECN fields in the TCP headers (although
unlikely given they clearly survived during the handshake). unlikely given they clearly survived during the handshake).
This advice is not stated normatively (in capitals), because the best This advice is not stated normatively (in capitals), because the best
strategy might depend on experience of the most likely scenarios, strategy might depend on the likelihood to experience these
which can only be known at the time of deployment. scenarios, which can only be known at the time of deployment.
Note that a host in AccECN mode MUST continue to provide Accurate ECN Note that a host in AccECN mode MUST continue to provide Accurate ECN
feedback to its peer, even if it is no longer sending ECT itself over feedback to its peer, even if it is no longer sending ECT itself over
the other half connection. the other half-connection.
If reordering occurs, the first feedback packet that arrives will not If reordering occurs, the first feedback packet that arrives will not
necessarily be the same as the first packet in sequence order. The necessarily be the same as the first packet in sequence order. The
test has been specified loosely like this to simplify implementation, test has been specified loosely like this to simplify implementation,
and because it would not have been any more precise to have specified and because it would not have been any more precise to have specified
the first packet in sequence order, which would not necessarily be the first packet in sequence order, which would not necessarily be
the first ACE counter that the Data Receiver fed back anyway, given the first ACE counter that the Data Receiver fed back anyway, given
it might have been a retransmission. it might have been a retransmission.
The possibility of reordering means that there is a small chance that The possibility of reordering means that there is a small chance that
the ACE field on the first packet to arrive is genuinely zero the ACE field on the first packet to arrive is genuinely zero
(without middlebox interference). This would cause a host to (without middlebox interference). This would cause a host to
unnecessarily disable ECN for a half connection. Therefore, in unnecessarily disable ECN for a half-connection. Therefore, in
environments where there is no evidence of the ACE field being environments where there is no evidence of the ACE field being
zeroed, implementations MAY skip this test. zeroed, implementations MAY skip this test.
Note that the Data Sender MUST NOT test whether the arriving counter Note that the Data Sender MUST NOT test whether the arriving counter
in the initial ACE field has been initialized to a specific valid in the initial ACE field has been initialized to a specific valid
value -- the above check solely tests whether the ACE fields have value -- the above check solely tests whether the ACE fields have
been incorrectly zeroed. This allows hosts to use different initial been incorrectly zeroed. This allows hosts to use different initial
values as an additional signalling channel in the future. values as an additional signalling channel in the future.
3.2.2.5. Safety Against Ambiguity of the ACE Field 3.2.2.5. Safety Against Ambiguity of the ACE Field
skipping to change at line 1585 skipping to change at line 1586
The following rules define when the receiver of a packet in AccECN The following rules define when the receiver of a packet in AccECN
mode emits an ACK: mode emits an ACK:
Change-Triggered ACKs: An AccECN Data Receiver SHOULD emit an ACK Change-Triggered ACKs: An AccECN Data Receiver SHOULD emit an ACK
whenever a data packet marked CE arrives after the previous packet whenever a data packet marked CE arrives after the previous packet
was not CE. was not CE.
Even though this rule is stated as a "SHOULD", it is important for Even though this rule is stated as a "SHOULD", it is important for
a transition to trigger an ACK if at all possible. The only valid a transition to trigger an ACK if at all possible. The only valid
exception to this rule is given below these bullets. exception to this rule is due to large receive offload (LRO) or
generic receive offload (GRO) as further described below.
For the avoidance of doubt, this rule is deliberately worded to For the avoidance of doubt, this rule is deliberately worded to
apply solely when _data_ packets arrive, but the comparison with apply solely when _data_ packets arrive, but the comparison with
the previous packet includes any packet, not just data packets. the previous packet includes any packet, not just data packets.
Increment-Triggered ACKs: An AccECN receiver of a packet MUST emit Increment-Triggered ACKs: An AccECN receiver of a packet MUST emit
an ACK if 'n' CE marks have arrived since the previous ACK. If an ACK if 'n' CE marks have arrived since the previous ACK. If
there is unacknowledged data at the receiver, 'n' SHOULD be 2. If there is unacknowledged data at the receiver, 'n' SHOULD be 2. If
there is no unacknowledged data at the receiver, 'n' SHOULD be 3 there is no unacknowledged data at the receiver, 'n' SHOULD be 3
and MUST be no less than 3. In either case, 'n' MUST be no and MUST be no less than 3. In either case, 'n' MUST be no
skipping to change at line 1723 skipping to change at line 1725
Figure 4 shows two option field orders; order 0 and order 1. They Figure 4 shows two option field orders; order 0 and order 1. They
both consist of three 24-bit fields. Order 0 provides the 24 least both consist of three 24-bit fields. Order 0 provides the 24 least
significant bits of the r.e0b, r.ceb, and r.e1b counters, significant bits of the r.e0b, r.ceb, and r.e1b counters,
respectively. Order 1 provides the same fields, but in the opposite respectively. Order 1 provides the same fields, but in the opposite
order. On each packet, the Data Receiver can use whichever order is order. On each packet, the Data Receiver can use whichever order is
more efficient. In either case, the bytes within the fields are in more efficient. In either case, the bytes within the fields are in
network byte order (big-endian). network byte order (big-endian).
The choice to use three bytes (24 bits) fields in the options was The choice to use three bytes (24 bits) fields in the options was
made to strike a balance between TCP option space usage, and the made to strike a balance between TCP Option space usage, and the
required fidelity of the counters to accommodate typical scenarios required fidelity of the counters to accommodate typical scenarios
such as hardware TCP Segmentation Offloading (TSO), and periods such as hardware TCP Segmentation Offloading (TSO), and periods
during which no option may be transmitted (e.g., SACK loss recovery). during which no option may be transmitted (e.g., SACK loss recovery).
Providing only 2 bytes (16 bits) for these counters could easily roll Providing only 2 bytes (16 bits) for these counters could easily roll
over within a single TSO transmission or large/generic receive over within a single TSO transmission or large/generic receive
offload (LRO/GRO) event. Having two distinct orderings further offload (LRO/GRO) event. Having two distinct orderings further
allows the transmission of the most pertinent changes in an allows the transmission of the most pertinent changes in an
abbreviated option (see below). abbreviated option (see below).
When a Data Receiver sends an AccECN Option, it MUST set the Kind When a Data Receiver sends an AccECN Option, it MUST set the Kind
skipping to change at line 1862 skipping to change at line 1864
AccECN Options. To expedite connection setup in deployment scenarios AccECN Options. To expedite connection setup in deployment scenarios
where AccECN path traversal might be problematic, the TCP Server where AccECN path traversal might be problematic, the TCP Server
SHOULD retransmit the SYN/ACK, but with no AccECN Option. If this SHOULD retransmit the SYN/ACK, but with no AccECN Option. If this
retransmission times out, to expedite connection setup, the TCP retransmission times out, to expedite connection setup, the TCP
Server SHOULD retransmit the SYN/ACK with (AE,CWR,ECE) = (0,0,0) and Server SHOULD retransmit the SYN/ACK with (AE,CWR,ECE) = (0,0,0) and
no AccECN Option, but it remains in AccECN feedback mode (per no AccECN Option, but it remains in AccECN feedback mode (per
Section 3.1.5). Section 3.1.5).
| Note that a retransmitted AccECN SYN/ACK will not necessarily | Note that a retransmitted AccECN SYN/ACK will not necessarily
| have the same TCP-ECN flags as the original SYN/ACK, because it | have the same TCP-ECN flags as the original SYN/ACK, because it
| feeds back the IP-ECN field of the latest SYN to have arrived | feeds back the IP ECN field of the latest SYN to have arrived
| (by the rule in Section 3.1.5). | (by the rule in Section 3.1.5).
The above fall-back approach limits any interference by middleboxes The above fall-back approach limits any interference by middleboxes
that might drop packets with unknown options, even though it is more that might drop packets with unknown options, even though it is more
likely that SYN/ACK loss is due to congestion. The TCP Server MAY likely that SYN/ACK loss is due to congestion. The TCP Server MAY
try to send another packet with an AccECN Option at a later point try to send another packet with an AccECN Option at a later point
during the connection but it ought to monitor if that packet got lost during the connection but it ought to monitor if that packet got lost
as well, in which case it SHOULD disable the sending of AccECN as well, in which case it SHOULD disable the sending of AccECN
Options for this half-connection. Options for this half-connection.
skipping to change at line 1922 skipping to change at line 1924
packets carried an AccECN Option and disable the sending of AccECN packets carried an AccECN Option and disable the sending of AccECN
Options if the loss probability of those packets is significantly Options if the loss probability of those packets is significantly
higher than that of all other data packets in the same connection. higher than that of all other data packets in the same connection.
3.2.3.2.3. Testing for Absence of the AccECN Option 3.2.3.2.3. Testing for Absence of the AccECN Option
If the TCP Client has successfully negotiated AccECN but does not If the TCP Client has successfully negotiated AccECN but does not
receive an AccECN Option on the SYN/ACK (e.g., because is has been receive an AccECN Option on the SYN/ACK (e.g., because is has been
stripped by a middlebox or not sent by the Server), the Client stripped by a middlebox or not sent by the Server), the Client
switches into a mode that assumes that the AccECN Option is not switches into a mode that assumes that the AccECN Option is not
available for this half connection. available for this half-connection.
Similarly, if the TCP Server has successfully negotiated AccECN but Similarly, if the TCP Server has successfully negotiated AccECN but
does not receive an AccECN Option on the first segment that does not receive an AccECN Option on the first segment that
acknowledges sequence space at least covering the ISN, it switches acknowledges sequence space at least covering the ISN, it switches
into a mode that assumes that the AccECN Option is not available for into a mode that assumes that the AccECN Option is not available for
this half connection. this half-connection.
While a host is in this mode that assumes incoming AccECN Options are While a host is in this mode that assumes incoming AccECN Options are
not available, it MUST adopt the conservative interpretation of the not available, it MUST adopt the conservative interpretation of the
ACE field discussed in Section 3.2.2.5. However, it cannot make any ACE field discussed in Section 3.2.2.5. However, it cannot make any
assumption about support of outgoing AccECN Options on the other half assumption about support of outgoing AccECN Options on the other
connection, so it SHOULD continue to send AccECN Options itself half-connection, so it SHOULD continue to send AccECN Options itself
(unless it has established that sending AccECN Options is causing (unless it has established that sending AccECN Options is causing
packets to be blocked as in Section 3.2.3.2.2). packets to be blocked as in Section 3.2.3.2.2).
If a host is in the mode that assumes incoming AccECN Options are not If a host is in the mode that assumes incoming AccECN Options are not
available, but it receives an AccECN Option at any later point during available, but it receives an AccECN Option at any later point during
the connection, this clearly indicates that AccECN Options are no the connection, this clearly indicates that AccECN Options are no
longer blocked on the respective path, and the AccECN endpoint MAY longer blocked on the respective path, and the AccECN endpoint MAY
switch out of the mode that assumes AccECN Options are not available switch out of the mode that assumes AccECN Options are not available
for this half connection. for this half-connection.
3.2.3.2.4. Test for Zeroing of the AccECN Option 3.2.3.2.4. Test for Zeroing of the AccECN Option
For a related test for invalid initialization of the ACE field, see For a related test for invalid initialization of the ACE field, see
Section 3.2.2.4 Section 3.2.2.4
Section 3.2.1 required the Data Receiver to initialize the r.e0b and Section 3.2.1 required the Data Receiver to initialize the r.e0b and
r.e1b counters to a non-zero value. Therefore, in either direction r.e1b counters to a non-zero value. Therefore, in either direction
the initial value of the EE0B field or EE1B field in an AccECN Option the initial value of the EE0B field or EE1B field in an AccECN Option
(if one exists) ought to be non-zero. If AccECN has been negotiated: (if one exists) ought to be non-zero. If AccECN has been negotiated:
* the TCP Server MAY check that the initial value of the EE0B field * the TCP Server MAY check that the initial value of the EE0B field
or the EE1B field is non-zero in the first segment that or the EE1B field is non-zero in the first segment that
acknowledges sequence space that at least covers the ISN plus 1. acknowledges sequence space that at least covers the ISN plus 1.
If it runs a test and either initial value is zero, the Server If it runs a test and either initial value is zero, the Server
will switch into a mode that ignores AccECN Options for this half will switch into a mode that ignores AccECN Options for this half-
connection. connection.
* the TCP Client MAY check that the initial value of the EE0B field * the TCP Client MAY check that the initial value of the EE0B field
or the EE1B field is non-zero on the SYN/ACK. If it runs a test or the EE1B field is non-zero on the SYN/ACK. If it runs a test
and either initial value is zero, the Client will switch into a and either initial value is zero, the Client will switch into a
mode that ignores AccECN Options for this half connection. mode that ignores AccECN Options for this half-connection.
While a host is in the mode that ignores AccECN Options, it MUST While a host is in the mode that ignores AccECN Options, it MUST
adopt the conservative interpretation of the ACE field discussed in adopt the conservative interpretation of the ACE field discussed in
Section 3.2.2.5. Section 3.2.2.5.
Note that the Data Sender MUST NOT test whether the arriving byte Note that the Data Sender MUST NOT test whether the arriving byte
counters in an initial AccECN Option have been initialized to counters in an initial AccECN Option have been initialized to
specific valid values -- the above checks solely test whether these specific valid values -- the above checks solely test whether these
fields have been incorrectly zeroed. This allows hosts to use fields have been incorrectly zeroed. This allows hosts to use
different initial values as an additional signalling channel in the different initial values as an additional signalling channel in the
skipping to change at line 2006 skipping to change at line 2008
could also occur if a middlebox mangled an AccECN Option but not the could also occur if a middlebox mangled an AccECN Option but not the
ACE field. However, the Data Sender has to assume that the integrity ACE field. However, the Data Sender has to assume that the integrity
of AccECN Options is sound, based on the above test of the well-known of AccECN Options is sound, based on the above test of the well-known
initial values and optionally other integrity tests (Section 5.3). initial values and optionally other integrity tests (Section 5.3).
If either endpoint detects that the s.ceb counter has increased but If either endpoint detects that the s.ceb counter has increased but
the s.cep has not (and by testing ACK coverage it is certain how much the s.cep has not (and by testing ACK coverage it is certain how much
the ACE field has wrapped), and if there is no explanation other than the ACE field has wrapped), and if there is no explanation other than
an invalid protocol transition due to some form of feedback mangling, an invalid protocol transition due to some form of feedback mangling,
the Data Sender MUST disable sending ECN-capable packets for the the Data Sender MUST disable sending ECN-capable packets for the
remainder of the half-connection by setting the IP-ECN field in all remainder of the half-connection by setting the IP ECN field in all
subsequent packets to Not-ECT. subsequent packets to Not-ECT.
3.2.3.3. Usage of the AccECN TCP Option 3.2.3.3. Usage of the AccECN TCP Option
If a Data Receiver in AccECN mode intends to use AccECN TCP Options If a Data Receiver in AccECN mode intends to use AccECN TCP Options
to provide feedback, the rules below determine when to include an to provide feedback, the rules below determine when to include an
AccECN TCP Option, and which fields to include, given other options AccECN TCP Option, and which fields to include, given other options
might be competing for limited option space: might be competing for limited option space:
Importance of Congestion Control: AccECN is for congestion control, Importance of Congestion Control: AccECN is for congestion control,
which implementations SHOULD generally prioritize over other TCP which implementations SHOULD generally prioritize over other TCP
options when there is insufficient space for all the options in Options when there is insufficient space for all the options in
use. use.
If SACK has been negotiated [RFC2018], and the smallest If SACK has been negotiated [RFC2018], and the smallest
recommended AccECN Option would leave insufficient space for two recommended AccECN Option would leave insufficient space for two
SACK blocks on a particular ACK, the Data Receiver MUST give SACK blocks on a particular ACK, the Data Receiver MUST give
precedence to the SACK option (total 18 octets), because loss precedence to the SACK option (total 18 octets), because loss
feedback is more critical. feedback is more critical.
Recommended Simple Scheme: The Data Receiver SHOULD include an Recommended Simple Scheme: The Data Receiver SHOULD include an
AccECN TCP Option on every scheduled ACK if any byte counter has AccECN TCP Option on every scheduled ACK if any byte counter has
skipping to change at line 2040 skipping to change at line 2042
include a field for every byte counter that has changed at some include a field for every byte counter that has changed at some
time during the connection (see examples later). time during the connection (see examples later).
A scheduled ACK means an ACK that the Data Receiver would send by A scheduled ACK means an ACK that the Data Receiver would send by
its regular delayed ACK rules. Recall that Section 1.3 defines an its regular delayed ACK rules. Recall that Section 1.3 defines an
'ACK' as either with data payload or without. But the above rule 'ACK' as either with data payload or without. But the above rule
is worded so that, in the common case when most of the data is is worded so that, in the common case when most of the data is
from a Server to a Client, the Server only includes an AccECN TCP from a Server to a Client, the Server only includes an AccECN TCP
Option while it is acknowledging data from the Client. Option while it is acknowledging data from the Client.
When available TCP option space is limited on particular packets, the When available TCP Option space is limited on particular packets, the
recommended scheme will need to include compromises. To guide the recommended scheme will need to include compromises. To guide the
implementer, the rules below are ranked in order of importance, but implementer, the rules below are ranked in order of importance, but
the final decision has to be implementation-dependent, because the final decision has to be implementation-dependent, because
tradeoffs will alter as new TCP options are defined and new use-cases tradeoffs will alter as new TCP Options are defined and new use-cases
arise. arise.
Necessary Option Length: When TCP option space is limited, an AccECN Necessary Option Length: When TCP Option space is limited, an AccECN
TCP option MAY be truncated to omit one or two fields from the end TCP Option MAY be truncated to omit one or two fields from the end
of the option, as indicated by the permitted variants listed in of the option, as indicated by the permitted variants listed in
Table 5, provided that the counter(s) that have changed since the Table 5, provided that the counter(s) that have changed since the
previous AccECN TCP option are not omitted. previous AccECN TCP Option are not omitted.
If there is insufficient space to include an AccECN TCP option If there is insufficient space to include an AccECN TCP Option
containing the counter(s) that have changed since the previous containing the counter(s) that have changed since the previous
AccECN TCP option, then the entire AccECN TCP option MUST be AccECN TCP Option, then the entire AccECN TCP Option MUST be
omitted. (see Section 3.2.3); omitted. (see Section 3.2.3);
Change-Triggered AccECN TCP Options: If an arriving packet Change-Triggered AccECN TCP Options: If an arriving packet
increments a different byte counter to that incremented by the increments a different byte counter to that incremented by the
previous packet, the Data Receiver SHOULD feed it back in an previous packet, the Data Receiver SHOULD feed it back in an
AccECN Option on the next scheduled ACK. AccECN Option on the next scheduled ACK.
For the avoidance of doubt, this rule does not concern the arrival For the avoidance of doubt, this rule does not concern the arrival
of control packets with no payload, because they cannot alter any of control packets with no payload, because they cannot alter any
byte counters. byte counters.
Continual Repetition: Otherwise, if arriving packets continue to Continual Repetition: Otherwise, if arriving packets continue to
increment the same byte counter: increment the same byte counter:
* the Data Receiver SHOULD include a counter that has continued * the Data Receiver SHOULD include a counter that has continued
to increment on the next scheduled ACK following a change- to increment on the next scheduled ACK following a change-
triggered AccECN TCP Option; triggered AccECN TCP Option.
* while the same counter continues to increment, it SHOULD * while the same counter continues to increment, it SHOULD
include the counter every n ACKs as consistently as possible, include the counter every n ACKs as consistently as possible,
where n can be chosen by the implementer; where n can be chosen by the implementer.
* It SHOULD always include an AccECN Option if the r.ceb counter * It SHOULD always include an AccECN Option if the r.ceb counter
is incrementing and it MAY include an AccECN Option if r.ec0b is incrementing and it MAY include an AccECN Option if r.ec0b
or r.ec1b is incrementing or r.ec1b is incrementing.
* It SHOULD include each counter at least once for every 2^22 * It SHOULD include each counter at least once for every 2^22
bytes incremented to prevent overflow during continual bytes incremented to prevent overflow during continual
repetition. repetition.
The above rules complement those in Section 3.2.2.5, which determine The above rules complement those in Section 3.2.2.5, which determine
when to generate an ACK irrespective of whether an AccECN TCP Option when to generate an ACK irrespective of whether an AccECN TCP Option
is to be included. is to be included.
The recommended scheme is intended as a simple way to ensure that all The recommended scheme is intended as a simple way to ensure that all
the relevant byte counters will be carried on any ACK that reaches the relevant byte counters will be carried on any ACK that reaches
the Data Sender, no matter how many pure ACKs are filtered or the Data Sender, no matter how many pure ACKs are filtered or
coalesced along the network path, and without consuming the space coalesced along the network path, and without consuming the space
available for payload data with counter field(s) that have never available for payload data with counter field(s) that have never
changed. changed.
As an example of the recommended scheme, if ECT(0) is the only As an example of the recommended scheme, if ECT(0) is the only
codepoint that has ever arrived in the IP-ECN field, the Data codepoint that has ever arrived in the IP ECN field, the Data
Receiver will feed back an AccECN0 TCP Option with only the EE0B Receiver will feed back an AccECN0 TCP Option with only the EE0B
field on every packet that acknowledges new data. However, as soon field on every packet that acknowledges new data. However, as soon
as even one CE-marked packet arrives, on every packet that as even one CE-marked packet arrives, on every packet that
acknowledges new data it will start to include an option with two acknowledges new data it will start to include an option with two
fields, EE0B and ECEB. As a second example, if the first packet to fields, EE0B and ECEB. As a second example, if the first packet to
arrive happens to be CE marked, the Data Receiver will have to arrive happens to be CE marked, the Data Receiver will have to
arbitrarily choose whether to precede the ECEB field with an EE0B arbitrarily choose whether to precede the ECEB field with an EE0B
field or an EE1B field. If it chooses, say, EEB0 but it turns out field or an EE1B field. If it chooses, say, EEB0 but it turns out
never to receive ECT(0), it can start sending EE1B and ECEB instead never to receive ECT(0), it can start sending EE1B and ECEB instead
-- it does not have to include the EE0B field if the r.e0b counter -- it does not have to include the EE0B field if the r.e0b counter
skipping to change at line 2170 skipping to change at line 2172
A TCP normalizer is likely to block or alter an AccECN TCP Option if A TCP normalizer is likely to block or alter an AccECN TCP Option if
the length value or the initial values of its byte-counter fields do the length value or the initial values of its byte-counter fields do
not match one of those specified in Sections 3.2.3 or 3.2.1. not match one of those specified in Sections 3.2.3 or 3.2.1.
However, to comply with the present AccECN specification, a middlebox However, to comply with the present AccECN specification, a middlebox
MUST NOT change the ACE field; or those fields of an AccECN Option MUST NOT change the ACE field; or those fields of an AccECN Option
that are currently specified in Section 3.2.3; or any AccECN field that are currently specified in Section 3.2.3; or any AccECN field
covered by integrity protection (e.g., [RFC5925]). covered by integrity protection (e.g., [RFC5925]).
3.3.3. Requirements for TCP ACK Filtering 3.3.3. Requirements for TCP ACK Filtering
Section 5.2.1 of [RFC3449] gives best current practice on filtering Section 5.2.1 of RFC 3449 [BCP69] gives best current practice on
(aka thinning or coalescing) of pure TCP ACKs. It advises that filtering (aka thinning or coalescing) of pure TCP ACKs. It advises
filtering ACKs carrying ECN feedback ought to preserve the correct that filtering ACKs carrying ECN feedback ought to preserve the
operation of ECN feedback. As the present specification updates the correct operation of ECN feedback. As the present specification
operation of ECN feedback, this section discusses how an ACK filter updates the operation of ECN feedback, this section discusses how an
might preserve correct operation of AccECN feedback as well. ACK filter might preserve correct operation of AccECN feedback as
well.
The problem divides into two parts: determining if an ACK is part of The problem divides into two parts: determining if an ACK is part of
a connection that is using AccECN and then preserving the correct a connection that is using AccECN and then preserving the correct
operation of AccECN feedback: operation of AccECN feedback:
* To determine whether a pure TCP ACK is part of an AccECN * To determine whether a pure TCP ACK is part of an AccECN
connection without resorting to connection tracking and per-flow connection without resorting to connection tracking and per-flow
state, a useful heuristic would be to check for a non-zero ECN state, a useful heuristic would be to check for a non-zero ECN
field at the IP layer (because the ECN++ experiment only allows field at the IP layer (because the ECN++ experiment only allows
TCP pure ACKs to be ECN-capable if AccECN has been negotiated TCP pure ACKs to be ECN-capable if AccECN has been negotiated
[ECN++]). This heuristic is simple and stateless. However, it [ECN++]). This heuristic is simple and stateless. However, it
might omit some AccECN ACKs, because AccECN can be used without might omit some AccECN ACKs because AccECN can be used without
ECN++ and even if it is, ECN++ does not have to make pure ACKs ECN++. Even if ECN++ is used, pure ACKs do not necessarily have
ECN-capable -- only deployment experience will tell. Also, TCP to be marked as ECN-capable -- only deployment experience will
ACKs might be ECN-capable owing to some scheme other than AccECN, tell. Also, TCP ACKs might be ECN-capable owing to some scheme
e.g., [RFC5690] or some future standards action. Again, only other than AccECN, e.g., [RFC5690] or some future standards
deployment experience will tell. action. Again, only deployment experience will tell.
* The main concern with preserving correct AccECN operation involves * The main concern with preserving correct AccECN operation involves
leaving enough ACKs for the Data Sender to work out whether the leaving enough ACKs for the Data Sender to work out whether the
3-bit ACE field has wrapped. In the worst case, in feedback about 3-bit ACE field has wrapped. In the worst case, in feedback about
a run of received packets that were all ECN-marked, the ACE field a run of received packets that were all ECN-marked, the ACE field
will wrap every 8 acknowledged packets. ACE field wrap might be will wrap every 8 acknowledged packets. ACE field wrap might be
of less concern if packets also carry AccECN TCP Options. of less concern if packets also carry AccECN TCP Options.
However, note that logic to read an AccECN TCP Option is optional However, note that logic to read an AccECN TCP Option is optional
to implement (albeit recommended -- see Section 3.2.3). So one to implement (albeit recommended -- see Section 3.2.3). So one
end writing an AccECN TCP Option into a packet does not end writing an AccECN TCP Option into a packet does not
skipping to change at line 2240 skipping to change at line 2243
direction. Therefore, currently available TSO hardware with direction. Therefore, currently available TSO hardware with
[RFC3168] support may need some minor driver changes, to adjust the [RFC3168] support may need some minor driver changes, to adjust the
bitmask for the first, middle, and last segments processed with TSO. bitmask for the first, middle, and last segments processed with TSO.
Initially, when Classic ECN [RFC3168] and Accurate ECN flows coexist Initially, when Classic ECN [RFC3168] and Accurate ECN flows coexist
on the same offloading engine, the host software may need to work on the same offloading engine, the host software may need to work
around incompatibilities (e.g., when only global configurable TSO TCP around incompatibilities (e.g., when only global configurable TSO TCP
Flag bitmasks are available), otherwise this would cause some issues. Flag bitmasks are available), otherwise this would cause some issues.
One way around this could be to only negotiate for Accurate ECN, but One way around this could be to only negotiate for Accurate ECN, but
not offer a fall back to [RFC3168] ECN. Another way could be to not offer a fall back to Classic ECN [RFC3168]. Another way could be
allow TSO only as long as the CWR flag in the TCP header is not set to allow TSO only as long as the CWR flag in the TCP header is not
-- at the cost of more processing overhead while the ACE field has set -- at the cost of more processing overhead while the ACE field
this bit set. has this bit set.
For LRO in the receive direction, a different issue may get exposed For LRO in the receive direction, a different issue may get exposed
with [RFC3168] ECN supporting hardware. with Classic ECN [RFC3168] supporting hardware.
The ACE field changes with every received CE marking, so today's The ACE field changes with every received CE marking, so today's
receive offloading could lead to many interrupts in high congestion receive offloading could lead to many interrupts in high congestion
situations. Although that would be useful (because congestion situations. Although that would be useful (because congestion
information is received sooner), it could also significantly increase information is received sooner), it could also significantly increase
processor load, particularly in scenarios such as DCTCP or L4S where processor load, particularly in scenarios such as DCTCP or L4S where
the marking rate is generally higher. the marking rate is generally higher.
Current offload hardware ejects a segment from the coalescing process Current offload hardware ejects a segment from the coalescing process
whenever the TCP ECN flags change. In data centres, it has been whenever the TCP ECN flags change. In data centres, it has been
skipping to change at line 2304 skipping to change at line 2307
of the present specification. of the present specification.
* In Section 6.1.2 of [RFC3168], all mentions of a congestion * In Section 6.1.2 of [RFC3168], all mentions of a congestion
response to an ECN-Echo (ECE) ACK packet are updated by response to an ECN-Echo (ECE) ACK packet are updated by
Section 3.2 of the present specification to mean an increment to Section 3.2 of the present specification to mean an increment to
the sender's count of CE-marked packets, s.cep. And the the sender's count of CE-marked packets, s.cep. And the
requirements to set the CWR flag no longer apply, as specified in requirements to set the CWR flag no longer apply, as specified in
Section 3.1.5 of the present specification. Otherwise, the Section 3.1.5 of the present specification. Otherwise, the
remaining requirements in Section 6.1.2 of [RFC3168] still stand. remaining requirements in Section 6.1.2 of [RFC3168] still stand.
It will be noted that [RFC8311] already updates, or potentially It will be noted that [RFC8311] already updates a number of the
updates, a number of the requirements in Section 6.1.2 of requirements in Section 6.1.2 of [RFC3168]. Section 6.1.2 of RFC
[RFC3168]. Section 6.1.2 of RFC 3168 extended standard TCP 3168 extended standard TCP congestion control [RFC5681] to cover
congestion control [RFC5681] to cover ECN marking as well as ECN marking as well as packet drop. Whereas, [RFC8311] enables
packet drop. Whereas, [RFC8311] enables experimentation with experimentation with alternative responses to ECN marking, if
alternative responses to ECN marking, if specified for instance by specified for instance by an Experimental RFC produced by the IETF
an Experimental RFC produced by the IETF Stream. [RFC8311] also Stream. [RFC8311] also strengthened the statement that "ECT(0)
strengthened the statement that "ECT(0) SHOULD be used" to a SHOULD be used" to a "MUST" (see [RFC8311] for the details).
"MUST" (see [RFC8311] for the details).
* The whole of Section 6.1.3 of [RFC3168] is updated by Section 3.2 * The whole of Section 6.1.3 of [RFC3168] is updated by Section 3.2
of the present specification, with the exception of the last of the present specification, with the exception of the last
paragraph (about congestion response to drop and ECN in the same paragraph (about congestion response to drop and ECN in the same
round trip), which still stands. Incidentally, this last round trip), which still stands. Incidentally, this last
paragraph is in the wrong section, because it relates to "TCP paragraph is in the wrong section, because it relates to "TCP
Sender" behaviour. Sender" behaviour.
* The following text within Section 6.1.5 of [RFC3168]: * The following text within Section 6.1.5 of [RFC3168]:
skipping to change at line 2384 skipping to change at line 2386
with the value 0b000 or 0b001, these values indicate that the TCP with the value 0b000 or 0b001, these values indicate that the TCP
Client did not request support for AccECN; therefore, the Server does Client did not request support for AccECN; therefore, the Server does
not enter AccECN mode for this connection. Further, 0b001 on the ACK not enter AccECN mode for this connection. Further, 0b001 on the ACK
implies that the Server sent an ECN-capable SYN/ACK, which was marked implies that the Server sent an ECN-capable SYN/ACK, which was marked
CE in the network, and the non-AccECN TCP Client fed this back by CE in the network, and the non-AccECN TCP Client fed this back by
setting ECE on the ACK of the SYN/ACK. setting ECE on the ACK of the SYN/ACK.
5.2. Compatibility with TCP Experiments and Common TCP Options 5.2. Compatibility with TCP Experiments and Common TCP Options
AccECN is compatible (at least on paper) with the most commonly used AccECN is compatible (at least on paper) with the most commonly used
TCP options: MSS, time-stamp, window scaling, SACK, and TCP-AO. It TCP Options: MSS, timestamp, window scaling, SACK, and TCP-AO. It is
is also compatible with Multipath TCP (MPTCP [RFC8684]) and the also compatible with Multipath TCP (MPTCP [RFC8684]) and the
experimental TCP option TCP Fast Open (TFO [RFC7413]). AccECN is experimental TCP Option TCP Fast Open (TFO [RFC7413]). AccECN is
friendly to all these protocols, because space for TCP options is friendly to all these protocols, because space for TCP Options is
particularly scarce on the SYN, where AccECN consumes zero additional particularly scarce on the SYN, where AccECN consumes zero additional
header space. header space.
When option space is under pressure from other options, Because option space is limited, Section 3.2.3.3 provides guidance on
Section 3.2.3.3 provides guidance on how important it is to send an how important it is to send an AccECN Option relative to other
AccECN Option relative to other options, and which fields are more options and specifies which fields are more important to include.
important to include.
Implementers of TFO need to take careful note of the recommendation Implementers of TFO need to take careful note of the recommendation
in Section 3.2.2.1. That section recommends that, if the TCP Client in Section 3.2.2.1. That section recommends that, if the TCP Client
has successfully negotiated AccECN, when acknowledging the SYN/ACK, has successfully negotiated AccECN, when acknowledging the SYN/ACK,
even if it has data to send, it sends a pure ACK immediately before even if it has data to send, it sends a pure ACK immediately before
the data. Then it can reflect the IP-ECN field of the SYN/ACK on the data. Then it can reflect the IP ECN field of the SYN/ACK on
this pure ACK, which allows the Server to detect ECN mangling. Note this pure ACK, which allows the Server to detect ECN mangling. Note
that, as specified in Section 3.2, any data on the SYN (SYN=1, ACK=0) that, as specified in Section 3.2, any data on the SYN (SYN=1, ACK=0)
is not included in any of the byte counters held locally for each ECN is not included in any of the byte counters held locally for each ECN
marking, nor in the AccECN Option on the wire. marking, nor in the AccECN Option on the wire.
AccECN feedback is compatible with the ECN++ experiment [ECN++], AccECN feedback is compatible with the ECN++ experiment [ECN++],
which allows TCP control packets and retransmissions to be ECN- which allows TCP control packets and retransmissions to be ECN-
capable ([RFC3168] was updated by [RFC8311] to permit such capable ([RFC3168] was updated by [RFC8311] to permit such
experiments). AccECN is likely to inherently support any experiment experiments). AccECN is likely to inherently support any experiment
with ECN-capable packets, because it feeds back the contents of the with ECN-capable packets, because it feeds back the contents of the
skipping to change at line 2424 skipping to change at line 2425
an earlier experimental protocol with narrower scope than ECN++ and a an earlier experimental protocol with narrower scope than ECN++ and a
5-way handshake. 5-way handshake.
5.3. Compatibility with Feedback Integrity Mechanisms 5.3. Compatibility with Feedback Integrity Mechanisms
Three alternative mechanisms are available to assure the integrity of Three alternative mechanisms are available to assure the integrity of
ECN and/or loss signals. AccECN is compatible with any of these ECN and/or loss signals. AccECN is compatible with any of these
approaches: approaches:
* The Data Sender can test the integrity of the receiver's ECN (or * The Data Sender can test the integrity of the receiver's ECN (or
loss) feedback by occasionally setting the IP-ECN field to a value loss) feedback by occasionally setting the IP ECN field to a value
normally only set by the network (and/or deliberately leaving a normally only set by the network (and/or deliberately leaving a
sequence number gap). Then it can test whether the Data sequence number gap). Then it can test whether the Data
Receiver's feedback faithfully reports what it expects (similar to Receiver's feedback faithfully reports what it expects (similar to
paragraph 2 of Section 20.2 of [RFC3168]). Unlike the ECN-nonce paragraph 2 of Section 20.2 of [RFC3168]). Unlike the ECN-nonce
[RFC3540], this approach does not waste the ECT(1) codepoint in [RFC3540], this approach does not waste the ECT(1) codepoint in
the IP header, it does not require standardization, and it does the IP header, it does not require standardization, and it does
not rely on misbehaving receivers volunteering to reveal feedback not rely on misbehaving receivers volunteering to reveal feedback
information that allows them to be detected. However, setting the information that allows them to be detected. However, setting the
CE mark by the sender might conceal actual congestion feedback CE mark by the sender might conceal actual congestion feedback
from the network and therefore ought to only be done sparingly. from the network and therefore ought to only be done sparingly.
skipping to change at line 2455 skipping to change at line 2456
ConEx is an experimental change to the Data Sender that would be ConEx is an experimental change to the Data Sender that would be
most useful when combined with AccECN. Without AccECN, the ConEx most useful when combined with AccECN. Without AccECN, the ConEx
behaviour of a Data Sender would have to be more conservative than behaviour of a Data Sender would have to be more conservative than
would be necessary if it had the accurate feedback of AccECN. would be necessary if it had the accurate feedback of AccECN.
* The Standards Track TCP authentication option (TCP-AO [RFC5925]) * The Standards Track TCP authentication option (TCP-AO [RFC5925])
can be used to detect any tampering with AccECN feedback between can be used to detect any tampering with AccECN feedback between
the Data Receiver and the Data Sender (whether malicious or the Data Receiver and the Data Sender (whether malicious or
accidental). The AccECN fields are immutable end to end, so they accidental). The AccECN fields are immutable end to end, so they
are amenable to TCP-AO protection, which covers TCP options by are amenable to TCP-AO protection, which covers TCP Options by
default. However, TCP-AO is often too brittle to use on many end- default. However, TCP-AO is often too brittle to use on many end-
to-end paths, where middleboxes can make verification fail in to-end paths, where middleboxes can make verification fail in
their attempts to improve performance or security, e.g., Network their attempts to improve performance or security, e.g., Network
Address Translation (NAT) and Network Address Port Translation Address Translation (NAT) and Network Address Port Translation
(NAPT), resegmentation, or shifting the sequence space. (NAPT), resegmentation, or shifting the sequence space.
6. Summary: Protocol Properties 6. Summary: Protocol Properties
This section is informative, not normative. It describes how well This section is informative, not normative. It describes how well
the protocol satisfies the agreed requirements for a more Accurate the protocol satisfies the agreed requirements for a more Accurate
skipping to change at line 2477 skipping to change at line 2478
Accuracy: From each ACK, the Data Sender can infer the number of new Accuracy: From each ACK, the Data Sender can infer the number of new
CE-marked segments since the previous ACK. This provides better CE-marked segments since the previous ACK. This provides better
accuracy on CE feedback than Classic ECN. In addition, if an accuracy on CE feedback than Classic ECN. In addition, if an
AccECN Option is present (not blocked by the network path), the AccECN Option is present (not blocked by the network path), the
number of bytes marked with CE, ECT(1), and ECT(0) are provided. number of bytes marked with CE, ECT(1), and ECT(0) are provided.
Overhead: The AccECN scheme is divided into two parts. The Overhead: The AccECN scheme is divided into two parts. The
essential feedback part reuses the three flags already assigned to essential feedback part reuses the three flags already assigned to
ECN in the TCP header. The supplementary feedback part adds an ECN in the TCP header. The supplementary feedback part adds an
additional TCP option consuming up to 11 bytes. However, no TCP additional TCP Option consuming up to 11 bytes. However, no TCP
option space is consumed in the SYN. Option space is consumed in the SYN.
Ordering: The order in which marks arrive at the Data Receiver is Ordering: The order in which marks arrive at the Data Receiver is
preserved in AccECN feedback, because the Data Receiver is preserved in AccECN feedback, because the Data Receiver is
expected to send an ACK immediately whenever a different mark expected to send an ACK immediately whenever a different mark
arrives. arrives.
Timeliness: While the same ECN markings are arriving continually at Timeliness: While the same ECN markings are arriving continually at
the Data Receiver, it can defer ACKs as TCP does normally, but it the Data Receiver, it can defer ACKs as TCP does normally, but it
will immediately send an ACK as soon as a different ECN marking will immediately send an ACK as soon as a different ECN marking
arrives. arrives.
skipping to change at line 2545 skipping to change at line 2546
can assure the integrity of ECN feedback. If AccECN Options are can assure the integrity of ECN feedback. If AccECN Options are
stripped, the resolution of the feedback is degraded, but the stripped, the resolution of the feedback is degraded, but the
integrity of this degraded feedback can still be assured. integrity of this degraded feedback can still be assured.
Backward Compatibility: If only one endpoint supports the AccECN Backward Compatibility: If only one endpoint supports the AccECN
scheme, it will fall back to the most advanced ECN feedback scheme scheme, it will fall back to the most advanced ECN feedback scheme
supported by the other end. supported by the other end.
If AccECN Options are stripped by a middlebox, AccECN still If AccECN Options are stripped by a middlebox, AccECN still
provides basic congestion feedback in the ACE field. Further, provides basic congestion feedback in the ACE field. Further,
AccECN can be used to detect mangling of the IP-ECN field; AccECN can be used to detect mangling of the IP ECN field;
mangling of the TCP ECN flags; blocking of ECT-marked segments; mangling of the TCP ECN flags; blocking of ECT-marked segments;
and blocking of segments carrying an AccECN Option. It can detect and blocking of segments carrying an AccECN Option. It can detect
these conditions during TCP's three-way handshake so that it can these conditions during TCP's three-way handshake so that it can
fall back to operation without ECN and/or operation without AccECN fall back to operation without ECN and/or operation without AccECN
Options. Options.
Forward Compatibility: The behaviour of endpoints and middleboxes is Forward Compatibility: The behaviour of endpoints and middleboxes is
carefully defined for all reserved or currently unused codepoints carefully defined for all reserved or currently unused codepoints
in the scheme. Then, the designers of security devices can in the scheme. Then, the designers of security devices can
understand which currently unused values might appear in the understand which currently unused values might appear in the
skipping to change at line 2581 skipping to change at line 2582
+=====+==============+===========+==============================+ +=====+==============+===========+==============================+
| Bit | Name | Reference | Assignment Notes | | Bit | Name | Reference | Assignment Notes |
+=====+==============+===========+==============================+ +=====+==============+===========+==============================+
| 7 | AE (Accurate | RFC 9768 | Previously used as NS (Nonce | | 7 | AE (Accurate | RFC 9768 | Previously used as NS (Nonce |
| | ECN) | | Sum) by [RFC3540], which is | | | ECN) | | Sum) by [RFC3540], which is |
| | | | now Historic [RFC8311] | | | | | now Historic [RFC8311] |
+-----+--------------+-----------+------------------------------+ +-----+--------------+-----------+------------------------------+
Table 6: TCP Header Flag Reassignment Table 6: TCP Header Flag Reassignment
This document also defines two new TCP options for AccECN from the This document also defines two new TCP Options for AccECN from the
TCP option space. These values are defined as the following in the TCP Option space. These values are defined as the following in the
"TCP Option Kind Numbers" registry in the "Transmission Control "TCP Option Kind Numbers" registry in the "Transmission Control
Protocol (TCP) Parameters" registry group: Protocol (TCP) Parameters" registry group:
+======+========+================================+===========+ +======+========+================================+===========+
| Kind | Length | Meaning | Reference | | Kind | Length | Meaning | Reference |
+======+========+================================+===========+ +======+========+================================+===========+
| 172 | N | Accurate ECN Order 0 (AccECN0) | RFC 9768 | | 172 | N | Accurate ECN Order 0 (AccECN0) | RFC 9768 |
+------+--------+--------------------------------+-----------+ +------+--------+--------------------------------+-----------+
| 174 | N | Accurate ECN Order 1 (AccECN1) | RFC 9768 | | 174 | N | Accurate ECN Order 1 (AccECN1) | RFC 9768 |
+------+--------+--------------------------------+-----------+ +------+--------+--------------------------------+-----------+
Table 7: New TCP Option assignments Table 7: New TCP Option Assignments
Early experimental implementations of the two AccECN Options used Early experimental implementations of the two AccECN Options used
experimental option 254 per [RFC6994] with the 16-bit magic numbers experimental option 254 per [RFC6994] with the 16-bit magic numbers
0xACC0 and 0xACC1, respectively, for Order 0 and 1, as allocated in 0xACC0 and 0xACC1, respectively, for Order 0 and 1, as allocated in
the IANA "TCP/UDP Experimental Option Experiment Identifiers (TCP/UDP the IANA "TCP/UDP Experimental Option Experiment Identifiers (TCP/UDP
ExIDs)" registry. Even earlier experimental implementations used the ExIDs)" registry. Even earlier experimental implementations used the
single magic number 0xACCE (16 bits). Uses of these experimental single magic number 0xACCE (16 bits). Uses of these experimental
options SHOULD migrate to use the new option kinds (172 and 174). options SHOULD migrate to use the new option kinds (172 and 174).
8. Security and Privacy Considerations 8. Security and Privacy Considerations
If ever the supplementary feedback part of AccECN that is based on If ever the supplementary feedback part of AccECN that is based on
one of the new AccECN TCP Options is unusable (due for example to one of the new AccECN TCP Options is unusable (due for example to
middlebox interference), the essential feedback part of AccECN's middlebox interference), the essential feedback part of AccECN's
congestion feedback offers only limited resilience to long runs of congestion feedback offers only limited resilience to long runs of
ACK loss (see Section 3.2.2.5). These problems are unlikely to be ACK loss (see Section 3.2.2.5). These problems are unlikely to be
due to malicious intervention (because if an attacker could strip a due to malicious intervention (because if an attacker could strip a
TCP option or discard a long run of ACKs, it could wreak other TCP Option or discard a long run of ACKs, it could wreak other
arbitrary havoc). However, it would be of concern if AccECN's arbitrary havoc). However, it would be of concern if AccECN's
resilience could be indirectly compromised during a flooding attack. resilience could be indirectly compromised during a flooding attack.
AccECN is still considered safe though, because if AccECN Options are AccECN is still considered safe though, because if AccECN Options are
not present, the AccECN Data Sender is then required to switch to not present, the AccECN Data Sender is then required to switch to
more conservative assumptions about wrap of congestion indication more conservative assumptions about wrap of congestion indication
counters (see Section 3.2.2.5 and Appendix A.2). counters (see Section 3.2.2.5 and Appendix A.2).
Section 5.1 describes how a TCP Server can negotiate AccECN and use Section 5.1 describes how a TCP Server can negotiate AccECN and use
the SYN cookie method for mitigating SYN flooding attacks. the SYN cookie method for mitigating SYN flooding attacks.
skipping to change at line 2639 skipping to change at line 2640
will be degraded, but the integrity of this degraded information can will be degraded, but the integrity of this degraded information can
still be assured. Assuring that Data Senders respond appropriately still be assured. Assuring that Data Senders respond appropriately
to ECN feedback is possible, but the scope of the present document is to ECN feedback is possible, but the scope of the present document is
confined to the feedback protocol and excludes the response to this confined to the feedback protocol and excludes the response to this
feedback. feedback.
In Section 3.2.3, a Data Sender is allowed to ignore an unrecognized In Section 3.2.3, a Data Sender is allowed to ignore an unrecognized
TCP AccECN Option length and read as many whole 3-octet fields from TCP AccECN Option length and read as many whole 3-octet fields from
it as possible up to a maximum of 3, treating the remainder as it as possible up to a maximum of 3, treating the remainder as
padding. This opens up a potential covert channel of up to 29B (40 - padding. This opens up a potential covert channel of up to 29B (40 -
(2+3*3)) B. However, it is really an overt channel (not hidden) and (2+3*3)). However, it is really an overt channel (not hidden) and it
it is no different than the use of unknown TCP options with unknown is no different than the use of unknown TCP Options with unknown
option lengths in general. Therefore, where this is of concern, it option lengths in general. Therefore, where this is of concern, it
can already be adequately mitigated by regular TCP normalizer can already be adequately mitigated by regular TCP normalizer
technology (see Section 3.3.2). technology (see Section 3.3.2).
The AccECN protocol is not believed to introduce any new privacy The AccECN protocol is not believed to introduce any new privacy
concerns, because it merely counts and feeds back signals at the concerns, because it merely counts and feeds back signals at the
transport layer that had already been visible at the IP layer. A transport layer that had already been visible at the IP layer. A
covert channel can be used to compromise privacy. However, as covert channel can be used to compromise privacy. However, as
explained above, undefined TCP options in general open up such explained above, undefined TCP Options in general open up such
channels, and common techniques are available to close them off. channels, and common techniques are available to close them off.
There is a potential concern that a Data Receiver could deliberately There is a potential concern that a Data Receiver could deliberately
omit AccECN Options pretending that they had been stripped by a omit AccECN Options pretending that they had been stripped by a
middlebox. No known way can yet be contrived for a receiver to take middlebox. Currently, there is no known way for a receiver to take
advantage of this behaviour, which seems to always degrade its own advantage of this behaviour, which seems to always degrade its own
performance. However, the concern is mentioned here for performance. However, the concern is mentioned here for
completeness. completeness.
A generic privacy concern of any new protocol is that for a while it A generic privacy concern of any new protocol is that for a while it
will be used by a small population of hosts, and thus show up more will be used by a small population of hosts, and thus those hosts
easily. However, it is expected that AccECN will become available in could be more easily identified. However, it is expected that AccECN
operating systems over time and that it will eventually be turned on will become available in operating systems over time and that it will
by default. Thus, an individual identification of a particular user eventually be turned on by default. Thus, an individual
is less of a concern than the fingerprinting of specific versions of identification of a particular user is less of a concern than the
operation systems. However, the latter can be done using different fingerprinting of specific versions of operation systems. However,
means independent of Accurate ECN. the latter can be done using different means independent of Accurate
ECN.
As Accurate ECN exposes more bits in the TCP header that could be As Accurate ECN exposes more bits in the TCP header that could be
tampered with without interfering with the transport excessively, it tampered with without interfering with the transport excessively, it
may allow an additional way to identify specific data streams across may allow an additional way to identify specific data streams across
a virtual private network (VPN) to an attacker that has access to the a virtual private network (VPN) to an attacker that has access to the
datastream before and after the VPN tunnel endpoints. This may be datastream before and after the VPN tunnel endpoints. This may be
achieved by injecting or modifying the ACE field in specific patterns achieved by injecting or modifying the ACE field in specific patterns
that can be recognized. that can be recognized.
Overall, Accurate ECN does not change the risk profile on privacy to Overall, Accurate ECN does not change the risk profile on privacy to
skipping to change at line 2722 skipping to change at line 2724
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)", [RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)",
STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022, STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022,
<https://www.rfc-editor.org/info/rfc9293>. <https://www.rfc-editor.org/info/rfc9293>.
9.2. Informative References 9.2. Informative References
[BCP69] Best Current Practice 69,
<https://www.rfc-editor.org/info/bcp69>.
At the time of writing, this BCP comprises the following:
Balakrishnan, H., Padmanabhan, V., Fairhurst, G., and M.
Sooriyabandara, "TCP Performance Implications of Network
Path Asymmetry", BCP 69, RFC 3449, DOI 10.17487/RFC3449,
December 2002, <https://www.rfc-editor.org/info/rfc3449>.
[ECN++] Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit [ECN++] Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
Congestion Notification (ECN) to TCP Control Packets", Congestion Notification (ECN) to TCP Control Packets",
Work in Progress, Internet-Draft, draft-ietf-tcpm- Work in Progress, Internet-Draft, draft-ietf-tcpm-
generalized-ecn-17, 21 April 2025, generalized-ecn-17, 21 April 2025,
<https://datatracker.ietf.org/doc/html/draft-ietf-tcpm- <https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-
generalized-ecn-17>. generalized-ecn-17>.
[Mandalari18] [Mandalari18]
Mandalari, A., Lutu, A., Briscoe, B., Bagnulo, M., and Ö. Mandalari, A., Lutu, A., Briscoe, B., Bagnulo, M., and Ö.
Alay, "Measuring ECN++: Good News for ++, Bad News for ECN Alay, "Measuring ECN++: Good News for ++, Bad News for ECN
over Mobile", IEEE Communications Magazine , March 2018, over Mobile", IEEE Communications Magazine , March 2018,
<http://www.it.uc3m.es/amandala/ <http://www.it.uc3m.es/amandala/
ecn++/ecn_commag_2018.html>. ecn++/ecn_commag_2018.html>.
[RFC3449] Balakrishnan, H., Padmanabhan, V., Fairhurst, G., and M.
Sooriyabandara, "TCP Performance Implications of Network
Path Asymmetry", BCP 69, RFC 3449, DOI 10.17487/RFC3449,
December 2002, <https://www.rfc-editor.org/info/rfc3449>.
[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
Congestion Notification (ECN) Signaling with Nonces", Congestion Notification (ECN) Signaling with Nonces",
RFC 3540, DOI 10.17487/RFC3540, June 2003, RFC 3540, DOI 10.17487/RFC3540, June 2003,
<https://www.rfc-editor.org/info/rfc3540>. <https://www.rfc-editor.org/info/rfc3540>.
[RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common
Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007,
<https://www.rfc-editor.org/info/rfc4987>. <https://www.rfc-editor.org/info/rfc4987>.
[RFC5562] Kuzmanovic, A., Mondal, A., Floyd, S., and K. [RFC5562] Kuzmanovic, A., Mondal, A., Floyd, S., and K.
skipping to change at line 2852 skipping to change at line 2858
(L4S) Internet Service: Architecture", RFC 9330, (L4S) Internet Service: Architecture", RFC 9330,
DOI 10.17487/RFC9330, January 2023, DOI 10.17487/RFC9330, January 2023,
<https://www.rfc-editor.org/info/rfc9330>. <https://www.rfc-editor.org/info/rfc9330>.
[RFC9438] Xu, L., Ha, S., Rhee, I., Goel, V., and L. Eggert, Ed., [RFC9438] Xu, L., Ha, S., Rhee, I., Goel, V., and L. Eggert, Ed.,
"CUBIC for Fast and Long-Distance Networks", RFC 9438, "CUBIC for Fast and Long-Distance Networks", RFC 9438,
DOI 10.17487/RFC9438, August 2023, DOI 10.17487/RFC9438, August 2023,
<https://www.rfc-editor.org/info/rfc9438>. <https://www.rfc-editor.org/info/rfc9438>.
[RoCEv2] InfiniBand Trade Association, "InfiniBand Architecture [RoCEv2] InfiniBand Trade Association, "InfiniBand Architecture
Specification", Volume 1, Release 1.4, 2020, Specification",
<https://www.infinibandta.org/ibta-specification/>. <https://www.infinibandta.org/ibta-specification/>.
Appendix A. Example Algorithms Appendix A. Example Algorithms
This appendix is informative, not normative. It gives example This appendix is informative, not normative. It gives example
algorithms that would satisfy the normative requirements of the algorithms that would satisfy the normative requirements of the
AccECN protocol. However, implementers are free to choose other ways AccECN protocol. However, implementers are free to choose other ways
to implement the requirements. to satisfy the requirements.
A.1. Example Algorithm to Encode/Decode the AccECN Option A.1. Example Algorithm to Encode/Decode the AccECN Option
The example algorithms below show how a Data Receiver in AccECN mode The example algorithms below show how a Data Receiver in AccECN mode
could encode its CE byte counter r.ceb into the ECEB field within an could encode its CE byte counter r.ceb into the ECEB field within an
AccECN TCP Option, and how a Data Sender in AccECN mode could decode AccECN TCP Option, and how a Data Sender in AccECN mode could decode
the ECEB field into its byte counter s.ceb. The other counters for the ECEB field into its byte counter s.ceb. The other counters for
bytes marked ECT(0) and ECT(1) in an AccECN Option would be similarly bytes marked ECT(0) and ECT(1) in an AccECN Option would be similarly
encoded and decoded. encoded and decoded.
skipping to change at line 2893 skipping to change at line 2899
where '%' is the remainder operator. where '%' is the remainder operator.
On the arrival of an AccECN Option, the Data Sender first makes sure On the arrival of an AccECN Option, the Data Sender first makes sure
the ACK has not been superseded in order to avoid winding the s.ceb the ACK has not been superseded in order to avoid winding the s.ceb
counter backwards. It uses the TCP acknowledgement number and any counter backwards. It uses the TCP acknowledgement number and any
SACK options [RFC2018] to calculate newlyAckedB, the amount of new SACK options [RFC2018] to calculate newlyAckedB, the amount of new
data that the ACK acknowledges in bytes (newlyAckedB can be zero but data that the ACK acknowledges in bytes (newlyAckedB can be zero but
not negative). If newlyAckedB is zero, either the ACK has been not negative). If newlyAckedB is zero, either the ACK has been
superseded or CE-marked packet(s) without data could have arrived. superseded or CE-marked packet(s) without data could have arrived.
To break the tie for the latter case, the Data Sender could use time- To break the tie for the latter case, the Data Sender could use
stamps [RFC7323] (if present) to work out newlyAckedT, the amount of timestamps [RFC7323] (if present) to work out newlyAckedT, the amount
new time that the ACK acknowledges. If the Data Sender determines of new time that the ACK acknowledges. If the Data Sender determines
that the ACK has been superseded, it ignores the AccECN Option. that the ACK has been superseded, it ignores the AccECN Option.
Otherwise, the Data Sender calculates the minimum non-negative Otherwise, the Data Sender calculates the minimum non-negative
difference d.ceb between the ECEB field and its local s.ceb counter, difference d.ceb between the ECEB field and its local s.ceb counter,
using modulo arithmetic as follows: using modulo arithmetic as follows:
if ((newlyAckedB > 0) || (newlyAckedT > 0)) { if ((newlyAckedB > 0) || (newlyAckedT > 0)) {
d.ceb = (ECEB + DIVOPT - (s.ceb % DIVOPT)) % DIVOPT d.ceb = (ECEB + DIVOPT - (s.ceb % DIVOPT)) % DIVOPT
s.ceb += d.ceb s.ceb += d.ceb
} }
skipping to change at line 2982 skipping to change at line 2988
of the missing ACKs were piggy-backed on data (i.e., not pure ACKs) of the missing ACKs were piggy-backed on data (i.e., not pure ACKs)
retransmissions will not repair the lost AccECN information, because retransmissions will not repair the lost AccECN information, because
AccECN requires retransmissions to carry the latest AccECN counters, AccECN requires retransmissions to carry the latest AccECN counters,
not the original ones. not the original ones.
The phrase 'under prevailing conditions' allows for implementation- The phrase 'under prevailing conditions' allows for implementation-
dependent interpretation. A Data Sender might take account of the dependent interpretation. A Data Sender might take account of the
prevailing size of data segments and the prevailing CE marking rate prevailing size of data segments and the prevailing CE marking rate
just before the sequence of missing ACKs. However, we shall start just before the sequence of missing ACKs. However, we shall start
with the simplest algorithm, which assumes segments are all full- with the simplest algorithm, which assumes segments are all full-
sized and ultra-conservatively it assumes that ECN marking was 100% sized, and ultra-conservatively it assumes that ECN marking was 100%
on the forward path when ACKs on the reverse path started to all be on the forward path when ACKs on the reverse path started to all be
dropped. Specifically, if newlyAckedB is the amount of data that an dropped. Specifically, if newlyAckedB is the amount of data that an
ACK acknowledges since the previous ACK, then the Data Sender could ACK acknowledges since the previous ACK, then the Data Sender could
assume that this acknowledges newlyAckedPkt full-sized segments, assume that this acknowledges newlyAckedPkt full-sized segments,
where newlyAckedPkt = newlyAckedB/MSS. Then it could assume that the where newlyAckedPkt = newlyAckedB/MSS. Then it could assume that the
ACE field incremented by ACE field incremented by
dSafer.cep = newlyAckedPkt - ((newlyAckedPkt - d.cep) % DIVACE) dSafer.cep = newlyAckedPkt - ((newlyAckedPkt - d.cep) % DIVACE)
For example, imagine an ACK acknowledges newlyAckedPkt=9 more full- For example, imagine an ACK acknowledges newlyAckedPkt=9 more full-
size segments than any previous ACK, and that ACE increments by a size segments than any previous ACK, and that ACE increments by a
minimum of 2 CE marks (d.cep=2). The above formula works out that it minimum of 2 CE marks (d.cep=2). The above formula indicates that it
would still be safe to assume 2 CE marks (because 9 - ((9-2) % 8) = would still be safe to assume 2 CE marks (because 9 - ((9-2) % 8) =
2). However, if ACE increases by a minimum of 2 but acknowledges 10 2). However, if ACE increases by a minimum of 2 but acknowledges 10
full-sized segments, then it would be necessary to assume that there full-sized segments, then it would be necessary to assume that there
could have been 10 CE marks (because 10 - ((10-2) % 8) = 10). could have been 10 CE marks (because 10 - ((10-2) % 8) = 10).
Note that checks would need to be added to the above pseudocode for Note that checks would need to be added to the above pseudocode for
(d.cep > newlyAckedPkt), which could occur if newlyAckedPkt had been (d.cep > newlyAckedPkt), which could occur if newlyAckedPkt had been
wrongly estimated using an inappropriate packet size. wrongly estimated using an inappropriate packet size.
ACKs that acknowledge a large stretch of packets might be common in ACKs that acknowledge a large stretch of packets might be common in
skipping to change at line 3024 skipping to change at line 3030
average segment size and prevailing ECN marking. For instance, average segment size and prevailing ECN marking. For instance,
newlyAckedPkt in the above formula could be replaced with newlyAckedPkt in the above formula could be replaced with
newlyAckedPktHeur = newlyAckedPkt*p*MSS/s, where s is the prevailing newlyAckedPktHeur = newlyAckedPkt*p*MSS/s, where s is the prevailing
segment size and p is the prevailing ECN marking probability. segment size and p is the prevailing ECN marking probability.
However, ultimately, if TCP's ECN feedback becomes inaccurate, it However, ultimately, if TCP's ECN feedback becomes inaccurate, it
still has loss detection to fall back on. Therefore, it would seem still has loss detection to fall back on. Therefore, it would seem
safe to implement a simple algorithm, rather than a perfect one. safe to implement a simple algorithm, rather than a perfect one.
The simple algorithm for dSafer.cep above requires no monitoring of The simple algorithm for dSafer.cep above requires no monitoring of
prevailing conditions and it would still be safe if, for example, prevailing conditions and it would still be safe if, for example,
segments were on average at least 5% of full-sized as long as ECN segments were on average at least 5% of a full-sized packet as long
marking was 5% or less. Assuming it was used, the Data Sender would as ECN marking was 5% or less. Assuming it was used, the Data Sender
increment its packet counter as follows: would increment its packet counter as follows:
s.cep += dSafer.cep s.cep += dSafer.cep
If missing acknowledgement numbers arrive later (due to reordering), If missing acknowledgement numbers arrive later (due to reordering),
Section 3.2.2.5.2 says "the Data Sender MAY attempt to neutralize the Section 3.2.2.5.2 says "the Data Sender MAY attempt to neutralize the
effect of any action it took based on a conservative assumption that effect of any action it took based on a conservative assumption that
it later found to be incorrect". To do this, the Data Sender would it later found to be incorrect". To do this, the Data Sender would
have to store the values of all the relevant variables whenever it have to store the values of all the relevant variables whenever it
made assumptions, so that it could re-evaluate them later. Given made assumptions, so that it could re-evaluate them later. Given
this could become complex and it is not required, we do not attempt this could become complex and it is not required, we do not attempt
skipping to change at line 3063 skipping to change at line 3069
if (dSafer.cep > d.cep) { if (dSafer.cep > d.cep) {
if (d.ceb <= MSS * d.cep) { % Same as (s <= MSS), but no DBZ if (d.ceb <= MSS * d.cep) { % Same as (s <= MSS), but no DBZ
sSafer = d.ceb/dSafer.cep sSafer = d.ceb/dSafer.cep
if (sSafer < MSS/SAFETY_FACTOR) if (sSafer < MSS/SAFETY_FACTOR)
dSafer.cep = d.cep % d.cep is a safe enough estimate dSafer.cep = d.cep % d.cep is a safe enough estimate
} % else } % else
% No need for else; dSafer.cep is already correct, % No need for else; dSafer.cep is already correct,
% because d.cep must have been too small % because d.cep must have been too small
} }
The chart below shows when the above algorithm will consider d.cep The chart below shows when the above algorithm will replace
can replace dSafer.cep as a safe enough estimate of the number of CE- dSafer.cep with d.cep as a safe enough estimate of the number of CE
marked packets: marked packets:
^ ^
sSafer| sSafer|
| |
MSS+ MSS+
| |
| dSafer.cep | dSafer.cep
| is | is
MSS/SAFETY_FACTOR+--------------+ safest MSS/SAFETY_FACTOR+--------------+ safest
skipping to change at line 3113 skipping to change at line 3119
than below MSS/2. than below MSS/2.
If pure ACKs were allowed to be ECN-capable, missing ACKs would be If pure ACKs were allowed to be ECN-capable, missing ACKs would be
far less likely. However, because [RFC3168] currently precludes far less likely. However, because [RFC3168] currently precludes
this, the above algorithm assumes that pure ACKs are not ECN-capable. this, the above algorithm assumes that pure ACKs are not ECN-capable.
A.3. Example Algorithm to Estimate Marked Bytes from Marked Packets A.3. Example Algorithm to Estimate Marked Bytes from Marked Packets
If AccECN Options are not available, the Data Sender can only decode If AccECN Options are not available, the Data Sender can only decode
a CE marking from the ACE field in packets. Every time an ACK a CE marking from the ACE field in packets. Every time an ACK
arrives, to convert this into an estimate of CE-marked bytes, it arrives, to convert the number of CE markings into an estimate of CE-
needs an average of the segment size, s_ave. Then it can add or marked bytes, it needs an average of the segment size, s_ave. Then
subtract s_ave from the value of d.ceb as the value of d.cep it can add or subtract s_ave from the value of d.ceb as the value of
increments or decrements. Some possible ways to calculate s_ave are d.cep increments or decrements. Some possible ways to calculate
outlined below. The precise details will depend on why an estimate s_ave are outlined below. The precise details will depend on why an
of marked bytes is needed. estimate of marked bytes is needed.
The implementation could keep a record of the byte numbers of all the The implementation could keep a record of the byte numbers of all the
boundaries between packets in flight (including control packets), and boundaries between packets in flight (including control packets), and
recalculate s_ave on every ACK. However, it would be simpler to recalculate s_ave on every ACK. However, it would be simpler to
merely maintain a counter packets_in_flight for the number of packets merely maintain a counter packets_in_flight for the number of packets
in flight (including control packets), which is reset once per RTT. in flight (including control packets), which is reset once per RTT.
Either way, it would estimate s_ave as: Either way, it would estimate s_ave as:
s_ave ~= flightsize / packets_in_flight, s_ave ~= flightsize / packets_in_flight,
skipping to change at line 3179 skipping to change at line 3185
B.1. Three TCP Header Flags in the SYN-SYN/ACK Handshake B.1. Three TCP Header Flags in the SYN-SYN/ACK Handshake
AccECN uses a rather unorthodox approach to negotiate the highest AccECN uses a rather unorthodox approach to negotiate the highest
version TCP ECN feedback scheme that both ends support, as justified version TCP ECN feedback scheme that both ends support, as justified
below. It follows from the original TCP ECN capability negotiation below. It follows from the original TCP ECN capability negotiation
[RFC3168], in which the Client set the 2 least significant of the [RFC3168], in which the Client set the 2 least significant of the
original reserved flags in the TCP header, and fell back to No ECN original reserved flags in the TCP header, and fell back to No ECN
support if the Server responded with the 2 flags cleared, which had support if the Server responded with the 2 flags cleared, which had
previously been the default. previously been the default.
Classic ECN used header flags rather than a TCP option because it was Classic ECN used header flags rather than a TCP Option because it was
considered more efficient to use a header flag for 1 bit of feedback considered more efficient to use a header flag for 1 bit of feedback
per ACK, and this bit could be overloaded to indicate support for per ACK, and this bit could be overloaded to indicate support for
Classic ECN during the handshake. During the development of ECN, 1 Classic ECN during the handshake. During the development of ECN, 1
bit crept up to 2, in order to deliver the feedback reliably and to bit crept up to 2, in order to deliver the feedback reliably and to
work round some broken hosts that reflected the reserved flags during work round some broken hosts that reflected the reserved flags during
the handshake. the handshake.
In order to be backward compatible with RFC 3168, AccECN continues In order to be backward compatible with RFC 3168, AccECN continues
this approach, using the 3rd least significant TCP header flag that this approach, using the 3rd least significant TCP header flag that
had previously been allocated for the ECN-nonce (now historic). had previously been allocated for the ECN-nonce (now historic).
Then, whatever form of Server an AccECN Client encounters, the Then, whatever form of Server an AccECN Client encounters, the
connection can fall back to the highest version of feedback protocol connection can fall back to the highest version of feedback protocol
that both ends support, as explained in Section 3.1. that both ends support, as explained in Section 3.1.
If AccECN capability negotiation had used the more orthodox approach If AccECN capability negotiation had used the more orthodox approach
of a TCP option, it would still have had to set the two ECN flags in of a TCP Option, it would still have had to set the two ECN flags in
the main TCP header, in order to be able to fall back to Classic ECN the main TCP header, in order to be able to fall back to Classic ECN
[RFC3168], or to disable ECN support, without another round of [RFC3168], or to disable ECN support, without another round of
negotiation. Then AccECN would also have had to handle all the negotiation. Then AccECN would also have had to handle all the
different ways that Servers currently respond to settings of the ECN different ways that Servers currently respond to settings of the ECN
flags in the main TCP header, including all of the conflicting cases flags in the main TCP header, including all of the conflicting cases
where a Server might have said it supported one approach in the flags where a Server might have said it supported one approach in the flags
and another approach in a new TCP option. And AccECN would have had and another approach in a new TCP Option. And AccECN would have had
to deal with all of the additional possibilities where a middlebox to deal with all of the additional possibilities where a middlebox
might have mangled the ECN flags, or removed TCP options. Thus, might have mangled the ECN flags, or removed TCP Options. Thus,
usage of the 3rd reserved TCP header flag simplified the protocol. usage of the 3rd reserved TCP header flag simplified the protocol.
The third flag was used in a way that could be distinguished from the The third flag was used in a way that could be distinguished from the
ECN-nonce, in case any nonce deployment was encountered. Previous ECN-nonce, in case any nonce deployment was encountered. Previous
usage of this flag for the ECN-nonce was integrated into the original usage of this flag for the ECN-nonce was integrated into the original
ECN negotiation. This further justified the third flag's use for ECN negotiation. This further justified the third flag's use for
AccECN, because a non-ECN usage of this flag would have had to use it AccECN, because a non-ECN usage of this flag would have had to use it
as a separate single bit, rather than in combination with the other 2 as a separate single bit, rather than in combination with the other 2
ECN flags. ECN flags.
skipping to change at line 3232 skipping to change at line 3238
indicate on the SYN/ACK, four already indicated earlier (or broken) indicate on the SYN/ACK, four already indicated earlier (or broken)
versions of ECN support, one now being Historic. In the early design versions of ECN support, one now being Historic. In the early design
of AccECN, an AccECN Server could use only 2 of the 4 remaining of AccECN, an AccECN Server could use only 2 of the 4 remaining
codepoints. They both indicated AccECN support, but one fed back codepoints. They both indicated AccECN support, but one fed back
that the SYN had arrived marked as CE. Even though ECN support on a that the SYN had arrived marked as CE. Even though ECN support on a
SYN is not yet on the Standards Track, the idea is for either end to SYN is not yet on the Standards Track, the idea is for either end to
act as a mechanistic reflector, so that future capabilities can be act as a mechanistic reflector, so that future capabilities can be
unilaterally deployed without requiring 2-ended deployment (justified unilaterally deployed without requiring 2-ended deployment (justified
in Section 2.5). in Section 2.5).
During traversal testing, it was discovered that the IP-ECN field in During traversal testing, it was discovered that the IP ECN field in
the SYN was mangled on a non-negligible proportion of paths. the SYN was mangled on a non-negligible proportion of paths.
Therefore, it was necessary to allow the SYN/ACK to feed all four IP- Therefore, it was necessary to allow the SYN/ACK to feed all four IP
ECN codepoints that the SYN could arrive with back to the Client. ECN codepoints that the SYN could arrive with back to the Client.
Without this, the Client could not know whether to disable ECN for Without this, the Client could not know whether to disable ECN for
the connection due to mangling of the IP-ECN field (also explained in the connection due to mangling of the IP ECN field (also explained in
Section 2.5). This development consumed the remaining two codepoints Section 2.5). This development consumed the remaining two codepoints
on the SYN/ACK that had been reserved for future use by AccECN in on the SYN/ACK that had been reserved for future use by AccECN in
earlier versions. earlier draft versions of this document.
B.3. Space for Future Evolution B.3. Space for Future Evolution
Despite availability of usable TCP header space being extremely Despite availability of usable TCP header space being extremely
scarce, the AccECN protocol has taken all possible steps to ensure scarce, the AccECN protocol has taken all possible steps to ensure
that there is space to negotiate possible future variants of the that there is space to negotiate possible future variants of the
protocol, either if a variant of AccECN is required, or if a protocol, either if a variant of AccECN is required, or if a
completely different ECN feedback approach is needed. completely different ECN feedback approach is needed.
Future AccECN variants: When the AccECN capability is negotiated Future AccECN variants: When the AccECN capability is negotiated
skipping to change at line 3300 skipping to change at line 3306
equivalent to AccECN negotiation with (1,1,1) on the SYN. These equivalent to AccECN negotiation with (1,1,1) on the SYN. These
codepoints would not allow fall-back to Classic ECN support for a codepoints would not allow fall-back to Classic ECN support for a
Server that did not understand them, but this approach ensures Server that did not understand them, but this approach ensures
they are available in the future, perhaps for uses other than ECN they are available in the future, perhaps for uses other than ECN
alongside the AccECN scheme. All possible combinations of SYN/ACK alongside the AccECN scheme. All possible combinations of SYN/ACK
could be used in response except either (0,0,0) or reflection of could be used in response except either (0,0,0) or reflection of
the same values sent on the SYN. the same values sent on the SYN.
In order to extend AccECN or ECN in the future, other ways could In order to extend AccECN or ECN in the future, other ways could
be resorted to, although their traversal properties are likely to be resorted to, although their traversal properties are likely to
be inferior. They include a new TCP option; using the remaining be inferior. They include a new TCP Option; using the remaining
reserved flags in the main TCP header (preferably extending the reserved flags in the main TCP header (preferably extending the
3-bit combinations used by AccECN to 4-bit combinations, rather 3-bit combinations used by AccECN to 4-bit combinations, rather
than burning one bit for just one state); a non-zero urgent than burning one bit for just one state); a non-zero urgent
pointer in combination with the URG flag cleared; or some other pointer in combination with the URG flag cleared; or some other
unexpected combination of fields yet to be invented. unexpected combination of fields yet to be invented.
Acknowledgements Acknowledgements
We want to thank Koen De Schepper, Praveen Balasubramanian, Michael We want to thank Koen De Schepper, Praveen Balasubramanian, Michael
Welzl, Gorry Fairhurst, David Black, Spencer Dawkins, Michael Scharf, Welzl, Gorry Fairhurst, David Black, Spencer Dawkins, Michael Scharf,
 End of changes. 149 change blocks. 
234 lines changed or deleted 240 lines changed or added

This html diff was produced by rfcdiff 1.48.