rfc9840xml2.original.xml | rfc9840.xml | |||
---|---|---|---|---|
<?xml version="1.0" encoding="US-ASCII"?> | <?xml version='1.0' encoding='UTF-8'?> | |||
<!-- edited with XMLSPY v5 rel. 3 U (http://www.xmlspy.com) | ||||
by Daniel M Kohn (private) --> | ||||
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [ | ||||
<!DOCTYPE rfc [ | ||||
<!ENTITY nbsp " "> | ||||
<!ENTITY zwsp "​"> | ||||
<!ENTITY nbhy "‑"> | ||||
<!ENTITY wj "⁠"> | ||||
]> | ]> | |||
<rfc category="exp" docName="draft-irtf-iccrg-rledbat-10" | ||||
ipr="trust200902"> | ||||
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?> | ||||
<?rfc toc="yes" ?> | ||||
<?rfc symrefs="yes" ?> | <rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="exp" docName="draft-ir | |||
tf-iccrg-rledbat-10" number="9840" consensus="true" ipr="trust200902" obsoletes= | ||||
<?rfc sortrefs="yes"?> | "" updates="" submissionType="IRTF" xml:lang="en" tocInclude="true" symRefs="tru | |||
e" sortRefs="true" version="3"> | ||||
<?rfc iprnotified="no" ?> | ||||
<?rfc strict="yes" ?> | ||||
<front> | <front> | |||
<title abbrev="rLEDBAT">rLEDBAT: receiver-driven Low Extra Delay Background | <title abbrev="rLEDBAT">rLEDBAT: Receiver-Driven Low Extra Delay Background | |||
Transport for TCP | Transport for TCP</title> | |||
</title> | <seriesInfo name="RFC" value="9840"/> | |||
<author fullname="Marcelo Bagnulo" initials="M." surname="Bagnulo"> | <author fullname="Marcelo Bagnulo" initials="M." surname="Bagnulo"> | |||
<organization>Universidad Carlos III de Madrid</organization> | <organization>Universidad Carlos III de Madrid</organization> | |||
<address> | <address> | |||
<email>marcelo@it.uc3m.es</email> | <email>marcelo@it.uc3m.es</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<author fullname="Alberto Garcia-Martinez" initials="A." surname="Garcia-Mar tinez"> | <author fullname="Alberto Garcia-Martinez" initials="A." surname="Garcia-Mar tinez"> | |||
<organization>Universidad Carlos III de Madrid</organization> | <organization>Universidad Carlos III de Madrid</organization> | |||
<address> | <address> | |||
<email>alberto@it.uc3m.es</email> | <email>alberto@it.uc3m.es</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<!-- [rfced] Alberto, would you prefer that we use accented letters | ||||
in your name in this and subsequent RFCs? We ask because we see | ||||
"García-Martínez" in [COMNET1], [COMNET2], and [COMNET3]. We are | ||||
fine either way, but we ask because some authors prefer that the | ||||
accents be used. If you prefer that we use the accented letters | ||||
going forward, we will note your preference for future reference. | ||||
Original: | ||||
A. Garcia-Martinez | ||||
... | ||||
Alberto Garcia-Martinez --> | ||||
<author fullname="Gabriel Montenegro" initials="G." surname="Montenegro"> | <author fullname="Gabriel Montenegro" initials="G." surname="Montenegro"> | |||
<address> | <address> | |||
<email>g.e.montenegro@hotmail.com</email> | <email>g.e.montenegro@hotmail.com</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<author fullname="Praveen Balasubramanian " initials="P." surname="Balasubra manian"> | <author fullname="Praveen Balasubramanian " initials="P." surname="Balasubra manian"> | |||
<organization>Confluent</organization> | <organization>Confluent</organization> | |||
<address> | <address> | |||
<email>pravb.ietf@gmail.com</email> | <email>pravb.ietf@gmail.com</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<date year="2025" month="August"/> | ||||
<date year="2025" /> | <workgroup>Internet Congestion Control</workgroup> | |||
<!-- [rfced] Please insert any keywords (beyond those that appear in the | ||||
title) for use on <https://www.rfc-editor.org/search>. --> | ||||
<!-- [rfced] Please ensure that the guidelines listed in Section 2.1 | ||||
of RFC 5743 have been adhered to in this document. See | ||||
<https://www.rfc-editor.org/rfc/rfc5743.html#section-2.1>. --> | ||||
<abstract> | <abstract> | |||
<t> This document specifies rLEDBAT, a set of mechanisms that enable the e xecution of a less-than-best-effort congestion control algorithm for TCP at the receiver end. This document is a product of the Internet Congestion Control Rese arch Group (ICCRG) of the Internet Research Task Force (IRTF). | <t> This document specifies receiver-driven Low Extra Delay Background Tra nsport (rLEDBAT) -- a set of mechanisms that enable the execution of a less-than -best-effort congestion control algorithm for TCP at the receiver end. This docu ment is a product of the Internet Congestion Control Research Group (ICCRG) of t he Internet Research Task Force (IRTF). | |||
</t> | </t> | |||
</abstract> | </abstract> | |||
</front> | </front> | |||
<middle> | <middle> | |||
<section title="Introduction"> | <section numbered="true" toc="default"> | |||
<name>Introduction</name> | ||||
<t>LEDBAT (Low Extra Delay Background Transport) <xref target="RFC6817" fo | ||||
rmat="default"/> is a congestion control algorithm used for less-than-best-effor | ||||
t (LBE) traffic.</t> | ||||
<t>When LEDBAT traffic shares a bottleneck with other traffic using standa | ||||
rd congestion control algorithms (for example, TCP traffic using CUBIC <xref tar | ||||
get="RFC9438" format="default"/>, hereafter referred to as "standard-TCP" for sh | ||||
ort), it reduces its sending rate earlier and more aggressively than standard-TC | ||||
P congestion control, allowing other non-background traffic to use more of the a | ||||
vailable capacity. In the absence of competing traffic, LEDBAT aims to make effi | ||||
cient use of the available capacity, while keeping the queuing delay within pred | ||||
efined bounds. | ||||
<t>LEDBAT (Low Extra Delay Background Transport) <xref target="RFC6817" / | <!-- [rfced] Section 1: Is there a distinction between | |||
> is a congestion-control algorithm used for less-than-best-effort (LBE) traffic | "standard-TCP" and "standard TCP" (e.g., "standard TCP sender", | |||
.</t> | "standard-TCP flow") as used in this document, or do they mean the | |||
same thing? We ask because we see "hereafter referred (to) as | ||||
standard-TCP for short" in the second paragraph of Section 1. | ||||
If "standard-TCP" and "standard TCP" mean the same thing, we suggest | ||||
removing the hyphen*. | ||||
<t>When LEDBAT traffic shares a bottleneck with other traffic using stand | * Please note that we also see "standard TCP" but not "standard-TCP" | |||
ard congestion control algorithms (for example, TCP traffic using Cubic<xref tar | in RFC 6817, and the only published RFC to date that uses | |||
get="RFC9438" />, hereafter referred as standard-TCP for short), it reduces its | "standard-TCP" appears to be RFC 1687 ("A Large Corporate User's View | |||
sending rate earlier and more aggressively than standard-TCP congestion control, | of IPng"), published in August 1994. | |||
allowing other non-background traffic to use more of the available capacity. In | ||||
the absence of competing traffic, LEDBAT aims to make an efficient use of the a | ||||
vailable capacity, while keeping the queuing delay within predefined bounds.</t> | ||||
<t>LEDBAT reacts both to packet loss and to variations in delay. With re | Original: | |||
spect to packet loss, LEDBAT reacts with a multiplicative decrease, similar to m | When LEDBAT traffic shares a bottleneck with other traffic using | |||
ost TCP congestion controllers. Regarding delay, LEDBAT aims for a target queuei | standard congestion control algorithms (for example, TCP traffic | |||
ng delay. When the measured current queueing delay is below the target, LEDBAT i | using Cubic[RFC9438], hereafter referred as standard-TCP for short), | |||
ncreases the sending rate and when the delay is above the target, it reduces the | it reduces its sending rate earlier and more aggressively than | |||
sending rate. LEDBAT estimates the queuing delay by subtracting the measured cu | standard-TCP congestion control, allowing other non-background | |||
rrent one-way delay from the estimated base one-way delay (i.e. the one-way dela | traffic to use more of the available capacity. | |||
y in the absence of queues). </t> | ... | |||
rLEDBAT assumes that the sender is a standard TCP sender. | ||||
... | ||||
This guarantees | ||||
that the rLEDBAT flow will never transmit more aggressively than a | ||||
standard-TCP flow, as the sender's congestion window limits the | ||||
sending rate. --> | ||||
<t>The LEDBAT specification <xref target="RFC6817" /> defines the LEDBAT | </t> | |||
congestion-control algorithm, implemented in the sender to control its sending r | <t>LEDBAT reacts to both packet loss and variations in delay. With respec | |||
ate. LEDBAT is specified in a protocol and layer agnostic manner.</t> | t to packet loss, LEDBAT reacts with a multiplicative decrease, similar to most | |||
TCP congestion controllers. Regarding delay, LEDBAT aims for a target queuing de | ||||
lay. When the measured current queuing delay is below the target, LEDBAT increas | ||||
es the sending rate, and when the delay is above the target, it reduces the send | ||||
ing rate. LEDBAT estimates the queuing delay by subtracting the measured current | ||||
one-way delay from the estimated base one-way delay (i.e., the one-way delay in | ||||
the absence of queues). </t> | ||||
<t>The LEDBAT specification <xref target="RFC6817" format="default"/> defi | ||||
nes the LEDBAT congestion control algorithm, implemented in the sender to contro | ||||
l its sending rate. LEDBAT is specified in a protocol-agnostic and layer-agnosti | ||||
c manner.</t> | ||||
<t>LEDBAT++ <xref target="I-D.irtf-iccrg-ledbat-plus-plus" format="default | ||||
"/> is also an LBE congestion control algorithm that is inspired by LEDBAT while | ||||
addressing several problems identified with the original LEDBAT specification. | ||||
In particular, the differences between LEDBAT and LEDBAT++ include the following | ||||
:</t> | ||||
<t>LEDBAT++ <xref target="I-D.irtf-iccrg-ledbat-plus-plus" /> is also an | <ol spacing="normal" type="%i)"> | |||
LBE congestion control algorithm which is inspired by LEDBAT while addressing se | <li>LEDBAT++ uses the round-trip time (RTT) (as opposed to the one-way delay us | |||
veral problems identified with the original LEDBAT specification. In particular | ed in LEDBAT) to estimate the queuing delay.</li> | |||
the differences between LEDBAT and LEDBAT++ include: i) LEDBAT++ uses the round- | <li>LEDBAT++ uses an additive increase/multiplicative decrease algorithm to ach | |||
trip-time (RTT) (as opposed to the one way delay used in LEDBAT) to estimate the | ieve inter-LEDBAT++ fairness and avoid the latecomer advantage observed in LEDBA | |||
queuing delay; ii) LEDBAT++ uses an Additive Increase/Multiplicative Decrease a | T.</li> | |||
lgorithm to achieve inter-LEDBAT++ fairness and avoid the late-comer advantage o | <li>LEDBAT++ performs periodic slowdowns to improve the measurement of the base | |||
bserved in LEDBAT; iii) LEDBAT++ performs periodic slowdowns to improve the meas | delay.</li> | |||
urement of the base delay; iv) LEDBAT++ is defined for TCP.</t> | <li>LEDBAT++ is defined for TCP.</li> | |||
</ol> | ||||
<t>In this specification, we describe receiver-driven Low Extra Delay Back | ||||
ground Transport (rLEDBAT) -- a set of mechanisms that enable the execution of a | ||||
n LBE delay-based congestion control algorithm such as LEDBAT or LEDBAT++ at the | ||||
receiver end of a TCP connection.</t> | ||||
<t> The consensus of the Internet Congestion Control Research Group (ICCRG | ||||
) is to publish this document to encourage further experimentation and review of | ||||
rLEDBAT. This document is not an IETF product and is not an Internet Standards | ||||
Track specification. The status of this document is Experimental. In <xref targe | ||||
t="sect-5" format="default"/> ("<xref target="sect-5" format="title"/>"), we des | ||||
cribe the purpose of the experiment and its current status. </t> | ||||
</section> | ||||
<t>In this specification, we describe rLEDBAT, a set of mechanisms that e | <section numbered="true" toc="default"> | |||
nable the execution of an LBE delay-based congestion control algorithm such as L | <name>Conventions and Terminology</name> | |||
EDBAT or LEDBAT++ at the receiver end of a TCP connection.</t> | <t>The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", | |||
"<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", | ||||
"<bcp14>SHALL NOT</bcp14>", "<bcp14>SHOULD</bcp14>", | ||||
"<bcp14>SHOULD NOT</bcp14>", | ||||
"<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>", | ||||
"<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document | ||||
are to be interpreted as described in BCP 14 | ||||
<xref target="RFC2119"/> <xref target="RFC8174"/> when, and only | ||||
when, they appear in all capitals, as shown here.</t> | ||||
<t>We use the following abbreviations throughout the text and include them | ||||
here for the reader's convenience:</t> | ||||
<dl spacing="normal" newline="false"> | ||||
<dt>RCV.WND:</dt><dd>The value included in the Receive Window field of | ||||
the TCP header (which computation is modified by this | ||||
specification). | ||||
<t> The consensus of the Internet Congestion Control Research Group (ICCR G) is to publish this document to encourage further experimentation and review o f rLEDBAT. This document is not an IETF product and is not a standard. The statu s of this document is experimental. In section 4 titled Experiment Consideration s, we describe the purpose of the experiment and its current status. </t> | <!-- [rfced] Appendix A (moved to Section 2, as noted below): | |||
</section> | a) Please note the following: | |||
<section title="Motivations for rLEDBAT"> | * Because we found "RFC 2119 key words" (e.g., "MUST", "SHOULD") in | |||
this document, per our standard process we added the appropriate | ||||
boilerplate text and Normative Reference listings. | ||||
<t>rLEDBAT enables new use cases and new deployment models, fostering the | * We moved the contents of Appendix A to a new Section 2, so that | |||
use of LBE traffic. The following scenarios are enabled by rLEDBAT: | readers can read the definitions of the terms before they are used | |||
<list> | in this document (e.g., "RCV.WND" in Section 4.1). | |||
<t>Content Delivery Networks and more sophisticated file | ||||
distribution scenarios: Consider the case where the source of a file to be distr | ||||
ibuted (e.g., a software developer that wishes to distribute a software update) | ||||
would prefer to use LBE and it enables LEDBAT/LEDBAT++ in the servers containing | ||||
the source file. However, because the file is being distributed through a CDN t | ||||
hat does not implement LBE congestion control, the result is that the file trans | ||||
fers originated from CDN surrogates will not be using LBE. Interestingly enough, | ||||
in the case of the software update, the developer may also control the software | ||||
performing the download in the client, the receiver of the file, but because cu | ||||
rrent LEDBAT/LEDBAT++ are sender-based algorithms, controlling the client is not | ||||
enough to enable LBE congestion control in the communication. rLEDBAT would ena | ||||
ble the use of LBE traffic class for file distribution in this setup. </t> | ||||
<t>Interference from proxies and other middleboxes: Proxies and o | ||||
ther middleboxes are commonplace in the Internet. For instance, in the case of m | ||||
obile networks, proxies are frequently used. In the case of enterprise networks, | ||||
it is common to deploy corporate proxies for filtering and firewalling. In the | ||||
case of satellite links, Performance Enhancement Proxies (PEPs) are deployed to | ||||
mitigate the effect of the long delay in TCP connection. These proxies terminate | ||||
the TCP connection on both ends and prevent the use of LBE congestion control | ||||
in the segment between the proxy and the sink of the content, the client. By ena | ||||
bling rLEDBAT, clients would be able to enable LBE traffic between them and the | ||||
proxy.</t> | ||||
<t>Receiver-defined preferences. It is frequent that the bottlene | ||||
ck of the communication is the access link. This is particularly true in the cas | ||||
e of mobile devices. It is then especially relevant for mobile devices to proper | ||||
ly manage the capacity of the access link. With current technologies, it is poss | ||||
ible for the mobile device to use different congestion control algorithms expres | ||||
sing different preferences for the traffic. For instance, a device can choose to | ||||
use standard-TCP for some traffic and to use LEDBAT/LEDBAT++ for other traffic. | ||||
However, this would only affect the outgoing traffic since both standard-TCP an | ||||
d LEDBAT/LEDBAT++ are sender-driven. The mobile device has no means to manage th | ||||
e traffic in the down-link, which is in most cases, the communication bottleneck | ||||
for a typical eye-ball end-user. rLEDBAT enables the mobile device to selective | ||||
ly use LBE traffic class for some of the incoming traffic. For instance, by usin | ||||
g rLEDBAT, a user can use regular standard-TCP/UDP for video stream (e.g., Youtu | ||||
be) and use rLEDBAT for other background file download.</t> | ||||
</list></t> | ||||
</section> | b) We had trouble following the meaning of "(which computation is | |||
modified by this specification)". Does "which computation" mean | ||||
"the computation of which", and does "this specification" refer to | ||||
this document or the specification of the value? If the suggested | ||||
text is not correct, please clarify. | ||||
<section title="rLEDBAT mechanisms"> | Original: | |||
RCV.WND: the value included in the Receive Window field of the TCP | ||||
header (which computation is modified by this specification) | ||||
<t>rLEDBAT provides the mechanisms to implement an LBE congestion control | Suggested: | |||
algorithm at the receiver-end of a TCP connection. The rLEDBAT receiver control | RCV.WND: The value included in the Receive Window field of the TCP | |||
s the sender's rate through the Receive Window announced by the receiver in the | header (the computation of which is modified by its specification). --> | |||
TCP header.</t> | ||||
<t>rLEDBAT assumes that the sender is a standard TCP sender. rLEDBAT does | </dd> | |||
not require any rLEDBAT-specific modifications to the TCP sender. The envisione | <dt>SND.WND:</dt><dd>The TCP sender's window.</dd> | |||
d deployment model for rLEDBAT is that the clients implement rLEDBAT and this en | <dt>cwnd:</dt><dd>The congestion window as computed by the congestion | |||
ables rLEDBAT in communications with existent standard TCP senders. In particul | control algorithm running at the TCP sender.</dd> | |||
ar, the sender MUST implement <xref target="RFC9293" /> and it also MUST impleme | <dt>RLWND:</dt><dd>The window value calculated by the rLEDBAT algorithm. | |||
nt the Time Stamp Option as defined in <xref target="RFC7323" />. Also, the send | </dd> | |||
er should implement some of the standard congestion control mechanisms, such as | <dt>fcwnd:</dt><dd>The value that a standard RFC793bis TCP receiver | |||
Cubic <xref target="RFC9438" /> or New Reno <xref target="RFC5681" />. </t> | calculates to set in the receive window for flow control | |||
purposes.</dd> | ||||
<dt>RCV.HGH:</dt><dd>The highest sequence number corresponding to a | ||||
received byte of data at one point in time.</dd> | ||||
<dt>TSV.HGH:</dt><dd>The Timestamp Value (TSval) <xref target="RFC7323" | ||||
format="default"/> corresponding to the | ||||
segment in which RCV.HGH was carried at that point in time.</dd> | ||||
<dt>SEG.SEQ:</dt><dd>The sequence number of the last received segment.</ | ||||
dd> | ||||
<dt>TSV.SEQ:</dt><dd>The TSval value of the last received segment.</dd> | ||||
</dl> | ||||
<t>rLEDBAT does not define a new congestion control algorithm. The LBE co | <!-- [rfced] Appendix A and Section 3.1: Regarding "RFC793bis (TCP) | |||
ngestion control algorithm executed in the rLEDBAT receiver is defined in other | receiver": Should RFC 9293 ("Transmission Control Protocol (TCP)"), | |||
documents. The rLEDBAT receiver MUST use an LBE congestion control algorithm. Be | which obsoletes RFC 793, be cited in the text as suggested below? | |||
cause rLEDBAT assumes a standard TCP sender, the sender will be using a "best ef | ||||
fort" congestion control algorithm (such as Cubic or New Reno). Since rLEDBAT us | ||||
es the Receive Window to control the sender's rate and the sender calculates the | ||||
sender's window as the minimum of the Receive window and the congestion window, | ||||
rLEDBAT will only be effective as long as the congestion control algorithm exec | ||||
uted in the receiver yields a smaller window than the one calculated by the send | ||||
er. This is normally the case when the receiver is using an LBE congestion contr | ||||
ol algorithm. The rLEDBAT receiver SHOULD use the LEDBAT congestion control algo | ||||
rithm <xref target="RFC6817" /> or the LEDBAT++ congestion control algorithm <xr | ||||
ef target="I-D.irtf-iccrg-ledbat-plus-plus" />. The rLEDBAT MAY use other LBE co | ||||
ngestion control algorithms defined elsewhere. Irrespective of which congestion | ||||
control algorithm is executed in the receiver, an rLEDBAT connection will never | ||||
be more aggressive than standard-TCP since it is always bounded by the congestio | ||||
n control algorithm executed at the sender.</t> | ||||
<t>rLEDBAT is essentially composed of three types of mechanisms, namely, | Original: | |||
those that provide the means to measure the packet delay (either the round trip | fcwnd: the value that a standard RFC793bis TCP receiver calculates | |||
time or the one way delay, depending on the selected algorithm), mechanisms to d | to set in the receive window for flow control purposes. | |||
etect packet loss and the means to manipulate the Receive Window to control the | ... | |||
sender's rate. The former provide input to the LBE congestion control algorithm | In order to avoid confusion, we | |||
while the latter uses the congestion window computed by the LBE congestion contr | will call fcwnd the value that a standard RFC793bis TCP receiver | |||
ol algorithm to manipulate the Receive window, as depicted in the figure.</t> | calculates to set in the receive window for flow control purposes. | |||
We call RLWND the window value calculated by rLEDBAT algorithm and we | ||||
call RCV.WND the value actually included in the Receive Window field | ||||
of the TCP header. For a RFC793bis receiver, RCV.WND == fcwnd. | ||||
<figure title="The rLEDBAT architecture."> | Suggested: | |||
<artwork align="center"><![CDATA[ | fcwnd: The value that a standard TCP receiver compliant with | |||
[RFC9293] calculates to set in the receive window for flow | ||||
control purposes. | ||||
... | ||||
In order to avoid confusion, we will call | ||||
fcwnd the value that a standard TCP receiver compliant with | ||||
[RFC9293] calculates to set in the receive window for flow control | ||||
purposes. We call RLWND the window value calculated by the rLEDBAT | ||||
algorithm, and we call RCV.WND the value actually included in the | ||||
Receive Window field of the TCP header. For a receiver compliant | ||||
with [RFC9293], RCV.WND == fcwnd. --> | ||||
</section> | ||||
<section numbered="true" toc="default"> | ||||
<name>Motivations for rLEDBAT</name> | ||||
<t>rLEDBAT enables new use cases and new deployment models, fostering the | ||||
use of LBE traffic. The following scenarios are enabled by rLEDBAT: | ||||
</t> | ||||
<dl spacing="normal" newline="true"> | ||||
<dt>Content Delivery Networks (CDNs) and more sophisticated file distrib | ||||
ution scenarios:</dt> | ||||
<dd>Consider the case where the source of a file to be distributed (e.g., a soft | ||||
ware developer that wishes to distribute a software update) would prefer to use | ||||
LBE and enables LEDBAT/LEDBAT++ in the servers containing the source file. Howev | ||||
er, because the file is being distributed through a CDN that does not implement | ||||
LBE congestion control, the result is that the file transfers originated from CD | ||||
N surrogates will not be using LBE. Interestingly enough, in the case of the sof | ||||
tware update, the developer may also control the software performing the downloa | ||||
d in the client (the receiver of the file), but because current LEDBAT/LEDBAT++ | ||||
are sender-based algorithms, controlling the client is not enough to enable LBE | ||||
congestion control in the communication. rLEDBAT would enable the use of a | ||||
n LBE traffic class for file distribution in this setup.</dd> | ||||
<dt>Interference from proxies and other middleboxes:</dt> | ||||
<dd>Proxies and other middleboxes are commonplace in the Internet. For instance, | ||||
in the case of mobile networks, proxies are frequently used. In the case of ent | ||||
erprise networks, it is common to deploy corporate proxies for filtering and fir | ||||
ewalling. In the case of satellite links, Performance Enhancing Proxies (PEPs) a | ||||
re deployed to mitigate the effect of long delays in a TCP connection. These pro | ||||
xies terminate the TCP connection on both ends and prevent the use of LBE conges | ||||
tion control in the segment between the proxy and the sink of the content, the c | ||||
lient. By enabling rLEDBAT, clients can then enable LBE traffic between them and | ||||
the proxy.</dd> | ||||
<dt>Receiver-defined preferences:</dt> | ||||
<dd>Frequently, the access link is the communication bottleneck. This is particu | ||||
larly true in the case of mobile devices. It is then especially relevant for mob | ||||
ile devices to properly manage the capacity of the access link. With current tec | ||||
hnologies, it is possible for the mobile device to use different congestion cont | ||||
rol algorithms expressing different preferences for the traffic. For instance, a | ||||
device can choose to use standard-TCP for some traffic and use LEDBAT/LEDBAT++ | ||||
for other traffic. However, this would only affect the outgoing traffic, since b | ||||
oth standard-TCP and LEDBAT/LEDBAT++ are driven by the sender. The mobile device | ||||
has no means to manage the traffic in the downlink, which is, in most cases, th | ||||
e communication bottleneck for a typical "eyeball" end user. rLEDBAT enabl | ||||
es the mobile device to selectively use an LBE traffic class for some of the inc | ||||
oming traffic. For instance, by using rLEDBAT, a user can use regular standard-T | ||||
CP/UDP for a video stream (e.g., YouTube) and use rLEDBAT for other background f | ||||
ile downloads.</dd> | ||||
</dl> | ||||
</section> | ||||
<section numbered="true" toc="default"> | ||||
<name>rLEDBAT Mechanisms</name> | ||||
<t>rLEDBAT provides the mechanisms to implement an LBE congestion control | ||||
algorithm at the receiver end of a TCP connection. The rLEDBAT receiver controls | ||||
the sender's rate through the Receive Window announced by the receiver in the T | ||||
CP header.</t> | ||||
<t>rLEDBAT assumes that the sender is a standard TCP sender. rLEDBAT | ||||
does not require any rLEDBAT-specific modifications to the TCP sender. The envi | ||||
sioned deployment model for rLEDBAT is that the clients implement rLEDBAT and th | ||||
is enables rLEDBAT in communications with existing standard TCP senders. In par | ||||
ticular, the sender <bcp14>MUST</bcp14> implement <xref target="RFC9293" format= | ||||
"default"/> and also <bcp14>MUST</bcp14> implement the TCP Timestamps (TS) optio | ||||
n as defined in <xref target="RFC7323" format="default"/>. Also, the sender shou | ||||
ld implement some of the standard congestion control mechanisms, such as CUBIC < | ||||
xref target="RFC9438" format="default"/> or NewReno <xref target="RFC5681" forma | ||||
t="default"/>. | ||||
<!-- [rfced] Sections 3, 3.2.1, and 3.2.2: | ||||
a) We changed "Time Stamp Option", "Time Stamp (TS) option", and | ||||
"TimeStamp option" to "TCP Timestamps option" or "TS option", per | ||||
RFC 7323 and "TS option generation rules [RFC7323]" used elsewhere in | ||||
this document. Please let us know any concerns. | ||||
Original: | ||||
In particular, the sender MUST | ||||
implement [RFC9293] and it also MUST implement the Time Stamp Option | ||||
as defined in [RFC7323]. | ||||
... | ||||
In order to measure RTT, the rLEDBAT client MUST enable the Time | ||||
Stamp (TS) option [RFC7323]. | ||||
... | ||||
In the case of TCP, the receiver can use the TimeStamp option to | ||||
measure the one way delay by subtracting the timestamp contained in | ||||
the incoming packet from the local time at which the packet has | ||||
arrived. | ||||
Currently: | ||||
In particular, the sender MUST | ||||
implement [RFC9293] and also MUST implement the TCP Timestamps (TS) | ||||
option as defined in [RFC7323]. | ||||
... | ||||
In order to measure RTT, the rLEDBAT client MUST enable the TS | ||||
option [RFC7323]. | ||||
... | ||||
In the case of TCP, the receiver can use the TS option to measure the | ||||
one-way delay by subtracting the timestamp contained in the incoming | ||||
packet from the local time at which the packet has arrived. | ||||
b) We do not see "New Reno", "NewReno", or "Reno" mentioned anywhere | ||||
in RFC 5681. May we also cite RFC 6582 ("The NewReno Modification to | ||||
TCP's Fast Recovery Algorithm"), which obsoletes RFC 3782 (which we | ||||
see mentioned in RFC 5681), for ease of the reader? | ||||
Original: | ||||
Also, the sender should implement some of | ||||
the standard congestion control mechanisms, such as Cubic [RFC9438] | ||||
or New Reno [RFC5681]. | ||||
Suggested: | ||||
Also, the sender should implement | ||||
some of the standard congestion control mechanisms, such as CUBIC | ||||
[RFC9438] or NewReno [RFC5681] [RFC6582]. | ||||
... | ||||
[RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The | ||||
NewReno Modification to TCP's Fast Recovery Algorithm", | ||||
RFC 6582, DOI 10.17487/RFC6582, April 2012, | ||||
<https://www.rfc-editor.org/info/rfc6582>. --> | ||||
</t> | ||||
<t>rLEDBAT does not define a new congestion control algorithm. The LBE con | ||||
gestion control algorithm executed in the rLEDBAT receiver is defined in other d | ||||
ocuments. The rLEDBAT receiver <bcp14>MUST</bcp14> use an LBE congestion control | ||||
algorithm. Because rLEDBAT assumes a standard TCP sender, the sender will be us | ||||
ing a "best effort" congestion control algorithm (such as CUBIC or NewReno). Sin | ||||
ce rLEDBAT uses the Receive Window to control the sender's rate and the sender c | ||||
alculates the sender's window as the minimum of the Receive window and the conge | ||||
stion window, rLEDBAT will only be effective as long as the congestion control a | ||||
lgorithm executed in the receiver yields a smaller window than the one calculate | ||||
d by the sender. This is normally the case when the receiver is using an LBE con | ||||
gestion control algorithm. The rLEDBAT receiver <bcp14>SHOULD</bcp14> use the LE | ||||
DBAT congestion control algorithm <xref target="RFC6817" format="default"/> or t | ||||
he LEDBAT++ congestion control algorithm <xref target="I-D.irtf-iccrg-ledbat-plu | ||||
s-plus" format="default"/>. The rLEDBAT <bcp14>MAY</bcp14> use other LBE congest | ||||
ion control algorithms defined elsewhere. Irrespective of which congestion contr | ||||
ol algorithm is executed in the receiver, an rLEDBAT connection will never be mo | ||||
re aggressive than standard-TCP, since it is always bounded by the congestion co | ||||
ntrol algorithm executed at the sender. | ||||
<!-- [rfced] Section 3: | ||||
a) Will "other documents" be clear to readers? Should one or more | ||||
specific documents be cited here? | ||||
Original: | ||||
The LBE | ||||
congestion control algorithm executed in the rLEDBAT receiver is | ||||
defined in other documents. | ||||
b) Does "The rLEDBAT MAY use other LBE congestion control algorithms | ||||
defined elsewhere" mean "The rLEDBAT receiver MAY use other LBE | ||||
congestion control algorithms defined elsewhere" or something else? | ||||
We ask because we see "the rLEDBAT node", "the rLEDBAT receiver", | ||||
"the rLEDBAT host", etc. | ||||
We have the same question re. "the rLEDBAT in host A" | ||||
(Section 3.2.1.1) and "How the rLEDBAT should resume" (Section 4). | ||||
Original: | ||||
The rLEDBAT MAY | ||||
use other LBE congestion control algorithms defined elsewhere. | ||||
... | ||||
This limitation of the sender's window can come either from the TCP | ||||
congestion window in host B or from the announced receive window from | ||||
the rLEDBAT in host A. | ||||
... | ||||
- How the rLEDBAT should resume after a period during which there | ||||
was no incoming traffic and the information about the rLEDBAT | ||||
state information is potentially dated. --> | ||||
</t> | ||||
<t>rLEDBAT is essentially composed of three types of mechanisms, namely | ||||
those that provide the means to measure the packet delay (either the RTT or the | ||||
one-way delay, depending on the selected algorithm), mechanisms to detect packet | ||||
loss, and the means to manipulate the Receive Window to control the sender's ra | ||||
te. The first two provide input to the LBE congestion control algorithm, while t | ||||
he third uses the congestion window computed by the LBE congestion control algor | ||||
ithm to manipulate the Receive window, as depicted in <xref target="fig1"/>.</t> | ||||
<figure anchor="fig1"> | ||||
<name>The rLEDBAT Architecture</name> | ||||
<artwork align="center" name="" type="" alt=""><![CDATA[ | ||||
+------------------------------------------+ | +------------------------------------------+ | |||
| TCP receiver | | | TCP Receiver | | |||
| +-----------------+ | | | +-----------------+ | | |||
| | +------------+ | | | | | +------------+ | | | |||
| +---------------------| RTT | | | | | +---------------------| RTT | | | | |||
| | | | Estimation | | | | | | | | Estimation | | | | |||
| | | +------------+ | | | | | | +------------+ | | | |||
| | | | | | | | | | | | |||
| | | +------------+ | | | | | | +------------+ | | | |||
| | +--------------| Loss, RTX | | | | | | +--------------| Loss, RTX | | | | |||
| | | | | Detection | | | | | | | | | Detection | | | | |||
| | | | +------------+ | | | | | | | +------------+ | | | |||
| v v | | | | | v v | | | | |||
| +----------------+ | | | | | +----------------+ | | | | |||
| | LBE Congestion | | rLEDBAT | | | | | LBE Congestion | | rLEDBAT | | | |||
| | Control | | | | | | | Control | | | | | |||
| +----------------+ | | | | | +----------------+ | | | | |||
| | | +------------+ | | | | | | +------------+ | | | |||
| | | | RCV-WND | | | | | | | | RCV.WND | | | | |||
| +---------------->| Control | | | | | +---------------->| Control | | | | |||
| | +------------+ | | | | | +------------+ | | | |||
| +-----------------+ | | | +-----------------+ | | |||
+------------------------------------------+ | +------------------------------------------+ | |||
]]></artwork> | ]]></artwork> | |||
</figure> | ||||
<t>We next describe each of the rLEDBAT components.</t> | ||||
<section numbered="true" toc="default"> | ||||
<name>Controlling the Receive Window</name> | ||||
<t>rLEDBAT uses the TCP Receive Window (RCV.WND) to enable the receiver | ||||
to control the sender's rate. <xref target="RFC9293" format="default"/> specifi | ||||
es that the RCV.WND is used to announce the available receive buffer to the send | ||||
er for flow control purposes. In order to avoid confusion, we will call fcwnd th | ||||
e value that a standard RFC793bis TCP receiver calculates to set in the receive | ||||
window for flow control purposes. We call RLWND the window value calculated by t | ||||
he rLEDBAT algorithm, and we call RCV.WND the value actually included in the Rec | ||||
eive Window field of the TCP header. For an RFC793bis receiver, RCV.WND == fcwnd | ||||
.</t> | ||||
<t>In the case of the rLEDBAT receiver, this receiver <bcp14>MUST NOT</b | ||||
cp14> set the RCV.WND to a value larger than fcwnd and <bcp14>SHOULD</bcp14> set | ||||
the RCV.WND to the minimum of RLWND and fcwnd, honoring both. | ||||
</figure> | <!-- [rfced] Sections 3.1 and 3.1.1: We had trouble following the | |||
meaning of "honoring both", "may fall short to honor", "honoring | ||||
that", and "sufficient to honor the window output" in these | ||||
sentences. Please clarify. | ||||
<t>We describe each of the rLEDBAT components next.</t> | Original: | |||
This | ||||
may fall short to honor the new calculated value of the RLWND | ||||
immediately. However, the receiver SHOULD progressively reduce the | ||||
advertised RCV.WND, always honoring that the reduction is less or | ||||
equal than the received bytes, until the target window determined by | ||||
the rLEDBAT algorithm is reached. | ||||
... | ||||
In the case of rLEDBAT receiver, the rLEDBAT receiver MUST NOT set | ||||
the RCV.WND to a value larger than fcwnd and it SHOULD set the | ||||
RCV.WND to the minimum of RLWND and fcwnd, honoring both. | ||||
... | ||||
In order to avoid window shrinking, the receiver MUST only reduce | ||||
RCV.WND by the number of bytes upon of a received data packet. This | ||||
may fall short to honor the new calculated value of the RLWND | ||||
immediately. However, the receiver SHOULD progressively reduce the | ||||
advertised RCV.WND, always honoring that the reduction is less or | ||||
equal than the received bytes, until the target window determined by | ||||
the rLEDBAT algorithm is reached. This implies that it may take up | ||||
to one RTT for the rLEDBAT receiver to drain enough in-flight bytes | ||||
to completely close its receive window without shrinking it. This is | ||||
sufficient to honor the window output from the LEDBAT/LEDBAT++ | ||||
algorithms since they only allow to perform at most one | ||||
multiplicative decrease per RTT. --> | ||||
<section title="Controlling the receive window"> | </t> | |||
<t>When using rLEDBAT, two congestion controllers are in action in the f | ||||
low of data from the sender to the receiver, namely the TCP congestion control a | ||||
lgorithm on the sender side and the LBE congestion control algorithm executed i | ||||
n the receiver and conveyed to the sender through the RCV.WND. In the normal TCP | ||||
operation, the sender uses the minimum of the cwnd and the RCV.WND to calculate | ||||
the SND.WND. This is also true for rLEDBAT, as the sender is a regular TCP send | ||||
er. This guarantees that the rLEDBAT flow will never transmit more aggressively | ||||
than a standard-TCP flow, as the sender's congestion window limits the sending r | ||||
ate. Moreover, because an LBE congestion control algorithm such as LEDBAT/LEDBAT | ||||
++ is designed to react earlier and more aggressively to congestion than regular | ||||
TCP congestion control, the RLWND contained in the TCP RCV.WND field will gener | ||||
ally be smaller than the congestion window calculated by the TCP sender, implyin | ||||
g that the rLEDBAT congestion control algorithm will be effectively controlling | ||||
the sender's window. One exception to this scenario is that at the beginning of | ||||
the connection, when there is no information to set RLWND, RLWND is set to its | ||||
maximum value, so that the sending rate of the sender is governed by the flow co | ||||
ntrol algorithm of the receiver and the TCP slow start mechanism of the sender. | ||||
<t>rLEDBAT uses the Receive Window (RCV.WND) of TCP to enable the receive | <!-- [rfced] Section 3.1: We had trouble parsing this sentence. | |||
r to control the sender's rate. <xref target="RFC9293" /> defines that the RCV. | We updated it as follows. If this is incorrect, please clarify the | |||
WND is used to announce the available receive buffer to the sender for flow cont | text. | |||
rol purposes. In order to avoid confusion, we will call fcwnd the value that a s | ||||
tandard RFC793bis TCP receiver calculates to set in the receive window for flow | ||||
control purposes. We call RLWND the window value calculated by rLEDBAT algorithm | ||||
and we call RCV.WND the value actually included in the Receive Window field of | ||||
the TCP header. For a RFC793bis receiver, RCV.WND == fcwnd.</t> | ||||
<t>In the case of rLEDBAT receiver, the rLEDBAT receiver MUST NOT set the | ||||
RCV.WND to a value larger than fcwnd and it SHOULD set the RCV.WND to the minim | ||||
um of RLWND and fcwnd, honoring both.</t> | ||||
<t>When using rLEDBAT, two congestion controllers are in action in the fl | Original: | |||
ow of data from the sender to the receiver, namely, the congestion control algor | One exception to this | |||
ithm of TCP in the sender side and the LBE congestion control algorithm execute | is at the beginning of the connection, when there is no information | |||
d in the receiver and conveyed to the sender through the RCV.WND. In the normal | to set RLWND, then, RLWND is set to its maximum value, so that the | |||
TCP operation, the sender uses the minimum of the congestion window cwnd and the | sending rate of the sender is governed by the flow control algorithm | |||
receiver window RCV.WND to calculate the sender's window SND.WND. This is also | of the receiver and the TCP slow start mechanism of the sender. | |||
true for rLEDBAT, as the sender is a regular TCP sender. This guarantees that th | ||||
e rLEDBAT flow will never transmit more aggressively than a standard-TCP flow, a | ||||
s the sender's congestion window limits the sending rate. Moreover, because a LB | ||||
E congestion control algorithm such as LEDBAT/LEDBAT++ is designed to react earl | ||||
ier and more aggressively to congestion than regular TCP congestion control, the | ||||
RLWND contained in the RCV.WND field of TCP will be in general smaller than the | ||||
congestion window calculated by the TCP sender, implying that the rLEDBAT conge | ||||
stion control algorithm will be effectively controlling the sender's window. On | ||||
e exception to this is at the beginning of the connection, when there is no info | ||||
rmation to set RLWND, then, RLWND is set to its maximum value, so that the sendi | ||||
ng rate of the sender is governed by the flow control algorithm of the receiver | ||||
and the TCP slow start mechanism of the sender.</t> | ||||
<t>In summary, the sender's window is: SND.WND = min(cwnd, RLWND, fcwnd)< | Currently: | |||
/t> | One exception to | |||
this scenario is that at the beginning of the connection, when there | ||||
is no information to set RLWND, RLWND is set to its maximum value, | ||||
so that the sending rate of the sender is governed by the flow | ||||
control algorithm of the receiver and the TCP slow start mechanism | ||||
of the sender. --> | ||||
<section title="Avoiding window shrinking"> | </t> | |||
<t>In summary, the sender's window is SND.WND = min(cwnd, RLWND, fcwnd)< | ||||
/t> | ||||
<section numbered="true" toc="default"> | ||||
<name>Avoiding Window Shrinking</name> | ||||
<t>The LEDBAT/LEDBAT++ algorithm executed in a rLEDBAT receiver increa | ||||
ses or decreases RLWND according to congestion signals (variations in the estima | ||||
ted queuing delay and packet loss). | ||||
<t>The LEDBAT/LEDBAT++ algorithm executed in a rLEDBAT receiver i | If RLWND is decreased and directly announced in RCV.WND, | |||
ncreases or decreases RLWND according to congestion signals (variations on the e | this could lead to an announced window that is smaller than what is currently in | |||
stimated queueing delay and packet loss). | use. This so-called "shrinking the window" is discouraged as per <xref target=" | |||
RFC9293" format="default"/>, as it may cause unnecessary packet loss and perform | ||||
ance penalties. To be consistent with <xref target="RFC9293" format="default"/>, | ||||
the rLEDBAT receiver <bcp14>SHOULD NOT</bcp14> shrink the receive window. </t> | ||||
<t>In order to avoid window shrinking, the receiver <bcp14>MUST</bcp14 | ||||
> only reduce RCV.WND by the number of bytes upon of a received data packet. Thi | ||||
s may fall short to honor the new calculated value of the RLWND immediately. How | ||||
ever, the receiver <bcp14>SHOULD</bcp14> progressively reduce the advertised RCV | ||||
.WND, always honoring that the reduction is less than or equal to the received b | ||||
ytes, until the target window determined by the rLEDBAT algorithm is reached. | ||||
This implies that it may take up to one RTT for the rLEDBAT receiver to drain en | ||||
ough in-flight bytes to completely close its receive window without shrinking it | ||||
. This is sufficient to honor the window output from the LEDBAT/LEDBAT++ algorit | ||||
hms, since they only allow to perform at most one multiplicative decrease per RT | ||||
T. | ||||
If RLWND is decreased and directly announced in RCV.WND, this could lead to an announced window that is smaller than what is currently in use. This so called 'shrinking the window' is discouraged as per <xref target=" RFC9293" />, as it may cause unnecessary packet loss and performance penalty. To be consistent with <xref target="RFC9293" />, the rLEDBAT receiver SHOULD NOT s hrink the receive window. </t> | <!-- [rfced] Section 3.1.1: | |||
<t>In order to avoid window shrinking, the receiver MUST only red | a) Please clarify "upon of" in this sentence. Are some words | |||
uce RCV.WND by the number of bytes upon of a received data packet. This may fall | missing, or should either "upon" or "of" be removed? | |||
short to honor the new calculated value of the RLWND immediately. However, the | ||||
receiver SHOULD progressively reduce the advertised RCV.WND, always honoring tha | ||||
t the reduction is less or equal than the received bytes, until the target windo | ||||
w determined by the rLEDBAT algorithm is reached. | ||||
This implies that it may take up to one RTT for the rLEDBAT receiver to drain en | ||||
ough in-flight bytes to completely close its receive window without shrinking it | ||||
. This is sufficient to honor the window output from the LEDBAT/LEDBAT++ algorit | ||||
hms since they only allow to perform at most one multiplicative decrease per RTT | ||||
.</t> | ||||
</section> | ||||
<section title="Setting the Window Scale Option"> | Original: | |||
In order to avoid window shrinking, the receiver MUST only reduce | ||||
RCV.WND by the number of bytes upon of a received data packet. | ||||
<t>The Window Scale (WS) option <xref target="RFC7323" /> is a me | b) Does "they only allow to perform" mean "they are only allowed to | |||
ans to increase the maximum window size permitted by the Receive Window. The WS | perform", "they only permit performing", or something else? | |||
option defines a scale factor which restricts the granularity of the receive win | ||||
dow that can be announced. This means that the rLEDBAT client will have to accum | ||||
ulate the increases resulting from multiple received packets, and only convey a | ||||
change in the window when the accumulated sum of increases is equal or higher th | ||||
an one increase step as imposed by the scaling factor according to the WS option | ||||
in place for the TCP connection.</t> | ||||
<t>Changes in the receive window that are smaller than 1 MSS are | Original: | |||
unlikely to have any immediate impact on the sender's rate, as usual TCP's segme | This is | |||
ntation practice results in sending full segments (i.e., segments of size equal | sufficient to honor the window output from the LEDBAT/LEDBAT++ | |||
to the MSS). Current WS option specification <xref target="RFC7323" /> defines t | algorithms since they only allow to perform at most one | |||
hat allowed values for the WS option are between 0 and 14. Assuming a MSS around | multiplicative decrease per RTT. --> | |||
1500 bytes, WS option values between 0 and 11 result in the receive window bein | ||||
g expressed in units that are about 1 MSS or smaller. So, WS option values betwe | ||||
en 0 and 11 have no impact in rLEDBAT (unless packets smaller than the MSS are b | ||||
eing exchanged).</t> | ||||
<t>WS option values higher than 11 can affect the dynamics of rLE | </t> | |||
DBAT, since control may become too coarse (e.g., with WS of 14, a change in one | </section> | |||
unit of the receive window implies a change of 10 MSS in the effective window).< | <section numbered="true" toc="default"> | |||
/t> | <name>Setting the Window Scale Option</name> | |||
<t>For the above reasons, the rLEDBAT client SHOULD set WS option | <t>The Window Scale (WS) option <xref target="RFC7323" format="default | |||
values lower than 12. Additional experimentation is required to explore the imp | "/> is a means to increase the maximum window size permitted by the Receive Wind | |||
act of larger WS values on rLEDBAT dynamics.</t> | ow. The WS option defines a scale factor that restricts the granularity of the r | |||
<t>Note that the recommendation for rLEDBAT to set the WS option | eceive window that can be announced. This means that the rLEDBAT client will hav | |||
value to lower values does not precludes the communication with servers that set | e to accumulate the increases resulting from multiple received packets and only | |||
the WS option values to larger values, since the WS option value is set indepen | convey a change in the window when the accumulated sum of increases is equal to | |||
dently for each direction of the TCP connection.</t> | or higher than one increase step as imposed by the scaling factor according to t | |||
</section> | he WS option in place for the TCP connection.</t> | |||
</section> | <t>Changes in the receive window that are smaller than 1 MSS (Maximum | |||
Segment Size) are unlikely to have any immediate impact on the sender's rate. As | ||||
usual, TCP's segmentation practice results in sending full segments (i.e., segm | ||||
ents of size equal to the MSS). <xref target="RFC7323" format="default"/>, which | ||||
defines the WS option, specifies that allowed values for the WS option are betw | ||||
een 0 and 14. Assuming an MSS of around 1500 bytes, WS option values between 0 a | ||||
nd 11 result in the receive window being expressed in units that are about 1 MSS | ||||
or smaller. So, WS option values between 0 and 11 have no impact in rLEDBAT (un | ||||
less packets smaller than the MSS are being exchanged).</t> | ||||
<t>WS option values higher than 11 can affect the dynamics of rLEDBAT, | ||||
since control may become too coarse (e.g., with a WS option value of 14, a chan | ||||
ge in one unit of the receive window implies a change of 10 MSS in the effective | ||||
window). | ||||
<section title="Measuring delays"> | <!-- [rfced] Section 3.1.2: We changed "with WS of 14" to "with a WS | |||
option value of 14" here, to indicate the option value as opposed to | ||||
the concept of window scale. If this is incorrect, please clarify. | ||||
<t>Both LEDBAT and LEDBAT++ measure base and current delays to estimate t | Original: | |||
he queueing delay. LEDBAT uses the one way delay while LEDBAT++ uses the round t | WS option values higher than 11 can affect the dynamics of rLEDBAT, | |||
rip time. In the next sections we describe how rLEDBAT mechanisms enable the rec | since control may become too coarse (e.g., with WS of 14, a change in | |||
eiver to measure the one way delay or the round trip time, whatever is needed de | one unit of the receive window implies a change of 10 MSS in the | |||
pending on the congestion control algorithm used.</t> | effective window). | |||
<section title="Measuring RTT to estimate the queueing delay"> | Currently: | |||
WS option values higher than 11 can affect the dynamics of rLEDBAT, | ||||
since control may become too coarse (e.g., with a WS option value of | ||||
14, a change in one unit of the receive window implies a change of 10 | ||||
MSS in the effective window). --> | ||||
<t>LEDBAT++ uses the round trip time (RTT) to estimate the queueing delay | </t> | |||
. In order to estimate the queueing delay using RTT, the rLEDBAT receiver estima | <t>For the above reasons, the rLEDBAT client <bcp14>SHOULD</bcp14> set | |||
tes the base RTT (i.e., the constant components of RTT) and also measures the cu | WS option values lower than 12. Additional experimentation is required to explo | |||
rrent RTT. By subtracting these two values, we obtain the queuing delay to be us | re the impact of larger WS values on rLEDBAT dynamics.</t> | |||
ed by the rLEDBAT controller.</t> | <t>Note that the recommendation for rLEDBAT to set the WS option value | |||
s to lower values does not preclude communication with servers that set the WS o | ||||
ption values to larger values, since WS option values are set independently for | ||||
each direction of the TCP connection.</t> | ||||
</section> | ||||
</section> | ||||
<section numbered="true" toc="default"> | ||||
<name>Measuring Delays</name> | ||||
<t>Both LEDBAT and LEDBAT++ measure base and current delays to estimate | ||||
the queuing delay. LEDBAT uses the one-way delay, while LEDBAT++ uses the RTT. I | ||||
n the next sections, we describe how rLEDBAT mechanisms enable the receiver to m | ||||
easure the one-way delay or the RTT -- whichever is needed, depending on the con | ||||
gestion control algorithm used.</t> | ||||
<section numbered="true" toc="default"> | ||||
<name>Measuring RTT to Estimate the Queuing Delay</name> | ||||
<t>LEDBAT++ uses the RTT to estimate the queuing delay. In order to es | ||||
timate the queuing delay using RTT, the rLEDBAT receiver estimates the base RTT | ||||
(i.e., the constant components of RTT) and also measures the current RTT. By sub | ||||
tracting these two values, we obtain the queuing delay to be used by the rLEDBAT | ||||
controller.</t> | ||||
<t>LEDBAT++ discovers the base RTT (RTTb) by taking the minimum value | ||||
of the measured RTTs over a period of time. The current RTT (RTTc) is estimated | ||||
using a number of recent samples and applying a filter, such as the minimum (or | ||||
the mean) of the last k samples. Using RTT to estimate the queuing delay has a n | ||||
umber of shortcomings and difficulties, as discussed below.</t> | ||||
<t>The queuing delay measured using RTT also includes the queuing dela | ||||
y experienced by the return packets in the direction from the rLEDBAT receiver t | ||||
o the sender. This is a fundamental limitation of this approach. The impact of t | ||||
his error is that the rLEDBAT controller will also react to congestion in the re | ||||
verse path direction, resulting in an even more conservative mechanism. | ||||
<t>LEDBAT++ discovers the base RTT (RTTb) by taking the minimum value of | <!-- [rfced] Section 3.2.1: Please confirm that "error" is the | |||
the measured RTTs over a period of time. The current RTT (RTTc) is estimated usi | correct word here. The approach discussed in this section does not | |||
ng a number of recent samples and applying a filter, such as the minimum (or the | seem to otherwise be considered an error - only an approach with a | |||
mean) of the last k samples. Using RTT to estimate the queueing delay has a num | limitation (per the previous sentence). Please confirm that calling | |||
ber of shortcomings and difficulties that we discuss next.</t> | this approach an error will be clear to readers. | |||
<t>The queuing delay measured using RTT includes also the queueing delay | Original (the previous sentence is included for context): | |||
experienced by the return packets in the direction from the rLEDBAT receiver to | This is a fundamental limitation of this | |||
the sender. This is a fundamental limitation of this approach. The impact of thi | approach. The impact of this error is that the rLEDBAT controller | |||
s error is that the rLEDBAT controller will also react to congestion in the reve | will also react to congestion in the reverse path direction which | |||
rse path direction which results in an even more conservative mechanism.</t> | results in an even more conservative mechanism. | |||
<t>In order to measure RTT, the rLEDBAT client MUST enable the Time Stamp | Perhaps ("this limitation"): | |||
(TS) option <xref target="RFC7323" />. By matching the TSVal value carried in o | This is a fundamental limitation of this | |||
utgoing packets with the TSecr value observed in incoming packets, it is possibl | approach. The impact of this limitation is that the rLEDBAT | |||
e to measure RTT. This allows the rLEDBAT receiver to measure RTT even if it is | controller will also react to congestion in the reverse path | |||
acting as a pure receiver. In a pure receiver there is no data flowing from the | direction, resulting in an even more conservative mechanism. | |||
rLEDBAT receiver to the sender, making impossible to match data packets with ack | ||||
nowledgements packets to measure RTT, as it is usually done in TCP for other pur | ||||
poses.</t> | ||||
<t>Depending on the frequency of the local clock used to generate the val | Or possibly ("this issue"): | |||
ues included in the TS option, several packets may carry the same TSVal value. I | This is a fundamental limitation of this | |||
f that happens, the rLEDBAT receiver will be unable to match the different outgo | approach. The impact of this issue is that the rLEDBAT controller | |||
ing packets carrying the same TSVal value with the different incoming packets ca | will also react to congestion in the reverse path direction, | |||
rrying also the same TSecr value. However, it is not necessary for rLEDBAT to us | resulting in an even more conservative mechanism. --> | |||
e all packets to estimate RTT and sampling a subset of in-flight packets per RTT | ||||
is enough to properly assess the queueing delay. RTT MUST then be calculated as | ||||
the time since the first packet with a given TSVal was sent and the first packe | ||||
t that was received with the same value contained in the TSecr. Other packets wi | ||||
th repeated TS values SHOULD NOT be used for RTT calculation. </t> | ||||
<t>Several issues must be addressed in order to avoid an artificial incre | </t> | |||
ase of the observed RTT. Different issues emerge depending whether the rLEDBAT c | <t>In order to measure RTT, the rLEDBAT client <bcp14>MUST</bcp14> ena | |||
apable host is sending data packets or pure ACKs to measure RTT. We next conside | ble the TS option <xref target="RFC7323" format="default"/>. By matching the TSv | |||
r the issues separately.</t> | al value carried in outgoing packets with the Timestamp Echo Reply (TSecr) value | |||
<xref target="RFC7323" format="default"/> observed in incoming packets, it is p | ||||
ossible to measure RTT. This allows the rLEDBAT receiver to measure RTT even if | ||||
it is acting as a pure receiver. In a pure receiver, there is no data flowing fr | ||||
om the rLEDBAT receiver to the sender, making it impossible to match data packet | ||||
s with Acknowledgment packets to measure RTT, as it is usually done in TCP for o | ||||
ther purposes. | ||||
<section title="Measuring RTT sending pure ACKs"> | <!-- [rfced] Section 3.2.1: Does "as it is usually done in TCP" | |||
indicate a comparison or a contrast? If the suggested text is not | ||||
correct, please clarify. | ||||
<t>In this scenario, the rLEDBAT node (node A) sends a pure ACK t | Original: | |||
o the other endpoint of the TCP connection (node B), including the TS option. Up | In a pure | |||
on the reception of the TS Option, host B will copy the value of the TSVal into | receiver there is no data flowing from the rLEDBAT receiver to the | |||
the TSecr field of the TS option and include that option into the next data pack | sender, making impossible to match data packets with acknowledgements | |||
et towards host A. However, there are two reasons why B may not send a packet im | packets to measure RTT, as it is usually done in TCP for other | |||
mediately back to A, artificially increasing the measured RTT. The first reason | purposes. | |||
is when A has no data to send. | ||||
The second is when A has no available window to put more packets in-flight. We d | ||||
escribe next how each of these cases is addressed.</t> | ||||
<t>The case where the host B has no data to send when it receives the pur | Suggested (guessing a contrast): | |||
e Acknowledgement is expected to be rare in the rLEDBAT use cases. rLEDBAT will | In a pure | |||
be used mostly for background file transfers so the expected common case is that | receiver, there is no data flowing from the rLEDBAT receiver to the | |||
the sender will have data to send throughout the lifetime of the communication. | sender, making it impossible to match data packets with | |||
However, if, for example, the file is structured in blocks of data, it may be t | Acknowledgment packets to measure RTT, in contrast to what is | |||
he case that the sender seldomly will have to wait until the next block is avail | usually done in TCP for other purposes. --> | |||
able to proceed with the data transfer. To address this situation, the filter us | ||||
ed by the congestion control algorithm executed in the receiver SHOULD discard o | ||||
utliers (e.g. a min filter would achieve this) when measuring RTT using pure ACK | ||||
packets.</t> | ||||
<t>This limitation of the sender's window can come either from the TCP co | <!-- [rfced] Sections 3.2.1 and subsequent: Because "TSval" stands | |||
ngestion window in host B or from the announced receive window from the rLEDBAT | for "Timestamp Value" per RFC 7323, may we change the instances of | |||
in host A. Normally, the receive window will be the one to limit the sender's tr | "TSval value" to "TSval", to avoid the appearance of "Timestamp Value | |||
ansmission rate, since the LBE congestion control algorithm used by the rLEDBAT | value"? --> | |||
node is designed to be more restrictive on the sender's rate than standard-TCP. | ||||
If the limiting factor is the congestion window in the sender, it is less releva | ||||
nt if rLEDBAT further reduces the receive window due to a bloated RTT measuremen | ||||
t, since the rLEDBAT node is not actively controlling the sender's rate. Neverth | ||||
eless, the proposed approach to discard larger samples would also address this i | ||||
ssue.</t> | ||||
<t>To address the case in which the limiting factor is the receive window | </t> | |||
announced by rLEDBAT, the congestion control algorithm at the receiver SHOULD d | <t>Depending on the frequency of the local clock used to generate the | |||
iscard RTT measurements during the window reduction phase that are triggered by | values included in the TS option, several packets may carry the same TSval value | |||
pure ACK packets. The rLEDBAT receiver is aware whether a given TSVal value was | . If that happens, the rLEDBAT receiver will be unable to match the different ou | |||
sent in a pure ACK packet where the window was reduced, and if so, it can discar | tgoing packets carrying the same TSval value with the different incoming packets | |||
d the corresponding RTT measurement. </t> | also carrying the same TSecr value. However, it is not necessary for rLEDBAT to | |||
</section> | use all packets to estimate RTT, and sampling a subset of in-flight packets per | |||
<section title="Measuring RTT when sending data packets"> | RTT is enough to properly assess the queuing delay. RTT <bcp14>MUST</bcp14> the | |||
n be calculated as the time since the first packet with a given TSval was sent a | ||||
nd the first packet that was received with the same value contained in the TSecr | ||||
. Other packets with repeated TS values <bcp14>SHOULD NOT</bcp14> be used for RT | ||||
T calculations. </t> | ||||
<t>Several issues must be addressed in order to avoid an artificial in | ||||
crease in the observed RTT. Different issues emerge, depending on whether the | ||||
rLEDBAT-capable host is sending data packets or pure ACKs to measure RTT. We nex | ||||
t consider these issues separately.</t> | ||||
<section numbered="true" toc="default"> | ||||
<name>Measuring RTT When Sending Pure ACKs</name> | ||||
<t>In this scenario, the rLEDBAT node (node A) sends a pure ACK to t | ||||
he other endpoint of the TCP connection (node B), including the TS option. Upon | ||||
the reception of the TS option, host B will copy the value of the TSval into the | ||||
TSecr field of the TS option and include that option in the next data packet to | ||||
wards host A. However, there are two reasons why B may not send a packet immedia | ||||
tely back to A, artificially increasing the measured RTT. The first reason is wh | ||||
en A has no data to send. | ||||
The second is when A has no available window to put more packets in flight. We n | ||||
ext describe how each of these cases is addressed.</t> | ||||
<t>The case where host B has no data to send when it receives the pu | ||||
re Acknowledgment is expected to be rare in the rLEDBAT use cases. rLEDBAT | ||||
will be used mostly for background file transfers, so the expected common case | ||||
is that the sender will have data to send throughout the lifetime of the communi | ||||
cation. However, if, for example, the file is structured in blocks of data, it m | ||||
ay be the case that the sender will seldom have to wait until the next block is | ||||
available to proceed with the data transfer. To address this situation, the filt | ||||
er used by the congestion control algorithm executed in the receiver <bcp14>SHOU | ||||
LD</bcp14> discard outliers (e.g., a MIN filter <xref target="RFC6817"/> would a | ||||
chieve this) when measuring RTT using pure ACK packets. | ||||
<t>In the case that the rLEDBAT node is sending data packets and matching | <!-- [rfced] Sections 3.2.1.1 and 3.2.1.2: For ease of the reader, | |||
them with pure ACKs to measure RTT, a factor that can artificially increase the | we changed "min filter" to "MIN filter" and cited RFC 6817 here | |||
RTT measured is the presence of delayed Acknowledgements. | (where "MIN filter" is first used). Please let us know any concerns. | |||
According to the TS option generation rules <xref target="RFC7323 | ||||
" />, | Original: | |||
the value included in the TSecr for a delayed ACK is the one in t | To address this | |||
he TSVal field of the earliest unacknowledged segment. | situation, the filter used by the congestion control algorithm | |||
executed in the receiver SHOULD discard outliers (e.g. a min filter | ||||
would achieve this) when measuring RTT using pure ACK packets. | ||||
... | ||||
Also, applying a filter that | ||||
discards outliers would also address this issue (e.g. a min filter). | ||||
Currently: | ||||
To address this | ||||
situation, the filter used by the congestion control algorithm | ||||
executed in the receiver SHOULD discard outliers (e.g., a MIN filter | ||||
[RFC6817] would achieve this) when measuring RTT using pure ACK | ||||
packets. | ||||
... | ||||
Applying a filter (e.g., a MIN | ||||
filter) that discards outliers would also address this issue. --> | ||||
</t> | ||||
<t>This limitation of the sender's window can come from either the T | ||||
CP congestion window in host B or the announced receive window from the rLEDBAT | ||||
in host A. Normally, the receive window will be the one to limit the sender's tr | ||||
ansmission rate, since the LBE congestion control algorithm used by the rLEDBAT | ||||
node is designed to be more restrictive on the sender's rate than standard-TCP. | ||||
If the limiting factor is the congestion window in the sender, it is less releva | ||||
nt if rLEDBAT further reduces the receive window due to a bloated RTT measuremen | ||||
t, since the rLEDBAT node is not actively controlling the sender's rate. Neverth | ||||
eless, the proposed approach to discard larger samples would also address this i | ||||
ssue.</t> | ||||
<t>To address the case in which the limiting factor is the receive w | ||||
indow announced by rLEDBAT, the congestion control algorithm at the receiver <bc | ||||
p14>SHOULD</bcp14> discard RTT measurements during the window reduction phase th | ||||
at are triggered by pure ACK packets. The rLEDBAT receiver is aware of whether a | ||||
given TSval value was sent in a pure ACK packet where the window was reduced, a | ||||
nd if so, it can discard the corresponding RTT measurement. </t> | ||||
</section> | ||||
<section numbered="true" toc="default"> | ||||
<name>Measuring RTT When Sending Data Packets</name> | ||||
<t>In the case that the rLEDBAT node is sending data packets and mat | ||||
ching them with pure ACKs to measure RTT, a factor that can artificially increas | ||||
e the RTT measured is the presence of delayed Acknowledgments. | ||||
According to the TS option generation rules <xref target="RFC7323 | ||||
" format="default"/>, | ||||
the value included in the TSecr for a delayed ACK is the one in t | ||||
he TSval field of the earliest unacknowledged segment. | ||||
This may artificially increase the measured RTT. </t> | This may artificially increase the measured RTT. </t> | |||
<t>If both endpoints of the connection are sending data packets, Ack | ||||
nowledgments are piggybacked onto the data packets and they are not delayed. Del | ||||
ayed ACKs only increase RTT measurements in the case that the sender has no data | ||||
to send. Since the expected use case for rLEDBAT is that the sender will be sen | ||||
ding background traffic to the rLEDBAT receiver, the cases where delayed ACKs in | ||||
crease the measured RTT are expected to be rare.</t> | ||||
<t>Nevertheless, measurements based on data packets from the rLEDBAT | ||||
node matching pure ACKs from the other end will result in an increased RTT samp | ||||
le. The additional increase in the measured RTT will be up to 500 ms. This is be | ||||
cause delayed ACKs are generated every second data packet received and not delay | ||||
ed more than 500 ms according to <xref target="RFC9293" format="default"/>. The | ||||
rLEDBAT receiver <bcp14>MAY</bcp14> discard RTT measurements done using data pac | ||||
kets from the rLEDBAT receiver and matching pure ACKs, especially if it has rece | ||||
nt measurements done using other packet combinations. Applying a filter (e.g., a | ||||
MIN filter) that discards outliers would also address this issue.</t> | ||||
</section> | ||||
</section> | ||||
<section numbered="true" toc="default"> | ||||
<name>Measuring One-Way Delay to Estimate the Queuing Delay</name> | ||||
<t>The LEDBAT algorithm uses the one-way delay of packets as input. A | ||||
TCP receiver can measure the delay of incoming packets directly (as opposed to t | ||||
he sender-based LEDBAT, where the receiver measures the one-way delay and needs | ||||
to convey it to the sender).</t> | ||||
<t>In the case of TCP, the receiver can use the TS option to measure t | ||||
he one-way delay by subtracting the timestamp contained in the incoming packet f | ||||
rom the local time at which the packet has arrived. As noted in <xref target="RF | ||||
C6817" format="default"/>, the clock offset between the sender's clock and the r | ||||
eceiver's clock does not affect the LEDBAT operation, since LEDBAT uses the diff | ||||
erence between the base one-way delay and the current one-way delay to estimate | ||||
the queuing delay, effectively "canceling out" the clock offset error in the que | ||||
uing delay estimation. There are, however, two other issues that the rLEDBAT rec | ||||
eiver needs to take into account in order to properly estimate the one-way delay | ||||
, namely the units in which the received timestamps are expressed and the clock | ||||
skew. These issues are addressed below. | ||||
<t>If both endpoints of the connection are sending data packets, | <!-- [rfced] Section 3.2.2: We changed 'effectively canceling the | |||
Acknowledgments are piggybacked into the data packets and they are not delayed. | clock offset error' to 'effectively "canceling out" the clock offset | |||
Delayed ACKs only increase RTT measurements in the case that the sender has no d | error' per Appendix A.1 of RFC 6817 (which says 'the offsets cancel | |||
ata to send. Since the expected use case for rLEDBAT is that the sender will be | each other out in the queuing delay estimate'). Please let us know | |||
sending background traffic to the rLEDBAT receiver, the cases where delayed ACKs | any objections. | |||
increase the measured RTT are expected to be rare.</t> | ||||
<t>Nevertheless, measurements based on data packets from the rLED | ||||
BAT node matching pure ACKs from the other end will result in an increased RTT s | ||||
ample. The additional increase in the measured RTT will be up to 500 ms. The rea | ||||
son for this is that delayed ACKs are generated every second data packet receive | ||||
d and not delayed more than 500 ms according to <xref target="RFC9293" />. The r | ||||
LEDBAT receiver MAY discard RTT measurements done using data packets from the rL | ||||
EBDAT receiver and matching pure ACKs, especially if it has recent measurements | ||||
done using other packet combinations. Also, applying a filter that discards outl | ||||
iers would also address this issue (e.g. a min filter).</t> | ||||
</section> | ||||
</section> | ||||
<section title="Measuring one way delay to estimate the queueing | Original: | |||
delay"> | As noted in [RFC6817] the clock offset between the clock of | |||
<t>The LEDBAT algorithm uses the one-way delay of packets as inpu | the sender and the clock in the receiver does not affect the LEDBAT | |||
t. A TCP receiver can measure the delay of incoming packets directly (as opposed | operation, since LEDBAT uses the difference between the base one way | |||
to the sender-based LEDBAT, where the receiver measures the one-way delay and n | delay and the current one way delay to estimate the queuing delay, | |||
eeds to convey it to the sender).</t> | effectively canceling the clock offset error in the queueing delay | |||
<t>In the case of TCP, the receiver can use the TimeStamp option | estimation. | |||
to measure the one way delay by subtracting the timestamp contained in the incom | ||||
ing packet from the local time at which the packet has arrived. As noted in <xre | ||||
f target="RFC6817" /> the clock offset between the clock of the sender and the c | ||||
lock in the receiver does not affect the LEDBAT operation, since LEDBAT uses the | ||||
difference between the base one way delay and the current one way delay to esti | ||||
mate the queuing delay, effectively canceling the clock offset error in the queu | ||||
eing delay estimation. There are however two other issues that the rLEDBAT recei | ||||
ver needs to take into account in order to properly estimate the one way delay, | ||||
namely, the units in which the received timestamps are expressed and the clock s | ||||
kew. We address them next.</t> | ||||
<t>In order to measure the one way delay using TCP timestamps, th | Currently: | |||
e rLEDBAT receiver, first, needs to discover the units of values in the TS optio | As noted | |||
n and, second, needs to account for the skew between the two endpoint clocks. No | in [RFC6817], the clock offset between the sender's clock and the | |||
te that a mismatch of 100 ppm (parts per million) in the estimation of the sende | receiver's clock does not affect the LEDBAT operation, since LEDBAT | |||
r's clock rate accounts for 6 ms of variation per minute in the measured delay. | uses the difference between the base one-way delay and the current | |||
This just one order of magnitude below the target delay set by rLEDBAT (or poten | one-way delay to estimate the queuing delay, effectively "canceling | |||
tially more if the target is set to lower values, which is possible). Typical sk | out" the clock offset error in the queuing delay estimation. --> | |||
ew for untrained clocks is reported to be around 100-200 ppm <xref target="RFC68 | ||||
17" />.</t> | ||||
<t>In order to learn both the TS units and the clock skew, the rL | ||||
EDBAT receiver measures how much local time has elapsed between two packets with | ||||
different TS values issued by the sender. By comparing the local time differenc | ||||
e and the TS value difference, the receiver can assess the TS units and relative | ||||
clock skews. In order for this to be accurate, the packets carrying the differe | ||||
nt TS values should experience equal (or at least similar delay) when traveling | ||||
from the sender to the receiver, as any difference in the experienced delays wou | ||||
ld introduce error in the unit/skew estimation. One possible approach is to sele | ||||
ct packets that experienced the minimum delay (i.e. close to zero queueing delay | ||||
) to make the estimations.</t> | ||||
<t>An additional difficulty regarding the estimation of the TS un | ||||
its and clock skew in the context of (r)LEDBAT is that the LEDBAT congestion con | ||||
troller actions directly affect the (queueing) delay experienced by packets. In | ||||
particular, if there is an error in the estimation of the TS units/skew, the LED | ||||
BAT controller will attempt to compensate it by reducing/increasing the load. Th | ||||
e result is that the LEDBAT operation interferes with the TS units/clock skew me | ||||
asurements. Because of this, measurements are more accurate when there is no tra | ||||
ffic in the connection (in addition to the packets used for the measurements). T | ||||
he problem is that the receiver is unaware if the sender is injecting traffic at | ||||
any point in time, and so, it is unable to use these quiet intervals to perform | ||||
measurements. The receiver can however, force periodic slowdowns, reducing the | ||||
announced receive window to a few packets and perform the measurements then.</t> | ||||
<t>It is possible for the rLEDBAT receiver to perform multiple me | ||||
asurements to assess both the TS units and the relative clock skew during the li | ||||
fetime of the connection, in order to obtain more accurate results. Clock skew m | ||||
easurements are more accurate if the time period used to discover the skew is la | ||||
rger, as the impact of the skew becomes more apparent. It is a reasonable appro | ||||
ach for the rLEDBAT receiver to perform an early discovery of the TS units (and | ||||
the clock skew) using the first few packets of the TCP connection and then impro | ||||
ve the accuracy of the TS units/clock skew estimation using periodic measurement | ||||
s later in the lifetime of the connection. </t> | ||||
</section> | </t> | |||
<t>In order to measure the one-way delay using TCP timestamps, the rLE | ||||
DBAT receiver first needs to discover the units of values in the TS option and t | ||||
hen needs to account for the skew between the two endpoint clocks. Note that a m | ||||
ismatch of 100 ppm (parts per million) in the estimation of the sender's clock r | ||||
ate accounts for 6 ms of variation per minute in the measured delay. This is jus | ||||
t one order of magnitude below the target delay set by rLEDBAT (or potentially m | ||||
ore if the target is set to lower values, which is possible). Typical skew for u | ||||
ntrained clocks is reported to be around 100-200 ppm <xref target="RFC6817" form | ||||
at="default"/>.</t> | ||||
<t>In order to learn both the TS units and the clock skew, the rLEDBAT | ||||
receiver measures how much local time has elapsed between two packets with diff | ||||
erent TS values issued by the sender. By comparing the local time difference and | ||||
the TS value difference, the receiver can assess the TS units and relative cloc | ||||
k skews. In order for this to be accurate, the packets carrying the different TS | ||||
values should experience equal (or at least similar) delay when traveling from | ||||
the sender to the receiver, as any difference in the experienced delays would in | ||||
troduce an error in the unit/skew estimation. One possible approach is to select | ||||
packets that experienced minimal delay (i.e., queuing delay close to zero) to m | ||||
ake the estimations.</t> | ||||
<t>An additional difficulty regarding the estimation of the TS units a | ||||
nd clock skew in the context of (r)LEDBAT is that the LEDBAT congestion controll | ||||
er actions directly affect the (queuing) delay experienced by packets. In partic | ||||
ular, if there is an error in the estimation of the TS units/skew, the LEDBAT co | ||||
ntroller will attempt to compensate for it by reducing/increasing the load. The | ||||
result is that the LEDBAT operation interferes with the TS units/clock skew meas | ||||
urements. Because of this, measurements are more accurate when there is no traff | ||||
ic in the connection (in addition to the packets used for the measurements). The | ||||
problem is that the receiver is unaware if the sender is injecting traffic at a | ||||
ny point in time, and so, it is unable to use these quiet intervals to perform m | ||||
easurements. The receiver can, however, force periodic slowdowns, reducing the a | ||||
nnounced receive window to a few packets and perform the measurements then. | ||||
</section> | <!-- [rfced] Section 3.2.2: We had trouble parsing these sentences. | |||
If the suggested text is not correct, please clarify the meaning of | ||||
"the receiver is unaware if the sender is injecting traffic" and | ||||
"reducing the announced receive window to a few packets and perform". | ||||
<section title="Detecting packet losses and retransmissions"> | Original: | |||
The problem is that the receiver is unaware if the | ||||
sender is injecting traffic at any point in time, and so, it is | ||||
unable to use these quiet intervals to perform measurements. The | ||||
receiver can however, force periodic slowdowns, reducing the | ||||
announced receive window to a few packets and perform the | ||||
measurements then. | ||||
<t>The rLEDBAT receiver is capable of detecting retransmitted packets in | Suggested: | |||
the following way. We call RCV.HGH the highest sequence number corresponding to | The problem is that the receiver is unaware of whether the | |||
a received byte of data (not assuming that all bytes with smaller sequence numbe | sender is injecting traffic at any point in time; it is therefore | |||
rs have been received already, there may be holes) and we call TSV.HGH the TSVal | unable to use these quiet intervals to perform measurements. The | |||
value corresponding to the segment in which that byte was carried. SEG.SEQ stan | receiver can, however, force periodic slowdowns, reducing the | |||
ds for the sequence number of a newly received segment and we call TSV.SEQ the T | announced receive window to a few packets and performing the | |||
SVal value of the newly received segment.</t> | measurements at that time. --> | |||
<t>If SEG.SEQ < RCV.HGH and TSV.SEQ > TSV.HGH then the newly received | ||||
segment is a retransmission. This is so because the newly received segment was g | ||||
enerated later than another already received segment which contained data with a | ||||
larger sequence number. This means that this segment was lost and was retransmi | ||||
tted.</t> | ||||
<t>The proposed mechanism to detect retransmissions at the receiver fails | </t> | |||
when there are window tail drops. If all packets in the tail of the window are | <t>It is possible for the rLEDBAT receiver to perform multiple measure | |||
lost, the receiver will not be able to detect a mismatch between the sequence nu | ments to assess both the TS units and the relative clock skew during the lifetim | |||
mbers of the packets and the order of the timestamps. In this case, rLEDBAT will | e of the connection, in order to obtain more accurate results. Clock skew measur | |||
not react to losses but the TCP congestion controller at the sender will, most | ements are more accurate if the time period used to discover the skew is larger, | |||
likely reducing its window to 1MSS and take over the control of the sending rate | as the impact of the skew becomes more apparent. It is a reasonable approach f | |||
, until slow start ramps up and catches the current value of the rLEDBAT window. | or the rLEDBAT receiver to perform an early discovery of the TS units (and the c | |||
</t> | lock skew) using the first few packets of the TCP connection and then improve th | |||
e accuracy of the TS units/clock skew estimation using periodic measurements lat | ||||
er in the lifetime of the connection. </t> | ||||
</section> | ||||
</section> | ||||
<section numbered="true" toc="default"> | ||||
<name>Detecting Packet Losses and Retransmissions</name> | ||||
<t>The rLEDBAT receiver is capable of detecting retransmitted packets as | ||||
follows. We call RCV.HGH the highest sequence number corresponding to a receive | ||||
d byte of data (not assuming that all bytes with smaller sequence numbers have b | ||||
een received already, there may be holes), and we call TSV.HGH the TSval value c | ||||
orresponding to the segment in which that byte was carried. SEG.SEQ stands for t | ||||
he sequence number of a newly received segment, and we call TSV.SEQ the TSval va | ||||
lue of the newly received segment.</t> | ||||
<t>If SEG.SEQ < RCV.HGH and TSV.SEQ > TSV.HGH, then the newly rece | ||||
ived segment is a retransmission. This is so because the newly received segment | ||||
was generated later than another already-received segment that contained data wi | ||||
th a larger sequence number. This means that this segment was lost and was retra | ||||
nsmitted.</t> | ||||
<t>The proposed mechanism to detect retransmissions at the receiver fail | ||||
s when there are window tail drops. If all packets in the tail of the window are | ||||
lost, the receiver will not be able to detect a mismatch between the sequence n | ||||
umbers of the packets and the order of the timestamps. In this case, rLEDBAT wil | ||||
l not react to losses but the TCP congestion controller at the sender will, most | ||||
likely reducing its window to 1 MSS and take over the control of the sending ra | ||||
te, until slow start ramps up and catches the current value of the rLEDBAT windo | ||||
w. | ||||
</section> | <!-- [rfced] Section 3.3: This sentence does not parse. If the | |||
suggested text is not correct, please clarify "reducing its window to | ||||
1MSS and take over the control". | ||||
</section> | Original (the previous sentence is included for context): | |||
If all packets in the tail | ||||
of the window are lost, the receiver will not be able to detect a | ||||
mismatch between the sequence numbers of the packets and the order of | ||||
the timestamps. In this case, rLEDBAT will not react to losses but | ||||
the TCP congestion controller at the sender will, most likely | ||||
reducing its window to 1MSS and take over the control of the sending | ||||
rate, until slow start ramps up and catches the current value of the | ||||
rLEDBAT window. | ||||
<section title="Experiment Considerations"> | Suggested (the missing space in "1MSS" has been added): | |||
<t>The status of this document is Experimental. The general purpose of th | In this case, rLEDBAT will not react to losses; however, | |||
e proposed experiment is to gain more experience running rLEDBAT over different | the TCP congestion controller at the sender will, most likely | |||
network paths to see if the proposed rLEDBAT parameters perform well in differen | reducing its window to 1 MSS and taking over the control of the | |||
t situations. Specifically, we would like to learn about the following aspects o | sending rate until slow start ramps up and catches the current | |||
f the rLEDBAT mechanism: </t> | value of the rLEDBAT window. --> | |||
<t><list> | ||||
<t>- Interaction between the sender and the receiver Congestion c | ||||
ontrol algorithms. rLEDBAT posits that because the rLEDBAT receiver is using a l | ||||
ess-than-best-effort congestion control algorithm, the receiver congestion contr | ||||
ol algorithm will expose a smaller congestion window (conveyed though the Receiv | ||||
e Window) than the one resulting from the congestion control algorithm executed | ||||
at the sender. One of the purposes of the experiment is learn how these two inte | ||||
ract and if the assumption that the receiver side is always controlling the send | ||||
er's rate (and making rLEDBAT effective) holds. The experiment should include th | ||||
e different congestion control algorithms that are currently widely used in the | ||||
Internet, including Cubic, BBR and LEDBAT(++).</t> | ||||
<t>- Interaction between rLEDBAT and Active Queue Management tech | ||||
niques such as Codel, PIE and L4S.</t> | ||||
<t>- How the rLEDBAT should resume after a period during which th | ||||
ere was no incoming traffic and the information about the rLEDBAT state informat | ||||
ion is potentially dated.</t> | ||||
</list></t> | ||||
<section title="Status of the experiment at the time of this writing."> | ||||
<t>Currently there are the following implementations of rLEDBAT t | ||||
hat can be used for experimentation: | ||||
<list> | ||||
<t>- Windows 11. rLEDBAT is available in Microsof | ||||
t's Windows 11 22H2 since October 2023 <xref target="Windows11" />.</t> | ||||
<t>- Windows Server 2022. rLEDBAT is available in | ||||
Microsoft's Windows Server 2022 since September 2022 <xref target="WindowsServe | ||||
r" />.</t> | ||||
<t>- Apple. rLEDBAT is available in MacOS and iOS | ||||
since 2021 <xref target="Apple" />.</t> | ||||
<t>- Linux implementation, open source, available | ||||
since 2022 at https://github.com/net-research/rledbat_module.</t> | ||||
<t>- ns3 implementation, open source, available s | ||||
ince 2020 at https://github.com/manas11/implementation-of-rLEDBAT-in-ns-3.</t> | ||||
</list></t> | ||||
<t>In addition, rLEDBAT has been deployed by Microsoft in | </t> | |||
wide scale in the following services: | </section> | |||
<list> | </section> | |||
<t>- BITS (Background Intelligent Transfe | <section numbered="true" anchor="sect-5" toc="default"> | |||
r Service)</t> | <name>Experiment Considerations</name> | |||
<t>- DO (Delivery Optimization) service</ | <t>The status of this document is Experimental. The general purpose of the | |||
t> | proposed experiment is to gain more experience running rLEDBAT over different n | |||
<t>- Windows update # using DO</t> | etwork paths to see if the proposed rLEDBAT parameters perform well in different | |||
<t>- Windows Store # using DO</t> | situations. Specifically, we would like to learn about the following aspects of | |||
<t>- OneDrive</t> | the rLEDBAT mechanism: </t> | |||
<t>- Windows Error Reporting # wermgr.exe | <ul spacing="normal"> | |||
; werfault.exe</t> | <li> | |||
<t>- System Center Configuration Manager | <t>Interaction between the sender's and receiver's congestion control | |||
(SCCM)</t> | algorithms. rLEDBAT posits that because the rLEDBAT receiver is using a le | |||
<t>- Windows Media Player</t> | ss-than-best-effort congestion control algorithm, the receiver's congestion cont | |||
<t>- Microsoft Office</t> | rol algorithm will expose a smaller congestion window (conveyed through the Rece | |||
<t>- Xbox (download games) # using DO</t> | ive Window) than the one resulting from the congestion control algorithm execute | |||
</list> </t> | d at the sender. One of the purposes of the experiment is to learn how these two | |||
algorithms | ||||
interact and if the assumption that the receiver side is always controlling the | ||||
sender's rate (and making rLEDBAT effective) holds. The experiment should includ | ||||
e the different congestion control algorithms that are currently widely used in | ||||
the Internet, including CUBIC, Bottleneck Bandwidth and Round-trip propagation t | ||||
ime (BBR), and LEDBAT(++). | ||||
<t> Some initial experiments involving rLEDBAT have been | <!-- [rfced] Section 4: We (1) changed "the sender and the receiver | |||
reported in <xref target="COMNET3" />. Experiments involving the interaction of | Congestion control algorithms" to "the sender's and receiver's | |||
LEDBAT++ and BBR are presented in <xref target="COMNET2" />. An experimental eva | congestion control algorithms" per the next sentence and | |||
luation of the LEDBAT++ algorithm is presented in <xref target="COMNET1" />. As | (2) clarified that "these two" means "these two algorithms". | |||
LEDBAT++ is one of the less-than-best-effort congestion control algorithms that | Please let us know if anything is incorrect. | |||
rLEDBAT relies on, the results regarding LEDBAT++ interaction with other congest | ||||
ion control algorithms are relevant for the understanding of rLEDBAT as well.</t | ||||
> | ||||
</section> | ||||
</section> | Original (the next sentence is included for context): | |||
- Interaction between the sender and the receiver Congestion | ||||
control algorithms. rLEDBAT posits that because the rLEDBAT | ||||
receiver is using a less-than-best-effort congestion control | ||||
algorithm, the receiver congestion control algorithm will expose a | ||||
smaller congestion window (conveyed though the Receive Window) | ||||
than the one resulting from the congestion control algorithm | ||||
executed at the sender. One of the purposes of the experiment is | ||||
learn how these two interact and if the assumption that the | ||||
receiver side is always controlling the sender's rate (and making | ||||
rLEDBAT effective) holds. | ||||
<section title="Security Considerations"> | Currently ("conveyed though the" has also been corrected): | |||
<t>Overall, we believe that rLEDBAT does not introduce any new vu | * Interaction between the sender's and receiver's congestion control | |||
lnerabilities to existing TCP endpoints, as it relies on existing TCP knobs, not | algorithms. rLEDBAT posits that because the rLEDBAT receiver is | |||
ably the Receive Window and timestamps. </t> | using a less-than-best-effort congestion control algorithm, the | |||
receiver's congestion control algorithm will expose a smaller | ||||
congestion window (conveyed through the Receive Window) than the | ||||
one resulting from the congestion control algorithm executed at | ||||
the sender. One of the purposes of the experiment is to learn how | ||||
these two algorithms interact and if the assumption that the | ||||
receiver side is always controlling the sender's rate (and making | ||||
rLEDBAT effective) holds. --> | ||||
<t>Specifically, rLEDBAT uses RCV.WND to modulate the rate of the sender | </t> | |||
. An attacker wishing to starve a flow can simply reduce the RCV.WND, irrespecti | </li> | |||
ve of whether rLEDBAT is being used or not.</t> | <li> | |||
<t>Interaction between rLEDBAT and Active Queue Management techniques | ||||
such as Controlled Delay (CoDel); Proportional Integral controller Enhanced (PIE | ||||
); and Low Latency, Low Loss, and Scalable Throughput (L4S). | ||||
</t> | ||||
</li> | ||||
<li> | ||||
<t>How the rLEDBAT should resume after a period during which there was | ||||
no incoming traffic and the information about the rLEDBAT state information is | ||||
potentially dated.</t> | ||||
</li> | ||||
</ul> | ||||
<section numbered="true" toc="default"> | ||||
<name>Status of the Experiment at the Time of This Writing</name> | ||||
<t>Currently, the following implementations of rLEDBAT can be used for e | ||||
xperimentation:</t> | ||||
<ul spacing="normal"> | ||||
<li> | ||||
<t>Windows 11. rLEDBAT is available in Microsoft's Windows 11 | ||||
22H2 since October 2023 <xref target="Windows11" format="default"/>.</t> | ||||
</li> | ||||
<li> | ||||
<t>Windows Server 2022. rLEDBAT is available in Microsoft's Wi | ||||
ndows Server 2022 since September 2022 <xref target="WindowsServer" format="defa | ||||
ult"/>.</t> | ||||
</li> | ||||
<li> | ||||
<t>Apple. rLEDBAT is available in macOS and iOS since 2021 < | ||||
xref target="Apple" format="default"/>.</t> | ||||
</li> | ||||
<li> | ||||
<t>Linux implementation, open source, available since 2022 at <eref | ||||
target="https://github.com/net-research/rledbat_module" brackets="angle"/>.</t> | ||||
</li> | ||||
<li> | ||||
<t>ns3 implementation, open source, available since 2020 at <eref ta | ||||
rget="https://github.com/manas11/implementation-of-rLEDBAT-in-ns-3" brackets="an | ||||
gle"/>. | ||||
<t> We can further ask ourselves whether the attacker can use the rLEDBAT | <!-- [rfced] Section 4.1: | |||
mechanisms in place to force the rLEDBAT receiver to reduce the RCV WND. There | ||||
are two ways an attacker can do that. One would be to introduce an artificial de | ||||
lay to the packets either by actually delaying the packets or modifying the Time | ||||
stamps. This would cause the rLEDBAT receiver to believe that a queue is buildin | ||||
g up and reduce the RCV.WND. Note that an attacker to do that must be on path, s | ||||
o if that is the case, it is probably more direct to simply reduce the RCV.WND.< | ||||
/t> | ||||
<t> The other option would be for the attacker to make the rLEDBAT | ||||
receiver believe that a loss has occurred. To do that, it basically needs to re | ||||
transmit an old packet (to be precise, it needs to transmit a packet with the ri | ||||
ght sequence number and the right port and IP numbers). This means that the atta | ||||
cker can achieve a reduction of incoming traffic to the rLEDBAT receiver not onl | ||||
y by modifying the RCV.WND field of the packets originated from the rLEDBAT host | ||||
, but also by injecting packets with the proper sequence number in the other dir | ||||
ection. This may slightly expand the attack surface.</t> | ||||
</section> | ||||
<section title="IANA Considerations"> | a) Because the latest version of [Windows11] is dated October 2024 | |||
<t>No actions are required from IANA.</t> | and "2023" is not mentioned on the page, we cannot verify "since | |||
</section> | October 2023". A Google search for "Windows 11 22H2 ledbat 2023" | |||
does not provide any information. Will "October 2023" be clear to | ||||
readers, or should this item be rephrased? If you would like to | ||||
rephrase, please provide clarifying text. | ||||
<section title="Acknowledgements"> | Original: | |||
- Windows 11. rLEDBAT is available in Microsoft's Windows 11 22H2 | ||||
since October 2023 [Windows11]. | ||||
<t>This work was supported by the EU through the StandICT projects RXQ, C | b) Would you like us to cite these GitHub pages and list them in the | |||
CI and CEL6, the NGI Pointer RIM project and the H2020 5G-RANGE project and by t | Informative References section, as suggested below? | |||
he Spanish Ministry of Economy and Competitiveness through the 5G-City project ( | ||||
TEC2016-76795-C6-3-R).</t> | ||||
<t>We would like to thank ICCRG chairs Reese Enghardt and Vidhi Goel for | Original: | |||
their support on this work. We would also like to thank Daniel Havey for his hel | - Linux implementation, open source, available since 2022 at | |||
p. We would like to thank Colin Perkins, Mirja Kuehlewind, and Vidhi Goel for th | https://github.com/net-research/rledbat_module. | |||
eir reviews and comments on earlier versions of this document.</t> | ||||
- ns3 implementation, open source, available since 2020 at | ||||
https://github.com/manas11/implementation-of-rLEDBAT-in-ns-3. | ||||
Suggested: | ||||
* Linux implementation, open source, available since 2022 | ||||
[rledbat_module]. | ||||
* ns3 implementation, open source, available since 2020 | ||||
[rLEDBAT-in-ns3]. | ||||
... | ||||
[rledbat_module] "rledbat_module", commit d82ff20, September 2022, | ||||
<https://github.com/net-research/rledbat_module>. | ||||
[rLEDBAT-in-ns3] "Implementation-of-rLEDBAT-in-ns-3", commit | ||||
2ab34ad, June 2020, | ||||
<https://github.com/manas11/ | ||||
implementation-of-rLEDBAT-in-ns-3>. --> | ||||
</t> | ||||
</li> | ||||
</ul> | ||||
<t>In addition, rLEDBAT has been deployed by Microsoft at wide scale in | ||||
the following services: | ||||
</t> | ||||
<ul spacing="normal"> | ||||
<li> | ||||
<t>BITS (Background Intelligent Transfer Service)</t> | ||||
</li> | ||||
<li> | ||||
<t>DO (Delivery Optimization) service</t> | ||||
</li> | ||||
<li> | ||||
<t>Windows update # using DO</t> | ||||
</li> | ||||
<li> | ||||
<t>Windows Store # using DO</t> | ||||
</li> | ||||
<li> | ||||
<t>OneDrive</t> | ||||
</li> | ||||
<li> | ||||
<t>Windows Error Reporting # wermgr.exe; werfault.exe</t> | ||||
</li> | ||||
<li> | ||||
<t>System Center Configuration Manager (SCCM)</t> | ||||
</li> | ||||
<li> | ||||
<t>Windows Media Player</t> | ||||
</li> | ||||
<li> | ||||
<t>Microsoft Office</t> | ||||
</li> | ||||
<li> | ||||
<t>Xbox (download games) # using DO</t> | ||||
</li> | ||||
<!-- [rfced] Section 4.1: Do the "#" symbols mean "number" in these | ||||
items or something else? Will the text be clear "as is" to readers? | ||||
If not, please clarify. | ||||
Original: | ||||
- Windows update # using DO | ||||
- Windows Store # using DO | ||||
... | ||||
- Windows Error Reporting # wermgr.exe; werfault.exe | ||||
... | ||||
- Xbox (download games) # using DO --> | ||||
</ul> | ||||
<t> Some initial experiments involving rLEDBAT have been reported in <xr | ||||
ef target="COMNET3" format="default"/>. Experiments involving the interaction be | ||||
tween LEDBAT++ and BBR are presented in <xref target="COMNET2" format="default"/ | ||||
>. An experimental evaluation of the LEDBAT++ algorithm is presented in <xref ta | ||||
rget="COMNET1" format="default"/>. As LEDBAT++ is one of the less-than-best-effo | ||||
rt congestion control algorithms that rLEDBAT relies on, the results regarding h | ||||
ow LEDBAT++ interacts with other congestion control algorithms are relevant for | ||||
the understanding of rLEDBAT as well.</t> | ||||
</section> | ||||
</section> | </section> | |||
<section numbered="true" toc="default"> | ||||
<name>Security Considerations</name> | ||||
<t>Overall, we believe that rLEDBAT does not introduce any new vulnerabili | ||||
ties to existing TCP endpoints, as it relies on existing TCP knobs, notably the | ||||
Receive Window and timestamps. </t> | ||||
<t>Specifically, rLEDBAT uses RCV.WND to modulate the rate of the sender. | ||||
An attacker wishing to starve a flow can simply reduce the RCV.WND, irrespective | ||||
of whether rLEDBAT is being used or not.</t> | ||||
<t> We can further ask ourselves whether the attacker can use the rLEDBAT | ||||
mechanisms in place to force the rLEDBAT receiver to reduce the RCV.WND. There a | ||||
re two ways an attacker can do this:</t> | ||||
<ul spacing="normal"> | ||||
<li>One would be to introduce an artificial delay to the packets by either | ||||
actually delaying the packets or modifying the timestamps. This would cause the | ||||
rLEDBAT receiver to believe that a queue is building up and reduce the RCV.WND. | ||||
Note that to do so, an attacker must be on path, so if that is the case, it is | ||||
probably more direct to simply reduce the RCV.WND.</li> | ||||
<li>The other option would be for the attacker to make the rLEDBAT receive | ||||
r believe that a loss has occurred. To do this, it basically needs to retransmit | ||||
an old packet (to be precise, it needs to transmit a packet with the correct se | ||||
quence number and the correct port and IP numbers). This means that the attacker | ||||
can achieve a reduction of incoming traffic to the rLEDBAT receiver not only by | ||||
modifying the RCV.WND field of the packets originated from the rLEDBAT host but | ||||
also by injecting packets with the proper sequence number in the other directio | ||||
n. This may slightly expand the attack surface.</li> | ||||
</ul> | ||||
</section> | ||||
<section numbered="true" toc="default"> | ||||
<name>IANA Considerations</name> | ||||
<t>This document has no IANA actions.</t> | ||||
</section> | ||||
</middle> | </middle> | |||
<back> | <back> | |||
<references title="Informative References"> | <displayreference target="I-D.irtf-iccrg-ledbat-plus-plus" to="LEDBAT++"/> | |||
<?rfc include='reference.RFC.9293'?> | ||||
<?rfc include='reference.I-D.irtf-iccrg-ledbat-plus-plus" ?> | <references> | |||
<?rfc include="reference.RFC.6817" ?> | <name>References</name> | |||
<?rfc include="reference.RFC.7323" ?> | <references anchor="sec-normative-references"> | |||
<?rfc include="reference.RFC.9438"?> | <name>Normative References</name> | |||
<?rfc include="reference.RFC.5681"?> | <xi:include | |||
href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/> | ||||
<xi:include | ||||
href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/> | ||||
</references> | ||||
<references anchor="sec-informative-references"> | ||||
<name>Informative References</name> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.929 | ||||
3.xml"/> | ||||
<reference anchor="Windows11" > | <!-- draft-irtf-iccrg-ledbat-plus-plus (I-D Exists) --> | |||
<front> | <xi:include href="https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.ir | |||
<title>What's new in Delivery Optimization</title> | tf-iccrg-ledbat-plus-plus.xml"/> | |||
<author initials="C.F." surname="Forsmann" fullname=" | <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.681 | |||
Carmen"> | 7.xml"/> | |||
<organization /> | <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.732 | |||
</author> | 3.xml"/> | |||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.943 | ||||
8.xml"/> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.568 | ||||
1.xml"/> | ||||
<date year="2023" /> | <reference anchor="Windows11" target="https://learn.microsoft.com/en-us/wi | |||
</front> | ndows/deployment/do/whats-new-do"> | |||
<seriesInfo name="Microsoft Documentation" value="https:/ | <front> | |||
/learn.microsoft.com/en-us/windows/deployment/do/whats-new-do" /> | <title>What's new in Delivery Optimization</title> | |||
<refcontent></refcontent> | <author> | |||
</reference> | <organization>Microsoft</organization> | |||
</author> | ||||
<date month="October" year="2024"/> | ||||
</front> | ||||
<refcontent>Microsoft Windows Documentation</refcontent> | ||||
</reference> | ||||
<reference anchor="WindowsServer" > | <reference anchor="WindowsServer" target="https://techcommunity.microsoft. | |||
<front> | com/t5/networking-blog/ledbat-background-data-transfer-for-windows/ba-p/3639278" | |||
<title>LEDBAT Background Data Transfer for Wi | > | |||
ndows</title> | <front> | |||
<author initials="D.H." surname="Havey" fulln | <title>LEDBAT Background Data Transfer for Windows</title> | |||
ame="Daniel"> | <author initials="D" surname="Havey" fullname="Daniel"> | |||
<organization /> | <organization/> | |||
</author> | </author> | |||
<date month="September" year="2022"/> | ||||
</front> | ||||
<refcontent>Microsoft Networking Blog</refcontent> | ||||
</reference> | ||||
<date year="2022" /> | <reference anchor="Apple" target="https://developer.apple.com/videos/play/ | |||
</front> | wwdc2021/10239/"> | |||
<seriesInfo name="Microsoft Blog" value="https:// | <front> | |||
techcommunity.microsoft.com/t5/networking-blog/ledbat-background-data-transfer-f | <title>Reduce network delays for your app</title> | |||
or-windows/ba-p/3639278" /> | <author initials="S" surname="Cheshire" fullname="Stuart Cheshire"> | |||
<refcontent></refcontent> | <organization/> | |||
</reference> | </author> | |||
<author initials="V" surname="Goel" fullname="Vidhi Goel "> | ||||
<organization/> | ||||
</author> | ||||
<date year="2021"/> | ||||
</front> | ||||
<refcontent>Apple Worldwide Developers Conference (WWDC2021), Video</ref | ||||
content> | ||||
</reference> | ||||
<reference anchor="Apple" > | <reference anchor="COMNET3"> | |||
<front> | <front> | |||
<title>Reduce network delays for your | <title> Design, implementation and validation of a receiver-driven les | |||
app</title> | s-than-best-effort transport </title> | |||
<author initials="S.C." surname="Stua | <author initials="M" surname="Bagnulo" fullname="Marcelo Bagnulo"> | |||
rt" fullname="Cheshire"> | <organization/> | |||
<organization /> | </author> | |||
</author> | <author initials="A" surname="García-Martínez" fullname="Alberto Garcí | |||
<author initials="V.G." surname="Vidh | a-Martínez"> | |||
i" fullname=" Goel "> | <organization/> | |||
<organization /> | </author> | |||
</author> | <author initials="A.M." surname="Mandalari" fullname="Anna Maria Manda | |||
<date year="2021" /> | lari"> | |||
</front> | <organization/> | |||
<seriesInfo name="WWDC21" value="https:// | </author> | |||
developer.apple.com/videos/play/wwdc2021/10239/" /> | <author initials="P" surname="Balasubramanian" fullname="Praveen Balas | |||
<refcontent></refcontent> | ubramanian"> | |||
</reference> | <organization/> | |||
</author> | ||||
<author initials="D" surname="Havey" fullname="Daniel Havey"> | ||||
<organization/> | ||||
</author> | ||||
<author initials="G" surname="Montenegro" fullname="Gabriel Montenegro | ||||
"> | ||||
<organization/> | ||||
</author> | ||||
<date month="September" year="2023"/> | ||||
</front> | ||||
<refcontent>Computer Networks, vol. 233</refcontent> | ||||
<seriesInfo name="DOI" value="10.1016/j.comnet.2023.109841"/> | ||||
</reference> | ||||
<reference anchor="COMNET3" > | <reference anchor="COMNET2"> | |||
<front> | <front> | |||
<title> Design, implementation and va | <title>When less is more: BBR versus LEDBAT++</title> | |||
lidation of a receiver-driven less-than-best-effort transport </title> | <author initials="M" surname="Bagnulo" fullname="Marcelo Bagnulo"> | |||
<author initials="M.B." surname="Bagn | <organization/> | |||
ulo" fullname="Marcelo Bagnulo"> | </author> | |||
<organization /> | <author initials="A" surname="García-Martínez" fullname="Alberto Garcí | |||
</author> | a-Martínez"> | |||
<author initials="A.G." surname="Garc | <organization/> | |||
ia-Martinez" fullname="Alberto Garcia-Martinez"> | </author> | |||
<organization /> | <date month="December" year="2022"/> | |||
</author> | </front> | |||
<author initials="A.M." surname="Mand | <refcontent>Computer Networks, vol. 219</refcontent> | |||
alari" fullname="Anna Maria Mandalari"> | <seriesInfo name="DOI" value="10.1016/j.comnet.2022.109460"/> | |||
<organization /> | </reference> | |||
</author> | ||||
<author initials="P.B," surname="Bala | ||||
subramanian" fullname="Praveen Balasubramanian"> | ||||
<organization /> | ||||
</author> | ||||
<author initials="D.H." surname="Have | ||||
y" fullname="Daniel Havey"> | ||||
<organization /> | ||||
</author> | ||||
<author initials="G.M." surname="Mont | ||||
enegro" fullname="Gabriel Montenegro"> | ||||
<organization /> | ||||
</author> | ||||
<date year="2022" /> | ||||
</front> | ||||
<seriesInfo name="Computer Networks" valu | ||||
e="Volume 233" /> | ||||
<refcontent></refcontent> | ||||
</reference> | ||||
<reference anchor="COMNET2" > | <reference anchor="COMNET1"> | |||
<front> | <front> | |||
<title>When less is m | <title>An experimental evaluation of LEDBAT++ </title> | |||
ore: BBR versus LEDBAT++</title> | <author initials="M" surname="Bagnulo" fullname="Marcelo Bagnulo"> | |||
<author initials="M.B | <organization/> | |||
." surname="Bagnulo" fullname="Marcelo Bagnulo"> | </author> | |||
<organization /> | <author initials="A" surname="García-Martínez" fullname="Alberto Garcí | |||
</author> | a-Martínez"> | |||
<author initials="A.G | <organization/> | |||
." surname="Garcia-Martinez" fullname="Alberto Garcia-Martinez"> | </author> | |||
<organization /> | <date month="July" year="2022"/> | |||
</author> | </front> | |||
<date year="2022" /> | <refcontent>Computer Networks, vol. 212</refcontent> | |||
</front> | <seriesInfo name="DOI" value="10.1016/j.comnet.2022.109036"/> | |||
<seriesInfo name="Compute | </reference> | |||
r Networks" value="Volume 219" /> | ||||
<refcontent></refcontent> | ||||
</reference> | ||||
<reference anchor="COMNET | <!-- [rfced] References: We found and added DOIs for [COMNET1], | |||
1" > | [COMNET2], and [COMNET3]. The DOIs lead to open-access versions of | |||
<front> | those references. Please review our updates and the new links, and | |||
<titl | let us know if anything is incorrect. | |||
e>An experimental evaluation of LEDBAT++ </title> | ||||
<auth | ||||
or initials="M.B." surname="Bagnulo" fullname="Marcelo Bagnulo"> | ||||
< | ||||
organization /> | ||||
</aut | ||||
hor> | ||||
<auth | ||||
or initials="A.G." surname="Garcia-Martinez" fullname="Alberto Garcia-Martinez"> | ||||
< | ||||
organization /> | ||||
</aut | ||||
hor> | ||||
<date | ||||
year="2022" /> | ||||
</front> | ||||
<seri | ||||
esInfo name='Computer Networks' value="Volume 212"/> | ||||
<refconte | ||||
nt></refcontent> | ||||
</reference> | ||||
</references> | Original: | |||
[COMNET1] Bagnulo, M.B. and A.G. Garcia-Martinez, "An experimental | ||||
evaluation of LEDBAT++", Computer Networks Volume 212, | ||||
2022. | ||||
<section title="Terminology"> | [COMNET2] Bagnulo, M.B. and A.G. Garcia-Martinez, "When less is | |||
more: BBR versus LEDBAT++", Computer Networks Volume 219, | ||||
2022. | ||||
<t>We use the following abreviations thoughout the text. We include a sho | [COMNET3] Bagnulo, M.B., Garcia-Martinez, A.G., Mandalari, A.M., | |||
rt list for the reader's convenence:</t> | Balasubramanian, P.B,., Havey, D.H., and G.M. Montenegro, | |||
<t><list> | "Design, implementation and validation of a receiver- | |||
<t>RCV.WND: the value included in the Receive Window fiel | driven less-than-best-effort transport", Computer | |||
d of the TCP header (which computation is modified by this specification)</t> | Networks Volume 233, 2022. | |||
<t>SND.WND: The TCP sender's window</t> | ||||
<t>cwnd: the consgestion window as computed by the conges | ||||
tion control algorithm running at the TCP sender.</t> | ||||
<t>RLWND: the window value calculated by rLEDBAT algorith | ||||
m</t> | ||||
<t>fcwnd: the value that a standard RFC793bis TCP receive | ||||
r calculates to set in the receive window for flow control purposes.</t> | ||||
<t>RCV.HGH: the highest sequence number corresponding to | ||||
a received byte of data at one point in time</t> | ||||
<t>TSV.HGH: TSV.HGH the TSVal value corresponding to the | ||||
segment in which RCV.HGH was carried at that point in time</t> | ||||
<t>SEG.SEQ: the sequence number of the last received segm | ||||
ent</t> | ||||
<t>TSV.SEQ: the TSVal value of the last received segment< | ||||
/t> | ||||
</list></t> | ||||
</section> | ||||
<section title="rLEDBAT pseudo-code"> | Currently: | |||
[COMNET1] Bagnulo, M. and A. García-Martínez, "An experimental | ||||
evaluation of LEDBAT++", Computer Networks, vol. 212, | ||||
DOI 10.1016/j.comnet.2022.109036, July 2022, | ||||
<https://doi.org/10.1016/j.comnet.2022.109036>. | ||||
<t>We next describe how to integrate the proposed rLEDBAT mechanisms and | [COMNET2] Bagnulo, M. and A. García-Martínez, "When less is more: | |||
an LBE delay-based congestion control algorithm such as LEDBAT or LEDBAT++. We | BBR versus LEDBAT++", Computer Networks, vol. 219, | |||
describe the integrated algorithm as two procedures, one that is executed when | DOI 10.1016/j.comnet.2022.109460, December 2022, | |||
a packet is received by a rLEDBAT-enabled endpoint (Figure 2) and another that i | <https://doi.org/10.1016/j.comnet.2022.109460>. | |||
s executed when the rLEDBAT-enabled endpoint sends a packet (Figure 3). At the b | ||||
eginning, RLWND is set to its maximum value, so that the sending rate of the sen | ||||
der is governed by the flow control algorithm of the receiver and the TCP slow s | ||||
tart mechanism of the sender, and the ackedBytes variable is set to 0. </t> | ||||
<t>We assume that the LBE congestion control algorithm defines a WindowIn | [COMNET3] Bagnulo, M., García-Martínez, A., Mandalari, A.M., | |||
crease() function and a WindowDecrease() function. For example, in the case of L | Balasubramanian, P., Havey, D., and G. Montenegro, | |||
EDBAT++, the WindowIncrease() function is an additive increase, while the Window | "Design, implementation and validation of a receiver- | |||
Decrease() function is a multiplicative decrease. In the case of the WindowIncre | driven less-than-best-effort transport", Computer | |||
ase(), we assume that it takes as input the current window size and the number o | Networks, vol. 233, DOI 10.1016/j.comnet.2023.109841, | |||
f bytes that were acknowledged since the last window update (ackedBytes) and ret | September 2023, | |||
urns as output the updated window size. In the case of WindowDecrease(), it take | <https://doi.org/10.1016/j.comnet.2023.109841>. --> | |||
s as input the current window size and returns the updated window size. </t> | ||||
<t>The data structures used in the algorithms are as follows. The sentLis | </references> | |||
t is a list that contains the TSval and the local send time of each packet sent | </references> | |||
by the rLEDBAT-enabled endpoint. The TSecr field of the packets received by the | <section numbered="true" toc="default"> | |||
rLEDBAT-enabled endpoint are matched with the sendList to compute the RTT.</t> | <name>rLEDBAT Pseudocode</name> | |||
<t>In this section, we describe how to integrate the proposed rLEDBAT mech | ||||
anisms and an LBE delay-based congestion control algorithm such as LEDBAT or LE | ||||
DBAT++. We describe the integrated algorithm as two procedures: one that | ||||
is executed when a packet is received by a rLEDBAT-enabled endpoint (<xref targe | ||||
t="fig2"/>) and another that is executed when the rLEDBAT-enabled endpoint sends | ||||
a packet (<xref target="fig3"/>). At the beginning, RLWND is set to its maximum | ||||
value, so that the sending rate of the sender is governed by the flow control a | ||||
lgorithm of the receiver and the TCP slow start mechanism of the sender, and the | ||||
ackedBytes variable is set to 0. </t> | ||||
<t>We assume that the LBE congestion control algorithm defines a WindowInc | ||||
rease() function and a WindowDecrease() function. For example, in the case of LE | ||||
DBAT++, the WindowIncrease() function is an additive increase, while the WindowD | ||||
ecrease() function is a multiplicative decrease. In the case of the WindowIncrea | ||||
se() function, we assume that it takes as input the current window size and the | ||||
number of bytes that were acknowledged since the last window update (ackedBytes) | ||||
and returns as output the updated window size. In the case of the WindowDecreas | ||||
e() function, it takes as input the current window size and returns the updated | ||||
window size. </t> | ||||
<t>The data structures used in the algorithms are as follows. The sentList | ||||
is a list that contains the TSval and the local send time of each packet sent | ||||
by the rLEDBAT-enabled endpoint. The TSecr field of the packets received by the | ||||
rLEDBAT-enabled endpoint is matched with the sendList to compute the RTT. | ||||
<t>The RTT values computed for each received packet are stored in the RTT | <!-- [rfced] Appendix B: As it appears that "TSecr field" should | |||
list, which contains also the received TSecr (to avoid using multiple packets wi | remain singular (i.e., not be "TSecr fields") and "TSecr field" is | |||
th the same TSecr for RTT calculations, only the first packet received for a giv | the subject of this sentence, we changed "are" to "is". Please let | |||
en TSecr is used to compute the RTT). It also contains the local time at which t | us know if "TSecr field" should be "TSecr fields" instead. | |||
he packet was received, to allow selecting the RTTs measured in a given period ( | ||||
e.g., in the last 10 minutes). RTTlist is initialized with all its values to its | ||||
maximum.</t> | ||||
<figure title="Procedure executed when a packet is received"> | Original: | |||
The TSecr field of | ||||
the packets received by the rLEDBAT-enabled endpoint are matched with | ||||
the sendList to compute the RTT. | ||||
<sourcecode> | Currently: | |||
The TSecr field of | ||||
the packets received by the rLEDBAT-enabled endpoint is matched with | ||||
the sendList to compute the RTT. --> | ||||
</t> | ||||
<t>The RTT values computed for each received packet are stored in the RTTl | ||||
ist, which also contains the received TSecr (to avoid using multiple packets wit | ||||
h the same TSecr for RTT calculations, only the first packet received for a give | ||||
n TSecr is used to compute the RTT). It also contains the local time at which th | ||||
e packet was received, to allow selecting the RTTs measured in a given period (e | ||||
.g., in the last 10 minutes). RTTlist is initialized with all its values to its | ||||
maximum.</t> | ||||
<figure anchor="fig2"> | ||||
<name>Procedure Executed When a Packet Is Received</name> | ||||
<sourcecode type="pseudocode"><![CDATA[ | ||||
procedure receivePacket() | procedure receivePacket() | |||
//Looks for first sent packet with same TSval as TSecr, and, | //Looks for first sent packet with same TSval as TSecr, and | |||
//returns time difference | //returns time difference | |||
receivedRTT = computeRTT(sentList, receivedTSecr, receivedTime) | receivedRTT = computeRTT(sentList, receivedTSecr, receivedTime) | |||
//Inserts minimum value for a given receivedTSecr | //Inserts minimum value for a given receivedTSecr | |||
//note that many received packets may contain same receivedTSecr | //Note that many received packets may contain same receivedTSecr | |||
insertRTT (RTTlist, receivedRTT, receivedTSecr, receivedTime) | insertRTT (RTTlist, receivedRTT, receivedTSecr, receivedTime) | |||
filteredRTT = minLastKMeasures(RTTlist, K=4) | filteredRTT = minLastKMeasures(RTTlist, K=4) | |||
baseRTT = minLastNSeconds(RTTlist, N=180) | baseRTT = minLastNSeconds(RTTlist, N=180) | |||
qd = filteredRTT - baseRTT | qd = filteredRTT - baseRTT | |||
//ackedBytes is the number of bytes that can be used to reduce | //ackedBytes is the number of bytes that can be used to reduce | |||
//the Receive Window - without shrinking it - if necessary | //the Receive Window - without shrinking it - if necessary | |||
ackedBytes = ackedBytes + receiveBytes | ackedBytes = ackedBytes + receiveBytes | |||
if retransmittedPacketDetected then | if retransmittedPacketDetected then | |||
RLWND = DecreaseWindow(RLWND) // Only once per RTT | RLWND = DecreaseWindow(RLWND) //Only once per RTT | |||
end if | end if | |||
if qd < T then | if qd < T then | |||
RLWND = IncreaseWindow(RLWND, ackedBytes) | RLWND = IncreaseWindow(RLWND, ackedBytes) | |||
else | else | |||
RLWND = DecreaseWindow(RLWND) | RLWND = DecreaseWindow(RLWND) | |||
end if | end if | |||
end procedure | end procedure | |||
</sourcecode> | ]]></sourcecode> | |||
</figure> | </figure> | |||
<figure title="Procedure executed when a packet is sent"> | <!-- [rfced] Figures 2 and 3: Per the contents of the figures and | |||
<sourcecode> | the title of Appendix B, we set the sourcecode type to "pseudocode". | |||
Please let us know any concerns. | ||||
Please see | ||||
<https://www.rfc-editor.org/rpc/wiki/doku.php?id=sourcecode-types> | ||||
for a list of sourcecode types. --> | ||||
<figure anchor="fig3"> | ||||
<name>Procedure Executed When a Packet Is Sent</name> | ||||
<sourcecode type="pseudocode"><![CDATA[ | ||||
procedure SENDPACKET | procedure SENDPACKET | |||
if (RLWND > RLWNDPrevious) or (RLWND - RLWNDPrevious < ackedBytes) | if (RLWND > RLWNDPrevious) or (RLWND - RLWNDPrevious < ackedBytes) | |||
then | then | |||
RLWNDPrevious = RLWND | RLWNDPrevious = RLWND | |||
else | else | |||
RLWNDPrevious = RLWND - ackedBytes | RLWNDPrevious = RLWND - ackedBytes | |||
end if | end if | |||
ackedBytes = 0 | ackedBytes = 0 | |||
RLWNDPrevious = RLWND | RLWNDPrevious = RLWND | |||
//Compute the RWND to include in the packet | //Compute the RWND to include in the packet | |||
RLWND = min(RLWND, fcwnd) | RLWND = min(RLWND, fcwnd) | |||
end procedure | end procedure | |||
</sourcecode> | ]]></sourcecode> | |||
</figure> | </figure> | |||
</section> | ||||
<!-- [rfced] Figure 3: Should "RWND" be "RLWND" here? We ask | ||||
because we do not see "RWND" used elsewhere in this document. | ||||
Original: | ||||
//Compute the RWND to include in the packet | ||||
RLWND = min(RLWND, fcwnd) --> | ||||
</section> | ||||
<section numbered="false" toc="default"> | ||||
<name>Acknowledgments</name> | ||||
<t>This work was supported by the EU through the StandICT projects RXQ, CC | ||||
I, and CEL6; the NGI Pointer RIM project; and the H2020 5G-RANGE project; and by | ||||
the Spanish Ministry of Economy and Competitiveness through the 5G-City project | ||||
(TEC2016-76795-C6-3-R).</t> | ||||
<t>We would like to thank ICCRG chairs <contact fullname="Reese Enghardt"/ | ||||
> and <contact fullname="Vidhi Goel"/> for their support on this work. We would | ||||
also like to thank <contact fullname="Daniel Havey"/> for his help. We would lik | ||||
e to thank <contact fullname="Colin Perkins"/>, <contact fullname="Mirja Kühlewi | ||||
nd"/>, and <contact fullname="Vidhi Goel"/> for their reviews and comments on ea | ||||
rlier draft versions of this document.</t> | ||||
</section> | ||||
</back> | </back> | |||
<!-- [rfced] FYI - We have added expansions for the following abbreviations | ||||
per Section 3.6 of RFC 7322 ("RFC Style Guide"). Please review each | ||||
expansion in the document carefully to ensure correctness. | ||||
Controlled Delay (CoDel) | ||||
Proportional Integral controller Enhanced (PIE) | ||||
Low Latency, Low Loss, and Scalable Throughput (L4S) | ||||
Maximum Segment Size (MSS) | ||||
Bottleneck Bandwidth and Round-trip propagation time (BBR) | ||||
--> | ||||
<!-- [rfced] Please review the "Inclusive Language" portion of the | ||||
online Style Guide at | ||||
<https://www.rfc-editor.org/styleguide/part2/#inclusive_language>, | ||||
and let us know if any changes are needed. Updates of this nature | ||||
typically result in more precise language, which is helpful for | ||||
readers. | ||||
Note that our script did not flag any words in particular, but this | ||||
should still be reviewed as a best practice. --> | ||||
<!-- [rfced] Please let us know if any changes are needed for the | ||||
following: | ||||
a) The following terms were used inconsistently in this document. | ||||
We chose to use the latter forms. Please let us know any objections. | ||||
Congestion control (1 instance) / congestion control (46 instances) | ||||
RCV-WND (Figure 1) / RCV WND (Section 5) / | ||||
RCV.WND (per the rest of this document and per published RFCs | ||||
to date) | ||||
TSVal / TSval (per published RFCs, including RFC 7323; we could not | ||||
find "TSVal" in any published RFC) | ||||
b) The following terms appear to be used inconsistently in this | ||||
document. Please let us know which form is preferred. | ||||
a rLEDBAT / an rLEDBAT | ||||
Receive window / Receive Window / receive window | ||||
(We see that "congestion window" is used consistently.) | ||||
sendList / sentList --> | ||||
</rfc> | </rfc> | |||
End of changes. 122 change blocks. | ||||
719 lines changed or deleted | 1366 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |