Internet-Draft | Aggregated Metrics On Egress | July 2024 |
Du | Expires 9 January 2025 | [Page] |
This document describes aggregated metrics on the Egress node and the corresponding routing mechanism for Computing-Aware Traffic Steering (CATS).¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 9 January 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
In [I-D.ldbc-cats-framework], a framework for Computing-Aware Traffic Steering (CATS) is described. A general procedure is shown as follows. The client sends out a packet with an anycast destination address, which is also called a Service ID in CATS, while many places in the network can fulfill the job. The network will work like a virtual Load Balancing equipment, enabling the traffic to be forwarded to one proper service point, which has enough computing resource for the job and relative low network latency to the client. The advantage is that the Load Balancing point, normally the Ingress node, is distributed and near to the client. However, the Ingress node will normally only select an Egress node, and tunnel the packet to the Egress. The routing mechanism after the Egress remains uncertain.¶
In [I-D.lbdd-cats-dp-sr], some mechanisms based on Segment Routing for CATS is proposed. It described that when multiple service sites are connected to a single Egress, we can distinguish them by using a specific END.DX. The Egress needs to allocate an END.DX for each service sites connected to it, and notify these SRv6 Functions to the Ingress. The Ingress node will select an SR policy with the last segment as a specific END.DX. It is to say that the Ingress can select a service site directly.¶
However, in this document, we suggest that a two-phase mechanism can also work here, in which the metrics are aggregated on the Egress node and a second round of load balancing would take place on the Egress node.¶
In this document, the Ingress and Egress are the PE (Provider Edge) routers in the network. The client can access to the service site with computing services across the network. The Ingress is the first router connected to the clients in the provider network, and the Egress is the first router connected to the service site in the provider network.¶
To enable the decision point in the network to make a decision considering both the computing metric and the network metric, the computing information need to be notified into the network. However, it is challenging and needs a careful designation. To avoid too much computing information announced into the network, we can consider aggregated metrics, and three levers of computing metrics can be considered here.¶
The metrics that a service point reports at the granularity of the service point.¶
The merged metrics that a service site reports at the granularity of the service site.¶
The merged metrics that an Egress node reports at the granularity of the Egress Node.¶
In the first case, the metrics are the original computing metrics with no merging. For example in this case, a service site can contain only one service point.¶
In the second case, a cluster of servers serving the same application behind a Layer 7 Load balancer can be considered as one merged application server. In this case, the compute metrics from each server within the site will be merged at the site level. A service site can contain several service points. It can merge or aggregate the metrics from different service points, and report the merged metrics into the network. The service site can do load balancing within the site, but the routing within the service site is out of scope of this document.¶
In the third case, multiple service sites are connected to a single Egress, and the Egress can do another round of load balancing among the local sites that connects directly to it. The first round of load balancing is done on the Ingress; however, it only selects an Egress without selecting a service site directly. In this case, the Egress can merge again the compute metrics from the local sites, and the routing method is in the scope of this document.¶
Obviously, the third case is more complicated, but the advantage is that we can reduce the metric information that needs to be announced in the network. In this case, the information that the Egress sends to the Ingress would be one or more aggregated compute metric values. Also, an identifier should be carried, which indicates that the metrics are merged ones, and the merging granularity is the Egress node. After receiving the routes for the Service ID with the identifier, the Ingress would do the route selection and the forwarding at the granularity of the Egress Node accordingly.¶
In the previous section, we mentioned that the Egress can report merged metrics into the network, and thus the Ingress would only select the Egress accordingly, without the detailed service site information. In this case, the routing mechanism needs to be reconsidered.¶
Normally, there is one or more tunnels between the Ingress node and the Egress node. In the SRv6 case, the Ingress will encapsulated the original packet with an outlayer IPv6 Header containing an SRH header, and the Egress will decapsulate the packet, and obtain the original packet and a related SID announced by itself.¶
On the Ingress, after it receives the original packet of the client with the anycast destination, i.e., the service ID, the packet from the client will be encapsulated and sent to the target Egress by a tunnel/policy. We can encapsulate an underlayer IP header for the original IP packet. In this round of load balancing, the Ingress will select a proper Egress among all candidates.¶
After the encapsulation, the overlayer IP header's <SA, DA> pair will be the <clientIP, anycastIP>, and the underlayer IP header's <SA, DA> pair will be <IngressIP, EgressIP>. In the SRv6 case, the underlayer IP header will contain an SRH, and the last SID of the SRH's segment list should be an SID announced by the Egress.¶
The Egress will receive the encapsulated packet with an overlay IP header and an underlayer IP header. The Egress will decapsulate the packet, i.e., removing the underlayer IP header and obtain the original IP packet, and forward it to a proper service site. It is the second round of load balancing. However, we need some trigger for the second round of load balancing. Otherwise, the Egress will forward the packet directly by using the DA of the original IP packet.¶
A straightforward way would be to add a specific SRv6 Function on the Egress. It can be programmed as the last segment in the SRv6 SID list. Thus, the Egress can obtain the original IP packet and the SID, and act accordingly. When the Egress deals with this specific SID, it will decapsulate the packet, and obtain the original IP packet before the encapsulation. Meanwhile, according the SID, the Egress would lookup a local computing forwarding table to forward the packet. This local computing forwarding table only contains the routing information from the local service sites that connect directly to the Egress. Hence, according the SID and the result the local computing forwarding table, the packet will be forwarded to a local specific service site.¶
By comparison, without the specific SID as a trigger, an anycast destination will trigger the lookup of a global forwarding table, i.e., the normal FIB on the Egress. In that case, the packet may or maynot be forwarded to a locally connected service site.¶
TBD.¶
TBD.¶