Placement of Logical Functionalities in 5G/B5G Networks

5G technology has brought tremendous growth in connectivity, mobile traffic capacity, and enhanced performance with greater throughput, lower latency, ultra-high reliability, higher connectivity, and an expanded range of mobility. We present here a unified E2E logical functionality placement with the joint placement of distributed units (DU), centralized Units (CU) and user plane functions (UPF). The problem is modeled as a large-scale integer linear pro-gram, solved using decomposition techniques. It includes all key network and IT resources and optimizes DU, CU and UPF placements and their numbers. Numerical results are presented on an open Montreal traffic dataset. Sensitivity analysis is performed, investigating the impact of the number of locations hosting DU, CU and UPF functional-ities on the delays of the request flows.


I. INTRODUCTION
As 5G continues to evolve and we begin planning for 6G, difficult questions remain related to logical functionality, considering cost, energy and latency requirements.On the one hand, it is good to host the logical functionalities as close as possible to the radio access network, in order to satisfy the latency requirements, and in particular the latency requirements of the Ultra-Reliable Low Latency Communications (uRLCC) 5G traffic class.Conversely, sharing computing resources is easier when moving away from the radio access network.The question then arises as to how much computing resources we should schedule close to the RAN, and how much we can move further away to be able to meet user needs at any time, taking into account traffic fluctuations.
Moreover, considering the increase in quality of service (QoS) of 5G applications and the increase in overall 5G traffic, we have to deal with the huge power consumption of data centers involved in the different clouds hosting the growing software part of 5G networks.Again energy consumption needs to be kept in mind when taking care of the logical functionalities as to favor the sharing of their compute resources.
We therefore study here the placement of logical functionalities, DU, CU and UPF from an E2E network point of view.We propose a comprehensive mathematical model which integrates the key constraints and solve it using large-scale optimization techniques.We next use that model to explore the compromise between delays and number of locations for hosting the logical functionalities.
Several studies have been devoted to the placement of Virtual Network Functions and Service Function Chains, e.g., [1], [2], and much less on the placement of 5G logical functionalities.On the one hand, there are studies on the placement of DU and CU ( [3], [4], [5]), then on UPF [6] on the other hand, but none on all the logical functionalities of 5G.

A. Notation and Problem Statement
We consider a 5G Software-Defined Network (SDN) that is represented by a graph G = (V, L) where V represents the set of nodes and L the set of physical links.We assume that only a subset of nodes V △ ⊆ V, △ ∈ {DU, CU, UPF} have computational resources available to host 5G logical functionalities, respectively for the DU, CU and UPF functionalities [7].Following ITU-R, 5G traffic covers three traffic categories: Enhanced Mobile Broadband (eMBB), Ultra-Reliable Low Latency Communications (URLLC), and Massive Machine Type Communications (mMTC).On the user plane, each traffic flow request must follow a logical functionality chain made of an ordered set of logical functionalities as described in Figure 1, where we only present the dominant part of the traffic.Demand is then defined by a set of flow requests,  Each E2E logical functionality chain (LFC) c is defined as an ordered sequence of 5G logical functionalities.We denote by f i the ith logical functionality in any given logical functionality chain, where i ∈ I CoS k = {1, . . ., |c CoS k |}.Each logical functionality chain is mapped to a request depending on its service category.Let C be the set of all logical functionality chains.
Let F be the set of 5G logical functionalities, indexed by f .Each functionality f requires some compute resources, i.e., CPU, RAM and storage for its processing.Let CPU f , RAM f , STO f be the amount of CPU, RAM and storage required by f for each unit of bandwidth (it is not necessarily a linear function).Therefore, assuming f is one occurrence of the LFC functions in request k, VM □ f k represents the fraction of □ ∈ {RAM, CPU, STO} of virtual machine (VM) (we assume that all VMs have the same configuration) that it needs for each of its occurrences.Each logical functionality has a limited number of replicas, denoted by REPLICA f .Note that each VM can serve several demands or functionalities, with each demand/functionality only using a fraction of the compute resources of each VM.Note that the complexity of a functionality may not always scale linearly with the unit of bandwidth of the functionality.The Placement of 5G/B5G Logical Functionalities problem is critical for efficiently provisioning 5G requests and making sure they can go through their required logical functionality chain.We formally state it as follows.For a given set of request flows, where each flow k is characterized by a 5-tuple, , identify the best function locations in order to provision the set of LFCs and requests, while minimizing the overall network load subject to the transport and compute capacities (i.e., CAP ℓ for the links and by CAP □ for □ ∈ {CPU, RAM, STO} for the nodes), and optionally to a limit on the number of logical functionality replicas.
When provisioning a request k, its required logical functionalities are encountered in the same order as in c CoS k , with some functions possibly located at the same node.

B. Layered Graph
In the context of Virtual/Container Network Function (VNF/CNF) placement, many authors have used the concept of layered graphs, see [9] for one of the early references, and then, e.g., [1], [2].Here, we propose a revisited layered graph adapted to the logical functionalities of 5G, and in particular, to certain location restrictions of logical nodes, i.e., DU and CU in the access network, while UPF is either in transport or in the core network, depending on latency requirements.
A layered graph G L is defined for each request flow k and its associated E2E logical functionality chain as follows.For a complete E2E chain RU → DU → CU → UPF → CU → DU → RU, the layered graph has 5 layers, and them 3 layers if the request is an uplink/downlink one, see Figure 3 for an illustration of the layered graph for an E2E request.
The initial network graph G is transformed into a layered graph G L (not counting the RU SRC and RU DST nodes), where each layer is either associated with the transport network graph (or a subgraph of it) or the combination of the transport and core networks.
For every node v ∈ V , let v i be the corresponding node in the ith layer i ∈ i−1 can be installed and run on node v, see Figure 4 for an illustration.Therein, request k first goes through its RU SRC following the base station to which it connects, and then to the access network in order to go through a DU and then a CU.Next, it needs to identify a UPF, either in the transport or in the core network, and then again a CU and then a DU, before reaching a RU and then the destination.Note that we do not need to consider the overall transport network, e.g., in Figure 2, we only need to consider TN SRC k for the DU-UL/CU-UL layers, and similarly only TN DST k for the CU-DL/DU-DL layers.Finding a path and a chain placement for a request (s k , d k , c CoS k , b k , δ CoS k ) consists in finding a path on the layered graph G L from node RU SRC to node RU DST .Note that each layer represents the progression of the chain, e.g., being on the second layer means that the placement of the first function of the chain is already decided.The placement of the logical functionality associated with layer i is given by the cross-layer link used to switch between layers i and i + 1.

C. Mathematical Model
We propose a model, called LFP CG model, which relies on the concept of configurations.Therein, a configuration is defined by a potential path provisioning, called service path in the sequel, which is next described formally below for a given request.
A Service Path p for request flow k is a path, i.e., an ordered set of nodes from the source to the destination node of the request, going through node hosting the logical functionalities The objective is to minimize the network workload.For a given request provisioning and its associated routing (path), the network workload can be calculated by multiplying its hop count by the request's bandwidth requirement.
The set of constraints can then be expressed as follows.

Constraints
One path per demand Link capacity Compute node resource capacity Limited # of logical functionality occurrences Checking the existence of a service path, i.e., all its functions are hosted on a node along it Logical functionality placement Limited number of UPF nodes For a given RU, outgoing and incoming flows must go through the same DU and CU

D. Solution of the Mathematical Model
The mathematical model of the previous section has an exponential number of variables and therefore requires a decomposition solution scheme using column generation techniques [10] in order to scale.The latter scheme reuses the mathematical model of the previous section as a so-called Master Problem (MP) and the Restricted Master Problem (RMP), i.e., MP with a very small subset of configurations/columns, and the so-called Pricing Problem (PP), i.e., a configuration generator.Consequently, the Restricted Master Problem corresponds to (1) -( 11) with a very limited number of variables.Its role is to select the best provisioning, one for each request, while the pricing problem generates improving configurations, i.e., configurations such that, if added to the current RMP, improve the value of its linear relaxation.Once the linear relaxation of MP is solved, an integer solution is sought.Reader who is not familiar with column generation and how to seek an integer solution is referred to, e.g., [11] and [12].Figure 5 provides an illustration of the flowchart of the solution when combining column generation and integer linear programming techniques.
We now describe the pricing problem whose role is to generate a valid Service Path for a given request.Once again, the formulation relies on the layer graph (G L ) introduced in Section II-B.Its objective is defined by the so-called reduced cost (see [11] if not familiar with linear programming concepts).
• u (j) represents the vector of dual variables of constraints (j) in the RMP. Variables: • α iv ∈ {0, 1}, where is installed on node v, 0 otherwise.
, where φ i ℓ = 1 if the flow is forwarded on link ℓ on layer i, i.e., links in each layer in graph G L , 0 otherwise.
• β vf ∈ {0, 1}, where β vf = 1 if function f is installed in node v.Note that chains can contain multiple occurrences of the same function.
is installed in node v, allowing a function to intervene several times (e.g., firewall application) in a service chain, and to be run in different locations, i.e., potentially a different one for each occurrence.The service path generator (so-called pricing problem in mathematical programming and decomposition models) is written for each request k.We describe it below for the case of an uplink request, and it can be written in a similar way for the two other cases, i.e., for a downlink or a bidirectional request.Its objective function, so-called reduced cost, can be expressed as follows.
where (10)   RU U L k ,△,v for uplink and E2E requests and 0 for downlink requests.
• COST DL = △∈{DU,CU} v∈V △ α △,DL,v u (10)   RU DL k ,△,v for downlink and E2E requests and 0 for uplink requests.Constraints are written as follows.Flow conservation: they correspond to flow constraints (i.e., route) from one logical functionality to the next one in the 5G E2E logical functionality chain for the SRC k ⇝ DST k request for which the pricing problem is solved (constraints ( 13)), and then flow constraints from the source node to the location of the DU function of the service chain (constraints ( 14)), and similarly from the location of the last function, i.e., UPF for an uplink request, of the service chain to the destination node (constraints (15)).Note that a i v = 0 for all nodes that do not host any logical node/functionality and that we take care of the possibility that several logical functionalities can be located on the same node, including on the source or destination nodes.Let k ∈ K UL .Denote by TN k the transport network associated with k as in Figure 2.
Selecting the CU placement Selecting the DU placement Selecting the UPF placement Link capacity.
Compute node capacity.For v ∈ V LN , Delay constraint for request k: it includes a link delay (propagation and average queuing delays: it corresponds to the contribution of k to the overall queuing delay on link ℓ, with the assumption that queuing delay depends linearly on the bandwidth , and a node delay (function processing delay). ℓ∈L

III. NUMERICAL RESULTS
We now report on the numerical results.First, we describe the data set we used and present the placement produced by the proposed LFP CG model as well as the function's logical connectivity.Next, we examine the trade-off between the number of logical functionality locations and the request's path delays.Finally, we study the impact of reducing the number of logical functionality locations on the overall capacity utilization.The proposed LFP CG model was implemented on an AMD Ryzen Threadripper 2990WX-32-Core processor @ 2.95 GHz with 128 GB of RAM in a Python environment with the GUROBI library.
To assess the performance of the proposed LFP CG model, we considered four scenarios with different numbers of CU, DU, and UPF in the network.The details of each scenario are represented in Table III.In scenarios 1 and 2, the numbers of DU and CU are the same, respectively, 18 and 9, as well as in scenarios 3 and 4, respectively, 12 and 6.In scenarios 1 and 3, UPF AR , UPF MIOT , and UPF I4.0 can be placed in the core network, and two replicas are allowed per UPF type, while in scenarios 2 and 4, they cannot be placed in the core, and only one replica is allowed.As UPFs of different services can be collocated, the parameter UB indicates the overall number of admitted UPFs locations.

A. Dataset
The data set used in this work is an extended version of the data set proposed by Ziazet et al. [13].It contains E2E traffic generated by refactoring the urban data of the city of Montreal.The 5G cell locations are inspired by the real cell locations of a mobile operator in the city of Montreal.5G transport and core nodes are mapped by aggregating cells in the same neighbourhood, and network connectivity is done as in Figure 6.The algorithm will select nodes where we can place processing points for logical functionalities.We considered six types of services, as illustrated in Table I with different bandwidths and delay requirements.Table II provides the compute resource modeling that we used with the required amount of CPU (in terms of percentage of CPU per user), RAM, and storage for each logical node.These values do not come from real use cases (due to lack of access to real data) and only provide a certain order of magnitude.More details about the data set can be found in [13].

B. Placement analysis
We provide placement results in Figure 7 along with logical functionality connectivity for Scenario 1, for UPF placement Fig. 6: 5G network where the placement will be done: radio, transport and core infrastructures and their hierarchy in the core network.As expected, UPFs of services that we associated with URLLC, i.e., MIoT, AR, and I4.0 in our use case, are placed in the MEC by the proposed approach.This is in line with what is done in real life, as these services are latency-sensitive and their UPFs have to be located near the end user.Some processing points co-locate DU, CU, and UPF and are located in areas with dense populations (downtown Montreal).This is surely to accommodate the peak of traffic that can be encountered in such areas.
From Figure 6d, while in general RU is connected to the closest DU, we can see that the model does not necessarily connect the RU with its closest DU.This might be because of the capacity constraints of that processing point.Another reason might be that the traffic pattern in that area is such that the selected connectivity is the one aligned with the goal of minimizing network workload while maintaining acceptable delay.It is counter-intuitive, as we might think RU should be connected to the closest DU.However, the benefit of our approach is that it will perform the connectivity that benefits the goal and that sometimes is not trivial to find by heuristics or other rule-based approaches.The same interpretation can be made for the connectivity of DU and CU.The proposed LFP CG model concentrates the backhaul network in the downtown part of the city, where we have more traffic, and it positions the UPFs in the area.Back-haul logical links show that a CU might be logically connected to multiple UPFs.This is because, based on the type of requested service and the network status, physical paths connecting a CU and any UPFs might be created on the core network.

C. Delay vs. number functions
To examine the trade-off between the number of logical functionality locations and the path delay, we considered the four scenarios previously described and compared their spare delays.Figure 8 presents the resulted spare delay for each scenario for the augmented reality service.We can see that for all the cases, the LFP CG model can find solutions so that the overall sparse delay is more than 50% of the requirement.This is beneficial as it guarantees delay QoS is met, and this spare delay converted into energy will represent an important reduction in energy consumption.
As we reduce the number of replicas of logical functionalities (i.e., DU from 18 to 12, CU from 9 to 6, and UPF from 2 to 1), we observe an increase in the path delay.This is because, with fewer available logical functionalities, requests would need longer paths to access the available ones and reach the destination.This increase is not too important based on the results indicating the capacity of the model to find an optimized placement given the conditions.
Similar results are obtained for the five other services, i.e., cloud gaming, video streaming, VoIP, MIoT, and Industry 4.0.

D. Capacity utilization
Figure 9 shows the link capacity utilized in all four scenarios.Overall, the obtained results are in line with those of the delay comparison in the previous section.Indeed, for the same reasons as the delay, reducing the number of replicas of logical functionalities will result in an increase in link capacity utilization.Comparing scenarios 1 and 3 which differ from the number of DU and CU replicas and scenarios 1 and 2 which differ from the number of UPFs, we see that in our use case, reducing the number of UPFs has less negative impact on the link capacity utilization than reducing the number of DU and CU.While being influenced by the characteristics of the traffic generated, the placement of DUs and CUs is also very dependent on the dimensioning of the transport network and the availability of computing resources.

IV. CONCLUSIONS
This paper formulates the DU, CU, and UPF placement problem as a large-scale integer linear program and solves it using decomposition techniques.The proposed solution does not only find the placement of logical functionalities but also the paths to serve the requests, which are later used to generate the logical connectivity of the 5G functions.Experiments with Montreal traffic data showed that the placement is done so as to reduce link capacity utilization, and the requests are provisioned with relatively small delays.The next step would be to quantify the gain, i.e., spare capacity and spare delay, in terms of energy for a more energy-efficient network.
Streaming (downlink from video server) Example: Massive IoT (data collection) Example: VoIP

ACKNOWLEDGMENT
First author was supported by a NSERC/INNOVEE internship in collaboration with Ericsson, GAIA Montreal.A special thanks to Jennie Diem Vo for clarifying some concepts of the current 5G technology.

TABLE II :
CPU/RAM/Storage core usage for logical nodes

TABLE III :
Input configurations per scenario: blue, green and yellow colors indicate that UPF can be located only in MEC, only in core and in both MEC and Core, respectively