A Low-overhead Network Monitoring for SDN-Based Edge Computing

Using Software-Defined Networking (SDN) in edge computing environments allows for more flexible flow monitoring than traditional networking methods. In SDN, the controller collects statistics from all switches and can communicate with switches to dynamically manage the entire network. However, monitoring per-flow or per-switch mechanisms to obtain the flow statistics from all of the switches may significantly increase bandwidth costs between switches and the control plane. In this paper, we propose a Bandwidth Cost First (BCF) algorithm to reduce the number of monitored switches and therefore lower the monitoring cost. The experiment results show that our algorithm outperforms the existing technique by reducing the number of monitored switches by 56%, leading to a reduction in bandwidth overhead of 41% and switch processing delay by 25%.


I. INTRODUCTION
The advent of distributed computing concepts, such as fog or edge computing, has opened up new opportunities to process data from sources outside traditional data centers and support low-latency, high-bandwidth consuming applications.Software-Defined Networking (SDN) splits the network into a control plane and a data plane; then it relies on the central controller to control the whole network, which suits to support multi-access edge computing environments [1].Monitoring the network in such environments is essential to detect and suspend malicious traffic efficiently, especially since the volume of global Distributed Denial-of-Service attacks (DDoS) rapidly grows.Software-defined networking provides flexible network resource management and programmable traffic controls.The detection and defense of network attacks in SDN can be easier with the global view of the network provided by the SDN controller.
However, monitoring the network is a high-resource consuming activity.It is critical to monitor the network accurately and promptly for effective network management and to detect malicious traffic.Moreover, it is particularly challenging in edge computing environments where resources may be limited and network traffic more diverse and dynamic.The deployment of edge computing is geo-distributed that may cover a vast region or even an entire country.This high network traffic across such a large area will lead to a waste of network bandwidth.As a result, network monitoring encounters a tradeoff between accuracy and resource consumption.
In SDN, the controller polls switches to collect statistics on the active flows and there are two main approaches for flow statistics collection, per-flow collection [2], [3] and perswitch collection [4], [5].However, both schemes may cause overhead on the control channel bandwidth and lengthy switch processing delays.The per-flow collection involves the controller sending a request to a switch to collect traffic statistics for each individual flow.However, if there are many flows on the network, each collection request and reply will result in high overhead on bandwidth and switch CPU usage.On the other hand, the per-switch collection is when the controller sends a request to gather the traffic statistics of all flow entries on the switch's flow table from a switch.However, this method may collect redundant flow information from other switches.When flows pass through multiple switches, the same flow statistics are collected redundantly in the flow table.This causes the wasteful expenditure of resources in collecting duplicate flow information.Excessive resource consumption overloads the control channel and may eventually saturate the existing control channel so that switches cannot connect to the controller for management.
To address this problem and minimize monitoring overhead, we propose a solution that utilizes OpenFlow multipart messages to monitor flows on switches and introduce the Bandwidth Cost First (BCF) algorithm, which focuses on reducing the number of monitored switches to lower the overall monitoring costs.To further reduce the number of monitored switches, BCF is considering rerouting certain network flows, directing them from lower-usage switches to higher-usage switches.This approach would enable a controller to monitor only the higher-usage switches, which can collect statistics for multiple flows in a single control message.
Our evaluation shows that BCF can significantly reduce the number of monitored switches by selecting crucial switches and rerouting certain flows to gather statistical information.Compared to the state-of-art, our algorithm can reduce the number of monitored switches by 56%, which leads to a decrease in both bandwidth overhead of 41% and switch processing delay of 25%.
The rest of this paper is organized as follows.Section II Section III shows the monitoring cost function and explains the detail of BCF.Section IV evaluates the overhead of network monitoring and presents the simulation results.We conclude this paper in Section V.

II. RELATED WORK
Multi-access Edge Computing (MEC) or fog computing expands on cloud computing by incorporating computing resources closer to end-users.This concept has received much attention from academia.Many prior works have presented different facets of this field, including placement of jobs and services [6], seamless application migration [7], and monitoring computing resources [8].However, these works are based on the traditional network that does not have enough flexibility to control the network.To overcome this issue, we are witnessing research works applying Software-Defined Networking (SDN) as their network solution for MEC [9]- [14].
To monitor the network, the most famous monitoring solutions in traditional networks are NetFlow [15] and sFlow [16].NetFlow is the most prevalent monitoring tool that attaches switches to complete or sample traffic statistics.sFlow provides a network traffic sampling mechanism to collect traffic information by the sFlow agent.However, these solutions with the specific hardware and software for monitoring have a high overload on bandwidth and CPU resources [2], [17], [18].In order to mitigate the monitoring overhead caused by specific hardware and software, employing SDN to monitor the network is a promising solution.
Within a software-defined networking environment, two types of traffic monitoring solutions exist: passive monitoring [19]- [21] and active monitoring [2], [3], [22]- [28].Passive monitoring collects information about the network without injecting any additional traffic.This approach can be less intrusive and consume fewer resources than active monitoring.Active monitoring involves injecting additional traffic in the form of probe packets, which are sent to the network to collect information about the status of the network.This approach can be more accurate in measuring performance metrics but can also introduce additional traffic and consume network resources.
In the passive monitoring solutions: OpenNetMon [2] polls the flow information on the ingress and the egress switches.MicroTE [20] is a passive monitoring mechanism that adapts to traffic variations by leveraging the short-term and partial predict network traffic.MicroTE considers that 80% flows from the data center is not live longer than 10 seconds, and below 0.1% flows live longer than 200 seconds if any elephant flow occurs.OpenSample [21] is a low-latency sampling-based network considering 99% flows in the data center is TCP protocol.It captures TCP sequence numbers from header fields and uses the same packet samples to estimate port utilization to reconstruct flow statistics.OpenSketch [29] separates the measurement data plane from the control plane and provides a three-stage pipeline (hashing, filtering, and counting) in the data plane.Using a three-stage pipeline approach, we can use hashing to reduce the bandwidth and filter the coarse-grand and fine-grand flows.
In the active monitoring solutions: OpenTM [3] estimates the network traffic matrix by directly measuring the traffic.It distributes the measurement tasks to multiple switches in the network to reduce the overhead.However, this approach may lead to high bandwidth and delay costs, especially in large networks with high traffic volume.Another similar method is PayLess [22], which offers an adaptive scheduling algorithm for the polling from a controller to switches.PayLess focuses on the polling rate of the switches by considering monitoring accuracy and the reduction of control channel communication overheads.Planck [24] is a network measurement architecture that uses port mirroring to gather network information, improving the accuracy of flow collection.This approach overcomes the limitations of traditional methods constrained by the sampling rate bottleneck, which can negatively impact monitoring accuracy.Lonely-Flow-First (LFF) [26] considers that polling unnecessary switches makes redundant flow statistics so that LFF reduces monitored switches to lower monitoring costs.Low-Cost Monitoring (LCM) [27] algorithm is an active monitoring method focusing on the monitoring cost by reducing the monitored switches.The LCM algorithm considers minimizing the monitoring bandwidth consumption and reporting delay.However, both LFF and LCM did not make good use of the flexible mechanisms of SDN.The SDN architecture provides flexibility, allowing flows to be easily rerouted, and can reduce the need for monitored switches and associated costs.
To minimize the monitored switches and bandwidth cost in SDN-based edge computing environments, we introduce Bandwidth Cost First (BCF) algorithm to benefit from fewer monitored switches that bring the knock-on effect of monitoring overhead.Our approach will also reroute some parts of the network flow from low-usage switches to high-usage switches to reduce the number of monitored switches further.

III. ALGORITHM DESIGN
The objective of this paper is to reduce monitoring costs by decreasing the number of monitored switches and minimizing bandwidth consumption in the control channel.In this section, we discuss the design of the Bandwidth Cost First (BCF) algorithm and its time complexity.

A. Idea of BCF Algorithm
In this paper, we design an algorithm that operates on top of an OpenFlow controller to minimize monitoring costs.The algorithm aims to reduce the number of monitored switches corresponding to the number of control messages.By using OpenFlow, we also avoid installing unnecessary monitoring hardware or software, such as sFlow on SDN, which requires numerous ports to receive flows and establish flow rules on each switch based on the number of monitored switches.
Switches rely on flow tables to convey flow information, such as packet count, byte count, and duration time.However, multiple switches along a flow's routing path may collect the same flow information.The controller can collect flow statistics from a single switch to avoid redundancy instead of gathering redundant information from all switches along the path.Collecting flow statistics from fewer switches reduces the flow information sent to the controller, minimizing control channel bandwidth costs.Therefore, selecting a subset of switches to collect flow statistics while ensuring that all flow information is still monitored can be a more efficient approach.
To further reduce the monitoring cost and leverage the flexibility of SDN, BCF will consider rerouting some parts of the network flow to minimize the number of monitored switches.In SDN, the rerouting of traffic can be achieved through SDN policies, which can be easily updated and modified without requiring significant changes to the underlying network infrastructure.By rerouting traffic from switches with lower flows to switches with higher flows, we can minimize the number of monitored switches while achieving comprehensive flow monitoring.Monitoring fewer switches with more flows can lead to better cost-performance ratios regarding bandwidth and delay cost.This approach is more efficient than traditional methods and does not require network hardware or software configuration changes.

B. Bandwidth Cost First (BCF) Algorithm
This subsection provides the problem formulation for BCF to clarify the issue and proposes an algorithm to address the issue.G(V, E) and a set of flows F , where V is a set of switches and E is a set of links between two switches.The goal of BCF is to reduce the number of monitored switches.Reducing the number of monitored switches can improve the overall latency of monitoring task processing in all switches.To this end, the object of the problem is Where x = 1 indicates if switch v is selected as a monitored switch.
Next, we formulate some constraints that must be met, including traffic routing, because BCF takes flow rerouting into consideration.Equation 2shows the number of the output flows from source s should be one more than the number of input flows, and the number of input flows to destination d should be exactly one more than the number of output flows as Equation 4.Moreover, the number of output flows from any other switches should equal the number of input flows shown in Equation 3.
BCF should satisfy all monitoring tasks by monitoring all target flows, which means the selected switches should cover all target flows.If a flow f goes through the switch v (ε f,v,u = 1) that is selected as a monitored switch (x v ), the flow is covered.Each flow should be covered by at least one monitored switch; thus, the summation of ε f,v,u x v should be equal to or larger than one as shown in Equation 5.
Finally, to limit data plane latency, the length of rerouted paths should be restricted by Equation 8, where α is a parameter to determine the rerouting limitation.This equation shows that the length of rerouted paths cannot be longer than α times their shortest paths.
2) Algorithm: The BCF algorithm reduces the number of monitored switches to minimize the bandwidth cost.The algorithms comprise two phases: (A) selecting the monitored switches and (B) rerouting the network flow.In Phase (A), the algorithm selects the minimum number of switches to monitor the traffic.First, the algorithm scans all switches in the network and counts the number of flows passing through each switch.Then, the algorithm sorts the switches according to the number of flows passing through and the bandwidth consumption by Equation 10 in descending order.Considering the bandwidth consumption, selecting the switch closest to the controller and with the highest number of flows is prioritized.To determine monitored switches, the algorithm iteratively selects switches as monitored switches until all flows in the network are covered.In each iteration, the algorithm selects the switch covering the highest number of uncovered flows, which can cover more flows.After each iteration, the network flows that have already been covered are removed from the switches that have not been selected.The monitored switches selection phase provides a monitored switch set V c that the controller only needs to monitor these switches.
To further reduce the number of monitored switches, the algorithm runs phase (B): rerouting network flows.The basic idea is to reroute the flows from lower to higher usage switches.After rerouting, the controller can monitor the lower number of switches.In the first step of phase (B), the algorithm uses the ascending order to sort monitored switch set V c by the number of covered flows.If the number of monitored flows in switch (v m ∈ V ) (|f vm |) is lower than a monitored switch (v n ∈ V ), the algorithm iteratively tries to combine monitored switch (v m ∈ V c ) to another monitored switch (v n ∈ V ).After rerouting f vm to switch (v n ∈ V ), the controller polls the flow information from single switch v n instead of both v n and v m .With the rerouting limit α, if the rerouting costs for all rerouted flows are lower than α, then the v m is combined with v n .Otherwise, the combination is rejected to avoid higher extra end-to-end latency caused by a longer path.The flow rerouting phase keeps processing the combination of monitored switches until no monitored switch can be combined.Eventually, fewer monitored switches are required in a network to collect traffic information for security and consume fewer control plane resources. 3 where L df × |F | is fixed cost that BCF cannot reduce.The delay reduction provided by BCF is mainly coming from |V c |× L p and fewer monitored switches result in lower |V c | × L p .For example, compared with traditional per-switch collection, BCF reduces the delay cost by 5) Bandwidth reduction: Regarding monitoring schema, a controller sends a "of p multipart request" message to a switch.The switch processes the request to aggregate the specific flow information, then replies "of p multipart reply" back to the controller.Based on [30] and [26] ,the bandwidth cost function in switch v k is Bandwidth cost function involves of p multipart request and of p multipart reply with the distance d v k between a controller and a switch v k .The parameters of the bandwidth cost function are set as follows: L req as the length of of p multipart request, L rpyh as the length of the reply packet header, L bf as the length of single flow statistic in reply packet body, and |F c v k | as the number of monitored flows in v k .

IV. PERFORMANCE EVALUATION
In this section, we evaluate the performance of BCF from different perspectives, including the number of monitored switches, delay cost, and bandwidth cost.To compare BCF with related algorithms, we choose the Lonely-Flow-First (LFF) algorithm [26], which prioritizes the collection of statistics from lonely flow switches to reduce polling costs.A lonely flow is defined as a flow that passes through only one switch, making it necessary for the controller to collect flow statistics from that switch.LFF assigns higher priority to lonely flow switches when collecting flow statistics and employs an algorithm to find monitored switches with better cost performance.By monitoring fewer switches, the bandwidth and delay costs are reduced.
A. Simulation Setup 1) Simulation scenarios: This experiment is conducted in two scenarios, which are listed as follows.
Scenario (A): Fixed number of switches and dynamic number of flows in simulation.Scenario (B): Dynamic number of switches and fixed number of flows in simulation.2) Network topologies and flows: In our simulation, we randomly generate two topologies.The first topology includes 500 switches and 500 to 5000 flows for scenario (A).The second topology includes 100 to 1000 switches with 1000 flows for scenario (B).Additionally, we run experiments in real-world network topology, which is Cogentco-T opology, to verify the performance of BCF in a real network environment.The routing path for each flow is generated using the shortest path algorithm.
3) Algorithm parameters setup: Based on experiment results obtained from [31], we set the parameters L P = 1.21 ms and L df = 0.198 ms into delay cost function Equation 9. Bandwidth cost parameters is according to [30] and set the parameters L req = 122 bits, L rpyh = 78 bits and L bf = 96 bits into Equation 10.We vary the threshold α from 1.0 to 2.0, where α = 2.0 implies that we can reroute the origin flow path to a twice-as-long path.On the other hand, if the rerouting threshold α is 1.0, it means there is no rerouted flow path.In our experiments, we set the rerouting threshold α to 1.5 because we believe it is a reasonable trade-off between bandwidth cost on the control plane and high end-to-end latency on the data plane.To avoid obtaining extreme results from the simulation, we run the experiment 100 times with different flow sets on the same topology and then average the results from each experiment.the results of monitored switches on average in real network topology and the simulated topology, respectively.The flow statistics collection method in LFF combines per-switch and then per-flow polling.Compared with the LFF, BCF collects flow statistics by multipart message, which can collect the specific flows in a single request.Also, LFF does not consider rerouting the flow path.As a result, LFF selects the switches with better cost performance for monitoring, leading to performance improvements compared to the per-switch method.At the same time, our algorithm can significantly reduce the number of monitored switches even more, especially when compared to the per-switch method.The per-switch method is a traditional SDN monitoring approach that collects all flows' statistics on all switches along the paths, resulting in high resource consumption.In contrast, our algorithms offer a more efficient polling mechanism that reduces the number of monitored switches.Figures 2 demonstrate that BCF can significantly reduce the number of monitored switches compared to the per-switch and LFF methods.Given its prioritization of monitored switches, BCF exhibits superior performance.By decreasing the number of monitored switches, the network bandwidth of the control channel can be conserved.Furthermore, in the actual topology, the results indicate an improvement of over 56% compared to the LFF method and 85% compared to the per-switch method.
2) The performance results for delay cost: Figures 3 display the performance results of the delay cost metric.Delay cost is defined as the average time it takes for a switch to In the delay cost function of an SDN switch, a monitoring request includes both the request processing time and the flow statistics processing time.However, the request time requires significantly more resource time than processing flow statistics.Gathering flow statistics on fewer switches can be an effective approach to reducing global delay costs.
The LFF algorithm's performance improvement is weak due to its need for more monitored switches and tasks to gather flow statistics.In scenario (B), which involves a fixed number of flows and dynamic switches, the delay cost is based on the number of monitored switches, meaning that a higher number of switches will not significantly impact the delay cost.However, LFF exhibits a significant performance gap in scenario (B) because it does not reroute the flows, resulting in more switches that LFF needs to monitor compared to our algorithm.As shown in Figure 3(a), our algorithm achieves a 25% better delay cost improvement than LFF.
3) The performance results for bandwidth cost: The performance of bandwidth cost is presented in Figures 4. For the sake of clarity, we only show the results of LFF and BCF.Bandwidth cost considers monitoring packet length and the distance between switches and the controller.In Figures 4, BCF significantly reduces the bandwidth cost up to 97% compared to the per-switch method.LFF has a weak performance in reducing bandwidth overhead because LFF did not consider rerouting the network flows to aggregate the flow statistics to reduce the bandwidth cost.Moreover, LFF's flow statistics collection method is per-switch and then per-

V. CONCLUSION
In this paper, we design a flow monitoring algorithm called Bandwidth Cost First (BCF) to reduce resource consumption by choosing crucial switches and rerouting the network flow.Compared to an existing Lonely Flow First algorithm and exam in two topologies with two scenarios, we show that BCF can save more unnecessary resource consumption while monitoring the whole network.Based on the experimental results, it is evident that our algorithm surpasses the current technique by decreasing the number of monitored switches by 56%, which in turn results in a reduction of bandwidth overhead by 41% and switch processing delay by 25%.

VI. FUTURE WORK
In future work, we plan to propose an advanced network monitoring algorithm for edge computing architecture, which is more efficient in the resource-limited environment and considers network congestion and QoS.Apart from the network congestion issue, we have observed in the [32] that machine learning were used to reduce network latency in SDN.This is worth considering and referencing for our algorithm.

)
Time complexity: BCF must assign each flow to a monitored switch, which requires scanning all flows and has a time complexity of |F |.For each flow, in the worst-case scenario, BCF checks all switches to determine which one can cover the most flows, resulting in a time complexity of |V |.After selecting a monitored switch, BCF appends the flows covered by the monitored switch to the set of covered flows that takes |F | time.BCF takes |V | + |F | time to assign flows to a monitored switch.Therefore, the time complexity in the monitored switches selection phase is |F |(|V | + |F |).In the rerouting phase, BCF sorts the monitoring switches in |V c | and takes |V |log|V | using a quick sort algorithm.Next, BCF checks pairs of monitored switches for combination with a worst-case time complexity of |V | 2 .Each monitored switch combination takes |F | to reroute flows.In the rerouting phase, the time complexity is |V |log|V | + |V | 2 |F |.Because |V | > log|V |, the time complexity can be shorten to |V | 2 |F |.Thus, the overall time complexity of BCF is |F |(|V |+|F |)+|V | 2 |F |. 4) Delay reduction: BCF aims to reduce the number of monitored switches, which can lead to lower overall delay.According to Equation 9, it costs L df to collect statistics for a single flow in switches.With |F | flows in a network, the total latency required is L df ×|F |.In addition, each monitored switch incurs an additional latency of L p to process a control message.Therefore, if there are |V c | monitored switches, the total cost is |V c | × L p .Overall delay cost for the monitoring is

Figure 2 (
a) and Figure 2(b) show

TABLE I NOTATIONSTABLE Notation Description
V A set of switches {v 1 , v 2 , v 3 , ...} VcA set of switches that already covered F A set of flows {f 1 , f 2 , f 3 , ...} k 1) Problem formulation: Given a network topology