Random packet marking source tracking algorithm based on DoS attack

Author: Hungry Garfield (QQ120474)
      iojhgfti@

Abstract: In response to the increasingly rampant denial of service attack (DoS) on the Internet, the performance defects of the traditional random packet marking algorithm are analyzed, and a new return tracking algorithm based on hash message authentication code is proposed. By analyzing its performance indicators, it shows that the algorithm improves the efficiency and accuracy of return tracking DoS attacks.
Thanks to the experts who helped me, Brother Yuan [nsfocus], sunwear[], isno[xfocus], scz[nsfocus]
1. Introduction
Denial of Service Attack, referred to as DoS (Denial-of-Service), is a common hacking behavior. This attack behavior causes the network server to be filled with a large amount of information waiting for reply by sending packet requests with false source addresses, consumes network bandwidth or system resources, makes the network or system service load too heavy, the service quality decreases, and normal services are stopped until they are paralyzed. Sometimes, in order to improve the effectiveness of the attack, hackers often join forces with multiple attack sites to launch attacks on the victims. Because DoS attacks are easy to implement, difficult to defend, and difficult to return to track the source of the attack, it becomes a security issue that seriously intrudes into the normal operation of institutions such as network service providers, government agencies, and financial securities.
2. IP return tracking related technologies
In order to completely eliminate DoS attacks, it is necessary to trace the source and find the real attacking machine or attacker. This method is called IP return tracking technology (IP Traceback). Because DoS attackers often forge the source address when sending attack packets, it is very difficult to trace the IP back. Common IP return tracking methods are: ingress filtering, connection testing, ICMP tracking [7], login analysis, source path isolation engine (SPIE), IPSec tracking, and random packet marking tracking (PPM) [2]. Performance comparisons between various tracking technologies are shown in Table 1.
Management burden      Network burden     Router burden     Distributed capability      After-tracking capability      Prevention/response
Enter filter     Medium      Low    Medium    N/A            N/A                         �
Input Debugging    High      High     Good
Control Flow
Login Analysis
ICMP Tracking                                                            �
Packet Tags
Table 1 Comparison of performance of several IP return tracking technologies[2]
3. HPPM IP return tracking algorithm
3.1 PPM algorithm (Probabilistic Packet Marking)
The main principle of the random packet marking algorithm PPM is as follows: a router randomly marks the packet passing through it with a certain probability p (usually 1/25) [2] with a certain probability p (usually 1/25). When a DoS attack occurs, the victim reconstructs the attack path based on the tagged information in the attack packet it receives. Using the PPM algorithm, the router is less burdened, and the use of marker edge compression and sharding techniques greatly reduces additional network traffic. Moreover, this method can track the attack source after the attack is over. PPM has a good tracking effect on single-source DoS attacks.
However, due to its own flaws, the PPM algorithm cannot return well to track DDoS attacks (Distributed–Denial–of–Service). First, since the router randomly tags the packet with probability p, it gives the attacker an opportunity to write the forged tag information into the header of the attack packet (usually the Identifier field). As long as the packet is not marked by the router it passes through, until the target host, it can forge a side path in the attack path to prevent the victim from tracking the real attack source. Secondly, in order to save storage space and reduce network burden, PPM adopts edge mark compression and shard storage technology. However, shard storage may cause victims to group shards that are not originally part of the same packet, thereby generating an incorrect edge path. The labeled edge compression method mainly depends on (a○+b)○+b=a (a and b are the IP addresses of two adjacent routers on the attack path respectively), which will significantly aggravate this effect and further generate the wrong attack path. When a DDoS attack occurs, the explosion effect will be more serious because there are multiple attackers at the same distance [2].
3.2     HPPM Algorithm
In view of the above shortcomings of the PPM algorithm, we propose a packet marking algorithm HPPM based on hash message authentication code HMAC, and adopts a new mark overlapping sharding strategy to improve the performance of IP return tracking DoS attacks (especially DDoS attacks).
In this algorithm, router Ri randomly tags the packets passing through it with probability p. The tag information includes the IP address of Ri and the IP address of the next hop router Rj, a total of 64 bits. In order to save tag storage space and not cause too much impact on users, the algorithm uses the 16-bit identifier field (Identifier), 1-bit idle flags (Flags) [1] and 13-bit chip displacement field (according to statistics, less than 0.25% of data packets currently require sharding [2]), and the 8-bit TOS field (Type-of-Service) [1] that is rarely used, a total of 38 bits are used to store tag shard information. The 64-bit tag information is divided into k=8 pieces, each piece occupies 8 bits, the shard offset value occupies log2k=3 bits, the distance value from Ri to the target host occupies 5 bits (25-1=31 hops, suitable for most current networks [2]), and the HMAC value used for verification accounts for 22 bits.
Hash message authentication code, referred to as HMAC, is an authentication mechanism based on message authentication code MAC (Message Authentication Code). When using HMAC, both parties to the message communicate authenticate the authenticity of the message by verifying the authentication key K added in the message; HMAC also introduces a hash function H to encrypt the message, further ensuring the security and effectiveness of message authentication. The specific calculation method of HMAC is as follows:
HMAC(M,K) = H(K○+opad, H(K○+ipad, M ))
Among them, ipad = word 0x36 is repeated B times, opad = word 0x5C is repeated B times, M = message string to be encrypted, B = word length of message string. For more detailed contents of HMAC, please refer to RFC 2104[6].
In the HPPM algorithm, we use the one-time password mechanism OTP (One-Time Password)[4][5], and first let each router Ri generate a set of private key sequences {Kij} (j=0, 1, 2...). This set of private key sequences is generated by a one-way hash function f and has the following rules: Kij-1 = f(Kij). Since function f is one-way, the latest key Kij can be deduced in turn all Kij-1, Kij-2...Ki0 generated before it, but the next key cannot be deduced based on the existing key, which ensures that others cannot forge Ri's key. Every time the router Ri passes a fixed time interval, it replaces the private key Ki according to the above method, and then publishes the key that has just been stopped in a reliable way. When the data packet P passes through Ri, Ri uses the HMAC function H and the current private key Ki to encrypt Ri’s IP address and P’s destination address, that is: H(M,Ki), where M = IP(Ri)+IP(destination). The specific marking steps are as follows:
Marking procedure at router Ri:
let m be the marking massage ip(Ri) + ip(Rj)
let k be the number of fragments in m
let H be the HMAC function
let Ki be the private key of Ri at current time interval
for each packet w
let x be a random number from [0..1)
if x < p then
let hmac be the HMAC value of IP(Ri)+IP()
    hmac := H( IP(Ri)+IP(), Ki)
let o be a random integer from [0..k-1]
let f be the fragment of m at offset o
write f into
write 0 into
write o into
write hmac into
else
increment
The following will discuss the process of victims rebuilding the attack path. When attacked by DoS, the victim begins to collect marker shards and record their arrival time. Assuming that the attacker sends a large number of attack packets, the victim can collect enough marker shards, and then reorganize shards with the same attack distance d and hmac values according to the offset value of the shard, generate edge paths, and then generate the entire attack path. Since an attacker may forge tagged packets and interfere with the return tracking process, it is necessary to identify the authenticity of the generated edge path. The specific identification method is: the victim contacts the router Ri (service provider), obtains Ri's latest private key K, and then uses the same hash function f to calculate the private key Ki used when Ri tag P based on the arrival time of K and packet P (delay needs to be considered). Then, based on Ri and its own IP address, and Ki, the HMAC value is calculated and compared with the HMAC value in the tagged packet. If the same is true, it means that the edge path is valid, otherwise it will be discarded. The specific process of rebuilding the attack path is as follows:
Path reconstruction procedure at victim v:
let FragTbl be a table of tuples (frag, offset, distance, hmac, time)
let G be a tree with root v
let edges in G be tuples (start,end,distance,hmac,time)
let maxd := 0
let last := v
for each packet w from attacker
let rectime be the time when v receives the packet w
(,,,,rectime)
if  > maxd then
maxd :=
for d := 0 to maxd
for all ordered combinations of fragments having the same hmac value at distance d
construct edge z( IP(Ri), IP(Rj), , , rectime)
// Start of HMAC authentication
let K be the private key of Ri at current time interval
let Ki be the private key of Ri at the time interval when Ri marked the packet w
let f be the function to get Ki according to K and rectime
Ki := f(K, rectime)
if  = H(IP(Ri)+IP(v), Ki)
insert edge z( IP(Ri), IP(Rj), ) into G
        // End of HMAC authentication
remove any edge (x,y,d,hmac,time) with d ≠ distance from x to v in G
extract path (Ri..Rj ) by enumerating acyclic paths in G
4. HPPM algorithm performance analysis
4.1      Attack Convergence Packet Number
According to [2], the probability that the victim receives a tagged packet sent by the router at the farthest path on the path with a distance of d is: p(1-p)d-1. Conservatively assume that the probability of a victim receiving a tagged packet sent by any router on the path is the same as the farthest distance d and is independent of each other. Therefore, the probability of any packet received by the victim being tagged by some routers on the path through the packet will have the expected value: 1/dp(1-p)d-1.
According to the coupon-collector problem [8], the number of packets received by the victim from the path with a distance of d includes at least one tagged packet sent by each router in all d routers. The number of packets required to be received has the following expected value: 1+d/(d-1)+d/(d-2)+…+d/2+d = d(ln(d)+O(1)). Considering that each tag data packet is divided into k slices, with a total of kd slices, the above value is kd(ln(kd)+O(1)). Therefore, the number of packets required to reconstruct an attack path with a distance d (including d routers) has the expected value:
E(N)<kd(ln(kd)+O(1))
1/dp(1-p)d-1≈k
ln(kd)/p(1-p)d-1
Therefore, the number of converged packets attacks by this algorithm is the same as that of the PPM edge marking algorithm [2].
4.2       Fulfability
In the PPM algorithm [2], for a hash random function with output length h, the probability that the victim accepts any candidate edge path is 1/2h. Assuming there are m attackers, at a certain distance d, at the worst case, there will be m independent routers. Therefore, at the distance from victim d, the maximum probability that any candidate edge path that is not originally in the actual attack path is accepted by the victim (ie, positive error [3] is generated) will be: 1-(1-1/2h)n (where n=mk)[2]. Because, in the worst case, there will be mk mark shards at distance d. When the k value or m value is large, this probability will also become large. The HMAC authentication mechanism used by the HPPM algorithm can effectively identify the forged edge paths by the attacker and filter out the forged edge paths from the candidate edge paths, thereby fully reducing the positive errors generated by the algorithm and improving the accuracy of return tracking.
4.3 The burden on the router
Since the HMAC and one-time password mechanism are used to encrypt edge tags in attack packets, the intermediate router does not have to bear the exclusive OR operation (a○+b)○+b of each router in the PPM algorithm for its IP address and existing edge tags. The calculation process of HMAC is simple and has good scalability. When a hash function with faster or safer computing speed is found or requires, it is easy to replace the underlying hash function without modification. See literature [6]. To make the packet marking process more secure, the router needs to periodically replace its private key for encrypted edge marking. This cycle needs to be properly selected. Too short cycle will put additional burden on the router and will not be conducive to synchronization with the victim. Too long cycle will affect the security of the algorithm. You can consider taking 10 seconds as its order of magnitude [3].
4.4                                                             �
Due to the one-time password mechanism, the victim needs to obtain the private key used when encrypting edge marks of the upstream router. One feasible method is that the upstream router publishes the latest key through a trusted channel (such as published on the website). When the victim identifies the authenticity of the edge mark, he only needs to download the latest key of the router, and based on the latest key, he can calculate all the keys used by the router when encrypting the edge mark. This process can be completed in constant time.
4.5       Limitations of Algorithm
The HPPM algorithm requires victims to master the network topology and have maps of their upstream routers, which to some extent limits the development of the algorithm. The synchronization process of the victim and the intermediate router with regard to the key needs further consideration.
5. Summary and future work
This paper describes a new random packet marking algorithm HPPM based on hash message authentication code HMAC. This algorithm is used to return the attack source for tracking DoS attacks, which can fully reduce the error caused by the attacker forgery of packet markings and improve the security and accuracy of return tracking. However, this algorithm has some incompleteness. For example, like most return tracking algorithms, the HPPM algorithm is a reactive tracking algorithm, that is, tracking can only be done if an attack does occur. Moreover, the algorithm cannot actually track the real attack source and can only return to the boundary router that is closest to the attack source. These issues need to be further studied.

References
[1]    Written by Stevens, translated by Fan Jianhua, Xu Guanghui, Zhang Tao, et al., edited by Xie Xiren, "Detailed Explanation of TCP/IP - Volume 1: Agreement" [M], 1st Edition, Beijing: Machinery Industry Press, April 2000, pp. 24-27, 111-112.
[2] , ,  and . Practical network support for IP traceback[C]. In ACM SIGCOMM, Stockholm, Sweden, 2000, 295-306.
[3]  and . Advanced and authenticated marking schemes for IP traceback[C]. In IEEE INFOCOMM, April 2001.
[4] , The S/KEY One-Time Password System[Z], Internet RFC 1760, February 1995.
[5] , A One-Time Password System[Z], Internet RFC 2289, February 1998.
[6] , , and , HMAC: Keyed-Hashing for Message Authentication[Z], Internet RFC 2104, February 1997.
[7] . ICMP Traceback Messages[Z]. Internet Draft: 2000.
[8] . An Introduction to Probability Theory and Its Applications[M]. 2nd edition, volume 1. Wiley and Sons. 1966.
[9] , . On the Effectiveness of Probabilistic Packet Marking for IP Traceback under Denial
of Service Attack[C]. In IEEE INFOCOM, 2001.