tags: system
At present, the network packet capture system under the Linux operating system is generally built on the libpcap packet capture platform. The English meaning of libpcap is Library of Packet Capture, that is, the packet capture function library. The C function interface provided by the library can be used in application systems that need to capture data packets that pass through the network interface (as long as the target address is not necessarily the local machine). In a high-traffic network environment, the inefficient packet capture technology based on libpcap cannot capture a sufficient number of network packets for use by upper-layer application systems.
In the libpcap system, the operating system kernel protocol stack provides users with a socket SOCK_PACKET working at the data link layer.libpcap enters the system kernel state from the user state through the socket application program interface, bypassing the TCP layer and IP layer processing in the kernel protocol stack and directly capturing the original network data frame from the data link layer. In order to deeply analyze the packet capture mechanism of libpcap, first, based on the linux-2.4.10 operating system, the message transmission mechanism and performance bottleneck of the traditional TCP/IP kernel protocol stack are studied.
The kernel protocol stack of the Linux operating system can be basically divided into several parts such as the data link layer, IP layer, TCP/UDP layer, INET Socket layer, BSD Socket layer and application layer. Taking the transport layer protocol as the TCP protocol as an example, a typical TCP/IP protocol stack hierarchy and module structure are given in FIG. 1.
The kernel protocol stack includes a set of functions and several key data structures. The function call relationship of each protocol layer is shown in Figure 1. The socket socket library at the user level is maintained by the kernel's data structures struct socket and struct sock. The data is managed by two data structures struct msghdr and struct sk_buff in the process of sending/receiving. The struct msghdr is maintained by the BSD Socket layer and the INET Socket layer. The struct msghdr is used to store the address and size of the data buffer of the application layer. Each layer below TCP/IP uses struct sk_buff as a data buffer.The kernel implements the conversion of these two data structures through the memory copy operation in the tcp_sendmsg()/tcp_recvmsg() function. The kernel data buffer struct sk_buff does not perform data copy operations when passing through the layers below the TCP/UDP layer, but only moves the data pointer between different message protocol headers. This can avoid unnecessary system overhead。

Taking Intel 100M network card eepro100 as the physical layer device, comparing the kernel protocol stack framework in Figure 1, a detailed analysis of the message receiving process of the Linux system kernel protocol stack:
The message reception of the Linux kernel protocol stack is divided into two processes: a top-down process and a bottom-up process. The top-down process is a passive process. The system calls read() or recv()/recvfrom() to transfer the "required data" request through the upper protocol stack, and enters the TCP protocol layer through the tcp_recvmsg() function. The function tcp_recvmsg() will read data in the receive queue struct sk->receive_queue. If the required data is not in the receive queue, the current process where the function tcp_recvmsg() is located will sleep and wait on the struct sk->receive_queue receive queue . When the system wakes up the suspended process again, the tcp_recvmsg() function will read the required data and call the tcp_v4_do_rcv() function to fill the data in the struct sk->backlog queue into the sk->receive_queue queue while awakening and waiting in this Processes on the queue. The tcp_recvmsg() function copies the data packet from the kernel buffer to the application buffer during the reading of the data packet. The bottom-up process is as follows:
In order to test the system overhead of the kernel protocol stack during the message reception process (consuming CPU time), the driver of the network card was modified under Linux and some code was inserted in the system kernel to record the time when the message arrived at each part Poke, so you can determine the time cost of each part. The test results are shown in Table 1.

As can be seen from the table, the processing time of system hard interrupts (from speedo_rx() to net_rx_action()) accounts for about 20%, and the system overhead of soft interrupt processing (net_rx_action() Tcp_recvmsg()) accounts for about 15%, and from tcp_recvmsg() to inet_recvmsg(), about 55% of CPU time is consumed due to the data memory copy operation. System calls take up about 5% to 6% of CPU time. The cost of memory copy operations is expensive, mainly for the following reasons:
libpcap uses the socket SOCK_PACKET working at the data link layer to complete the reading of network data packets. Taking Intel 100M network card eepro100 as an example, the process of analyzing the packet capture of libpcap is as follows:
The comparison between the packet capture process of libpcap and the message receiving process of the kernel TCP/IP protocol stack is shown in Figure 2. It can be seen from the figure that compared with the packet receiving process of the traditional kernel TCP/IP protocol stack, libpcap bypasses the processing of the TCP layer (UDP layer) and the IP layer, and directly copies the data packet from the data link layer to In the application buffer. This can save the CPU time consumed by the data packet during the reception process. As can be seen from Table 1, about 10% of the processing time is saved. However, in the process of libpcap packet capture, system calls, data copying and kernel interrupt handling are still the main performance bottlenecks of the system.

Aiming at several time-consuming links in the libpcap packet capture process, optimization and improvement were made in order to reduce system resource consumption, and at the same time improve the libpcap packet capture performance under high-flow network environment. Several main measures for packet capture optimization are proposed below and their advantages and disadvantages are analyzed.
Kernel filtering is to determine whether the received data packet is a message of interest to the application in the part equivalent to the soft interrupt handler (packet_rcv() function). If it is, copy it to the application layer buffer, otherwise discard it. This can greatly reduce the number of packets actually processed by the system, thereby improving the efficiency of packet capture. However, this method only works for certain applications, and it does not play a big role in some applications such as traffic statistics and common protocol analysis, because these applications often have to process most of the packets on the network.
Because the system call involves 0x80 interruption on-site saving and process switching, frequent system calls in a large-flow network environment are very time-consuming work. Therefore, many applications treat the main application layer processing part as an LKM module in the kernel, which can reduce the number of system calls. However, to do so, the application process needs to be written in the driver module or the kernel code. There are many differences between the method of writing code in the kernel and the application layer and the functions used, and the method of memory allocation and reading and writing of user data has also changed. The complexity and difficulty of kernel coding have increased. In addition, the kernel code has high requirements for stability. Once the kernel program fails, it will cause the entire system to crash. Therefore, this method is generally suitable for applications that require high processing efficiency and relatively simple processing, such as firewalls.
libpcap reads the data packet through the system call revfrom(), and each time the function is called, it only passes a data packet to the user area. Therefore, it is possible to save a certain number of packets in the kernel buffer, and wake up the user process to read all the data packets in the kernel buffer when the packets reach a certain number. This saves time for user process switching. It is a very effective method when processing large-flow data packets. However, this will delay the time for the message to reach the user process. When the packet arrival rate is large, the time delay will increase significantly.
The data memory copy operation from the kernel to the application layer is the main performance bottleneck of the packet capture system. To this end, zero-copy technology can be used to convert the virtual address of the user buffer into a physical address available for the network card and lock the address by adding a processing module in the kernel. Improve the network card driver to obtain the physical address of the user buffer and use the network card asynchronous DMA to transfer data packets from the network card directly to the user space, which can bypass the operating system kernel protocol stack and reduce system kernel processing, data copying and system calls s expenses. This method can greatly improve the system's packet capture performance and even reach the performance limit of the network card. However, due to the need to modify the network card driver, it limits the versatility of this method.
When the rate of network packets reaching the system is too frequent, there will be situations where the CPU processing time is all used for interrupt processing. At this time, the system frequently runs the hard interrupt processing program, resulting in the upper soft interrupt processing not running. The hard interrupt handler fills the network buffer into the system buffer, so that the system buffer is quickly filled, and excess data packets will be discarded. At the same time, other processes cannot obtain control of the CPU. Therefore, reducing the interruption frequency is necessary to improve the performance of large-flow network packet capture systems. In some improved methods, when it is found that the interrupt frequency is too high, the hardware interrupt is forbidden, so that although some packets will be lost, the system will have the opportunity to respond to other processes. Some systems use a mixed interrupt and polling mechanism to reduce the frequency of system interrupts. The polling watchdog mechanism proposed by Macquelin et al. is one such method.
Experimental requirements Source and destination physical address of the print data packet; Print source IP and destination IP address; Printing upper layer protocol type; If the upper layer protocol ...
download Compile and install Instance The C function interface provided by the library is used to capture data packets passing through the specified network interface. download: http://www.linuxfromsc...
Design and implementation of network packet capture and traffic online analysis system-based on libpcap on MacOS Record this happy (DT) week Claim: Design and implement a network flow analysis system ...
Every time when you exit tcpdump, the terminal will display the above three lines of information. This article is to explain the relevant information of these 3 values. I will try my best to write in ...
In fact, when I first captured the packet, I found a problem. After the ChangeCipherSpec message, there are no encrypted extensions, certificates, or even Finished messages. During the handshake proce...
First, the traceroute program traceroute / tracert Traceroute Linux and Mac OS systems default route provides tracking applet, Tracert is a Windows system default route provides tracking applet. Both ...
The last article "Analysis of Ali Packet Capture (2)" briefly introduced the initialization of Mtop and foundIMtopInitTaskIt is mainly used to handle the initialization of Mtop. After viewin...
Bit means 1bit, excluding optional fields (options and data), a total of 160bit, which is 20 bytes The source port number and destination port number are used to find the sender and receiver [source p...
Three-layer routing packet capture analysis This experiment uses the packet capture tool that comes with Cisco Packet Tracer. Related commands: arp -a to view the arp table, arp -d to release the arp ...
1. Introduction to Libpcap Libpcap is the abbreviation of Packet Capture Libray, which is the data packet capture function library. The C function interface provided by the library is used to capture...