Windows驅動編程之NDIS(VPN)
一、引言
一篇有關Windows VPN代理技術分享,并不討論VPN方案和隧道加密代理,而是NDIS小端數據包到應用層原理技術探討分享。
個人Windows下用過兩個OpenVpn驅動版本,tap-windows 5.0(Ndis 5.0)版本 <= win7,tap-windows 6.0(Ndis 6.0)版本 >= win8。
Jason Donenfeld并滿意OpenVPN tap-Windows驅動,自研WireGuard Wintun代替tap-Windows的NDIS,后來減少r3r0上下文切換等優化,就有了WireGuard NT。
WirGuard口碑很不錯,被Linux集成在了內核稱藝術品。商用化方案越來越多使用WirGuard代替OpenVpn。r3也提供了高效開發,也不用關注驅動可以Go一把梭。
二、源碼
OpenVpn TapWin下載地址:https://github.com/OpenVPN/tap-windows6
WireGuard Nt下載地址:https://git.zx2c4.com/wireguard-nt/
三、Tap-windows:
tap-windows5.0用Wdk7600編譯,tap-windows6.0高版本wdk編譯,用兩套xp~win7 tap-5.0支持比較友好,tap-Win 5.0高版本系統會有問題,具體的可以看Github中Issues。
這里并不是講OpenVpn它本身如何做隧道的,而是通過假設代理方案舉例:
- 初始化tap驅動,注冊小端生成虛擬網卡。
- 應用層設置路由表和虛擬網卡,指定IP路由到虛擬網卡。
- NDIS捕獲完成IRP發送應用層,應用層拿到數據包Socket5或者私有代理。
- 代理回包應用層寫回虛擬網卡。
Tap-windows5.0和6.0捕獲數據包傳輸使用都是I/O,異步ReadFile/WriteFile MDL讀寫。這套方案做代理需要面對一些問題,NDIS提取網絡層數據包應用層代理?
UDP處理稍微簡單一些,TCP代理有點復雜,考慮TCP的每一個數據包(Wirshark抓到的一樣),包括握手揮手超時恢復等都要自己處理。
雖然有lWIP用來解決該問題,應用層幫你去維護阻塞控制、RTT、快速恢復轉發等,最后通過IOCP或者Asio等框架轉發鏈路數據至Server服務,BadVpn好像也是使的該方案。
數據包流動實現:
應用層:
1) 初始化驅動以后,起線程異步讀取發送ReadFile I/O:
for (i = 0; i < 8; i++){ readBytes = 0; memset(&ol, 0, sizeof(ol)); ol.hEvent = g_ioEvent; if (!ReadFile(hDevice, &rr, sizeof(rr), NULL, &ol)) { if (GetLastError() != ERROR_IO_PENDING) { OutputDebugString(L"ReadFile Error!"); goto finish; } } for (;;) { dwRes = WaitForMultipleObjects( sizeof(events) / sizeof(events[0]), events, FALSE, waitTimeout); ...... } ......}
內核層:
1) 驅動接收應用層發來的I/O,先把Read IRP Pending,AdapterCreate->CreateTapDevice->TapDeviceRead(IRP_MJ_READ),嘗試拉取AdapterSendNetBufferLists捕獲的數據包副本,如果有的話完成I/O反饋。
IoCsqInsertIrp(&adapter->PendingReadIrpQueue.CsqQueue, Irp, NULL);tapProcessSendPacketQueue(adapter);ntStatus = STATUS_PENDING;
另一種寫法Read I/O只負責Pendig IRP存入鏈表,Event來喚醒單獨的內核線程嘗試拉取副本完成IRP。
IoMarkIrpPending(irp);InsertTailList(&g_pendedIoRequests, &irp->Tail.Overlay.ListEntry);status = STATUS_PENDING;KeSetEvent(&ThreadEvent, IO_NO_INCREMENT, FALSE);
2) AdapterSendNetBufferLists判斷是否打開tap和已就緒,NdisDeviceStateD0是網絡適配發出的電源狀態,表示電源管理已準備就緒并且已激活,case里面過濾初始化和適配器不指示接收的情況。
```if(adapter->TapFileObject == NULL){ // // Complete all NBLs and return if adapter not ready. // tapSendNetBufferListsComplete( adapter, NetBufferLists, NDIS_STATUS_SUCCESS, DispatchLevel ); return;}if(!Adapter->LogicalMediaState){ status = NDIS_STATUS_MEDIA_DISCONNECTED;}else if(Adapter->CurrentPowerState != NdisDeviceStateD0){ status = NDIS_STATUS_LOW_POWER_STATE;}else if(Adapter->ResetInProgress){ status = NDIS_STATUS_RESET_IN_PROGRESS;}else{ switch(Adapter->Locked.AdapterState) { case MiniportPausingState: case MiniportPausedState: status = NDIS_STATUS_PAUSED; break; case MiniportHaltedState: status = NDIS_STATUS_INVALID_STATE; break; default: status = NDIS_STATUS_SUCCESS; break; }}```
3) 檢測Net_Buf大小范圍,大于等于以太網幀頭大小(64byte)+IP頭大小(20byte),最大不能超以太網報頭+MTU(MAX 1500)+VLAN。
// Minimum packet size is size of Ethernet plus IPv4 headers.ASSERT(packetLength >= (ETHERNET_HEADER_SIZE + IP_HEADER_SIZE)); if(packetLength < (ETHERNET_HEADER_SIZE + IP_HEADER_SIZE)){ return FALSE;} // Maximum size should be Ethernet header size plus MTU plus modest pad for// VLAN tag.ASSERT( packetLength <= (ETHERNET_HEADER_SIZE + VLAN_TAG_SIZE + Adapter->MtuSize)); if(packetLength > (ETHERNET_HEADER_SIZE + VLAN_TAG_SIZE + Adapter->MtuSize)){ return FALSE;}
4) 接下來對多種協議處理,拷貝完整的數據包到副本,插入PacketQueue傳輸隊列。接下來和步驟1中完成IRP調用函數一致,IoCompleteRequest完成pending IRP,將數據反饋至應用層。
ETH_HEADR大小是 6 + 6 + 2 = 14byte
#define MACADDR_SIZE 6typedef unsigned char MACADDR[MACADDR_SIZE];typedef struct{ MACADDR dest; /* destination eth addr */ MACADDR src; /* source ether addr */ USHORT proto; /* packet type ID field */} ETH_HEADER, *PETH_HEADER;
Proto這里處理類型如下:
#define NDIS_ETH_TYPE_IPV4 0x0800 // IPV4#define NDIS_ETH_TYPE_ARP 0x0806 // ARP#define NDIS_ETH_TYPE_IPV6 0x86dd // IPV6
DHCP在哪里設置標志的?TAP_WIN_IOCTL_CONFIG_DHCP_MASQ
case TAP_WIN_IOCTL_CONFIG_DHCP_MASQ: { if(inBufLength >= sizeof(IPADDR)*4) { adapter->m_dhcp_enabled = FALSE; adapter->m_dhcp_server_arp = FALSE; adapter->m_dhcp_user_supplied_options_buffer_len = 0; // Adapter IP addr / netmask adapter->m_dhcp_addr = ((IPADDR*) (Irp->AssociatedIrp.SystemBuffer))[0]; adapter->m_dhcp_netmask = ((IPADDR*) (Irp->AssociatedIrp.SystemBuffer))[1]; // IP addr of DHCP masq server adapter->m_dhcp_server_ip = ((IPADDR*) (Irp->AssociatedIrp.SystemBuffer))[2]; // Lease time in seconds adapter->m_dhcp_lease_time = ((IPADDR*) (Irp->AssociatedIrp.SystemBuffer))[3]; GenerateRelatedMAC( adapter->m_dhcp_server_mac, adapter->CurrentAddress, 2 ); adapter->m_dhcp_enabled = TRUE; adapter->m_dhcp_server_arp = TRUE; CheckIfDhcpAndTunMode (adapter); Irp->IoStatus.Information = 1; // Simple boolean value DEBUGP (("[Boom] Configured DHCP MASQ.\n")); } else { NOTE_ERROR(); Irp->IoStatus.Status = ntStatus = STATUS_INVALID_PARAMETER; } } break;
AdapterSendNetBufferLists拷貝代碼:
```// Allocate TAP packet memorytapPacket = (PTAP_PACKET )NdisAllocateMemoryWithTagPriority( Adapter->MiniportAdapterHandle, TAP_PACKET_SIZE (packetLength+addHeaderSize), TAP_PACKET_TAG, NormalPoolPriority );// 提取Buf數據packetData = NdisGetDataBuffer(NetBuffer,packetLength,tapPacket->m_Data+addHeaderSize,1,0); // 拷貝數據到tapPacket+addHeaderSize后的位置,預留出來addHeaderSizeif(packetData != (tapPacket->m_Data+addHeaderSize)){ // Packet data was contiguous and not yet copied to m_Data. NdisMoveMemory(tapPacket->m_Data+addHeaderSize,packetData,packetLength);} // 填充addHeaderSize大小數據if(addHeaderSize > 0){ // Add an 802.1Q header between the ethernet header and the payload NdisMoveMemory(tapPacket->m_Data,tapPacket->m_Data+addHeaderSize,ETHERNET_HEADER_SIZE-2); PETH_HEADER header = (PETH_HEADER)tapPacket->m_Data; PETH_8021Q_HEADER tag = (PETH_8021Q_HEADER)(header+1); header->proto = htons(0x8100); USHORT tagValue = 0; tagValue |= packetPriority.TagHeader.UserPriority<<13; tagValue |= packetPriority.TagHeader.VlanId & 0xFFF; tag->Tag = tagValue; packetLength += addHeaderSize;} // DHCP的處理從數據鏈路層(ETH_HEADER) 到 網絡層(IP_HDR) 到 UDPHDR(傳輸層) DHCP......const ETH_HEADER *eth = (ETH_HEADER *) tapPacket->m_Data;const IPHDR *ip = (IPHDR *) (tapPacket->m_Data + sizeof (ETH_HEADER));const UDPHDR *udp = (UDPHDR *) (tapPacket->m_Data + sizeof (ETH_HEADER) + sizeof (IPHDR));......else if (packetLength >= sizeof (ETH_HEADER) + sizeof (IPHDR) + sizeof (UDPHDR) + sizeof (DHCP)&& eth->proto == htons (NDIS_ETH_TYPE_IPV4)&& ip->version_len == 0x45 // IPv4, 20 byte header&& ip->protocol == IPPROTO_UDP&& udp->dest == htons (BOOTPS_PORT))...... // 首先要從Ethernet->proto確認協議ETH_HEADER *e;e = (ETH_HEADER *) tapPacket->m_Data;switch (ntohs (e->proto)) // ARP處理if (packetLength != sizeof (ARP_PACKET)){ goto no_queue;} ProcessARP ( Adapter, (PARP_PACKET) tapPacket->m_Data, Adapter->m_localIP, Adapter->m_remoteNetwork, Adapter->m_remoteNetmask, Adapter->m_TapToUser.dest ); // ipv4/ipv6處理case NDIS_ETH_TYPE_IPV4: // Make sure that packet is large enough to be IPv4. if (packetLength < (ETHERNET_HEADER_SIZE + IP_HEADER_SIZE)) { goto no_queue; } // Only accept directed packets, not broadcasts. if (memcmp (e, &Adapter->m_TapToUser, ETHERNET_HEADER_SIZE)) { goto no_queue; } // Packet looks like IPv4, queue it. :-) tapPacket->m_SizeFlags |= TP_TUN; break; case NDIS_ETH_TYPE_IPV6: // Make sure that packet is large enough to be IPv6. if (packetLength < (ETHERNET_HEADER_SIZE + IPV6_HEADER_SIZE)) { goto no_queue; } // Broadcasts and multicasts are handled specially // (to be implemented) // Neighbor discovery packets to fe80::8 are special // OpenVPN sets this next-hop to signal "handled by tapdrv" if ( HandleIPv6NeighborDiscovery(Adapter,tapPacket->m_Data, packetLength) ) { goto no_queue; } // Packet looks like IPv6, queue it. :-) tapPacket->m_SizeFlags |= TP_TUN;}```
6) ipv6處理時候會檢測fe80::xxx,這類的本地單播不處理,有可能是自動配置地址,鄰居發現等等。
static IPV6ADDR IPV6_NS_TARGET_MCAST = { 0xff, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0xff, 0x00, 0x00, 0x08 };static IPV6ADDR IPV6_NS_TARGET_UNICAST = { 0xfe, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08 };if ( memcmp( ipv6->daddr, IPV6_NS_TARGET_MCAST, sizeof(IPV6ADDR) ) != 0 && memcmp( ipv6->daddr, IPV6_NS_TARGET_UNICAST, sizeof(IPV6ADDR) ) != 0 ){ return FALSE; // wrong target address} // ICMPv6 type+code must be 135/0 for NSif ( icmpv6_ns->type != ICMPV6_TYPE_NS || icmpv6_ns->code != ICMPV6_CODE_0 ){ return FALSE; // wrong ICMPv6 type} // 需要計算和填充checksumicmpv6_csum = icmpv6_checksum ( (UCHAR*) &(na->icmpv6), icmpv6_len, na->ipv6.saddr, na->ipv6.daddr );
最后調用tapProcessSendPacketQueue完成。
7) 應用層拿到數據包以后代理后,代理回包如何處理(recv)?這就和Write操作有關dispatchTable[IRP_MJ_WRITE] = TapDeviceWrite;
............// 拿到應用層傳遞過來的數據包unsigned char* packetBuffer = (unsigned char *) Irp->AssociatedIrp.SystemBuffer;ULONG packetLength = irpSp->Parameters.Write.Length;PVOID packetPriority = 0; DUMP_PACKET ("IRP_MJ_WRITE ETH", packetBuffer, packetLength); // 8021Q處理packetPriority = TapStrip8021Q(&packetBuffer, &packetLength); // 發送包到虛擬網卡ntStatus = TapSharedSendPacket(adapter,Irp,packetBuffer,packetLength,packetPriority,NULL,0);
8) TapSharedSendPacket
// 小于60byte拷貝副本在MDL映射到NetBufferList// Copy packet data to flat buffer.......if(PrefixLength > 0){ NdisMoveMemory(allocBuffer, PrefixData, PrefixLength);}NdisMoveMemory (allocBuffer + PrefixLength, PacketBuffer, PacketLength);NdisZeroMemory(allocBuffer + fullLength, paddedLength - fullLength);......// 標記注入包TAP_RX_NBL_FLAGS_CLEAR_ALL(netBufferList);TAP_RX_NBL_FLAG_SET(netBufferList,TAP_RX_NBL_FLAGS_IS_INJECTED); // 設置pendingIoMarkIrpPending(Irp); IoSetCancelRoutine(Irp,NULL); // Stash IRP pointer in NBL MiniportReserved[0] field.netBufferList->MiniportReserved[0] = Irp;netBufferList->MiniportReserved[1] = NULL; NET_BUFFER_LIST_INFO(netBufferList, Ieee8021QNetBufferListInfo) = PacketPriority; // Increment in-flight receive NBL count.nblCount = NdisInterlockedIncrement(&Adapter->ReceiveNblInFlightCount);ASSERT(nblCount > 0 ); //// Indicate the packet// -------------------// This NBL contains the complete packet including Ethernet header and payload.//// 完整的數據調用ReceiveNet完成包NdisMIndicateReceiveNetBufferLists( Adapter->MiniportAdapterHandle, netBufferList, NDIS_DEFAULT_PORT_NUMBER, 1, // NumberOfNetBufferLists 0 // 清除所有標志 ); // 注意返回標志是Pendingreturn STATUS_PENDING;
9) tap 5.0 Recv NBL實現和6.0有些異同,直接發送:
if (!l_Adapter->m_tun && ((l_IrpSp->Parameters.Write.Length) >= ETHERNET_HEADER_SIZE)){ ...... NdisMEthIndicateReceive (l_Adapter->m_MiniportAdapterHandle, (NDIS_HANDLE)l_Adapter, (unsigned char*)p_IRP->AssociatedIrp.SystemBuffer, ETHERNET_HEADER_SIZE, (unsigned char*)p_IRP->AssociatedIrp.SystemBuffer + ETHERNET_HEADER_SIZE, l_IrpSp->Parameters.Write.Length - ETHERNET_HEADER_SIZE, l_IrpSp->Parameters.Write.Length - ETHERNET_HEADER_SIZE); NdisMEthIndicateReceiveComplete(); p_IRP->IoStatus.Status = l_Status = STATUS_SUCCESS; ......}else if (l_Adapter->m_tun && ((l_IrpSp->Parameters.Write.Length) >= IP_HEADER_SIZE)){ ...... if (IPH_GET_VER(((IPHDR*)p_IRP->AssociatedIrp.SystemBuffer)->version_len) == 6) { p_UserToTap = &l_Adapter->m_UserToTap_IPv6; } ...... NdisMEthIndicateReceive (l_Adapter->m_MiniportAdapterHandle, (NDIS_HANDLE)l_Adapter, (unsigned char*)p_UserToTap, sizeof(ETH_HEADER), (unsigned char*)p_IRP->AssociatedIrp.SystemBuffer, l_IrpSp->Parameters.Write.Length, l_IrpSp->Parameters.Write.Length); NdisMEthIndicateReceiveComplete(l_Adapter->m_MiniportAdapterHandle); p_IRP->IoStatus.Status = l_Status = STATUS_SUCCESS; ......}
NdisMIndicateReceiveNetBufferLists NDIS6.0或更高版本才可以使用,過程并不復雜,不過以前自己做NDIS輪子遇到不少麻煩,后來在做類似項目就直接用開源的驅動。
四、WireGuard NT:
這里并不是講WirGuard NT本身如何做隧道的,而是通過假設代理方案舉例:
- 初始化WireGuard驅動,注冊小端生成虛擬網卡。
- 應用層設置路由表和虛擬網卡,指定IP路由到虛擬網卡。
- NDIS捕獲數據包,內核WSK代理雙向連接。
WireGuard NT梳理基于SendNetBufferLists/ReturnNetBufferLists WSK UDP處理主線
數據包流動實現:
應用層:
1) 兩個階段:
第一個是通過WireGuardSetConfiguration-->DevIceIoControl WG_IOCTL_SET 發送 WIREGUARD_INTERFACE_HAS_PRIVATE_KEY,這個不做討論。
第二個是通過WireGuardSetAdapterState-->DevIceIoControl WG_IOCTL_SET_ADAPTER_STATE 發送 WIREGUARD_ADAPTER_STATE_UP。
// WG_IOCTL_INTERFACE_HAS_LISTEN_PORT 初始化WSK// WIREGUARD_INTERFACE_HAS_PRIVATE_KEY 初始化Key// WireGuard首先要先發送keystruct{ WIREGUARD_INTERFACE Interface; WIREGUARD_PEER DemoServer; WIREGUARD_ALLOWED_IP AllV4;} Config = { .Interface = { .Flags = WIREGUARD_INTERFACE_HAS_PRIVATE_KEY, .PeersCount = 1 }, .DemoServer = { .Flags = WIREGUARD_PEER_HAS_PUBLIC_KEY | WIREGUARD_PEER_HAS_ENDPOINT, .AllowedIPsCount = 1 }, .AllV4 = { .AddressFamily = AF_INET } };// 通過發送Key初始化WIREGUARD_SET_CONFIGURATION_FUNC WireGuardSetConfiguration;_Use_decl_annotations_BOOL WINAPIWireGuardSetConfiguration(WIREGUARD_ADAPTER *Adapter, const WIREGUARD_INTERFACE *Config, DWORD Bytes){ HANDLE ControlFile = AdapterOpenDeviceObject(Adapter); if (ControlFile == INVALID_HANDLE_VALUE) return FALSE; if (!DeviceIoControl(ControlFile, WG_IOCTL_SET, NULL, 0, (VOID *)Config, Bytes, &Bytes, NULL)) { DWORD LastError = GetLastError(); CloseHandle(ControlFile); SetLastError(LastError); return FALSE; } CloseHandle(ControlFile); return TRUE;}// WireGuardSetAdapterState把WIREGUARD_ADAPTER_STATE_UP參數傳遞到內核Log(WIREGUARD_LOG_INFO, L"Setting configuration and adapter up");if (!WireGuardSetConfiguration(Adapter, &Config.Interface, sizeof(Config)) || !WireGuardSetAdapterState(Adapter, WIREGUARD_ADAPTER_STATE_UP)){ LastError = LogError(L"Failed to set configuration and adapter up", GetLastError()); goto cleanupAdapter;}
內核層:
1) 不關注Key初始化,接收到WIREGUARD_ADAPTER_STATE_UP(1) = WG_IOCTL_ADAPTER_STATE_UP。
// WIREGUARD_ADAPTER_STATE_UP內核處理case WG_IOCTL_SET_ADAPTER_STATE: AdapterState(DeviceObject, Irp);AdapterState{ case WG_IOCTL_ADAPTER_STATE_UP: Irp->IoStatus.Status = Up(Wg);}WIREGUARD_ADAPTER_STATE_UP --> Up(Wg) --> SocketInit(Wg, Wg->IncomingPort); // 上文說到也可以通過發送WG_IOCTL_INTERFACE_HAS_LISTEN_PORT來實現,他會執行以下代碼NTSTATUS Status;if (IoctlInterface.Flags & WG_IOCTL_INTERFACE_HAS_LISTEN_PORT){ Status = SetListenPort(Wg, IoctlInterface.ListenPort); if (!NT_SUCCESS(Status)) goto cleanupLock;}WG_IOCTL_INTERFACE_HAS_LISTEN_PORT --> SetListenPort() --> SocketInit()
2) SocketInit函數初始化WSK, 創建UDP Socket,WireGuard本身隧道加密傳輸基于UDP,Win下也不列外。
// 初始化固定大小的SendCtxStatus = ExInitializeLookasideListEx( &SocketSendCtxCache, NULL, NULL, NonPagedPool, 0, sizeof(SOCKET_SEND_CTX), MEMORY_TAG, 0); // 初始化/注冊WSK_CLIENT_NPIWSK_CLIENT_NPI WskClientNpi = { .Dispatch = &WskAppDispatchV1 };Status = WskRegister(&WskClientNpi, &WskRegistration); // 注冊后要捕獲WSK NPI(Network Programming Interface)// WSK_INFINITE_WAIT就是要等待到WSK子系統準備好才可以Status = WskCaptureProviderNPI(&WskRegistration, WSK_INFINITE_WAIT, &WskProviderNpi);if (!NT_SUCCESS(Status)) goto cleanupWskRegister; // 成功連接WSK子系統后,WSK_TRANSPORT_LIST_QUERY 檢索可用的傳輸列表Status = WskProviderNpi.Dispatch->WskControlClient( WskProviderNpi.Client, WSK_TRANSPORT_LIST_QUERY, 0, NULL, WskTransportsSize, WskTransports, &WskTransportsSize, NULL); for (SIZE_T i = 0, n = WskTransportsSize / sizeof(*WskTransports); i < n; ++i){ if (WskTransports[i].SocketType == SOCK_DGRAM && WskTransports[i].Protocol == IPPROTO_UDP) { if (WskTransports[i].AddressFamily == AF_UNSPEC) { WskHasIpv4Transport = TRUE; WskHasIpv6Transport = TRUE; } else if (WskTransports[i].AddressFamily == AF_INET) WskHasIpv4Transport = TRUE; else if (WskTransports[i].AddressFamily == AF_INET6) WskHasIpv6Transport = TRUE; }} // 所有套接字自動啟動回調, 使用的參數WSK_EVENT_RECEIVE_FROM參數。WSK_EVENT_CALLBACK_CONTROL WskEventCallbackControl = { .NpiId = &NPI_WSK_INTERFACE_ID, .EventMask = WSK_EVENT_RECEIVE_FROM };Status = WskProviderNpi.Dispatch->WskControlClient( WskProviderNpi.Client, WSK_SET_STATIC_EVENT_CALLBACKS, sizeof(WskEventCallbackControl), &WskEventCallbackControl, 0, NULL, NULL, NULL); Status = NotifyRouteChange2(AF_INET, RouteNotification, &RoutingGenerationV4, FALSE, &RouteNotifierV4);if (!NT_SUCCESS(Status)) goto cleanupWskProviderNPI;Status = NotifyRouteChange2(AF_INET6, RouteNotification, &RoutingGenerationV6, FALSE, &RouteNotifierV6);if (!NT_SUCCESS(Status)) goto cleanupRouteNotifierV4; // 更具體的參數See Msdn: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wsk/nc-wsk-pfn_wsk_control_client
3) 上面說到Init過程使用WSK_EVENT_RECEIVE_FROM參數,它對應的回調事件WskReceiveFromEvent。
// 如果List支持v4if (WskHasIpv4Transport){ Status = CreateAndBindSocket(Wg, (SOCKADDR *)&Sa4, &New4); if (!NT_SUCCESS(Status)) goto out;}// 如果List支持v6if (WskHasIpv6Transport){ Sa6.sin6_port = Sa4.sin_port; Status = CreateAndBindSocket(Wg, (SOCKADDR *)&Sa6, &New6); if (!NT_SUCCESS(Status)) { CloseSocket(New4); New4 = NULL; if (Status == STATUS_ADDRESS_ALREADY_EXISTS && !Port && Retries++ < 100) goto retry; goto out; }}
4) CreateAndBindSocket是WSK Create Bind的封裝
// 默認是0.0.0.0和配置傳遞過來的Port// 初始異步Event/Irp,Irp完成回調觸發KeSetEvent EventKeInitializeEvent(&Done, SynchronizationEvent, FALSE);IoInitializeIrp(&I.Irp, sizeof(I.IrpBuffer), 1);IoSetCompletionRoutine(&I.Irp, RaiseEventOnComplete, &Done, TRUE, TRUE, TRUE);// Irp傳遞參數傳遞給了NPIDisPathch,WskReceiveFromEvent這個地方賦值了IPPROTO_UDP Receive回調函數static CONST WSK_CLIENT_DATAGRAM_DISPATCH WskClientDatagramDispatch = { .WskReceiveFromEvent = Receive };// 創建新UDP Socket:SOCK_DGRAM IPPROTO_UDPStatus = WskProviderNpi.Dispatch->WskSocket( WskProviderNpi.Client, Sa->sa_family, SOCK_DGRAM, IPPROTO_UDP, WSK_FLAG_DATAGRAM_SOCKET, Socket, &WskClientDatagramDispatch, Wg->SocketOwnerProcess, NULL, NULL, &I.Irp); ULONG True = TRUE;if (Sa->sa_family == AF_INET){ // IP_PKTINFO允許啟動/禁用v4 LPFN_WSARECVMSG(WSARecvMsg)返回數據包信息 Status = SetSockOpt(Sock, IPPROTO_IP, IP_PKTINFO, &True, sizeof(True)); if (!NT_SUCCESS(Status)) goto cleanupSocket;}else if (Sa->sa_family == AF_INET6){ Status = SetSockOpt(Sock, IPPROTO_IPV6, IPV6_V6ONLY, &True, sizeof(True)); if (!NT_SUCCESS(Status)) goto cleanupSocket; Status = SetSockOpt(Sock, IPPROTO_IPV6, IPV6_PKTINFO, &True, sizeof(True)); if (!NT_SUCCESS(Status)) goto cleanupSocket;}Status = ((WSK_PROVIDER_DATAGRAM_DISPATCH *)Sock->Dispatch)->WskBind(Sock, Sa, 0, &I.Irp);Status = ((WSK_PROVIDER_DATAGRAM_DISPATCH *)Sock->Dispatch)->WskGetLocalAddress(Sock, Sa, &I.Irp);
5) Up()執行SocketInit后,接下來DeviceStart調用PacketSendStagedPackets,當然這個函數在SendNetBufferList里面也會調用,函數會針對有key和無key分別處理。
...... Irql = RcuReadLock(); Keypair = NoiseKeypairGet(RcuDereference(NOISE_KEYPAIR, Peer->Keypairs.CurrentKeypair)); RcuReadUnlock(Irql); if (!Keypair) goto outNokey; // 有key 數據生產消費 PeerGet(Keypair->Entry.Peer); _Analysis_assume_(NET_BUFFER_LIST_FIRST_NB(Packets.Head)); /* Checked in SendNetBufferLists(). */ NET_BUFFER_LIST_KEYPAIR(Packets.Head) = Keypair; PacketCreateData(Peer, Packets.Head); outNokey: // 無key ...... ...... }
6) PacketCreateData處理:
Ret = QueueEnqueuePerDeviceAndPeer(&Wg->EncryptQueue, &Peer->TxQueue, &Wg->EncryptThreads, First);if (Ret == STATUS_PIPE_BROKEN){// 失敗處理 QueueEnqueuePerPeer(&Peer->Device->TxQueue, &Peer->TxSerialEntry, First, PACKET_STATE_DEAD); MulticoreWorkQueueBump(&Wg->EncryptThreads);}if (NT_SUCCESS(Ret) || Ret == STATUS_PIPE_BROKEN) return;
7) QueueEnqueuePerDeviceAndPeer 調用 QueueEnqueuePerDevice插入加密隊列,PacketEncryptWorker線程處理NBL加密。
if (!QueueInsertPerPeer(PeerQueue, Nbl)) return STATUS_BUFFER_TOO_SMALL;/* Then we queue it up in the device queue, which consumes the * packet as soon as it can. */// 排隊消費數據包意思,就是加密處理掉這個數據包if (!QueueEnqueuePerDevice(DeviceQueue, DeviceThreads, Nbl)) return STATUS_PIPE_BROKEN;
8) PacketEncryptWorker線程負責NBL加密,算法chacha20 poly1305e, See Rfc:https://www.rfc-editor.org/rfc/rfc7539。
......EncryptPacket --> ChaCha20Poly1305EncryptMdl// 加密失敗包狀態會改變State = PACKET_STATE_DEAD;......QueueEnqueuePerPeer(&Peer->Device->TxQueue, &Peer->TxSerialEntry, First, State);ProcessPerPeerWork(&Wg->TxQueue); ProcessPerPeerWork{ PEER_SERIAL_ENTRY *Entry; while ((Entry = PeerSerialDequeue(WorkQueue)) != NULL) PeerSerialMaybeRetire( WorkQueue, Entry, PacketPeerTxWork(CONTAINING_RECORD(Entry, WG_PEER, TxSerialEntry), PEER_XMIT_PACKETS_PER_ROUND));} PacketPeerTxWork{ // PacketCreateDataDone負責發送 if (State == PACKET_STATE_CRYPTED) PacketCreateDataDone(Peer, First); else FreeSendNetBufferList(Peer->Device, First, 0);} PacketCreateDataDone { if (NT_SUCCESS(SocketSendNblsToPeer(Peer, First, &IsKeepalive)) && !IsKeepalive) TimersDataSent(Peer);} if (NT_SUCCESS(SocketSendNblsToPeer(Peer, First, &IsKeepalive)) && !IsKeepalive) TimersDataSent(Peer);
9) SocketSendNblsToPeer負責WSK發送處理。
PFN_WSK_SEND_MESSAGES WskSendMessages = ((WSK_PROVIDER_DATAGRAM_DISPATCH *)Socket->Sock->Dispatch)->WskSendMessages;#if NTDDI_VERSION == NTDDI_WIN7 if (NoWskSendMessages) WskSendMessages = PolyfilledWskSendMessages;#endif Status = WskSendMessages( Socket->Sock, FirstWskBuf, 0, (PSOCKADDR)&Peer->Endpoint.Addr, (ULONG)WSA_CMSGDATA_ALIGN(Peer->Endpoint.Cmsg.cmsg_len) + WSA_CMSG_SPACE(0), &Peer->Endpoint.Cmsg, &Ctx->Irp);
10) SendNetBufferLists這里它也叫做生產者,就是數據包捕獲源,這里不再關注細節了,它還是會調用PacketSendStagedPackets(步驟5)。
............while (NetBufferListQueueLength(&Peer->StagedPacketQueue) > MAX_STAGED_PACKETS){ NET_BUFFER_LIST *NblToDiscard = NetBufferListDequeue(&Peer->StagedPacketQueue); _Analysis_assume_(NblToDiscard); /* NetBufferListQueueLength() > MAX_STAGED_PACKETS implies NetBufferListDequeue() returns a NBL. */ NET_BUFFER_LIST_STATUS(NblToDiscard) = NDIS_STATUS_FAILURE; ++Wg->Statistics.ifOutDiscards; FreeSendNetBufferList(Wg, NblToDiscard, CompleteFlags | NDIS_SEND_COMPLETE_FLAGS_DISPATCH_LEVEL);}NetBufferListEnqueue(&Peer->StagedPacketQueue, Nbl);KeReleaseSpinLock(&Peer->StagedPacketQueue.Lock, Irql); PacketSendStagedPackets(Peer);............
11) ReturnNetBufferLists里面處理比較簡單,通過WSK回包。
for (NET_BUFFER_LIST *Nbl = First, *NextNbl; Nbl; Nbl = NextNbl){ NextNbl = NET_BUFFER_LIST_NEXT_NBL(Nbl); NET_BUFFER_LIST_NEXT_NBL(Nbl) = NULL; WSK_DATAGRAM_INDICATION *DatagramIndication = NET_BUFFER_LIST_DATAGRAM_INDICATION(Nbl); SOCKET *Socket = (SOCKET *)DatagramIndication->Next; DatagramIndication->Next = NULL; ((WSK_PROVIDER_DATAGRAM_DISPATCH *)Socket->Sock->Dispatch)->WskRelease(Socket->Sock, DatagramIndication); MemFreeNetBufferList(Nbl); ExReleaseRundownProtection(&Socket->ItemsInFlight);}
12) 回包要經過PacketConsumeData-->PacketDecryptWorker線程解密,丟給PacketPeerRxWork調用NdisMIndicateReceiveNetBufferLists。
// PacketHandshakeRxWorker線程處理Recv調用NdisMIndicateReceiveNetBufferLists:if (First) NdisMIndicateReceiveNetBufferLists(First->SourceHandle, First, NDIS_DEFAULT_PORT_NUMBER, NumNbls, 0);
13) 關于Rcu使用,代碼中以PeerGet()和PeerPut(Peer)一對,調用Put會調用回收函數KrefRelease()。另外初始化地方.InitializeHandlerEx = InitializeEx:
// 加密線程Status = MulticoreWorkQueueInit(&Wg->EncryptThreads, PacketEncryptWorker);if (!NT_SUCCESS(Status)) goto cleanupHandshakeRxQueue; // 解密線程Status = MulticoreWorkQueueInit(&Wg->DecryptThreads, PacketDecryptWorker);if (!NT_SUCCESS(Status)) goto cleanupEncryptThreads; // HANDSHAKE_TX_SEND的時候,調用PacketSendHandshakeInitiationStatus = MulticoreWorkQueueInit(&Wg->HandshakeTxThreads, PacketHandshakeTxWorker);if (!NT_SUCCESS(Status)) goto cleanupDecryptThreads;PacketSendHandshakeInitiation{ if (NoiseHandshakeCreateInitiation(&Packet, &Peer->Handshake)) { CookieAddMacToPacket(&Packet, sizeof(Packet), Peer); TimersAnyAuthenticatedPacketTraversal(Peer); TimersAnyAuthenticatedPacketSent(Peer); WriteNoFence64(&Peer->LastSentHandshake, KeQueryInterruptTime()); SocketSendBufferToPeer(Peer, &Packet, sizeof(Packet)); TimersHandshakeInitiated(Peer); }}SocketSendBufferToPeer-->SocketResolvePeerEndpoint; // PacketHandshakeRxWorker處理wsk recvStatus = MulticoreWorkQueueInit(&Wg->HandshakeRxThreads, PacketHandshakeRxWorker);if (!NT_SUCCESS(Status)) goto cleanupHandshakeTxThreads;