In this blog we’re going to cover troubleshooting PXE Boot with Wireshark, in this case, we have an existing capture of a PXE boot issue. PXE boot is the process of booting a network-connected device from the network using the Preboot eXecution Environment.
The client needs to be configured to boot from the network either after failing to boot from other sources or, as is often the case because it needs a rebuild and the build-media is held on a network drive.
In this scenario, the PXE boot and build process stopped working so we needed to find out what the problem was.
Getting the capture
Since we couldn’t have Wireshark installed on the client or the server, the capture was performed on a third device. We knew the MAC address of the client machine, and the Wireshark laptop so after tracing the MAC addresses to their respective switch ports and setting up port mirroring, we were good to start the capture.
1: Start Wireshark and choose (1) Capture and choose Options, then select (2) Promiscuous Mode for the capture interface and make sure there is no capture filter. We could have used the MAC address of the PXE client in the Capture Filter box as “ether host a1:b2:c3:d4:cf:0c” for example, I’d include “ether host ff:ff:ff:ff:ff:ff” so as not to miss the broadcast element of the capture.
Click on the Start button to begin the capture.
Applying display filters to a Wireshark capture
2: Once you’ve stopped the capture, save the file for future use. You can start to filter using the display filter options. The simplest way to do that is to right-click the desired field in the packet of interest. Right-clicking on the MAC address of the PXE client and highlighting “Apply as Filter” then clicking on selected will populate the display filter box at the top of the capture.
3: Amend the filter until it includes what’s needed, you can select other fields as above and add them using the “..or Selected”, or you can type directly in the filter box.
4: To complete the filter (in the next image) I’ve added “or eth.dst == d4:81:d7:fd:d5:ee or dhcp” although the expression ” eth.addr == d4:81:d7:fd:d5:ee or dhcp” would have achieved the same output.
Examining the output of a Wireshark capture
5: If you don’t see the packet details in the Wireshark window, select View and make sure that Packet Details is selected in the drop-down. The output should appear as below.
The initial DHCP request from our client is no 587 a DHCP Discover packet. Packet 588 is shown in the detail window below the packet list and if you look at the source and destination Ethernet addresses, neither include our PXE client so we would have missed that without the “dhcp” option in the filter. The DHCP Discover and the responses are all sent using broadcasts.
In this DHCP Offer we can see the client IP offered as well as the “Next server address” but because the DHCP server (x.x.30.42, not shown) is on a different subnet, there is also a Relay agent IP address, packet 589 contained the same information (including the DHCP server x.x.30.42) but the relay agent had a different address x.x.31.3 instead of x.x.31.2. The “Next Server” IP address turned out to be the TFTP server offered in DHCP option 66.
6: Looking at packet 591 (A DHCP offer from a different server) we can see that there is no Relay agent address because the DHCP server (which is also x.x.31.29 is on the same subnet as the DHCP client.
7: Packet 665 is the DHCP Request. We can see Option (54) DHCP Server Identifier x.x.30.42 as well as the requested address in Option (50), which was offered earlier by x.x.30.42.
There’s also an Option (55) Parameter request list which includes requests for Option (60), (66) and (67). Option (66) is requesting the TFTP server for the boot file named in Option (67).
8: Looking at the DHCP Ack in packet 666, it has confirmed the Requested IP address and DHCP server identifier, and it has given the Bootfile name and TFTP server.
9: The following ARP (Address Resolution Protocol) request though (packet 667) is for x.x.31.29, and it’s from x.x.31.153, so it looks like our client has accepted the IP address offered by x.x.30.42 via the relays and is using it to communicate with the server at x.x.31.29.
10: After the ARP reply, the next packet (669) is a ProxyDHCP request (which uses UDP port 4011) directed to x.x.31.29. It’s a Boot Request so it’s still looking for boot information. This packet won’t be seen by the x.x.30.42, or x.x.86.46 which is the expected PXE boot server, so why does our client choose the rogue server to boot from?
11: If we go back to packet 591, which is the DHCP offer from x.x.31.29 we can see that option 60 is set to “PXEClient”.
The x.x.30.42 DHCP server didn’t have option 60 set, but the x.x.30.42 server was not running the PXE boot service, it provided the Bootfile location using options (66) and (67). Option 60 is only required if the DHCP server and the Boot server are running on the same host as is the case for x.x.31.29. It means the PXE client regards x.x.31.29 as a valid PXE boot server but it needs to use port 4011 to connect, (the bootp protocol would use the same ports as DHCP but since they are being used for DHCP already, the bootp process needs to use other ports.
So our client is trying to connect to the rogue server. The server hasn’t provided the bootfile information at this point.
12: The proxyDHCP ack packet at 672 (1) provides the missing boot file location, (2) which is followed by a TFTP (Trivial File Transfer Protocol) request from the PXE client to the x.x.31.29 server for the aforementioned file. The boot process would likely have continued, but we can see that the TFTP server returns a “File not found” error.
The above scenario was repeatable. Each time the DHCP discover packet received a response from one of the relays first, followed by either the other relay or the rogue server. The relays offered an IP address, the rogue did not.
The following DHCP Request asked for the IP offered by the relays, and once they had sent the DHCP ACK (acknowledgment), the PXE client started using it to query the rogue server.
RFC951 describes the bootstrap operation as consisting of two phases. Phase one is the IP address determination and boot-file selection and phase 2 is the transfer of the file.
It’s clear from the above packet captures that the PXE client can accept the IP address from one offer, but either ignore other fields in the same packet or perhaps prefer options presented by other servers. It would make sense to accept all the options from the same packet. In the end, the boot process failed because the file offered by the rogue server couldn’t be found (on the same server).
As an aside, the Microsoft recommended method for booting from a PXE server that’s not on the same subnet as the client is to use the IP helper-address functionality available on many routers and avoid the use of DHCP options. See https://support.microsoft.com/en-gb/help/4471003/how-to-boot-from-a-pxe-server-on-a-different-network
To cut a long story short, a summary.
This blog post covered troubleshooting PXE Boot with Wireshark.
We started by opening Wireshark and choosing the interface as well as selecting the checkbox for promiscuous mode. For this capture, I chose not to use a capture filter, using display filters once the capture had been saved instead.
The symptoms of the fault were that PXE boot was failing, and given that the network appeared to be otherwise functional we neede to look at what was happening with the traffic and Wireshark is the tool that was used to achieve this.
After inspecting the traffic we could see that an unexpected boot server was present on the network, and it didn’t have the required files needed to complete the boot process. Removing the rogue server, fixed the issue.
If you’re a network engineer and you could use an IPv4 subnet calculator check out the free techiedoodah IPv4 excel subnet calculator spreadsheet and if you get a lot of time hands on rackside and need a tray to put your laptop on, let us know what you think of the Portable Rack Mount Laptop Tray and sign up if you want one. Type with two hands instead of one, be more comfortable, improve your productivity and get out of the server room sooner, (or wherever the rack happens to be).
Techiedoodah blogs are created in the hope that they can help others by giving real-life examples. If this has been useful to you please feel free to leave a comment. If you’re reading this post on the home page, you won’t be able to post comments here, so follow this link to the blog, and then scroll to the comments section at the bottom of the page.