Open
Conversation
cb0f18d to
72ba56b
Compare
Signed-off-by: Terry Bai <tianyi.bai@unsw.edu.au>
72ba56b to
b604a8f
Compare
Courtney3141
requested changes
Mar 30, 2026
Contributor
Courtney3141
left a comment
There was a problem hiding this comment.
I will see if I can tidy up the changes to the general echo server code, if you could address the minor comments on the Ethernet driver.
Also, if you could please add an issue for the lwip error message we get, so we can look into that later.
Once I have done with my changes, if you could review them please, then we can merge 👍
drivers/network/genet/ethernet.h
Outdated
| #define ETH_HLEN (14) | ||
| #define VLAN_HLEN (4) | ||
| #define ETH_FCS_LEN (4) | ||
| /* Body(1500) + EH_SIZE(14) + VLANTAG(4) + BRCMTAG(6) + FCS(4) = 1528. |
Contributor
There was a problem hiding this comment.
Can you turn this comment into a sentence? I.e. MTU must be a multiple of 256, so we set ENET_PAD to 8. Also, is the 64 bit checksum working pad factored in here?
Contributor
Author
There was a problem hiding this comment.
Are you referring to 32-bit CRC, which is included in ETH_FCS?
b604a8f to
1862749
Compare
This GENET driver is derived from Linux, U-boot and RT-Thread source code due to lack of public documentation. We use only the default ring (i.e. 16) for both Rx and Tx for simplification. Signed-off-by: Terry Bai <tianyi.bai@unsw.edu.au>
The rpi4 GENET hardware requires a pseudo header checksum calculated by software, and 64-bytes pre-appended configuration space for each of Rx/Tx packets, which means the actual payload is shifted to the offset 64. For now, the pseudo checksum is just hackly implemented for this special case. Signed-off-by: Terry Bai <tianyi.bai@unsw.edu.au>
Co-authored-by: Kurt Wu <rihui.wu@unsw.edu.au> Signed-off-by: Terry Bai <tianyi.bai@unsw.edu.au>
Signed-off-by: Terry Bai <tianyi.bai@unsw.edu.au>
1862749 to
7cfe7f4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The implementation mainly refers to Linux, U-Boot and RT-Thread source code due to lack of documentation. Compared to Linux using the second IRQs for seperate rings, we use only the default ring (i.e., ring 16) and the first IRQ for Rx/Tx status update.
The GENET NIC supports only one checksum which can be either network layer (e.g., IP) or transport layer (e.g., TCP/UDP) at a time, which means the checksum of another layer needs to be calculated in software. To enable hardware checksum offload, a 64-byte space needs to be pre-appended to each of Rx/Tx packets, causing the actual payload to be shifted to the byte 64. Also, a pseudo header checksum needs to be calculated by software (with a constant time cost) and filled in the checksum field. A rough benchmark shows that handling TCP/UDP checksum calculation to hardware saves around 8.5% total CPU Utilisation.
Unlike most of other NICs, the driver needs to explicitly doorbell the device by updating
prod_indexof Tx ring orcons_indexof Rx ring once there are some work to do. However, each doorbell takes over 300 cycles, so the hardware would have likely finished the work before jumping out from the whole loop inhandle_irq(), undermining the batching of the packets.The basic benchmark results: (The data has been added to the internal benchmark spreadsheets)
A tiny issue occurred during the full benchmark:
memp_malloc: out of memory in pool TCP_PCBis printed after exactly 10 benchmarks, but could be fixed in another commit later.