The Amiga Future 148 was released on the January 11th.
I'm debugging some network issues with Roadshow when used on the ZZ9000 and I'd like to clarify some behaviour?
Is roadshow rejecting received packets larger than the MTU?
The behaviour I'm seeing is that if I set my MTU to 500, then `ping -s 600 <dest>` the destination will successfully receive the packet split over 2 packets. The combined size is less than destinations MTU, so it returns the response in a single packet larger than the Amiga's MTU. This never makes it back to the application layer on the Amiga (I see it at the ethernet layer).
This should be fine - MTU should only be used for the transmission - but I've a feeling Roadshow may be using the value for the MRU (maximum receive unit). If this is the case, can it be fixed, or is there an interface option to set the MRU independently? If this isn't the case, I'll have to dig further into the whole stack.
Miami uses TCP/IP stack code which is at least 4-5 years younger than the one used by Roadshow. As things stand, the Roadshow TCP/IP stack makes no distinction between the maximum transmission unit and the maximum receive unit sizes. Each network interface only has a single maximum transmission unit record which governs both transmission and reception alike.
The other thing that might be a problem here is that the TCP/IP stack used by Roadshow lacks a feature known as "path MTU discovery" which allows packets to be fragmented and reassembled depending upon what the intermediate systems between your Amiga and remote hosts prefer. Normally, fragmentation issues are flagged through error messages passed up the transmission path, but not every TCP/IP stack along the way can adjust to the limitations reported. I'm sorry, but this is how things are for now
Thanks Olsen, that's a shame. Is there any hope that support for this can be added to the roadmap for a future fix?
As it stands, if the Amiga has an MTU set less than any other machine on the network it can drop valid packets. It wouldn't need path MTU discovery enabled, even just allocating a constant 1500 bytes for the RX buffer, irrespective of configured MTU, and capping received packets at that (the max for TCP within a normal ethernet frame), would fix it.
Also, slightly off-topic, but would you be open to adding support for hardware TCP/IP/UDP checksum offloading in the future? The ZZ9000 ethernet hardware supports it, and there seems to be unused ios2_Flags bits free in the SANA-II spec that could be used to both signal the device supports it and to pass back checksum results. Would need some alignment across the Amiga hardware community, but pretty sure everyone would be keen as it could lower the CPU overhead of network traffic.
This is a tough one... The TCP/IP stack code which Roadshow uses was used in NetBSD, OpenBSD, FreeBSD, etc. at the time and the newer features which were added to these operating systems might be portable. A while ago I specifically downloaded the earliest versions of NetBSD which followed the 1994 release of the TCP/IP stack which Roadshow uses. I made little progress there, though.
Limiting the MTU is still an option. Editing the respective network interface configuration file should be the easiest approach to do this.
Working out how the feature could be used is still an unsolved problem. At the minimum, you would have to let the TCP/IP stack know that the incoming datagrams need not be checked because the hardware already took care of that. This would already help The more difficult other half of the job is letting the TCP/IP stack know early on that it need not update the datagram and TCP/UDP checksums. There will be a point in the routing process which assigns a packet to a network interface at which the decision to deal with the checksums will have to be made. I don't know where that happens yet.
So, there's still a lot of work to be done, has been for years. We discussed this during OS4 development several years ago but made no progress...
Thanks olsen! The MTU thing isn't an issue in practice for me - I was only running at a smaller value to make it easier to dump the frames via kprintf and via the ARM code running on the ZZ9000 board in order to track down where some corruption was occurring.
As for a proof-of-concept for checksum offloading, I came across your https://github.com/obarthel/amiga-sana-ii-tftpclient project and finding it very useful for testing the idea and for also debugging the stack/.device/hardware flow of data in general. Thank you!
You're welcome This is exactly why I wrote the SANA-II tftpclient program: it's very hard to find a complete example code of how network client software works which talks to the SANA-II driver and which isn't 20-25 years old.
Did you make progress on your experiments? I am still pondering how to make hardware acceleration possible and so far I lack any test cases for it.
You may be right on the money here. The "ping" command which ships with Roadshow uses the ICMP protocol to perform its tasks, and the IP datagrams it transmits cannot be broken up into smaller fragments. Once you go beyond the minimum datagram size the ICMP messages may get discarded by default.
I know of the 68 byte minimum IP datagram size but I find it hard to make sense of it, given that IP datagrams shorter than 576 bytes already seem to be causing trouble
Does Roadshow set and respect the TCP maximum segment size (MSS) option? PPPoE, VPNs or other tunnels may reduce the path MTU. In most cases, the box reducing the MTU applies MSS clamping, i.e., it rewrites the MSS to a lower value in order to make sure both sides don't send packets that are too large. Normally that is because PMTUD often breaks, but with Roadshow that would be important because otherwise a lot of fragmentation will occur which is bad for performance and may also be blocked by firewalls.
I wonder how much checksum offloading will help for performance. If the checksum calculation is done during a copy cycle that needs to happen anyway it's basically free on any decent CPU, and remember, we're almost entirely talking about 10 Mbps cards.
And I guess with TCP/IP code from 1994 IPv6 support is low on the backlog...
I have an Ariadne card in a machine with a 50MHz 68060 and Roadshow 1.14. Download speed using wget to RAM: is 800KBytes/s which sets my CPU at 90% load. My guess is that a lot of that is spent checking and calculating checksums. With the Xsurf-100 and USB Ethernet options there are a few 100Mbps cards around...
I'd indeed love to see if the checksum routine can be moved to the ethernet card or done by another CPU in the machine (PPC, ZZ9000, ...). I'm not a skilled programmer but would be nice to do something in this space...