Navigation
« 

Anonymous




Register
Login
« 
« 

Amiga Future

« 

Community

« 

Knowledge

« 

Last Magazine

The Amiga Future 167 was released on the March 5th.

The Amiga Future 167 was released on the March 5th.
The Amiga Future 167 was released on the March 5th.

The Amiga Future 167 was released on the March 5th.
More informations

« 

Service

« 

Search




Advanced search

Unanswered topics
Active topics
« 

Social Media

Twitter Amigafuture Facebook Amigafuture RSS-Feed [german] Amigafuture RSS-Feed [english] Instagram YouTube Patreon
« 

Advertisement

Amazon

Patreon

« 

Partnerlinks

z3sdram/dm9000: S2_CopyToBuff16 possible?

Support Roadshow

Moderators: AndreasM, olsen

Post Reply
tnt23
Grade reingestolpert
Grade reingestolpert
Posts: 3
Joined: 31.07.2013 - 15:34
Location: Saint Petersburg

z3sdram/dm9000: S2_CopyToBuff16 possible?

Post by tnt23 »

Hello,

I am working on a device driver for Zorro II/III network card based on Z3SDRAM project. The chip used is DM9000B from Davicom. The pre-alpha version of the driver is ready, and I am looking to improve its performance.

It seems to me that the driver could benefit from using 16-bit access to the Ethernet chip. In Roadshow I can see the S2_CopyFromBuff16 enabled with COPYMODE=FAST option. Is it possible to have S2_CopyToBuff16 enabled in a similar way?
olsen
CygnusEd Developer
Posts: 167
Joined: 06.06.2006 - 16:27

Re: z3sdram/dm9000: S2_CopyToBuff16 possible?

Post by olsen »

tnt23 wrote:Hello,

I am working on a device driver for Zorro II/III network card based on Z3SDRAM project. The chip used is DM9000B from Davicom. The pre-alpha version of the driver is ready, and I am looking to improve its performance.

It seems to me that the driver could benefit from using 16-bit access to the Ethernet chip. In Roadshow I can see the S2_CopyFromBuff16 enabled with COPYMODE=FAST option. Is it possible to have S2_CopyToBuff16 enabled in a similar way?
There would be no practical difference between S2_CopyToBuff and the (currently unimplemented) S2_CopyToBuff16 functions.

All copying operations inside Roadshow are done by one and the same function, which works similarly to how exec.library/CopyMem does its job.

Specifically, that copying function first checks if the source and destination addresses are aligned to 16 bit address boundaries. If so, it proceeds to copy in portions of several 32 bit words at a time. If there is still data left to be copied (that would be 1-3 bytes), these will be copied one byte at a time.

Roadshow's buffers are always allocated in such a way as to align them to 16 bit address boundaries (most of the time the alignment should be even better: 32 or 128 bits). I suppose your hardware's receive/transmit buffers will be aligned exactly in this manner, too. Hence, Roadshow will always end up copying data in 16 bit size portions to start with, if not 32 bit portions.

Now, if the source or the destination address would not be aligned to a 16 bit boundary, then the copying routine would have to resort to copying the data one byte at a time (worst case). To the best of my knowledge this can never happen.

The S2_CopyFromBuff16 function is special. If you use it, it is guaranteed to copy data only in portions of 16 or 32 bits. It never ever copies data one byte at a time. In order to achieve this, it will copy data to a temporary buffer if either the source or the destination addresses are not aligned to 16 bit boundaries. This temporary buffer is always aligned to a 16 bit address boundary. Because the temporary buffer is used, S2_CopyFromBuff16 will never be faster than S2_CopyFromBuff, but it may be slower than S2_CopyFromBuff if the alignment restrictions come into play.

So, in a nutshell: there would be no advantage in implementing S2_CopyToBuff16, compared to how Roadshow already handles S2_CopyToBuff out of the box.

It looks like I ought to update the network interface configuration files, or change Roadshow not to use the "COPYMODE=SLOW" option by default. That option was only ever necessary for the "Ariadne" Ethernet card.
tnt23
Grade reingestolpert
Grade reingestolpert
Posts: 3
Joined: 31.07.2013 - 15:34
Location: Saint Petersburg

Re: z3sdram/dm9000: S2_CopyToBuff16 possible?

Post by tnt23 »

olsen wrote: So, in a nutshell: there would be no advantage in implementing S2_CopyToBuff16, compared to how Roadshow already handles S2_CopyToBuff out of the box.

It looks like I ought to update the network interface configuration files, or change Roadshow not to use the "COPYMODE=SLOW" option by default. That option was only ever necessary for the "Ariadne" Ethernet card.
Thank you for thorough explanation.

Here's the catch with my card. The DM9000B registers are accessed in 16-bit words. First register is located at IOBASE+0, second at IOBASE+2 and so on, and reads/writes should always be done in words, always word-aligned.

This is OK for the driver, it feels quite happy with that addressing scheme. Being hacked from the 3c589 driver, mine also allocates its own buffers for RX/TX data, and passes these to S2_CopyToBuff/S2_CopyFromBuff hooks. It is these own buffers that later are being copied to/from DM9000B packet RAM using word-wide, word-aligned access. So it actually does not matter how these internal buffers are being accessed, in bytes, words or longs, everything fits.

What I thought was, if the S2_CopyToBuff16/S2_CopyFromBuff16 routines actually would align to words and would strictly write/read in words, then I'd map the DM9000B packet RAM directly to some IOBASE+PKTRAM address and hand it to those routines. That would render internal driver buffers unnecessary and possibly speed things up a bit.

As it turns out I was doing wrong guesses regarding 16-bit buffer management. Oh well :)
tnt23
Grade reingestolpert
Grade reingestolpert
Posts: 3
Joined: 31.07.2013 - 15:34
Location: Saint Petersburg

Post by tnt23 »

Still, I feel a bit confused regarding SANA-2 Revision 3. Don't know if this is the right source, quoting http://wiki.amigaos.net/index.php/Revision_3 comments for better buffer management:
These are optional callbacks presented to the device with the
same calling interface as for S2_CopyToBuff or S2_CopyFromBuff,
respectively. The difference to the original callbacks is the
required and guaranteed transfer size and alignment for
accessing the device's buffer for a single piece of a data of
either 16 or 32 bits, a data word. The copy function called may
only use 16/32 bit aligned read/write commands of 16/32 bits at
once to transfer the data words, respectively. If the buffer
data length is not a multiple of the required data word
transfer size, the last data word transfer may contain garbage
padding in either transfer direction.
I have strong perception this is about both alignment and size.
Post Reply