HsUSBD Proper Read-out method for Bulk-OUT EPs

Post Reply
Posts: 11
Joined: 18 Apr 2017, 16:39

09 Jan 2019, 09:24

Dear fellows,

I've been working on my HsUSB device implentation project.
Now facing at problem in read-out process for BulkOut endpoint,
especially in a case where ...
  • Buffer memory allocation size for that Endpoint :
    Due to enjoy better USB transaction in bus utilization point of view,
    NYET/PING should be happened as less as possible.
    If I allocate buffer memory just for one packet size, the Endpoint
    should always respond with NYET, no chance to respond with ACK,
    and makes a Host invoke PING at the next time it wants to send
    another packet.
    At least the twice packet size of memory should be allocated for
    those endpoints if I want to supress NYET/PING occurence.
    So, I allocated 1KB for an Bulk Ep described as whose max packet size
    is 512 bytes.
  • Read-out process to be delayed as the decvice firmware can process:
    The corresponding recieving data process may take variable times.
    Or, even delayed to relinquish CPU for other higher priority tasks.
    Even in such case, USB's handshake mechanism should work
    When a new data packet is sent to the device during the several
    unprocessed received data is still left in the endpoint FIFO, endpoint
    control mechanism of USB device for bulk out should work fine.

    So, my interrupt handler for HsUSBD is just signaling the fact of
    new data reception to the consumer task, clear RXPKIF immediately
    and simply returns.
    Certain time later, the consumer task is then invoked and it
    starts reading-out the recieved data one by one from the allocated
    endpoint buffer inside HsUSBD block at consumers processing pace.
    When the final data is read-out by the consumer task, then the Endpoint
    controller can decide to change its handshake for the next reception.
    e.g., if allocated buffer is completely empty, then next responce can be ACK.
    if still some are left, then NYET (or NAK) respectively.

    In this implementation, how the consumer task can know the emptiness
    after at each read-out process
I have tried two different ways, but neither in success....
One is checking EPxDATCNT after reading EPxDAT and another is
checking BUFEMPTYIF flag after reading EPxDAT.

I've already confirmed that as long as the consumer task is
keeping up (no more than one packet is held in Ep buffer, responding always by ACK), both
method work as expected, the same successful results.
But, it once goes behind (another new packet is recieved,
responded with NYET, during the data in the first packet is
still unread completely)

I've also confirmed that EPxDATCNT indicates more than 0x200 when
another packet is received while the first one is not read-out.

The problem I experienced for the former method is ...
BUFEMPTY never arise after it once respond with NYET (or received PING).
So, the consumer process can never proper end of received data and
the process goes on working in endless-loop....

In case of the later method,.....
EPxDATCNT is decremented by 4 at each word-readout access, but
never goes down thru 0x200 to approaching zero.
So, also in the case by this method, the consumer process can NOT get out from
the endless loop.....

By the way, the my implementations described above may include some
challenges which the technical reference manual does not explain detail.
So, I may misunderstand very basic behavior about this HsUSBD HW block.

I allocated twice size memory for an bulkout endpoint to suppress
NYET/PING. I confirmed it seems to be working well....
Is this a proper ways to get expected response behavior for the HsUSBD HW block?

I immediately clear RXPKIF before reading out from EPxDATA and
let the HsUSBD's Endpoint control mechanism continure to respond
to the new USB TOKENs properly.
Is the really OK for that HsUSBD HW block ?

Any comments is welcome.


Dylan Hsieh
Posts: 31
Joined: 22 Mar 2017, 09:54

09 Jan 2019, 19:09

Hi Kojima-san,

According to HSUSB specification if Endpoint buffer have no enough space to store the one packet size, NYET will be returned. Allocate twice size memory (in your case 512K x 2 = 1K Bytes) to suppress the NYET/PING is correct way.

To let the USB transfer more efficiency, I suggest using the DMA to read out the RX data from Endpoint buffer :
Each DAM transmission is 512 bytes, when transmission is finish DMA will issue a interrupt so that you can know when the RX data totally read out.

For the HUSBD DMA operation, please visit the link as below:
https://github.com/OpenNuvoton/M480BSP/ ... VENDOR_LBK

Thanks ;)

Posts: 11
Joined: 18 Apr 2017, 16:39

30 Jan 2019, 14:51

Dear Dyian,

Thank you for your reply and suggestions to employ DMA.

>Allocate twice size memory (in your case 512K x 2 = 1K Bytes)
>to suppress the NYET/PING is correct way

I confirmed that I do not misunderstand how to suppress
Thank you for your information

But, regarding employing DMA, it does NOT fit to my case
and my point of views.
I would like to utilize packet buffer RAM hardware resource
efficiently, it can be under control of USB handshake mechanism.
Employing DMA can release packet buffer RAM inside
USBD IP block very rapidly. But, for this action, I must
allocate a kind of RELAY buffer memory between the digester
process. Maybe, software-coded ring-buffer FIFO mechanism is
required. At dequeuing from that ring-buffer FIFO, some address
calculation/management procedures are additionally required,
which is more complex than simple reading-out from the USBD
IP bloch's DATA port access which is fixed-address.
Only thing I should take care is vailable recevied data still being
left inside that IP block.

Another feature I would like to utilize with that IP block is ....
even during digesting previously received data, the other free
RAM buffer area inside IP block can be utilized to recieve
the new successing data packet.
If this situation really happens, the USBD IP block MAY update
its data count inside the HW buffer during read-out.
I think I'm facing at this process of checking "how many bytes
are left still inside USBD IP block ?".

Basically and roughly, there might be two ways to do that.
One is checking data count thru the register and another
is just checking EMPTY flag bit.
In the former way, the content of the data count register MAY
grow at the new packet reception with the number of bytes
left plus newly received packet.
In the later way, it can be more simle because to be checked
is just a single bit indicating EMPTY or NOT, regardless of
new packet is received or not.

But, as far as I've tried, neither algorithm I could get success.
So, I think I should know more detail about HW packet buffer
management dynamics which this USBD IP block behaves.

How do I understand about this USBD IP's buffer management
dynamics ?
Is ther any better document which descibe about that ?


Posts: 11
Joined: 18 Apr 2017, 16:39

05 Feb 2019, 07:02

Dear Dyian,

I supporse, I had better asking about this issuse from different point of
views for better understanding..

Is this HsUSBD block capable to hold two (or more) packet contents
in its allocated endpoint buffer memory ?

As far as I browsed, every sample code employing this HsUSBD is allocating just
one packet size memory for corresponding endpoint, none trying to allocate
twice or so.

Generally speaking, allocating two(or more) packet size of memory for an
endpoint is the one which is intended for double-buffering or so-called
'Ping-Pong' buffering fashion. Is this way of idea covered under design
philosophy/perspectives of Nuvoton's USB IP design ?
(For Full-speed USB IPs, neither I could see)


Dylan Hsieh
Posts: 31
Joined: 22 Mar 2017, 09:54

11 Feb 2019, 16:30

Hi Kojima-san,

Allow me to simplify your question:

You want to operate the endpoint buffer likes the "double-buffer", allocates double packet size for a endpoint (BUF1, BUF2), the first package from HOST will be stored into BUF1, the next package will be stored into BUF2, so you can handle the previous package and receive the next package at the same time without responding the HOST NYET package.

Am I right?

Posts: 11
Joined: 18 Apr 2017, 16:39

12 Feb 2019, 07:11

Dear Dyian,

Yes, you are correct.

When I recieved the first packet (whose size is SIZE1) onto BUF1 with
ACK response, I can invoke the digester task for that endpoint.
At least, until the new USB packet is going to be received on the
same endpoint, the digester task can know SIZE1 information
thru lower half word of corresponding Data Available Count register
At the begining before I do not read the recieved data thru Data
Register(_EPxDAT), _EPDATCNT should be read as SIZE1.

But, how does the _DATCNT behave after I start reading-out EPxDAT ?
At least before the new USB packet comes, _DATCNT is decremented
by 1 (when I read EPxDAT by byte access) or by 4 (when I read by word access)
Am I correct ?

Then, what will happen when a new USB packet is recieved onto BUF1 area
until _DATCNT does not reach zero ?
In this situation, several bits (such as NYETIF) in corresponding EPxINTSTS
will be set as '1' by hardware.
Does this NYETIF==1 state change make changes on the content of _DATCNT ?


Dylan Hsieh
Posts: 31
Joined: 22 Mar 2017, 09:54

12 Feb 2019, 14:23

Hi Kojima-san

The _DATCNT always indicate how many data in the endpoint buffer, and it will decrease by "1" each time you read-out (byte accessed) the data from the endpoint buffer. And the endpoint buffer is work likes "Ring Buffer", if it have enough space to stored "A data package", the USB hardware will response ACK, otherwise the USB hardware will response NYET.

For example:

Allocate 512 x2 = 1K Bytes for endpoint buffer, when 1st data package (512 Bytes) coming because we have another 512 byte space to stored
_DATCNT = 512 (ACK return)

Read-out the data from endpoint buffer, the _DATCNT will decrease
_DATCNT = 512 - 1 = 511

Then 2nd data package (512 Bytes) coming, now we don't have enough space to stored because we only read-out 1 byte data
_DATCNT = 511 + 512 = 1023 (NYET return)

The USB hardware will keep response NYET until we release the space of endpoint buffer by read-out data.


For the double-buffered, maybe implement it outside is more easier, we allocate a space in the internal SRAM for 512 byte x 2-3 layers (depend on how you deal with your data) or use the linked-list.

When endpoint interrupt occurs (the data package comes), we change the destination address of USB's DMA to decide which layer of SRAM the data moving into and trigger USB's DMA to move the data from endpoint buffer to ensure there have enough space to store next data packet. So the endpoint buffer is always "free" for the next data package, and we could handle the data after the USB's DMA transfer is finished.

BTW, could you tell us what kind of project your are dealing with, any feedback is appreciated.


Posts: 11
Joined: 18 Apr 2017, 16:39

13 Feb 2019, 08:15

Dear Dyian,

Thank you for your response.
(You like to employing USB's DMA. But, please put it aside.)

You showed an example explaining how _DATCNT behaves
when two of full-sized(512 byte long) packets come in during
those data is being read.

And you mentioned about "enough space to store" and
_DATCNT indication of "how many data in the endpoint buffer".

* checking emptiness:
In my case, my digesting task works ...
read-out proper number of data bytes,
process it
until the all received data is done
Checking _DATCNT (being read as 0) seems to be one of the ways.
But, I'm thinking abot another way.
For checking only empty-or-NOT, BUFEMPTYIF bit of the corresponding
EPxINTSTS bit can be used alternatively.
Am I corrent ?

*memory space management done by USBD IP
You'd just shown example of two of full-size packet case.
When I allocate 1KB and three or four short packets (each sized
256 bytes or less), can this USBD IP recieve them regardless
of number of packets ?

If the size of each coming packet varies, data storing pointer inside
this USBD IP may encounter wrap-around border case during
within a packet. Does it work fine even in such a case ?

* over-reading case ?
This is one of my doubtfull situations in my current code....
This USBD provides only EPxDAT register port to read/write
packet contents.
If I over-read the data exceeding actually-received number
of data, ...
Does _DATCNT stay zero ?
Does EMPTYIF bit stay one ?
Are there any undeterminable results for further new packet reception ?

You wrote:
>BTW, could you tell us what kind of project your are dealing with
Sorry, I can not explain in detail with short words.
In place, I can show very similar and populer application of USB
with profiles of it for your better imaginations.
It is like a VCOM Rx usage case:
The packet size varies, (mostly short packets?)
Packets are sent by host occasionally
Sometimes just a lonesome short packet,
Sometimes serise of full-sized packets very dense in time domain
It is not a packet-border sensitive, contiguous byte-width FIFO
buffering is acceptable

It is like a MIDI Rx usage case:
very similar to VCOM Rx case...
Exception is data unit size. 4-byte, not a 1-byte.
Not packet-border sensitive, simple word-width FIFO is acceptable.


Dylan Hsieh
Posts: 31
Joined: 22 Mar 2017, 09:54

15 Feb 2019, 15:22

Hi Kojima-san,

[Checking emptiness]
_BUFEMPTYIF is just a status flag indicate that do endpoint buffer have data or not. If endpoint buffer have unread data inside it will keep "0", even there is only 1 byte unread data inside endpoint buffer.

[Memory space management done by USBD IP]
If the endpoint buffer have enough space to store, it will receive the data package no matter what length of data package is. But the "enough space" is determine by the maximum package size of USB descriptor.

For example:
We allocate 1K bytes of endpoint buffer and the maximum package size of this endpoint is 512 bytes.
  • The 1st data package which size is 256 bytes coming, the USB IP will receive it and return ACK because the free space of endpoint buffer is 1K bytes, 1K bytes is larger than the maximum package size (512 Bytes)
  • The 2nd data package which size is 512 bytes coming, the USB IP will receive it and return NYET because the free space of endpoint buffer is 256 bytes (768 - 512) which is less than the maximum package size (512 Bytes)

[Border case]
As mentioned previously, the endpoint buffer is words like "Ring-Buffer", it will take care the border case itself.

[Over-reading case]
Over-reading will mess up the read pointer which inside the endpoint buffer of USB hardware, the USB hardware doesn't have "mistake-proofing" in over-reading condition. So, do not read the data when _DATCNT is equal "0" and _EMPTYIF is equal "1".

From your point of view, Do Nuvoton's MCU implement the double buffer inside the USB IP is helpful in your use-case?
Any suggestion is welcomed.


Posts: 11
Joined: 18 Apr 2017, 16:39

22 Feb 2019, 19:01

Dear Dyian-san,

Thank you for your updates and sorry for my short absense in reply.

[Checking emptiess by _BUFEMPTY]
I'm always reading out the buffer in 32bit access, because my Host sends
word unit information, multiple interger number of four bytes, every time.
Does this HW show/update _DATCNT progress even a new receving is
on-going ?
If so, _DATCNT may show two or three (or those plus four) at the
close-to-empty sitsuations. and _BUFEMPTY = 0.
If my code just checks _BUFEMPTY, it may be too early to read in WORD.
There might be still two or three bytes available, an over-reading ?.
Does this may happen ?

You wrote very generic question:
>From your point of view, Do Nuvoton's MCU implement the double buffer inside the
>USB IP is helpful in your use-case?
> Any suggestion is welcomed.

Basically, Yes.

If you let me comment a little bit deeply, the word 'double buffer' should be treated
more carefully, with the design consideration about packet-border-respectiveness.
If we say 'double buffer', most of us may understand it as two packet reception
capacity including size information of each packet. Even if those two packets are
the short packets in different size, can we get both valid size of the packet or not....
Maybe, if we handle those two packet size information, we need extra registers
inside USB device IP. Even if you try, you would like to limit the number of packets,
maybe two.... If so, I would like to call such a one 'Ping-Pong' buffer to distinguish
it from some a little bit different implementation.
I guess, Nuvoton has never took such perspectives on USB device IP designs,
at least for FsUSB device IPs.
SOME may be changed when Nuvoton designs High-speed USB device IP design.
In case of receiving, as I asked at the beginning, a new handshake 'NYET'
is introduced in High-speed USB specification. How much of 'SOME' is what
I did not understand very well.

Thru the discussions, I understand this HsUSBD IP is designed taking care about
total number of byte carried in the packet payload vs physical buffer capacity
inside HsUSBD IP block. It may be able to three, four or five very short packets
as long as the allocated area is enough big for them. But, actual size of each
packet can not be hold in anywhere inside HsUSBD IP block....
In some USB pipe usages, those packet border related information is NOT
needed. Such as CDC/VCOM and MIDI....
As a result, packet buffer capacity utilization (without packet order cares) we
can enjoy in current Nuvoton's design.
It maybe called "multiple packet buffering" (including 'double' or tripple).
If somebody allocate twice of MaxPacketSize, he may call it 'double buffer'.
It is not a Ping-Pong buffer, it should be distinguished.

At least for Nuvoton, High-speed USB would be a good opportunity to challenge
new designs. I've been watching from NUC505. Very limited number of competitors
are releasing mass-productive High-speed USB capable MCUs yet.
So, Nuvoton is still an unique Hs-USB MCU vender.
But, I did not find and understand very well about how efficiently utilizing it.


Post Reply
  • Information
  • Who is online

    Users browsing this forum: No registered users and 0 guests