Continuing with our series on modular interfaces, today we’re going to talk about one of the most powerful features in Cynthesizer–the ability to create custom interfaces and use them at a high level. These are more than just ready/valid handshakes or external memory interfaces like I showed you last time. These are very complex streaming and buffer-style interfaces that transfer entire data structures.
Anyone who has designed a complex interface in RTL knows that one of the biggest chores is keeping track of the details, and it’s difficult to reuse the work you did before if any of the interface details change in any way. If the data is a multidimensional array, the interface needs to know things like the row and column parameters so the array is formatted properly. If the data is moving from one thread or module to another, the interface needs to understand how to synchronize between them. And most interfaces require some kind of internal memory to temporarily store data in case the reads and writes happen at different speeds.
All in all, when you design interfaces in RTL you spend all your time and many lines of code getting all these details resolved. There’s no way to just say “take the data from here and put it over there.”
But in Cynthesizer there is, and it does it in a way that drastically reduces the amount of code you have to write and the time you spend verifying it. The keys to this are interface generation and packaging the details up so that you can write your algorithm at the transaction level. Keep reading and I’ll show you some hard data on that.
Interfaces ‘R Us
So I want to create an interface, a real complicated one. What do I do?
Well, in Cynthesizer there’s a tool called the Interface Generator. The Interface Generator is a combination of really sophisticated IP and a graphical editing window where you can create any custom, fully parameterized interface. This interface then becomes a packaged component (actually a set of C++ classes) I can use in a high-level SystemC design. Here’s a rundown of familiar interface types you can create in the Interface Generator:
- Buffer
- Transfers data between two modules or threads through a shared buffer.
- Line Buffer
- Transfers an array and stores multiple rows of data for reading.
- Circular Buffer
- Transfers data over a circular buffer where the reading and writing operations are tightly synchronized.
- Streaming
- Transfers an array of streaming data over one or more clock cycles.
- Trigger/Done
- A master/slave architecture with acknowledgement signals at the beginning and end of data transfer.
- P2P Stream
- A general form of Forte’s CynWare point-to-point interface with features for creating specialized fifos.
For each type of interface, the Interface Editor window in Interface Generator shows you a diagram of the read/write structure, a diagram of the access pattern for the datatype being transferred, and many related parameters that you can define for your needs. Best of all, the interface you create will have transaction-level functions like get(), put(), x_done(), next_y(), etc. that let your algorithm execute entire interface accesses with a single function call.
A Real Example
Let’s take a look at creating an interface part and using it in a SystemC algorithm. I will describe what I want the interface to do in words because it would be too daunting to describe in RTL.
I want an interface that does the following:
- Allows a reader and a writer to share an array of 16-bit unsigned integers
- Allows the writer to write to the array in groups of values, working from the beginning of the array to the end
- Allows the reader to read the array in groups of values working from the beginning of the array to the end
- Coordinates the activities of the reader and writer so that there is no need to store the whole array in memory or registers
Okay, that’s the basic function and it seems easy enough. But for something like this to work in the real world you need to cover that stuff I mentioned before—the details. And the details are things like this:
- Implement the internal buffer to be 1024 words long
- The writer puts the first group of input values (let’s say 2 at a time) in the first two words of the buffer (words 0 and 1), put the next group of inputs in the next two words of the buffer (words 2 and 3), etc.
- The reader gets the first group of output values (let’s say 8 at a time) from the first eight words of the buffer (words 0-7), get the next outputs by shifting once and reading the next eight words (words 1-8), etc.
- When the writer has put inputs in the last two words of the buffer, circle around and put the next inputs in the first two words of the buffer
- When the reader has grabbed outputs that go beyond the edge of the buffer, circle around and get the remaining values from the beginning of the buffer
- Maintain input and output buffer pointers to make sure the correct buffer contents are read or written
- Synchronize the interface at the beginning and the end of the algorithm’s execution
- Synchronize the interface at the beginning and the end of any iterating loop in the algorithm
- Fully handshake all interface accesses
Suddenly this is not looking so easy. If I was designing the old way I would have to create a buffer array, set up a bunch of pointer variables, make complicated address assignments to keep everything pointing to the right place, keep track of where I was in the memory to handle that circular wraparound requirement, and harness all reads and writes with some kind of ready/valid handshake. Sigh.
But now I’m using Cynthesizer. Here’s what the Interface Editor window looks like for an interface that does exactly what I need:
[Click to enlarge]Note first I have chosen to create a circular buffer interface because of the wraparound requirement. In the “Reader” and “Writer” sections I can choose the size of the working set on each side of the interface. I need an array of two input values on the writer side and an array of eight output values on the reader side, so you see 2 and 8 in those boxes. Also, the output pointer needs to shift by one when read, so I entered 1 in the “Adjustment” box. Now all that’s left is to define what is transferred over this interface. In the “Parameter” section I’ve specified a sc_uint datatype, which is the SystemC standard for an unsigned integer, and given it a width of 16 in the “#Bits” box. Then I sized a “1D”, 1024-word buffer for the needed internal storage. Finally, I specify RAM2 as the memory part I want the interface to use to implement the buffer. I created RAM2 myself previously using Cynthesizer’s Memory Editor window. Please read Part II of this blog series for information on creating memories. I saved my interface as a part named my_if.
So “my_if” now exists as an interface part in my library. Let’s put it to good use in an algorithm that does something like this:
My module DUT needs to do the following:
- Read an input port din, calculate two values and put them in the interface. I’ll create a thread called writer() to do this.
- Get eight values from the interface, make a calculation with them, and write the result of that calculation to an output port dout. I’ll create a thread called reader() to do this.
First, I have to define a DUT module, declare the ports and threads and instantiate my interface:
SC_MODULE( dut ) {
cynw_p2p< sc_uint<16> >::in din;
cynw_p2p< sc_uint<16> >::out dout;
my_if::direct<> m_if;
…
void writer();
void reader();
…
SC_THREAD( writer, clk.pos() );
SC_THREAD( reader, clk.pos() );
}
In red I show an instance of my_if using the ::direct<> template class. This is one of the member classes in my_if, and it is used to represent an internal interface communicating directly between two threads. Conversely, there is a ::chan<> class to use when the interface is external and communicates between two modules.
Now in the DUT module we’ll define the writer() and reader() threads to interact with the interface.
void writer()
{
…
m_if.w_start_tx();
// 512 gets (working set of 2; fills 1024 length buffer)
for( int i=0; i < (1024/2); i++ ) {
m_if.w_start_iter();
vin = din.get();
m_if[i*2] = vin;
m_if[i*2+1] = vin+1;
m_if.w_end_iter();
}
m_if.w_end_tx();
}
void reader()
{
…
m_if.r_start_tx();
// 1016 puts (working set of 8 with read adjustment of 1)
for( int i = 0; i < (1024-8)+1; i++ )
m_if.r_start_iter();
v1 = m_if[j];
v2 = m_if[j*8-1];
val = (v1 + v2)/2;
dout.put( val );
m_if.r_end_iter();
}
m_if.r_end_tx();
}
That’s it. I’m done.
In red I have highlighted some more of the transaction-level member functions that save me a lot of time and keep my algorithm at a high level. They are:
- w_start_tx(), w_end_tx()
- Synchronizes the algorithm on the writing side of the interface
- r_start_tx(), r_end_tx()
- Synchronizes the algorithm on the reading side of the interface
- w_start_iter(), w_end_iter()
- Synchronizes a loop iteration on the writing side of the interface
- r_start_iter(), r_end_iter()
- Synchronizes a loop iteration on the reading side of the interface
So with the Interface Generator I defined a few parameters and instantly had a SystemC interface part loaded with powerful classes and functions, and using them I was able to describe the algorithm at the transaction level in just 58 lines. This same design in RTL would have taken well over 4600 lines!
Next time, we’ll conclude our series on modular interfaces with some final suggestions of things you should consider.






