<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Forte CynCity &#187; Methodology</title>
	<atom:link href="http://cyncity.forteds.com/category/methodology/feed/" rel="self" type="application/rss+xml" />
	<link>http://cyncity.forteds.com</link>
	<description></description>
	<lastBuildDate>Mon, 28 Mar 2011 15:38:43 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Modular Interfaces Part III: Custom Interfaces</title>
		<link>http://cyncity.forteds.com/2011/03/28/modular-interfaces-part-iii-custom-interfaces/</link>
		<comments>http://cyncity.forteds.com/2011/03/28/modular-interfaces-part-iii-custom-interfaces/#comments</comments>
		<pubDate>Mon, 28 Mar 2011 15:38:43 +0000</pubDate>
		<dc:creator>Mike Meredith</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://cyncity.forteds.com/?p=160</guid>
		<description><![CDATA[Continuing with our series on modular interfaces, today we&#8217;re going to talk about one of the most powerful features in Cynthesizer&#8211;the ability to create custom interfaces and use them at a high level.  These are more than just ready/valid handshakes or external memory interfaces like I showed you last time.  These are very [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing with our series on modular interfaces, today we&#8217;re going to talk about one of the most powerful features in Cynthesizer&#8211;the ability to create custom interfaces and use them at a high level.  These are more than just ready/valid handshakes or external memory interfaces like I showed you last time.  These are very complex streaming and buffer-style interfaces that transfer entire data structures.</p>
<p>Anyone who has designed a complex interface in RTL knows that one of the biggest chores is keeping track of the details, and it’s difficult to reuse the work you did before if any of the interface details change in any way.   If the data is a multidimensional array, the interface needs to know things like the row and column parameters so the array is formatted properly.  If the data is moving from one thread or module to another, the interface needs to understand how to synchronize between them.  And most interfaces require some kind of internal memory to temporarily store data in case the reads and writes happen at different speeds.</p>
<p>All in all, when you design interfaces in RTL you spend all your time and many lines of code getting all these details resolved.  There’s no way to just say “take the data from here and put it over there.”</p>
<p>But in Cynthesizer there is, and it does it in a way that drastically reduces the amount of code you have to write and the time you spend verifying it. The keys to this are interface generation and packaging the details up so that you can write your algorithm at the transaction level. Keep reading and I’ll show you some hard data on that.</p>
<p><span style="color: #0000ff;"><strong>Interfaces &#8216;R Us</strong></span><br />
So I want to create an interface, a real complicated one.  What do I do?</p>
<p>Well, in Cynthesizer there’s a tool called the Interface Generator.  The Interface Generator is a combination of really sophisticated IP and a graphical editing window where you can create any custom, fully parameterized interface.  This interface then becomes a packaged component (actually a set of C++ classes) I can use in a high-level SystemC design.  Here’s a rundown of familiar interface types you can create in the Interface Generator:</p>
<ul>
<li>Buffer
<ul>
<li>Transfers data between two modules or threads through a shared buffer.</li>
</ul>
</li>
</ul>
<ul>
<li> Line Buffer
<ul>
<li>Transfers an array and stores multiple rows of data for reading.</li>
</ul>
</li>
</ul>
<ul>
<li> Circular Buffer
<ul>
<li>Transfers data over a circular buffer where the reading and writing operations are tightly synchronized.</li>
</ul>
</li>
</ul>
<ul>
<li> Streaming
<ul>
<li>Transfers an array of streaming data over one or more clock cycles.</li>
</ul>
</li>
</ul>
<ul>
<li> Trigger/Done
<ul>
<li>A master/slave architecture with acknowledgement signals at the beginning and end of data transfer.</li>
</ul>
</li>
</ul>
<ul>
<li> P2P Stream
<ul>
<li>A general form of Forte’s CynWare point-to-point interface with features for creating specialized fifos.</li>
</ul>
</li>
</ul>
<p>For each type of interface, the Interface Editor window in Interface Generator shows you a diagram of the read/write structure, a diagram of the access pattern for the datatype being transferred, and many related parameters that you can define for your needs.  Best of all, the interface you create will have transaction-level functions like <em>get()</em>, <em>put()</em>, <em>x_done()</em>, <em>next_y()</em>, etc. that let your algorithm execute entire interface accesses with a single function call.</p>
<p><strong><span style="color: #0000ff;">A Real Example</span></strong><br />
Let’s take a look at creating an interface part and using it in a SystemC algorithm.  I will describe what I want the interface to do in <em>words </em>because it would be too daunting to describe in RTL.</p>
<p>I want an interface that does the following:</p>
<ul>
<li>Allows a reader and a writer to share an array of 16-bit unsigned integers</li>
</ul>
<ul>
<li>Allows the writer to write to the array in groups of values, working from the beginning of the array to the end</li>
</ul>
<ul>
<li> Allows the reader to read the array in groups of values working from the beginning of the array to the end</li>
</ul>
<ul>
<li> Coordinates the activities of the reader and writer so that there is no need to store the whole array in memory or registers</li>
</ul>
<p>Okay, that’s the basic function and it seems easy enough.  But for something like this to work in the real world you need to cover that stuff I mentioned before—the details.  And the details are things like this:</p>
<ul>
<li> Implement the internal buffer to be 1024 words long</li>
</ul>
<ul>
<li> The writer puts the first group of input values (let’s say 2 at a time)  in the first two words of the buffer (words 0 and 1), put the next group of inputs in the next two words of the buffer (words 2 and 3), etc.</li>
</ul>
<ul>
<li> The reader gets the first group of output values (let’s say 8 at a time) from the first eight words of the buffer (words 0-7), get the next outputs by shifting once and reading the next eight words (words 1-8), etc.</li>
</ul>
<ul>
<li> When the writer has put inputs in the last two words of the buffer, circle around and put the next inputs in the first two words of the buffer</li>
</ul>
<ul>
<li> When the reader has grabbed outputs that go beyond the edge of the buffer, circle around and get the remaining values from the beginning of the buffer</li>
</ul>
<ul>
<li> Maintain input and output buffer pointers to make sure the correct buffer contents are read or written</li>
</ul>
<ul>
<li> Synchronize the interface at the beginning and the end of the algorithm’s execution</li>
</ul>
<ul>
<li> Synchronize the interface at the beginning and the end of any iterating loop in the algorithm</li>
</ul>
<ul>
<li> Fully handshake all interface accesses</li>
</ul>
<p>Suddenly this is not looking so easy.  If I was designing the old way I would have to create a buffer array, set up a bunch of pointer variables, make complicated address assignments to keep everything pointing to the right place, keep track of where I was in the memory to handle that circular wraparound requirement, and harness all reads and writes with some kind of ready/valid handshake.  Sigh.</p>
<p>But now I’m using Cynthesizer.  Here’s what the Interface Editor window looks like for an interface that does exactly what I need:</p>
<p><a href="http://cyncity.forteds.com/wp-content/uploads/2011/03/ifed.jpg"><img class="alignnone size-thumbnail wp-image-208" title="Interface Editor Window" src="http://cyncity.forteds.com/wp-content/uploads/2011/03/ifed-150x150.jpg" alt="" width="75" height="75" /></a></p>
<address>[Click to enlarge]</address>
<p>Note first I have chosen to create a circular buffer interface because of the wraparound requirement.  In the “Reader” and “Writer” sections I can choose the size of the working set on each side of the interface.  I need an array of two input values on the writer side and an array of eight output values on the reader side, so you see<em> 2</em> and <em>8 </em>in those boxes.  Also, the output pointer needs to shift by one when read, so I entered <em>1</em> in the “Adjustment” box.  Now all that’s left is to define what is transferred over this interface.  In the “Parameter” section I’ve specified a <em>sc_uint</em> datatype, which is the SystemC standard for an unsigned integer, and given it a width of <em>16</em> in the “#Bits” box.  Then I sized a “1D”, 1024-word  buffer for the needed internal storage.  Finally, I specify <em>RAM2 </em>as the memory part I want the interface to use to implement the buffer.  I created RAM2 myself previously using Cynthesizer’s Memory Editor window.  Please read <a href="http://cyncity.forteds.com/2011/01/27/modint-part-ii-extmem/" target="_self">Part II of this blog series</a> for information on creating memories.  I saved my interface as a part named <em>my_if</em>.<br />
So “my_if” now exists as an interface part in my library.  Let’s put it to good use in an algorithm that does something like this:</p>
<p><a href="http://cyncity.forteds.com/wp-content/uploads/2011/03/blog3.jpg"><img class="alignnone size-full wp-image-205" title="High-Level Algorithm" src="http://cyncity.forteds.com/wp-content/uploads/2011/03/blog3.jpg" alt="" width="440" height="358" /></a></p>
<address> </address>
<p>My module DUT needs to do the following:</p>
<ul>
<li> Read an input port din, calculate two values and put them in the interface.  I’ll create a thread called <em>writer()</em> to do this.</li>
<li> Get eight values from the interface, make a calculation with them, and write the result of that calculation to an output port dout.  I’ll create a thread called <em>reader() </em>to do this.</li>
</ul>
<p>First, I have to define a DUT module, declare the ports and threads and instantiate my interface:</p>
<pre>SC_MODULE( dut ) {
    cynw_p2p&lt; sc_uint&lt;16&gt; &gt;::in		din;
    cynw_p2p&lt; sc_uint&lt;16&gt; &gt;::out	dout;
    <span style="color: #ff0000;">my_if::direct&lt;&gt;	 		m_if;</span>
    …
    void writer();
    void reader();
    …
    SC_THREAD( writer, clk.pos() );
    SC_THREAD( reader, clk.pos() );
}</pre>
<p>In red I show an instance of my_if using the <em>::direct&lt;&gt;</em> template class.  This is one of the member classes in my_if, and it is used to represent an internal interface communicating directly between two threads.  Conversely, there is a <em>::chan&lt;&gt;</em> class to use when the interface is external and communicates between two modules.</p>
<p>Now in the DUT module we’ll define the writer() and reader() threads to interact with the interface.</p>
<pre>void writer()
{
    …
    <span style="color: #ff0000;">m_if.w_start_tx();</span>
    // 512 gets (working set of 2; fills 1024 length buffer)
    for( int i=0; i &lt; (1024/2); i++ ) {
        <span style="color: #ff0000;">m_if.w_start_iter();</span>
        vin = din.get();
        m_if[i*2] = vin;
        m_if[i*2+1] = vin+1;
        <span style="color: #ff0000;">m_if.w_end_iter();</span>
    }
   <span style="color: #ff0000;"> m_if.w_end_tx();</span>
}</pre>
<pre>void reader()
{
    …
    <span style="color: #ff0000;">m_if.r_start_tx();</span>
    // 1016 puts (working set of 8 with read adjustment of 1)
    for( int i = 0; i &lt; (1024-8)+1; i++ )
        <span style="color: #ff0000;">m_if.r_start_iter();</span>
        v1 = m_if[j];
        v2 = m_if[j*8-1];
        val = (v1 + v2)/2;
        dout.put( val );
        <span style="color: #ff0000;">m_if.r_end_iter();</span>
    }
    <span style="color: #ff0000;">m_if.r_end_tx();</span>
}</pre>
<p>That&#8217;s it.  I&#8217;m done.</p>
<p>In red I have highlighted some more of the transaction-level member functions that save me a lot of time and keep my algorithm at a high level.  They are:</p>
<ul>
<li> <em>w_start_tx(), w_end_tx() </em>
<ul>
<li>Synchronizes the algorithm on the writing side of the interface</li>
</ul>
</li>
</ul>
<ul>
<li> <em>r_start_tx(), r_end_tx() </em>
<ul>
<li>Synchronizes the algorithm on the reading side of the interface</li>
</ul>
</li>
</ul>
<ul>
<li> <em>w_start_iter(), w_end_iter() </em>
<ul>
<li>Synchronizes a loop iteration on the writing side of the interface</li>
</ul>
</li>
</ul>
<ul>
<li> <em>r_start_iter(), r_end_iter() </em>
<ul>
<li>Synchronizes a loop iteration on the reading side of the interface</li>
</ul>
</li>
</ul>
<p style="padding-left: 30px;">
<p>So with the Interface Generator I defined a few parameters and instantly had a SystemC interface part loaded with powerful classes and functions, and using them I was able to describe the algorithm at the transaction level in just <strong>58</strong> lines.  This same design in RTL would have taken well over <strong>4600</strong> lines!</p>
<p style="padding-left: 30px;">
<p>Next time, we&#8217;ll conclude our series on modular interfaces with some final suggestions of things you should consider.</p>
]]></content:encoded>
			<wfw:commentRss>http://cyncity.forteds.com/2011/03/28/modular-interfaces-part-iii-custom-interfaces/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Modular Interfaces  Part I: Benefits</title>
		<link>http://cyncity.forteds.com/2011/01/06/modular-interfaces-part-i-benefits/</link>
		<comments>http://cyncity.forteds.com/2011/01/06/modular-interfaces-part-i-benefits/#comments</comments>
		<pubDate>Fri, 07 Jan 2011 00:15:59 +0000</pubDate>
		<dc:creator>Mike Meredith</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://cyncity.forteds.com/?p=101</guid>
		<description><![CDATA[High-level synthesis (HLS) is just that&#8211;high level&#8211;a design approach that lets you work at a level above having to wade through pins and wires and state machines.  There are many factors to consider in choosing an HLS tool, but one of them is so fundamental that it often gets overlooked.
It&#8217;s interfaces. I&#8217;m not just [...]]]></description>
			<content:encoded><![CDATA[<p>High-level synthesis (HLS) is just that&#8211;high level&#8211;a design approach that lets you work at a level above having to wade through pins and wires and state machines.  There are many factors to consider in choosing an HLS tool, but one of them is so fundamental that it often gets overlooked.</p>
<p>It&#8217;s interfaces. I&#8217;m not just talking about declaring ports and hooking them up. I&#8217;m talking about high-level, modular interfaces that encapsulate very complex I/O protocols into easy-to-use function calls. Most of the tools out there offer some kind of interface solution, but we think they miss the mark in giving you what you need to be successful.  Some let you describe I/O behavior with standard ANSI C only to add the actual pin-level interface details later in synthesis (which, unfortunately, will be the first time you can actually verify the interface).  Some offer only off-the-shelf interface IP but then provide no means of creating custom interfaces.  We developed Cynthesizer with all of this in mind.  Cynthesizer designs are written in SystemC, where the clock and pin level activity of an interface can be simulated before the tool is run.  And Cynthesizer has a family of standard interface IP in addition to an editor where you can create any interface you want.</p>
<p>This is the first in a four-part series on designing and working with modular interfaces in Cynthesizer. Today I&#8217;ll be detailing the benefits of modular interfaces. In the coming weeks we&#8217;ll be talking about how useful they are with external memories, we&#8217;ll detail some of the techniques we use to build them, and we&#8217;ll show what exactly is available interface-wise from Forte.</p>
<p><span style="color: #0000ff;"><strong>What Is A Modular Interface?</strong></span><br />
When we talk to customers or prospects we use the word &#8220;transaction&#8221; a lot. And this is probably the best word to use in describing a modular interface. A transaction, in essence, is what is being communicated across a modular interface. A modular interface can be, for example, a burst write to a standard AHB bus model. Or it can be an exchange of data that follows a strict protocol in a fixed number of clock cycles. Or it can simply be the writing of a vector or datatype value with ready/valid handshaking. The modular interface combines the I/O protocol of your transaction (i.e. ports) with the actual functionality of the transaction. Whatever the case, to you as a user it is a single function call alongside your other high-level code.</p>
<p><strong><span style="color: #0000ff;">Why Use Them?</span></strong><br />
There are many benefits to designing this way. Your code is much simpler and you can describe an algorithm in fewer total lines. Your design effort can concentrate solely on the core of the algorithm at transaction level, all while retaining the flexibility to write custom interface protocols. Your connections become easier because all the pins involved in the interfaces are encapsulated in high-level channels. But the real benefit is in verification.  Since handshaking is built into a modular interface, all you need is one testbench to verify all of the RTL architectures your HLS tool produces.  Also, as mentioned above, Cynthesizer interfaces correctly simulate the interaction of their clocks and pins at the behavioral level, before you run any synthesis.  Remember that SystemC supports both TLM and pin-level I/O configurations.  We can&#8217;t state enough how critical this is to your HLS success&#8211;once your interface is verified behaviorally, it stays verified all the way down to gates or anywhere you reuse it.</p>
<p>Without modular interfaces you would spend a lot of time down in the trenches of I/O protocol design. You would have to declare whatever ports are needed for your transaction, make sure the direction of the ports was correct, declare signals in the parent module to connect them with, and then make sure all the connections are correct. But then comes the hard part, writing some kind of handshaking or acknowledgement scheme so that you only read input data when it is valid and only write output data when the downstream module is ready for it. This means an manual assertion of a ready signal, followed by a loop that sits and waits on a valid signal, followed by reading the data and storing it properly. And remember, you will repeat this for every interface you have.</p>
<p>Just look at the difference of this code with modular interfaces:</p>
<pre style="padding-left: 30px;">in_data = inp.get();
out_data = my_function( in_data );
outp.put( out_data );</pre>
<p>as compared to the same code written <em>without </em>modular interfaces:</p>
<pre style="padding-left: 30px;">inp_rdy = 1;
do {
    wait();
} while( !inp_vld );
in_data = inp.read();
inp_rdy = 0;
out_data = my_function( in_data );
do {
    wait();
} while( !outp_rdy );
outp.write( out_data );
outp_vld = 1;
wait();
outp_vld = 0;</pre>
<p>And this is only for a basic ready/valid handshake.  As the interface becomes more complex, the difference in the amount of code becomes more dramatic.</p>
<p><strong><span style="color: #0000ff;">How Does SystemC Help?</span></strong><br />
We&#8217;ve touched on it already, but SystemC really lends itself to modular interface design.  A SystemC modular interface has two sides: a tidy, transaction-level side like the <em>.get()</em> and <em>.put()</em> calls you see above; and a rougher, pin-level side where the gritty details of the cycle-accurate protocol are defined.  Some people criticize SystemC because the OSCI synthesizable subset requires modules to have pin-level ports like <em>sc_in&lt;&gt;</em> and <em>sc_out&lt;&gt;</em>, but in this case it&#8217;s an advantage.  While you as a designer work at a high level, it is the pin-level ports that are presented to your HLS tool.  With the pins broken out, the tool can more optimally combine your interface protocol and datapath with the control FSM.  Using modular interfaces does not mean sacrificing quality or reusability.</p>
<p>Next time, we&#8217;ll get into designing with external memories&#8211;a specific case where modular interfaces save loads of time and effort.</p>
]]></content:encoded>
			<wfw:commentRss>http://cyncity.forteds.com/2011/01/06/modular-interfaces-part-i-benefits/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SystemC Hits A High Note</title>
		<link>http://cyncity.forteds.com/2010/07/14/systemc-hits-a-high-note/</link>
		<comments>http://cyncity.forteds.com/2010/07/14/systemc-hits-a-high-note/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 19:54:08 +0000</pubDate>
		<dc:creator>Brett Cline</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[General/Misc.]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://cyncity.forteds.com/?p=73</guid>
		<description><![CDATA[Two weeks ago in Yokohama, several companies hosted SystemC 2010 Japan. Over 200 design and verification engineers, architecture and algorithm engineers, EDA specialists, managers, and software designers attended the event.  Presentation were made by EDA vendors and users alike, with leading companies like Renesas, Ricoh, and Sony discussing their experiences with SystemC, ESL and [...]]]></description>
			<content:encoded><![CDATA[<p>Two weeks ago in Yokohama, several companies hosted SystemC 2010 Japan. Over 200 design and verification engineers, architecture and algorithm engineers, EDA specialists, managers, and software designers attended the event.  Presentation were made by EDA vendors and users alike, with leading companies like Renesas, Ricoh, and Sony discussing their experiences with SystemC, ESL and high-level synthesis. By all reports it was a great success (on a scale of 1 to 5, almost 75% of the attendees gave the seminar a rating of 4 or 5!) and speaks to the continued momentum that SystemC has in the market.</p>
<p>In a survey, 65% of the attendees said they have used or are now using SystemC and SystemC tools in their current flow. More than 52% said that they are using high-level synthesis (HLS) &#8212; the most of any category. So what does all of this tell us?</p>
<p>Well, it’s clear that SystemC and SystemC HLS have moved well beyond the trial phase. Some of the most respected companies in the world were discussing their methodologies in use today and their vision for the next couple of years. The user community for SystemC design is now setting the direction. Japan has been a leader in SystemC HLS since 2000, and the attendees made it clear they intend to continue the push forward.</p>
<p><span style="color: #0000ff;"><strong>Why SystemC?</strong></span><br />
This is a question that never seems to get old.  Maybe it&#8217;s because for years EDA marketeers have told us <a href="http://www.edadesignline.com/showArticle.jhtml?articleID=218400038">ANSI-C is all you&#8217;ll ever need</a>! The fact is that ANSI-C cannot handle real design and verification needs and the designers in Japan know it. Those surveyed at SystemC 2010 Japan said their SystemC usage falls into these 5 categories (multiple answers allowed):</p>
<ul>
<li> 52.7% HLS</li>
<li>44.8% Functional verification</li>
<li> 33.5% Virtual platform / software design</li>
<li> 30.5% Architecture design</li>
<li> 1.5% Other</li>
</ul>
<p>So why use SystemC instead of ANSI-C?  It’s true that ANSI-C and SystemC can both be synthesized, but that’s not where the problem lies. The problem is in the modeling and design environment that you need to build. Sure you can synthesize a serial algorithm in ANSI-C &#8212; but real designs have hierarchy, concurrency, control and complex interfaces where data is exchanged in parallel. What about modeling a virtual platform and the design architecture? You could add non-standard language extensions or develop your own event manager (effectively creating a custom simulator), but is that what you want to spend your time doing?  Let’s face it &#8212; ANSI-C just doesn’t cut it. SystemC was developed specifically to rectify ANSI-C’s shortcomings for hardware design.</p>
<p>SystemC is a C++ class library that is “hardware aware,” and there’s a lot of online material to help you learn about its advantages. Forte’s CTO, John Sanguinetti, has written a couple of articles about the topic. I’d recommend reading <a href="http://www.eetimes.com/design/other/4199781/Transitioning-from-C-C--to-SystemC-in-high-level-design">“Transitioning from C/C++ to SystemC in high-level design”</a> that appeared recently at Embedded.com, as well as <a href="http://cyncity.forteds.com/2010/05/24/i-can%E2%80%99t-imagine-designing-hardware-in-c-%E2%80%A6/">“I can&#8217;t imagine designing hardware in C&#8230;”</a> that appeared right here in Forte’s <a href="http://cyncity.ForteDS.com">CynCity blog</a>.  Read these and you’ll see why ANSI-C doesn’t work for anything more than trivial hardware design.</p>
<p>Want more?  There was <a href="http://www2.dac.com/technical+program.aspx?event=38&amp;topic=10">a panel about it at DAC this year</a>!  And while they probably won’t admit it, even the EDA vendors who trashed SystemC a year ago are starting to sing a different tune.</p>
]]></content:encoded>
			<wfw:commentRss>http://cyncity.forteds.com/2010/07/14/systemc-hits-a-high-note/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>I can’t imagine designing hardware in C …</title>
		<link>http://cyncity.forteds.com/2010/05/24/i-can%e2%80%99t-imagine-designing-hardware-in-c-%e2%80%a6/</link>
		<comments>http://cyncity.forteds.com/2010/05/24/i-can%e2%80%99t-imagine-designing-hardware-in-c-%e2%80%a6/#comments</comments>
		<pubDate>Tue, 25 May 2010 01:27:03 +0000</pubDate>
		<dc:creator>John Sanguinetti</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[General/Misc.]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://cyncity.forteds.com/?p=68</guid>
		<description><![CDATA[I recently had a conversation with an old friend from my graduate school days who, when I told him about Forte and high-level synthesis, said, “I can’t imagine designing hardware in C, or any other programming language.” This guy is one of the best programmers I know &#8212; he wrote the user interface code for [...]]]></description>
			<content:encoded><![CDATA[<p>I recently had a conversation with an old friend from my graduate school days who, when I told him about Forte and high-level synthesis, said, “I can’t imagine designing hardware in C, or any other programming language.” This guy is one of the best programmers I know &#8212; he wrote the user interface code for the Apollo workstation in 1981, one of the first graphical user interfaces &#8212; so I take his opinions seriously.</p>
<p>He went on to say that good hardware had such a high level of parallelism, he didn’t see how a compiler could produce that from inherently sequential code like what is written in a standard, general-purpose programming language. In his words, “though I wouldn’t write in assembly language any more, the output of gcc is crap. If it can’t do a good job with sequential code, why should I believe that a compiler can do a good job producing parallel code?”</p>
<p>My answer to him was along the lines of, “well, it really isn’t all that hard, and our customers have produced lots of real designs using Cynthesizer,” but that wasn’t a very satisfying answer. Here is the answer that I didn’t have time to give my friend:</p>
<ol>
<li>You’re right, you can’t use C. We need to define what we mean by a programming language. Verilog (and VHDL) is a hardware description language, not a “hardware programming language.” However, it is also a simulation language, with a lot of similarities to <em>Simula </em>(one of the groundbreaking early simulation languages). We’ve been designing hardware in a simulation language for 20 years, so the basic idea is not all that radical.</li>
<li>Though Verilog (and VHDL) is a simulation language, it is not a particularly rich one. It is object-oriented (modules are classes, and instantiations are objects), but it lacks a lot of features that object-oriented programming languages provide. What it does have, however, are bit-wise operators and data types, and a standard module class which has ports, threads, and submodules.</li>
<li>Actually, Verilog (and VHDL) is capable of representing hardware at a higher level of abstraction than RTL, but it has one big drawback: it isn’t used for algorithm development, implementation or publication. Really, the only language that doesn’t have that drawback is C (or C++). Java is a distant second, and the other popular languages aren’t really candidates (M?, Fortran?, Perl?, Python?, Ruby on Rails?, …).</li>
<li>This is where SystemC comes in. C++ is a rich object-oriented language which is nearly universal and, since it is a superset of C, the vast majority of algorithms which are published are in C++. Being an object-oriented language, C++ has the ability to support a hierarchy of layers of abstraction, implemented by libraries of classes. SystemC is a library of classes which implements a “hardware abstraction layer” in   C++. You can write the same code in SystemC/C++ as you would write in Verilog, at the netlist level or at RTL. That’s not particularly interesting &#8212; algorithms aren’t written at RTL in any language. What’s important is that you can also write code at a higher level than RTL, using C++ constructs. And, you can take a C algorithm, stick it unchanged in a SystemC module (that is, make it a method in an SC_MODULE class), and you have a starting point for hardware implementation.</li>
<li>Now that we have a structure for describing concurrent processes (a hierarchy of modules is a collection of concurrent processes), we’ve got something that a compiler (or high-level synthesis tool) can work with. Extracting fine-grain parallelism from a sequential process is easy (at least conceptually) &#8212; compilers have been doing that for years. Managing concurrency between processes requires a bit more work, but there is enough information in the source code for the compiler to do a good job.</li>
<li>Finally, the world of parallel vector supercomputers spawned a rich set of code transformations in the compiler world which could expose possible parallelism in sequential loops. Many of these can be applied to hardware synthesis. The end result is that a great deal of parallelism can be realized from a hardware description in SystemC.</li>
<li>So, implementing parallel hardware from a SystemC description can be done effectively. Generally, this part of the problem falls into the category of <em>scheduling</em>. That is, creating an optimal (by some definition) state machine, or set of state machines, to implement the described functions. Scheduling is a critical part of the high-level synthesis process. But that is only part of the problem. The other part is to implement the desired schedule in the smallest amount of hardware possible. That generally is called <em>allocation</em>, and that is the subject for another article.</li>
</ol>
<p>The conclusion is that you not only can imagine creating hardware automatically from a C-related programming language, it is being done successfully by today’s high-level synthesis programs. At least, we can say it is being done successfully by Cynthesizer. It has taken quite a bit of time and effort to develop the synthesis technology to the point we are at today, where the vast majority of hardware designs can be done successfully  with high-level synthesis. This will have a profound effect on how people do hardware design in the coming years.</p>
]]></content:encoded>
			<wfw:commentRss>http://cyncity.forteds.com/2010/05/24/i-can%e2%80%99t-imagine-designing-hardware-in-c-%e2%80%a6/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What to expect from high-level synthesis</title>
		<link>http://cyncity.forteds.com/2010/04/13/what-to-expect-from-high-level-synthesis/</link>
		<comments>http://cyncity.forteds.com/2010/04/13/what-to-expect-from-high-level-synthesis/#comments</comments>
		<pubDate>Tue, 13 Apr 2010 23:35:44 +0000</pubDate>
		<dc:creator>Brett Cline</dc:creator>
				<category><![CDATA[FAQ]]></category>
		<category><![CDATA[General/Misc.]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://cyncity.forteds.com/?p=55</guid>
		<description><![CDATA[For nearly a decade Forte Design Systems has been working with customers to design and implement high-level synthesis (HLS) strategies and methodologies. These customers ask a lot of questions as they try to determine if HLS is right for them. But there&#8217;s one question they almost never ask: “What should I expect from high-level synthesis?”
They [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: left;">For nearly a decade Forte Design Systems has been working with customers to design and implement high-level synthesis (HLS) strategies and methodologies. These customers ask a lot of questions as they try to determine if HLS is right for them. But there&#8217;s one question they almost never ask: “What should I expect from high-level synthesis?”</p>
<p style="text-align: left;">They don&#8217;t ask it because it seems too obvious, or they already think they know what to expect.  But the answer is not so obvious. Pose the question to yourself right now before reading on&#8230; what was <em>your </em>answer?</p>
<p style="text-align: left;">If you chose “productivity benefits” or some form of “faster time-to-RTL,” you’d be right&#8211;well, sort of.  There are several values and we’ll discuss five of them here.</p>
<p style="text-align: left;"><span style="color: #0000ff;"><strong>Productivity</strong></span><br />
Certainly HLS promises and delivers great productivity improvements over RTL. Raising the abstraction level of hardware design means that the functional intent of the design can be expressed in fewer lines of code. Fewer lines of code should, in theory, take less time to write. The HLS tool then does a lot of the work in transforming the high-level code into RTL, creating the control FSM and the datapath. Oftentimes our customers see productivity improvements for real designs of 2-10X over RTL. Roughly speaking, that&#8217;s a gate count somewhere between 300,000 and 1,000,000 gates per engineer-year. Not bad!!!</p>
<p style="text-align: left;"><strong><span style="color: #0000ff;">Predictable Timing Closure</span></strong><br />
It&#8217;s usually unexpected, but during the RTL creation process users find that HLS has great value in timing closure. State-of-the-art HLS tools extract information from the user&#8217;s technology library (.lib), and, along with the target clock frequency, can characterize the performance and cost of each part.  Based on user constraints, it can choose which ones to use when building the FSM and datapath. Different tools do this differently, of course, with varied levels of accuracy. Forte’s Cynthesizer uses an internal synthesis engine to estimate the area, speed, and power of each part at the gate-level, and then efficiently packs the operators and control into each clock cycle.  The result is an RTL architecture that gets through logic synthesis and place-and-route smoothly, because Cynthesizer  scheduled it with the target technology and process in mind. Timing closure problems are virtually eliminated, saving months of time and effort.</p>
<p style="text-align: left;"><span style="color: #0000ff;"><strong>Less Time In Verification</strong></span><br />
As many a marketing person has stated, “verification is greater than two-thirds of the RTL design process.” Sure, that’s probably true, but the reason is because of the RTL design process itself. Think about it. RTL designs have thousands of lines of code, are insanely complex, and the code doesn’t get to the verification team until the end of the design schedule, leaving little time to verify the design.  And with incredibly slow RTL simulation times, it&#8217;s no surprise RTL verification would take so long.</p>
<p style="text-align: left;">What if the verification team had a super-fast, accurate model of the design months earlier in the design process, a model where the functionality, architecture, and even the interfaces are fully specified and “simulatable”?  A SystemC HLS model can provide just that&#8211;an early, accurate, fast, executable specification. Cynthesizer can then compile this model and produce a unique RTL architecture for each set of user directives (more on that later). More importantly, the RTL design can be automatically validated using the same SystemC testbench, significantly reducing the verification effort. There are even formal tools that can check equivalency between the high-level SystemC and the RTL. The result? An improved design and verification flow that lets you start verification much earlier in the process and reduces the overall design time.</p>
<p style="text-align: left;"><strong><span style="color: #0000ff;">True Design Reuse</span></strong><br />
Everybody reuses RTL, but the same comments about reuse come up every time: old RTL designs are difficult to modify, they have limited shelf life when changing clock speeds, and a lot of area is wasted when changing geometries because of inefficient silicon use. HLS provides true design reuse by eliminating the majority of implementation details from the model. A SystemC HLS model doesn’t include any technology-specific details of the design. It is focused on the functional aspects and relies on the HLS tool to add the implementation details.  Because of this, many of the issues mentioned above in reusing RTL are never encountered:</p>
<ul style="text-align: left;">
<li>Modification of designs is only necessary if the function changes, and with fewer lines of code, the functional intent is much more clear.</li>
<li>The shelf life of the IP is longer because it is not tied to a specific process and speed. The HLS tool can easily retarget the design for different technologies (even FPGA!) simply by changing the directives.</li>
</ul>
<p><span style="color: #0000ff;"><strong>Quality of Results</strong></span><br />
Of course, no amount of productivity and verification improvements will really help if you can’t meet your performance, power and silicon area targets.  Designers looking to HLS for high-volume production designs are naturally skeptical that they can get better results with an HLS tool than by hand coding RTL. This is not unreasonable, and some of the other HLS tools are pretty good on productivity but force users to sacrifice QoR. This is simply not true for Cynthesizer. You can expect to achieve just as good a result using Cynthesizer as you would if you wrote RTL by hand. This is not done by waving a magic wand. The user has to apply engineering skill to organize the design and drive the tool to get the best results, but it’s been proven on numerous production chips that Cynthesizer lets users take HLS all the way to the finish line.</p>
<p style="text-align: left;">
<p style="text-align: left;">So, there you have it.  If you are looking to adopt HLS design, you now know what you can, and should, expect.</p>
]]></content:encoded>
			<wfw:commentRss>http://cyncity.forteds.com/2010/04/13/what-to-expect-from-high-level-synthesis/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Hierarchy in SystemC: Why it&#8217;s so important for HLS!</title>
		<link>http://cyncity.forteds.com/2010/03/02/hierarchy-in-systemc-why-its-so-important-for-hls/</link>
		<comments>http://cyncity.forteds.com/2010/03/02/hierarchy-in-systemc-why-its-so-important-for-hls/#comments</comments>
		<pubDate>Wed, 03 Mar 2010 04:41:44 +0000</pubDate>
		<dc:creator>Mike Meredith</dc:creator>
				<category><![CDATA[General/Misc.]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://cyncity.forteds.com/2010/02/28/hierarchy-in-systemc-why-its-so-important-for-hls/</guid>
		<description><![CDATA[Last time, I looked at the verification advantages of using SystemC for HLS.  This time, I want to explore another important capability of SystemC that makes it far superior to ANSI C for hardware design.
I&#8217;m talking about structural hierarchy. SystemC supports hierarchy while ANSI C does not.  Structural hierarchy means submodules, connected together [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://cyncity.forteds.com/2010/02/22/need-another-reason-to-use-systemc-for-hls-the-verification-advantage-is-the-best-of-all/">Last time</a>, I looked at the verification advantages of using SystemC for HLS.  This time, I want to explore another important capability of SystemC that makes it far superior to ANSI C for hardware design.</p>
<p>I&#8217;m talking about structural hierarchy. SystemC supports hierarchy while ANSI C does not.  Structural hierarchy means submodules, connected together and executing concurrently. Use of hierarchy is the traditional mainstay of hardware designers for breaking down complex designs into a group of smaller, more manageable designs that are easier to design and verify.</p>
<p>Here are some of the advantages that come from SystemC&#8217;s support for hierarchy:</p>
<p><strong><span style="color: #0000ff;">Unit-Level Verification</span></strong><br />
It&#8217;s easier to build a testbench that can stimulate all the critical corner cases in a design when it&#8217;s a smaller block.  Also, signoff requirements like code coverage are easier to meet when you have controllability at the smaller block boundary.  Making sure that every line of source code is covered requires a designer to find the correct sequence of input values to exercise every code branch.  With a small block you don&#8217;t have to try to trick the upstream blocks into producing the right stimulus, you can just have the testbench present whatever stimulus you need.  In large blocks this may be impossible altogether.</p>
<p><strong><span style="color: #0000ff;">Connections and Interfaces</span></strong><br />
Working at the submodule level allows designers to isolate the complex interfaces and channel connections between them.  ANSI C HLS tools, as mentioned in the previous post, do not accurately represent concurrent hardware. This can really cause problems designing and verifying interfaces because the handshaking and transactions all occur simultaneously. The simulation semantics of SystemC let you examine interfaces and channels at the pin-level and make it straightforward to code your own interfaces with whatever protocol you need. SystemC-based HLS also allows you to encapsulate the details of a particular protocol in a set of classes and easily switch from one interface to another without changing your module&#8217;s source code&#8211;but that&#8217;s the topic of a future posting!</p>
<p><strong><span style="color: #0000ff;">Architecture Design</span></strong><br />
In HLS design there is a step between algorithmic design and RTL generation: architectural design. This step takes the untimed algorithmic code and decides how major portions will be implemented in hardware to best meet QoR requirements, i.e. whether a particular array be an external memory or a flattened register. Some of this is done through synthesis directives or constraints, but a handy tactic is being able to partition a section of code into a nice hierarchical submodule.  This, and its verification advantage, is talked about more in <a href="http://www.edadesignline.com/howto/222900653;jsessionid=5YNCEVM025M2FQE1GHPCKH4ATMY32JVN">John Sanguinetti&#8217;s recent EETimes EDA DesignLine article</a>.</p>
<p><strong><span style="color: #0000ff;">Faster Runtimes</span></strong><br />
Everything will run faster with smaller modules.  The blocks will get through behavioral synthesis scheduling more quickly and the generated RTL will run through logic synthesis tools faster.  The block-level testing will go faster because smaller blocks compile and run in simulators faster as well.</p>
<p><span style="color: #0000ff;"><strong>Teams of Multiple Designers</strong></span><br />
Teams usually work in parallel, with multiple teams designing and verifying multiple blocks in parallel. Hierarchy makes design maintenance easier, makes it possible to keep a consistent set of code for HLS and verification, and keeps designers from stepping on each others&#8217; toes.</p>
<p><span style="color: #0000ff;"><strong>Reuse</strong></span><br />
When the next-generation design is derived from your current design,having the design broken into manageable blocks makes it much easier to reuse some of those blocks without an entirely new verification effort. It also improves your ability to figure out which blocks will have to be changed, or how to fit together a combination of old, new and modified blocks to meet the new requirements.</p>
<p>SystemC supports our familiar friend&#8211;structural hierarchy&#8211;and allows you to use many of the same techniques you are accustomed to for managing the complexity of design and verification tasks. Gee, using SystemC for HLS is just like having a real hardware language, only with higher levels of abstraction available.  No wait, it&#8217;s not like that&#8211;it&#8217;s <em>exactly</em> that!  And that&#8217;s what you really need for practical high-level hardware design.</p>
]]></content:encoded>
			<wfw:commentRss>http://cyncity.forteds.com/2010/03/02/hierarchy-in-systemc-why-its-so-important-for-hls/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Need another reason to use SystemC for HLS?  The verification advantage is the best of all.</title>
		<link>http://cyncity.forteds.com/2010/02/22/need-another-reason-to-use-systemc-for-hls-the-verification-advantage-is-the-best-of-all/</link>
		<comments>http://cyncity.forteds.com/2010/02/22/need-another-reason-to-use-systemc-for-hls-the-verification-advantage-is-the-best-of-all/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 21:42:15 +0000</pubDate>
		<dc:creator>Mike Meredith</dc:creator>
				<category><![CDATA[General/Misc.]]></category>
		<category><![CDATA[Methodology]]></category>

		<guid isPermaLink="false">http://cyncity.forteds.com/2010/02/22/need-another-reason-to-use-systemc-for-hls-the-verification-advantage-is-the-best-of-all/</guid>
		<description><![CDATA[The &#8220;language war&#8221; in high-level system (HLS) design has been waging for a while now.  You&#8217;ve probably read a lot of online publications touting the advantages of using SystemC over ANSI C to design at an abstract level. If you were to take what everyone is saying and boil it down to a few [...]]]></description>
			<content:encoded><![CDATA[<p>The &#8220;language war&#8221; in high-level system (HLS) design has been waging for a while now.  You&#8217;ve probably read a lot of online publications touting the advantages of using SystemC over ANSI C to design at an abstract level. If you were to take what everyone is saying and boil it down to a few key points, they might sound something like this:</p>
<ul>
<li>ANSI C is a sequential language.</li>
<li>ANSI C cannot execute two subroutines or functions concurrently.</li>
<li>ANSI C executes your code in a single flow, one line after another.</li>
<li>ANSI C-based HLS tools either limit themselves to single-block designs or provide a proprietary mechanism to mimic concurrency.</li>
<li>ANSI C HLS tools don&#8217;t give you a way to accurately simulate what a real piece of hardware does.</li>
</ul>
<ul>
<li>SystemC, on the other hand, is a standardized superset of C++ that supports multiple concurrent processes.</li>
<li>SystemC supports hierarchy and modules.</li>
<li>SystemC allows communication between those modules at the transaction or pin level.</li>
<li>SystemC gives hardware designers the ability to tackle the very complex design and interface tasks they face every day.</li>
</ul>
<p>Notice first I said &#8220;superset of C++&#8221; instead of &#8220;language.&#8221;  That&#8217;s because SystemC is indeed not a language: it&#8217;s a family of C++ classes specifically geared toward hardware design constructs like modules, ports, concurrency, clocks, resets and channels.</p>
<p>The list above is short, but to see what far-reaching consequences these points have for hardware designers, consider the following:</p>
<p>Let&#8217;s say you use an ANSI C tool that can only work with single blocks.  If you have a multiple block system, you can produce each block one at a time.  But there&#8217;s a catch: how do you verify that system?  According to the tool, that&#8217;s your problem.  It&#8217;s up to you to somehow stitch these blocks together in RTL and do all the verification in RTL.</p>
<p>If you use an ANSI C tool that mimics concurrent simulation using proprietary libraries or other non-standard techniques, you are locked into that proprietary flow.  Want to drive yourself crazy?  Just try using this flow to create some IP cells and distribute them to your customers.  Those customers will have to use (and own) the same tool just to run a simulation.</p>
<p>The proprietary library approach also takes liberties to get your ANSI C code to act like real hardware.  To emulate the dataflow of your design they will typically have you partition the design so each block is a subroutine, and require you to call proprietary APIs inside the subroutines to manage that dataflow. One common approach is to have you conditionally execute the algorithms inside the subroutines depending on the availability of data reported by these API calls.</p>
<p>Sound messy?  It is.  And when you consider that it takes an API call (also proprietary) to determine if there is data in the channel, then you start to understand how these libraries are rudimentary static simulators at best.</p>
<p>If you have a fairly simple design that has a sequential algorithm, can be controlled by a single finite state machine, processes the same number of inputs and outputs every cycle, and uses a limited set of interfaces to communicate, then ANSI C may work fine for you.  But the truth is that real projects aren&#8217;t that simple.  Complex designs have multiple processes operating concurrently, they communicate with things like external memories and they send data through interfaces that must be customizable.</p>
<p>If you use the SystemC classes for HLS design, you have a real built-in, event-driven simulator to support your verification efforts. SystemC supports multiple modules that can execute concurrently, can share data in memories, and can synchronize their execution using real signal-level protocols.  In other words, it allows you to attack the biggest design problems by breaking them down into multiple blocks and connecting them with channels that follow a protocol.</p>
<p>And with SystemC, you can easily simulate these blocks together to make sure everything is working correctly.</p>
<p>Can you do all this with ANSI C? You could, but how much time are you willing to spend writing the specialized hardware classes that already exist with SystemC?</p>
<p>There&#8217;s a lot more I could say about this, but I&#8217;ll let someone else do it for me.  John Sanguinetti, Forte&#8217;s CTO, just published <a title="High-level synthesis, verification and language" href="http://www.edadesignline.com/howto/222900653;jsessionid=5YNCEVM025M2FQE1GHPCKH4ATMY32JVN" target="_blank">this article</a> in EETimes&#8217; EDA DesignLine that takes another look at the verification angle.</p>
<p>Next time, I&#8217;ll take a little deeper look into another important advantage of SystemC in HLS design: its support of <a href="http://cyncity.forteds.com/2010/02/28/hierarchy-in-systemc-why-its-so-important-for-hls/">structural hierarchy</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://cyncity.forteds.com/2010/02/22/need-another-reason-to-use-systemc-for-hls-the-verification-advantage-is-the-best-of-all/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

