Skip to content

SystemC Hits A High Note

Two weeks ago in Yokohama, several companies hosted SystemC 2010 Japan. Over 200 design and verification engineers, architecture and algorithm engineers, EDA specialists, managers, and software designers attended the event. Presentation were made by EDA vendors and users alike, with leading companies like Renesas, Ricoh, and Sony discussing their experiences with SystemC, ESL and high-level synthesis. By all reports it was a great success (on a scale of 1 to 5, almost 75% of the attendees gave the seminar a rating of 4 or 5!) and speaks to the continued momentum that SystemC has in the market.

In a survey, 65% of the attendees said they have used or are now using SystemC and SystemC tools in their current flow. More than 52% said that they are using high-level synthesis (HLS) — the most of any category. So what does all of this tell us?

Well, it’s clear that SystemC and SystemC HLS have moved well beyond the trial phase. Some of the most respected companies in the world were discussing their methodologies in use today and their vision for the next couple of years. The user community for SystemC design is now setting the direction. Japan has been a leader in SystemC HLS since 2000, and the attendees made it clear they intend to continue the push forward.

Why SystemC?
This is a question that never seems to get old.  Maybe it’s because for years EDA marketeers have told us ANSI-C is all you’ll ever need! The fact is that ANSI-C cannot handle real design and verification needs and the designers in Japan know it. Those surveyed at SystemC 2010 Japan said their SystemC usage falls into these 5 categories (multiple answers allowed):

  • 52.7% HLS
  • 44.8% Functional verification
  • 33.5% Virtual platform / software design
  • 30.5% Architecture design
  • 1.5% Other

So why use SystemC instead of ANSI-C? It’s true that ANSI-C and SystemC can both be synthesized, but that’s not where the problem lies. The problem is in the modeling and design environment that you need to build. Sure you can synthesize a serial algorithm in ANSI-C — but real designs have hierarchy, concurrency, control and complex interfaces where data is exchanged in parallel. What about modeling a virtual platform and the design architecture? You could add non-standard language extensions or develop your own event manager (effectively creating a custom simulator), but is that what you want to spend your time doing? Let’s face it — ANSI-C just doesn’t cut it. SystemC was developed specifically to rectify ANSI-C’s shortcomings for hardware design.

SystemC is a C++ class library that is “hardware aware,” and there’s a lot of online material to help you learn about its advantages. Forte’s CTO, John Sanguinetti, has written a couple of articles about the topic. I’d recommend reading “Transitioning from C/C++ to SystemC in high-level design” that appeared recently at Embedded.com, as well as “I can’t imagine designing hardware in C…” that appeared right here in Forte’s CynCity blog. Read these and you’ll see why ANSI-C doesn’t work for anything more than trivial hardware design.

Want more? There was a panel about it at DAC this year! And while they probably won’t admit it, even the EDA vendors who trashed SystemC a year ago are starting to sing a different tune.

I can’t imagine designing hardware in C …

I recently had a conversation with an old friend from my graduate school days who, when I told him about Forte and high-level synthesis, said, “I can’t imagine designing hardware in C, or any other programming language.” This guy is one of the best programmers I know — he wrote the user interface code for the Apollo workstation in 1981, one of the first graphical user interfaces — so I take his opinions seriously.

He went on to say that good hardware had such a high level of parallelism, he didn’t see how a compiler could produce that from inherently sequential code like what is written in a standard, general-purpose programming language. In his words, “though I wouldn’t write in assembly language any more, the output of gcc is crap. If it can’t do a good job with sequential code, why should I believe that a compiler can do a good job producing parallel code?”

My answer to him was along the lines of, “well, it really isn’t all that hard, and our customers have produced lots of real designs using Cynthesizer,” but that wasn’t a very satisfying answer. Here is the answer that I didn’t have time to give my friend:

  1. You’re right, you can’t use C. We need to define what we mean by a programming language. Verilog (and VHDL) is a hardware description language, not a “hardware programming language.” However, it is also a simulation language, with a lot of similarities to Simula (one of the groundbreaking early simulation languages). We’ve been designing hardware in a simulation language for 20 years, so the basic idea is not all that radical.
  2. Though Verilog (and VHDL) is a simulation language, it is not a particularly rich one. It is object-oriented (modules are classes, and instantiations are objects), but it lacks a lot of features that object-oriented programming languages provide. What it does have, however, are bit-wise operators and data types, and a standard module class which has ports, threads, and submodules.
  3. Actually, Verilog (and VHDL) is capable of representing hardware at a higher level of abstraction than RTL, but it has one big drawback: it isn’t used for algorithm development, implementation or publication. Really, the only language that doesn’t have that drawback is C (or C++). Java is a distant second, and the other popular languages aren’t really candidates (M?, Fortran?, Perl?, Python?, Ruby on Rails?, …).
  4. This is where SystemC comes in. C++ is a rich object-oriented language which is nearly universal and, since it is a superset of C, the vast majority of algorithms which are published are in C++. Being an object-oriented language, C++ has the ability to support a hierarchy of layers of abstraction, implemented by libraries of classes. SystemC is a library of classes which implements a “hardware abstraction layer” in C++. You can write the same code in SystemC/C++ as you would write in Verilog, at the netlist level or at RTL. That’s not particularly interesting — algorithms aren’t written at RTL in any language. What’s important is that you can also write code at a higher level than RTL, using C++ constructs. And, you can take a C algorithm, stick it unchanged in a SystemC module (that is, make it a method in an SC_MODULE class), and you have a starting point for hardware implementation.
  5. Now that we have a structure for describing concurrent processes (a hierarchy of modules is a collection of concurrent processes), we’ve got something that a compiler (or high-level synthesis tool) can work with. Extracting fine-grain parallelism from a sequential process is easy (at least conceptually) — compilers have been doing that for years. Managing concurrency between processes requires a bit more work, but there is enough information in the source code for the compiler to do a good job.
  6. Finally, the world of parallel vector supercomputers spawned a rich set of code transformations in the compiler world which could expose possible parallelism in sequential loops. Many of these can be applied to hardware synthesis. The end result is that a great deal of parallelism can be realized from a hardware description in SystemC.
  7. So, implementing parallel hardware from a SystemC description can be done effectively. Generally, this part of the problem falls into the category of scheduling. That is, creating an optimal (by some definition) state machine, or set of state machines, to implement the described functions. Scheduling is a critical part of the high-level synthesis process. But that is only part of the problem. The other part is to implement the desired schedule in the smallest amount of hardware possible. That generally is called allocation, and that is the subject for another article.

The conclusion is that you not only can imagine creating hardware automatically from a C-related programming language, it is being done successfully by today’s high-level synthesis programs. At least, we can say it is being done successfully by Cynthesizer. It has taken quite a bit of time and effort to develop the synthesis technology to the point we are at today, where the vast majority of hardware designs can be done successfully with high-level synthesis. This will have a profound effect on how people do hardware design in the coming years.

What to expect from high-level synthesis

For nearly a decade Forte Design Systems has been working with customers to design and implement high-level synthesis (HLS) strategies and methodologies. These customers ask a lot of questions as they try to determine if HLS is right for them. But there’s one question they almost never ask: “What should I expect from high-level synthesis?”

They don’t ask it because it seems too obvious, or they already think they know what to expect. But the answer is not so obvious. Pose the question to yourself right now before reading on… what was your answer?

If you chose “productivity benefits” or some form of “faster time-to-RTL,” you’d be right–well, sort of.  There are several values and we’ll discuss five of them here.

Productivity
Certainly HLS promises and delivers great productivity improvements over RTL. Raising the abstraction level of hardware design means that the functional intent of the design can be expressed in fewer lines of code. Fewer lines of code should, in theory, take less time to write. The HLS tool then does a lot of the work in transforming the high-level code into RTL, creating the control FSM and the datapath. Oftentimes our customers see productivity improvements for real designs of 2-10X over RTL. Roughly speaking, that’s a gate count somewhere between 300,000 and 1,000,000 gates per engineer-year. Not bad!!!

Predictable Timing Closure
It’s usually unexpected, but during the RTL creation process users find that HLS has great value in timing closure. State-of-the-art HLS tools extract information from the user’s technology library (.lib), and, along with the target clock frequency, can characterize the performance and cost of each part. Based on user constraints, it can choose which ones to use when building the FSM and datapath. Different tools do this differently, of course, with varied levels of accuracy. Forte’s Cynthesizer uses an internal synthesis engine to estimate the area, speed, and power of each part at the gate-level, and then efficiently packs the operators and control into each clock cycle. The result is an RTL architecture that gets through logic synthesis and place-and-route smoothly, because Cynthesizer scheduled it with the target technology and process in mind. Timing closure problems are virtually eliminated, saving months of time and effort.

Less Time In Verification
As many a marketing person has stated, “verification is greater than two-thirds of the RTL design process.” Sure, that’s probably true, but the reason is because of the RTL design process itself. Think about it. RTL designs have thousands of lines of code, are insanely complex, and the code doesn’t get to the verification team until the end of the design schedule, leaving little time to verify the design. And with incredibly slow RTL simulation times, it’s no surprise RTL verification would take so long.

What if the verification team had a super-fast, accurate model of the design months earlier in the design process, a model where the functionality, architecture, and even the interfaces are fully specified and “simulatable”?  A SystemC HLS model can provide just that–an early, accurate, fast, executable specification. Cynthesizer can then compile this model and produce a unique RTL architecture for each set of user directives (more on that later). More importantly, the RTL design can be automatically validated using the same SystemC testbench, significantly reducing the verification effort. There are even formal tools that can check equivalency between the high-level SystemC and the RTL. The result? An improved design and verification flow that lets you start verification much earlier in the process and reduces the overall design time.

True Design Reuse
Everybody reuses RTL, but the same comments about reuse come up every time: old RTL designs are difficult to modify, they have limited shelf life when changing clock speeds, and a lot of area is wasted when changing geometries because of inefficient silicon use. HLS provides true design reuse by eliminating the majority of implementation details from the model. A SystemC HLS model doesn’t include any technology-specific details of the design. It is focused on the functional aspects and relies on the HLS tool to add the implementation details. Because of this, many of the issues mentioned above in reusing RTL are never encountered:

  • Modification of designs is only necessary if the function changes, and with fewer lines of code, the functional intent is much more clear.
  • The shelf life of the IP is longer because it is not tied to a specific process and speed. The HLS tool can easily retarget the design for different technologies (even FPGA!) simply by changing the directives.

Quality of Results
Of course, no amount of productivity and verification improvements will really help if you can’t meet your performance, power and silicon area targets.  Designers looking to HLS for high-volume production designs are naturally skeptical that they can get better results with an HLS tool than by hand coding RTL. This is not unreasonable, and some of the other HLS tools are pretty good on productivity but force users to sacrifice QoR. This is simply not true for Cynthesizer. You can expect to achieve just as good a result using Cynthesizer as you would if you wrote RTL by hand. This is not done by waving a magic wand. The user has to apply engineering skill to organize the design and drive the tool to get the best results, but it’s been proven on numerous production chips that Cynthesizer lets users take HLS all the way to the finish line.

So, there you have it.  If you are looking to adopt HLS design, you now know what you can, and should, expect.

Hierarchy in SystemC: Why it’s so important for HLS!

Last time, I looked at the verification advantages of using SystemC for HLS. This time, I want to explore another important capability of SystemC that makes it far superior to ANSI C for hardware design.

I’m talking about structural hierarchy. SystemC supports hierarchy while ANSI C does not. Structural hierarchy means submodules, connected together and executing concurrently. Use of hierarchy is the traditional mainstay of hardware designers for breaking down complex designs into a group of smaller, more manageable designs that are easier to design and verify.

Here are some of the advantages that come from SystemC’s support for hierarchy:

Unit-Level Verification
It’s easier to build a testbench that can stimulate all the critical corner cases in a design when it’s a smaller block. Also, signoff requirements like code coverage are easier to meet when you have controllability at the smaller block boundary. Making sure that every line of source code is covered requires a designer to find the correct sequence of input values to exercise every code branch. With a small block you don’t have to try to trick the upstream blocks into producing the right stimulus, you can just have the testbench present whatever stimulus you need. In large blocks this may be impossible altogether.

Connections and Interfaces
Working at the submodule level allows designers to isolate the complex interfaces and channel connections between them. ANSI C HLS tools, as mentioned in the previous post, do not accurately represent concurrent hardware. This can really cause problems designing and verifying interfaces because the handshaking and transactions all occur simultaneously. The simulation semantics of SystemC let you examine interfaces and channels at the pin-level and make it straightforward to code your own interfaces with whatever protocol you need. SystemC-based HLS also allows you to encapsulate the details of a particular protocol in a set of classes and easily switch from one interface to another without changing your module’s source code–but that’s the topic of a future posting!

Architecture Design
In HLS design there is a step between algorithmic design and RTL generation: architectural design. This step takes the untimed algorithmic code and decides how major portions will be implemented in hardware to best meet QoR requirements, i.e. whether a particular array be an external memory or a flattened register. Some of this is done through synthesis directives or constraints, but a handy tactic is being able to partition a section of code into a nice hierarchical submodule. This, and its verification advantage, is talked about more in John Sanguinetti’s recent EETimes EDA DesignLine article.

Faster Runtimes
Everything will run faster with smaller modules. The blocks will get through behavioral synthesis scheduling more quickly and the generated RTL will run through logic synthesis tools faster. The block-level testing will go faster because smaller blocks compile and run in simulators faster as well.

Teams of Multiple Designers
Teams usually work in parallel, with multiple teams designing and verifying multiple blocks in parallel. Hierarchy makes design maintenance easier, makes it possible to keep a consistent set of code for HLS and verification, and keeps designers from stepping on each others’ toes.

Reuse
When the next-generation design is derived from your current design,having the design broken into manageable blocks makes it much easier to reuse some of those blocks without an entirely new verification effort. It also improves your ability to figure out which blocks will have to be changed, or how to fit together a combination of old, new and modified blocks to meet the new requirements.

SystemC supports our familiar friend–structural hierarchy–and allows you to use many of the same techniques you are accustomed to for managing the complexity of design and verification tasks. Gee, using SystemC for HLS is just like having a real hardware language, only with higher levels of abstraction available. No wait, it’s not like that–it’s exactly that! And that’s what you really need for practical high-level hardware design.

Need another reason to use SystemC for HLS? The verification advantage is the best of all.

The “language war” in high-level system (HLS) design has been waging for a while now. You’ve probably read a lot of online publications touting the advantages of using SystemC over ANSI C to design at an abstract level. If you were to take what everyone is saying and boil it down to a few key points, they might sound something like this:

  • ANSI C is a sequential language.
  • ANSI C cannot execute two subroutines or functions concurrently.
  • ANSI C executes your code in a single flow, one line after another.
  • ANSI C-based HLS tools either limit themselves to single-block designs or provide a proprietary mechanism to mimic concurrency.
  • ANSI C HLS tools don’t give you a way to accurately simulate what a real piece of hardware does.
  • SystemC, on the other hand, is a standardized superset of C++ that supports multiple concurrent processes.
  • SystemC supports hierarchy and modules.
  • SystemC allows communication between those modules at the transaction or pin level.
  • SystemC gives hardware designers the ability to tackle the very complex design and interface tasks they face every day.

Notice first I said “superset of C++” instead of “language.” That’s because SystemC is indeed not a language: it’s a family of C++ classes specifically geared toward hardware design constructs like modules, ports, concurrency, clocks, resets and channels.

The list above is short, but to see what far-reaching consequences these points have for hardware designers, consider the following:

Let’s say you use an ANSI C tool that can only work with single blocks. If you have a multiple block system, you can produce each block one at a time. But there’s a catch: how do you verify that system? According to the tool, that’s your problem. It’s up to you to somehow stitch these blocks together in RTL and do all the verification in RTL.

If you use an ANSI C tool that mimics concurrent simulation using proprietary libraries or other non-standard techniques, you are locked into that proprietary flow. Want to drive yourself crazy? Just try using this flow to create some IP cells and distribute them to your customers. Those customers will have to use (and own) the same tool just to run a simulation.

The proprietary library approach also takes liberties to get your ANSI C code to act like real hardware. To emulate the dataflow of your design they will typically have you partition the design so each block is a subroutine, and require you to call proprietary APIs inside the subroutines to manage that dataflow. One common approach is to have you conditionally execute the algorithms inside the subroutines depending on the availability of data reported by these API calls.

Sound messy? It is. And when you consider that it takes an API call (also proprietary) to determine if there is data in the channel, then you start to understand how these libraries are rudimentary static simulators at best.

If you have a fairly simple design that has a sequential algorithm, can be controlled by a single finite state machine, processes the same number of inputs and outputs every cycle, and uses a limited set of interfaces to communicate, then ANSI C may work fine for you. But the truth is that real projects aren’t that simple. Complex designs have multiple processes operating concurrently, they communicate with things like external memories and they send data through interfaces that must be customizable.

If you use the SystemC classes for HLS design, you have a real built-in, event-driven simulator to support your verification efforts. SystemC supports multiple modules that can execute concurrently, can share data in memories, and can synchronize their execution using real signal-level protocols. In other words, it allows you to attack the biggest design problems by breaking them down into multiple blocks and connecting them with channels that follow a protocol.

And with SystemC, you can easily simulate these blocks together to make sure everything is working correctly.

Can you do all this with ANSI C? You could, but how much time are you willing to spend writing the specialized hardware classes that already exist with SystemC?

There’s a lot more I could say about this, but I’ll let someone else do it for me.  John Sanguinetti, Forte’s CTO, just published this article in EETimes’ EDA DesignLine that takes another look at the verification angle.

Next time, I’ll take a little deeper look into another important advantage of SystemC in HLS design: its support of structural hierarchy.

Welcome to CynCity!

Hello and welcome to CynCity, the all new Forte Design Systems blog located here at http://cyncity.forteds.com. We are doing this so we can share our ideas with you more publicly – ideas we think you should consider in the fast-evolving world of high-level design. Forte now has a decade of experience helping people achieve great results. We hope that you will find CynCity to be both entertaining and informative, and above all, give you a clearer understanding of innovative approaches to design techniques and methodologies.

For our inaugural blog posting, I thought there was no better place to start than at the beginning.

Forte was founded on a vision with 4 core beliefs:

  • Abstraction – critical to improve the productivity and maintainability of new designs.
  • Verification – an essential component of any design solution.
  • Quality Silicon – tight coupling to downstream tools to produce high-quality results.
  • Standards – support for standard languages is best for our customers and our industry as a whole.

In 2001 we introduced Cynthesizer to the market, the first SystemC-based high-level synthesis tool. Cynthesizer raised the level of abstraction for designers, but it also delivered a series of verification integrations that made it easier for customers to adopt this new, high-level design flow. We chose SystemC at the time because it was the clear momentum winner in C++ modeling for EDA, and have since evolved that standard through technical and executive participation in the OSCI standards body. We also provided very close ties to existing silicon implementation flows (starting with logic synthesis tools) and tight integration of specialized datapath engines for generation of high-performance, quality silicon.

Our approach seems to be working. Cynthesizer has rapidly been adopted by the world’s leading electronics companies and our online Cynthesizer Knowledge Base now has over 600 users.

Things got even better last September when we acquired the CellMath Designer (CMD) technologies from Arithmatica. This brought to Forte a very high-performance datapath optimization technology, a number of world-class IP blocks and a team of world-renowned datapath experts. This step exemplifies our commitment to a core belief in quality silicon.

One of the challenges Forte faces, as a small company, is the fear on the part of designers to be “the guinea pig” – a fear of being the first user of new technology that might not work. Rest easy. Cynthesizer and CMD have both been through the ringer, having been used by many customers to tape out numerous designs. If you go to your local electronics store and buy a digital camera, TV, printer, MP3 player or one of the most popular smart phones on the market, chances are you will walk out holding a piece of hardware that was designed with a Forte product.

Each month we plan to bring you all kinds of helpful information no matter what your current tools or usage may be. If you are an existing Forte customer, we’ll help you get the most out of Cynthesizer and CMD to do things like create complex line buffer interfaces, get that external memory interface synthesized or design with floating-point data types. If you are a user of a competing high-level design tool, we hope to challenge some of the restrictions we know you are facing and point out ways you could achieve better results. And if you’re new to high-level synthesis but don’t know where to start, we’ll give you the information you need to examine your current design approaches and ask yourself a few tough questions that we’ve already solved.

If you’ve read other blogs or message boards, you’ve heard a lot of banter and conflicting opinions about high-level synthesis. Should I use SystemC or ANSI C? How exactly do I verify the results? What do I do with the RTL you generate? How do I analyze what you’ve done with my code? What about control-oriented designs? What about low power and ECOs?

It’s okay. We hear this all the time and you’ll see the answers right here on CynCity.

So welcome again. Please take a moment to bookmark this page and sign up for an email subscription. This way you’ll get the most out of your experience with Cynthesizer and CellMath Designer.

And, as always, we’re here to listen. If you have topics you would like to see covered or any suggestions to make our blog as usable as possible for you, don’t hesitate to send your ideas to us at cyncity@forteds.com.