I&#39;d love to see a performance-oriented, pure C++ synthesis library come to prominence. STK gets partway there, with the inclusion of the vectorized tick method, but in order to be performant and highly modular, parameter setting needs to be vectorized as well. I&#39;ve taken steps toward this goal with my EZPlug wrapper for STK. In my own code, I use a sine oscillator which can take an stk::generator as a frequency input. <div>

<br></div><div><a href="https://github.com/morganpackard/EZPlug/blob/master/EZPlug/EZPlugGenerators/SineWaveMod.h">https://github.com/morganpackard/EZPlug/blob/master/EZPlug/EZPlugGenerators/SineWaveMod.h</a></div><div><br>

</div><div>This way, I can work at a bit of a higher level, more &quot;patching style&quot;, but without having the added layer of complexity and obfuscation that would come with using PD. Creating an efficient FM synth becomes a matter of just patching together the right sine waves and helper generators (Multiplier, Adder, FixedValue).</div>

<div><br></div><div>-Morgan</div><div><br><div class="gmail_quote">On Tue, Sep 18, 2012 at 6:37 PM, Stephen Sinclair <span dir="ltr">&lt;<a href="mailto:sinclair@music.mcgill.ca" target="_blank">sinclair@music.mcgill.ca</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">This is difficult.  I have been playing with the code and gcc options<br>

and gprof, and it seems there is no specific bottleneck.  HevyMetl is<br>

about 15% the speed of Clarinet.  I managed to get it down to about<br>

the same speed, but I had to do several things:<br>

<br>

- moved several functions into their headers for inlining<br>

(FileLoop::setRate, FileWvIn::tick, etc.. anything that is referenced<br>

from HevyMetl::tick.)<br>

<br>

- set some gcc options to force inlining of as much as possible,<br>

e.g. -Winline --param inline-unit-growth=65536 -finline-limit=65536<br>

<br>

- used link-time optimisation available in gcc 3.6 and up  (-flto on all code)<br>

<br>

- set -ffast-math<br>

<br>

Even then it is not quite as fast.  I found fmod used in FileLoop was<br>

a bit of a bottleneck.<br>

<br>

In general I find it pretty surprising that gcc doesn&#39;t succeed in<br>

speeding this up more, but FileLoop seems to be a bit of a problem for<br>

reasons that aren&#39;t clear to me. I sprinkled the code with checks for<br>

denormals and came up empty.  I checked the assembler and used gprof<br>

and -Winline to make sure inlining was working as expected.<br>

<br>

Oh, I should mention this was on my fast desktop computer, not an ARM<br>

tablet, so proper profiling on the target hardware may be warranted.<br>

<br>

In any case, Morgan&#39;s right in that the vectorised versions are<br>

probably better to use &quot;in production,&quot; since the whole &quot;inline&quot; thing<br>

in C/C++ is not supposed to be fully relied on for efficiency.  (e.g.<br>

the compiler might choose not to inline due to code size rather than<br>

speed.)  The per-sample tick functions are however important for<br>

certain algorithms, and generally useful as a teaching tool, so their<br>

presence in STK is desirable.  That said, usually a vectorised<br>

approach is preferred in application code.<br>

<br>

Just a couple of notes...<br>

<div class="im"><br>

<br>

On Tue, Sep 18, 2012 at 11:41 AM, Morgan Packard<br>

&lt;<a href="mailto:morgan@morganpackard.com">morgan@morganpackard.com</a>&gt; wrote:<br>

&gt; I did some of my own experimentation, which seems to point to method calls<br>

&gt; themselves (even with all of the calculation inside them commented out)<br>

&gt; being responsible for much of the cpu use of HevyMetl.<br>

&gt;<br>

&gt; I&#39;ve been using STK all along with the assumption that all calls to tick()<br>

&gt; without frames, or any other per-sample function call was going to be<br>

&gt; significantly less efficient than operating on buffers. I&#39;m aware of the<br>

&gt; existence of inlining, but not savvy enough to understand if it&#39;s happening<br>

&gt; or not, and under what conditions it can happen. It seems suspicious to me<br>

&gt; to think that inlining could happen on pointers. I mean, if you have a<br>

&gt; pointer to an stk::Generator, and you call tick() on it, I don&#39;t see how the<br>

&gt; compiler could know ahead of time which subclass of Generator it should be<br>

&gt; inlining.<br>

<br>

</div>That&#39;s true, but most STK code uses the final child class, not the<br>

superclass, so the compiler should have all the information it needs.<br>

<div class="im"><br>

&gt; I&#39;d love to find out that all my meticulous buffer passing in order to get<br>

&gt; reasonably performant code is unnecessary, but until I understand otherwise,<br>

&gt; or better, I&#39;m working with the assumption that method calls are expensive<br>

&gt; and best to minimize. Another thing I like about calculating samples in<br>

&gt; batches/buffers/stkframes is it allows me to use Apple&#39;s accelerate<br>

&gt; framework, which offers some very nice performance boosts.<br>

&gt;<br>

&gt; However, I certainly trust Gary&#39;s assertion that this HevyMetl ran just fine<br>

&gt; on 90&#39;s machines, and I&#39;m very curious about what has changed. Has the code<br>

&gt; itself changed, breaking inlining? Is there something about method calls on<br>

&gt; the Apple hardware that makes them much more expensive than on Gary&#39;s 90&#39;s<br>

&gt; hardware?<br>

<br>

</div>I think one experiment would be to compare run-times of<br>

HevyMetl::tick() for previous versions of STK.  My meticulous building<br>

of a git archive of the previous tarballs might finally pay off!<br>

<br>

<a href="https://github.com/radarsat1/stk/commits/upstream" target="_blank">https://github.com/radarsat1/stk/commits/upstream</a><br>

<br>

If a really old version does turn out to be faster, I was thinking<br>

maybe a well-crafted &quot;git bisect&quot; command might help get to the bottom<br>

of this.<br>

<br>

Steve<br>

<div class="HOEnZb"><div class="h5"><br>

_______________________________________________<br>

Stk mailing list<br>

<a href="mailto:Stk@ccrma.stanford.edu">Stk@ccrma.stanford.edu</a><br>

<a href="http://ccrma-mail.stanford.edu/mailman/listinfo/stk" target="_blank">http://ccrma-mail.stanford.edu/mailman/listinfo/stk</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div><font size="1" color="#999999">===============</font></div><div><font size="1" color="#999999">Morgan Packard</font></div><div><font size="1" color="#999999">cell: (720) 891-0122</font></div>

<div><font size="1" color="#999999">aim: mpackardatwork</font></div><div><font size="1" color="#999999">twitter: @morganpackard</font></div><br>

</div>