SmartSDR performance thread

  • 1
  • Problem
  • Updated 5 years ago
  • Acknowledged
  • (Edited)
Hi!  Figured I'd throw this in the API forum since most people probably don't care or have no idea what this topic is about.

So basically, I'm a nerd (I apologize in advance).

The "choppiness" in the panadapters that I've been seeing -- I came across the Perforator utility to analyze the performance of SmartSDR and discovered that when the choppy panadapter behavior surfaces (not as often as it used to), the cause is that it was falling back to software/cpu rendering, not GPU rendering.

I know that in the main forums, most people have been complaining about the Intel 4000 series drivers as the source of the problem, but I'm not convinced (yet).

After some googling, I came across another diagnostic app, the "Intel GPA System Analyzer," so being the dork that I am, I downloaded it and attached it to the SmartSDR process.

So far the only metric that has jumped out at me is the "EUs Stalled in PS %" metric, which translates to "Execution Units stalled executing Pixel Shader instructions."

That value is around 20%, while the "Active pixel shader" is around 6%.  According to the Intel documentation, this indicates a potential performance issue in the pixel shading rendering code. (see: http://software.intel.com/sites/products/documentation/gpa/13.4/win/EUs_Stalled_in_PS.htm )

Additionally, according to this link: (http://software.intel.com/sites/products/documentation/gpa/13.4/win/GPU_EUs_Active.htm) if GPU EUs Active + PU EUs Stalled is significantly lower than 100%, there are stalls elsewhere in the rendering pipeline.  In my case, this figure represents about 27% (20% + 7%), but I haven't been able to locate the source of the other pipeline stalls.  

Obviously I don't know jack about WPF but I've been trying to learn.  Let me know if there is anything else I should be looking at to help track down the issue.  I am more than happy to complain very loudly to Intel if I can find a smoking gun to point to, if it is their problem to fix.  

If I find any other interesting data, I'll post it here. :-)

Thanks!

-Robbie

Edit: those values above were observed while the GPU was being used, not when it fell back to CPU.  I'm still not sure what can trigger the fallback other than being out of memory.


Photo of Robbie - KI4TTZ

Robbie - KI4TTZ

  • 480 Posts
  • 77 Reply Likes
  • happy

Posted 5 years ago

  • 1
Photo of Steve - N5AC

Steve - N5AC, VP Engineering / CTO

  • 1031 Posts
  • 1002 Reply Likes
I've forwarded this to the engineers that work on the SmartSDR-Windows.  Currently they are embroiled in the waterfall work, but they have looked at this in the past.  We have a number of tools to look at performance and we periodically pull them out and scrub the code again to be sure we're doing things efficiently.  Thanks for sending this, Robbie.

Steve
Photo of Robbie - KI4TTZ

Robbie - KI4TTZ

  • 480 Posts
  • 77 Reply Likes

Thanks Steve!  I'm looking forward to the waterfall!  I don't want to detract from that work or I may have a lot of people on here coming after me! :-)

This isn't a *huge* problem for me, it's just an annoyance when the fallback to software rendering happens.  Wish there was a programmatic way to tell WPF/directx to *not* fallback but I haven't found it.  At least if it threw an error, it would be easier to find the source of the problem.

I figured I'd grab a screenshot just for fun (follow link for larger version).

http://screencast.com/t/K7zxTC54

After it fell back to software rendering and flaked out for a while, sometime yesterday I reattached the GPA analyzer to SmartSDR-Win and most of the metrics were working again, but all of the metrics under "Execution Units" (like pixel shader stalled/active, etc) were all showing zero, which I thought was weird. At that point, it appeared to definitely be using GPU acceleration though.  The only way I was able to get those metrics to work again was to reboot the computer.  This might not mean anything but figured I'd mention it.

Also just for sake of testing, I forced software rendering via a registry setting and it confirmed the choppy panadapter updates happened immediately when launching the app. 

I found a blog post that lists the various reasons DirectX would fall back to software rendering (I'm sure your developers already know this stuff, figured I'd post it anyway to add to the report):

http://blogs.msdn.com/b/jgoldb/archive/2010/06/22/software-rendering-usage-in-wpf.aspx?Redirected=tr...

I haven't been able to find any way to enable logging or debugging for DirectX or WPF.  Looks like that stuff can only be done through visual studio unless I've been going down the wrong path.

Hope this helps!

-Robbie

Photo of Robbie - KI4TTZ

Robbie - KI4TTZ

  • 480 Posts
  • 77 Reply Likes

Just some more info.  I noticed a slightly different behavior, but with the same end-result.  Over the last hour or so, on multiple occasions all of the panadapters would just freeze for no reason. If I click at the top of the window and just drag it around for a sec, they start moving again.  Here are some wpf analyzer screenshots to show what it was seeing at the time of the freeze, and then when I click-dragged the smartsdr window to make it start updating again.

http://screencast.com/t/qQaH7UqEKK

And here is what it saw after I moved the window.

http://screencast.com/t/G6Iroyg4kE

The difference here is that instead of just falling straight back to software rendering, it froze first.  As usual, I have no idea if this info is helpful.  Just wanted to pass it along in case it is. :-)

-Robbie

Photo of Robbie - KI4TTZ

Robbie - KI4TTZ

  • 480 Posts
  • 77 Reply Likes
In taskmgr, I brought up several different metrics  (paged pool, handles, threads, etc) just to see if anything would jump out at me.

Well, it did. :-)  When I launched SmartSDR, page faults climbed at a rate of about 3,000 per second (give or take) but performance appeared to be ok.  While I noticed it happening, it wasn't too crazy until the panadapter freeze happened.  Then it jumped from 800,000 page faults (total for smartsdr.exe) to about 3,000,000 page faults within about 3 or 4 seconds (seriously, not even exaggerating).  In the time it has taken me to type this post, page faults are at 19,000,000 and climbing, and the panadapters are still frozen. (and right as i'm about to hit send, it's over 28,000,000 faults)

Windows still thinks i have 3gb free out of 8gb total.

Any ideas/thoughts?

-Robbie

Photo of Eric - KE5DTO

Eric - KE5DTO, Official Rep

  • 718 Posts
  • 211 Reply Likes
Robbie,

Thanks for your notes here.  There is a problem that we have addressed where the Panadapter would freeze while resizing.  I am not convinced that this is the same issue based on your description but it is possible that our resolution to that issue may help with the one you are seeing.

To my knowledge this is the first Panadapter freezing that I have heard about that is not resize related.  I will be interested to know whether you have the problem with SmartSDR v1.2 due to be released this month.

It is worth noting that we haven't written and shader code directly as this is handled in the .NET library.  So any performance issues in the shader code are likely either us using the library incorrectly or they are in the library itself.  We'll continue to tweak for performance as we move forward.
Photo of Robbie - KI4TTZ

Robbie - KI4TTZ

  • 480 Posts
  • 77 Reply Likes
Thanks again Eric, looking forward to the new version!  I'll keep my fingers crossed that the resize fix helps things a bit.
Photo of Robbie - KI4TTZ

Robbie - KI4TTZ

  • 480 Posts
  • 77 Reply Likes
Hi Eric, doesn't look like 1.2 changed the freeze/cpu fallback behavior.

Also, I recently purchased a new Intel NUC i5 (to replace an old mac mini connected to my TV).  The nuc i5 has the 5000 graphics chipset vs. the 4000 chipset in my Surface Pro.  The NUC i5 also experiences the exact same freeze/cpu fallback as the Surface Pro.  I tried it with a fresh install of Windows 8.1 with no other software installed except SmartSDR.  It was also connected directly to my router (not wifi) so it takes wireless issues out of the picture.

Hope this helps,

-Robbie
Photo of Eric - KE5DTO

Eric - KE5DTO, Official Rep

  • 718 Posts
  • 211 Reply Likes
OK.  Thanks for the feedback.  We'll keep looking and I'll let you know when we find something.