Welcome to the new FlexRadio Community! Please review the new Community Rules and other important new Community information on the Message Board.
If you are having a problem, please check the Help Center for known solutions.
Need technical support from FlexRadio? It's as simple as Creating a HelpDesk ticket.

Flex Lockup on 1.10.16

189101214

Comments

  • Tim - W4TME
    Tim - W4TME Administrator, FlexRadio Employee admin
    edited June 2017
    David Livingston - as noted the issue is a software problem and not a hardware issue so returning it is not covered under the hardware warranty.  Downgrading is a temporary recommendation until we can definitively determine the root cause of the issue. 
  • Tim - W4TME
    Tim - W4TME Administrator, FlexRadio Employee admin
    edited June 2017
    Paul - it is difficult to answer your question because the sample sizes are vastly different and there aren't enough data points to accurately extrapolate the data, but empirically, the 2.0 code we are currently testing does seem to have a lower incident rate.
  • Tim - W4TME
    Tim - W4TME Administrator, FlexRadio Employee admin
    edited June 2017
    Rick - Your suggestion to have everyone who has experienced a problem to post they are having it isn't going to accelerate the process.  What we need is a reproducible set of events or actions that cause a problem.  That is constructive feedback we can use to fix the problem.  Reports without constructive feedback are just complaints and we are already aware of the issue as noted above.
  • Tim - W4TME
    Tim - W4TME Administrator, FlexRadio Employee admin
    edited June 2017
    N1MW - Resolving the DAX issues is also high on our fix list too.
  • Paul M
    Paul M Member ✭✭
    edited June 2017
    Tim -Thanks for the reply but I am not asking for a statistically accurate answer.   Many of us on this board have either owned companies or worked with companies that have had hardware/software issues that were difficult to track down and we all appreciate the difficulty if the issue is not easily reproducible.

    Have any of your Alpha or Beta testers had premature terminations with 1.16 that required a long press or removal of power?  Of that subset, if any, have they experienced the same issue with 2.0 Alpha or Beta?  

    As for my system, I ran for over a month without a stoppage and then the other day, while transmitting in WSJT-X I had a sudden stop.  Yesterday I had both a radio freeze and a computer freeze in that order.   There was a delay in the computer freeze of about 30 seconds.  Again I was using WSJT-X but in receive only.  I had not transmitted since the radio started about 30 minutes before the stop.    It seems that putting the Flexcontrol at port 240 has stopped the SSDR quitting with the error related to the Flexcontrol.  

    I still have random issue with the USB ports not controlling the SteppIR controller and even less frequent issues with the USB ports not controlling the SPE amplifier.  Disabling and enabling normally starts the process.

    Best regards,

    Paul


  • Rick W7YP
    Rick W7YP Member
    edited June 2017
    Tim,

    Since I see this problem quite regularly, can I get a copy of the 2.0 BETA to test on my setup?  If you want to really put it to the test where this problem is concerned, there are several of us here who should be included in final testing.

    Thanks,

    Rick, W7YP

  • Tim - W4TME
    Tim - W4TME Administrator, FlexRadio Employee admin
    edited June 2017
    Paul, Any answer to your question may be misinterpreted, so let me try to explain it in detail.

    1.) Yes, alpha team members do experience radio crashes, freezes and occasionally a "bricked" radio.  This is the risk they assume by being on the alpha team.  They are black box testing alpha code and by its very nature, it may have issues we did not anticipate or encounter in white box testing.

    2.) Now, the $64k question.  Are these incidents related to the issue that people in the field are experiencing?  Honestly, we don't know for certain.  This is the reason for my answer above.

    This particular issue is not a systemic bug.  If it was, it would have been fixed immediately because it would have been reproducible in our lab and we could have determined root cause very quickly.

    To date, we have not had a radio in our lab (and we have lots of them) running 1.10.16 lockup in the manner described on the Community.  If we can't catch the issue in action while debugging, then it makes determining root cause much more challenging because we have to start doing thought experiments on hypothetical possibilities how the fault might occur and then go look at that section of code to see if we see anything that might be a contributing factor.

    We have tried doing in the field logging, but the crash is a seg fault, meaning that when the crash occurs, it does so in such a manner that the logging facility crashes too before we can catch a "last gasp" console feedback.  Therefore we get no useful forensics from the logging we have tried.

    In addition, since the crashing is really random. For example when the radio is in use or idle, whether it has been running for 10 minutes or 10 days, there is no pattern to assist in trying to zero in on the root cause.  Also, we have not received any feedback from users having the problem that describes a particular set of steps or events that can reproducibly generate the failure.  If we had this information, the issue can be addressed in very short order.

    There is another dynamic in play here too and that is there have been a lot of reports of issues that have been lumped into this post that are not applicable.  I wrote a response that differentiates other types fault behaviors from the one associated with this post.  There are situations were a network issue can look like a crash, but isn't.  These "false positive" reports can muddy the waters when trying to determine the root cause if the issue.  

    It is possible that through the process of building SmartSDR v2.0 that we have serendipitously squashed the bug that is causing the problems and 2.0 users will not experience it.  We will not know for certain until we can get more data points (more people using the code) and validate that the issue has been mitigated.  If this turns out to be the case, then we have something to work with, as we can look at the changes we made for 2.0 and if any look to be good candidates for a fix, we'll incorporate them into the planned 1.x maintenance release.

    I also want to close with none of this hampers our desire and drive to find this bug and squash it.  As hard as it may be to believe, this issue gnaws at us more than you because it is an enigma we can unravel.  As noted earlier, we are seriously committed to fixing this issue.
  • Tim - W4TME
    Tim - W4TME Administrator, FlexRadio Employee admin
    edited June 2017
    Rick - I am afraid that Alpha releases are not available for public distribution.
  • k0eoo
    k0eoo Member ✭✭
    edited June 2017
    Tim - I apologize if this has been mentioned before, back in the day, we use to find very intermittent bugs (sometimes called race conditions) by running timing margins and voltage margins... 

    Again I apologize if this has already been done....  Just what FRS needs, is some 72 year old yeahoo looking over your shoulder.
  • Tim - W4TME
    Tim - W4TME Administrator, FlexRadio Employee admin
    edited June 2017
    No problem. We have a variety of tools to help identify these types of issues.
  • Rick W7YP
    Rick W7YP Member
    edited June 2017
    Tim,

    Please don't take this personally; but, if you're existing tools have yet to reproduce the problem in your lab, yet it's occurring on many customers' installations, then your tool kit is less than totally adequate.  That leaves you with two options:  (1)  Expand your alpha/beta testing to include many of those who are frequently experiencing the problem, or (2) Look for the deficiencies in the tools and testing methods you're currently using.  Especially when close to a major release, we found the first approach to be the most productive and least taxing on our own resources.

  • Don, VE2HJ
    Don, VE2HJ Member ✭✭
    edited June 2019

    Hi Tim, from your post on 19 april i want to clarify this:

     "do a normal power off on the radio by pressing and releasing the power button.  Give it a good 15-30 seconds to respond. "

    If the radio shutdown after the delay without indicating that it is shutting down, is this a lockup that you want us to report ? ( On my last 4 events,  my 6500 shutdown after 45, 60, 90, 60s)

    73,  Donald, VE2HJ

  • Rick W7YP
    Rick W7YP Member
    edited June 2017
    It has always taken my 6700 roughly a minute to shut down following a crash and, so far, it has always eventually reported "Shutting down" on its display.  Only a couple of times have I had to do the long push and hold on the on/off button to shut it down.
  • Tim - W4TME
    Tim - W4TME Administrator, FlexRadio Employee admin
    edited June 2017
    Only a couple of times have I had to do the long push and hold on the on/off button to shut it down.

    That condition indicates a crash.  The other condition does not.
  • Tim - W4TME
    Tim - W4TME Administrator, FlexRadio Employee admin
    edited June 2017
    Thanks for your feedback.  We'll take it under consideration.
  • Eric-KE5DTO
    Eric-KE5DTO Administrator, FlexRadio Employee admin
    edited June 2017
    To clarify, there are two separate shutdown measures that the radio takes.  If the radio firmware is up and running properly, a single short press of the power button should initiate a shutdown within 5-10 sec.  Otherwise, there is a 60 sec timer in the power control chip that will shut things down if the firmware does not respond.  

    The latter is what indicates the firmware was not running properly.  Holding the power button down for ~4 seconds performs the same shutdown as the end of the 60 sec timer.
  • Mike W8MM
    Mike W8MM Member ✭✭
    edited July 2019
    I've lately had a couple of "freeze-ups" running 1.10.16 where my VPN (SoftEther) remote session of SmartSDR just quit and the radio chooser option for that particular radio vanished from the radio-setup screen on my remote CPU running SmartSDR under Windows 10 through Parallels on my office iMac.

    After the event, I tried other ways to connect remotely, as well as locally by WiFi/lan, using SmartSDR for iOS, etc., and the radio remained not discoverable.

    The situation was remedied by a long press of the power button once and only needed a short press another time.  It's possible that the long press was not needed the first time, I just did it for fun.

    These "freeze-ups" only took less than an hour to occur over VPN.  Last night, I ran the same radio for hours using local lan control and front-panel microphone w/powered speakers from the rear panel without any incident.

    My observed version of the problem seems VPN/remote dependent.
  • Don, VE2HJ
    Don, VE2HJ Member ✭✭
    edited June 2019

    It would be interesting to know  if any radio shipped with the factory firmware version 1.10.16.**** ever had a lockup ?

    IF NOT

     this would suggest it is not the firmware but something else. ( upgrading process, virus etc.)

    So i think that we should indicate the serial no. of the radio when we report a crash.

    73, Donald, VE2HJ

  • Rick W7YP
    Rick W7YP Member
    edited June 2017
    I think it's fair to say that to, most people, once the radio becomes unresponsive to SmartSDR and will not reconnect, it has "crashed", especially when most of the time a corrupt profile is the 'parting gift'.

  • Rick W7YP
    Rick W7YP Member
    edited June 2017
    It might also be informative to find out, from those experiencing the problem on 1.10.16 after upgrading to it, whether they had reset to factory defaults BEFORE doing the upgrade, as well as what version they were running before they upgraded to 1.10.16.
  • NM1W
    NM1W Member ✭✭
    edited July 2017
    6700 crash with tone this am while sitting idle on 40m cw.  Radio had been on about an hour;
    Upon reboot radio had 6 of 8 slices apparently randomly distributed to the 8 pans.

    Yesterday radio was on about 12 hours doing cw, no issues... Nothing different between how I ran yesterday or today. Radio has default profiles.

  • James Skala
    James Skala Member
    edited June 2017
    Lockup occurred on a freshly rebooted PC and Radio about 10 mins into reboot.  But this time the lockup occurred while TX'ing.  The TX was locked up even with not connection to radio.  Red TX was on the power button, tried to reconnect to radio by relauching the SmartSDR but radio was not listed.  Short button power did not turn off radio.  Long button power off required.

    Flex 6300 1.10.16 No USB's
    SPE 1.3K-FA
    PALSTAR HF AUTO
  • Rick W7YP
    Rick W7YP Member
    edited June 2017
    Welcome to the club.  I've been experimenting with 1.10.15 for the past 8 days and it's been far more stable on my radio than was 1.10.16.  If you decide to give it a try, be sure to reset your radio to factory defaults on 1.10.16 BEFORE doing the back-rev to 1.10.15. 
  • Rick W7YP
    Rick W7YP Member
    edited June 2017
    After several days running without a radio lockup on 1.10.15, but only doing SSB or CW, I fired up WSJT -X to see if DAX might be a factor in precipitating the radio crash.  2.5 hours later, I had my answer.  SSDR had lost connection to the radio and the radio had to be restarted, restored to factory defaults and my profiles reloaded.

    Virtually all other crashes I saw on 1.10.16 occurred either while running some kind of digital mode or after having done so for some period of time.  DAX would seem to be a contributor to this lockup problem.

  • Bill W2PKY
    Bill W2PKY Member ✭✭
    edited June 2017
    I have been running 1.10.16 w/DAX but only 4 channels active. Will activate the remainding channels to see if problems return.
  • DaveC
    DaveC Member ✭✭
    edited May 2018
    Just had a lock up with long button reset. Been a long while since I have had a lock up (months). BTW it is almost 90 degrees here today.


    Tim have you tried putting the radio in a heat chamber to see if it has any effect?(hot or cold)
  • Rick W7YP
    Rick W7YP Member
    edited June 2017
    It may well have nothing to do with DAX.  After this morning's crash, I restarted the radio, forced factory defaults and then reloaded my profiles.  Did nothing more with the radio but it crashed while idling within 20 minutes.  The profile I loaded was newly created 9 days ago, right after a reset to factory defaults.  Then it was saved.
  • Rick W7YP
    Rick W7YP Member
    edited June 2017
    Dave, my 6700 is in a temperature-controlled room, kept to 70F year-round.
  • NM1W
    NM1W Member ✭✭
    edited July 2017
    Just booted the pc; turned on the rig, fired up dxlab, sdrbridge, frstack.... and was just about to get started looking around when I heard that annoying tone and saw ssdr drop connection... Rig hadnt been on 10 minutes... long button push to recover... 
  • NM1W
    NM1W Member ✭✭
    edited June 2019
    Banner night;  I now can state there are at least two levels of tone; Earlier had a more subdued not as loud grocky kind of tone.. 

    Just now running 4 wsjtx in a relatively cool room (its 73) I had a much louder tone;
    The rig starts emitting the tone, I observe its not in xmit mode, so I start loading this page; After a few seconds  ssdr lost comm;
    I hit the rig on/off button once, and began navigating here and typing. The rig shutdown; I didnt notice if it ever said "shutting down", and it took probably a good minute. 
    I thought Tim had said this wasnt a crash (since the button push "worked"), but the rig wailing and being non-responsive to ssdr, sure seems to me to be a crash... 
    Rig had been on 1.5 hours (or however long it was from my last crash)


Leave a Comment

Rich Text Editor. To edit a paragraph's style, hit tab to get to the paragraph menu. From there you will be able to pick one style. Nothing defaults to paragraph. An inline formatting menu will show up when you select text. Hit tab to get into that menu. Some elements, such as rich link embeds, images, loading indicators, and error messages may get inserted into the editor. You may navigate to these using the arrow keys inside of the editor and delete them with the delete or backspace key.