Welcome to the new FlexRadio Community! Please review the new Community Rules and other important new Community information on the Message Board.
If you are having a problem, please refer to the product documentation or check the Help Center for known solutions.
Need technical support from FlexRadio? It's as simple as Creating a HelpDesk ticket.

Publish Blameless Post-Mortems

With SmartLink service restored, it would be good to put to rest community speculation (some clear-eyed, some wild-eyed) regarding the root causes and aggravating factors of the outage, and the process changes FlexRadio has put in place to both restore service and reduce severity and frequency of future outages. See for example https://www.etsy.com/codeascraft/blameless-postmortems/

These post-mortems should be:

  • blameless (no personnel at-fault / at-risk/ identified publicly),
  • specific (i.e. not 'a cert expired' but "The intermediate CA cert responsible for authenticating radio clients and software clients to the SmartLink service expired before it was replaced")
  • detailed in
    • the problem space (see above)
    • the restoration space ("A new intermediate CA was signed by our internal root CA and installed on the SmartLink servers, which restored service.") and
    • the preventative space ("To prevent this recurring, we have put in place two independent internal processes -- one which automatically renews the cert 60 days before the old cert's expiration, and a separate one which checks the deployed cert and alerts engineering if the cert will expire within 45 days, indicating a failure of the primary system. We have further added an automated process to generate a 'cert-update-only' point release for all current releases, so that customers can avoid service issues without changing the current functionality of their devices.")
  • Comprehensive, in providing a full picture of the incident, the root and aggravating causes, and other related/confounding problems (e.g. addressing in a similar manner the SmartLink backoff timer challenges, implementing some kind of source-IP-based rate limiting at the Smartlink service front end, etc)

Having read a bunch of the Facebook group posts, I'm sure that, to put it concisely, haters gonna hate. However, I hope that doesn't dissuade Flex from speaking to customers who-- having put in the hard work and busted knuckles to learn their respective professions and hobbies, and having made their own mistakes along the way, can respect Flex's transparency and efforts to improve their products and services.

18 votes

Completed · Last Updated


  • Alan
    Alan Member ✭✭✭✭

    Well laid out and well said.

    Constructive problem identification and resolution.

    Alan. WA9WUD

  • Pete La
    Pete La Member ✭✭

    I expect a company like FRS to be more proactive and less reactive. They have been caught with their pants down too many times recently.

    Pete K1OYQ

  • km8v
    km8v Member ✭✭
    I wish I could vote for this 100 times.
  • WK2X
    WK2X Member
    Yes, absolutely.
  • Geoff AB6BT
    Geoff AB6BT Member ✭✭✭

    This should also apply to folks who post a question with a problem on this forum.

    Instead of just saying that the problem is solved, and leaving us all hanging, tell us the solution.

    Sorry...one of my pet peeves.

  • Eric-KE5DTO
    Eric-KE5DTO Administrator, FlexRadio Employee admin

    This is such a great suggestion both in the way that it was posted (constructive) and with the excellent link that explains why this is a good practice (driving the right behaviors). I can't make any promises, but I'll do what I can from my end to make sure that we publish something like this. Thanks for the suggestion.

  • KD0RC
    KD0RC Member, Super Elmer Moderator

    Thanks for this Eric. It helps to know what happened and what is being done. I appreciate the effort that goes into a report like this (been there...).

  • Trucker
    Trucker Member ✭✭✭

    Eric, if I understand correctly from the information in the link you posted, even radios not setup and using SmartLink were pinging the SmartLink servers? Why would that be the case? I run v3.x software but don't use SmartLink. I can understand my radio checking for firmware and software updates on startup ( if an internet connection is available) . But, otherwise, my radio and others not using SmartLink, should have no effect on the SmartLink system.

    I have monitored my radio with Wireshark in the past and have only seen the initial update ping and nothing else. Has something changed? I understand about the certificates needed for accessing the SmartLink Authentication servers. But, why would this even be needed for someone not using SmartLink?

    As an aside, I think that as some have requested, there should be an explicit way to log off of the SmartLink system and, until the user wants to use SmartLink again, no pings or attempted connection between the radio and the SmartLink server. ( And just the firmware/software update check on startup if there is an internet connection available)

    Just my thoughts.



  • Eric-KE5DTO
    Eric-KE5DTO Administrator, FlexRadio Employee admin

    The intention is that if your radio isn't registered for SmartLink that it wouldn't contact the SmartLink server (unless initiating the registration process). That is the on/off switch as far as our intended design. As stated in the postmortem:

    ...radios that were not registered for SmartLink were making an initial connection to the SmartLink server.

    This is a bug that will be addressed.

  • Trucker
    Trucker Member ✭✭✭

    Eric, I think I understand better. One other question. You mentioned registering for SmartLink as the On/Off , method. Does that mean that once a SmartLink user is finished connecting to their remote pc, that the user is no longer logged into SmartLink for Authentication until the next session? And, if they connect locally over their home network, SmartLink is not involved in the process of connecting to the radio over the local network. And there is no attempt by SmartSDR to look for the connection to SmartLink?

    Thanks for the information. ( just trying to understand the process)



  • Eric-KE5DTO
    Eric-KE5DTO Administrator, FlexRadio Employee admin

    There are multiple pieces of the puzzle here. When you "log in", you are simply getting an authentication token. That token is then used when connecting to the SmartLink server. Once you are connected to the radio, the SmartLink server is no longer in the mix with regard to the connection to the radio.

    For local connections, the SmartLink server is not involved at all. If you are logged into SmartLink, the radio chooser may show SmartLink connected radios if any are available outside of your local network. But connecting to a local radio doesn't involve SmartLink at all. As a broker, it isn't necessary on a LAN as the radio is broadcasting "here I am" with discovery packets which helps the client know about the available radios.

  • Dan Trainor
    Dan Trainor Member ✭✭✭

    Cheers to Flex Team for this Transparency. Very good.

  • Trucker
    Trucker Member ✭✭✭

    Thank you Eric for clarifying how the system works. I have had people argue with me that thought the system had to phone home for every single setup, no matter if the user was using SmartLink or not. I always thought that was wrong as I have used Wireshark several times in the past and never could verify any activity beyond the check for new updates on startup. ( outside of my LAN)



Leave a Comment

Rich Text Editor. To edit a paragraph's style, hit tab to get to the paragraph menu. From there you will be able to pick one style. Nothing defaults to paragraph. An inline formatting menu will show up when you select text. Hit tab to get into that menu. Some elements, such as rich link embeds, images, loading indicators, and error messages may get inserted into the editor. You may navigate to these using the arrow keys inside of the editor and delete them with the delete or backspace key.