SmartSDR v3.8.19 and the SmartSDR v3.8.19 Release Notes | SmartSDR v2.12.1 and the SmartSDR v2.12.1 Release Notes
SmartSDR v1.12.1 and the SmartSDR v1.12.1 Release Notes
Power Genius XL Utility v3.8.8 and the Power Genius XL Release Notes v3.8.8
Tuner Genius XL Utility v1.2.11 and the Tuner Genius XL Release Notes v1.2.11
Antenna Genius Utility v4.1.8
Need technical support from FlexRadio? It's as simple as Creating a HelpDesk ticket.
SmartLink Post-mortem for October 14, 2023 Outage
Background
The SmartLink system uses a secrets storage system to store sensitive information in a secure fashion. These secrets might be credentials for accessing a resource like a database, for example. This complies with industry best practices for securing online services like SmartLink. The secret storage system has a permissions system that allows only authorized users and applications to access them. A third party library allows the SmartLink server application to access appropriate secrets when the server application first starts up.
There is a current issue with the SmartLink server application that causes the application to crash approximately every two days. This is not a large issue since there are systems that automatically restart the application in case of a crash, and there is no noticeable interruption to the users thanks to failover features of the server. Nevertheless, Software Engineering has been working on fixes to this issue. Since the issue only seems to happen in production and not in local test instances, Software Engineering has been working on deploying a development environment at our cloud provider.
Incident Description
On Friday, 2023-10-13, Software Engineering was working on deploying a test environment to our cloud provider. As a part of this work a new secret was added to the secrets storage system. Since the production environment did not need permissions to this secret, none were given.
On Saturday, 2023-10-14 at approximately 11:18 UTC, the SmartLink application server restarted because of a crash. This caused the third-party library to attempt to reload the secrets from secrets storage during startup. Since it did not have permissions to the newly added secret, it crashed shortly after startup. The system began restarting the application repeatedly, but no user requests were serviced.
At approximately 14:30 UTC, Engineering responded to reports from support indicating issues with SmartLink, and began debugging. Once the issue was identified, a new version of the server application was deployed to the production environment. At approximately 16:56 UTC, the issue was resolved and normal operation restored.
Remedial Actions
- The third-party library used to access the secrets does so in a way that is prone to breaking. There are alternative methods to pass the secrets to the application software. This is how Engineering restored normal operation. There are some refinements that need to be made to this process to improve security, and Software Engineering will continue to make these.
- There needs to be more separation between the production and development or testing instances. Software Engineering will be creating a more isolated environment according to our cloud provider’s best practices. There are also mitigations that can be performed in the meantime to attempt to protect the production environment from similar unintended conflicts.
- Software Engineering continues to work on the precipitating issue causing the application software to crash every few days.
Comments
-
Thank you for the transparency.
0 -
Mike, could this on going issue have any bearing with the problem I have had ever since the initial outage. I only get partial log in on my Maestro when remoting my Flex 6500 (basic display, no panadapter, no audio).
Yesterday I got four successful remote connections. Today no complete connections.
Bob, KN4HH
0 -
No, not at all.
Something is blocking the VITA49 packets which are not controlled by SmartLink at all.
Please open a support ticket.
73
0
Categories
- All Categories
- 283 Community Topics
- 2.1K New Ideas
- 526 The Flea Market
- 7.5K Software
- 6K SmartSDR for Windows
- 145 SmartSDR for Maestro and M models
- 352 SmartSDR for Mac
- 249 SmartSDR for iOS
- 228 SmartSDR CAT
- 169 DAX
- 352 SmartSDR API
- 8.7K Radios and Accessories
- 7K FLEX-6000 Signature Series
- 6 FLEX-8000 Signature Series
- 840 Maestro
- 43 FlexControl
- 845 FLEX Series (Legacy) Radios
- 788 Genius Products
- 414 Power Genius XL Amplifier
- 274 Tuner Genius XL
- 100 Antenna Genius
- 242 Shack Infrastructure
- 166 Networking
- 403 Remote Operation (SmartLink)
- 127 Contesting
- 624 Peripherals & Station Integration
- 125 Amateur Radio Interests
- 860 Third-Party Software