SmartSDR v3.8.19 and the SmartSDR v3.8.19 Release Notes | SmartSDR v2.12.1 and the SmartSDR v2.12.1 Release Notes
SmartSDR v1.12.1 and the SmartSDR v1.12.1 Release Notes
Power Genius XL Utility v3.8.8 and the Power Genius XL Release Notes v3.8.8
Tuner Genius XL Utility v1.2.11 and the Tuner Genius XL Release Notes v1.2.11
Antenna Genius Utility v4.1.8
Need technical support from FlexRadio? It's as simple as Creating a HelpDesk ticket.
SmartLink Post-mortem for October 14, 2023 Outage
Background
The SmartLink system uses a secrets storage system to store sensitive information in a secure fashion. These secrets might be credentials for accessing a resource like a database, for example. This complies with industry best practices for securing online services like SmartLink. The secret storage system has a permissions system that allows only authorized users and applications to access them. A third party library allows the SmartLink server application to access appropriate secrets when the server application first starts up.
There is a current issue with the SmartLink server application that causes the application to crash approximately every two days. This is not a large issue since there are systems that automatically restart the application in case of a crash, and there is no noticeable interruption to the users thanks to failover features of the server. Nevertheless, Software Engineering has been working on fixes to this issue. Since the issue only seems to happen in production and not in local test instances, Software Engineering has been working on deploying a development environment at our cloud provider.
Incident Description
On Friday, 2023-10-13, Software Engineering was working on deploying a test environment to our cloud provider. As a part of this work a new secret was added to the secrets storage system. Since the production environment did not need permissions to this secret, none were given.
On Saturday, 2023-10-14 at approximately 11:18 UTC, the SmartLink application server restarted because of a crash. This caused the third-party library to attempt to reload the secrets from secrets storage during startup. Since it did not have permissions to the newly added secret, it crashed shortly after startup. The system began restarting the application repeatedly, but no user requests were serviced.
At approximately 14:30 UTC, Engineering responded to reports from support indicating issues with SmartLink, and began debugging. Once the issue was identified, a new version of the server application was deployed to the production environment. At approximately 16:56 UTC, the issue was resolved and normal operation restored.
Remedial Actions
- The third-party library used to access the secrets does so in a way that is prone to breaking. There are alternative methods to pass the secrets to the application software. This is how Engineering restored normal operation. There are some refinements that need to be made to this process to improve security, and Software Engineering will continue to make these.
- There needs to be more separation between the production and development or testing instances. Software Engineering will be creating a more isolated environment according to our cloud provider’s best practices. There are also mitigations that can be performed in the meantime to attempt to protect the production environment from similar unintended conflicts.
- Software Engineering continues to work on the precipitating issue causing the application software to crash every few days.
Comments
-
Thank you for the transparency.
0 -
Mike, could this on going issue have any bearing with the problem I have had ever since the initial outage. I only get partial log in on my Maestro when remoting my Flex 6500 (basic display, no panadapter, no audio).
Yesterday I got four successful remote connections. Today no complete connections.
Bob, KN4HH
0 -
No, not at all.
Something is blocking the VITA49 packets which are not controlled by SmartLink at all.
Please open a support ticket.
73
0
Categories
- All Categories
- 289 Community Topics
- 2.1K New Ideas
- 535 The Flea Market
- 7.5K Software
- 6K SmartSDR for Windows
- 146 SmartSDR for Maestro and M models
- 360 SmartSDR for Mac
- 249 SmartSDR for iOS
- 231 SmartSDR CAT
- 172 DAX
- 352 SmartSDR API
- 8.8K Radios and Accessories
- 7K FLEX-6000 Signature Series
- 27 FLEX-8000 Signature Series
- 850 Maestro
- 44 FlexControl
- 847 FLEX Series (Legacy) Radios
- 796 Genius Products
- 416 Power Genius XL Amplifier
- 277 Tuner Genius XL
- 103 Antenna Genius
- 243 Shack Infrastructure
- 166 Networking
- 404 Remote Operation (SmartLink)
- 130 Contesting
- 631 Peripherals & Station Integration
- 125 Amateur Radio Interests
- 870 Third-Party Software