2.x upgrade has been a nightmare. Several problems but this is a show stopper.

  • 4
  • Question
  • Updated 2 years ago
  • Answered
I am getting the square of death in the water fall when I access smartSDR or the ios app via the 2.x smart link.  The IOS app via smartlink is useless unless I enable a seperate vpn to that network that the radio resides on.

Everything works fine if I use the VPN.

According to the Facebook experts the radio is having fragmentation and that is why things are broke only on the smartlink mode. Ok I was remote last week and decided to wait until I was physically at the radio to test. 

I did a wireshark/tcpdump and see the fragmentation.

Here is the problem. I have jumbo frames and the MTU set to 9000 the entire path to the cable modem ( I have a 1gig symetric connection.)

According to my switch, the radio is connection at 1000 full duplex. It is impossible to hard set the network settings on the radio, thus it is impossible to troubleshoot.

The support I have seen on this is that people have changed a setting and things start working?

I do not run consumer gear. Thus ideas where to go?

1. I have jumbo frames enabled the entire path.
2. MTU is set to 9000 the entire path except the radio which I can not verifiy.
3. The radio ethernet is connecting at 1000 full duplex.
4. I ran a sniffer via a spanned port on the local switch and and see fragmentation on the local port. This means the radio is probally using a low mtu for some odd reason.
5. I have 1gig symetric with static ip space.
6. I am running a complete ubiquiti stack, all of the firmwares are current version.
7. Other udp fragmentation sensative protocols like voip, and L2TP work fine.
8. Everything works fine when accessing the radio via L2tp, openvpn, and Softether vpns. 
9. The radio is installed on a secure iot lan segment due to the fact that the OS is not accesible thus its security level can not be verified.


Ideas?

This particular 2.x problem is a showstopper since I can have the same working functionality with a 1.x firmware.  I am also having the audio drop problem when running digital modes, but that has a work around. 
Photo of David H Hickman

David H Hickman

  • 48 Posts
  • 4 Reply Likes
  • annoyed

Posted 2 years ago

  • 4
Photo of Tim - W4TME

Tim - W4TME, Customer Experience Manager

  • 9188 Posts
  • 3550 Reply Likes
Michael, VA3MW wrote...

The black box of death is just a symptom of something else and engineering may be able to figure it out once they know what is causing it.


The number of bytes that comprises a frame of display data can, and usually is, greater than 1500 bytes.  We send display data to the client in single frames for rendering.  The display data transport is UDP (VITA-49) and the maximum Ethernet frame size on the Internet is 1518 bytes which include a 1500 byte data payload. Since the display frame data can exceed the maximum size of a single Ethernet frame, the radio's TCP/IP stack fragments the data, meaning that the display data frame is divided up into multiple VITA-49 packets and the client host's TCP/IP stack reassembles them into a single display frame.  IP fragmentation/reassembly is an Internet Protocol (IP) process that breaks datagrams into smaller pieces (fragments) which are reassembled by the receiving host running SmartSDR.  This capability is standard and required in all TCP/IP protocol stacks as per RFCs 791, 815 and 1191.

If somewhere in the connection between the radio (this includes the LAN and the LAN router), the multitude of transport providers on the Internet, and the LAN and LAN router at the client end of the connection does not allow fragmented packets to pass through so that the client host can reassemble them into a single display frame, then you will experience the frozen display and no waterfall data.  

Now, it is very unlikely that the transport providers are not in compliance with RFCs 791 and 815 so that leaves the radio and client ends (endpoints) of the connection as probable areas where the issue is occurring.  Edge routers that are used for these endpoints usually have firewall features and by their very nature block different traffic types.  With some routers, you have control over the features and in others, you have very limited control.  Some client locations prevent fragmented packets as a security feature because, in the past, hackers have used this part of the TCP/IP protocol as a DoS attack vector.  Rather than using a stateful process to inspect the packets they use brute force methods of denying all of them.  Other client locations deny UDP fragments because it represents a possible high bandwidth usage scenario that they want to prevent rather than using a stateful firewall that operates at layer 6 and 7, again, another brute force method to control data flow. If one of these features is preventing the multiple VITA-49 packet fragments from traversing the firewall/router, then the SmartSDR client is not getting the data it needs to reassemble the packet fragments and build a complete frame of display data.  

This behavior is easy to validate.  If you have this issue, start reducing the size of the SmartSDR application so that the panafall's geometry begins to get smaller.  At a certain point, it will reach a size where all of the data for a single display frame is 1500 bytes or less.  In this scenario we do not need multiple VITA-49 packets to construct a display frame; fragmentation is not required and the display will begin rendering.

When we engineer a network feature, we do so in order to achieve the most efficient transfer of data while being compatible with the greatest number of client configurations. In the case of how we transfer display frame data, we chose the method described above because it uses the least amount of CPU resources on the radio where they are finite (the fragmentation/reassembly is performed using the TCP/IP kernel mode daemon) and fragmentation is a standard component of the TCP/IP protocol stack, so it is universal. We have been using this method of rendering display data since before SmartSDR v1.0.0 was released and it has worked flawlessly.

Since a majority of customers who are using SmartLink are not globally experiencing this particular problem, meaning that they connect to their radio from a different location or Internet provider and it functions as intended, our primary design goal has been met.  For those instances where this was not the case, turning off certain firewall/router or client based Internet security features has resolved the problems. This does not mean that every use case will experience success.  Cases, where the radio or the client end of the connection (LANs), are using a variety of firewall policies (either configured or unbeknownst to them which is usually the case) or configuration that would be considered advanced in nature may not operate properly.  In these cases, we will work to identify the cause of the problem, but FlexRadio's support mandate does not extend to advanced network troubleshooting, analysis or issue resolution.  

Since we are aware of the situation, we have defect listed in our bug tracker to investigate possible remedies for datapaths that are blocking VITA-49 fragments.  We have some ideas, but until we investigate and analyze the impact that has on the radio operational performance, we are not ready to commit to an alternate solution or a time frame.  I hope this clears up the nature of this situation at hand.

And I hope David can find a resolution to his issue as he has been provided the key information as to why his issue is occurring.  

David, If you believe the radio is at fault, then open a HelpDesk ticket and send it in for validation.  We'll put in on our LAN and I'll do the client testing myself since both networks are known good running SmartLink daily.  I use my location for worldwide demo purposes, most recently from Germany and Japan.  If there is a problem, we'll correct it.
(Edited)

This conversation is no longer open for comments or replies.