We don't know exactly the root cause of Fox-1E (AO-109)'s current behavior. We do know that the behavior could be explained if the IHU never started up. Further, we know a few reasons that this could happen:
1) The 'attached' line (which normally senses if a USB umbilical is attached) is held high for some reason (short or whatever)
2) The flash program memory being corrupted even by a single bit will cause the CRC check to fail, and it will power cycle. If power cycling does to fix the problem, it will loop forever.
3) Other diagnostic tests of the processor failing (for example a memory test) will also cause the power cycle loop.
It was decided to do this long before AO-85 was launched and despite never seeing any of these problems the code remains. While there is no knowing one of these really is the cause of AO-109's problems, I'd like to reduce the probability of a false signal or a flipped bit totally stops the processor, especially for LTM-1, but also for Golf-TEE. I consider the latter is a slightly lower priority since we also have the RT-IHU. And yes, I know the LIHU is supposed to be a backup to the RT-IHU.
Problem 1 could be worked around by adding code in the boot loader and startup to continue after several hours even when attached is high. We would also need an "ignore umbilical" uplink command similar to the one that currently exists as a console command.
Problems 2 and 3 are harder. Because these tests come immediately after boot, we have no way to "remember" across a power cycle what caused the power cycle. We have not yet started the SPI bus to write to MRAM, and we have made it a point not to write to the flash memory in orbit due to possible charge pump damage by radiation. If we could remember a count, we might try power cycling 3 or 4 times and then continue anyway if that did not fix the problem.
We could also remove these tests entirely, and "let come what may" on the assumption that many bit flips will have only minor effects.
Another possibility is to provide a hardware command that could be uplinked to tell the processor to continue in the face of these errors. The disadvantage is that this would use up our last possible hardware command on Golf-TEE and leave only a single hardware command on LTM-1.
I'm just recording my thoughts for posterity on the mailing list, but any ideas are appreciated.
Burns Fisher, WB1FJ *AMSAT(R) Flight Software*