Re: Latchup Protection and Watchdog Parts

25 Jan 2023


      Thank you for responding with your insight, Bob!
On 1/24/23 19:15, Bob Stricklin via pacsat-dev wrote:
...
First the Excel spread sheet I sent is a early look at currents 
needed. Since I put that together some of the parts have changed and 
some have been added.
I am sure we have a power issue but taking the position of just trying 
to get everything we want done then we can back down on capability and 
reduce power later.
There is not a limit or budget on power at this time.
Thank you. I meant to mention I considered the spreadsheet a first order 
approximation but I may have missed that in my revisions.
...
Each time you add one of these current monitors to the design you 
introduce another part that can fail due to latch-up and other reasons.
The action taken for each monitor added may be different. Latch-ups 
are possible from radiation exposure. These can be single event or 
they can result in a hard failure of a part. When there is an event 
and high current the plan may be to power down and wait for a period 
of time and then try to restart. If it is the processor with an issue 
then you are restarting everything if it is a sub circuit then you may 
be able to do a quick recycle. There are different types of current 
monitors to help you with your action plan. It may also be necessary 
to build a subcircuit to get the results needed.
We're not necessarily dealing with hard failure of a part with this 
current switch. We are specifically dealing with single-event upsets 
leading to latchup from a radiation effect that further results in 
unregulated power consumption. This result is considered transient and 
is resolved with a power cycle, hence the use of this part in Fox and 
now Golf. From our recent experience, hard failure of a part seems 
relatively rare and we haven't had a recent satellite with batteries 
that lasted long enough to deal with total ionizing dose, for example. 
(I don't know for sure which AMSAT satellites used non-hardened 
integrated circuits and thus would be resistant to that affect.)
...
<snip>
I worked on optical ICs and since these were exposed to light we had 
to be careful not create an issue with latch-up. When a new design 
comes out of wafer fab it is one of the early test you do to see if 
you have issues. If you find a problem you have try and fix it by 
changing the die layout, adding more metal or modify the circuit. When 
a device is “radiation harden” this should also be done and hopefully 
the TMS570 had this done. Still could fail with radiation though.
One thing to point out... I don't believe the TMS570 is radiation 
hardened. I understand it's used in safety critical equipment and has 
special circuitry to detect failure modes. But I wouldn't expect it to 
be immune to single-event upsets. In the case of bit flips that impact 
processing, the TMS570 could detect that as a failure when comparing the 
results of the two cores and assert a failure. In the case of the RT-IHU 
this would result in failover to the mirror processor. In the case of 
the PACSAT payload, which I believe is running a single TMS570, the 
failure line could be tied to the power circuit to reset. If the power 
circuity of the TMS570 suffers a single-event upset that latches up a 
power rail I'd expect we'll depend on the current switch to detect and 
recycle power to recover. (On a related topic, it's pretty fascinating 
to examine the Fox telemetry and observe the impact of the SAA. I don't 
know if Fox reset every time it traversed the SAA but it was quite 
impactful.)
As long as we're talking about radiation affects, nothing we're doing 
will mitigate total radiation affects that will ultimately degrade and 
cause failure of our chips.
Jonathan
-- 
Jonathan Brandenburg
Radio Amateur Satellite Corporation
1-214-213-1066

Jonathan Brandenburg

tags (0)

participants (1)