[AR] Re: LEO radiation shielding
- From: Henry Spencer <hspencer@xxxxxxxxxxxxx>
- To: Arocket List <arocket@xxxxxxxxxxxxx>
- Date: Sat, 21 Dec 2019 02:30:09 -0500 (EST)
(Catching up on some postings I set aside "for a bit later" when I got
busy earlier...)
On Thu, 5 Dec 2019, Elliot Robert wrote:
Just to add more data points to the discussion. The first Mexican
cubesat which launched today, aztec-sat is using a beagle black board
for it's primary computer. That's a decidedly terrestrial design. About
$65 USD from digi-key.
That's not the first BBB in orbit, either, although it may be the first
that's mission-critical. It'll be interesting to see how well that works.
I wonder if those of you with the experience could elaborate about the
memory error correction software being flown routinely.
Proper memory error correction, alas, *really* has to be in hardware.
You can sort of fake it a little in software for things like control
variables, but there isn't a practical way to do software error correction
on, e.g., the program's stack -- at least not without serious compiler
modifications and a lot of overhead -- and a one-bit error there can make
an awful mess. Been there, done that, not doing it again.
Hardware error correction is fairly easy to do if you're building your own
board, but it is admittedly scarce in off-the-shelf boards. (With one
caveat: some modern memory chips appear to have undocumented internal
error correction to make less-than-perfect memory arrays usable. This may
also help with errors due to proton hits etc., but it's hard to say how
good it is when the manufacturer refuses to discuss it at all.)
The good news is, if you design the hardware so a software failure can't
damage it -- e.g., minimal solar arrays on all sides so temporary loss of
attitude control is not a mission killer -- then if you're willing to live
with a somewhat higher rate of outages, you can cross your fingers and
just ignore the issue. Whether you get a usable satellite that way is a
bit of a gamble -- even proton-beam testing isn't really a good simulation
of the space environment -- but sometimes you get lucky and the chips stay
up fairly well. Witness all the off-the-shelf stuff, like BBBs, flown in
space with some success. (I say "sometimes" and "some success" because
you seldom see press releases about the times when it *didn't* work. A
lot of cubesats either are DOA or work so poorly that their owners quickly
give up on them; either way, such failures usually aren't publicized.)
...are we really just talking about multiple CPUs with their own short
term memory that are constantly running basic arithmetic equations
1+2=3. If one of the cpu's results don't match the other two for a given
instance of time...
Unfortunately, what you really want to know is whether the CPU's results
in doing *real work* match the other two. Typically we're not talking
about a CPU that gets a little tipsy and does everything not-quite-right,
but about transient errors in memory or internal registers that mess up
particular computations only. Catching that tends to require, again,
comparisons done in hardware. That can get complicated and messy -- e.g.,
do the CPUs always take interrupts with exactly the same timing?
One bright spot: we're starting to see high-end microcontrollers for
high-rel applications that have two lockstep CPU cores on the *same chip*,
with interrupt handling etc. carefully synchronized by the hardware, and
internal comparisons done on everything, and any disagreement causing the
whole assembly to reset; those might be useful for cheap spacecraft.
(This requires either (a) a spacecraft that can't be hurt by software
outages, or (b) software like that in the Apollo LM, which *doesn't* just
give up and do a cold start when a reset occurs, but rather makes an
organized effort to pick up where it left off.)
Henry
Other related posts: