Let's discuss our ideas here for advanced data structures that allow to deal with very large datasets.
One option that came into my mind for 0/1/2 data would be to use two bitvectors (BitSet in Java) per individual. That could yield approximately a 4-fold reduction in memory usage as compared to using a full byte per value.