It is a three dimensional image of a teapot suspended in the air in front of a vertical backdrop and a slanting base.
Because we are binocular, we can see three dimensions. This is called ”single binocular vision”. It has three ”steps”. The first is simultaneous vision. Both eyes independently see the same image. The second is fusion. When our brain detects the two similar (not identical, see below) images - the sensory part of fusion - it will automatically and uncosciously precisely align the images into a single one by adjusting the direction of the eyes - the motor part of the fusion. We do not want to see double. Once the images of the two eyes are aligned, our brain notices that they are not in fact identical. This is because right eye sees it slightly from the right side and our left eye from the left - this is called disparity. Within half a second or so after aligning the two images, our brain calculates from this disparity a three dimensional view - the third step of single binocular vision called stereopsis.
But this three dimensionality is limited to a relatively narrow field, narrowest directly in front of us and a but wide in our peripheral visual field, called Panum’s space. Outside this space we continuously see double. This does not bother us because our brain is accustomed to these physiological diplopic images (but out a Finger in front of you and focus to it, puting it thus inside Panum’s space and everything behind your finger will be seen double. Now focus to the obects behind your finger, thus moving Panum’s space there, and you will see your finger doubled).
The APOD of today plays with single binocular vision. It has a repeating vertical motif that is there to stimulate fusion. Then there are the two disparate images that, once fusion is activated, within the half second or so will be spotted by our brain. It will find them different and will calculate the three dimensional image to us. Once ready, fusion keeps it visible and fixed even if we tilt the screen or look at it sideways. The third component is the random ”noise” added to confuse us so to hide the two disparate images. The three dimensional view emanates from the fact that the two images are slightly at dirrerent distances from the margin of the repeating vertical motif.
There are two ways of getting the stereoscopiv view. If you have latent outward squint like many of us have (almost all shortsighted myopic individuals have this exophoria) you can just let your eyes wander a bit so that one vertical motif moves approximately on top of the next one. The brain will notice they are similar, but not identical, fusion will align them perfectly and the brain will calculate the stereopsis. If you let your eyes wander two much so that they will skip on vertical motif and align the next one, your stereopsis will produce an abstract three dimensional image - a Salvador Dali teapot if you wish.
If you are not blessed with exophoria, it will be more difficult. You will need to look somewhat beyond Panum’s space and then out your screen within it. If you have latent inward squint, esophoria, or you align your eye on the wrong side of Panum’s space, the three dimensionality will reverse as the eyes are crossed the unintended way.