Continuous bag of words, symbol to images and text?

Can “continuous bag of words” neural networks be used to browse semantic space?

For example perhaps visually producing a zoomable map of images and text collapsed onto a 2D hyperbolic plane. I think this would well as projections such as the poincaré disc are not too visually confusing, both spaces of designs/configuration spaces, and higher dimensional objects will fit into them nicely, without being too counterintuitive or squashed together.

For example semantically between “walk” and “run” is “jog”, between “walk” and “jog” is perhaps “hurry”, between “walk” and “hurry” is “waurry” (made up word), etc…