Looking at your HTML, your caption seems to be similar to "tagging" (this photo is "tagged" under the subject "Coast" and there are x other images also so tagged), so I think those are good.
I'm going to dig around for the article I read about a specific issue with the Back button and usability/accessibility. If I find it I will post it here.
*edit it might have been this one The Trouble with Lightbox (and its Variants) - Monday By Noon
Depending on what kind of gallery this is and how you think most users will expect it to function (does hitting Back bring you back to the gallery page or the previous image?), choose that type of functionality and then...
Is it a good idea to add some off-screen content, just a brief description, that outlines the purpose of the page or informs users that the anchors will open images in a modal overlay?
here you'd have, as Ryan suggested, some ON-screen info that basically states how the images are best navigated. It doesn't need to be more than a sentence or two (esp since many people visiting already have experience with lightboxy-type things) and so long as it's keyboard-accessible and text-available you shouldn't need to explain how the rest works. Because your links in the modal box are... links, I don't see this as something that needs ARIA widget-attributes. That's more for using elements who were meant to do one thing (like spans) as widget tools to do something else (turn span into a scroller-thingie).
Of course, make sure it all works as expected for those not getting the modal box (no JS)... and I just mean this in general, not screen reader users (who like everyone else, often/usually have JS enabled).
Some more points: Building a Better Lightbox | habdas.org