Totally prevent crawling of anything which isn't something

Server Config
#1

I have a MediaWiki website.
MediaWiki creates about at least 15 webpages per webpage:

  • Talk webpage
  • History webpage
  • Revision webpages
  • Diff webpages
  • What-links-here webpage
  • Recent-changes-in-webpages-linked-from-here
  • Printable version webpage
  • Permalink version
  • Information about this webpage — webpage
  • Source code webpage / Edit webpage
  • Statistics webpages
  • And probably more

The total amount of webpages might arrive 150-1500 or much more.

Having so many webpages per webpage tremendously inhibited the crawling of my website to the extent that SEO is damaged although software performance is decent and content is abundant and rewarding with good feedback from readers.

Although most of the webpages I’ve exampled have a noindex attribute, I believe that I should still limit access to them backendly somehow.

I thought using robots.txt to allow access only into article and category pages.

Allow: /index.php/article/
Allow: /index.php/category/
Disallow: *

Is this syntax good? Would you do something otherwise?

#2

looks like there is a big page already dedicated to this on the mediawiki site.
https://www.mediawiki.org/wiki/Manual:Robots.txt

#3

Any instruction to robots, either through a meta tag or robots.txt, is only ever advisory, it does not enforce anything.
This means that “good” robots should obey your rules, but there is nothing to stop any or “bad” robots from ignoring them.
The chances are, if robots are ignoring the rules in one place (meta tag) they will ignore them elsewhere too.

Have you established whether this is really the actual problem?
Have you seen that pages that should not be indexed are indexed?
It may just be you have unrealistic expectations about how quickly all your pages will be indexed.