I don’t think you need to be running your body through
htmlspecialchars before you validate it. Remember that
htmlspecialchars is primarily to convert special characters to HTML entities so that they can then be displayed safely to the user in an HTML document. You can accept what the user inputs as is, validate that and only when you go to print it out to the user then run it through that function. Remember sanitize input and escape output.
htmlspecialchars and its related
htmlentities are for escaping output. TIP: Make sure to always specify an encoding with those functions.
If you take out that usage of
htmlspecialchars, your
validate function will probably be a lot easier to implement and your
preg_match can be a lot simpler to implement as well.
I feel that because you are encoding early, it is going to make your validation and preg_matching much more error prone than they need to be. Maybe consider some of this and see if it makes things a bit more straight forward.