How can I simplify this validation script

Initially I had a simple script that did not validate URL parameters and believe it could have been a security risk.

The following script seems to work OK and I think could be simplifed:

//=========================================
// get URL parameters and return $PAGE name 
// OR return warning page not found.
//=========================================
function getPage()
:string 
{
	$PAGE = strrchr($_SERVER['REQUEST_URI'], '/');
	$PAGE = str_replace('.php', '', $PAGE);

	$PAGE = '/'===$PAGE  ? 'home' : $PAGE;
	if('/'===substr($PAGE, 0,1)):
		$PAGE = substr($PAGE, 1);
	endif;		
	$PAGE = 'index'===$PAGE ? 'home' : $PAGE;

	if( file_exists('sitepages/' . $PAGE . '.php')) :
		# No problem :)
	else:
		$PAGE = 'missing-file';
	endif;

	return $PAGE;
}

Well lets see.

Short version: Here’s how i would have it.

function getPage()
:string 
{
//I added a safety check here in case strrchr returns FALSE.
	$PAGE = substr(strrchr($_SERVER['REQUEST_URI'], '/') ?: " ",1);
	$PAGE = (''!==$PAGE && 'index.php' !==$PAGE) ? $PAGE : 'home.php';	
	if( !file_exists('sitepages/' . $PAGE)) :		
		$PAGE = 'missing-file';
	endif;
	return str_replace(".php","",$PAGE);
}

Now, I admit there is some loss of readability in my compression. But let me isolate one block that i think could be simplified in your original code without as much disruption:

Before the first line, you know that the first character of the string is a /, because you created the string by reverse searching for a slash. Furthermore, you know that there is exactly one slash.

The only way this is not the case is if strrchr returned FALSE, but that would have failed immediately anyway, because you wouldn’t be able to str_replace a boolean without causing typecasting. (See my extra check in my version as to how I force this to be a valid string regardless)

Simplification: Substring the initial result of strrchr, remove the entire if block; check if page is the empty string at the same time you check if it’s set to index, because the action in both cases is the same.

Any time you’ve got an empty clause in an if statement, you know you can simplify. Since this is an empty If, invert the test and make your Else clause your If clause.

3 Likes

I would do it like this:

// decouple from superglobals, makes it easier if you ever want to unit test this
function getPage(string $requestUri): string
{
    $PAGE = strrchr($requestUri, '/');
    if (false === $PAGE):
        // invalid input, no sense continuing
        return 'missing-file';
    endif;
    
    // normalize PAGE - use preg_replace instead of str_replace because it should
    // only remove .php from the END of the string, not just anywhere
    // in the string
    $PAGE = substr(preg_replace('~.php$~', '', $PAGE), 1);

    // detect all variants of "home"
    if ('' === $PAGE || 'index' === $PAGE):
        $PAGE = 'home';
    endif;

    // Check if the page exists, if so: we're done
    if (file_exists('sitepages/' . $PAGE . '.php')):
        return $PAGE;
    endif;

    // When all else fails ...
    return 'missing-file';
}
1 Like

couple comments…
should the logic for invalid input be missing-file or home?.. just from a UX perspective.

if we’re gonna use preg_replace, lets just go whole-hog on it, and do the substr at the same time…
preg_replace(["~^/~",'~.php$~'], '', $PAGE)

condensable, since we’re not running any other code between the two returns.
return (file_exists('sitepages/' . $PAGE . '.php')) ? $PAGE : 'missing-file';

1 Like

That would depend on what John wants, but I personally would not render anything sensible when a simple criterion isn’t met. Don’t sweep errors under the rug, deal with them.

Sure, I don’t have a strong opinion on either of those solutions. They both work fine, and since this probably only needs to be done once per HTTP request (once you know the page you can continue and don’t need to check it again later - probably) performance is not a consideration IMO.

… yet. If we ever need to do more checks the diff will be a lot easier with the additional if check (just add code inbetween), rather than breaking a ternary apart to get more code in there.

Plus I like the simplicity of having the default result on it’s own on the last line. It clearly tells me that whatever is happening above, if none of that actually works we’re going to default to missing-file.

Like in a method that checks if a certain user has enough privileges to do something I would always start with a return false; and then above that add any cases for when it should be true. You don’t have access, unless you can satisfy any of the if statements above.

Lastly, I’m of the opinion that code should be easy to read and debug if needed, not as terse as possible.

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

3 Likes