PHP and Standards: arg_separator.output

    Thomas Rutter
    Thomas Rutter
    Share

    PHP’s configuration directive arg_separator.output allows you to tell PHP how it should separate arguments in a URL and has a default value of ‘&’.

    The directive affects all URLs that are generated or modified automatically by PHP. The only time this is likely to affect us is when we use PHP Session Handling along with session.use-trans-sid to auto-generate URLs with session IDs. So, if you don’t use this, the following problem may not affect you.

    Theoretically, if your web application preferred separating arguments in a URL with another character, such as ‘;’, then you could tell PHP to use that character instead:

    http://www.example.com/url?variable1=value1;variable2=value2

    However, the following note in the PHP manual’s Session Handling section indicates a problem.

    Note: The arg_separator.output php.ini directive allows to customize the argument seperator. For full XHTML conformance, specify & there.

    This seems very strange. Firstly, the issue has nothing to do with XHTML conformance. A quick glace at the HTML 4.01 Specification, the HTML 3.2 Specification or even this Introduction to SGML upon which HTML is based should remind you that all occurences of ‘&’ must be escaped (for example, with &), regardless of the version of HTML or XHTML in use. I believe that the myth that this is an XHTML issue only may be due to the fact that validating’s one markup has only become trendy at about the same time as using XHTML has.

    Secondly, the description given of the arg_separator.output directive given in the PHP manual indicates that the separator will be used in the URL:

    arg_separator.output string The separator used in PHP generated URLs to separate arguments.


    PHP has glossed over the distinction between a URL, and a URL represented within an HTML attribute value. In the second case, a small selection of characters (& and ‘ or “) must be escaped. We ought to be able to set arg_separator.output to ‘&’ and PHP should escape this appropriately whenever it uses it as an attribute value in HTML (or XHTML, for that matter).

    Sure enough, when using a value of ‘&’, PHP wrongly turns your links into something like:

    Tony


    This is PHP’s default behaviour, and it is incorrect in all versions of both HTML and XHTML.

    A comment in the PHP demonstrates the confusion this has caused:

    arg_separator.output set to “&” is bad when you want to work with xhtml. Xhtml requires & instead to be written out. This for example prevents validation of xhtml using php sessions. I hope the default value will be changed


    The implied difference beteen XHTML and HTML here is incorrect. The requirement for ampersands in attribute values to be escaped applies equally to HTML (based on SGML) and XHTML (based on XML). Also, the proposed solution of fixing this by setting the default value to ‘&’ is inelegant. Ideally, this value should be ‘&’ and PHP should realise that when it is dealing with attribute values, it needs to convert ‘&’ inself.

    If you use PHP’s session handling and session.use_trans_sid to auto-generate URLs with session IDs, for now you can set arg_separator.output to ‘&’ in your PHP configuration in order to remain well-formed in HTML or XHTML.

    Lachlan pointed out to me that changing this value will also affect http_build_query, which is used to generate raw URLs, which should not contain HTML entities. So if you do set arg_separator.output to ‘&’ to work around this problem, avoid using http_build_query, and vice versa.

    The most elegant solution would be to insert your own session IDs in URLs and avoid session.use_trans_sid completely, because session.use_trans_sid is voodoo.

    Frequently Asked Questions about PHP and Standards: Arg_Separator.output

    What is the purpose of arg_separator.output in PHP?

    The arg_separator.output is a configuration directive in PHP. It is used to separate arguments in the URL-encoded query string. By default, the value of arg_separator.output is “&”. However, it can be changed to any other character based on the requirements. This is particularly useful when you want to generate URLs that are XHTML compliant.

    How can I change the value of arg_separator.output?

    You can change the value of arg_separator.output in the php.ini file. You just need to find the line where arg_separator.output is defined and replace the default value “&” with the character you want to use. Remember to restart your server after making changes to the php.ini file for the changes to take effect.

    Can I change the value of arg_separator.output at runtime?

    Yes, you can change the value of arg_separator.output at runtime using the ini_set() function. For example, you can use ini_set(‘arg_separator.output’, ‘;’) to change the value of arg_separator.output to “;”.

    What is the difference between arg_separator.input and arg_separator.output?

    While arg_separator.output is used to separate arguments in the URL-encoded query string, arg_separator.input is used to separate arguments in the input data from GET, POST, and COOKIE. By default, the value of arg_separator.input is “&”.

    Why should I care about arg_separator.output?

    If you are developing a web application in PHP, it’s important to understand how arg_separator.output works because it affects how URLs are generated. If you want your URLs to be XHTML compliant, you need to change the value of arg_separator.output to “&”.

    What happens if I use an invalid character as the value of arg_separator.output?

    If you use an invalid character as the value of arg_separator.output, PHP will generate a warning and continue to use the default value “&”.

    Can I use multiple characters as the value of arg_separator.output?

    No, you can only use a single character as the value of arg_separator.output. If you try to use multiple characters, PHP will only use the first character.

    Is it possible to use a non-alphanumeric character as the value of arg_separator.output?

    Yes, you can use a non-alphanumeric character as the value of arg_separator.output. However, you should be careful when choosing the character because some characters may have special meanings in URLs.

    How does arg_separator.output affect the performance of my PHP application?

    The value of arg_separator.output does not have a significant impact on the performance of your PHP application. However, if you change the value of arg_separator.output frequently at runtime, it may slightly slow down your application.

    Is arg_separator.output case-sensitive?

    Yes, arg_separator.output is case-sensitive. If you define it in lowercase in the php.ini file, you must also use it in lowercase in your PHP scripts.