I have an application where users can write their own configuration files. These are text files where on each line some text is inputted. I load the config file using file(), which puts it into an array, with each array element being a line in the file. This is nice to work with.
I need to compare the text in each line with a string and the problem is that the first line might start with a BOM, e.g. if the user saved the config file as UTF-8 in Notepad. Thus for the first string comparison they might be the same to the naked eye but to PHP they aren’t - string + BOM is three characters longer than string (even though these three are invisible).
Apparently this will be fixed for PHP 6, but for now how can I deal with this problem? I have no idea what encoding the text file will be saved as. Starting each file with a carriage return works, but it’s too much to ask of the users.
Ah, you got me going! I think I figured it out though:
// Removes BOM (Byte order mark) from file (if necessary)
function bomStrip( path, output )
$bufsize = 65536;
$utf8bom = "\\xef\\xbb\\xbf";
$inf = fopen(path, r);
$outf = fopen(output, w);
$buf = fread($inf, strlen($utf8bom));
if ($buf != $utf8bom)
if ($buf == "")
$buf = fread($inf, $bufsize);
if ($buf == "")
I did a quick test and it seemed to work. Let me know how it goes for you.
Thanks Hamish, that gave me the idea to simply use trim():
$lines is what file(‘config.txt’) returns. Very simple and clean with trim() and it works well.
I’m curious about your function, though. I don’t really understand it. The first ‘if’ will output the first 3 characters of $inf (because there’s no BOM) to $outf. In the next ‘if’, if the file is empty, exit the script (makes sense but why not make it the first ‘if’?). This means $outf will only be three characters long. Then an infinite loop where $inf is read in chunks of 64kb. Again, if empty, exit script (why again?). Then write the current 64kb chunk of $inf to $outf (overwriting what was currently there). Then $outf will only contain the last 64kb of $inf.
I’m sure I’m wrong, but I’d like to know why as I’m pretty confused by it.
Hey, that’s an even better solution!
The function is a rough translation of some python code I found (for the same purpose). So to be honest, I don’t fully understand some of the bits myself; but I was in a rush to finish it (just leaving work, had to catch a train ;p) and since the first test worked, I just posted it as is.
I ran over it again, and I believe it works as I expected. Consecutive fwrite calls don’t (at least in my tests) overwrite previous data, they append.
Well, anyhow, I’m glad it gave you the hint you needed.
Oh yeah, of course, fwrite appends. I got confused - was thinking of fopen() in w mode which truncates the file to zero.
Anyway, thanks again.