Handling misspelled words

I have a form in which people can enter their company name and then an email is sent. The client has a list of companies they’ll accept, but they don’t want to use a dropdown or typeahead because the user can then see all of the companies, which means I have to guess what people are going to enter and then send an email to the appropriate contact.

For instance, an accepted company name is: ‘A.C. Widgets & Son, Inc.’ So if somebody enters ‘AC Widgets’, or ‘A C Widgets Inc’, or ‘AC Widgets and Son’ I have to be able to match those to ‘A.C. Widgets & Son, Inc.’ and send the email to that contact.

Keep in mind that there could be another company called ‘Widgets, Inc.’, so I can’t just regex the word “widget,” as most of the companies will have the word widget in their name.

Is there a way in PHP to do this, other than filling an array with a ton of options and hoping somebody enters the right one?

Thanks!

You might want to look into AJAX autofill/autocomplete solutions with a Javascript framework (jQuery, mootools, Prototype…whichever one you happen to already be using).

that’s why i was referring to with the typeahead… :slight_smile:

Sounds like a bit of a pain that. So they basically don’t want a dropdown because they don’t want to show the complete list of companies to others, but you only contact people who are coming from specific companies?

There’s a few solutions I thought of, but probably the simplest one is to have a “default” category, which is basically any person who wasn’t identified in one of the specified companies, and have that default dealt with a certain way - perhaps emailing all contacts, or just storing in the database to be filtered later?

In terms of working out which companies are a match, I’d do something like take the company name entered, then strip it of all whitespace and non-alphanumeric characters, then lowercase it, and then compare that against company names that you’ve put through the same process…

Something along those lines, anyway…

I don’t think you’ll ever be able to match all possible variations with that approach :frowning:

Why don’t make compromise between dropdown and ordinary text input - put autocomplete feature in there and after user enters 4+ letters (or more) you can perform custom search and fill the autocomplete dropdown with only matched company names.

avram: already tried that. no dice.

aaarrrggh: that was my original thought too, then I just realized something: each one of the companies has a distinct keyword in the name, so it’ll be much easier if I just do something like

if(stristr($_POST['company'], 'son') !== false) {
// do something
}

of course, each one of the keywords in the company name is not as ambiguous as ‘son’ so it should work :slight_smile:

Sounds quite messy though to be honest. I’d go with some sensible defaults and strip the string as I said above, then have a default option to catch anything that isn’t matched, and deal with that by sending it to either everyone in the group or by sending to a set email address created just for that purpose. It should be quite easy to explain to the client that you’ve set up this ‘default’ catch option to catch stuff coming through that has spelling mistakes in it…

Also, if you’re storing in the database, you could provide a view for your clients to look at all the information so they can filter it out themselves.

Is it possible for you to use MySQL’s Sounds Like operator? Not guaranteed to give you a match every time, but it could be useful. Alternatively, maybe see if you can use a dedicated search engine like [URL=http://sphinxsearch.com]Sphinx? You should be able to configure that to match approximate strings. If it’s good enough for Craig’s List and MySQL.com:slight_smile:

Antnee: I’m not storing the company names in a db. :slight_smile:

aaarrrggh: it’s actually a lot cleaner than the alternative, which was to fill up a bunch of arrays with anticipated misspellings. For instance, one company name could be A.C. Fishberger & Sons, Inc., which could easily be misspelled as Fishburger, or AC Fishburger and Son, and so on… However, if I just look for ‘fish’ I should be ok.

function check_company($string) {
    $str = strtolower($string);
    if(stristr($_POST['company'],$str) !== false) {
         return true;
    } else {
         return false;
    }
}


if(check_company('fish')) {
    $to = 'email1@email.com';
}
elseif(check_company('team') || check_company('usa')) {
    $to = 'email2@email.com';
}
elseif(check_company('prog') || check_company('class')) {
    $to = 'email3@email.com';
} else {
    $to = 'catchall@email.com';
}

It sounds a lot like you’re trying to simplify logic that has taken Google (and the like) many years and many $millions to develop. Oh, and you don’t need to store the company names in a database, you just need a valid DB connection to do a simple query against if that helps?

I’d recommend removing the elseif statement and replacing it with a switch. They’re much more readable, and also have a nice “default:” option, which allows you to nicely deal with a form submission that doesn’t match anything you’re looking for…