Categorizing deals with tags! - Major issue

Not going to lie, this is a massive script, and I’m so close to having it done and working, however it seems the major issue is my use of arrays – I’ve done a ton of troubleshooting and can’t figure it out, however I believe by the end of the script my array is something like this:

array('Array','Array', 'Array', 'Array', 'Array', 'Array'); 

So after a few hours, I decided to post it in case anyone wants to try and figure it out.

Functions:
max_value = Simply finds the array with the highest integer and returns that arrays name.
isTag = Uses regex to check if the tag is in the description by itself
reallyseriouslysortdealshardcore = GOAL: Sort through a list of predefined tags and find them in a description, count how often they’re used - Then returns a category based on what category the most often used tags are in, and returns a list of the tags.

function max_value($args) 
					{
						$results = array();
						$max_value = max($args);
						$results[] = $max_value;
 
    					//now get the key for the first occurence of max value
   						 $found_key = false;
						foreach ($args as $key => $value) 
							{
								if ($value == $max_value && !$found_key)
									{
										$results[] =  $key;
										$found_key = true;
        							}
   							 }
							 return $results;
					}

function isTag($tag, $descrip)
			{
				$matches=array();
				preg_match_all('/\\b'.$tag.'\\b/', $descrip, $matches);
				// If the tag 'car' is found 10 times, it only returns car once
				$matches = array_unique($matches);
				// No tags found = return false, otherwise return all tags found
				if (count($matches)==0)
					{
						return 'false';
					} else {
						return $matches;
					}
			}

function reallyseriouslysortdealshardcore($description)
			{
	
		$apparelTags = array('apparel','clothe','shoe','jacket','shirt','lingerie');
		$autoTags = array('auto','car','audio','speaker','tire','receiver','gps','tomtom','vehicle');
		$beautyTags = array('beauty','makeup','nail polish','hair','nails','mascara','eye liner','foundation','lipstick','lotion','skin','skin care','pendant','sterling','gold');
		$booksTags = array('books','ebook','novel','paperback','hardback','kindle','nook','ibook');
		$electronicsTags = array('electronics','television','tv','hd','hdtv','speaker','home theatre','home theater','receiver','pc','hard drive','camera','projector','watch','watches','intel','amd','asus',
		'samsung','corsair','lite-on','gigabyte','logitech','wii','ps3','xbox','laptop','netbook','ipad','xps');
		$entertainmentTags = array('entertainment','video games','music','movie','blu-ray','dvd');
		$foodTags = array('food','healthy','diet','recipe','chocolate','organic','fryer');
		$musicTags = array('music','cd','mp3','itune','headphones','album');
		$sportsTags = array('sports','football','basketball','soccer','tennis','softball','hockey','jerseys','baseball');
		$toysTags = array('toys','legos','barbie','hot wheels');
		$travelTags = array('travel','airfare');
		
		// These are the tags the current description contains
		$runningTags = array();
		$apparelCount = 0;
		$autoCount = 0;
		$beautyCount = 0;
		$booksCount = 0;
		$electronicsCount = 0;
		$entertainmentCount = 0;
		$foodCount = 0;
		$musicCount = 0;
		$sportsCount = 0;
		$toysCount = 0;
		$travelCount = 0;
		
		
		// Run through apparel tags and add any tags found to running tags
		foreach ($apparelTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$apparelCount++;
							}
					}
			}
			
		foreach ($autoTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$autoCount++;
							}
					}
			}
			
		foreach ($beautyTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$beautyCount++;
							}
					}
			}
			
		foreach ($booksTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$booksCount++;
							}
					}
			}
			
		foreach ($electronicsTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								//echo $arraytag;
								//
								$runningTags[] = $arraytag;
								$electronicsCount++;
							}
					}
			}
		
		foreach ($entertainmentTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$entertainmentCount++;
							}
					}
			}
			
		foreach ($foodTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$foodCount++;
							}
					}
			}
			
		foreach ($musicTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$musicCount++;
							}
					}
			}
			
		foreach ($sportsTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$sportsCount++;
							}
					}
			}
			
		foreach ($toysTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$toysCount++;
							}
					}
			}
		
		foreach ($travelTags as $tag)	
			{
				// Returns either false, or an array of tags
				$findTags = isTag($tag, $description);
				
				if ($findTags!='false')
					{
						// For each result in the array -- add it to the runningTags array
						// and add a count to that tags category
						foreach ($findTags as $arraytag)
							{
								$runningTags[] = $arraytag;
								$travelCount++;
							}
					}
			}
			
		// Trim all whitespace in each tag in the running tags array to make tag display nice
		/*foreach($runningTags as $tag)
			{
				$tag = trim($tag);
				
				//$runningTags[] = $tag;
			}*/
		
		// formatted running tags
		//$runningTags = implode(', ', $runningTags);
		//echo $runningTags;
			
		echo print_r($runningTags).'<Br />';
		
		
		
		// Figure which category the tags most fit...
		$counts = array(
			'apparel' => $apparelCount,
			'auto' => $autoCount,
			'beauty' => $beautyCount,
			'books' => $booksCount,
			'electronics' => $electronicsCount,
			'entertainment' => $entertainmentCount,
			'food' => $foodCount,
			'music' => $musicCount,
			'sports' => $sportsCount,
			'toys' => $toysCount,
			'travel' => $travelCount);
		
		$results = max_value($counts);
		
		// proper category
		$properCategory = $results[1];
		//echo $properCategory.'<br>';
		//echo $results[1];
		
		
		return $runningTags.'|'.$properCategory;
		
		
		
		
		
		
	}
	

This is the code I use to test it:

$description = "Shine some light on the holiday. Known for their ruggedness and dependability, Hella auxiliary lamps must pass a series of endurance tests before branded with the \\"Hella\\" name. This Xenon lamp provides long-range, powerful illumination that resembles daylight for added safety and peace of mind.  It includes a magnesium reflector and housing in a size that is ideal for mounting in the front grill or air dam. This kit includes two Micro DE Xenon lamps with D2S Xenon capsules, two Generation 3 ballast units, one wiring harness with relay and mount, and a one year limited warranty.";
	$returns = reallyseriouslysortdealshardcore($description);
	
	$returnspieces = explode('|', $returns);
	$theTags = $returnspieces[0];
	$theCat = $returnspieces[1];
	
	echo 'Tags: '.$theTags.'<br>';
	echo 'Category: '.$theCat.'<br>';

Thank you for any assistance.

Wouldn’t it be easier if you did something like this:

$tags = array(
	'apparel'    => array('apparel', 'shoes', 'jacket', 'shirt', 'lingerie'),
	'auto'	   => array('auto','car','audio','speaker','tire','receiver','gps','tomtom','vehicle')
);

That way, you can add more categories to your array without copy/pasting the foreach loop that’s used for traversing trough it.

Finally, the comparison you’re doing - there are many ways you can go about the comparison but the sequential one should suffice here - for each of the words in tags categories, attempt to find the occurences of current tag within the description.

function isTag($tag, $description)
{
	return preg_match_all("#$tag#", $description);
}

foreach($tags as $category => $taglist)
{
	foreach($taglist as $tag)
	{
		if(($num_occurences = isTag($tag, $description)))
		{
			$tag_count[$category][$tag] = $num_occurences; // number of occurences for the current tag within the category - it could be expanded to suit your needs
		}
	}
}

Also, why would you use double-quotes to quote a string that has no php variables in it? You had to escape a word that contained double-quotes so it also looks unreadable.

You could have done something like:

$description = 'Shine some light on the holiday. Known for their ruggedness and dependability, Hella auxiliary lamps must pass a series of endurance tests before branded with the "Hella" name. This Xenon lamp provides long-range, powerful illumination that resembles daylight for added safety and peace of mind.  It includes a magnesium reflector and housing in a size that is ideal for mounting in the front grill or air dam. This kit includes two Micro DE Xenon lamps with D2S Xenon capsules, two Generation 3 ballast units, one wiring harness with relay and mount, and a one year limited warranty.';

or


$description = <<<EOF
Shine some light on the holiday. Known for their ruggedness and dependability, Hella auxiliary lamps must pass a series of endurance tests before branded with the "Hella" name. This Xenon lamp provides long-range, powerful illumination that resembles daylight for added safety and peace of mind.  It includes a magnesium reflector and housing in a size that is ideal for mounting in the front grill or air dam. This kit includes two Micro DE Xenon lamps with D2S Xenon capsules, two Generation 3 ballast units, one wiring harness with relay and mount, and a one year limited warranty.
EOF;

First of all Blue, I want to thank you for taking the time to help me out with this matter, your post did open my eyes to a few things. I had contemplated setting the predefined tags up into a multi-dimensional array yet hadn’t done it, now I have.

As for this:


function isTag($tag, $description)

{

    return preg_match_all("#$tag#", $description);

}

I’ve realized that currently returns the amount of occurences of the $tag in the $description. However after looking through the php manual and experiencing my own errors, it seems it requires the third parameter of an array to write the occurences into, so no big deal I’ll just add a blank array named $matches to the function:

function isTag($tag, $description)
			{
				 $matches=array();
   				 return preg_match_all("#$tag#", $description, $matches);
			}

Moving on to the nested foreach loops…
I understand for the most part how they work with two minor exceptions:

if(($num_occurences = isTag($tag, $description)))

In that line, since $num_occurences is being defined in the if statement, it will always be true moving on, this is the intention I assume?
So the other part is making this:

$tag_count[$category][$tag] = $num_occurences;

This would produce something like:
$tag_count[‘auto’][‘tomtom’] = number of times tomtom shows up the auto category of tags

Alright after wrapping my head around this a bit I’m gonna head back and try to implement it in my scenario and see how all works out.

One last thing (promise!)
About the double quoting, that description value is usually going to be pulled from an RSS feed, I just took one of the internet and copy/pasted it for example.

Since I haven’t tested any of this, I did miss the 3rd parameter in preg_match_all.
However, the line:

if(($num_occurences = isTag($tag, $description))) 

will not always return true. What happens is that variable $num_occurences receives the return value of isTag function.
What can be returned is false, 0 or positive integer (or unsigned integer if you want to be exact with terms).

So if a 0 or false is returned - the if evaluates to false and you do not increment the counter of occurences for the specific tag.
The IF statement could have been rewritten so it’s more logical, I was just rushing so I did it this way.
You can also use

$num_occurences = isTag($tag, $description);

if($num_occurences){ }

Good luck with your feature and remember to keep things simple :slight_smile: