How to filter data?

What happens if (if it can happen!) the data looks like:


lorem ipsum

Your Name: Joe Bloggs
Your Address: 1 Big Street
Myhamlet
Mytown
Preferred Method of receiving information: Carrier Pidgeon
Comments: [COLOR="DarkRed"]This is a comment
but this one is too: we know that
comments can span multiple lines but
what about colons?[/COLOR]

lorem ipsum dolor sit amet

The code i have works for this type…but i am not able to process the last field if its multiline. Thats what I am looking a solution for. :frowning:

Thanks.

Great. Have the other code snippets here not helped? Could you show your code, as far as you’ve got so far? (:

kind of combining some previous answers, but possibly helpful:

$str = '
lorem ipsum

Your Name: Joe Bloggs
Your Address: 1 Big Street
Myhamlet
Mytown
Preferred Method of receiving information: Carrier Pidgeon
Comments: This is a comment
But this one is too: we know that
comments can span multiple lines but
what about colons?

lorem ipsum dolor sit amet
';

$myKeys = array(
    'Your Address' => "Address",
    "Your Name" => "Name",
    "Preferred Method of receiving information" => "PrefMethod",
    'Comments' => 'comments'
);

$terms = implode(':|', array_keys($myKeys));
$results = array();

foreach($myKeys as $key => $val) {
    if (preg_match('#(?:^|\
)('.$key.'):\\s*(.+?)\
('.$terms.':|$)#s', $str, $m)) {
        $results[$val] = $m[2];
    }
}

// results array should contain what you need?

Phew, I knew someone would come to the aid of the party.

A negative look-behind regex match. I knew how to say it, I just couldn’t make one. :slight_smile:

Hi,

Great code, small and efficient but its including the last line in the comments when it should not. How to do ?

And here’s my code, not efficient but works but not for the last comment field :frowning:

<?php
if (isset($_POST["btnTest"]))
{
	$fields[] = array ("fld", "Name:", "Address:", "City:", "State:", "Post Code:", "Phone:", "Mobile:", "E-Mail:", "Comments:", PHP_EOL);
	$fields[] = array ("fld", "Name:", "Email:", "Telephone:", Chr(13));
	$fields[] = array ("fld", "Your Name:", "Your Address:", "Telephone:", "Telephone:", "Email:", "Friend's Name:", "Friend's Address:", "Post Code:", chr(13));
	$fields[] = array ("fld", "TITLE:", "NAME:", "EMAIL:", "DESIRED_LOCATION:", "CAPITAL_TO_INVEST:", "TELEPHONE_NUMBER:", "ALT_TELEPHONE_NUMBER:", "BEST_TIME_TO_CALL:", "PREFERRED_METHOD_OF_CONTACT:", "ADDRESS_LINE_1:", "ADDRESS_LINE_2:", "ADDRESS_CITY:", "ADDRESS_REGION:", "ADDRESS_POSTCODE:", "ADDRESS_COUNTRY:", "WHEN_WANTS_TO_START_NEW_BUSINESS:", "COMMENTS:", "SOURCE:", "DATE:", Chr(13));
	$fields[] = array ("fld", "Name:", "Email:", "Telephone:", "Address 1:", "Address 2:", "Town/City:", "Postcode:", Chr(13));
	$fields[] = array ("fld", "Name:", "Surname:", "Address1:", "Address2:", "Address3:", "Postcode:", "Country:", "Phone1:", "Phone2:", "E-mail:", "Available Budget:", "Geographic areas of interest:", "Current employment:", "Specific skills:", "Willing to relocate", "Willing to relocate", Chr(13));
	$fields[] = array ("csv", "Comma-separated data:", "To import this");
	$fields[] = array ("fld", "Title:", "Forename:", "Surname:", "Age:", "Address Line", "Town/City:", "County:", "Postcode:", "Email Address:", "Preferred Method", "Home Phone:", "Employment situation:", "Approx Timeframe:", "Approx Funding Level:", Chr(13));
	$fields[] = array ("fld", "Title:", "Forename:", "Surname:", "Age:", "Address Line", "Address Line 2:", "Town/City:", "County:", "Postcode:", "Email Address:", "Preferred Method", "Home Phone:", "Mobile Phone:", "Approx Timeframe:", "Approx Funding Level:", Chr(13));
	$fields[] = array ("fld", "First Name:", "Last Name:", "Email Address:", "Address:", "City:", "County:", "Country:", "Postcode:", "Telephone:", "Mobile Phone:", "Preferred calling time:", "Capital:", "Time Frame:", "Preferred contact:", Chr(13));
	$fields[] = array ("fld", "Name:", "Address:", "Tel:", "Email:", Chr(13));

        if(!empty($_FILES["filemail"][name]))
        {
		$find = array ("=20", "=A");
		$rplc = array (" ", "");
		
		$email = str_replace ($find, $rplc, file_get_contents($_FILES["filemail"][tmp_name]));
	}
        else { $error[] = "Invalid E-Mail Message File"; }

	if (empty($error))
	{
		for ($f = 0; $f <= count($fields) - 1; $f++)
		{
			$find_pos = "";
			$error = "";
			$pos = "";
			
			for ($i = 1; $i <= count($fields[$f]) - 1; $i++)
			{
				if ($find_pos == "")
				{
					$find_pos = strpos($email, $fields[$f][$i]);
				}
				else
				{
					$find_pos = strpos($email, $fields[$f][$i], ($find_pos + strlen($fields[$f][$i])));
				}
				
				if ($find_pos === false) 
				{ 
					$error = array("Unknown Format"); 
				} 
				else 
				{ 
					$pos[] = $find_pos;
				}
				if ($_POST["debug"] == "1") { echo $fields[$f][$i] . " : " . $find_pos . "<br />"; }
			}
			
			if ($_POST["debug"] == "1") { echo "<hr />"; }
			if (empty($error)) { break; }
		}
		
		if ($_POST["debug"] == "1") { 
			echo "<pre>";
			print_r($pos);
			echo "</pre>";
		}

		if (empty($error))
		{
			$find = array (chr(13), "<br />", "<br>", "<br/>");
			$rplc = array ("", "", "", "");
			
			for ($i = 0; $i <= count($pos) - 1; $i++)
			{
				if ($i == (count($pos) - 1))
				{
					$data[] = str_replace ($find, $rplc, substr ($email, $pos[$i], 1));
				}
				else
				{
					$data[] = str_replace ($find, $rplc, substr ($email, $pos[$i], $pos[$i + 1] - $pos[$i]));
				}
			}

			if ($fields[$f][0] == "fld")
			{
				// Its field based data
				
				for ($i = 0; $i <= count($data) - 1; $i++)
				{
					$exp2 = explode (": ", $data[$i]);
					if (trim($exp2[0]) <> "") 
					{ 
						echo str_replace ($find, $rplc, strtolower($exp2[0])) . " - " . "Value: " . $exp2[1] . "<br>\
"; 
					}
				}
			}
			else
			{
				// Its CSV based data
				$find = array ("Comma-separated data:", "=", Chr(10));
				$rplc = array ("", "", "", "");
				
				$data = str_replace($find, $rplc, $data[0]);
				$data = str_replace('""', '","', $data);
				
				
				$data = str_getcsv($data, ",");
				//$data = explode (",", str_replace ("\\"Relocate Internationally?\\"", "\\"Relocate Internationally?\\"" . chr(13), $data));
				
				echo "<pre>";
				print_r($data);
				echo "</pre>\
";
				
			}
		}
	}
}
?>
<h3 align="center">E-Mail Filter System</h3>
<?php if (!empty($error)) { require_once ("error.php"); } ?>
<form method="post" enctype="multipart/form-data">
<table border="0" cellpadding="5" cellspacing="0" align="center">
<tr>
	<td align="right">E-Mail:</td>
	<td><input type="file" name="filemail" /></td>
</tr>
<tr>
	<td align="right">Debug Mode:</td>
	<td>
		<input type="radio" value="0" checked name="debug" /> No
		<input type="radio" value="1" name="debug" /> Yes
	</td>
</tr>
<tr>
	<td colspan="2">&nbsp;</td>
</tr>
<tr>
	<td colspan="2" align="center"><input type="submit" name="btnTest" value="Test" /></td>
</tr>
</table>
</form>

Thanks.

If your original split used a regex, then perhaps that can be modified rather than create a second or third one - or, maybe we can treat the whole thing as lines to array elements and work though them.

Unfortunately it’s impossible to get rid of that last line - there is no way to know if it is part of the comment field or not. You could alter the code to ignore everything after a double line break, but in a comment field a double line break would be perfectly possible. I would suggest trying it on some real world data and see how it goes.

How would i find a double line break ?

Thanks.

$str = 'Your Name: Joe Bloggs
Your Address: 1 Big Street
Myhamlet
Mytown
Preferred Method of receiving information: Carrier Pidgeon
Comments: This is a comment
But this one is too: we know that
comments can span multiple lines but
what about colons?

lorem ipsum dolor sit amet
';

$myKeys = array(
    'Your Address' => "Address",
    "Your Name" => "Name",
    "Preferred Method of receiving information" => "PrefMethod",
    'Comments' => 'comments'
);

$terms = implode(':|', array_keys($myKeys));
$results = array();

foreach($myKeys as $key => $val) {
    if (preg_match('#(?:^|\
)('.$key.'):\\s*(.+?)\
('.$terms.':|\
|$)#s', $str, $m)) {
        $results[$val] = $m[2];
    }
}

Hi,

This also seems to be working fine. Specify 1st field name and the last field name in the $fields array.

<?php
$fields = array ("Name:", "Comments:");
$first = "";
$second = "";
$last = "";
$rec = 0;

$handle = @fopen("emails/0.eml", "r");
if ($handle) 
{
	while (!feof($handle)) 
	{
		$buffer = fgets($handle, 4096);
		if ($first == "") 
		{ 
			$pos = strpos($buffer, $fields[0]);
			if (is_numeric($pos)) { $first = $pos; $rec = "1"; }
		}
		if ($second == "") 
		{ 
			$pos = strpos($buffer, $fields[1]);
			if (is_numeric($pos)) { $second = $pos; $rec = "1"; }
		}
		
		if (is_numeric($first) and is_numeric($second))
		{
			if (trim($buffer) == "")
			{ 
				$rec = 0; 
			}
		} 
		
		if ($rec == "1") { $data = $data . $buffer; }
	}
	fclose($handle);
}

echo $data;
?> 

Thanks.

Hi,

Your code works fine but a small issue:

if data has blank fields then its not splitting it:

Eg.

$segments = array( 
  0   => 'Name: Some Name', 
  1   => 'Address: Line 1', 
  2   => 'Line 2', 
  3   => 'Line 3', 
  4   => 'City: Some City', 
  5   => 'State: Some State', 
  6   => 'Post Code: 12345', 
  7   => 'Phone: 123457980', 
  8   => 'Mobile: 123457980', 
  9   => 'E-Mail: some@someone.com', 
  10  => 'Comments: Some comments line 1', 
  11  => 'Some comments line 2', 
  12  => 'Some comments line 3',
  13  => 'Current Employment:',
  14  => 'Specific Skills:'
);

So in the end it won’t split 2 fields i.e: Current Employment and Specific Skills and it would include those 2 with comments value.

Thanks.