Hello all,
I am trying to grab some content from a web site. Web sites charset is charset=iso-8859-1. this is a sample of a page from that website.
This is my codes
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
$header[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$header[] = "Accept-Encoding: *";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Connection: Keep-Alive";
curl_setopt($c, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($c, CURLOPT_HEADER, 0);
curl_setopt($c, CURLOPT_URL, $url);
curl_setopt($c, CURLOPT_TIMEOUT, 30);
curl_setopt($c, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($c, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt ($c, CURLOPT_HTTPHEADER, $header);
curl_setopt($c, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.1)");
curl_setopt($c, CURL_GET, 1);
$w= curl_exec($c);
curl_close($c);
preg_match("/<td class=\\"details\\">(.*)<div class=\\"div\\">(.*)<\\/div>/isUS",$w,$matches);
$post['fullpage'] = $matches[2];
$fullpage= mysql_real_escape_string($post['fullpage']);
$query="INSERT INTO `file` VALUES ('', '".$title."', '".$fullpage."', '".$no."', '')";
if i echo $fullpage before inserting into database, it looks ok. There is no problem.
My database setting is
ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
if i insert it into database Out put on the site is like this.
I get problem with some characters. I tried everything. I spend last 5 hours to sort this out. Please please help me. i tried to chage the charset, collation but still same