Parse String

I need a bit of assistance with an ASPX page. This page needs to parse the Domain and keywords (q=) from this String below.

I want to end up with is:

Domain: www.google.com
Keywords: big apple pie

http://www.google.com/#hl=en&sugexp=ppwl&cp=13&gs_id=1k&xhr=t&q=big+apple+pie&pf=p&sclient=psy-ab&source=hp&pbx=1&oq=big+apple+pie&aq=0&aqi=g4&aql=&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=b447ab81d5cbad17&biw=1244&bih=751

I’d probably use Regular Expressions:

static void Main(string[] args)
{
	string inz = "http://www.google.com/#hl=en&sugexp=ppwl&cp=13&gs_id=1k&xhr=t&q=big+apple+pie&pf=p&sclient=psy-ab&source=hp&pbx=1&oq=big+apple+pie&aq=0&aqi=g4&aql=&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=b447ab81d5cbad17&biw=1244&bih=751/";

	Regex domainName = new Regex(@"http://(.+?)/");
	Regex queryItems = new Regex(@"&q=(.+?)&");

	Match matchDomain = domainName.Match(inz);
	Console.WriteLine(matchDomain.Groups[1].Value);
	Match matchQuery = queryItems.Match(inz);
	Console.WriteLine(matchQuery.Groups[1].Value);

}

Is it always going to be a url there? If so, I’d use a new Url(string) then use the properties on that object to snatch the various pieces you want.

I see that you can use .Host to get the domain name but how do you get the specific query item information?

I believe there is a .Query method you can use. IIRC. But it has been ages since I have worked with this, so cannot remember exactly how its done now again.

I think .Query will give you the entire query string but it doesn’t allow me to get the [“q”] value unless I’m forgetting something. There is no query, or at least no question mark, in the original post, so that may screw things up a bit, too.

Missed the lack of a question mark, though that is easily fixable in this case – replace the hashbang with a question mark.

Could have sworn the Uri object it gave nice access to the query string but it seems I was mistaken here. Or I was using a custom extension at the time.

Thanks for all the replies, I appreciate your time.

I’m I PHP guy, and really need some to guide me on this .Net stuff. As the .Net stuff is greek to me. Also, the site I’m working the code is in VB, so C-Sharp might not be the right solution.

I only care about if it is Google, Bing or Yahoo URL, then I want to parse the q= for Google and Bing, and p= for Yahoo.

Here’s what I have so far, don’t laugh.

<%
Dim http_referer As String
Dim http_user_agent As String
Dim searchengine As String

'http_referer = Request.ServerVariables(“HTTP_REFERER”)
http_referer = “http://www.google.com/?q=This+is+the+keyword+string

http_user_agent = Request.ServerVariables(“HTTP_USER_AGENT”)

If Session(“SearchEngine”) = “” Then

If InStr(http_referer,"www.google.com") &gt; 0 Then
	Session("SearchEngine") = "Google.com"
Else If InStr(http_referer,"www.bing.com") &gt; 0 Then
	Session("SearchEngine") = "Bing.com"
Else If InStr(http_referer,"search.yahoo.com") &gt; 0 Then
	Session("SearchEngine") = "Yahoo.com"
Else
	Session("SearchEngine") = "Other"
End If

Session("http_referer") = http_referer
Session("user_agent") = http_user_agent

End if
%>

So how can I get the Keyword list for the string?

It is pretty simple, this should work


            string http_referer = "http://www.google.com/#hl=en&sugexp=ppwl&cp=13&gs_id=1k&xhr=t&q=big+apple+pie&pf=p&sclient=psy-ab&source=hp&pbx=1&oq=big+apple+pie&aq=0&aqi=g4&aql=&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=b447ab81d5cbad17&biw=1244&bih=751";

            int i;
            i = http_referer.IndexOf("?q=");
            if (i == -1)
            {
                i = http_referer.IndexOf("&q=");
            }

            string s;
            s = http_referer.Remove(0, i+3);

            i = s.IndexOf("&");
            if (i >= 0)
            {
                s = s.Remove(i);
            }
            MessageBox.Show(s);