SitePoint Sponsor

User Tag List

Results 1 to 2 of 2
  1. #1
    SitePoint Enthusiast
    Join Date
    Oct 2006
    Posts
    41
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    web scraping from a page that comes after using post method?

    Hi everyone,
    I am not much experienced in web development world and I don't know if it is even possible to even do it. Check out this page http://dsebd.org/news_archive.php
    click "search by symbol".
    The next page that comes up shows a "news database" of the symbol.
    I want to automatically read this information using regular expression and create an offline database of all the symbols.

    Sounds easy, I will run a loop that will go thru each symbol and read the values/ characters of its news page and store it in my database.
    The only problem is, the database page comes up from post method.

    So, I was wondering, pages that come up from after a "submit button push" i.e. "post method" not get method, is there a way to screen scrape that page ?


    ps. I posted this on the php thread as well. This is not double post! just to make sure. I need to know this for .net as well.

  2. #2
    SitePoint Mentor NightStalker-DNS's Avatar
    Join Date
    Jul 2004
    Location
    Cape Town, South Africa
    Posts
    2,877
    Mentioned
    46 Post(s)
    Tagged
    0 Thread(s)
    You are going to have to do an HttpWebRequest to that page. And read the response stream into a string and use your regex on that string. You can also load it into XML doc, but then it has to be strict.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •