web scraping from a page that comes after using post method?
Hi everyone,
I am not much experienced in web development world and I don't know if it is even possible to even do it. Check out this page http://dsebd.org/news_archive.php
click "search by symbol".
The next page that comes up shows a "news database" of the symbol.
I want to automatically read this information using regular expression and create an offline database of all the symbols.
Sounds easy, I will run a loop that will go thru each symbol and read the values/ characters of its news page and store it in my database.
The only problem is, the database page comes up from post method.
So, I was wondering, pages that come up from after a "submit button push" i.e. "post method" not get method, is there a way to screen scrape that page ?
ps. I posted this on the php thread as well. This is not double post! just to make sure. I need to know this for .net as well.
You are going to have to do an HttpWebRequest to that page. And read the response stream into a string and use your regex on that string. You can also load it into XML doc, but then it has to be strict.
Bookmarks