Web Scraping doubts

Hi everyone,

I have some doubts with the web scrapping, I will try to explain them:

1. Scraping I use

I know that there are different ways to do web scraping. I´m my case I use this method to convert different currency to Euros. For example:

$url = 'http://www.xe.com/es/currencyconverter/convert/?Amount=1&From=USD&To=EUR';

$html = file_get_html($url);

$posts = $html->find('span[class=uccAmountWrap]');

foreach ($posts as $post) {
    $link = $post->find('span',2);
    $divisa_usa = $link->plaintext;
    $divisa_usa = str_replace(',', '.', $divisa_usa);

The problem is that with this method the page when refreshing is too long, from 3 to 6 seconds. I don´t know exactly why and how to reduce this time.

2. Can´t scrap to Xbox.com

The reason why I want to scrap to Xbox.com is because I have a webpage where I recollect all the prices for all the games in 12 different countries. I have already create a data base where I have all the prices and another features.

The problem is that I have to update everyday the prices and it would help me a lot if the prices were updated automatically. I have tried with the code shown above but it doesn´t work in Xbox.com. May be is because is an “https” website.

I suppose that is possible to scrap Xbox.com because there are other pages that use the method, but as I am inexperienced I haven´t been able to do it.

I case it is possible, which is the best method to recollect all the pricess and save them in a database? I started learning web programming 3 months ago, but in this case I cannot find useful information.

Thanks in advance.

Hi everyone,

I have been looking for information but I keep having the same problem. Trying with PHP and cURL, but keeps not working with Xbox.com. This way avoids the https request so that you can scrap this kind of websites, but it is not working.

Can anybody give me some tips? I would be really grateful.

Thanks in advice and merry Christmas.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.