Trying to scrape a report with JSON / JSONP

In the office, we have a product that generates all sorts of reports… One such report I need to scrape and print into a web page from another domain. It’s basically nothing more than a text file hosted on an IIS app server that the product is served from. This needs to be scraped from a different domain (i.e. - using JSONP?) and then output on a page from that other domain.

I’ve used JSONP before to do such things but I’m having a problem this time as it keeps failing with the following indications in Firebug’s console:

  • Error: undefined was not called
  • SyntaxError: missing ; before statement

The only formatting I can do with the report data from within the product is just adding prefixes and or suffixes to each row that the report generates. Beyond that, I’m powerless to do much.

The code I’m currently poking a stick at is as follows:

//This JSONP callback keeps indicating that all 3 parameters are "undefined"... Why is this?
function test(data, status, jqXHR){
    console.log('inside test()');
    console.log('data:'+data);
    console.log('status:'+status);
    console.log('jqXHR:'+jqXHR);
}

//Grab contents from the report...
$.ajax({
    url:"https://mydomain/x?report=DP5BZWOYD5LJU4TMD4YPV2L7EQK53Y2VI7BVCUBXTP3OQGVYIBT7VHHQ6GWCBDCERA53J6G4YBIEM",
    jsonpCallback: test,
    contentType: 'application/json',
    processData: false,
    dataType:"jsonp",
    success: function(data){
        console.log(data);
    },
    error: function(a, b, c){
        console.log(a);
        console.log(b);
        console.log(c);
    }
});

As the callback comment indicates, the 3 parms it expects are indicated as being “undefined” in Firebug. This tells me that the callback isn’t being used as it should be. The funny thing is that the “Net” panel in Firebug appears to show the contents of the report.txt file, as if it’s being extracted–or at least, read–by my script. (Maybe that’s a false observation?)

How can I scrape that report? What am I doing wrong here?

Thanks in advance.

Try setting the parameter jsonp: false.

As per: http://stackoverflow.com/a/6215286/1312971

As of jQuery 1.5, setting the jsonp option to false prevents jQuery from adding the “?callback” string to the URL or attempting to use “=?” for transformation. In this case, you should also explicitly set the jsonpCallback setting. For example, { jsonp: false, jsonpCallback: “callbackName” }. If you don’t trust the target of your Ajax requests, consider setting the jsonp property to false for security reasons.

Thanks for the suggestion, mawburn.

I tried what you suggested and here’s what the code looks like now:

function test(data, status, jqXHR){
    console.log('inside test()');
    console.log('data:'+data);
    console.log('status:'+status);
    console.log('jqXHR:'+jqXHR);
}


//Grab contents from the Argos report...
$.ajax({
    url: "https://mydomain/x?report=DP5BZWOYD5LJU4TMD4YPV2L7EQK53Y2VI7BVCUBXTP3OQGVYIBT7VHHQ6GWCBDCERA53J6G4YBIEM",
    jsonpCallback: test,
    contentType: 'application/json',
    processData: false,
    dataType:"jsonp",
    jsonp: false,
    success: function(data){
        console.log(data);
    },
    error: function(a, b, c){
        console.log(a);
        console.log(b);
        console.log(c);
    }
});

When I refreshed the page to test it, Firebug spit out the following indications (in the exact order that I see them):

  • inside test()
  • data:undefined
  • status:undefined
  • jqXHR:undefined
  • Object { readyState=4, status=200, statusText=“success”, more…}
  • parsererror
  • Error: undefined was not called
  • SyntaxError: missing ; before statement

(The last item precluded a line sample of the payload I was trying to scrape, like it was able to see contents in the text file.)

It’s really odd… I hope my use of mixed double and single quotes in various AJAX settings above wouldn’t cause something like this. Every article you see in Google has people posting their “working examples” that make mixed usage of things being double-quoted, single-quoted, or no quotes at all…

I wouldn’t worry about the mixed and single quotes, that’s perfectly valid Javascript syntax!

I’m thinking the error you are getting has to do with the format of the data returned from the other domain.

The way JSONP works is that it expects the response received to be valid Javascript:

The server should return valid JavaScript that passes the JSON response into the callback function.
http://api.jquery.com/jquery.ajax/

Since you mentioned you are receiving a “txt” file, I guess that’s where your error comes from.

Try wrapping the report data into a call to your callback functio (in your case, the test function), as described here:

test({ reportData: "...." });
1 Like

Spot on, ucorina. Adding to that, I was told by requinix over at devnetwork.net that JSONP has to be explicitly enabled on the server one might be trying to scrape from in order for it to work. Otherwise, the only way to really do what I’m trying to do here is via something like CURL (which I don’t think there’s going to be a need for since my script works ON the app server–just not from my workstation, due to the CORS stuff).

Thanks for your input, though. Just out of curiosity, are you saying that JSONP will always fail if the payload isn’t retrieved in the JSON format?

Yeah, the server has to return the data wrapped in a function call, as in @ucorina’s example above.

As the server needs to pass the data as an argument to a JS function the data could be in any valid format, including a string, or a number, but is usually JSON.

There’s a SitePoint article explaining how JSONP works with examples that goes into more detail, and explains alternatives like settings CORS headers (if you have control of the server) or using a proxy (for 3rd party servers that don’t support JSONP).

1 Like

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.