SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Enthusiast pyro.699's Avatar
    Join Date
    Jan 2006
    Location
    Canada
    Posts
    39
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    [python] Copy Objects & Class'

    Hey,

    I don't know how many of you know about the Client Form Module, it basically takes a ulrlib2.Request variable and extracts all the forms out of it so you can submit post data easier.

    Something that I'm having problems with right now is being able to read the html data and parsing the forms.

    PHP Code:
    html response.read()
    forms ParseResponse(responsebackwards_compat=Falserequest_class=request)

    print 
    html
    print forms 
    In this, html would equal the proper html value on the page; and forms would equal []

    PHP Code:
    forms ParseResponse(responsebackwards_compat=Falserequest_class=request)
    html response.read()
     
    print 
    html
    print forms 
    In this, html would equal None; and forms would equal [<ClientForm.HTMLForm instance at 0x00000000033203C8>] which is what it should equal.

    So what i need to do is have the same urllib2.request object available for 2 different uses. Reading and Parsing.

    Thanks
    ~Cody Woolaver

  2. #2
    SitePoint Enthusiast pyro.699's Avatar
    Join Date
    Jan 2006
    Location
    Canada
    Posts
    39
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    So i found out some interesting information.

    This is happening because each class (at somepoint) calls response.read() and because this is a buffered function, once it reaches the end of the buffer, it stays there, it doesn't get reset the next time you call it.

    This means that if i was to call response.read() 2 times in a row, it wouldn't work because the buffer has already reached the EOF marker. So what needs to be done is to have the buffer reset. Or something close to that, im not really 100% if i go that right, i was talking with a friend and heres what he said:

    Quote Originally Posted by Friend
    *One option is to wrap your file like object in another object
    *ie.
    *You store the results of the read inside the class
    *and then you override the read method
    *In the read method, you return the data if you already have it or use the file object to get it if you don't
    That didn't make a great deal of sense to me, but maybe you guys would be able to help me (hes offline now so i cant ask for more detail now).

    Thanks once again Sitepoint
    ~Cody Woolaver

  3. #3
    SitePoint Wizard
    Join Date
    Mar 2001
    Posts
    3,537
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    For future searchers, Friend's solution would look something like this:
    Code Python:
    import urllib
     
    class Wrapper(object):
        def __init__(self, resp):
            self.resp = resp
            self.contents = ""
     
        def read(self):
            if not self.contents:
                self.contents = self.resp.read()
     
            return self.contents
     
        def __getattr__(self, name):
            return getattr(self.resp, name)
     
     
     
    resp =  urllib.urlopen("http://www.google.com")
    w = Wrapper(resp)
     
    x = w.read()
    y = w.read()
     
    print x[:50]
    print
    print y[:50]
     
    print "-" * 20
     
    print w.geturl()
     
    print "_" * 20
     
    print w.info()
     
    --output:--
    <!doctype html><html><head><meta http-equiv="conte
     
    <!doctype html><html><head><meta http-equiv="conte
    --------------------
    [url]http://www.google.com[/url]
    ____________________
    Date: Tue, 11 Aug 2009 03:41:00 GMT
    Expires: -1
    Cache-Control: private, max-age=0
    Content-Type: text/html; charset=ISO-8859-1
    Set-Cookie: PREF=ID=50cd394f9bcfacf7:TM=1249962060:LM=1249962060:S=iNwurvPUxCz6oDHl; expires=Thu, 11-Aug-2011 03:41:00 GMT; path=/; domain=.google.com
    Server: gws
    Ideally, you would just subclass the response object's class and simply override one method: the read() method. However, to create a full fledged response object(of the perplexing type urllib.addinfourl), you have to do something like this:


    Code Python:
    class MyResponse(urllib.addinfourl):
        def __init__(self):
            urllib.addinfourl.__init__(self, fp, headers, url)

    Unfortunately, creating the argument fp is very complex.

  4. #4
    SitePoint Member
    Join Date
    Nov 2009
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    How do you guys all like using Python?

    Worth learning?

  5. #5
    SitePoint Wizard
    Join Date
    Mar 2001
    Posts
    3,537
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes. It's an excellent language to start with or to migrate to if you are experienced in other languages. python can be used to do just about anything, so it has no limitations. You can use python to do text file processing, number crunching, retrieve information from web pages, build web sites using cgi or frameworks such as (django, pylons, turbogears) , make sophisticated games (with pygame), do gui programming (wxPython, tkinter, etc.), whatever.

    There's a very active python forum here:

    http://python-forum.org/pythonforum/index.php

    sitepoint missed out by combining perl and python into one forum, which resulted in this very dead forum. You might have started your own thread if you wanted to get more responses.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •