Using WGET

Tweet

Wget is a tasty utility on Linux and Mac OS X systems that can come in handy for web system administrators.

Wget — found on the GNU.org site — is a command line application for file retrieval over ftp, http and https connections.

I find it useful for downloading files directly to a server I am working on in a shell session, saving time instead of downloading to my local desktop and uploading. Additionally, since it can pass user names and passwords, it is powerful for use in web site migrations, setting up mirrored sites and more.

Finally, Wget can be scheduled using cron, so if a file or directories need replicated on a regular basis, it can be set to do so without adminstrator intervention.

Some useful examples for utilizing Wget:

1) Downloading a remote file – Perhaps you are downloading an update to an application and have been sent the url. In this case you could use either ftp or http to retrieve:


wget http://somedomain.com/public/remotefilename.tar.gz
or wget ftp://somedomain.com/public/remotefilename.tar.gz

Wget over ftp defaults to binary (i mode on ftp lingo), however, of you need to use ascii mode, you simply add ‘;type=a’ (without quotes) onto the end of the ftp url example above.

2) Downloading with authentication – you may be updating a registered application requiring a user name and password to access. Change the syntax as shown below:


wget username:password@http://somedomain.com/reg/remotefilename.tar.gz
or wget username:password@ftp://somedomain.com/reg/remotefilename.tar.gz

3) Inserting custom ports into the wget request – perhaps your download will require a custom port along with authentication. Wget easily handles this as well by inserting a colon and portnumber afrter the host and before the /path to file(s):


wget username:password@http://somedomain.com:portnumber/reg/remotefilename.tar.gz
or wget username:password@ftp://somedomain.com:portnumber/reg/remotefilename.tar.gz

4) Entire directories can also be migrated from one server to another, i.e. moving a web site to new hardware. I have found ftp access to be most effective for this. I also make use of logging (the -o option) the transfer in the event debugging or verification of file retrieval is needed, and use the recursive option (-r) to recreate the directory structure on the new server.

So if I am moving mydomain.com — I would use:


wget -o mylogfile -r myuser:mypass@ftp://mydomain.com/

If you have an ftp user that can see more than one domain, insure you specify the path to the files and directories for the domain you are moving.

There are several other interesting and useful options including:

–passive-ftp: for using wget behind firewalls

-nd: does not recreate the directory structure on the remote machine and instead simply saves all retrieved files into the current local directory.

–cookies=on/off: if the remote site requires cookies to be on or off to retireve files (helpful with authentication at times)

–retr-symlinks: Will retrieve files pointed to by symbolic links.

There are several other powerful features in Wget, and fortunately, the manual included offers excellent examples. Simply run man wget on the command line to review.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://roderick.dk MRoderick

    Wget is also available as a windows binary in a larger package called UNXUTILS.

    http://unxutils.sourceforge.net/

  • sylozof

    Deep Vacuum http://www.hexcat.com/deepvacuum/index.html offers Mac OS X users a GUI for wget.
    It installs both wget (as wget is not installed on Mac OS X by default) and the GUI to use it, so you can choose between the command line and the graphical interface.

  • Smit-tay!

    WGET can use an http or ftp proxy. See descriptions for the wget specific environment variables: http_proxy, ftp_proxy.; as well as command line options for proxy authentication: –proxy-username and –proxy-passwd

  • Conrad

    >>Deep Vacuum http://www.hexcat.com/deepvacuum/index.html offers Mac OS X users a GUI for wget.

    ? Isn’t a GUI for WGET called “a browser”?

  • Bob

    >>Isn’t a GUI for WGET called “a browser”?

    Ya, because when you want to download 400 images from a server.. God knows its a whole lot easier to browse them directly, click on the thumbnail, and save them individuall instead of typing a one line command and waiting 2 minutes

  • Gesiel

    nice, however you also it could have spoken of option -c (resume)

  • count zero

    I have an apache 1.3x server with some files protected by a .htaccess file–a username and passwd is required for access.

    The syntax in the article: luser:passwd@http://foo.bar.com
    gives the following error: Authorization failed.
    luser:passwd@http://foo.bar.com Unsupported scheme.

    The alternate syntax given by man wget:
    wget http://foo.bar.com –http-user=user
    –http-passwd=password
    does work. ???

  • Simon

    Using standard ftp user/password syntax works, ie:

    ftp://user:password@ftp.host.org/xyz

    I would hazard a guess that it would work for http as well.

  • http://www.practicalapplications.net bwarrene

    Yes – while it is feasible some platforms could hiccup on syntax depending on versioning of the app – the protocol used in the blog post worked on Linux servers I manage running the 2.2, 2.4 and even a daring newer kernel – either in production or development environments.

  • BrianCoogan

    Unfortunately the syntax given for authentication is completely wrong according to the manual which gives an example (for user hniksic with password ‘password’) of:

    wget ftp://hniksic:mypassword@unix.server.com/.emacs

    My question though is, what’s the syntax for wget when the FTP username contains an ‘@’ sign?? This breaks everything horribly! :)

    - Brian

  • http://www.practicalapplications.net bwarrene

    [QUOTE=BrianCoogan]Unfortunately the syntax given for authentication is completely wrong according to the manual which gives an example (for user hniksic with password ‘password’) of:

    wget ftp://hniksic:mypassword@unix.server.com/.emacs

    My question though is, what’s the syntax for wget when the FTP username contains an ‘@’ sign?? This breaks everything horribly! :)

    - Brian[/QUOTE]

    Actually Brian – I always use just that method to wget from either my OS X machine or a Linux/Unix box. You can also try:

    wget ftp://name:password@hostname.com/myfile

  • Nate

    try replacing the at sign with %40, as if it were escaped in a querystring:

    ftp://name%40domain.dom:password@hostname.com/myfile

    assuming “name@domain.dom” is your username

  • Nilesh

    Is it possible to transfer free rapidshare files
    to my webhost server using WGET. If yes then
    what cammand should i use ? How can i run WGET
    on my extenal webhost server so i can use speed
    of the external server to transfer files from
    rapidshare to my webhost account file manager.

    Simpaly , Is Wget is capable external server to
    server transfer ?

    Example.
    I have http download link of the any file say
    http://www.tyr.com/dady.mov
    and want to transfer this file to my webhost
    directory using browser based control on Wget.

    Is it possible?

  • carlos alberto

    Eu ainda nunca utilizei, vou agora mesmo utiliza-lo. Muito interessante esta dica.

  • webchalkboard

    Hi,

    I’m a little confused, I can get wget to work when I enter a url like: wget http://username:password@www.mysite.co.uk/admin/

    But it doesn’t work when I try to get:
    http://username:password@www.mysite.co.uk:2082/getbackup/backup-mysite.co.uk-7-14-2006.tar.gz

    I’m trying to grab the cpanel backup for my website from a different server. I thought this was a cool way to do it, but it doesn’t work… maybe because of the port?

    I’m confused, nothing is working today! :(

  • Yuksel Kurtbas

    Hi,

    I’m a little confused, I can get wget to work when I enter a url like: wget http://username:password@www.mysite.co.uk/admin/

    But it doesn’t work when I try to get:
    http://username:password@www.mysite.co.uk:2082/getbackup/backup-mysite.co.uk-7-14-2006.tar.gz

    I’m trying to grab the cpanel backup for my website from a different server. I thought this was a cool way to do it, but it doesn’t work… maybe because of the port?

    I’m confused, nothing is working today! :(

    You cant access 2082 port by pass

  • vidhya

    hi

    when i tried to download a file using wget from my office n/w, i got the following message

    Connecting to myserver:8080… connected.
    Proxy request sent, awaiting response… 407 Proxy Authentication Required
    14:14:30 ERROR 407: Proxy Authentication Required.

    when i access thru the browser, i am using a username and password

    how can i download from my user home

    regards
    vidhya

  • lx

    vidhya: try “man wget” to get all options

    Try these for prowy authentication:
    –proxy-user=USER
    –proxy-passwd=PASS

    Greetz,
    lx

  • Akash Mehta

    @webchalkboard: The cPanel user authentication system is not a proper server-basd HTTP authentication method, it’s generated through scripting, and as a result wget can’t handle it. I’m not sure if the port should affect the outcome as you have still specified the protocol as http.