Let’s say that I have a HTML page which has a video in it. If I were to manually download it, I would open the source code, click on the link which leads to the .mp4 file. Then I click on the download icon and choose where it has to be saved. Doing it once isn’t that difficult but I have to download more than 50 such videos from a website. The pattern to head over to the .mp4 file is the same; it doesn’t change. So how can I create an automated process for doing this? Which tool/language should I use?
With that many files, I think the best approach would be to contact the site and ask them to zip them up for you.
I want to do so just for learning purposes. I think there are ways to do so with Beautiful Soup in Python but don’t know how exactly as I am not really experienced with Python
I do not know of a tool to do this but the process is not to difficult using any of the popular server side languages. I will give you a quick walk through of what will need to be done.
Send Http GET request to get the response in most cases it is html returned. (popular is cURL)
Parse HTML looking for video tags and src attributes - This is a web crawler of sorts. You can find many examples and tutorials on building these in many different languages. But once you find the src tag you can then validate it to make sure it is a video url by parsing the extension.
Once you have the video URL you can once again use cURL to force download to destination.
Most popular server side languages can accomplish the above.
Check out this script that I wrote a while ago, there’s a download file function that should help, just comment out the part where the script checks for file size and makes sure it’s under 1100 KB:
Does that help?
This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.