I need to retrieve the latest version of an application on a download page:
But I’m not quite sure how to go about doing this. I imagine it would start with a curl retrieval but once I have the code, how do I go about getting that first version number?
Any thoughts or suggestions on the matter would be most welcome.
In the example I’ve downloaded the file and am reading it from disk, but downloading it and parsing that is not that hard. I’ll leave that as an exercise to the reader
Also, this code represents only happy path, you’d have to add any guards for unexpected stuff (node cannot be found, etc).
Standard Disclaimer: Only use these tools on sites you have permission to do so after checking their terms and conditions.
Assuming the structure of the page remains the same, and that the desired link is always the top one, then it would be the 8th <a> tag on the page. Or maybe the 5th. Cant tell if those arrows are separate links or not.
I just read back on my OP and realized I didn’t explain myself clearly.
I do not need the zip file itself. I only need to know which version is the latest, which is the highlighted number in the image. I need to get that 1045 into a var.
Would these suggestions still be usable for what I’m trying to do?
take the text content of the link (@rpkamp’s post #3), and instead of substr’ing it (@John_Betong’s post #2), str_split it on the hyphen, and take the first element of the split array.
Looks fine. Just take care that it won’t work when allow_url_fopen is disabled on a host, so it would be more portable if you used something like curl.
Is it less likely for a server to have curl disabled than it is allow_url_fopen? I ask because I wonder if it would be worth the effort to have a fallback of trying allow_url_fopen if curl fails or if curl is common / expected.
The only issue I see with this is that you’re relying on the latest one being at the top of the list. Is that something you have control over, or could the server owner change that without warning? I think I’d be tempted to get all the entries and their corresponding dates, then find the latest date, and get the version associated with it.
But of course if you’re sure it will always be at the top, that’s fine.
That looks to be the current default order. Maybe just to make sure the query string bit could be added? i.e. https://runtime.fivem.net/artifacts/fivem/build_server_windows/master/?C=M&O=D
C, M, O, and D don’t give much idea as to what they are, and I guess those too could be changed at any time. The numbering and dates look to be incremental, maybe putting a “new is greater than last known” check in there somewhere would be a good idea.
I was curious and after further investigation discovered:
the link supplied shows a directory listing
a. Apache creates a web-page with a table of the directory contents
b. directory contents can be file names, dates, links to sub-directories, etc
c. Apache uses a style sheet to format the content
d. Apache lists the contents in a specified order.
I used a couple of browsers to view the directory listing and they were all very similar so…
Using PHP to extract the latest version number after Parent directory/ string followed by href=
I took your fantastic code and made a function of it. The function works if I comment the strict_types declaration but will not work if it’s included.
I looked up strict_type command and don’t fully understand. it. The function seems to work just fine without it but I wanted to ask it’s purpose and whether it’s truly necessary in my case and if so, what I can do to use it in the function?
Try adding these lines tp the file because it looks as though your default PHP.ini file has the default values set to *Off.
Please note errors and warnings should be displayed in the browser rather than being logged in the /log/errors.log file.
<?php
declare(strict_types=1); // must be the first declaration
error_reporting(-1); // maximum error reporting
ini_set('display_errors', 'true'); // display results to screen instead of errors.log
// remaining script
One reason the script works when declare(…) is removed or commented is that the declaration must be the first declaration in the file. Your PHP.ini file defaults are preventing the errors from showing in the browser which is the online default which could show users sensitive information, passwords, etc