JavaScript to fetch file content and check for a term

mohamedmelbourne786 · December 10, 2023, 6:09pm

Hello James,

Yes, I do have access to a web server. I guess the PHP file should be saved in the same folder as the .index.html ?

I found the below code when I did a search : ```
$data = file_get_contents(‘http://api.somesite.com’);


Am I correct or should more code go in to this ? (My apologies, I am not familiar with PHP).

Thanks a lot.

SamuelCalifornia · December 10, 2023, 6:28pm

I am not sure I understand what you are asking but probably you could do it with a browser extension. Browser extensions are given specific permissions that users can choose to accept and if they do not accept the permissions then the browser extension does not work. The important thing is that browser extensions can do more than non-extension clients can do.

James_Hibbard · December 10, 2023, 6:50pm

Hey,

Yeah, so save the following as proxy/index.php on your server:

<?php
$url = 'https://www.youtube.com/watch?v=94PLgLKcGW8';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$htmlContent = curl_exec($ch);
curl_close($ch);

$searchWord = 'Hilarious Cat Fails';

$found = strpos($htmlContent, $searchWord) !== false;
$response = json_encode(array('found' => $found));

header('Content-Type: application/json');
echo $response;

Then hit up http://localhost/proxy and you should see something like this:

{ "found": true }

What is happening is that we are fetching the content of a page on YouTube and checking for the presence of a string ‘Hilarious Cat Fails’. Based on the presence of that string, we are then returning some JSON which indicates the result.

Once you have that working, try changing the search term, e.g.:

$searchWord = 'StitePoint rocks!';

When you reload the PHP page, this should return:

{ "found": false }

I know this isn’t yet what you set out to do, but I’m trying to illustrate how this would work in practice.

In the next couple of iterations, we can change it to search for the correct term and we can make it so that you can consume the endpoint from within your JavaScript.

LMK if the above all makes sense and is working as I described.

mohamedmelbourne786 · December 11, 2023, 2:49am

Hello James,
Thank you so much for this. It works perfectly fine.

I guess the next step is to write some JavaScript to:
(a) call the proxy/index.php
(b) get the PHP variable $found in to a JavaScript variable

Once I can get up to (b) above (i.e. as true/false), I should be able to proceed. I tried to do (b) above by reading a lot on the internet, but had no luck. Could you kindly guide me on this please ? Thank you so much for all the help you gave…very kind of you.

mohamedmelbourne786 · December 11, 2023, 5:09am

Hello James,
Thank you so much for your code. I made a slight change as:
$status = json_encode($found);
echo $status;

and then wrote the below piece of code in the index.html and it seems working.

But the only problem is that the PHP output seems to be caching. Once I run proxy/index.php and refresh the browser it gives the correct result. If I acces index.html afterwards, then it works. Otherwise index.html is giving the previous result even after I change the “search term” in the PHP file… Any ideas on how to call proxy/index.php FRESH each time the index.html loads and to prevent browser caching of index.html ? Thank you very much.

window.onload = function checkForWord() {

     fetch("proxy/index.php")
    .then(function(response) {
               response.json().then(function(data) {
               const hasWord = data;
               alert(hasWord);
               });
       });
    
    }

NOTE: My previous reply thanking you for the code you sent was held my the anti-spam filter…don’t know why…

James_Hibbard · December 11, 2023, 6:38am

Hmmm, I tried your script (including your modifications) and couldn’t reproduce the problem. So there are a couple of things we need to try to find out where the caching is taking place.

First of all it would be better if we both run the same code. Pls change the PHP back to this so that we are returning valid JSON:

index.php

<?php
$url = 'https://www.youtube.com/watch?v=94PLgLKcGW8';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$htmlContent = curl_exec($ch);
curl_close($ch);

$searchWord = 'Hilarious Cat Fails';

$found = strpos($htmlContent, $searchWord) !== false;
$response = json_encode(array('found' => $found));

header('Content-Type: application/json');
echo $response;

Then a couple of modifications to the JavaScript.

index.html

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>YouTube Status</title>
</head>
<body>

  <script>
    document.addEventListener('DOMContentLoaded', async () => {
      const res = await fetch('./index.php');
      const json = await res.json();
      console.log(json);
    });
  </script>
</body>
</html>

Now make sure the PHP script contains the term ‘Hilarious Cat Fails’ and load the HTML page in the browser.

What is the output of the HTML page?
What is the output of loading the PHP page directly?

Now change the term to ‘SitePoint Rocks!’. Use this term and not something else (at first I was tempted to just change ‘cat’ to ‘dog’, but this might actually be present in the comments or the suggested videos loaded by the YouTube algorithm). Then load the HTML page in the browser.

What is the output of the HTML page?
What is the output of loading the PHP page directly?

Post back here with the relevant output and we’ll take it from there.

soniya-01A · December 11, 2023, 6:49am

To fetch the content of a file and check for a term using JavaScript, you can use the fetch API for file retrieval and then search for the term within the content. Here’s a simple example:

File Content Checker

<script>
    // Function to fetch file content and check for a term
    async function checkFileContent(filePath, searchTerm) {
        try {
            // Fetch the file content using the fetch API
            const response = await fetch(filePath);

            // Check if the fetch was successful (status code 200)
            if (!response.ok) {
                throw new Error(`Failed to fetch the file. Status: ${response.status}`);
            }

            // Get the text content of the file
            const fileContent = await response.text();

            // Check if the term exists in the file content
            if (fileContent.includes(searchTerm)) {
                console.log(`The term '${searchTerm}' was found in the file.`);
            } else {
                console.log(`The term '${searchTerm}' was not found in the file.`);
            }

        } catch (error) {
            console.error(`Error: ${error.message}`);
        }
    }

    // Example usage
    const filePath = 'path/to/your/file.txt'; // Replace with your file path
    const searchTerm = 'yourTerm'; // Replace with the term you want to check for

    // Call the function with your file path and search term
    checkFileContent(filePath, searchTerm);
</script>

This script defines a function `checkFileContent` that uses the `fetch` API to get the content of a file. It then checks if a specific term exists in the file content using the `includes` method.

Replace 'path/to/your/file.txt' with the actual path to your file and 'yourTerm' with the term you want to check for. When you open this HTML file in a web browser, it will fetch the content of the specified file and log whether the term is found or not.

mohamedmelbourne786 · December 11, 2023, 7:29am

HI James,
The outputs directly from index.php are: {“found”:true} and {“found”:false}, respectively, which are of course correct. However, the output of the index.html file is always : false…(surprisingly)…

Not sure why this is happening ?

(BTW, even when I call the php file directly, the correct result is displayed only after I hit the browser refresh button).

James_Hibbard · December 11, 2023, 7:33am

Me neither.

Change this in the PHP:

$found = true;

What does your HTML page output to the console now?

mohamedmelbourne786 · December 11, 2023, 8:16am

HI James,
Now the index.html is not producing any output at all and the console is giving some errors such as : DOCTYPE "… is not valid JSON. I am not sure where the error is…but please don’t worry, because I got the earlier code working (i.e. the caching problem solved) after I included the following in the PHP file :
// set expires header
header(‘Expires: Thu, 1 Jan 1970 00:00:00 GMT’);
// set cache-control header
header(‘Cache-Control: no-store, no-cache, must-revalidate, max-age=0’);
header(‘Cache-Control: post-check=0, pre-check=0’,false);
// set pragma header
header(‘Pragma: no-cache’);

AND the following in the HTML file:
meta http-equiv=“refresh” content=“10”

Now everythng is working perfectly. Just don’t have enough words to Thank you for the PHP code you sent, without which I simply wouldn’t have come to this point.
If I require any assistance again, shall post again to this forum and I am sure you will give me a helping hand…Thank you SO MUCH again for all your help…very kind of you…

mohamedmelbourne786 · December 11, 2023, 8:51am

Just want to say a BIG THANK YOU to everyone who helped me to solve my problem…I was trying for more than a week to find out whether a YouTube channel is live or not and got everything done within 1.5 days of joining SitePoint. Have no words to thank all of you.

Just a very tiny contribution from me on the words to search to check whether a YouTube channel is live or not: when it is live the page contains the words ’ “text”:" watching" ’ while a channel that is not live, does not contain this sequence.

The advantage of the code James provided is that, you can straightaway use the search terms as a single String as opposed to using includes() which ignores the spaces (unless you remove all spaces using replaceAll()) and then proceed.

Anyway, once again, Thank you VERY MUCH all !

James_Hibbard · December 11, 2023, 8:57am

Glad you got it working

system · March 11, 2024, 3:57pm

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.