Fs.access, fs.write and potential race conditions

Hi,

So I have just been asking chat GPT to create some code that will check for the existence of a folder, if if a folder doesn’t exist it will make one, and finally it will write a file to that destination.

It is quite an interesting process as you try to guide chat gpt, ‘can we break that down into separate functions?’, ‘how about using fs.access instead of fs.stat?’ etc

This is what it came up with.

const fs = require('fs').promises;
const path = require('path');

const ensureDirectoryExists = async (directoryPath) => {
  try {
    await fs.access(directoryPath);
  } catch (err) {
    if (err.code === 'ENOENT') {
      await fs.mkdir(directoryPath, { recursive: true });
    } else {
      throw err;
    }
  }
};

const writeFile = async (filePath, data) => {
  try {
    await fs.writeFile(filePath, data);
    console.log('File saved successfully.');
  } catch (err) {
    console.log('Error saving file: ', err);
  }
};

const saveTest = async () => {
  const filePath = 'public/non-existant-folder/some.txt';
  const directoryPath = path.dirname(filePath);

  try {
    await ensureDirectoryExists(directoryPath);
    await writeFile(filePath, 'Hello there!');
  } catch (err) {
    console.log('Error: ', err);
  }
};

That looks pretty good to me, but this line from the nodejs docs throws a spanner in the works.
https://nodejs.org/api/fs.html#fspromisesaccesspath-mode

Using fsPromises.access() to check for the accessibility of a file before calling fsPromises.open() is not recommended. Doing so introduces a race condition, since other processes may change the file’s state between the two calls. Instead, user code should open/read/write the file directly and handle the error raised if the file is not accessible.

I mentioned this to chat GPT and it came up with an alternative.

const fs = require('fs').promises;
const path = require('path');

const writeFile = async (filePath, data) => {
  try {
    const fileHandle = await fs.open(filePath, 'w');
    await fileHandle.writeFile(data);
    console.log('File saved successfully.');
    await fileHandle.close();
  } catch (err) {
    console.log('Error saving file: ', err);
  }
};

const saveTest = async () => {
  const filePath = 'public/non-existant-folder/some.txt';
  const directoryPath = path.dirname(filePath);

  try {
    await fs.mkdir(directoryPath, { recursive: true });
    await writeFile(filePath, 'Hello there!');
  } catch (err) {
    console.log('Error: ', err);
  }
};

I’m starting to feel like I am going down the rabbit hole with this now. I can’t find any up to date documentation or tutorials that give a clear cut way of performing this task.

Any input would be greatly appreciated.

To be honest I don’t know if your example will have a real use case, but if so, you can for example use semaphores. A simple example of creating a semaphore is here

1 Like

Semaphores are fine to avoid contention within one process, it doesn’t help at all to avoid it between separate processes, which I think is what the documentation alludes to.

Wrt the original question, I would check if it is realistic in your scenario. I mean, the chances of you creating a directory and some other process deleting that directory in the few nanoseconds that it takes your process to create a file in that directory are really slim for regular programs. Sure if you handle really high volumes of traffic like FB or LinkedIn stuff like that does happen. It’s a matter of scale.

Imo it’s fine to know, keep it in mind, ignore and move on.

2 Likes

Sorry but I think you are totally wrong here. semaphores are exspecially to avoid race conditions between multi processes accessing the same resource.

In general yes, but the implementation you link to does not. It’s just an in-memory counter, which can’t be shared between processes. You’d need some sort of persistent backend for that (like Redis, or MySQL, or something like that).

Yes that’s true, but nodejs is JavaScript and therefor has no multiple processes. It is just queuing events and if you have an await or other asynchronies code, this is added to the queue after it has finished. So no real multi-processing at all

That’s an interesting point, and I think keeping it simple at this point maybe the best idea — I’m over thinking this. Unfortunately there is so much contradictory information out there, you end up going round in circles.

Why is that? You would never check a destination folder before writing to it? What would you suggest instead?

I am doing a bit of a re-write of a php implementation.

// check if directory is ok
if ( ! is_dir( $uploads_dir ) )  {
    die( "The {$uploads_dir} directory does not exist." );
}

// check if directory is ok
if ( ! is_writable( $uploads_dir ) ) {
    die( "The {$uploads_dir} directory is not writable." );
}

// save new avatar
if ( ! file_put_contents( $uploads_dir . $file_name, $data ) ) {
    die( 'Error writing avatar on a disk.' );
}

These checks did come in handy on the mac, as I didn’t have write permissions setup on my local server. Is this not the way to do things in Node?

But this code does not create race conditions as you do not create the folder. So even if another event is creating the directory while you are testing, this is not running into an error.
I prefer to setup the server with a installation script running once creating all needed files and directories with the correct access rights. But even if you want to create the folder dynamically you can easily fetch the error when the directory already exists and proceed with it.

But this code does not create race conditions as you do not create the folder. So even if another event is creating the directory while you are testing, this is not running into an error.

That’s true.

That makes sense. Maybe a daft question, but how would you implement that? Where and when? If and when you have time, could you show a cut down example? Sorry I have become quite demanding since working with Chat GPT :slight_smile:

What do you mean? The installation file or the directory creation with error handling?

This is a good example how to implement check and create nowadays. I even did not know this new recursive feature by now.

1 Like

All of it really Thallius. How would you organise the installation file? Would it be imported into the app.js and where in the order of things, setting up routers etc would you call that init file.

Sorry if I am being vague. Only looking for a rough idea/convention as it were.

Yes you can see it being used in the ensureDirectoryExists function post #1. It was new to me too :slight_smile:

reads all the back and forth

So… just how often do you expect something other than your code to be creating these directories? Do you call ensureDirectoryExists asynchronously repeatedly?

Are you fighting a fire that doesnt exist?

From post #7

That’s an interesting point, and I think keeping it simple at this point maybe the best idea — I’m over thinking this. Unfortunately there is so much contradictory information out there, you end up going round in circles.

Its not knowing m_hutley that is that is the issue for me. Is this going to bite me in the backside down the line? I have had dealings with previous projects where the developers have relied on sticking plaster setTimeouts to deal with these types of issues, and ideally I want to avoid that.

Me being better informed, will mean I can say this is overkill and do X instead. The answer to ‘Are you fighting a fire that doesnt exist?’ is probably yes!

I think at some point you get to the idea of inevitability - as soon as you put a line break in your code, you’ve got a “possible race condition”. Eliminating all of that is impossible, because a javascript script isnt an atomic operation.

You can try and semaphore it,you can try and tighten the code up as much as you like - the simple fact is, your script isnt in control of the file system or the CPU or other programs running on the system, and so you can never completely eliminate the conditions. The best I think you can do is… well, what you started out with - Try, and handle errors cleanly.

1 Like

To be honest there is no kind of installation file. We use jenkins for CI/CD. So the servers (dev, Stage, prod) are setup once by a server admin, then we do all the rest with jenkins. If we need changes in the server setup, we ask the admin to do this. He is responsible for all the work on the server itself. At the end he is only doing what we are asking for (installing needed applications like npm, apache and configure them) but he is documenting it carefully and is taking care of having always the newest versions installed to match newest security standards, updating ssl certificates etc etc.

This is of course something you have to do by yourself when you work with local servers.

1 Like

@Thallius

That sounds like the proper way to do it and great that you have someone in the know to set this up for you. Jenkins is now another one on the list to atleast familiarise myself with.

I need more time in the day to educate myself about all this stuff — It’s good to be learning though :slight_smile:

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.