Check if file is open by another process before open it

Goodmorning I need to check if a file is already open in written or read mode before reopen it by another python code. If I do “os.path.isfile(“namefile”)” the output is yes but I cannot read file until other python process has not finished to write it because I could lose many data.

I would ask if exist a way to check if a file is already opened or not before open it in reading mode by another python code.
Thank you very much
Best regards

Usually we don’t do the such a check that way - it is racy. Consider
this code:

if not-in-use:
    use-the-file

Now imagine 2 programmes doing that same check, running at once. Maybe
their execution looks like this:

if not-in-use:
                            if not-in-use:
    use-the-file
                                use-the-file

Observe that both decide that the file is not in use (because it isn’t
when they do the check), then both go ahead and use the file, exactly
what you wanted to avoid. This kind of situation is called a race.

Instead, the usual approach is a lock of some kind, enforcing mutual
exclusion. What’s available depends on the platform. The pattern looks
like this:

lock-the-file
    use-the-file
unlock-the-file

The lock is such that you can’t obtain it if someone else has the lock.
The “lock-the-file” step might come in 2 flavours: blocking - you always
get the lock, but not until it is released, or nonblocking - you may
fail to get the lock, in which case you should try again later.

On UNIX platforms the fcntl module has lockf and flock methods - the
pattern is that you open the file nondestructively, then take the lock
before doing any work.

In Windows you often can’t open a file already in use. I don’t know what
the accepted method is there.

Cheers,
Cameron Simpson cs@cskk.id.au

A file open in Windows specifies the allowed read/execute (1), write/append (2), and delete/rename (4) access sharing. The file’s share access is checked and updated atomically by the filesystem driver in the kernel, so there is no race condition.

If you don’t want to share write and delete access, then call CreateFile() with dwShareMode = FILE_SHARE_READ. If the file is currently open with FILE_WRITE_DATA, FILE_APPEND_DATA, or DELETE access, the open will fail with ERROR_SHARING_VIOLATION (32). Of course, turnabout is fair play, so if you’re requesting FILE_READ_DATA or FILE_EXECUTE access, and the file is currently open without read access sharing, the open will also fail with a sharing violation. If the open succeeds, then as long as the open exists (i.e. as long as the kernel file object has one or more handles), the file cannot be opened again with write, append, or delete (rename) access.

Python’s builtin open() shares read and write access, but not delete access. If you need a different share mode, you’ll have to call CreateFile directly via ctypes or PyWin32’s win32file module.

Hi Eryk Sun,

You’re referring to CreateFile, but what about opening an existing file?

Thank you very much,

in details I’m interesting in linux system

Linux is basicly POSIX compatible, like the UNIXen are.

Cheers,
Cameron Simpson cs@cskk.id.au

CreateFile is also called to open a Device object or filesystem file/directory. The “File” that’s created is actually a kernel File object that references the Device object or filesystem file/directory. In the latter case, it’s either newly created or already existing. The dwCreationDisposition parameter determines the result:

  • CREATE_NEW - create a new file or fail if it already exists
  • CREATE_ALWAYS - create a new file or overwrite an existing file
  • OPEN_ALWAYS - open an existing file or create a new file
  • OPEN_EXISTING - open an existing file or fail if it doesn’t exist
  • TRUNCATE_EXISTING - open an existing file and truncate it, or fail if it doesn’t exist

If the creation disposition is CREATE_ALWAYS or OPEN_ALWAYS, and the call opens an existing file, it succeeds with the thread’s last error value set to ERROR_ALREADY_EXISTS. This is one of the few cases for which it’s correct to call GetLastError() after a successful call. Usually only a failed call sets the thread’s last error value.

On Linux, you can get a list of open files from the lsof command:

https://www.man7.org/linux/man-pages/man8/lsof.8.html

Note that this is not a Python command – you will need to write some
code to call that external command and parse the output. Do you need
help with that?

Keep in mind that this leaves you open to “time of check to time of use”
bugs:

  • Process 1: you check that the file is not open; it is not open so you
    think it is safe to open it;

  • Process 2 opens the file;

  • Process 1: you open the file as well.

Using locks may help a bit, but they are only advisory: other
processes may not honour your locks.

You may be able to enable mandatory file locking:

https://www.thegeekstuff.com/2012/04/linux-file-locking-types/

although some Linux kernels allow you to disable mandatory locking
altogether.

What specifically are you concerned about? Are you worried about random
other programs and processes writing to your files while you are trying
to read them? If you explain your scenario, we may be able to offer some
good advice.

thank you very much Steven, I will try early

Note that the psutil project includes a Process.open_files method, which can give you the same info as the lsof command (with similar caveats).

1 Like

how about you first read the file and then write True into a another file and then when your original file is going to read the target then it will first read the text file to check if True is written in it or is it False or None.

Hi Devyansh,

How does that process work?

I don’t see how having another file written that contains True will
prevent another process from opening the original file.

If I write a file with “True”, why would it change to “False” or “None”?

it won’t make a difference unless ur original file first checks, here’s an example:
files in the directory:
file.txt
file2.txt
file.py
file2.py
code in file.py:

with open('file.txt','w')as f:
     f.write('True')
with open('file2.txt','w')as f:
     f.write('your text')

code in file2.py:

with open('file.txt','r')as f:
     f=f.read()
if f!='True':
    pass  #read/write the file or whatever ur doing in it

srry for replying late

I’m running into this same issue and also running on UNIX systems. I have one process creating a database backup (which can take some time) and one process waiting for files to upload. Obviously, my ‘upload’ process needs to leave my backup file alone until it has finished. Here’s what I’m going with…

On the file write side:

# create backup with a hidden filename (starts with a '.')
filename = ".dump_file"
dump_db(filename)

# rename file after backup has finished, removing the '.'
os.rename(filename, filename.lstrip('.'))

On the file read side:

# list files and ignore any hidden filenames
files = os.listdir()
files = [i for i in files if i[0] != '.']

# upload and delete 'visible' files
for file in files:
    upload_file(file)
    os.remove(file)

Since the filename change comes after the file is completely written, the read side will ignore any in-process files.

By Jared Simons via Discussions on Python.org at 14Jun2022 16:20:

I’m running into this same issue and also running on UNIX systems. I
have one process creating a database backup (which can take some time)
and one process waiting for files to upload. Obviously, my ‘upload’
process needs to leave my backup file alone until it has finished.
Here’s what I’m going with…

On the file write side:

# create backup with a hidden filename (starts with a '.')
filename = ".dump_file"
dump_db(filename)

# rename file after backup has finished, removing the '.'
os.rename(filename, filename.lstrip('.'))

Aye. This is work making a function for. I’ve got an atomic_filename
context manager for this:

You’re welcome to reuse it.

Cheers,
Cameron Simpson cs@cskk.id.au