Yes, duping for direct access like a mmap
and yes, that presumably
means that caching can spell doom also (I dunno what happens when you
dupe right now).
Does this mean you do not know what os.dup()
does? It obtains another
file descriptor for the original descriptor (the int
frm
f.fileno()
).
They are not independent.
In POSIX, a file descriptor is just an int
we use to talk about an
open file via the OS interfaces. A process has a mapping from the file
descriptors (these int
s) to the file handle inside the kernel, which
represents the open file. In particular, it has a seek position for the
file. If you read from one file descriptor, that advances the seek
position, and that change will be visible via the other file descriptor
(because they both point at the same file handle).
Note there’s an os.pread
call which doesn’t move the file pointer.
But if you’re eg mmap
ping the file, you can not worry about that
because you’re not doing anything which moves the file point that way
either.
The purpose of the dup()
is to obtain a secondary file descriptor so
that if the first is closed (eg by closing the file) you’ve still got
valid access to the file (and, eg, its mmap) because there’s still a
file descriptor sitting around (you’ll need to close it yourself when
you’re does with it).
Note that if you’re not using the file outside of the original
open/close sequence, and not moving its read pointer, you don’t need to
use os.dup()
at all!
The point was whether there is a simple way to remove the worse trap in 20+ year old code not making it fully safe. Few users will pass used files, but they may open a gzip file and pass that.
I would test stat.S_ISREG(os.fstat(f.fileno()).st_mode)
. If that’s
True
you’ve got a regular (data) file an you’re probably just fine.
And I’d expect it to raise some exception for some non-regular files or
“psuedofiles” of whatever kind.
If you have a file-like object giving you uncompressed gzip data, I
don’t expect there to be a working f.fileno()
. (I could be wrong, some
gunzipping wrapper might keep the fileno hanging around.)
Is that good API, maybe not from that point of view. Although, there
are some convenient things like supporting to memory-map a files data
based on a kwarg, things that will get cluttered (up to hard) if you
force a fileno/path. (The point of which is: I am not willing to
deprecate that, at least not at this point.)
I think I need to see the source code, or more explaination; i still
don’t understand in enugh detail I think.
Cheers,
Cameron Simpson cs@cskk.id.au