Script required

Hi folks. First post here. I am looking for a python script to rename files in a folder based on various rules. I have a VBA version that works but it’s slow and I want to scale this up, and I think python might be the right tool for the job.

Is it okay to post this kind of request here? Please delete if not, but read on if you are able to help.

All files will be in a single folder and there are three different file types.

  1. PDF. These just need renamed by removing and reordering some parts of the existing name
  2. Invoice header csv. These need to be renamed using the invoice number which is inside the file. The file is csv format with quotes around strings. The invoice number is in row 2 column six and the file needs renamed as invoicenumber-header.csv
  3. Invoice details csv. Same as above. The invoice number is in row 2 column six and the file needs to be renamed as invoicenumber-details.csv

Thanks in advance for any help.

Rgds
Peter.

Hi folks. First post here. I am looking for a python script to rename
files in a folder based on various rules. I have a VBA version that
works but it’s slow and I want to scale this up, and I think python
might be the right tool for the job.

Is it okay to post this kind of request here? Please delete if not, but read on if you are able to help.

Sure. But because we don’t do homework (which this may not be) and in
order to help you learn, we won’t write the whole script. We’ll help you
with suggestions and address queries.

All the Python functions and modules mentioned below are described in
the Python documentation:

https://docs.python.org/3/

Have that open. Both the “modules” and “index” pages are particularly
useful for finding things.

All files will be in a single folder and there are three different file types.

  1. PDF. These just need renamed by removing and reordering some parts of the existing name
  2. Invoice header csv. These need to be renamed using the invoice number which is inside the file. The file is csv format with quotes around strings. The invoice number is in row 2 column six and the file needs renamed as invoicenumber-header.csv
  3. Invoice details csv. Same as above. The invoice number is in row 2
    column six and the file needs to be renamed as
    invoicenumber-details.csv

Suggestion 1: give your script a “no action” mode. Renaming files can be
destructive when you make mistakes, so have the no action mode just
print the suggested change. Later, apply the change (by changing modes):

import os
from os.apth import eixsts

# change this later, maybe with a command line option
doit = False

def rename_file(filename, doit):
    ... decide what the new name should be based on your rules ...
    if new_filename != filename:
        print("rename", filename, "=>", new_filename)
        if exists(new_filename):
            print("new filename already exists, rejecting!")
        elif doit:
            os.rename(filename, new_filename)

The easiest way to read a single folder is os.listdir:

import os

foldername = 'your-folder'here'
for filename in os.listdir(foldername):
    ... process the filename ...

Note that you get the bare names from listdir, do the actual filename
will be foldername/filename:

from os.path import join

real_filename = join(folder_name, filename)

but for ease of running your rules you probably want to work against the
filename itself. Then add in the foldername when you decide to do the
rename (since it will need the real pathname).

There’s a useful method in os.path called splitext:

from os.path import splitext

filebase, ext = splitext(filename)

After that, ext will be ‘.csv’ or whatever. filebase will be the part
before the file extension.

Start with that and make a little script, then come back with questions.

You may find the “re” module useful for matching what’s in filebase.

Finally, you sound like you’re using Windows. In Windows the path
separator is the backslash (‘\’). Which is also special in Python
strings. So use “raw” strings when writing literal Windows paths because
of the conflict:

foldername = r'C:\Users\blah\this\that'

Do the same with regular expressing (in the 're" module) because
backslashes are special in those, too.

Cheers,
Cameron Simpson cs@cskk.id.au

Hi Peter, and welcome!

Sorry, your request is unclear to me. Are you:

  • offering to hire somebody to write this script for you?

  • hoping somebody will do it for free?

  • looking for some hints how to write it yourself?

  • or just looking for reassurance that Python is capable of this?

If the last, the answer is definitely Yes.

A sketch of one possible solution might be:

get the list of file names ending with ".pdf" in the given folder,
and rename them;

get the list of file names ending with ".csv", read them and decide 
what action to take.

As with most such programming tasks, the challenge is mostly deciding on
how to deal with errors, and in particular, to avoid overwriting
existing files by accident. How do you deal with that in your VB script?

Here are some useful libraries you may want to look at:

https://docs.python.org/3/library/glob.html#module-glob

Here is a sketch of one solution:

import csv
import glob
import os

for oldname in glob.glob(path, "*.pdf"):
    newname = ...
    os.rename(oldname, newname)

for filename in glob.glob(path, "*.csv):
    with open(filename) as csvfile:
        invoicefile = csv.reader(csvfile, newline='')
        # Skip the first row.
        ignore = next(invoicefile)
        secondrow = next(invoicefile)
        sixthcolumn = secondrow[5]  # Python starts counting at 0
        invoice_number = int(sixthcolumn)
    newname = ...
    os.rename(filename, newname)

I leave the calculation of the new file names as an exercise :slight_smile:

Hi both,

Thanks for the help. Maybe I could have been clearer - I am looking for help, as a Python newbie, to create the script myself. Googling brings up plenty of options, and it’s good to get some suggestions of the best way to approach it. Happy to take what you have given me and work on the fine details myself. Of course, I will probably be back with more questions :slight_smile: !! Will be interesting to see how much more quickly this task can be done with python.

Thanks,
Peter.

Your questions will be welcome. Include your code inline in the message
between triple backtick lines, like:

```
your code here
```

and if you’re getting an exception/failure you don’t udnerstand, include
the whole traceback likewise: inline text.

Cheers,
Cameron Simpson cs@cskk.id.au