Reading a .wav file

Hello,
I’m relatively new to Python. I’ve a lot of experience in C and C++ and I’ve worked with WAVE files in C before. Now I want to open and read a .wav file in Python. I can open the file and read the text headers for RIFF and WAVE and such. Where I run into problems is that I need to read in the binary values and have them end up in a list. These values are longs, 2 bytes each, little endian format. Do I have to use something like pickle?
Joe

Hm, doesn’t wave — Read and write WAV files — Python 3.9.1 documentation fulfill your needs? Note that the source is pure-Python.

See also

https://docs.python.org/3/library/chunk.html

and

https://docs.python.org/3.10/library/audioop.html#audioop.byteswap

Thanks, I’ll take a look. I didn’t know such a thing existed. But the readframes returns a byte object. I’m still having trouble wrapping my mind around Python’s typing compared to what I could do in C. So bear with me. I can experiment with this a bit. But do the samples from the wave.readframes come back truly as bytes so I have to combine them into shorts for a 16 bit sample?
Since writing the above, I’ve tried the wave module and am successful at most of the methods like getnframes. But I’m having difficulty with the wave_read.readframes(n). I’ve looked up bytes object and tried to understand how to get it working but no success.

Thanks, I’ll take a look. I didn’t know such a thing existed.

Me either!

But the readframes returns a byte object. I’m still having trouble
wrapping my mind around Python’s typing compared to what I could do in
C.

Think of it as a read only unsigned char.

So bear with me. I can experiment with this a bit. But do the samples
from the wave.readframes come back truly as bytes so I have to combine
them into shorts for a 16 bit sample?

Having glanced at the doc URL, I’d say so. So you want the struct
module:

https://docs.python.org/3/library/struct.html#module-struct

Probably the iter_unpack function which will yield stuff as required.

Note that the struct modules error complaints are a bit cryptic, usually
around the struct format strings, which lay out what C-like structure
you want to read. The leading character is a ‘>’ indicating big-endian
or ‘<’ indicating little-endian. Then the following charatcers specify
ints etc of various sizes, all of the same endianness.

All these functions take a buffer, which is any object supporting the
“buffer protocol”. The bytes and bytearray classes are the usual
examples of these, so you’re good with the frames you get from the wave
module.

Cheers,
Cameron Simpson cs@cskk.id.au

Many thanks for all the help. I finally got something to work. My goal was to read in a waveform so I could do statistics on it. I’ve gotten the first part of reading the data. I thought I’d share my solution but the upload says I can’t send that type file. How you do folks share python scripts?

Just paste it inline into the message between:

code goes here

triple backtick markers.

We’d love to see it.

Cheers,
Cameron Simpson cs@cskk.id.au

Paste the code, select it and mark it up with the </> in the edit toolbar (image ).

Thanks. Keep in mind that this was the first shot and I’m still learning. The wave file I read was made using Audacity to generate one cycle of 1000Hz so that I could view it as a plot and know that I had read it correctly. The file I will do statistics on is one I recorded from the wife’s metronome. She thought it was not regular. I’m curious to see how irregular it is. So… here’s the code
import wave
import matplotlib.pyplot as plt
import struct
file_name = ‘sample-1000-1ms.wav’ #file containing 1 cycle of 1000Hz

    w = wave.open(file_name,mode='rb')

    ch = w.getnchannels()
    BytesPerSample = w.getsampwidth()
    FrameRate = w.getframerate()
    NumberFrames = w.getnframes()   #number of samples
    Frames = bytes() #next instruction returns a bytes object
    Frames = w.readframes(NumberFrames)
    print(NumberFrames)
    Wave_Form = struct.unpack('44h',Frames)  #converts from bytes to shorts

    print(Frames)
    print(Wave_Form)
    x = range(0,44,1)
    plt.plot(x,Wave_Form)  #plots the wave

    w.close()

Note that there’s a ‘magic’ number in there of 44. That was the number of samples or frames in my wave file. I’ll need to fix that to make it more general. It was nice to find the wave module so that I could better understand how it works. I found that when a wave object is instantiated, all the important numbers are gathered in the constructor. My biggest problem was trying to understand the unpacking, but found examples to go by.
Thanks everyone.

Thanks for the code. Some minor remarks below…

Thanks. Keep in mind that this was the first shot and I’m still
learning. The wave file I read was made using Audacity to generate one
cycle of 1000Hz so that I could view it as a plot and know that I had
read it correctly. The file I will do statistics on is one I recorded
from the wife’s metronome. She thought it was not regular. I’m
curious to see how irregular it is. So… here’s the code

   import wave
   import matplotlib.pyplot as plt
   import struct
   file_name = 'sample-1000-1ms.wav' #file containing 1 cycle of 1000Hz

   w = wave.open(file_name,mode='rb')

   ch = w.getnchannels()
   BytesPerSample = w.getsampwidth()
   FrameRate = w.getframerate()
   NumberFrames = w.getnframes()   #number of samples
   Frames = bytes() #next instruction returns a bytes object
   Frames = w.readframes(NumberFrames)
   print(NumberFrames)
   Wave_Form = struct.unpack('44h',Frames)  #converts from bytes to shorts

   print(Frames)
   print(Wave_Form)
   x = range(0,44,1)
   plt.plot(x,Wave_Form)  #plots the wave

   w.close()

Note that there’s a ‘magic’ number in there of 44. That was the number
of samples or frames in my wave file. I’ll need to fix that to make it
more general. It was nice to find the wave module so that I could
better understand how it works. I found that when a wave object is
instantiated, all the important numbers are gathered in the
constructor. My biggest problem was trying to understand the
unpacking, but found examples to go by.

As a matter of the common style, in Python we tend to use lower case
names for variables. CamelCase tends to be used for class names.

You could use “NumberFrames” to compute the number of samples, yes? Then
use that to size the struct format string?

Also, you’re assuming the samples are in your native byte order (on
intel, little endian). But if WAV files are always little endian you
really should wire that into the format with a leading ‘<’.

So maybe:

struct_format = f'<{NumberFrames}h'
print("struct format spec =", struct_format)
Wave_Form = struct.unpack(struct_format,Frames)

You don’t need this line:

Frames = bytes() #next instruction returns a bytes object

There are no declarations in Python. (“global” and friends aren’t really
declarations.) Any variable may refer to a value of any type, so you can
just drop that line. It actually allocates an empty bytes instance,
which you immediately discard.

In Python, values are strongly typed. Variables aren’t - they just refer
to values.

Aside: there are type annotations, but they’re not checked at runtime

  • they’re an aid to linters. (They’re kept in the programme though, and
    there are modules such as “typeguard” you can use to do runtime type
    checking using the annotations.)

Cheers,
Cameron Simpson cs@cskk.id.au

Thanks for the suggestions. " Any variable may refer to a value of any type" will take me a long time to get used to. My old boss was a stickler for how we wrote C. Old habits die hard.
Actually, the reason that line is in there “Frames = bytes()” is because while I was trying to get this to work I kept getting an error message that told me the readframes returned a byte object. So for whatever reason after I declared it, the error message went away. I’ll try it again though and see what happens. Maybe it was caused by something else and I just thought I fixed it.

You may want to look at type hints and type checker tools:

I am also used to write a lot of C/C++, I found that using type hints helps reconcile both.

Many thanks. I took a quick look. All help appreciated.