How to summarize API call?

Castor · March 9, 2023, 3:42pm

I have made some Python code which integrates the new Turbo 3.5 API from OpenAI.

The code reads in a document with approx. 6000 characters, but as ChatGTP can only handle around 2000 characters at a time, the code splits up the code into chunks of 2000 characters which are then sent to the API. (see screenshot)

This ‘chunking’ works very well. Before I used a library to summarize the content from 6000 words to 2000. However, I felt too much information was being lost as only 1/3rd of the original characters were left - hence ‘chunking’.

However, the advantage of the summarization approach was that I could prompt the API to list me the 10 main points of the content as the API only had do its processing of the prompt 1 time.

Now if I do this, it will give me 10 main points of ‘Chunk 1’, 10 main points of ‘Chunk 2’ and so forth.

My question is, any ideas how I can have all the output summarized into a list of 10 points. I know I can just copy/paste it into the ChatGTP web-interface, but that defeats the purpose of using automation.

I hope I was more or less able to explain my scenario.

Thanks in advance for all responses.

steven.daprano · March 9, 2023, 8:19pm

Please don’t post screenshots of code, unless you use Photoshop to write your programs.

Copy and paste the code into your post as text, formatted correctly.

I presume you mean that the code splits up the document you have read into chunks.

I think you need to ask the ChatGPT people how you can overcome the 2000 character limit on their API. This is not a Python problem, it is a ChatGPT problem and has absolutely nothing to do with Python.

Maybe you could ask ChatGPT how to use the ChatGPT API to summarise a 6000 word document?

Or maybe you could ask for the main three points (not ten) of the first and last chunk, and the main four points of the middle chunk. That will give you ten points altogether.

Or you could get the 30 main points from the three chunks and pass them back to ChatGPT and ask it to summarise those 30 points into 10.

rob42 · March 9, 2023, 11:16pm

I can a situation where f.read(2000) is going to return an incomplete word (say, the first three letters of a seven letter word that just happens to be at the end of the 2000 limit). Does your code account for that?