I was using PyPDF2 to extract text from PDFs and work with them. I encountered this error with a for loop, where it didn’t print the desirable output
from PyPDF2 import PdfReader
from PyPDF2 import PdfFileWriter
from PyPDF2 import PdfWriter
pdf = PdfReader('Test.pdf')
print(len(pdf.pages))
page = pdf.pages[1]
Text = str(page.extract_text())
No = int(Text.count("Ans:"))
for No in range(1,No+1):
if "Ans: (a)" in Text or "Ans: (a)" in Text:
print("Found a ")
Text.replace("Ans: (a)", "No Answer")
elif "Ans: (b)" in Text or "Ans: (b)" in Text:
print("Found b")
Text.replace("Ans: (b)","No Answer")
elif "Ans: (c)" in Text or "Ans: (c)" in Text:
print("Found c")
Text.replace("Ans: (c)","No Answer")
elif "Ans: (d)" in Text or "Ans: (d)" in Text:
print("Found d")
Text.replace("Ans: (d)","No Answer")
else:
print("Error")
print(Text)
Basically what I want this program to do is find Ans (a), Ans (b), Ans (c), Ans (d) and convert a answer key document for a test into a question paper.
The Text extracted using module:
Multiple Choice Questions with one correct answer. A correct answer carries 1 mark. No negative
mark. 60 x 1 = 60
- Which of the following is an equivalence relation?
(a)
ab (b)
ab
(c)
ab− is divisible by
5 (d)
a divides
b
Sol:
ab− is divisible by
5 is the only option satisfying reflexive, symmetric, transitive.
Hence it is an equivalence relation.
Ans: (c) - Let
, A x y z= then an equivalence relation on
A is
(a)
()()()() 1 , , , , , , , R x y y z x z x x= (b)
()()()() 2 , , , , , , , R z y z x z z y y=
(c)
()()()() 3 , , , , , , , R x x y y z z x y= (d) None of these
Sol:
1R is not reflexive since
(),y y R
2R
is not reflexive since
(),x x R
3R
is not symmetric since
(),y x R
Ans: (d) - In the set
6,7,8,9,10 A= a relation
R is defined by
() , : , and a , R a b a b A b= then
R is
(a) Reflexive (b) Symmetric (c)Transitive (d) None of these
Sol:
ab is not possible hence not reflexive.
a b b a
hence not symmetric relation.
, a b b c a c
is transitive relation.
Ans: (c) - The relation
()()() 4,4 , 5,5 , 6,6 R= on the set
4,5,6 is
(a) transitive only (b) an equivalence relation
(c) reflexive only (d) symmetric only
Sol:
()()() 4,4 , 5,5 , 6,6 R=
It is an equivalence relation
Ans: (b) - The range of function
()3
3xfxx−=− is
(a)
R (b)
1R− (c)
1− (d)
1 R−−
Sol: Let
()33133xxyxx−−= = =−− − −
Ans: (c) - The domain of the function
()()2log 2 4 f x x x= − + − is
(a)
2, 2− (b)
()2, (c)
()0, 2 (d)
(,2− −
As you can see it is successfully able to extract Ans (a,b,c,d) from the pdf.
But when I try to identify these in the PDF using code, it is only able to identify Ans (b), hence the output:
Found b
Found b
Found b
Found b
Found b
Could anyone please tell me what is going wrong. Thank you!