I have a Json which looks like this.
{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12234", "some_other_keys":"respective values"}
{"userId":"user1","processId":"p1","reportName":"report1","threadId":"12335", "some_other_keys":"respective values"}
{"reportName":"report2","processId":"p1","userId":"user1","threadId":"12434", "some_other_keys":"respective values"}
{"threadId":"12734", "some_other_keys":"respective values", "processId":"p1","userId":"user2","reportName":"report1"}
{"processId":"p1","reportName":"report1","threadId":"12534", "some_other_keys":"respective values","userId":"user2"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12934", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12834", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12634", "some_other_keys":"respective values"}
Objective: write a function which returns all different sets of lines which has same values of “processId”,“userId”,“reportName”.
So, in this particular example, the function should return three different sets.
Set1 ( for “processId”:“p1”,“userId”:“user1”,“reportName”:“report1”):
{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12234", "some_other_keys":"respective values"}*
{"userId":"user1","processId":"p1","reportName":"report1","threadId":"12335", "some_other_keys":"respective values"}*
{"processId":"p1","userId":"user1","reportName":"report1","threadId":"12834", "some_other_keys":"respective values"}
Set2 (“processId”:“p1”,“userId”:“user1”,“reportName”:“report2”):
{"reportName":"report2","processId":"p1","userId":"user1","threadId":"12434", "some_other_keys":"respective values"}*
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12934", "some_other_keys":"respective values"}
{"processId":"p1","userId":"user1","reportName":"report2","threadId":"12634", "some_other_keys":"respective values"}
Set3 (“processId”:“p1”,“userId”:“user2”,“reportName”:“report1”):
{"threadId":"12734", "some_other_keys":"respective values", "processId":"p1","userId":"user2","reportName":"report1"}
{"processId":"p1","reportName":"report1","threadId":"12534", "some_other_keys":"respective values","userId":"user2"}
So, one function is returning three sets (this can be more or less also depending on the number of matching sets)
I need a solution for the above problem as a (a) performance efficient code (b) code with less number of lines, as I’ll be processing a large number of lines. So want my code to run faster and also the code should be with fewer lines.
I already have a solution for this problem with multiple if conditions and for loops (I’m using Python json to parse the json and get the elements). But wanted a more efficient code.