Unable to append data to Json array object with desired output

I’m tried getting help for same issue on stack-overflow but got no help or replies. I’m re-posting here with the hope that someone can please guide me as I’m unable to push the code to repository due to delay.

My code


import json
import re
from http.client import responses

import vt
import requests

with open('/home/asad/Downloads/ssh-log-parser/ok.txt', 'r') as file:
file = file.read()

pattern = re.compile(r'\\d{1,3}.\\d{1,3}.\\d{1,3}.\\d{1,3}')
ips = pattern.findall(file)
unique_ips = list(set(ips))

# print(unique_ips)

# print(len(unique_ips))

headers = {
"accept": "application/json",
"x-apikey": "9765ba5d9fd52f5747dde5240606019f83d32758cb664abc63a43488aa42812d"
}
i = 0
url = "https://www.virustotal.com/api/v3/ip_addresses/"

# messages = \[\]

f = open('formater.json')

# returns JSON object as

# a dictionary

data = json.load(f)
f.close()
no = 0
while i \< len(unique_ips):

    furl = url + str(unique_ips[i])
    # response = requests.get(furl, headers=headers)
    
    # data_ = response.json()
    
    # print(data_)
    
    # messages = [data_['data']['attributes']['last_analysis_results']]
    messages = [data['data']['attributes']['last_analysis_results']]
    
    y = json.dumps(messages)
    y1 = json.loads(y)
    
    # print(y1)
    a = []
    r = []
    v = []
    cnt = 0
    store = len(y1[0])
    out_json_new = []
    out_json1 ={}
    
    
    for o in y1:
            for k, vv in o.items():
    
                a_ = vv['result']
                a.append(a_)
    
                r_ = vv['engine_name']
                r.append(r_)
    
                v_ = vv['category']
    
                v.append(v_)
    
                out_json = {
                    "indicators": [{
                        "value": str(unique_ips[i]),
                        "type": 'ip',
    
                    }]
                }
    
                out_json1 ={
                                       "providers":[{
                                       "provider": str(r),
                                                 "verdict": str(a),
                                                 "score": str(v)
    
                                    }]             }
    
                out_json1['providers'].append(out_json1)
    
    i += 1
    
    print(out_json,out_json1)

\#for aaa in a:
\#print(a\[0\])\`

Outputs as

{'indicators': [{'value': '192.91.72.201', 'type': 'ip'}]} {'providers': [{'provider': "['Bkav', 'CMC Threat Intelligence', 'CMC sarah ']", 'verdict': "['clean', 'legs', 'hate']", 'score': "['harmless', 'harmless', 'sarah']"}, {...}]}
 {'indicators': [{'value': '192.91.72.101', 'type': 'ip'}]} {'providers': [{'provider': "['Bkav', 'CMC Threat Intelligence', 'CMC sarah ']", 'verdict': "['clean', 'legs', 'hate']", 'score': "['harmless', 'harmless', 'sarah']"}, {...}]}

I want to change the output to this format.

 {
 "providers":\[
 {
 "provider":"['Bkav']",
 "verdict":"['clean']",
 "score":"['harmless']"
 },
 {
 "provider":"['CMC Threat Intelligence']",
 "verdict":"['clean']",
 "score":"['harmless']"
 },
 {
 "provider":"['CMC sarah']",
 "verdict":"['hate']",
 "score":"['harmless']"
}
 ]
 }

My current code, groups under one key e.g provider instead it should be appended one after another like in output above. I tried to use append logic but its not working as i attended. It output as

            `out_json1['providers'].append(out_json1)`

{‘indicators’: [{‘value’: ‘192.91.72.101’, ‘type’: ‘ip’}]} {‘providers’: [{‘provider’: “[‘Bkav’, ‘CMC Threat Intelligence’, 'CMC sarah ']”, ‘verdict’: “[‘clean’, ‘legs’, ‘hate’]”, ‘score’: “[‘harmless’, ‘harmless’, ‘sarah’]”}, {…}]}

RELEVANT FILES

In order to run the code these files are required.
ok.txt

Aug 22 09:45:08 ip-170-32-23-64 sshd\[1546\]: Invalid user HPSupport from 192.91.72.201
Aug 22 09:45:08 ip-170-32-23-64 sshd\[1546\]: Invalid user HPSupport from 192.91.72.101

formater.json

{
"data": {
"attributes": {
"country": "US",
"last_analysis_stats": {
"harmless": 86,
"malicious": 0,
"suspicious": 0,
"undetected": 0,
"timeout": 0
},
"last_analysis_results": {
"Bkav": {
"category": "harmless",
"result": "clean",
"method": "blacklist",
"engine_name": "Bkav"
},
"CMC Threat Intelligence": {
"category": "harmless",
"result": "legs",
"method": "blacklist",
"engine_name": "CMC Threat Intelligence"
},
"CMC sarah ": {
"category": "sarah",
"result": "hate",
"method": "you",
"engine_name": "CMC sarah "
}
}
}
}
}

I haven’t tried running your code. When asking questions online, try to condense your problem description down to the bare minimal needed to demonstrate your problem.

However, it looks like you are appending a dictionary to itself. Instead of

out_json1['providers'].append(out_json1)

try

out_json1['providers'].extend(out_json1['providers'])

Does that get you closer to what you want?

1 Like

Skipping ahead to your current and desired outputs:

 {'indicators': [{'value': '192.91.72.201', 'type': 'ip'}]} {'providers': [{'provider': "['Bkav', 'CMC Threat Intelligence', 'CMC sarah ']", 'verdict': "['clean', 'legs', 'hate']", 'score': "['harmless', 'harmless', 'sarah']"}, {...}]}
 {'indicators': [{'value': '192.91.72.101', 'type': 'ip'}]} {'providers': [{'provider': "['Bkav', 'CMC Threat Intelligence', 'CMC sarah ']", 'verdict': "['clean', 'legs', 'hate']", 'score': "['harmless', 'harmless', 'sarah']"}, {...}]}

Desired:

 {
 "providers":\[
 {
 "provider":"['Bkav']",
 "verdict":"['clean']",
 "score":"['harmless']"
 },
 {
 "provider":"['CMC Threat Intelligence']",
 "verdict":"['clean']",
 "score":"['harmless']"
 },
 {
 "provider":"['CMC sarah']",
 "verdict":"['hate']",
 "score":"['harmless']"
 ]
 }

Your desired output is JSON format. But your current output isn’t
JSON. It is actually Python dicts and lists from this print()
call:

 print(out_json,out_json1)

So what’s going on is a little confusing, because they’re quite similar.
It isn’t helped by the fact that you’ve named your variables out_json
and out_json1 and similar. They’re not JSON, they’re plain old
Python dicts. But the names will be misleading you.

So let’s hardwire a small example:

 import json
 d = {'a': 1, 'b': [3,4,5]}
 print(d)
 print(json.dumps(d))

Let’s dissect that:

 import json

Import the json module, for obvious reasons.

 d = {'a': 1, 'b': [3,4,5]}

Define a small dict, with an entry with key 'a' whose value is the
value 1 and an entry with key ‘b’ whose value is the list [3,4,5].

Now, the dict isn’t “in JSON format”, or any format really. It is just
a data structure. But the Python syntax for writing out such a dict
directly looks a bit JSONish:

 {'a': 1, 'b': [3,4,5]}

If you wrote that value in (JSON format)[JSON - Wikipedia]
you might write:

 {"a": 1, "b": [3,4,5]}

and the most obvious difference is the quote marks. But JSON is
JavaScript compatible. Anyway, JSON’s a fiel format for
reading/writing values. The variable d itself isn’t “in JSON format”
or “Python syntax”; it is just a value.

Let’s get to the output:

 print(d)

When you call print(), it converts each of its arguments to a string,
and writes the strings with a space between each string. The variable
d is a dict, and str(d) is that same as repr(d) which is the
Python syntax for the dict. Thus the Pythonesque output.

Now the second print():

 print(json.dumps(d))

When that runs, it prints this:

 {"a": 1, "b": [3, 4, 5]}

That’s because json.dumps(d) returns a string with the value in JSON
format. ANd print converts that string into… exactly the same string,
because it’s already a string.

You want line breaks in your output. Looking at the documentation for
json.dumps here:

you’ll see it has a separators parameter. Have a fiddle with supplying
values for that parameter in your json.dumps() call and see how things
look.

Cheers,
Cameron Simpson cs@cskk.id.au

1 Like

Here is a way to get the output you want. I have removed all the file handling, url getting, regexing, etc. that isn’t relevant to your actual problem:

import json

data = {
    "data": {
        "attributes": {
            "last_analysis_results": {
                "Bkav": {
                    "category": "harmless",
                    "result": "clean",
                    "engine_name": "Bkav",
                },
                "CMC Threat Intelligence": {
                    "category": "harmless",
                    "result": "legs",
                    "engine_name": "CMC Threat Intelligence",
                },
                "CMC sarah ": {
                    "category": "sarah",
                    "result": "hate",
                    "engine_name": "CMC sarah ",
                },
            },
        },
    },
}

analysis = data["data"]["attributes"]["last_analysis_results"]
providers = {"providers": []}

for k, v in analysis.items():
    providers["providers"].append(
        {
            "provider": str([v["engine_name"]]),
            "verdict": str([v["result"]]),
            "score": str([v["category"]]),
        },
    )

print(json.dumps(providers, indent=2))

Output:

{
  "providers": [
    {
      "provider": "['Bkav']",
      "verdict": "['clean']",
      "score": "['harmless']"
    },
    {
      "provider": "['CMC Threat Intelligence']",
      "verdict": "['legs']",
      "score": "['harmless']"
    },
    {
      "provider": "['CMC sarah ']",
      "verdict": "['hate']",
      "score": "['sarah']"
    }
  ]
}
1 Like

Sorry for the late reply. Your code help me correct the issue with my logic which was appending the data in a wrong way. The code providers["providers"].append( was missing from the logic I was trying to append after insertion while you do it in-line /part of main loop. I thank you once again for your assistance.

Thank you for your explanation , now I know most of my errors was due to bad declaration and not using json.dumps and relying on print was a mistake.