How to create a field in a json string from a tuple

I have this code:

ResponseAssets = requests.get(self.collibraRestURL + ‘/assets’,
auth=HTTPBasicAuth(self.username, self.password),
params = userParams2)

        asset_data = create_response_tuple(ResponseAssets)
        if asset_data[0] != 200:
            messages.append('ERROR: Asset Data extract has encountered errors!')
            http_error(asset_data, messages, indent=indent1, debug=1)
            quit()
        result_list_Assets = asset_data[1].get("results")

        if result_list_Assets[ii]['domain'].get('id') not in technical_data_assets_set:
            modified_dictionary = {'id': '',
                               'createdBy': '',
                               'createdOn': '',
                               'lastModifiedBy': '',
                               'lastModifiedOn': '',
                               'system': '',
                               'resourceType': '',
                               'name': '',
                               'displayName': '',
                               **'articulationScore': float(0),**
                               'excludedFromAutoHyperlinking': bool,
                               'domain': {'id': '', 'resourceType':'', 'name':''},
                               'type': {'id':'', 'resourceType':'', 'name':''},
                               'status': {'id': '', 'resourceType': '', 'name': ''},
                               'avgRating': float(0),
                               'ratingsCount': int(0)}
            modified_dictionary['id']                            = result_list_Assets[ii]['id']
            modified_dictionary['createdBy']                     = result_list_Assets[ii]['createdBy']
            modified_dictionary['createdOn']                     = result_list_Assets[ii]['createdOn']
            modified_dictionary['lastModifiedBy']                = result_list_Assets[ii]['lastModifiedBy']
            modified_dictionary['lastModifiedOn']                = result_list_Assets[ii]['lastModifiedOn']
            modified_dictionary['system']                        = result_list_Assets[ii]['system']
            modified_dictionary['resourceType']                  = result_list_Assets[ii]['resourceType']
            modified_dictionary['name']                          = result_list_Assets[ii]['name']


            **if result_list_Assets[ii]['articulationScore']:**

** modified_dictionary[‘articulationScore’] = result_list_Assets[ii][‘articulationScore’]**
** else:**
** modified_dictionary[‘articulationScore’] = 0**

            modified_dictionary['displayName']                   = result_list_Assets[ii]['displayName']

The problem I have is there are records that do not have an articulationScore field in them. I tried (in the highlighted code) to create that field in the modified_dictionary, but it is not working. I don’t get any errors, but the modified_dictionary for those records does not contain the articulationScore.

How can I add that field to the modified dictionary if it does not exist in the source record?

By Michael Walter via Discussions on Python.org at 25Mar2022 19:37:

I have this code:
[…]
result_list_Assets = asset_data[1].get(“results”)
if result_list_Assets[ii][‘domain’].get(‘id’) not in ta_assets_set:
modified_dictionary = {‘id’: ‘’,
‘createdBy’: ‘’,
‘createdOn’: ‘’,

Remark: I’m a fan of this style for wordy dicts:

modified_dictionary = dict(
    id='',
    createdBy='',
    createdOn='',
    .......
)

How about:

modified_dictionary['articulationScore'] = result_list_Assets[ii].get('articulationScore', 0)

Cheers,
Cameron Simpson cs@cskk.id.au

That did not work. It still blows up at certain point because it did not put an articulationScore field in the modified_dictionary.

Can you show the full code (inside triple backticks to keep the format) and the full traceback? The code example show above will always insert an articulationScore key into the modified_dictionary.

Here is the log:

Operating System: Windows_NT
Current User: M111843
Getpass User: M111843
starting main program

number of input parms : 1
Retrieving env from input, valid choices are ( “SAND” “DEV” “TEST” “TEST2” “PROD”)!
Enter environment:

Environment parameter is missing, manually setting environment to SAND
Environment set to SAND
Enter a comment to uniquely identify this run:
acquire_Collibra_credentials:: 2022-03-28 06:26:10.619949
Environment set to SAND
user: APIsand00006MW
Note: plugging default filename “H:.netrc” for NETRC method!
Debug option set to: 2
extract_n_insert_catalog_entries:: 2022-03-28 06:26:11.091686
Schema used to access the database is: sandbox

 acquire_environment_datetime::                                 2022-03-28 06:26:11.858770
	 get_application::                                          2022-03-28 06:26:11.858770

Starting Program; database key values are:  SAND 2022-03-28 06:26:12.469140 

 asset_dataz::                                                  2022-03-28 06:26:12.469140
	 get_asset_data::                                           2022-03-28 06:26:12.469140

Total domain records: 79363
Total domain loops: 80
Total asset records: 533424
Total asset loops: 534
Asset Data Status Code: 200
append_environment_datetime_results:: 2022-03-28 06:28:36.385347
Insert asset data
insert_asset_data_entries:: 2022-03-28 06:28:36.388336
Traceback (most recent call last):
File “C:\GIT\dda-datacatalog-python\save_metamodel_configuration.py”, line 110, in
rc = object.Save_Collibra_env(debug=debug)
File “C:\GIT\dda-datacatalog-python\save_metamodel_configuration.py”, line 17, in Save_Collibra_env
rc, messages = Comparison.extract_n_insert_catalog_entries(self,
File “C:\GIT\dda-datacatalog-python\Collibra.py”, line 6179, in extract_n_insert_catalog_entries
rc, messages = asset_dataz(indent=indent+1, debug=debug)
File “C:\GIT\dda-datacatalog-python\Collibra.py”, line 5382, in asset_dataz
rc, messages = Comparison.insert_asset_data_entries(self,
File “C:\GIT\dda-datacatalog-python\Collibra.py”, line 6316, in insert_asset_data_entries
+ “\t\t’” + str(result[“articulationScore”]) + “’,\n”
KeyError: ‘articulationScore’

Process finished with exit code 1

Here is thefull code:

def get_asset_data(self,
                    indent=0,
                    offset=0,
                    size=17,
                    limit=5000,
                    debug=0,
                    **kwargs):

    get_asset_data_dttm = timestamp()
    if debug >= 1:
        print('\t' * indent,
              'get_asset_data::'.ljust(size),
              get_asset_data_dttm)
    rc = int(0)
    messages = []
    indent1 = indent+1

    #######################################################################################
    # Before I can get the final asset data, I need to get a list of the assets to exclude
    # This will give me all of the assets for the Technical Data Asset community
    #######################################################################################
    userParams = {}
    userParams['communityId'] = 'b23eedc6-ffa3-4c67-8974-e5d5a36b513c'
    params = {}
    for k, v in kwargs.items():
        params[k] = v

    collibraResponseTechnicalAssets = requests.get(self.collibraRestURL + '/assets',
                                                   auth=HTTPBasicAuth(self.username, self.password),
                                                   params = userParams)

    # Get total asset records for technical Data Community from restAPI
    myTotalTechnicalAssets= collibraResponseTechnicalAssets.json()['total']
    print("Total domain records: " + str(myTotalTechnicalAssets))

    # Get the number of times the 'for' loop has to execute to get all the records
    loopCountTA = (-(-myTotalTechnicalAssets//1000))
    print("Total domain loops: " + str(loopCountTA))
    #quit()

    ############################################################################
    # Now to go back and loop for loopCountTA times to create a file of assets
    # I don't want
    ############################################################################

    technical_data_assets_set = {''}
    for i in range(loopCountTA + 1):
        #for i in range(loopCountTA + 1):
        userParams = {}
        userParams['communityId'] = 'b23eedc6-ffa3-4c67-8974-e5d5a36b513c'
        userParams['offset'] = i * 1000

        params = {}
        for k, v in kwargs.items():
            params[k] = v

        collibraResponseTechnicalAssets2 = requests.get(self.collibraRestURL + '/assets',
                                                        auth=HTTPBasicAuth(self.username, self.password),
                                                        params = userParams)
        #print(userParams)

        technical_data_assets = create_response_tuple(collibraResponseTechnicalAssets2)
        result_list_TA = technical_data_assets[1].get("results")

        for j in range(len(result_list_TA)):
            if 'domain' in result_list_TA[j]:
                technical_data_assets_set.add(result_list_TA[j]['domain'].get('id'))
    #print(technical_data_assets_set)
    #quit()
    #######################################################################################
    # I need a loop counter for the assets
    #######################################################################################
    params = {}
    for k, v in kwargs.items():
        params[k] = v

    collibraResponseAssetsCount = requests.get(self.collibraRestURL + '/assets',
                                                   auth=HTTPBasicAuth(self.username, self.password))

    # Get total asset records
    myTotalAssetsCount = collibraResponseAssetsCount.json()['total']
    print("Total asset records: " + str(myTotalAssetsCount))

    # Get the number of times the 'for' loop has to execute to get all the records
    loopCountAssets = (-(-myTotalAssetsCount // 1000))
    print("Total asset loops: " + str(loopCountAssets))
    #quit()
    #####################################################################################
    # Now get asset data
    #####################################################################################
    #for k in range(loopCountAssets + 1):
    for ii in range(2):
        userParams2 = {}
        userParams2['limit'] = 5000
        userParams2['offset'] = ii * 1000

        params = {}
        for k, v in kwargs.items():
            params[k] = v
        collibraResponseAssets = requests.get(self.collibraRestURL + '/assets',
                                    auth=HTTPBasicAuth(self.username, self.password),
                                    params = userParams2)

        asset_data = create_response_tuple(collibraResponseAssets)
        if asset_data[0] != 200:
            messages.append('ERROR: Asset Data Collibra extract has encountered errors!')
            http_error(asset_data, messages, indent=indent1, debug=1)
            quit()
        result_list_Assets = asset_data[1].get("results")

        if result_list_Assets[ii]['domain'].get('id') not in technical_data_assets_set:
            modified_dictionary = {'id': '',
                               'createdBy': '',
                               'createdOn': '',
                               'lastModifiedBy': '',
                               'lastModifiedOn': '',
                               'system': '',
                               'resourceType': '',
                               'name': '',
                               'displayName': '',
                               'articulationScore': float(0),
                               'excludedFromAutoHyperlinking': bool,
                               'domain': {'id': '', 'resourceType':'', 'name':''},
                               'type': {'id':'', 'resourceType':'', 'name':''},
                               'status': {'id': '', 'resourceType': '', 'name': ''},
                               'avgRating': float(0),
                               'ratingsCount': int(0)}
            modified_dictionary['id']                            = result_list_Assets[ii]['id']
            modified_dictionary['createdBy']                     = result_list_Assets[ii]['createdBy']
            modified_dictionary['createdOn']                     = result_list_Assets[ii]['createdOn']
            modified_dictionary['lastModifiedBy']                = result_list_Assets[ii]['lastModifiedBy']
            modified_dictionary['lastModifiedOn']                = result_list_Assets[ii]['lastModifiedOn']
            modified_dictionary['system']                        = result_list_Assets[ii]['system']
            modified_dictionary['resourceType']                  = result_list_Assets[ii]['resourceType']
            modified_dictionary['name']                          = result_list_Assets[ii]['name']
            modified_dictionary['displayName']                   = result_list_Assets[ii]['displayName']
            modified_dictionary['articulationScore']             = result_list_Assets[ii].get('articulationScore',0)
            modified_dictionary['excludedFromAutoHyperlinking']  = result_list_Assets[ii]['excludedFromAutoHyperlinking']
            modified_domain = {'id': '', 'resourceType': '', 'name': ''}
            if 'domain' in result_list_Assets[ii]:
                modified_domain['id']                 = result_list_Assets[ii]['domain'].get('id')
                modified_domain['resourceType']       = result_list_Assets[ii]['domain'].get('resourceType')
                modified_domain['name']               = result_list_Assets[ii]['domain'].get('name')
            modified_dictionary['domain']             = modified_domain
            if debug >= 3: print('domain type:', type(modified_dictionary['domain']))

            modified_type = {'id': '', 'resourceType': '', 'name': ''}
            if 'type' in result_list_Assets[ii]:
                modified_type['id']                 = result_list_Assets[ii]['type'].get('id')
                modified_type['resourceType']       = result_list_Assets[ii]['type'].get('resourceType')
                modified_type['name']               = result_list_Assets[ii]['type'].get('name')
            modified_dictionary['type']             = modified_type
            if debug >= 3: print('type:', type(modified_dictionary['type']))

            modified_status = {'id': '', 'resourceType': '', 'name': ''}
            if 'status' in result_list_Assets[ii]:
                modified_status['id']                 = result_list_Assets[ii]['status'].get('id')
                modified_status['resourceType']       = result_list_Assets[ii]['status'].get('resourceType')
                modified_status['name']               = result_list_Assets[ii]['status'].get('name')
            modified_dictionary['status']             = modified_status
            if debug >= 3: print('status type:', type(modified_dictionary['status']))

            modified_dictionary['avgRating']          = result_list_Assets[ii]['avgRating']
            modified_dictionary['ratingsCount']       = result_list_Assets[ii]['ratingsCount']
            result_list_Assets[ii]                    = modified_dictionary
            if debug >= 3: print()

    #dump(result_list_Assets)
    #quit()
    if debug >= 2:
        print('\t\t\tAsset Data Status Code: ', asset_data[0])
    if debug >= 4:
        print('\t\t\tAsset Data: ', asset_data)
        #dump(asset_data[1], caller='List of all Asset Data fields')
        #dump(asset_data[1].get("results")[0], caller='Contents of one individual Asset Data result record')
    if abs(debug) >= 5:
       timestamp(get_asset_data_dttm,
                 label='get_asset_data',
                 indent=indent,
                 debug=1)
    return(rc, messages, asset_data)

I get the final error because there is no articulation score in the record.