# Delete First Row - NameError

Andie19 · January 24, 2023, 9:36pm

Hi, any help is appreciated, I am trying to delete the first row in csv files before running the full script. I added (before creating column lines) the # Delete First Row line but receiving an NameError: name ‘df’ is not defined error. I can manage to resolve it. Any ideas? thanks

import libraries

import json
import pandas as pd
import os
import sys

specify data directories

locfile = “c:/files/test/”
raw_files = os.listdir(locfile)

# Delete First Row
df.drop(index=df.index[0], axis=0, inplace=True)

Define columns - these are from JSON attributes

cols = [“Id”,“Activity”,“CreationTime”,“OrganizationId”,“RecordType”,“WorkspaceId”,“WorkSpaceName”,“Workload”,“DataflowType”,“DatasetId”,“DatasetName”,“IsSuccess”,“ObjectId”,“ItemName”,“ReportId”,“ReportName”,“ReportType”,“UserKey”,“UserId”,“UserAgent”,“ClientIP”,“AcitveUser”]

In[306]:

sub routine to accept json data and return table format array

def convDelim (j):

try:
    js = json.loads(j, strict=False)
    df.columns = df.columns.str.strip()
    return [(js.get("Id") , js.get("Activity"),  js.get("CreationTime"),  js.get("OrganizationId"),  str((js.get("RecordType") if js.get("RecordType") else "")),
         (js.get("WorkspaceId") if js.get("WorkspaceId") else ""),   (js.get("WorkSpaceName") if js.get("WorkSpaceName") else ""),  (js.get("Workload") if js.get("Workload") else ""),
         (js.get("DataflowType") if js.get("DataflowType") else ""),   (js.get("DatasetId") if js.get("DatasetId") else "" ),
         (js.get("DatasetName") if js.get("DatasetName") else "") ,
         str(js.get("IsSuccess")),
         (js.get("ObjectId") if js.get("ObjectId") else ""),
         (js.get("ItemName") if js.get("ItemName") else "" ),
         (js.get("ReportId") if js.get("ReportId") else "" ),
         (js.get("ReportName") if js.get("ReportName") else ""),
         (js.get("ReportType") if js.get("ReportType") else ""),
         js.get("UserKey"),
         js.get("UserId"),
         js.get("UserAgent"),     
         js.get("ClientIP"),
      js.get("AcitveUser"))] 
except:
    return [("" ,  "", "",  "", "",  "", "", "",  "", "", "" ,"","","","","","","","","","")]

In[304]:

final_df = pd.DataFrame(columns=cols)

for file in (raw_files):
if file.endswith(“csv”):
print("Processing: " + file)
df = pd.read_csv(locfile + file)
#print(df.info())
df[“JsonData”] = df[“AuditData”].map(lambda x: convDelim(x))
js_arr = df[“JsonData”]
a =
for x in (js_arr):
a.append(x[0])
tmp_df = pd.DataFrame(data=a,columns=cols)
final_df = pd.concat([final_df, tmp_df])

final_df = final_df.drop_duplicates()

In[303]:

final_df.to_csv(locfile + “data_2022.csv”, index=False, header=True)

MRAB · January 24, 2023, 9:56pm

Please wrap code in triple backticks to preserve the formatting:

```python
if True:
    print(''Hello world!')
```

As to your problem, you’re trying to delete the first row of df without first saying what df is.

I also notice another thing: your code has a bare except, i.e. except without specifying any error. That’s a bad idea because it’ll swallow all errors, even ones such as NameError caused by misspelling a name. You should catch only those errors that you’re going to handle.

Andie19 · February 2, 2023, 8:49pm

thanks for the reply… I am brand new to python, what I would like to do is an if statement,

If first row contains the string “test” skip row and use second row as column names. This is what I have but I am stuck:

‘’’

def file():
df = pd.read_csv(locfile + file)
for file in (raw_files):
if file.endswith(“csv”):
print("Processing: " + file)
df = pd.read_csv(locfile + file)
#print(df.info())

def somenewfunction (row):
if row[‘a’].contains(‘test’)==True:
return skipLine(f, 1)

return

Define columns - these are from JSON attributes

cols = [“ID”,“Activity”,“CreationTime”,“OrganizationId”,“RecordType”]

Blockquote

MRAB · February 2, 2023, 9:41pm

To preserve format the code you post, select it and then click on </>.

# Read the CSV file.
with open(path, encoding='utf-8') as file:
    lines = list(file)

# Delete the first row if it contains "test".
if 'test' in lines[0]:
    del lines[0]

# read_csv requires a file for input, so first write the lines into a StringIO buffer...
from io import StringIO
sio = StringIO()
sio.writelines(lines)

# ..then rewind to the start to the buffer...
sio.seek(0)

# ...and read it in.
df = pd.read_csv(sio)

Andie19 · February 3, 2023, 3:49pm

thank you for you help

Topic		Replies	Views
Need: name 'df' is not defined Python Help help	7	2762	April 13, 2023
NameError: name 'soup' is not defined Python Help help	12	2765	February 8, 2023
Error creating columns using numpy and pandas Python Help	2	497	October 13, 2020
Fault to access column in dataframe Python Help help	2	420	February 7, 2022
Error: "in submit_fields SeriesA = df1['TYPE']" Python Help	1	265	October 21, 2021

# Delete First Row - NameError

import libraries

specify data directories

Define columns - these are from JSON attributes

In[306]:

sub routine to accept json data and return table format array

In[304]:

In[303]:

Define columns - these are from JSON attributes

Related Topics