Hello all, I’m somewhat new to data analysis (and Python in specific)
I have an Excel spreadsheet in the following format:
N | M | E | A |
---|---|---|---|
n1593 | m1037 | E192 | a10 |
n8322 | m1037 | E192 | a10 |
n7499 | m17 | E961 | a59 |
n1931 | m1205 | E380 | a59 |
n3953 | m1205 | E380 | a59 |
n4279 | m1205 | E380 | a59 |
n5738 | m1205 | E380 | a59 |
n6689 | m1205 | E386 | a51 |
n6689 | m1205 | E380 | a59 |
n7171 | m1205 | E1234 | a10 |
n7171 | m1205 | E119 | a59 |
n8280 | m1205 | E380 | a59 |
n8388 | m1205 | E380 | a59 |
n8737 | m1205 | E380 | a59 |
n12263 | m1205 | E386 | a51 |
n12263 | m1205 | E380 | a59 |
n12364 | m1205 | E380 | a59 |
n1793 | m491 | E386 | a51 |
n1793 | m491 | E408 | a54 |
n1793 | m491 | E467 | a54 |
n1793 | m491 | E1086 | a54 |
n1793 | m491 | E1112 | a54 |
n1793 | m491 | E216 | a55 |
n1793 | m491 | E1056 | a55 |
n2093 | m491 | E408 | a54 |
n2093 | m491 | E467 | a54 |
n2093 | m491 | E1086 | a54 |
n2093 | m491 | E1112 | a54 |
n3553 | m491 | E408 | a54 |
The ‘N’ column represents names of users
The ‘M’ column represents their managers
The ‘E’ column represents an entitlement (or permission) the user holds
And the ‘A’ column represents the application (or system) the entitlement resides on
The data is represented in numbers like this to scrub the data and ensure NPI/PII does not leak.
Here is a portion of a pivot table of the data with managers representing the rows and application names representing the columns (with the counts of how many users that manager has with access to that application):
I have highlighted some examples in boxes of patterns I would like to identify in the data but I am unsure of how to logically do this (I hope the sheet comes out big enough to see in the final post). I would imagine the vertical patterns would have different logic than the horizontal patterns.
Can someone tell me the logic of how to accomplish this in Python (or perhaps there are tools out there to get this done even simpler)?
Much appreciated!