Shared student IDs between courses

Robertglenn · August 30, 2022, 2:15pm

Imagine I have a spreadsheet with a campus’ courses and rosters for a specific semester. The spreadsheet has 2 columns: 1 column with course IDs, and the other column with student IDs for all students enrolled in each course.

How would I build a heat map / correlation matrix / something that will identify pairs of courses with the highest number of shared student IDs?

cameron · August 30, 2022, 10:29pm

By Robert Brace via Discussions on Python.org at 30Aug2022 14:25:

Imagine I have a spreadsheet with a campus’ courses and rosters for a
specific semester. The spreadsheet has 2 columns: 1 column with
course IDs, and the other column with student IDs for all students
enrolled in each course.

How would I build a heat map / correlation matrix / something that will
identify pairs of courses with the highest number of shared student
IDs?

I’d do the following:

load the data from the spreadsheet using the builtin csv module if
it is CSV data, openpyxl is it is an Excel spereadsheet, work
something out for other formats
shared students is a set relationship between the student ids of each
course; Python has a builtin set type for stuff like this; make a
mapping (probably a dict) of course id to set-of-student-ids; find
common students with a set intersection between the sets for chosen
pairs of courses
highest shared might be absolute (length of the interection set) or as
a ratio (that length versus the membership count of the courses); it
is easy to compute both
you can make a grid of courses as a 2-dimensional pandas
DataFrame, with shared sizes as the data values
you can plot that grid using the DataFrame’s plot method

Cheers,
Cameron Simpson cs@cskk.id.au

Topic		Replies	Views
Help with finding only non-matching rows Python Help	0	709	November 11, 2021
Help Please! Looking for code for visualisation Python Help	2	285	October 9, 2022
Total no. of rows in a csv Python Help	9	366	June 27, 2023
Correlation Matrix Python Help help	2	2309	August 9, 2023
How to merge several datasets, when variables differ but are similar? Python Help help	3	301	December 5, 2023

Shared student IDs between courses

Related Topics