Average with csv file

Good night

I request your help, because I have gotten involved with a problem that I have not been able to solve and I request your support if possible

I have a cvs file with the following records

year ; sex ; age_range ; career ; duration_semester; time_to_entitle_in_semest; total_semester

2015; female; 20 to 24 years; electrical engineering;9;2;11
2015;male;25 to 30 years; medicine;10;2;12
2015;male;31 to 34 years; teacher;9;1;10
2015;female;40 to 45 years;programmer;5;1;6
2015; female; 20 to 24 years; electrical engineering;9;4;13

I have performed several functions that were requested, but I have not been able to perform the follow:

I must calculate by sex, the average time that people between the ages of 20 and 24 take to complete their studies, for that I must take from the csv all the females between the ages of 20 and 24 , and calculate the average delay in graduating (total_semester) of all the students in that age range, considering the duration (duration_semester) and the time it took to do their thesis and receive their degree (time_to_entitle_in_semest).

For example, case 1, it took 2 semesters to finish his thesis and receive his degree, giving a total of 11 semesters, and case 5, it took 4 semesters to finish giving a total of 13 semesters, then, I must average how many Approximate semesters take to graduate.

I have not been able to understand how to do it, and I have read a lot, but I cannot understand how to do it.

I appreciate your help, and I apologize for bothering you with this problem, but I’m new to this topic and I’m learning how to solve this.

Thanks for your comments

I assume you already know how to read from the csv file, but let us know.

You can have variables total_time_female: float, number_of_females: int, total_time_male, number_of_males: int.

The, for each row of the CSV, current_sex = row.get('sex'), and you separate the two cases, if current_sex == 'female' you increment number_of_females += 1, and total_time_female += float(row.get('total_semester')). Here I am not sure if I understood correctly what you want to average. It might be a different column of the table. If current_sex == 'male', then you do the similar incrementing of number_of_males and of total_time_male.

Finally, your averages are total_time_female / number_of_females and total_time_male / number_of_males.

Use pandas groupby (split-apply-combine) functions.

Well, what do you imagine are the logical steps involved in solving the problem? How would you do it by hand? Do you understand, at least, what an average is, and how it is calculated? Where exactly are you stuck?