My first python project - Create a home budget forecast

I’m looking to create a python script balance.py that reads the recurring income and expenses from my budget.csv, then produces an output csv within a given forecast period (1yr, 2yrs etc.)
I have looked at using Pandas datetime to produce the forecast, but would appreciate any suggestions of how I might structure it. Perhaps there are already some existing python projects which I could use as inspiration?

budget.csv

comment, amount, period,startdate
rent,800,monthly,2022/10/1
food/200, weekly,2022/10/1

usage:
balance(startbalance,period)
For example a start balance of 10000, and a 2 year forecast.

balance(10000,2)

Just some ideas:

  • For such a simple task you do not need Pandas.
  • The Python builtins and standard library have the tools for that: datetime, csv
  • Represent dates using datetime.date and passed time (time period) using datetime.timedelta.
  • For date please try to avoid weird and ambiguous formats like 2022/10/1. Use the well established standard ISO 8601 - Wikipedia , example: 2022-10-01 (yyyy-mm-dd).
  • Internally represent money amounts using fixed point represented by int (not float) - e.g. 1 = 1 cent or use the module decimal from the standard library.
  • In the budget besides startdate you will probably need enddate too.
  • balance()
    • Should accept the budget table as a parameter or implement the financial forecast as a class with budget being one of its attributes.
    • Should accept start_date (type datetime.date), probably defaulting to today.
    • Should accept the period as datetime.timedelta object or alternative parameter as end_date.
  • From the budget table you can generate a budget dict keyed by date which would contain all the dates with transactions for the given time period. After generating it you can transform the dict to one sorted by date to get chronological list of every individual transaction.
    • dict in current Python versions keep the sort order but you cannot (sanely) sort it in-place. You will generate a new sorted dict from the original one.
  • If you do not need the list of individual transactions, you can avoid creating the dict by simply counting how many times will every budget item event happen in the given time period and multiplying the amount by the event count.

Thanks for the tips. This is what I have so far, cashflow.py and balance.py
cashflow.py

import datetime
import pandas as pd
import csv
import os


class CashFlow(object):

    def __init__(self, name='cashflow'):
        self.name = name
        self.columns = ['date_of_activity', 'description', 'amount']
        self.series = pd.DataFrame(columns=self.columns)

    def generate_series(self, description, amount, frequency='MS',
                        start=datetime.date(datetime.date.today().year, 1, 1),
                        end=datetime.date(datetime.date.today().year, 12, 31)):
        """Generate a new data series and append it to the existing data"""
        data = [(date_of_activity, description, amount)
                for date_of_activity in pd.date_range(start, end, freq=frequency)]
        self.series = pd.concat(
            [self.series, pd.DataFrame(data, columns=self.columns)])

    def add_event(self, description, amount, date_of_activity):
        """Add a single event"""
        self.generate_series(description, amount, frequency='D',
                             start=date_of_activity, end=date_of_activity),

    def export_to_csv(self, dirpath, filename):
        """ Export the data to a csv file."""
        """ Sort data by 1) date_of_activity ascending and 2) amount ascending before the export."""

        path = os.path.join(dirpath, filename)
        with open(path, 'w') as csvfile:
            csv_writer = csv.writer(csvfile, delimiter=',')
            csv_writer.writerow(self.columns)
            self.series.sort_values(by=['date_of_activity', 'amount'], ascending=[
                                    True, True], inplace=True)
            for index, record in self.series.iterrows():
                csv_writer.writerow(record)

balance.py

import pandas as pd
import plotly.express as px
import datetime
import cashflow
  
cf = cashflow.CashFlow('Cash flow forecast')

# Balance
balance = 98000

# start_date
start_year = 2022
start_month= 6
start_day = 2

# End date
end_year = 2022
end_month = 12
end_day = 29 
 
# Generate data
cf.add_event('Original balance', balance, datetime.date(start_year, start_month, start_day))

# weekly
cf.generate_series('Food', -100, frequency='W',
                   start=datetime.date(2022, 6, 3), end=datetime.date(end_year, end_month, end_day))

# Monthly
cf.generate_series('Rent', -1500, frequency='M',
                   start=datetime.date(2022, 6, 3), end=datetime.date(end_year, end_month, end_day))

# Quarterly
cf.generate_series('subscription', -1500, frequency='Q',
                   start=datetime.date(2022, 6, 15), end=datetime.date(end_year, end_month, end_day))

# One-offs
cf.add_event('One off payment - auto', -5000, datetime.date(2023, 5, 15))

#
# Show the data
print(cf.series)
cf.export_to_csv('', 'balance.csv')

However the output still does not match up with the startdates that I have in balance.py. :

date_of_activity,description,amount
2022-06-02 00:00:00,Original balance,98000
2022-06-05 00:00:00,Food,-100
2022-06-12 00:00:00,Food,-100
2022-06-19 00:00:00,Food,-100
2022-06-26 00:00:00,Food,-100
2022-07-01 00:00:00,Rent,-1500
2022-07-01 00:00:00,subscription,-1500

Food should start at the 3-6-2022, but in the output starts at the 5-6-2022. It seems that the date frequency for week is fixed to Sundays?

Solution was to set freq=SUN

This is really nice @yogibjorn.

Re your comment:

Solution was to set freq=SUN

I am guessing you are referring to the pd.date_range function referring to the offsets here: Time series / date functionality — pandas 2.1.4 documentation

Or have I misunderstood.

Be good to have an option to have weekly based on the start date, not a specific day of the week.

Sorry, looks like you can use ‘7D’ as the frequency to have the event occur every 7 days.