Simulation Engine with NumPy

  • You have N products and K customers. Assume you know the preferences of each customer for each product. (Create a matrix where the rows are the items, the columns the customers, and the inputs are the rating that each customer gives to an item. You can start by filling the preferences with random numbers).
  • Assume you also know the budget of each customer. (You have an array with the budgets of each customer. You can start by filling it with random numbers.)
  • Now assume you randomly select M products from the N products.
  • Compute for each customer which items are bought assuming that the customer buys the highest-rated products until its budget runs out.

I’d need help on the fourth step

I think some specifications of your assignment are unclear.

  • How are the prices of the products defined?
  • How is the product rating / preference defined? Is it up to you?
  • Is the amount of items of every product limited?
  • What is “budget of each customer”? A simple amount of money available to spend?

You did not say what exactly are you difficulties with the fourth step.

The best way would be if you show us your runnable code for the first three steps and demonstrate what are the exact problems with the fourth step - e.g. the code attempted, the result obtained vs the result expected.

2 Likes

Hi VĂ clav, here it follows the code.
https://app.datacamp.com/workspace/w/aaf734e8-78f1-4d6a-b0b2-c68870bcb7de/edit

As you can see, the complexity is minimum. Preferences and budget are simply randomly defined between 0 and 1.
I’m a beginner and I can’t find the way to get a result in step 4, that should follow this rule:

#Compute the profit for each customer:
##IF budget <= preference → BUY → New budget = budget - preference → IF new budget <= preference = BUY → (repeat) ELSE stop
##Requires loop, I think

Feel free to edit the script assuming that the three first steps are correct.

Hi Andre,

OK, I did not know datacamp.com provides Jupyter notebooks, good :slight_smile:

I made these changes to your code:

  • Made it to use the constants K, M, N. It does not make sense to repeat the numbers at many places in your code.
  • Removed repeated import numpy as np. Maybe you were confused when you opened the notebook later and it did not know np? You need to run the notebook cells from the begining when you open it again (all the variable are forgotten).
  • Added some structure - headers. Maybe you can replace your prints by Markdown fields too - they are intended to include text between your pieces of code.
  • Defined NumPy print options: np.set_printoptions() so the matrices do not annoyingly wrap and you do not need to round the numbers.
  • I added some additional suggestions as comments.

Now to the step 4:

  • I do not understand how you can compare or substract budget and preference. As I understand it:
    • Budget is amount of money of a customer is willing to spend.
    • Preference is some dimensionless number - meaning hmm… just a preference for buying certain product.
  • So as I pointed earlier. I think you are missing a 1D array of prices of the products. If I am right, add creation of this array to your code.
  • You need to do certain operations for every customer so you will need to loop over every customer (using for loop). …or maybe NumPy has a different way to do this which I do not know.

Something for benefit of others (if they do not open your notebook): Example of how to define formatting for the NumPy printing:

LINE_WIDTH = 160
DECIMALS = 3

FORMATTER = {"float": lambda x: format(x, f".{DECIMALS}f")}
np.set_printoptions(formatter=FORMATTER, linewidth=LINE_WIDTH)
2 Likes

Hello Václav
I am very grateful for the assistance you gave me. I’ll speak with my boss to find out how to develop this final phase.
If you’re interested, I could post comments here after I figured out the answer because as soon as the exercise is finished, there are other chores to accomplish.

Bests,

Andre

I am glad that I could help. I would be interested if you show how the exercise evolved and if some question shows up.

I wish you will enjoy learning Python.

Hi Václav,
You were completely right in suggesting the creation of a new array corresponding with prices. I had a call with my supervisor and we decided to extend the engine. The next step is exactly that one, and it means considering prices in € to be compared with budgets, and also consider the preferences as an utility scale.
I will start working tomorrow at this but in the meanwhile you can find the solution we found to sort and compute which products are bought at line 38 and 27.
We will be in touch again soon, if you like the idea.
Have a nice week :slightly_smiling_face:

Andre

2 Likes

Hi Andre,

that is good that you are making the specification of the problem more precise.

I have just glimpsed through the notebook:

  • It is good that you changed the budgets from the degenerated 2D matrix (with one dimension of size 1) to a true 1D matrix. …but you left the old code there. I recommend you to remove the old code because it makes the notebook unclear. If you do not want to get rid of the old code definitively, create a new notebook and move the old cells there.
  • Your failing cell max(budgets[0]) is probably for the old code. The new budgets is 1-dimensional so budgets[0] is directly the budget of the first customer (index 0). You cannot give max() a single value. If you want to find the largest budget use simply max(budgets).
  • For the 4th step I would recommend you writing it first for a single customer. (I understand it that every customer is independent from the others. Correct me if I am wrong.) Have the customer in a variable like customer = 0. When you make it working, you can make a function from it or enclose it directly in a for loop iterating over all the customers.
2 Likes

Fantastic Václav,
Your advice is a great help to me as always!
I have grasped everything you said and am editing the code now. I’ll keep you updated on the next steps :slight_smile:

Hi Václav,
I’ve done that step for the customer K=1 and the function works!
At this point I should extend the algorithm to the whole set of customers.
Which is the best/easiest/more functional way to do so?
Feel free to leave some notes if you feel like doing it :slight_smile:

Hello Andre,

That is great that you advanced. I will give you just a general advice:

As I understand it, you wrote the code for customer 0. For example the two zeros here refer to the customer 0:

budget = budgets[0]
do_something_else_with_customer(0)

As I suggested earlier put the id number of the customer to a variable at a single place instead of repeating it many times in the code:

customer = 0  # Only here you set the customer id.

budget = budgets[customer]  # Here you refer to the customer id only through the variable.
do_something_else_with_customer(customer)
...

Then if the variable K contains the number of all the customers, you can simply apply the computations sequentially to all the customers in a loop:

for customer in range(K):
    budget = budgets[customer]
    do_something_else_with_customer(customer)
    ...

That is it. The loop will execute the code in its body multiple times. Each time the variable customer will have a new value: 0, 1, 2,… 19. In effect the code will be sequentially executed for every customer.

Hi Václav,
I extended the mechanism to all the set of customers but I cannot make the operation budg -= price working.
Now the algorithm does what it is supposed to but it doesn’t subtract the price from the current budget when the customer buys the product.
Do you have suggestions? I think I’m missing a detail.

Bests!!

Hi Andre,

I am not able to access your notebook any more. In the past it was possible to switch the interface to Jupyter notebooks. Now it seems that Datacamp removed this option.

I see these basic options:

  1. Put the code here - it will be also more accessible to others.
  2. Try to make the notebook accessible (maybe it is enough to copy it to a filename notebook.ipynb).
  3. Use a different service for sharing Python notebooks like https://colab.research.google.com/ or https://mybinder.org/ (and there are more).

Hi Václav,
Datacamp automatically set the default private session but I changed it. You can now access

The summary of the exercise is:

K = 20  # customers
M = 3   # chosen products for a further analysis
N = 5   # products

Each customer has preferences and budgets, as follows

preferences = np.random.rand(N, K)
print(preferences)
import random

real_bdgt = random.sample(range(80, 100), K)
real_bdgs = np.array(real_bdgt)

And each product has a price:

pric = random.sample(range(15, 50), N)
price = np.array(pric)

The goal is to develop a model that computes the purchasing choices of the customers. I’m missing a detail because now the budget is not reduced when the consumer buys.

customer = 0
budg = real_bdgs[customer]
selection = []

for customer in range(K):
 for ix in largest_indexes:
    budg = real_bdgs[customer]
    print(str(ix) + ".", preferences[:,0][ix], price[ix], budg)
    if price[ix] < budg:
        budg -= price[ix]
        print("Bought")
    else:
        print("Can't afford")

The code below is redundant - probably remnant of an old testing code. You should remove it or move it to a separate notebook.


As I understand it budg will be assigned a float number.

    budg = real_bdgs[customer]

float in Python is immutable (like for example int, str, tuple etc.). It means that through the name budg you cannot change the value inside the real_bdgs container.

For example if you execute budg += 1 then budg will be bound to a new value (budg + 1) but this assignment will have no effect on the previous value in the container real_bdgs, the previous value will stay there.

So instead of

        budg -= price[ix]

you should do the following to change the value inside the container

        real_bdgs[customer] -= price[ix]