How to fix a bug that only occurs on Windows?

Hi, I’m here to ask some help to fix a bug that only occurs on windows environment.

before I explain the problem, the code I wrote is part of my company’s projects so I cannot share the source code. but I will try my best to explain with some sample codes.
I wish you guys consider and understand my situation generously.

the program is simply calculate and record sequential next values out of initial value and changes in value.
for example, 10_000 and [1_000, 2_000, 3_000, 4_000, 5_000] are given, respectively, initial value and changes in value,
the program is expected to records:

first object(previous value: 10_000, change: 1_000, current_value: 9_000),
second object(previous value: 9_000, change: 2_000, current_value: 7_000),
third object(previous value: 7_000, change: 3_000, current_value: 4_000),
forth object(previous value: 4_000, change: 4_000, current_value: 0),
fifth object(previous value: 0, change: 5_000, current_value: -5_000)

however, when I test this calculation, sometimes previous value goes back to the prior value in the middle of calculation.
for instance, in the above sequence, third object’s previous value must be 7000, but it becomes 9000 so the current value does 6000.

and this only occurs on integration test with sqlalchemy objects fetched from database, not on unit test(I mocked repository to return initiated model objects)

it only occurs on Windows in x86-64 arch, not on linux(haven’t tested on other architectures…).
and tested using Python 3.11.9 version.

please advice me where to check to resolve this problem… any hints for debugging is also welecome.

here’s sample codes showing my code structures.

thanks in advance

class Manager:
	def __init__(self, repository: SomeRepository):
		self._repository = repository
		
	def run(self) -> None:
		records = self._repository.get_records()
		latest_record = self._repository.get_latest_record(records[0].id)
		
		calculator = Calculator(
			entities=[record.to_entity() for record in records],
			latest_value=latest_record.current_value
		)
		models = calculator.calculate_and_get_models()
		
		self._repository.add_all(models)
		
		
class SomeRepository:
	...
	def get_records(self) -> list[RecordModel]:
		pass
	
	def get_latest_record(self, id: int) -> RecordModel:
		pass
	...
	

class RecordModel(sqlalchemy.orm.DeclarativeBase):
	...
	previous_value: int
	change: int
	current_value: int
	
	def to_entity(self) -> RecordEntity:
		return RecordEntity(**self)
	
		
class RecordEntity:
	previous_value: int
	change: int
	current_value: int
	
	def update(self, previous_value: int):
		self.previous_value = previous_value
		self.current_value = previous_value - change
		

class Calculator:
	def __init__(self, records: list[RecordEntity], latest_value: int):
		self.records = records
		self.latest_value = latest_value
		
	def calculate_and_get_models(self):
		models = []
		current_value = self.latest_value
		
		for record in self.records:
			record.update(current_value)
			
			current_value = record.current_value
			models.append(record.to_model())
			
		return models

Does this code reproduce the bug?
If so how do you call it?

As for how to debug. I would be adding extensive logging to find out where the windows version diverges from the linux version.
By extensive I mean a log for every line until you find the bug.

1 Like

this is very good point! I guess I didn’t consider much about this.
I will try to reproduce the same bug with the minimal codes.

I also put logging lines in between every lines values change, but previous_value just changes out of nowhere…

Look deeply into the line that has the surprise.
Break the line into small pieces and check every piece.
Also I wonder is you have dunder methods being run?

Thanks for your advice i will look into it soon.

by the way, I already checked machine coding using dis and it does not seem have any problem in the logic and the way it allocated and calculate the values.
given that, it hard for me to get idea how to catch some part having smells with breaking lines into smaller codes and printing all of it.

could you give me more hints how to catch after breaking lines smaller?

and dunder methods used in the program are init, repr and str.

I need code to examine to help on specifics.

What you describe can only happen if you have a thread racing against you code and changing data from outside the thread running the algorithm.