I am currently trying to write a small library to make calls to a certain JSON RPC API. As is the case when working with an API, the HTTP requests are all over the place and it is very difficult to isolate them. Having a call in the middle will make testing the output so much more laborious and I would like to avoid that.
Another point is typing. Specifically, maybe inspired by Go’s interfaces, I want the purpose of the data to be unambiguous and have a clear association with related methods.
Here is some pseudo code:
input_data = {...}
# io_opertions_1
id = uuid()
# statements 1: generate data to upload
payload = {"id": id, ...}
# io operations 2: make an API request
resonse = request.post(..., json=payload)
# statements 2: produce a local representation
output_resources = {...}
In regards to that code, I have a few idea on how to structure it and I would be interested in your thoughts.
1. an impure function
input_data_ = {...}
def do_something(input_data):
# io_opertions_1
id = uuid()
# statements 1: generate data to upload
payload = {"id": id, ...}
# io operations 2: make an API request
resonse = request(..., json=payload)
# statements 2: produce a local representation
output_resources = {...}
return output_resources
ouput_resources = do_something(input_data=input_data_)
This seems to be the easiest and the most readable. There is not much overhead. To associate the data with the function, I could just use python types.
The main issue with it is that the code is cumbersome to test due to IO at the beginning and at the end. I will have to mock the requests library. Also, I will have to duplicate a lot of code if I want to use something that is not requests
, e.g. maybe httpx
or an async library.
2. Split the code with side effects from the core logic
# statements 1: generate data to upload
def process_input(input, id):
...
payload = {"id": id, ...}
request_args = {method:"POST", "json"=payload, ...}
return request_args
# statements 2: produce a local representation
def process_response(response):
output_resources = {...}
return output_resources
def do_something():
# io_opertions_1
id = uuid()
request_args = process_input(input, id)
# io operations 2: make an API request
resonse = request(..., json=payload)
output_resources = {...}
return output_resources
ouput_resources = do_something(input_data=input_data_)
After this refactoring we have two pure functions that contain the core logic and which are trivial to test. do_something()
will be tested in the integration tests, which I am happy with.
That being said, the function being split into three is not something I am fond of. It definitively breaks the reading flow because the smaller functions don’t really make much sense on their own. For smaller operations this means that a big proportion of the code will be passing arguments from one function to another.
3. Split the code with side effects from the core logic \w Generators
# statements 1: generate data to upload
def _do_something(input_data):
# io_opertions_1
id = yield
# statements 1: generate data to upload
payload = {"id": id, ...}
request_args = {method:"POST", "json"=payload, ...}
response = (yield request_args)
# statements 2: produce a local representation
output_resources = {...}
yield output_resources
def do_something(input_data):
_generate_something = _do_something(input_data)
next(_generate_something)
# io_opertions_1
id = uuid()
_do_something.send(id)
request_args = next(_do_something)
# io operations 2: make an API request
resonse = request(..., json=payload)
output_resources = _do_something.send(response)
return output_resources
ouput_resources = do_something(input_data=input_data_)
Here I like the fact that _do_something
actually preserves the original flow of code like in the 1st example. At the same time all the IO is isolated into do_something
.
What I don’t like about it is that it is somewhat difficult to understand what is happening. do_something
also looks ugly.
4. Passing functions with side effects as arguments
input_data_ = {...}
def _do_something(uuid_func, request_func, input_data, uuid_func, request_func):
# io_opertions_1
id = uuid_func()
# statements 1: generate data to upload
payload = {"id": id, ...}
# io operations 2: make an API request
resonse = request_func(..., json=payload)
# statements 2: produce a local representation
output_resources = {...}
return output_resources
do_something = partial(_do_something, uuid_func, request_func)
ouput_resources = do_something(input_data=input_data_)
Here we preserve the original flow, there is not to much boilerplate. It is also fairly easy to test.
It would’ve been perfect if not for the static type checking completely giving up on uuid_func
and request_func
. This feels like asking for trouble.
5. Split the code with side effects from the core logic \w Data Classes or Attrs
input_data_ = {...}
@declare(frozen=True)
class Request:
method: str
url: str
payload: dict
def __call__(self, ...):
# io operations 2: make an API request
requests.request(self.method, self.url, json=self.payload)
@declare(frozen=True)
class Resource:
id: str
data: dict
...
@classmethod
def setup(input_data):
# io_opertions_1
return cls(uuid, input_data)
def _prepare_request(self):
# statements 1: generate data to upload
payload = {"id": id, ...}
request_args = {method:"POST", "json"=payload, ...}
return Request
def _process_the_response(self, response)
# statements 2: produce a local representation
...
return cls(response, self...)
def do_something(self):
request = self._prepare_request()
response = request()
new_state = self._process_the_response()
return new_state
resource = Resource.setup(input_data_)
ouput_resources = resource.do_something(input_data=input_data_)
Now hold to you seat, buddy, because when I see those Attrs, I pull out my cyberpunks glasses and hack directly into hyperspace. With Attrs it is easier for me to think about what I would want to do and I immediately start adding things that… I don’t really need right now.
The good part is that it is easy to relate the logic to the data. The data itself and its purpose is unambiguous. In the end, for me, this is just structs with methods.
The bad part, jokes aside, is that while it is easier to think about what I am trying to write, it is also very verbose and sparse on meaning. If I compare it to the 1st example, I am not exactly adding anything new. Remembering the [PyVideo.org · Stop Writing Classes](Stop Writing Classes) talk I can see that at twice the amount, I don’t receive twice as much benefit.
6. Deferred computation or Command Pattern-like solution
input_data_ = {...}
def do_something(id, input_data):
# statements 1: generate data to upload
payload = {"id": id, ...}
deferred_action = {
"request_args": {"method": "POST", "url": input_data["url"],"json": payload},
"build_resource": lambda json: {...}
}
return output_resources
# io_opertions_1
id_ = uuid()
deferred_action_ = do_something(id=id_, input_data=input_data_)
def apply(deffered_action):
# io operations 2: make an API request
response = requests.request(**deferred_action[request_args])
# statements 2: produce a local representation
resource = deferred_action["build_resource"](response.json())
return resource
To make this example simpler, I moved the the uuid()
out.
I like this because all the key logic is still encapsulated in do_something
but the side effects are outside. I think the flow of the code makes it more readable than examples 2 through 5. Maybe the lambda
can be an external function.
The only thing missing here is that the deferred action as a dictionary is a not transparent. But here I think the solution in form of a Data class would be acceptable.
Here is where I am asking for your input:
Could anyone give some constructive criticism to the points I have proposed? Suggestions such as articles or videos to read and watch are also welcome.
Thanks advance! Happy coding!