Request for comment: Making PEP for NOSQL databases

malemburg · April 5, 2022, 9:17am

Developing such a PEP without user feedback certainly won’t lead anywhere. I just feel that shooting down the effort at the “how to get started” point is not a good idea, since it does have merits.

matteoguadrini · April 5, 2022, 9:39am

I agree with everyone! I have been working with NoSQL for about three years and in python about 15 years. I don’t have the same experience with NoSQL. I made this proposal because I noticed some similarities and applying them with the existing libraries, I noticed that I could develop only one model for all the databases.

I started with an api decorator that wraps classes already written by library makers by mapping only the API names above. Later, I wrote the API compliant classes.

And here is my proposal. In the build section of the docs, I built (and I use it in the development environment) a simple library for CouchDB (there is no official one) and it works well respecting the API.

The only problem I see is that I don’t know of any official library developers. Here I ask some of you for help.
This morning I wrote and posted on the official CouchDB slack (also in the user mailing list) this thread. But I only know this …

malemburg · April 5, 2022, 4:15pm

The packages on PyPI usually have contact emails for the packages. That’s a good start. You could invite them to this Discourse topic or open a new one. If you want to have a little more privacy, you could also ask for creation of a SIG mailing list, e.g. nosql-sig@python.org, and then discuss there before coming back here.

matteoguadrini · April 5, 2022, 5:05pm

OK thank you. I will try to involve the various development teams of the various libraries starting from pypi. Maybe, before I create a special mailing list, I will try to involve them in this discussion. We’ll see.

blink1073 · April 5, 2022, 8:52pm

Hey all, I’m one of the maintainers of PyMongo. We’re certainly interested in being involved in the discussion. I’ve opened https://jira.mongodb.org/browse/PYTHON-3211 to help coordinate.

malemburg · April 6, 2022, 3:58pm

Something you may also want to do it look into how Python BI tools handle interfacing to NoSQL databases.

The interesting part here is that the BI tools can interface to SQL databases as well, so they also address the important aspect of bridging between the two worlds.

matteoguadrini · April 6, 2022, 9:08pm

In the redash package, libraries found on pypi such as cassandra and MongoDB are used.

Cassandra’s Connection and Session look a lot like the API I wrote:

...
  connection = Cluster(
                [self.configuration.get("host", "")],
                port=self.configuration.get("port", ""),
                protocol_version=self.configuration.get("protocol", 3),
                ssl_options=self._get_ssl_options(cert_path),
            )
        session = connection.connect()
...

Instead for MongoDB the Connection object corresponds to MongoClient which once accessed the database name through __getitem__ returns an object similar to the Session object

...
db_connection = pymongo.MongoClient(
            self.configuration["connectionString"], **kwargs
        )

        return db_connection[self.db_name]
...

Even in these libraries (like the ones I wrote in the past) using today’s NoSQL libraries, I keep finding a pattern to create a specific API.

UPDATES:
I studied for a moment what the Neo4j driver does with the data:

with driver.session() as session:
    result = session.run("MATCH (n:Person) RETURN n.name AS name")
    # do something with the result...

and it looks a lot like the APIs I wrote with the context manager:

# Graph type: use like a context manager
with graphconnection.connect() as sess:
        response = sess.get("MATCH (n:Person) RETURN n.name AS name")
        # do something with the response...

matteoguadrini · May 10, 2022, 4:54pm

I believe that Bernie Hackett did not understand what the APIs I wrote are for …
He mentions having a standard in NoSQL queries, but the standard must not be done according to the query language (even SQL has its dialects which differ from each other), but rather on python libraries written for NoSQL db.
Unfortunately I believe that he did not read the docs…