Request for comment: Making PEP for NOSQL databases
Hello everybody,
this is my first time writing in this section and I hope I’m not out of place.
I wanted to propose a new PEP that describes the APIs that a NOSQL database library should have, as I am going to describe below.
These types of databases are spreading very quickly and I believe it is necessary to have common interfaces between libraries. In fact, all four types of NOSQL databases have similarities on the most common operations (CRUD operation) and on database level operations.
This approach has already been defined for SQL databases in this PEP 249. The python libraries that follow this PEP are many and have the same interfaces for objects, methods and functions.
Many python SQL libraries follow PEP 249, using functions, methods and properties with the nomenclature mentioned therein.
This leads to a much easier development and use consistency of these libraries.
I wanted to create a similar thing for NOSQL databases although there are many differences between the four types of databases, grouping them into four categories:
- Column database, such as Apache Cassandra
- Key/Value database, such as Redis
- Document database, such as MongoDB
- Graph database, such as Neo4j
These four types of databases have different characteristics on how data are requested and how they are inserted but they also have some common peculiarities.
On this assumption I have created a library of interfaces which should represent these common characteristics. I have also written a series of tests to simulate the behavior of existing libraries by encapsulating their methods in interfaces.
nosqlapi is an interface/ORM/utility library that is used to write, in turn, python libraries for NOSQL databases, so that they reflect the characteristics of the interfaces and therefore, of the API.
In this documentation you will find in detail what I will briefly explain below.
Abstract
The PEP introduces the API which describes the interfaces and the names of classes, methods, properties and functions that a NOSQL python library should have.
The API covers all four types of NOSQL databases. The PEP will also provide extended APIs for the unique peculiarities of each database type.
The goal of the API is simplicity and ease of use.
Motivation
The libraries that exist today concerning NOSQL databases are inconsistent in names. For example, it is easy to find objects dealing with database connections called Database
, DatabaseConn
and again DBClient
. These objects produce the same result: an object that allows you to work directly with the database and its data.
These objects are instantiated with the same arguments, but different in names. Some use host
for the server name, others use hostname
other servers
. Same thing for the other arguments.
Furthermore, there is no clear distinction between the database layer and the data layer, as is the case for SQL databases.
It is therefore necessary for consistency and ease of development and use, to have APIs that allow you to unify all this.
Furthermore, instantiating an object that deals at the database level and one at the data level allows for a very clear separation of duties.
For this the API will provide a Connection
object which will take care of the database level operations, and a Session
which will take care of the data instead.
Rationale
Separating tasks into two different objects has the advantage of isolating programming and execution errors.
Another advantage is that a user connecting to a database may not have permissions to work directly on the data. With this separation it is possible to isolate some information to authorized users.
The Connection
object, will directly deal with working with databases. It will never go into the merits of the data in the database you are working on.
The Session
object, on the other hand, will work directly with the data that a user can request, insert, modify or delete (CRUD operation). This object will also offer additional methods such as create / modify / delete users, execute Selector
and Batch
objects.
Each response to each operation can be encapsulated in a further object Response
, which can contain the result of the operation and even more information.
In other languages like Java, a library for nosql database types has been implemented: JNoSQL.
Specification
The connection to the database will be done through a Connection
object. Once this object is instantiated, you will not have an immediate connection to the data, but only to the outermost layer of the server hosting the database, dealing exclusively with the various databases.
To get the ability to work on data (CRUD operations) you need to create a new object that will be similar to the Cursor object as far as relational databases are concerned: the Session
object.
Calling the connect()
method of the Connection
object will return a Session
object .
Each operation performed by a Connection
object or a Session
object should return an object of type Response
.
This type of object can be instantiated with extra information in addition to the return data, such as the call header or an exit code and relative exception object.
Data read operations (_SELECT_s in relational databases) can be implemented directly through a special object (Selector
object) that will have a build()
method, which will build its query string based on the database dialect. The object can be passed directly to the find()
method of the Session
object.
In addition, there are operations that are found only in one type of database vendor. These operations can be implemented as extensions of the core API classes or, it will be possible to implement a Batch
object to pass a series of instructions together with an instance of the Session
class.
Backwards Compatibility
Existing libraries do not have this type of structure and nomenclature to comply with these APIs. In the nosqlapi library there is a decorator that allows you to map the names of existing methods with API compliant names.
Potential Problems and their Solutions
This section outlines some pitfalls that can arise from using the API.
Reference Implementation
Nosqlapi documentation site: noslapi docs
Nosqlapi GitHub: noslapi repo
Nosqlapi production usage: example library
Copyright/license
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.