DSC Weekly Digest 7 September 2021
In the last year or so, I’ve been watching the emergence of a new paradigm in application development, one that is shifting the way that we not only process data but is also changing the view of what an application is.
This change necessitates a critical idea: the notion that you can represent any information as a graph of relationships. This idea goes a long way back – Database pioneer Ted Codd, alluded to graph-based data systems in the 1970s, though he felt (likely for legitimate reasons) that the technology of the time was insufficient to be able to do much with the notion.
In the early 2000s, Tim Berners-Lee explored this notion more fully with RDF and the Semantic Web, which made use of an assertion-based language as a way of building inferential knowledge graphs. This technology waxed and waned and waxed again throughout the next twenty years, but even as such systems have become part of enterprise data implementations, the relatively static nature of these relationships hampered adoption.
In the mid-2010s, Facebook released a new language called GraphQL, making it open source. GraphQL was intended to primarily to deal with the various social graphs that were at the heart of the social media giant’s products, but it also provided a generalized way of both discovering and retrieving data through a set of abstract interfaces.
Discovery has long been a problem with Service APIs, primarily because it put the onus of how data would be structured on the provider, often with complex parameterizations. and poorly defined output sets. What’s more, any query often necessitated multiple steps to retrieve hierarchical data, reducing performance and necessitating deeply nested asynchronous calls (which ultimately necessitated the development of a whole promise infrastructure for web applications).
GraphQL, on the other hand, lets a client process retrieve the model of how information is stored in the graph and uses that to help construct a query that can then short-circuit the need for multiple calls to the server. Moreover, while a GraphQL server can take advantage of an abstract query language to write simplified queries, resolvers can also be written that can calculate assertions dynamically as well (a major limitation of RDF).
For instance, with GraphQL it is possible to pass a parameter indicating a time zone to a property and the system would then retrieve the current time in that time zone, pulled not from a database but from a function on a clock, even with everything else coming from a database. This ability to create dynamic and contextual properties (at both the atomic and structural levels) may also provide a mechanism to interact consistently with machine learning models as if they too were databases, and may, in turn, be used to simplify updating reinforcement learning-based systems in a consistent manner without the need to write complicated scripts in Python or any other language.
As GraphQL becomes more pervasive, application development will become simpler – connect to an endpoint, build a query, bind it to a web component (in React, Angular, or the emerging Web Components framework), and act upon the results. Because the GraphQL endpoint is an abstraction layer, actions get reduced to traditional CRUD operations, with the difference being that a GET operation involves passing a query construct rather than a parameterized URL, while a POST operation involves passing a mutational construct. Rather than trying to support hundreds of microservice APIs, organizations can let users connect to a single GraphQL endpoint, providing both access to data and protection against exposing potentially dangerous data, the holy grail of application developers and database managers alike.
In media res,
To subscribe to the DSC Newsletter, go to Data Science Central and become a member today. It’s free!