This article is about how we came across the widely used open-source API platform and how we decided to go with CubeJs for the implementation of our critical dashboards.
Things turned around when a new feature request came from the product team to implement filters on our dashboards. It appeared to be a simple requirement but handling it on the back end with current system implementation became challenging, I’ll soon tell you how.
Though MySQL is an incredible DBMS with a lot of great capabilities, performing complex analytics and twisting queries becomes challenging beyond a certain point. Specifically, heavy real-time dynamic queries remained unresolved even after trying various MYSQL options like materialized view and stored procedures.
The hunt now was to search for open source tools and libraries that can help us in building dynamic queries with high performance.
And we found our savior!!!!!!
Cube.js adds a Semantic API layer on top of data to speed up the development. It provides the necessary infrastructure for any analytics application that relies heavily on caching and a pre-aggregation layer to provide insights from raw data in a short span of time. This would be very hard to achieve if we had to implement it on our own, thoughwe are yet to use their pre-aggregation feature.
A Cube can be queried using REST, GraphQL, or SQL interfaces. Cube generates SQL queries based on your data model and incoming queries. Besides orchestrating queries, connecting to databases, and caching, Cube also handles access control.
Cube.js provides a playground to build queries as per the requirement which is later used as query parameter while calling REST API. By default, this playground is enabled at port number 4000, but it is highly recommended to disable this playground on production, you can do it using CUBEJS_DEV_MODE=false in the .env file. There is an excellent production checklist from Cube.js, which we had referred before going live with cube.
Cube acts as a data access layer on top of your actual DAL, It translates API requests into SQL, and manages caching, queuing, and database connection.
Below 3 simple steps that made our life easier:
- Install Cube.js: It can run anywhere: Kubernetes, Serverless, and with Node.js. We have used CubeJS via Node.js.
- Connect to the data sources: It supports many different types of data sources, from Serverless Cloud Data Warehouses to traditional RDBMS.
- Use of REST API: REST API is used to communicate with Cube.js back-end
In Cube.js, every cube is defined as a dataset that is logically similar to the view in MySQL. For every cube, there is a separate file. By default, Cube.js creates cubes intelligently as per the schemas of the configured data source. Each table will have an equivalent cube created by Cube.js. One can have a custom cube as well depending upon the requirement. In Zipy, we have a mixture of default and custom cubes.
How Zipy is using Cube.js?
On the back-end, the Cube.js API instance is hosted as an independent server. MySQL is connected as a data source to which Cube.js is making calls. For performance, we have kept RefreshKey as 10 secs by default. We have secured these APIs using JWT
On its front-end, Zipy uses the React framework with movable/resizable components. Zipy uses a data visualization library ApexCharts.js.
An extensive range of filters are available in Zipy, which the user can use to slice and dice data as required, and Zipy dashboard, which can retrieve and visualize large amounts of data from the back-end. These are the two use cases that are getting handled via cube in Zipy.
Cube.js provides a variety of REST APIs, among that Zipy, has mainly used two APIs till now. Let’s understand each of them in detail with sample requests:
/v1/load
This call is made from the front-end and contains a variety of complex filters which I have talked about earlier. All those filters are passed in the filters section from the front-end and then everything is taken care by the Cube.js. It creates an equivalent SQL query intelligently with dynamic WHERE clauses and joins. In response to this query, we get all required columns mentioned in measures and dimensions.
/v1/meta
This API fetches all meta-information for cubes defined in the data schema, Zipy has used this API to get all dimensions and measures dynamically to display all possible values in the filter drop-down.
While defining a schema there is a parameter ‘show’ which can be set as true or false. Those parameters are set to true and are fetched via this API. Therefore the parameters which were required to be shown on the front-end we set shown value to true for them.
Conclusion
Choosing any tool for our system is very difficult, it becomes more difficult when we have to choose something for our core features. It is very necessary to check if all our use cases are getting handled by that tool. We did a lot of research before choosing the cube, we tried all our use cases on the playground which is provided by Cube.js. We are looking forward to using more extensive features of the cube.js as we grow up.