Encore Tooling, Inc.
Toolbox Product Overview​
Bill DeStein, Founder
Encore Tooling, Inc.
Toolbox is a SaaS product for use by analysts - especially those working in a multi-cloud environment. I’m one of those multi-cloud analysts. I routinely work in AWS, Azure and GCP clouds. I routinely work with Athena, BigQuery, Glue, PostgreSQL, Redshift and Synapse processing engines. Working in a multi-cloud environment brings with it these two pain points:
(1) tools proliferation, and (2) federated queries.
​​
Tools Proliferation
​
For each of the six processing engines mentioned above, the cloud providers offer three classes of tools. One is for composing and running SQL queries; a second is for working with notebooks; and the third is for data visualization and dashboarding. So a multi-cloud analyst can find themselves switching among eighteen different engine-tool pairs. That’s a lot of tools to learn and a lot of context switching. Analysts find themself spending more time dealing with tools than dealing with the data.
Toolbox offers a novel approach to dealing with the tools proliferation problem. Toolbox has embedded within it, three well established, time tested, open source tools. These are Apache Jupyter for notebooks, Apache Superset for data visualization and dashboarding, and Apache Zeppelin also for notebooks. Toolbox also includes an internally developed tool, Encore Studio, for composing and running SQL queries. Toolbox includes all of the drivers needed to get the four tools working with the six processing engines. With Toolbox, the analyst only needs to learn four tools rather than twenty four tools.
​
Federated Queries
​
Say I want to join three tables living in three different processing engines, each hosted by a different cloud provider. The current approach goes like this. I select one of the three processing engines to act in a primary role. And I choose two processing engines to act in a secondary role. In the primary processing engine I declare two external tables – one in each of the secondary processing engines. Then I configure a JDBC or ODBC connection from the primary to each of the secondary processing engines. And finally I code my SQL query. And that’s where the pain really goes up a notch. My SQL query may need to include code snippets written in three different SQL dialects. That’s a hard problem for the developer of the processing engine to get right, and it’s a hard problem for the analyst to get right.
Toolbox offers a novel approach to federated queries.
​
Core to the Toolbox approach is the notion of a 'hub'. A hub is a PostgreSQL database running in a Docker container in the cloud. Each Toolbox user has their own dedicated hub. The hub is spun-up at the start of a user session, and it is gracefully shut down at the end of a user session.
​​​
The analyst uses the Encore Studio tool to compose and run federated SQL queries. Encore Studio is similar to Jupyter and Zeppelin in that the analyst composes a notebook, and each paragraph in the notebook contains one or more SQL queries. For each SQL query, the user specifies:
-
The processing engine on which the query is to be executed, and
-
The name of a table in the hub where the result set is to be saved
​​
An Encore Studio notebook typically begins with one or more ingestion queries, and ends with one or more mutation queries. Ingestion queries run on one of the data lake processing engines and write a result set into a table in the hub. Mutation queries run in the hub. They join result set tables and perform any other transformations. The user can also choose to use Jupyter, Superset or Zeppelin to join and explore result sets in the hub.
A few closing comments …
Not surprisingly, cloud providers show little interest in the idea of a common user interface across clouds and across processing engines. Also not surprising, the cloud providers show little interest in offering a robust federated query solution. If these are indeed real pain points, it won’t be the cloud providers who deliver the pain relief.
Toolbox is a SaaS product hosted by Encore Tooling. So there’s nothing for the analyst to install on their desktop; and there’s nothing for their DevOps coworkers to install in the cloud. The analyst simply goes to encore-tooling.com, creates an account, enters their cloud provider credentials, and they’re ready to go.
​
​
​
​