How To

Google Sheets and Snowflake automation

Ian Yanusko
March 1, 2023
4
min read

Intro

The world needs Google Sheets. Don’t try to fight it. No matter how much you try to move away from it, new use cases will always crop up. Business teams always revert back to the infinitely flexible spreadsheet. Let’s be real, there are about 1 billion spreadsheet users (not kidding) in the world – it’s bound to happen. But data gets siloed.

The Problem

Google Sheets enables nontechnical teams to easily work with data. These business teams are running everyday ops and analyses in spreadsheets, but there’s almost always 2 key problems:

  1. Business teams don’t have all the data and context that a company stores in a data warehouse like Snowflake, and they’re not technical enough to get it on their own
  2. The rest of the company will never have access to the data in this spreadsheet because we all know how spreadsheets get lost into the abyss that is a company google drive. So how are these valuable insights and data supposed to be stored permanently? 

Here are some real examples across teams facing this problem:

  1. Sales & Partnerships teams need to close customers. They need an easy way to toggle certain features on/off for customers in the sales cycle or adjust / extend trial periods. Today they need to ping engineering to make these bespoke adjustments and it always gets deprioritized on an engineer’s backlog – especially when you have tons of sales folks making these requests
  2. Customer service teams when dealing with customers in real-time need the latest information on customer activity. Customers may have very recently purchased or engaged with the product, or have their order in transit right then and there. Most other data syncing companies don’t refresh data faster than unless on the enterprise plan. But if the customer service needs the latest data because it affects the support given, then the team needs that info in real-time. And if a change or cancellation is requested by the customer, the CS team needs to send that data back to Snowflake in real-time too.

Regardless of the scenario, incomplete / mismatching data or constant requests to engineers are inevitable unless the gap between these tools are bridged. Below, we talk through the options to sync data between Google Sheets and Snowflake.

Method #1: Write an internal script

For some teams, the knee-jerk method is to build an internal script. After all, why pay for something you could do yourself?

If you’re just doing one-way syncs from Snowflake to Sheets, it may be enough to work with Snowflake and the Sheets API. However, the Sheets API comes with a few drawbacks, such as strict rate limiting. It can also be challenging to maintain a proper record mapping, since Sheets does not mandate a unique primary key.

If you need to sync both ways, things will get more complex. You need to develop a way to map records from Sheets to Snowflake, control the schema so that extraneous data doesn’t pollute your warehouse, and figure out a method for handling merge conflicts.

Even looking past these limitations, internal scripts invariably require development and maintenance time, and as any developer knows, can break at the worst of times :) 

Pros: Maximum flexibility & control, tailored to your use case

Cons: Requires valuable engineering resources to build and maintain, potentially not scalable, no customer service when things go wrong or your data schema changes‍

Method #2: Use Zapier?

Want to use Zapier to do this? You can’t. Not only because it doesn’t track deletes, updates to existing records, and only does one way syncs. But because Snowflake isn’t supported.

Pros: None

Cons: It doesn’t work

Method #3: ETL/reverse ETL

Your next best bet is to use a more technically savvy solution: ETL (or ELT) and reverse ETL. ETL and reverse ETL can help ensure a single source of truth for your company. With an ETL provider, you can extract from those apps, transform the data so that it’s standardized & cleaned, and load it into the data warehouse. Some common tools are Fivetran, Stitch, and Airbyte. Reverse ETL is very similar to ETL, except you’re pulling from your data warehouse into your SaaS apps. Reverse ETL is handy if data in your warehouse is updating frequently, and you want your nontechnical teams to see the most up-to-date data in their SaaS tools. Some common tools are Hightouch, Census, and Grouparoo (now a part of Airbyte).

However, ETL and reverse ETL have three drawbacks: 

  1. It’s pricey! If you need data to simply sync both ways, ETL and reverse ETL can get complicated: you’ll need two separate providers, meaning two separate sales and billing processes and double payments. You will also need to figure out how to navigate the infinite loop problem.
  2. Not optimized for speed. If you need data to be updated faster than 1 hour, these tools will not fit your use case. This is especially problematic when you need your business teams to have access to data as soon as possible.
  3. You still need a method for handling merge conflicts. 

Pros: Robust tools, many options available, a ton of integrations

Cons: Slow syncing speeds, too complicated for many companies and use cases, not able to provide two-way syncs under one roof, and the infinite loop problem.

Method #4: Use Bracket for two-way syncs

Sometimes, you just need the same data to show up in two different places in real-time. Getting your G-sheets and Snowflake to mirror each other can significantly reduce data sharing frictions on your team and cut down reliance on engineers. But as shown by the options above, this is surprisingly hard to do!

Our frustrations led us to build Bracket, which does fast bidirectional syncing between tools like Google Sheets and Snowflake (and any other database for that matter). 

Bracket is built for data teams and engineers, but requires zero code to maintain. To set up a sync, all you need to do is:

  1. Create an empty G sheet with field names but no data.
  2. Go to app.usebracket.com
  3. Enter the connection details for your Snowflake and Google Sheet.
  4. Select syncing direction and double check that the field mapping is accurate. Click sync, and voila! 

The full step-by-step guide is in our docs. And now you’ll never have to worry about cobbling together multiple software solutions or relying on internal scripts again.

Pros: built for bidirectionality, extremely tailored to your use case. Only pay for 1 tool to do syncs in either direction

Cons: Still building out integrations for new tools, but our Snowflake and Google Sheets connectors are live!

Want to get your data syncing? Get started with Bracket here.