Harnessing the Full Potential of BigQuery Table Functions

Harnessing the Full Potential of BigQuery Table Functions

I come from SQL Server land, it was a great place to live. It’s an even better place to live if you know what you are doing (and I like to think I knew what I was doing — from time to time).

I tell you this because, for the better part of my career, it’s been SQL Server, PostgreSQL, and the occasional flavour of MySQL thrown in for some luck — your classic relational database systems.

That said, over the years, I’d dabble with BigQuery — nothing major, just a few simple pipelines, loading some data into a table or two, fooling around with it for a bit, shrugging, and then I’d scoot back to SQL Server and forget all about the goings-on in BigQuery.

Things Have Changed For Me

For the last 6 to 8 months, I haven’t touched SQL Server, not once. I’ve gone cold turkey. It’s been BigQuery for me every day — tinkering, trying things, and learning as I go.

Working with BigQuery is like being unplugged from the Matrix and realizing anything and everything is possible. It’s the wild west versus the safe haven that is SQL Server — that is how I feel about it, anyway.

I enjoy working in BigQuery more than I thought I would, I also like all the neat little things you can do in it (more on this in future posts).

That said, when I got into BigQuery, I promised I’d write up the things I’m doing in BigQuery in the hope it helps others and solidifies my learning. So here lies something I’ve used — Table Functions.

What are they?

Table functions or table-valued functions (TVF) are user-defined functions that return a table. Yeah, I know that’s a mouthful — this is nothing new in the database world, they’ve been around long enough for me to forget the version of SQL Server they were shipped with.

The way I see them is it’s basically a glorified view. A view that’s been pumped up to allow you to query it like you would a table, but the cool bit is you can pass a parameter to it. Essentially, it’s a parameterized view.

I’ll take you through a basic use case and how it all comes together when used with Google Sheets.

How to Create a Table Function

I’m going with a really simple example here. You can go wild with Table Functions, but in this case, pick a table — any table. Something you would select from often, a table on which you would likely apply filters, like a date or something useful. In my case, I’m using the BigQuery public data for Austin Bike Share data and filtering on start_time â€” a TIMESTAMP column.

This is the query I am going to convert into a table function:

SELECT
bike_type,
COUNT(*) AS bike_trips
FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips`
WHERE CAST(EXTRACT(YEAR FROM start_time) AS STRING) = "2022"
GROUP BY bike_type

It’s nothing fancy, but you get the jist.

The result of this query looks like this:

Results of the SQL query I am going to convert to a Table Function

Side note: I’m surprised at how many people use electric bicycles vs. regular bicycles, but I digress.

To convert the query to a Table Function you use the CREATE TABLE FUNCTION command, give your Table Function a name, and the fun part: a parameter with a data type for said parameter.

In my case, I am creating a bike_rides_by_year Table Function with the parameter “year” as a STRING data type.

The script looks like this:

CREATE OR REPLACE TABLE FUNCTION testing.bike_rides_by_year(year STRING)
AS (
SELECT
bike_type,
COUNT(*) AS bike_trips
FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips`
WHERE CAST(EXTRACT(YEAR FROM start_time) AS STRING) = year
GROUP BY bike_type
);

Note the parameter “year” is used in the WHERE clause.

How to Query a Table Function

Once the Table Function is created, you can query it much the same way you would query a view or a table, with the only difference being that you pass the parameter into the Table Function when selecting from it.

Here is how we select the Table Function we just created:

SELECT *
FROM testing.bike_rides_by_year("2022")

The results, as you can see, are the same as earlier when using “2022”:

Now let’s change the year to “2021” and there you have it a Table function in action.

Below are the results for “2021”:

Using Table Functions with Google Sheets

I’ve found Table functions pretty handy when you combine them with Looker or Google Sheets. This is how I use them in Google Sheets:

1 — Create a sheet (tab) called “parameter”.

2 — Set up a cell for use as your parameter value, in my case, “year” and “2022” (note: this needs to be a text cell in my case).

3 — Next, connect your Google Sheet to BigQuery.

4 — Choose a cloud project, and then select “write custom query”.

5 — In the BigQuery editor, paste in your “SELECT” query using your Table Function, and be sure to pass in the parameter you created in the next step.

6 — Set up the Google SheetParameter, as you can see below, I created a parameter called YEAR, and I’m referencing the “parameters” sheet B1 cell (which is set to “2021”):

7 — Hit “connect,” and voilà!

You will now have a Table Function plugged into a Google Sheet that accepts a parameter of YEAR.

8 — An optional step is to extract the table to a new sheet so you can make it look pretty.

Finally, to see all of this in action, change the year value in the parameters sheet and refresh your BigQuery data source for the changes to take effect.

See the video below of me demonstrating all this:

And there you have it, Table Functions in action. Obviously, this is as basic as it gets. And, as I said, it’s nothing new — SQL Server has Table Value Functions too (for ages), but I never found a need to use them.

I do like them. They save you from having to create a view and then use WHERE filters on the view. However, there are some limitations.

Limitations

Table functions must be stored in the same location as the tables they reference. So, you can’t have a table in the US region and then create a Table Function in, say, the EU, it won’t work.

Apart from that, I think these could prove pretty useful. I’m looking at other ways to implement these into some pipelines I’m building.

To Sum Up

If you made it this far, I thank you. If you followed along, even better. Wherever you are on your data journey, find time to explore and try things out. The best way to learn is by doing and documenting. The more hands-on you get with things, the higher the chance of new things sticking.

As always, I point you to Google Cloud’s official documentation on Table Functions. Do check it out, as they explain things better than I ever could.

Thanks for reading! If you liked it, don’t forget to subscribe to my latest articles. Want to get in touch? feel free to find me on LinkedIn.