More complete Update table example from BigQuery

Quick post here, but it very, very slightly addresses one of the biggest complaints I have with technical product documentation.

This complaint can be best explained by the phenomena where every example in the documentation is the absolute simplest possible example to show something.
Now I am not advocating getting rid of these examples! They are a great starting point, but all to frequently they are the only examples listed. This is a huge detriment to users, because this simplest-path use case is rarely how it plays out in reality. Simple examples are a springboard for more complex examples.

I have been doing alot of work in Google Cloud BigQuery lately. They have pretty thorough documentation but their examples are rarely complex.

For example, when they talk about updating a table in BigQuery, here is the example* they give:

UPDATE sample_db.UserSessions  
SET ip_address = REGEXP_REPLACE(ip_address, r"(\.[0-9]+)$", ".0")  

In many databases you can update a table based on the values in another table. Any examples of this multi-table UPDATE statement are nowhere to be found in the documentation, and there are multiple ways to format it, based on the database engine that you are using. Here is an example of a multi-table UPDATE statement, working off the simpler example above:

UPDATE sample_db.UserSessions  
SET UserSessions.ip_address = CAST(hosts.ipV6_addr as STRING)  
from sample_db.hosts  
WHERE UserSessions.host_name =  
AND UserSessions.loginname is not NULL  

BigQuery's Standard SQL is fairly strict, in terms of column data type comparisons. This is a great thing, but be prepared to write some CAST functions! This is a very limited coercion capability, which is an implicit conversion, while a CAST statement is an explicit conversion. See this [page]( for more details, under Conversion Rules.

(*) All examples here are in Standard SQL. BigQuery has an option called Legacy SQL, which I am not covering here. Most people know standard SQL and not Legacy SQL.

Pro-tip - when you open a query window in BigQuery it defaults to LegacySQL. You have to open the options and click to enable StandardSQL. If you have a Data Manipulation Language (DML) statement that you swear is correct but for some reason is not running, check that you have LegacySQL turned off. It will save you alot of headaches; particularly since the error statement is not helpful in this regard.

(**) Pro-tip2 - UPDATE statements in BigQuery always require a WHERE clause. If you want to update all rows in a table, use WHERE TRUE.

Pitt Fagan

Greetings! I'm passionate about data; specifically the big data and data science ecosystems! It's such an exciting time to be working in these spaces. I run the BigDataMadison meetup where I live.