Custom column cheatsheet

The Mighty Cheatsheet 

Welcome to the Mighty Cheatsheet! This guide will walk you through some of the different formulas and expressions that make Trevor such a powerful tool. If you have any questions or find something below a little tricky to write (even we do sometimes!) just message us via our chat in the bottom right.

Not familiar with your database, including data types? Check this post

For specific methods (e.g. how to count unique values), head to the How do I...? section.

How to use the cheatsheet:

cheatsheet_guide_.gif

Intro: Data types

Each column has a data type, which defines the kind of values the column can contain:

  • String = text (e.g. "Hello to you") 
  • Int = a whole number (e.g. 10, 20, 30)
  • Timestamp = date and time of day (e.g. 25/01/2019 14:25:03)
  • Float = a number with a fractional part (e.g. 4.7)
  • Bool = a value of either true or false.
  • Interval  = a time interval (e.g. 5 days)
  • JSON = a data type used for storing and transporting data. JSON is made up of key-value pairs between curly brackets (e.g. {"Book Title": "Ready Player One}).
  • Array - a data type used for storing multiple values, enclosed in square brackets ([]). 

 

Numbers

You can use Trevor as a calculator to perform arithmetic across columns

Examples

(column1/column2) * 100

column1 - column2

( 5 * ( column1 * 8 ) ) - 10

Convert a negative number to a positive one: number * (0-1)

 

Text strings

You can add to, manipulate, and compare text columns.

 

Examples:

Remove spaces at the start or end of a string: column1.trim()

Add a word to a column: column1 + "hello"

Combine text from two columns, e.g. column1 is first names and column2 is second names:
column1+("")+column2

Replace a character: hello world".replace("", "_")

You can also chain functions. For example, if you have a column structured like: ["column1"] and you want to remove [" and "]  to get just column1 you can use the following:

column1.replace("[\"","").replace("\"]","")

Extract the domain from an email address: 

column1.extract("@(.*)")

column.substring(column1.position("@")+1

Extract everything before a space: 

column1.extract("(.*)")

column1.truncate(column1.position(" "))

Extract everything after a space: 

column1.extract(" (.*)")

column1.substring(column1.position(" ")+1

Extract the first word

column1.split(" ").at(0)

Extract the characters after the 5th character: column1.substring(4)

Extract 4 characters after the 5th character: column1.substring(4, 4) (The second parameter is the number of characters to extract)

Extract only the 5th character: 

column1.substring(4).truncate(1)

column1.substring(4,1)

Extract only the 1st character: 

column1.substring(0).truncate(1)

column1.substring(0, 1)

Extract the last 2 characters: column1.substring(column1.length() -2)

Extract the first 5 characters after a space:column1.substring(column1.position(" ")+1, 5)

Extract everything before a number: column1.extract("^([^0-9]*)")

(Note: Not supported by SQL Server databases)

Extract the last word in a sentence: column1.extract("\S+\$")

Calculate length of text: column1.length()

Convert to lowercase:column1.lowercase()

Convert to uppercase: column1.uppercase()

Return true if the value in one column equals that of another:column1 == column 2

Return true if the value in one column contains a specific value: column1.contains("hello")

Return true if the value in one column contains the value of another column: column1.contains(column2)

 

Advanced

u.email + " " _ u.status

"one two three".split(" ")

(Note: Not supported by SQL Server databases)

r.column1 == r.column2

r.column1 != r.column2

 

Redshift databases

column1.regexSubstring(".*,")extracts everything after a comma

(you can replace the comma with any other character) 

column1.regexSubstring(",.*") extracts everything before a comma

 

If then else

Trevor supports if then else statements. Check out an example of using these in a query here.

Examples

If the value in a column is 2, return “two!”. If it isn’t, return ‘not two!”: 

if (column1 == 2) then "two!" else "not two"

-------------------------------------------

If the value in a column is “hello” AND the value of another column is “to you”, return 1, otherwise return 0.

if (column1 == "hello" and column2 == "to you") then 1 else 0)

Common next step:

Using the calculate button: Sum the values of the new column. This will give us the total number of rows where column1 is “hello” AND column2 is “to you”

-------------------------------------------

If the value in a column is “hello” AND the value of another column is “to you” OR “to me”, return 1, otherwise return 0.

if (column1 == "hello") and (column2 == "to you" or column3 == "to me")) then 1 else 0

 

If you want to make multiple IF ‘x’ THEN ‘y’ statements, you may want to create a ‘nested if’ statement. This will allow you to create more complex rules in your filtering and other functions.
E.g. if the value in a column is greater than 90, return "A+", if it's greater than 80, return "A", if it's greater than 70, return "B", and so on. 

if ((column1 > 90)) then "A+" else if (column 1 > 80) then "A" else if (column 1 > 70) then "B" else "Other score"

If the value of the middlename column is empty, return “EMPTY!”, otherwise return the existing column value.

if (middlename.isEmpty() then "EMPTY" else "middlename"

 

If the value in a column is less than 0, return “negative”. If it’s greater than 0, return “positive”. If it’s 0, return “zero”.

if (column1 <0) then "negative" else if (column1 > 0) then "positive" else "zero"

 

If the value in a column is true, return “woo”, otherwise return “boo”

if (column1) then "woo!" else "boo"

 

If the value in a column is false, return “oh look, it’s false”, otherwise return “nothing to see here...”

if (column1.isFalse()) then "oh look, it’s false!" else "nothing to see here..."

 

If the value in a column is true, return 1, otherwise return 0

if (column1) then 1 else 0

 

If the value in a column is null, return 0, otherwise just return the existing value.

if (column1.isNull()) then 0 else column1

 

If the value in a column is not null return true else false

column1.isNull().isFalse()

 

If the value in a column is null or the value is not “something”, AND the value in another column is less than or equal to 5, return 1, otherwise return 0.

if (column1.isNull() or column1 != "something" and column2 <=5)

 

If the value in a column is something, say “hello!”, otherwise set the value as null

if (column1 == "something") then "hello!" else column1.null()

If a column contains "a certain text string" then "a certain text string" should be returned, else if a column contains "a different text string" then "a different text string" should be returned. (This allows you to show whether certain text is present.

if column1.contains("something") then "something"
else if column1.contains ("something else") then "something else"

 

A note on negative numbers: If you want to use negative numbers, i.e. you want to see if the values in column1 are greater than -2, you'll need to write it as:
if (column1 > (0-2)) then "greater than -2" else "less than -2"

 

Aggregate functions 

This is the equivalent of using the Summarize button, but adding the result to a new column.

Count rows

count_rows()

 

Count rows, grouped by a column

count_rows_by(column1)

 

Sum values across rows

This is useful if you want to calculate the percentage and need to sum the values in a particular column. 

sum(column1)

 

Sum values across rows, grouped by a column

sum_by(column1, column2)

 

Can also do:

max(column1)

max_by(column1, column2)

min(column1)

min_by(column1)

Timestamps

Note: Some of the following functions (relating to intervals e.g. 3 days / 2 months) won’t work for SQL Server and Big Query databases. Pleaser scroll to the bottom of this section for Big Query and SQL Server specific functions, or chat with us for a workaround.

Examples

Convert to datetime

column1.toDateTime()

 

Convert to a given timezone

u.created_at.toTimeZone("America/Los_Angelse")

Support timezones: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones

 

Take one away from another 

diff_in_seconds(date1, date2)
Check out a sample query using this here

 

Convert a date to datetime:

date.toDateTime()

 

Convert a datetime to just date (i.e. remove the time from the column):

column1.toDate()

 

Subtract the created_at date from the current time

now() - u.created_at

 

Add 1 hour to a datetime

column1 + 1.hour()

 

Add 1 day to a datetime

column1 + 1.days()

 

*If the date column is not datetime and you want to add an hour / date to it, just convert it to datetime first using .toDateTime()

 

For SQL Server and Big Query databases, please use the following functions:

add_years(1, created_at)

add_months(5, created_at)

add_weeks(2, created_at)

add_days(3, created_at)

add_hours(1, created_at)

add_minutes(5, created_at)

add_seconds(20, created_at)

subtract_year(3, created_at)

---

Check if a date is in the future. If it is, return “the future”, otherwise return “the past”

if (u.created_at > now()) then "the future" else "the past"

 

Formatting timestamps as text strings

r.created_at.day()

r.created_at.week()

r.created_at.month()

r.created_at.year()

 

Rounding down to closest day/week/month/year

r.created_at.startOfDay()

r.created_at.startOfWeek()

r.created_at.startOfMonth()

r.created_at.startOfYear()

Nullable values (any data type)

Examples

r.customer_id.isNull() r.customer_id.isNull()

r.customer_id.isNull().isFalse()

Assign a value if a column has a null value.

column1.ifNull("hello")

Assign the value of another column if a column has a null value. 

r.column1.ifNull(column2)

If the other column has a NULL value, assign the value “unknown”

r.column.ifNull(column2).ifNull("unknown")

JSON

Access key values

column1.key("key-name").key("key-name")

Access array values

my_json_array.at(0)

MySQL

json_extract(column, "\$[0]")

json_extract(column, "\$.name")

See this official guide on what you can pass to the second argument.

Convert to other column types

To convert to a string (from any column type):  

column1 + ""

To convert from an Int to a float: 

column2 + 0.0

Common errors

Expected type FLOAT, but found INT

It’s expecting a floating point number e.g. (5.0) but finding an integer (e.g. 5). 

Solution: if your number is 5, change it to 5.0. If your number is 10, change it to 10.0 etc. 

 

Glossary

STRING

.truncate(INT) returns STRING

.substring(INT) returns STRING

.substring(INT, INT) returns STRING

.position(STRING) returns INT

.length() returns INT 

.isEmpty() returns BOOL

.trim() returns STRING

.ascii() returns INT

.uppercase() returns STRING

.lowercase() returns STRING

.contains(STRING) returns BOOL

.startsWith(STRING) returns BOOL

.endsWith(STRING) returns BOOL

.toInt() returns INT

.toFloat() returns FLOAT

.toDate() returns DATE

.toUUID() returns UUID

.extract(STRING) returns STRING [postgres only] using regular expression

.regexSubstring(STRING) returns STRING [redshift only]

.split(STRING) returns ARRAY

.replace(STRING,STRING) returns STRING

.properCaps() returns String

.toJson() returns JSON

.isOneOf(STRING, STRING, STRING, ...) returns BOOL

 

Example:

"abracadabra".substring(2).truncate(3) returns STRING

 

INT

sum(INT) returns INT

sum_by(INT, OTHER) returns INT

count_rows() returns INT

count_rows_by(OTHER) returns INT

max(INT) returns INT

max_by(INT, OTHER) returns INT

min(INT) returns INT

min_by(INT, OTHER) returns INT

.pretty() returns STRING

.toString() returns STRING

.seconds() returns INTERVAL

.minutes() returns INTERVAL

.hours() returns INTERVAL

.days() returns INTERVAL

.weeks() returns INTERVAL

.months() returns INTERVAL

.years() returns INTERVAL

.isOneOf(INT, INT, INT, ...) returns BOOL

.abs() returns INT

(The above don’t work for SQL Server databases)

 

Example:

now() - 5.days() returns DATETIME

 

Note: if you get the error 'integer out of range' you can do a little hack and convert the column to a float. 

 

FLOAT

sum(FLOAT) returns FLOAT

sum_by(FLOAT, OTHER) returns FLOAT

count_rows() returns INT 

count_rows_by(OTHER) returns INT

max(FLOAT) returns FLOAT

max_by(FLOAT, OTHER) returns FLOAT

min(FLOAT) returns FLOAT

min_by(FLOAT, OTHER) returns FLOAT

.pretty() returns STRING

.round() returns INT

.round(INT) returns FLOAT

.floor() returns INT

.ceil() returns INT

.abs() returns FLOAT

 

Example:

(53.98).round() returns INT

 

Convert a float to an Int

Round up using .ceil() or down using .floor()

 

BOOL

.isTrue() returns BOOL

.isFalse() returns BOOL

 

Example:

t.created_at < now() and t.id < 5 returns BOOL

 

NULLABLE

.isNull() returns BOOL

.ifNull(NULLABLE) returns NULLABLE

 

Example:

t.email.ifNull("no-email") returns STRING

 

DATE and DATETIME

.format(STRING) returns STRING:

Postgres: see official TO_CHAR documentation for format options

MySQL: see official DATE_FORMAT documentation for format options

Redshift: see official TO_CHAR documentation for format options

Snowflake: see official TO_CHAR documentation for format options

SQLServer: see official FORMAT documentation for format options

BigQuery: see official FORMAT_DATETIME documentation for format options

 

diff_in_seconds(DATE, DATE) return INT

diff_in_days(DATE, DATE) return INT

diff_in_weeks(DATE, DATE) return INT

diff_in_months(DATE, DATE) return INT

diff_in_years(DATE, DATE) return INT

now() returns DATETIME

date(STRING) returns DATE

datetime(STRING) returns DATETIME

max(DATETIME) returns DATETIME

max_by(DATETIME, OTHER) returns DATETIME

min(DATETIME) returns DATETIME

min_by(DATETIME, OTHER) returns DATETIME

.secondsSinceEpoch() returns INT

.second() returns INT

.minute() returns INT

.hour() returns INT

.date() returns INT

.month() returns INT

.weekOfYear() returns INT

.day() returns STRING

.dayOfWeek() returns INT

.weekday() returns STRING

.week() returns STRING

.year() returns STRING

.startOfDay() returns DATETIME

.startOfWeek() returns DATETIME

.startOfMonth() returns DATETIME

.startOfYear() returns DATETIME

.toDate() returns DATE

.toDateTime() returns DATETIME

.toString() returns STRING

.toTimeZone(STRING) returns DATETIME

 

Example:

now() - date("2018-01-25").toDateTime() returns INTERVAL

 

Input your own date

(“25/02/2018”).toDate()

 

INTERVAL

.toSeconds() .toSecond() returns INT

.getMonths() .getMonths() returns INT

.getYears() .getYears() returns INT

 

Example:

((now() - t.created_t.toSeconds() /60/60/24).round() returns INT

 

JSON

.at(INT) returns JSON

.key(STRING) returns JSON

.toString() returns STRING

.length() returns INT

 

Example:

t.json_data.key("items").at(0).key("name").toString() returns STRING

 

UUID

.toString() returns STRING

 

ARRAY

.toString() .toString() returns STRING

.length() .length() return INT

.at(INT) .at(INT) returns STRING

 

Example:

t.items.length() > 0 then t.items.at(0) else "missing"

 

Use case

Find the average of multiple dates 

(E.g. average of 1/06/18 and 30/06/18 = 15/06/18).

This is currently a fun one :)

 - Click on your date column header, select more options, and select seconds since 1970.

 - This will add a new column with the number of seconds between 1970 and each date. 

 - Calculate the average of that column. 

 - Click the new column button and add this big ol’ custom expression:

date("1970-01-01").toDateTime() + column1.round().seconds()

(where column1 = the average column we created in step 2)

 

Done! You’ve found the average of multiple dates 🎉

Help! I'm getting an error 
mismatched input '.' expecting OPEN_BRACE
This error is referring to a '.' within a column name. Our "new column" builder does not recognise funky column names with dots in them. If you use the "rename" step first to change the name of your column (just click the column and "rename" will be in the dropdown), then you should be able to complete the "new column" function. 

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.