The Mighty Cheatsheet
Welcome to the Mighty Cheatsheet! This guide will walk you through some of the different formulas and expressions that make Trevor such a powerful tool. If you have any questions or find something below a little tricky to write (even we do sometimes!) just message us via our chat in the bottom right.
Not familiar with your database, including data types? Check this post.
For specific methods (e.g. how to count unique values), head to the How do I...? section.
How to use the cheatsheet:
Intro: Data types
Each column has a data type, which defines the kind of values the column can contain:
- String = text (e.g. "Hello to you")
- Int = a whole number (e.g. 10, 20, 30)
- Timestamp = date and time of day (e.g. 25/01/2019 14:25:03)
- Float = a number with a fractional part (e.g. 4.7)
- Bool = a value of either true or false.
- Interval = a time interval (e.g. 5 days)
- JSON = a data type used for storing and transporting data. JSON is made up of key-value pairs between curly brackets (e.g. {"Book Title": "Ready Player One}).
- Array - a data type used for storing multiple values, enclosed in square brackets ([]).
Numbers
You can use Trevor as a calculator to perform arithmetic across columns.
Examples
(column1/column2) * 100
column1 - column2
( 5 * ( column1 * 8 ) ) - 10
Convert a negative number to a positive one: number * (0-1)
Text strings
You can add to, manipulate, and compare text columns.
Examples:
Remove spaces at the start or end of a string: column1.trim()
Add a word to a column: column1 + "hello"
Combine text from two columns, e.g. column1 is first names and column2 is second names:column1+("")+column2
Replace a character: hello world".replace("", "_")
You can also chain functions. For example, if you have a column structured like: ["column1"] and you want to remove [" and "] to get just column1 you can use the following:
column1.replace("[\"","").replace("\"]","")
Extract the domain from an email address:
column1.extract("@(.*)")
column.substring(column1.position("@")+1
Extract everything before a space:
column1.extract("(.*)")
column1.truncate(column1.position(" "))
Extract everything after a space:
column1.extract(" (.*)")
column1.substring(column1.position(" ")+1
Extract the first word
column1.split(" ").at(0)
Extract the characters after the 5th character: column1.substring(4)
Extract 4 characters after the 5th character: column1.substring(4, 4)
(The second parameter is the number of characters to extract)
Extract only the 5th character:
column1.substring(4).truncate(1)
column1.substring(4,1)
Extract only the 1st character:
column1.substring(0).truncate(1)
column1.substring(0, 1)
Extract the last 2 characters: column1.substring(column1.length() -2)
Extract the first 5 characters after a space:column1.substring(column1.position(" ")+1, 5)
Extract everything before a number: column1.extract("^([^0-9]*)")
(Note: Not supported by SQL Server databases)
Extract the last word in a sentence: column1.extract("\S+\$")
Calculate length of text: column1.length()
Convert to lowercase:column1.lowercase()
Convert to uppercase: column1.uppercase()
Return true if the value in one column equals that of another:column1 == column 2
Return true if the value in one column contains a specific value: column1.contains("hello")
Return true if the value in one column contains the value of another column: column1.contains(column2)
Advanced
u.email + " " _ u.status
"one two three".split(" ")
(Note: Not supported by SQL Server databases)
r.column1 == r.column2
r.column1 != r.column2
Redshift databases
column1.regexSubstring(".*,")
extracts everything after a comma
(you can replace the comma with any other character)
column1.regexSubstring(",.*")
extracts everything before a comma
If then else
Trevor supports if then else statements. Check out an example of using these in a query here.
Examples
If the value in a column is 2, return “two!”. If it isn’t, return ‘not two!”:
if (column1 == 2) then "two!" else "not two"
-------------------------------------------
If the value in a column is “hello” AND the value of another column is “to you”, return 1, otherwise return 0.
if (column1 == "hello" and column2 == "to you") then 1 else 0)
Common next step:
Using the calculate button: Sum the values of the new column. This will give us the total number of rows where column1 is “hello” AND column2 is “to you”
-------------------------------------------
If the value in a column is “hello” AND the value of another column is “to you” OR “to me”, return 1, otherwise return 0.
if (column1 == "hello") and (column2 == "to you" or column3 == "to me")) then 1 else 0
If you want to make multiple IF ‘x’ THEN ‘y’ statements, you may want to create a ‘nested if’ statement. This will allow you to create more complex rules in your filtering and other functions.
E.g. if the value in a column is greater than 90, return "A+", if it's greater than 80, return "A", if it's greater than 70, return "B", and so on.
if ((column1 > 90)) then "A+" else if (column 1 > 80) then "A" else if (column 1 > 70) then "B" else "Other score"
If the value of the middlename column is empty, return “EMPTY!”, otherwise return the existing column value.
if (middlename.isEmpty() then "EMPTY" else "middlename"
If the value in a column is less than 0, return “negative”. If it’s greater than 0, return “positive”. If it’s 0, return “zero”.
if (column1 <0) then "negative" else if (column1 > 0) then "positive" else "zero"
If the value in a column is true, return “woo”, otherwise return “boo”
if (column1) then "woo!" else "boo"
If the value in a column is false, return “oh look, it’s false”, otherwise return “nothing to see here...”
if (column1.isFalse()) then "oh look, it’s false!" else "nothing to see here..."
If the value in a column is true, return 1, otherwise return 0
if (column1) then 1 else 0
If the value in a column is null, return 0, otherwise just return the existing value.
if (column1.isNull()) then 0 else column1
If the value in a column is not null return true else false
column1.isNull().isFalse()
If the value in a column is null or the value is not “something”, AND the value in another column is less than or equal to 5, return 1, otherwise return 0.
if (column1.isNull() or column1 != "something" and column2 <=5)
If the value in a column is something, say “hello!”, otherwise set the value as null
if (column1 == "something") then "hello!" else column1.null()
If a column contains "a certain text string" then "a certain text string" should be returned, else if a column contains "a different text string" then "a different text string" should be returned. (This allows you to show whether certain text is present.
if column1.contains("something") then "something"
else if column1.contains ("something else") then "something else"
A note on negative numbers: If you want to use negative numbers, i.e. you want to see if the values in column1 are greater than -2, you'll need to write it as:
if (column1 > (0-2)) then "greater than -2" else "less than -2"
Aggregate functions
This is the equivalent of using the Summarize button, but adding the result to a new column.
Count rows
count_rows()
Count rows, grouped by a column
count_rows_by(column1)
Sum values across rows
This is useful if you want to calculate the percentage and need to sum the values in a particular column.
sum(column1)
Sum values across rows, grouped by a column
sum_by(column1, column2)
Can also do:
max(column1)
max_by(column1, column2)
min(column1)
min_by(column1)
Timestamps
Note: Some of the following functions (relating to intervals e.g. 3 days / 2 months) won’t work for SQL Server and Big Query databases. Pleaser scroll to the bottom of this section for Big Query and SQL Server specific functions, or chat with us for a workaround.
Examples
Convert to datetime
column1.toDateTime()
Convert to a given timezone
u.created_at.toTimeZone("America/Los_Angelse")
Support timezones: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
Take one away from another
diff_in_seconds(date1, date2)
Check out a sample query using this here
Convert a date to datetime:
date.toDateTime()
Convert a datetime to just date (i.e. remove the time from the column):
column1.toDate()
Subtract the created_at date from the current time
now() - u.created_at
Add 1 hour to a datetime
column1 + 1.hour()
Add 1 day to a datetime
column1 + 1.days()
*If the date column is not datetime and you want to add an hour / date to it, just convert it to datetime first using .toDateTime()
For SQL Server and Big Query databases, please use the following functions:
add_years(1, created_at)
add_months(5, created_at)
add_weeks(2, created_at)
add_days(3, created_at)
add_hours(1, created_at)
add_minutes(5, created_at)
add_seconds(20, created_at)
subtract_year(3, created_at)
---
Check if a date is in the future. If it is, return “the future”, otherwise return “the past”
if (u.created_at > now()) then "the future" else "the past"
Formatting timestamps as text strings
r.created_at.day()
r.created_at.week()
r.created_at.month()
r.created_at.year()
Rounding down to closest day/week/month/year
r.created_at.startOfDay()
r.created_at.startOfWeek()
r.created_at.startOfMonth()
r.created_at.startOfYear()
Nullable values (any data type)
Examples
r.customer_id.isNull() r.customer_id.isNull()
r.customer_id.isNull().isFalse()
Assign a value if a column has a null value.
column1.ifNull("hello")
Assign the value of another column if a column has a null value.
r.column1.ifNull(column2)
If the other column has a NULL value, assign the value “unknown”
r.column.ifNull(column2).ifNull("unknown")
JSON
Access key values
column1.key("key-name").key("key-name")
Access array values
my_json_array.at(0)
MySQL
json_extract(column, "\$[0]")
json_extract(column, "\$.name")
See this official guide on what you can pass to the second argument.
Convert to other column types
To convert to a string (from any column type):
column1 + ""
To convert from an Int to a float:
column2 + 0.0
Common errors
Expected type FLOAT, but found INT
It’s expecting a floating point number e.g. (5.0) but finding an integer (e.g. 5).
Solution: if your number is 5, change it to 5.0. If your number is 10, change it to 10.0 etc.
Glossary
STRING
.truncate(INT)
returns STRING
.substring(INT)
returns STRING
.substring(INT, INT)
returns STRING
.position(STRING)
returns INT
.length()
returns INT
.isEmpty()
returns BOOL
.trim()
returns STRING
.ascii()
returns INT
.uppercase()
returns STRING
.lowercase()
returns STRING
.contains(STRING)
returns BOOL
.startsWith(STRING)
returns BOOL
.endsWith(STRING)
returns BOOL
.toInt()
returns INT
.toFloat()
returns FLOAT
.toDate()
returns DATE
.toUUID()
returns UUID
.extract(STRING)
returns STRING [postgres only] using regular expression
.regexSubstring(STRING)
returns STRING [redshift only]
.split(STRING)
returns ARRAY
.replace(STRING,STRING)
returns STRING
.properCaps()
returns String
.toJson()
returns JSON
.isOneOf(STRING, STRING, STRING, ...)
returns BOOL
Example:
"abracadabra".substring(2).truncate(3)
returns STRING
INT
sum(INT)
returns INT
sum_by(INT, OTHER)
returns INT
count_rows()
returns INT
count_rows_by(OTHER)
returns INT
max(INT)
returns INT
max_by(INT, OTHER)
returns INT
min(INT)
returns INT
min_by(INT, OTHER)
returns INT
.pretty()
returns STRING
.toString()
returns STRING
.seconds()
returns INTERVAL
.minutes()
returns INTERVAL
.hours()
returns INTERVAL
.days()
returns INTERVAL
.weeks()
returns INTERVAL
.months()
returns INTERVAL
.years()
returns INTERVAL
.isOneOf(INT, INT, INT, ...)
returns BOOL
.abs()
returns INT
(The above don’t work for SQL Server databases)
Example:
now() - 5.days()
returns DATETIME
Note: if you get the error 'integer out of range' you can do a little hack and convert the column to a float.
FLOAT
sum(FLOAT)
returns FLOAT
sum_by(FLOAT, OTHER)
returns FLOAT
count_rows()
returns INT
count_rows_by(OTHER)
returns INT
max(FLOAT)
returns FLOAT
max_by(FLOAT, OTHER)
returns FLOAT
min(FLOAT)
returns FLOAT
min_by(FLOAT, OTHER)
returns FLOAT
.pretty()
returns STRING
.round()
returns INT
.round(INT)
returns FLOAT
.floor()
returns INT
.ceil()
returns INT
.abs()
returns FLOAT
Example:
(53.98).round()
returns INT
Convert a float to an Int
Round up using .ceil()
or down using .floor()
BOOL
.isTrue()
returns BOOL
.isFalse()
returns BOOL
Example:
t.created_at < now() and t.id < 5
returns BOOL
NULLABLE
.isNull()
returns BOOL
.ifNull(NULLABLE)
returns NULLABLE
Example:
t.email.ifNull("no-email")
returns STRING
DATE and DATETIME
.format(STRING)
returns STRING:
Postgres: see official TO_CHAR documentation for format options
MySQL: see official DATE_FORMAT documentation for format options
Redshift: see official TO_CHAR documentation for format options
Snowflake: see official TO_CHAR documentation for format options
SQLServer: see official FORMAT documentation for format options
BigQuery: see official FORMAT_DATETIME documentation for format options
diff_in_seconds(DATE, DATE)
return INT
diff_in_days(DATE, DATE)
return INT
diff_in_weeks(DATE, DATE)
return INT
diff_in_months(DATE, DATE)
return INT
diff_in_years(DATE, DATE)
return INT
now()
returns DATETIME
date(STRING)
returns DATE
datetime(STRING)
returns DATETIME
max(DATETIME)
returns DATETIME
max_by(DATETIME, OTHER)
returns DATETIME
min(DATETIME)
returns DATETIME
min_by(DATETIME, OTHER)
returns DATETIME
.secondsSinceEpoch()
returns INT
.second()
returns INT
.minute()
returns INT
.hour()
returns INT
.date()
returns INT
.month()
returns INT
.weekOfYear()
returns INT
.day()
returns STRING
.dayOfWeek()
returns INT
.weekday()
returns STRING
.week()
returns STRING
.year()
returns STRING
.startOfDay()
returns DATETIME
.startOfWeek()
returns DATETIME
.startOfMonth()
returns DATETIME
.startOfYear()
returns DATETIME
.toDate()
returns DATE
.toDateTime()
returns DATETIME
.toString()
returns STRING
.toTimeZone(STRING)
returns DATETIME
Example:
now() - date("2018-01-25").toDateTime()
returns INTERVAL
Input your own date
(“25/02/2018”).toDate()
INTERVAL
.toSeconds() .toSecond()
returns INT
.getMonths() .getMonths()
returns INT
.getYears() .getYears()
returns INT
Example:
((now() - t.created_t.toSeconds() /60/60/24).round()
returns INT
JSON
.at(INT)
returns JSON
.key(STRING)
returns JSON
.toString()
returns STRING
.length()
returns INT
Example:
t.json_data.key("items").at(0).key("name").toString()
returns STRING
UUID
.toString()
returns STRING
ARRAY
.toString() .toString()
returns STRING
.length() .length()
return INT
.at(INT) .at(INT)
returns STRING
Example:
t.items.length() > 0 then t.items.at(0) else "missing"
Use case
Find the average of multiple dates
(E.g. average of 1/06/18 and 30/06/18 = 15/06/18).
This is currently a fun one :)
- Click on your date column header, select more options, and select seconds since 1970.
- This will add a new column with the number of seconds between 1970 and each date.
- Calculate the average of that column.
- Click the new column button and add this big ol’ custom expression:
date("1970-01-01").toDateTime() + column1.round().seconds()
(where column1 = the average column we created in step 2)
Done! You’ve found the average of multiple dates 🎉
Help! I'm getting an error
mismatched input '.' expecting OPEN_BRACE
This error is referring to a '.' within a column name. Our "new column" builder does not recognise funky column names with dots in them. If you use the "rename" step first to change the name of your column (just click the column and "rename" will be in the dropdown), then you should be able to complete the "new column" function.
Comments
0 comments
Please sign in to leave a comment.