In this short article, I cover some key concepts that will help you understand your database.
To provide context for examples, I'll refer to a fictitious company called GBC that sells ridiculously expensive giant beanbag chairs.
Let’s jump in.
What is your database?
Your database is a rich store of information about your business, organised into one or multiple tables. In Trevor, you can see these tables in the left menu (click one to view its contents). You might have, for example:
- A Customer table, with information about your customers.
- An Orders table, with the details of each of your sales.
- A Marketing Channels table, with each of the potential marketing channels sell through.
The tables in your database contain rows and columns. The rows represent instances of what the table itself represents, and the columns represent values attributed to each instance.
So, for example, in the Customer table, each row might represent – you guessed it – a different customer, and the columns might be "name", "email address", "age" and so on. In the Transaction table, each row might represent a transaction, and the columns might be "date", "amount paid" etc.
Why store data in separate tables?
Put another way, why not just have one really massive table with lots of rows and columns?
In a nutshell, it's because, ideally, you only want to have one version/instance of each piece of data, to make it simpler to maintain and easier to ensure consistency. For example, consider the following table:
As you can see, this table contains the details of each of GBC’s customers, the super expensive beanbag chair they purchased, and the price they paid.
Now, imagine it had 20,000 rows instead of just 3 (excluding the header). If we wanted to change the name or details of one of the products, for instance, we’d have to change it in every single row in which it appeared. How annoying would that be? And, where would we add a new product that no-one had purchased yet (like a bright orange, massage bean bag chair, ribbed for your pleasure)? The whole thing would be a pain.
So, rather than creating one table containing all the information above, we might create two, like so:
A Product Table, in which each row represents a different product:
And a Customer Table, in which each row represents a different customer:
Having the information in two tables, like this, makes retrieving, modifying, and adding to the data a much simpler process for the code whisperers in your wonderful engineering team (your database is also more secure, as tables containing certain sensitive information, for instance, can be hidden). Win!
Connecting related data
But wait, a wee challenge has popped up. With the two tables above, how can we see which product is associated with which customer?
The answer is straightforward: you may have noticed that the Product Table has a column called ID – this is just a unique reference, called a primary key, for each of the rows in this table. If you check the Product table, you'll see that we’ve used this ID again in the Product_ID column (when used in another table, a primary key becomes known as a foreign key). This shows that the rows in the Product table are linked to the rows in the Customer table as per the unique values in these columns.
So, for example, looking at both tables, we can see that Rachel Hathaway, whose email address is [email protected], purchased a Red Bean Bag Chair for $10,100.
You can also see that the Customer Table contains its own primary key, which could be used in other tables to link rows to these. For instance, if we had another table containing details of any coupon codes we’d launched, we could add a Customer_ID column to that table to show which customers had used them.
And BOOM! This is the basis for what’s called a relational database.
Two key takeaways
We’ve used very simple examples, but hopefully it’s helped you understand that:
- Your data is organised into one or more tables.
- Relationships exist between some of these tables. Trevor makes it really easy for you to understand these relationships and find answers to questions that rely on data from multiple tables.