|
|
||
|---|---|---|
| .. | ||
| 14.1. Multiple apartments.md | ||
| README.md | ||
| cheatsheet.md | ||
README.md
14. Databases
Basic SQL
SELECT CITY FROM STATION WHERE CODE = '123'
Get distinct cities:
SELECT DISTINCT CITY FROM STATION WHERE CODE = '123'
Count number of distinct cities
SELECT count(DISTINCT CITY) FROM STATION WHERE CODE = '123'
Apart from count, there is also sum, avg, ceil, floor.
Get the shortest city name and longest city name
SELECT CITY, LENGTH(CITY) FROM STATION
ORDER BY LENGTH(CITY), CITY asc
limit 1;
SELECT CITY, LENGTH(CITY) FROM STATION
ORDER BY LENGTH(CITY), CITY desc
limit 1;
Get cities ending with vowels
SELECT DISTINCT CITY FROM STATION WHERE city REGEXP "[aeiou]$"
Select substring of column. The first two characters. 1 means first character, 2 means length. SQL has an index system starting from 1. To get the last ones, use -X.
SELECT SUBSTR(NAME, 1, 2) FROM TABLE
Output a string for each case
SELECT
CASE
WHEN A >= (B + C) OR B >= (A + C) OR C >= (A + B) THEN 'Not A Triangle'
WHEN A = B AND A = C THEN 'Equilateral'
WHEN A = B OR B = C OR A = C THEN 'Isosceles'
ELSE 'Scalene'
END
FROM TRIANGLES;
Get the list of top earning employees, and the number of employees that earn this amount
SELECT earnings, count(*) FROM Employee
GROUP BY earnings
ORDER BY earnings desc
limit 1;
SELECT COUNTRY.Continent, FLOOR(AVG(CITY.Population))
FROM CITY INNER JOIN COUNTRY
ON CITY.CountryCode=COUNTRY.Code
GROUP BY COUNTRY.Continent;
Book SQL
SELECT CourseName, TeacherName FROM Courses, Teachers WHERE Courses.TeacherID = Teachers.TeacherID
Normalized databases are designed to minimize redundancy, while denormalized databases are designed to optimize read time. We can denormalize the databases by storing redundant data and avoid doing many joins.
As an example, we have this database. * indicates a primary key.
Courses: CourseID*, CourseName, TeacherID
Teachers: TeacherID*, TeacherName
Students: StudentID*, StudentName
StudentCourses: CourseID*, StudentID*
Query 1: student enrollment
Get a list of all students and how many courses each student is enrolled in
SELECT StudentName, Students.StudentID, count(StudentCourses.CourseID) as [Cnt]
FROM Students LEFT JOIN StudentCourses
ON Students.StudentID = StudentCourses.StudentID
GROUP BY Students.StudentID, Students.StudentName
For reasons and incorrect implementations and their justification, see chaper 14 in book.
Query 2: Teacher class size
Get a list of all teachers and how many students they teach. If a teacher teaches the same student in two courses, double count the student. Sort the list in descending order of the number of students a teacher teaches
SELECT TeacherName, isnull(StudentSize.Number, 0)
FROM Teachers LEFT JOIN
(SELECT TeacherID, count(StudentCourses.CourseID) AS [Number]
FROM Courses INNER JOIN StudentCourses
ON Courses.CourseID = StudentCourses.CourseID
GROUP BY Courses.TeacherID) StudentSize
ON Teachers.TeacherID = StudentSize.TeacherID
ORDER BY StudentSize.Number DESC
Small database design
How to design a small database
- Handle ambiguity: understand exactly what you need to design, consult with the interviewer
- Define the core objects: typically each core object translates into a table
- Analyze relationships: how tables are connected to each other
- Investigate actions: walk through the common actions that will be taken and understand how to store and retrieve the relevant data.
Large database design
When designing a large, scalable database, joins are generally very slow. You must denormalize your data. Duplicate the relevant data in many tables.