To combine data from two tables we use an SQL SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7 clause, which comes after the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
8 clause.Database tables are used to organize and group data by common characteristics or principles. Often, we need to combine elements from separate tables into a single tables or queries for analysis and visualization. A JOIN is a means for combining columns from multiple tables by using values common to each. The JOIN keyword combined with ON is used to combine fields from separate tables. A SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7 clause on its own will result in a cross product, where each row in the first table is paired with each row in the second table. Usually this is not what is desired when combining two tables with data that is related in some way.For that, we need to tell the computer which columns provide the link between the two tables using the word SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
0. What we want is to join the data with the same species id.SELECT *
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
0 is like SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
2. It filters things out according to a test condition. We use the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
3 format to tell the manager what column in which table we are referring to.The output from using the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7 clause will have columns from the first table plus the columns from the second table. For the above statement, the output will be a table that has the following column names:record_idmonthdayyearplot_idspecies_idsexhindfoot_lengthweightspecies_idgenusspeciestaxa … 96820199712DMM3641DMDipodomysmerriamiRodent … Alternatively, we can use the word SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
5, as a short-hand. SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
5 only works on columns which share the same name. In this case we are telling the manager that we want to combine SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
7 with SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
8 and that the common column is SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
9.SELECT *
FROM surveys
JOIN species
USING (species_id);
The output will only have one species_id column record_idmonthdayyearplot_idspecies_idsexhindfoot_lengthweightgenusspeciestaxa … 96820199712DMM3641DipodomysmerriamiRodent … We often won’t want all of the fields from both tables, so anywhere we would have used a field name in a non-join query, we can use SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
3.For example, what if we wanted information on when individuals of each species were captured, but instead of their species ID we wanted their actual species names. SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
yearmonthdaygenusspecies … 1977716Neotomaalbigula 1977716Dipodomysmerriami … Many databases, including SQLite, also support a join through the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
2 clause of a query. For example, you may see the query above written without an explicit JOIN.SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
For the remainder of this lesson, we’ll stick with the explicit use of the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7 keyword for joining tables in SQL.Challenge:- Write a query that returns the genus, the species name, and the weight of every individual captured at the site
SolutionSELECT species.genus, species.species, surveys.weight
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
Different join typesWe can count the number of records returned by our original join query. SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
Notice that this number is smaller than the number of records present in the survey data. SELECT COUNT(*) FROM surveys;
This is because, by default, SQL only returns records where the joining value is present in the joined columns of both tables (i.e. it takes the intersection of the two join columns). This joining behaviour is known as an SELECT species.genus, species.species, surveys.weight
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
3. In fact the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7 keyword is simply shorthand for SELECT species.genus, species.species, surveys.weight
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
3 and the two terms can be used interchangably as they will produce the same result.We can also tell the computer that we wish to keep all the records in the first table by using a SELECT species.genus, species.species, surveys.weight
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
6 clause, or SELECT species.genus, species.species, surveys.weight
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7 for short.Challenge:- Re-write the original query to keep all the entries present in the
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
7 table. How many records are returned by this query?
SolutionSELECT * FROM surveys
LEFT JOIN species
USING (species_id);
Challenge:- Count the number of records in the
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
7 table that have a SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
0 value in the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
9 column.
SolutionSELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
Remember: In SQL a SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
0 value in one table can never be joined to a SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
0 value in a second table because SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
0 is not equal to anything, not even itself.Combining joins with sorting and aggregationJoins can be combined with sorting, filtering, and aggregation. So, if we wanted average mass of the individuals on each different type of treatment, we could do something like SELECT plots.plot_type, AVG(surveys.weight)
FROM surveys
JOIN plots
ON surveys.plot_id = plots.plot_id
GROUP BY plots.plot_type;
Challenge:- Write a query that returns the number of animals caught of each genus in each plot. Order the results by plot number (ascending) and by descending number of individuals in each plot.
SolutionSELECT *
FROM surveys
JOIN species
USING (species_id);
0
Challenge:- Write a query that finds the average weight of each rodent species (i.e., only include species with Rodent in the taxa field).
SolutionSELECT *
FROM surveys
JOIN species
USING (species_id);
1
Functions SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
5 and SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
6 and moreSQL includes numerous functions for manipulating data. You’ve already seen some of these being used for aggregation ( SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
7 and SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
8) but there are functions that operate on individual values as well. Probably the most important of these are SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
5 and SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
6. SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
5 allows us to specify a value to use in place of SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
0.We can represent unknown sexes with SELECT COUNT(*) FROM surveys;
3 instead of SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
0:SELECT *
FROM surveys
JOIN species
USING (species_id);
2The lone “sex” column is only included in the query above to illustrate where SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
5 has changed values; this isn’t a usage requirement.Challenge:SolutionSELECT *
FROM surveys
JOIN species
USING (species_id);
3
Challenge:- Write a query that calculates the average hind-foot length of each species, assuming that unknown lengths are 30 (as above).
SolutionSELECT *
FROM surveys
JOIN species
USING (species_id);
4
SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
5 can be particularly useful in SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7. When joining the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
8 and SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
7 tables earlier, some results were excluded because the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
9 was SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
0 in the surveys table. We can use SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
5 to include them again, re-writing the SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
0 to a valid joining value:SELECT *
FROM surveys
JOIN species
USING (species_id);
5Challenge:- Write a query that returns the number of animals caught of each genus in each plot, assuming that unknown species are all of the genus “Rodent”.
SolutionSELECT *
FROM surveys
JOIN species
USING (species_id);
6
The inverse of SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
5 is SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
6. This returns SELECT COUNT(*)
FROM surveys
JOIN species
USING (species_id);
0 if the first argument is equal to the second argument. If the two are not equal, the first argument is returned. This is useful for “nulling out” specific values.We can “null out” plot 7: SELECT *
FROM surveys
JOIN species
USING (species_id);
7Some more functions which are common to SQL databases are listed in the table below: FunctionDescriptionSELECT * FROM surveys
LEFT JOIN species
USING (species_id);
9Returns the absolute (positive) value of the numeric expression nSELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
0Returns the first of its parameters that is not NULLSELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
1Returns the length of the string expression sSELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
2Returns the string expression s converted to lowercaseSELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
3Returns NULL if x is equal to y, otherwise returns xSELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
4 or SELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
5Returns the numeric expression n rounded to x digits after the decimal point (0 by default)SELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
6Returns the string expression s without leading and trailing whitespace charactersSELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
7Returns the string expression s converted to uppercaseFinally, some useful functions which are particular to SQLite are listed in the table below: FunctionDescriptionSELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
8Returns a random integer between -9223372036854775808 and +9223372036854775807.SELECT COUNT(*)
FROM surveys
WHERE species_id IS NULL;
9Returns the string expression s in which every occurrence of f has been replaced with rSELECT plots.plot_type, AVG(surveys.weight)
FROM surveys
JOIN plots
ON surveys.plot_id = plots.plot_id
GROUP BY plots.plot_type;
0 or SELECT plots.plot_type, AVG(surveys.weight)
FROM surveys
JOIN plots
ON surveys.plot_id = plots.plot_id
GROUP BY plots.plot_type;
1Returns the portion of the string expression s starting at the character position x (leftmost position is 1), y characters long (or to the end of s if y is omitted)Challenge:Write a query that returns genus names (no repeats), sorted from longest genus name down to shortest. SolutionSELECT *
FROM surveys
JOIN species
USING (species_id);
8
As we saw before, aliases make things clearer, and are especially useful when joining tables. SELECT *
FROM surveys
JOIN species
USING (species_id);
9To practice we have some optional challenges for you. Challenge (optional):SQL queries help us ask specific questions which we want to answer about our data. The real skill with SQL is to know how to translate our scientific questions into a sensible SQL query (and subsequently visualize and interpret our results). Have a look at the following questions; these questions are written in plain English. Can you translate them to SQL queries and give a suitable answer? How many plots from each type are there? How many specimens are of each sex are there for each year, including those whose sex is unknown? How many specimens of each species were captured in each type of plot, excluding specimens of unknown species? What is the average weight of each taxa? What are the minimum, maximum and average weight for each species of Rodent? What is the average hindfoot length for male and female rodent of each species? Is there a Male / Female difference? What is the average weight of each rodent species over the course of the years? Is there any noticeable trend for any of the species?
Proposed solutions:- Solution:
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
0 - Solution:
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
1 - Solution:
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
2 - Solution:
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
3 - Solution:
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
4 - Solution:
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
5 - Solution:
SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
6
Key PointsUse a SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7 clause to combine data from two tables—the SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
0 or SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys, species
WHERE surveys.species_id = species.species_id;
5 keywords specify which columns link the tables.Regular SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7 returns only matching rows. Other join clauses provide different behavior, e.g., SELECT species.genus, species.species, surveys.weight
FROM surveys
JOIN species
ON surveys.species_id = species.species_id;
7 retains all rows of the table on the left side of the clause.
Which of the following are used to combine the records from multiple tables?
A JOIN clause is used to combine rows from two or more tables, based on a related column between them.
Which of the following can be used to combine multiple tables in a single query?
Joins are used to combine the rows from multiple tables using mutual columns.
Which query is used to retrieve data from multiple tables?
In SQL, to fetch data from multiple tables, the join operator is used.
Which of the following query combines data records from two or more tables?
Answer: C.
A Cartesian join between two tables returns every possible combination of rows from the tables. A Cartesian join can be produced by not including a join operation in the query or by using a CROSS JOIN.
|