SQL Tutorial: Analyzing Cereal Nutrition Data

In this article, we will analyze a dataset of cereal nutrition information using SQL. The focus lies in identifying cereals with high protein content, a crucial factor for a healthy and satisfying breakfast.

This dataset contains nutritional information for various Cereal brands. Source

Run the SQL commands here to populate the data.

The CSV dataset was converted to SQL using the CSV to JSON converter.


Finding the Top Protein-Rich Cereals


The primary objective is to find cereals with the highest protein content relative to their calorie count. To achieve this, the we will calculate the protein to calorie ratio for each cereal.
select id, name, calories, protein, round(protein/calories, 2) as protein_cal_ratio from cereal 
	order by protein_cal_ratio desc limit 5;

This query performs the following steps:
  • Calculates Protein to Calorie Ratio: ROUND(protein / calories, 2) calculates the ratio of protein to calories for each cereal and rounds the result to two decimal places. The calculated ratio is assigned to the alias protein_cal_ratio.
  • Orders Results: ORDER BY protein_cal_ratio DESC sorts the results in descending order based on the calculated ratio. This places cereals with the highest protein to calorie ratio at the top.
  • Limits Results: LIMIT 5 restricts the output to the top 5 cereals with the highest protein to calorie ratio.

Result

All-Bran with Extra Fiber, 4 grams of protein for just 50 cals.

id name calories protein protein_cal_ratio
4 All-Bran with Extra Fiber 50 4 0.08
3 All-Bran 70 4 0.06
1 100% Bran 70 4 0.06
12 Cheerios 110 6 0.05
58 Quaker Oatmeal 100 5 0.05

SQL Output