Forensic Files is a documentary-style, true crime series which reveals how forensic science is used to solve violent crimes, mysterious accidents, and even outbreaks of illness. Being a fan and most likely having watched all of the episodes in the series (more than once!) I was inspired to build a database to be used as a source of information about the show.

What I had in mind was to have the ability to assign words and/or phrases to each episode so that I can perform searches based in terms or phrases – such as “anthropology”, “dumpster”. “ravine” or even “cold case” and have my query return a list of all episodes that include content related to those topics.

The following outlines the solution I came up with.

Choosing the Technology

For this exercise I decided to use PostgreSQL, which is a powerful, open-source object-relational database system with over 35 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance. I’ve used PostgreSQL before, but I wanted to make sure to use the most recent version so the first thing I did was to run the following SQL command:

SELECT version();

Which yields the following result and yes, as of this writing, 15.2 is the latest version:

My other choice was which SQL management tool to use, and I opted for DBeaver  (my favorite). I’ve written about both PostgreSQL and DBeaver in the past so I will not spend any time describing them here and just get down to work.

First Step

The first step (I already had a database to use) was to create a new table to hold the Forensic Files episode data:

  • Season number (which season did this episode first aired)
  • The number of the episode within the entire series (all seasons) – this will be the episodes unique ID)
  • The number of the episode within the season
  • The title of the episode
  • The original air date of the episode
  • Data” – which I am planning to use to hold words and/or phrases I want to associate with the episode (more on this in a bit)

The following SQL statement was used to create my “episode” table:

DROP TABLE IF EXISTS public.Forensic_Files_Episodes;

CREATE TABLE public.Forensic_Files_Episodes (

season INT NOT NULL,

number_in_series INT NOT NULL,

number_in_season INT NOT NULL,

title VARCHAR ( 255 ) UNIQUE NOT NULL,

air_date DATE NOT NULL,

popularity  numeric,

data jsonb

);

 

Once I had defined my table, I wanted to load it with data. In DBeaver, you can easily import CSV’s  to a table that matches the columns that the comma-separated values or “CSV” file contains. To do this, you right click the table, and choose “Import Data“:

My file of Forensic Files Episode data contains all of the fields I am interested in (season, the number of episode within the series (all seasons), the number of the episode within the season, the episode title and the original air date) so I can easily import the data :

After the data import completes, I wanted to check the table’s content, so I used this SQL Statement:

select * from public.forensic_files_episodes;

 

Adding Popularity Ratings

I had a second data file which contained the viewer popularity rating for each episode and wanted to add that information to my table so I again used the DBeaver Import Data feature and this time, indicated CREATE for the target table so I ended up with a new table named “forensic_files_popularity”.

After this import (completes), I checked the new table with the following SL statement:

select * from forensic_files_popularity;

Now that I had loaded the episode popularity ratings, I updated the forensic_files_popularity.popularity column from the forensic_files_popularity table (matching on the “number_in_series” column) with the following SQL Statement:

 

UPDATE forensic_files_episodes

SET popularity  = forensic_files_popularity.popularity

FROM forensic_files_popularity

WHERE forensic_files_episodes.number_in_series  = forensic_files_popularity.number_in_series;

 

At that point I can query the list episodes by a variety of “filters” such as air date, season, number in series, the title or even popularity:

 

select number_in_series, title, popularity from public.Forensic_Files_Episodes

where public.Forensic_Files_Episodes.popularity > 5.0

order by popularity DESC;

 

Adding Tags

JSON is an open standard file format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and “arrays”. PostgreSQL introduced the JSONB data type in version 9.4 (a while back!) and warrants a closer look from the reader.

 

I initially defined a “data” column of this type as part of my forensic_files_episodes table:

So, this now allowed me to define and associate key phrases “tagging” each episode. Below are some sample SQL statements I used to assign an “alternate name” as well as associate multiple key phrases (or tags) for each episode:

 

UPDATE forensic_files_episodes set data =

(‘{“name”: “mothers diary”, “tags”: [“Haunted”, “disappearance of her mother”, “twenty years earlier”, “young woman”, “undertook an investigation”, “cold case”], “finished”: true}’)

where forensic_files_episodes.number_in_series = 198;

 

UPDATE forensic_files_episodes set data =

(‘{“name”: “Killigraphy”, “tags”: [“autopsy”,”suspicious”,”death”,”Georgia”,”mentally-unstable”,”husband”,”murder”,”medical examiner”], “finished”: true}’)

where forensic_files_episodes.number_in_series = 61;

 

UPDATE forensic_files_episodes set data =

(‘{“name”: “A Leg to Stand On”, “tags”: [“Severed leg”,”dumpster”,”investigators”,”identifying the victim”,”anthropology”,”toxicology”,”DNA”,”police”,”victim”,”follow the trail”,”evidence”], “finished”: true}’)

where forensic_files_episodes.number_in_series = 115;

Searching by Key Phrase

After adding the “tags” I was ready to do some searching! I can search by any phrase, such as “blood splatter” or “DNA” or how about a partial phrase like “invest”:

 

select t.title

from forensic_files_episodes t

where exists (select *

from jsonb_array_elements_text(t.data -> ‘tags’) as t(tagme)

where t.tagme like ‘%invest%’);

 

Conclusion

This exercise really didn’t take very long to complete and demonstrates a simple solution for using JSONb data types with PostgreSQL to allow searching by tags – and now I am ready for next time I want to binge watch ONLY those Forensic Files episodes that deal with square dancing doctors from New York!