A guide to Interactive Python dashboards using Bokeh

11 min readJun 3, 2021

I have spent a lot of time during the last couple of weeks at work on sourcing and analyzing data to track business performance. However, when I reached the conclusion of my analysis, I realized that I had no way to present the analysis other than a csv-file as the output.

A csv-file is not a good tool of visualization, when you want to communicate important findings. An easy alternative would be to setup an Excel spreadsheet with some graphical visualizations communicating the findings of the analysis. Using Excel might work out fine, if you are presenting a one off analysis to your boss, however if you are making a recurrent analysis you are way better off using PowerBI or Bokeh for Python as presented in this essay.

With Bokeh you can host interactive dashboards using Python, that enables the end-user to dive into the data themselves within the boundaries set by you the creator.

To create a simple functioning Bokeh dashboard you need to do the following:

Create the different widgets (sliders, buttons, etc.) used to sort and filter the data in your data source.
Create a ColumnDataSource (data source) which is used as a data-source by the different visualization elements in your dashboard.
Create your visual elements (plots, tables, etc.) and link them to your data source.
Connect your widgets and sliders to a function that updates the data source, when the widgets are clicked on.
Decide the layout of the widgets and visual elements in the dashboard.
Run a .bat file to host your dashboard and check it out in your browser.

Imports:

import pandas as pd
import numpy as np
import random
from datetime import datetime as dt
from bokeh.io import output_file, show
from bokeh.layouts import gridplot, layout
from bokeh.palettes import Category20
from bokeh.plotting import figure, curdoc
from bokeh.models import (ColumnDataSource, CDSView, RadioButtonGroup, GroupFilter, DataTable,
                          TableColumn, DateFormatter, CategoricalColorMapper, CheckboxGroup,
                          TextInput, Column, Row, Div, HoverTool, Slider, MultiChoice)

Let’s get to work!

For this example I will use an S&P 500 stock dataset from Kaggle (LINK), to create a simple dashboard with Bokeh. The dataset consists market price data of 505 ticker symbols on S&P 500 in the years [2013:2018]. I have added additional Month and Year columns using simple pandas month and year functions.

# Read the data:
data = pd.read_csv('all_stocks_5yr.csv')
# Format the date variable to datetime and sort the data by date:
data['date'] = pd.to_datetime(data['date'], format='%Y-%m-%d')
data.sort_values(by='date', ascending=False, inplace=True)
# Create the month and year variables:
data['Month'] = pd.to_datetime(data['date']).dt.month
data['Year'] = pd.to_datetime(data['date']).dt.year

Create the different widgets

Before creating your ColumnDataFrame (data source) I would recommend to initialize the widgets first. The reason for this is that the widgets will be used to filter the data used to initialize the data source.

For this dashboard, I will use three simple widgets: MultiChoice, Slider & RangeSlider.

MultiChoice

The MultiChoice widget is used in this example to filter the data source by the different ticker symbols.

A MultiChoice widget takes in two obligatory arguments:
value (initial value of the widget)
options (all the value options available)

For the obligatory arguments, I create a list with all ticker symbols called tickers, and pass it as input to both arguments. You can decide for yourself how many ticker values you want the widget to show when the dashboard is launched at first.

# Create a sorted list of ticker names:
tickers = sorted(list(data.Name.unique()))# Initialize the ticker choice:
ticker_button = MultiChoice(value=tickers[:2], options=tickers)

Slider

The Slider widget is used in this example to filter the data source by year.

Slider widget with year as value

A Slider widget takes five arguments:
start (minimum value)
end (maximum value)
value (initial value of the widget)
step (step size of slider increment)
title (title of your slider)

I create a list with all the years, and use this list as input to the majority of the arguments. Creating a list of years from the data frame instead of writing them down manually will make the slider able to adapt if new data is added to the dataset.

# Creat a sorted list of years:
year = sorted(list(data.Year.unique()))# Initialize the year slider:
year_slider = Slider(start=year[0], end=year[-1], value=year[1], step=1, title='Year')

RangeSlider

The RangeSlider is used to filter the data source based on months in this example.

RangeSlider widget with months as value

A RangeSlider widget takes the same arguments as a regular Slider widget, however the value argument needs to be a list rather than a single value.
start (minimum value)
end (maximum value)
value (initial value of the widget as list)
step (step size of slider increment)
title (title of your slider)

# Creat a sorted list of months:
month = sorted(list(data.Month.unique()))# Initialize the month range slider:
month_slider = RangeSlider(start=month[0], end=month[-1], value=month[:2], step=1, title='Month')

Create a ColumnDataSource (data source)

The ColumnDataSource is used by the different dashboard elements, and needs to be defined initially. It is easy to define, as the ColumnDataSource class takes a DataFrame as data input.

However, lets just filter our initial S&P 500 DataFrame with the values of the initialized widgets. To access the values of the widgets you can use “widget name”.value. Other widgets such as CheckboxButtonGroup does not have a value but instead you use .active to access the state of the widget.

I define the new filtered DataFrame as df, using the widget filters on the current DataFrame data.

# Filter the initial data source:
df = data[(data['Name'].isin(ticker_button.value)) & (data['Month'] >= month_slider.value[0]) & (data['Month'] <= month_slider.value[1]) & (data['Year'] == year_slider.value)]# Pass the filtered data source to the ColumnDataSource class:
source = ColumnDataSource(data=df)

Create the visual elements of the dashboard

Now that you have your widgets and your data source, it is time to use the data source to create some visual elements. In this example I am making a Plot and a Table.

Plot

Making a plot is fairly simple with bokeh, and if you have used Seaborn or other Python plotting modules then you will find this a bit similar.

The plot should have dates on the x-axis and closing price as the y-axis. And the plot lines should be grouped by ticker names, such that each ticker has its own line.

When creating a plot like this with 505 different groups (tickers) there are some considerations that needs to be taken into account. How are you going to find enough different colors for all of the plots? What are you going to do with the legend? Etc.

The function below creates a plot with the closing price per day of all the 505 tickers in the dataset. It adds a hovertool, which makes it possible to see the closing price, highest price, lowest price and volume of a given ticker on a given day.

It starts by sampling a bunch of random colors to a list from the Category20 color palette. Then the hovertool is created as TOOLTIPS. Afterwards the figure p is created, which is used to store all the plot lines in.

Then a loop is used to create all the different plot lines of each ticker. The grouping is done using CDSView, which is a way to add different filters to the data in a figure. In this example we use it to group the data by ticker name. All the plot lines of the 505 tickers are created within the loop. Lastly, the hovertool is added to the figure.

def plot_function(tickers):
    # Getting some colors:
    colors = list(Category20.values())[17]
    random_colors = []
    for c in range(len(tickers)):
        random_colors.append(random.choice(colors))

    # Create the hovertool:
    TOOLTIPS = HoverTool(tooltips=[    ('date', '$x{%Y-%m-%d}'),
                   ('close', '$@{close}{0.0}'),
                   ('high', '$@{high}{0.0}'),
                   ('low', '$@{low}{0.0}'),
                   ('volume', '@volume{0.00 a}')],
                         formatters={'$x': 'datetime'})

    # Create the figure to store all the plot lines in:
    p = figure(x_axis_type='datetime', width=1000)
    
    # Loop through the tickers and colors and create plot line for each:
    for t, rc in zip(tickers, random_colors):
        view = CDSView(source=source, filters=[GroupFilter(column_name='Name', group=t)])
        p.line(x='date', y='close', source=source, view=view, line_color=rc, line_width=4)
    
    # Add the hovertool to the figure:
    p.add_tools(TOOLTIPS)
    return p
p = plot_function(tickers)

Table

A table is very usefull, when showing multiple measures. In this case we simplify to only show the date, ticker name and closing price. However, it is very easy to add more columns to the table if needed.

A table takes two inputs: a ColumnDataSource and a list of columns with formatting specified. We have already created the data source, so we only need to create the list of columns now. Recall that the ColumnDataSource was assigned to the variable source. The code below initializes the table with the specified columns. If you need to format numbers in a specific way, I would recommend looking here (LINK).

# Creating the list of columns:
columns = [
        TableColumn(field="date", title="Date", formatter=DateFormatter(format="%Y-%m-%d")),
        TableColumn(field="Name", title="Name"),
        TableColumn(field="close", title="Close"),
    ]
# Initializing the table:
table = DataTable(source=source, columns=columns)

Connect the widgets to a function that updates the data source

The great thing about bokeh is that it is interactive. This is facilitated by the connection between your widgets and the data source.

Every time you interact with your widgets on the dashboard, they send an update a function which changes the data source, if you set it up correctly.

Using the three widgets MultiChoice, Slider and RangeSlider it is easy to setup a function, which reads the value of the widget and filters the data source. Recall, that the three widgets have the following names assigned respectively in the code as: ticker_button, year_slider and month_slider.

The idea is simple, you make the widget call a function whenever the widget is changed (interacted with). This function changes the datasource as you specify.

The function which filters the data uses the same code to filter as used before to initialize the data source. The values of the widgets are accessed through .value and the new values are used to create a new data frame. The new data frame is then replacing the old data in the data source. As source.data takes in a dict in the series format, you need to convert the data frame to a dictionary with a series formatting.

def filter_function():
    # Filter the data according to the widgets:
    new_src = data[(data['Name'].isin(ticker_button.value)) & (data['Month'] >= month_slider.value[0]) & (data['Month'] <= month_slider.value[1]) & (data['Year'] == year_slider.value)]
    
    # Replace the data in the current data source with the new data:
    source.data = new_src.to_dict('series')

Make the layout of your dashboard

It is very simple to create a layout of the dashboard in bokeh. The widgets and visual elements are easy to move around with a little bit of code, and after some trial and error the perfect layout is found.

Firstly, we give the dashboard a header. This is done with the Div model, which takes HTML code as input. The dashboard is now called Stock Dashboard.

# Header
title = Div(text='<h1 style="text-align: center">Stock Dashboard</h1>')

Now we need to create a layout for the widgets. This is done by using both the Column and Row models, that stacks the widgets in columns and rows. First, the month_slider and the year_slider is stacked on top of each other using the Column model and assigned to widgets_col. Then widgets_col is placed on the same row as ticker_button and assigned to widgets_row. Now widgets_row contains all the widgets.

widgets_col = Column(month_slider, year_slider)
widgets_row = Row(widgets_col, ticker_button)

Now we want to create a layout which has both the widgets and the visual elements (plot and table). Below, the layout is shown in code. Each list works as a horizontal layer in the order that they are written. This means that the title is at the top of the page, then the widgets, and lastly the plot and table is shown.

layout = layout([[title],
                 [widgets_row],
                 [p,table]])

Finally, at the end of your python script, you need to add the layout to a module which connects to the dashboard hosting. I have also added a title which is shown as the name of the tab in the browser.

curdoc().title = 'Stock Dashboard'
curdoc().add_root(layout)

Host your dashboard and access it through your browser

Lastly, you need to host your dashboard. This is done by creating a .bat file, which runs the script through bokeh.
My .bat file is written below. It activates my virtual environment (if you have not created one, then just delete the path from the code), and makes bokeh host (serve) the python script (main.py).

C:\Users\OMDJ\PycharmProjects\dash_template\Scripts\activate.bat && bokeh serve — show main.py

Note that — is supposed to be two hyphens.

Enjoy! 😃

Python Script:

import pandas as pd
import numpy as np
import random
from datetime import datetime as dt
from bokeh.io import output_file, show
from bokeh.layouts import gridplot, layout
from bokeh.palettes import Category20
from bokeh.plotting import figure, curdoc
from bokeh.models import (ColumnDataSource, CDSView, RadioButtonGroup, GroupFilter, DataTable,
                          TableColumn, DateFormatter, CategoricalColorMapper, CheckboxGroup,
                          TextInput, Column, Row, Div, HoverTool, Slider, RangeSlider, MultiChoice)
# Read the data:
data = pd.read_csv('all_stocks_5yr.csv')
# Format the date variable to datetime and sort the data by date:
data['date'] = pd.to_datetime(data['date'], format='%Y-%m-%d')
data.sort_values(by='date', ascending=False, inplace=True)
# Create the month and year variables:
data['Month'] = pd.to_datetime(data['date']).dt.month
data['Year'] = pd.to_datetime(data['date']).dt.year

# INITIAL VARIABLES:
# Create a sorted list of ticker names:
tickers = sorted(list(data.Name.unique()))
# Creat a sorted list of months:
month = sorted(list(data.Month.unique()))
# Creat a sorted list of years:
year = sorted(list(data.Year.unique()))

# WIDGETS:
# Initialize the ticker choice:
ticker_button = MultiChoice(value=tickers[:2], options=tickers)
# Initialize the year slider:
year_slider = Slider(start=year[0], end=year[-1], value=year[1], step=1, title='Year')
# Initialize the month range slider:
month_slider = RangeSlider(start=month[0], end=month[-1], value=month[:2], step=1, title='Month')


# Filter the initial data source:
df = data[(data['Name'].isin(ticker_button.value)) & (data['Month'] >= month_slider.value[0]) & (data['Month'] <= month_slider.value[1]) & (data['Year'] == year_slider.value)]
# Pass the filtered data source to the ColumnDataSource class:
source = ColumnDataSource(data=df)

# TABLE
# Creating the list of columns:
columns = [
        TableColumn(field="date", title="Date", formatter=DateFormatter(format="%Y-%m-%d")),
        TableColumn(field="Name", title="Name"),
        TableColumn(field="close", title="Close"),
    ]
# Initializing the table:
table = DataTable(source=source, columns=columns, height=500)

# PLOT
def plot_function(tickers):
    # Getting some colors:
    colors = list(Category20.values())[17]
    random_colors = []
    for c in range(len(tickers)):
        random_colors.append(random.choice(colors))

    # Create the hovertool:
    TOOLTIPS = HoverTool(tooltips=[    ('date', '$x{%Y-%m-%d}'),
                   ('close', '$@{close}{0.0}'),
                   ('high', '$@{high}{0.0}'),
                   ('low', '$@{low}{0.0}'),
                   ('volume', '@volume{0.00 a}')],
                         formatters={'$x': 'datetime'})

    # Create the figure to store all the plot lines in:
    p = figure(x_axis_type='datetime', width=1000, height=500)

    # Loop through the tickers and colors and create plot line for each:
    for t, rc in zip(tickers, random_colors):
        view = CDSView(source=source, filters=[GroupFilter(column_name='Name', group=t)])
        p.line(x='date', y='close', source=source, view=view, line_color=rc, line_width=4)

    # Add the hovertool to the figure:
    p.add_tools(TOOLTIPS)
    return p
p = plot_function(tickers)

def text_function(attr, old, new):
    new_text = new
    old_text = old
    text_data = pd.read_json('text_data.json')

def filter_function():
    # Filter the data according to the widgets:
    new_src = data[(data['Name'].isin(ticker_button.value)) & (data['Month'] >= month_slider.value[0]) & (data['Month'] <= month_slider.value[1]) & (data['Year'] == year_slider.value)]

    # Replace the data in the current data source with the new data:
    source.data = new_src.to_dict('series')

def change_function(attr, old, new):
    filter_function()

ticker_button.on_change('value' ,change_function)
month_slider.on_change('value', change_function)
year_slider.on_change('value', change_function)

# Header
title = Div(text='<h1 style="text-align: center">Stock Dashboard</h1>')

widgets_col = Column(month_slider, year_slider)
widgets_row = Row(widgets_col, ticker_button)
layout = layout([[title],
                 [widgets_row],
                 [p,table]])
curdoc().title = 'Stock Dashboard'
curdoc().add_root(layout)