Name	Name	Last commit message	Last commit date
parent directory ..
AggregatedHistogram.py	AggregatedHistogram.py
CumulativeHistogram.py	CumulativeHistogram.py
NormalizedHistogram.py	NormalizedHistogram.py
OverlaidHistogram.py	OverlaidHistogram.py
Readme.md	Readme.md
SimpleHistogram.py	SimpleHistogram.py
StackedHistogram.py	StackedHistogram.py
aggregated_histogram.png	aggregated_histogram.png
cumulative_histogram.png	cumulative_histogram.png
normalized_histogram.png	normalized_histogram.png
overlaid_histogram.png	overlaid_histogram.png
simple_histogram.png	simple_histogram.png
stacked_histogram.png	stacked_histogram.png

Histogram

Histogram are charts use rectangles represent frequency of the range of a continuous attribute. In this folder, we will go over how to create histogram with Python and Plotly.

Files

The following scripts are used in this chapter:

SimpleHistogram.py
NormalizedHistogram.py
OverlaidHistogram.py
StackedHistogram.py
CumulativeHistogram.py

Pacakges Needed

This chapter requires the following packages for the scripts used:

Plotly
Pandas

Syntax

Data

Data is a list of go.Histogram(), each go.Histogram() represents a set of histogram.

go.Histogram() has the following parameters:

x: Value
y: Value in a horizontal histogram
histnorm: Normalized Histogram
- probability: Probability of an event happen of each bin
- percent: Percentage of the occurrence with respect to total number
- density: Number of occurrence in a bin divided by the size of the bin interval
- probability density: The Probility of an event happen of each density bin
xbins: Setting of each bin
- size: Define the interval size of each bin, number
ybins: Same as xbins but for horizontal bars
cumulative_enabled: Enable Cumulative Histogram, True/False
opacity: Opacity, from 0-1
marker_color: Bar colour (Take colour spelliing in string or RGB in string)
hoverinfo: What information to be displayed when user hover over the bar, all the options are:
- percent
- label+percent
- label
- name

Layout

Genetic Layout parameters suggested to use:

title (Dictionary): Chart title and fonts
- text: Chart title to be displayed
- x: text location on x-dimension, from 0-1
- y: text location on y-dimension, from 0-1
xaxis (Dictionary): X-axis setting
- tickmode: Setting of ticks
- tickangle: Degree the tick rotate (-: Anticlockwise, +: Clockwise)
- categoryorder: Sort the order of attributes on X-axis, either ascending or descending
  - category ascending: Sort attribute (attribute in name in Data) in ascending orders
  - category descending: Sort attribute (attribute in name in Data) in descending orders
  - total ascending: Sort value in ascending orders
  - total descending: Sort value in descending orders
  - min ascending/min descending: Sort by minimum value
  - max ascending/max descending: Sort by maximum value
  - sum ascending/sum descending: Sort by summation value
  - mean ascending/mean descending: Sort by average value
  - median ascending/median descending: Sort by median value
  - array: Follow the sorting order defined in categoryarray
- categoryarray: Define the sorting order when categoryorder is array
yaxis (Dictionary): y-axis setting
- tickmode: Setting of ticks
- tickangle: Degree the tick rotate (-: Anticlockwise, +: Clockwise)
barmode: How the sets of histogram are displayed
- stack: Histograms are drawn on top of another
- overlay: Have different data set sharing the same bins
bargap: Gap between bars, in pixel
histfunc: Specifies the binning function (count: Count occurrences, sum: Sum the values, avg: Average the values, min/max: Display minimum or maximum value within the bin)
histnorm: Type of Normalization used for histogram, None by default (probability: Bar display in %, density: Bar is calculated by occurrences divided by size, probability density: Bar is calculated by occurrences divided by size, all bin sum to 1)

Histogram Exclusive parameters:

cumulative_enabled: Enable Cumulative Histogram, True/False
marker_color: Bar colour (Take colour spelliing in string or RGB in string)
barmode: How the sets of histogram are displayed
- stack: Histograms are drawn on top of another
- overlay: Have different data set sharing the same bins
bargap: Gap between bars, in pixel
histfunc: Specifies the binning function (count: Count occurrences, sum: Sum the values, avg: Average the values, min/max: Display minimum or maximum value within the bin)
histnorm: Type of Normalization used for histogram, None by default (probability: Bar display in %, density: Bar is calculated by occurrences divided by size, probability density: Bar is calculated by occurrences divided by size, all bin sum to 1)

Examples

Example 1 - Simple Histogram

# Data
data = []
data.append(go.Histogram(x=df['salary']))
# Layout
layout = {'title':{'text':'Histogram of Salary among Friends', 'x':0.5}}

Example 2 - Normalized Histogram

# Data
data = []
data.append(go.Histogram(x=x, histnorm='probability'))
# Layout
layout = {'title':{'text':'Distribution of 500 Random Numbers', 'x':0.5}}

Example 3 - Overlaid Histogram

data = []
for group in df['group'].unique():
    df_temp = df[df['group']==group]
    data.append(go.Histogram(x=df_temp['salary'],name=group))
# Layout
layout = {'title':{'text':'Everybody\'s Salary', 'x':0.5},
          'barmode':'overlay'}

Example 4 - Stacked Histogram

data = []
for group in df['group'].unique():
    df_temp = df[df['group']==group]
    data.append(go.Histogram(x=df_temp['salary'],name=group))
# Layout
layout = {'title':{'text':'Everybody\'s Salary', 'x':0.5},
          'barmode':'stack'}

Example 5 - Cumulative Histogram

# Data
data = []
data.append(go.Histogram(x=df['salary'], cumulative_enabled=True))
# Layout
layout = {'title':{'text':'Everybody\'s Salary', 'x':0.5}}

Example 6 - Aggregated Histogram

# Data
data = []
data.append(go.Histogram(x=df['salary'], y=df['salary'],  histfunc='sum'))
# Layout
layout = {'title':{'text':'Everybody\'s Salary (Summation by Group)', 'x':0.5}}

Note: histfunc aggreagates from y-axis. A numeric column must provided for y-axis arguement, or else, Plotly treats it as a simple histogram

Reference

Plotly Documentation Histograms

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readme.md

Histogram

Files

Pacakges Needed

Syntax

Data

Layout

Examples

Example 1 - Simple Histogram

Example 2 - Normalized Histogram

Example 3 - Overlaid Histogram

Example 4 - Stacked Histogram

Example 5 - Cumulative Histogram

Example 6 - Aggregated Histogram

Reference

FilesExpand file tree

Histogram

Directory actions

More options

Directory actions

More options

Latest commit

History

Histogram

Folders and files

parent directory

Readme.md

Histogram

Files

Pacakges Needed

Syntax

Data

Layout

Examples

Example 1 - Simple Histogram

Example 2 - Normalized Histogram

Example 3 - Overlaid Histogram

Example 4 - Stacked Histogram

Example 5 - Cumulative Histogram

Example 6 - Aggregated Histogram

Reference