Real Python Matplotlib Tutorial

In [33]:
import matplotlib.pyplot as plt
In [34]:
import numpy as np
In [35]:
np.random.seed(444)
  • One important big-picture matplotlib concept is its object hierarchy.

  • A “hierarchy” here means that there is a tree-like structure of matplotlib objects underlying each plot.

  • A Figure object is the outermost container for a matplotlib graphic, which can contain multiple Axes objects. One source of confusion is the name: an Axes actually translates into what we think of as an individual plot or graph (rather than the plural of “axis,” as we might expect).

In [36]:
fig, _ = plt.subplots()
In [37]:
type(one_tick := fig.axes[0].yaxis.get_major_ticks()[0])
Out[37]:
matplotlib.axis.YTick
  • Notice that we didn’t pass arguments to subplots() here. The default call is subplots(nrows=1, ncols=1)

In [38]:
fig, ax = plt.subplots()
In [39]:
type(ax)
Out[39]:
matplotlib.axes._subplots.AxesSubplot
  • We can call its instance methods to manipulate the plot similarly to how we call pyplots functions. Let’s illustrate with a stacked area graph of three time series

In [40]:
(rng := np.arange(50))
Out[40]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])
In [41]:
(rnd := np.random.randint(0, 10, size=(3, rng.size)))
Out[41]:
array([[3, 0, 7, 8, 3, 4, 7, 6, 8, 9, 2, 2, 2, 0, 3, 8, 0, 6, 6, 0, 3, 0,
        6, 7, 9, 3, 8, 7, 3, 2, 6, 9, 2, 9, 8, 9, 3, 2, 2, 8, 1, 5, 6, 7,
        6, 0, 0, 0, 0, 4],
       [8, 1, 9, 8, 5, 8, 9, 4, 6, 6, 4, 1, 8, 2, 7, 9, 3, 4, 2, 5, 0, 0,
        8, 1, 0, 9, 9, 3, 2, 7, 6, 0, 5, 5, 4, 8, 3, 4, 9, 4, 7, 1, 5, 4,
        4, 0, 2, 2, 5, 8],
       [5, 6, 6, 1, 1, 6, 8, 4, 1, 0, 9, 2, 3, 7, 3, 3, 2, 7, 8, 6, 6, 7,
        5, 7, 3, 9, 1, 3, 0, 4, 7, 5, 1, 5, 1, 4, 9, 7, 2, 4, 3, 7, 9, 2,
        2, 0, 1, 5, 2, 4]])
In [42]:
(yrs := 1950 + rng)
Out[42]:
array([1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960,
       1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971,
       1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982,
       1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993,
       1994, 1995, 1996, 1997, 1998, 1999])
In [43]:
fig, ax = plt.subplots(figsize=(5, 3))
In [44]:
ax.stackplot(yrs, rng + rnd, labels=["Eastasia", "Eurasia", "Oceania"])
Out[44]:
[<matplotlib.collections.PolyCollection at 0x7f7dfd31a1e0>,
 <matplotlib.collections.PolyCollection at 0x7f7dfd327a50>,
 <matplotlib.collections.PolyCollection at 0x7f7e20978690>]
In [45]:
ax.set_title("Combined debt growth over time")
Out[45]:
Text(0.5, 1, 'Combined debt growth over time')
In [46]:
ax.legend(loc="upper left")
Out[46]:
<matplotlib.legend.Legend at 0x7f7dfedb7b40>
In [47]:
ax.set_ylabel("Total debt")
Out[47]:
Text(3.200000000000003, 0.5, 'Total debt')
In [48]:
ax.set_xlim(xmin=yrs[0], xmax=yrs[-1])
Out[48]:
(1950, 1999)
In [49]:
fig.tight_layout()
In [50]:
fig
Out[50]:

Let’s look at an example with multiple subplots (Axes) within one Figure, plotting two correlated arrays that are drawn from the discrete uniform distribution:

In [51]:
(x := np.random.randint(low=1, high=11, size=50))
Out[51]:
array([ 9,  1,  5,  6, 10,  9,  7,  7, 10,  6,  8,  6,  4,  9,  3,  7,  6,
        9,  2, 10,  7,  2,  2,  5,  7,  9,  5,  9,  9,  8,  6,  3,  4,  3,
        1,  1,  5,  7,  6,  4,  4,  1,  9,  5, 10,  3,  5,  4,  1,  9])
In [52]:
(y := x + np.random.randint(1, 5, size=x.size))
Out[52]:
array([11,  5,  6,  7, 14, 13,  8, 10, 11,  8, 10,  7,  6, 11,  6, 10,  8,
       13,  6, 11, 10,  6,  6,  9,  9, 13,  8, 12, 12, 11,  9,  4,  6,  5,
        4,  2,  9,  8,  7,  8,  6,  3, 13,  8, 12,  4,  9,  7,  4, 11])
In [53]:
(data := np.column_stack((x, y)))
Out[53]:
array([[ 9, 11],
       [ 1,  5],
       [ 5,  6],
       [ 6,  7],
       [10, 14],
       [ 9, 13],
       [ 7,  8],
       [ 7, 10],
       [10, 11],
       [ 6,  8],
       [ 8, 10],
       [ 6,  7],
       [ 4,  6],
       [ 9, 11],
       [ 3,  6],
       [ 7, 10],
       [ 6,  8],
       [ 9, 13],
       [ 2,  6],
       [10, 11],
       [ 7, 10],
       [ 2,  6],
       [ 2,  6],
       [ 5,  9],
       [ 7,  9],
       [ 9, 13],
       [ 5,  8],
       [ 9, 12],
       [ 9, 12],
       [ 8, 11],
       [ 6,  9],
       [ 3,  4],
       [ 4,  6],
       [ 3,  5],
       [ 1,  4],
       [ 1,  2],
       [ 5,  9],
       [ 7,  8],
       [ 6,  7],
       [ 4,  8],
       [ 4,  6],
       [ 1,  3],
       [ 9, 13],
       [ 5,  8],
       [10, 12],
       [ 3,  4],
       [ 5,  9],
       [ 4,  7],
       [ 1,  4],
       [ 9, 11]])
In [54]:
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(8, 4))
In [55]:
ax1.scatter(x=x, y=y, marker="o", c="r", edgecolor="b")
Out[55]:
<matplotlib.collections.PathCollection at 0x7f7dfd209370>
In [58]:
ax1.set_title("Scatter: $x$ versus $y$")
Out[58]:
Text(0.5, 1, 'Scatter: $x$ versus $y$')
In [59]:
ax1.set_xlabel("$x$")
Out[59]:
Text(0.5, 3.1999999999999993, '$x$')
In [60]:
ax1.set_ylabel("$y$")
Out[60]:
Text(3.200000000000003, 0.5, '$y$')
In [61]:
ax2.hist(data, bins=np.arange(data.min(), data.max()), label=("x", "y"))
Out[61]:
([array([5., 3., 4., 5., 6., 6., 6., 2., 9., 4., 0., 0.]),
  array([0., 1., 1., 4., 2., 8., 4., 7., 5., 4., 6., 7.])],
 array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13]),
 <a list of 2 Lists of Patches objects>)
In [62]:
ax2.legend(loc=(0.65, 0.8))
Out[62]:
<matplotlib.legend.Legend at 0x7f7dfc62ac80>
In [63]:
ax2.set_title("Frequencies of $x$ and $y$")
Out[63]:
Text(0.5, 1, 'Frequencies of $x$ and $y$')
In [64]:
ax2.yaxis.tick_right()
  • Text inside dollar signs utilizes TeXmarkup to put variables in italics.

  • Because we’re creating a “1x2” Figure, the returned result of plt.subplots(1, 2) is now a Figure object and a NumPy array of Axes objects. (You can inspect this with fig, axs = plt.subplots(1, 2) and taking a look at axs.)

In [66]:
fig
Out[66]:
In [69]:
tuple(fig.axes[i] is ax for i, ax in zip(range(2), (ax1, ax2)))
Out[69]:
(True, True)
  • Taking this one step further, we could alternatively create a figure that holds a 2x2 grid of Axes objects

In [75]:
fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(7, 7))
  • Now, what is ax? It’s no longer a single Axes, but a two-dimensional NumPy array of them

In [76]:
type(ax)
Out[76]:
numpy.ndarray
In [77]:
ax
Out[77]:
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f7dfd2ddb90>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f7dfed5b280>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f7dfc1619b0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f7dfc119eb0>]],
      dtype=object)
In [78]:
ax1, ax2, ax3, ax4 = ax.flatten()

To illustrate some more advanced subplot features, let’s pull some macroeconomic California housing data extracted from a compressed tar archive, using io, tarfile, and urllib from Python’s Standard Library.

In [116]:
import tarfile
from urllib.request import urlretrieve
from pathlib import Path
In [117]:
filepath, response = urlretrieve("http://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.tgz")
In [118]:
filepath
Out[118]:
'/tmp/tmp91u1hf25'
In [132]:
with tarfile.open(name=filepath, mode='r') as archive:
    housing = np.loadtxt(archive.extractfile("CaliforniaHousing/cal_housing.data"), delimiter=",")
In [133]:
housing
Out[133]:
array([[-1.2223e+02,  3.7880e+01,  4.1000e+01, ...,  1.2600e+02,
         8.3252e+00,  4.5260e+05],
       [-1.2222e+02,  3.7860e+01,  2.1000e+01, ...,  1.1380e+03,
         8.3014e+00,  3.5850e+05],
       [-1.2224e+02,  3.7850e+01,  5.2000e+01, ...,  1.7700e+02,
         7.2574e+00,  3.5210e+05],
       ...,
       [-1.2122e+02,  3.9430e+01,  1.7000e+01, ...,  4.3300e+02,
         1.7000e+00,  9.2300e+04],
       [-1.2132e+02,  3.9430e+01,  1.8000e+01, ...,  3.4900e+02,
         1.8672e+00,  8.4700e+04],
       [-1.2124e+02,  3.9370e+01,  1.6000e+01, ...,  5.3000e+02,
         2.3886e+00,  8.9400e+04]])

Use JSON Web Tokens

Use JSON web tokens.

In [1]:
import jwt

jwt.encode?

Signature:
jwt.encode(
    payload,
    key,
    algorithm='HS256',
    headers=None,
    json_encoder=None,
)
Type:      method
In [2]:
(encoded := jwt.encode({"some": "payload"}, "secret"))
Out[2]:
b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzb21lIjoicGF5bG9hZCJ9.Joh1R2dYzkRvDkqv3sygm5YyK8Gi4ShZqbhK2gxcs2U'
In [3]:
jwt.decode(encoded, "secret", algorithms=["HS256"])
Out[3]:
{'some': 'payload'}

Set an expiration date.

In [4]:
from datetime import datetime, timedelta
import time
import arrow
In [5]:
(
    encoded := jwt.encode(
        {"exp": (exp := datetime.utcnow() + timedelta(seconds=3))}, "secret"
    )
)
Out[5]:
b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHAiOjE1Nzc3MDg2ODV9.3sInaD8S16T9Iva3I-OI0-4BtXbxh7PGIkHfepXNGGQ'
In [6]:
print(len(encoded))
105
In [7]:
time.sleep(4)  # Allow time to expire.
In [8]:
try:
    jwt.decode(encoded, "secret")
except jwt.ExpiredSignatureError:
    print(f"Signature expired {arrow.get(exp).humanize()}.")
Signature expired just now.

Insert a Menu and Anchor Tags in a Long Jupyter Notebook Output Cell

Download a previously stored dataframe

In [1]:
from pathlib import Path
import urllib.request
from urllib import parse
import pickle
from string import digits
from functools import partial

from IPython.display import HTML, Image, Markdown
In [2]:
(df,) = (
    pickle.loads(Path(fp).read_bytes())
    for fp, _ in (
        urllib.request.urlretrieve(
            (Path.home() / ".texpander" / "iowa_sports_pk_url").read_text()
        ),
    )
)

Display data frame

In [3]:
display(df)
xpath url sport sport_id sex
0 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/BBSB/TeamStand... Baseball B25923B5-D303-41CA-B9B3-DF2527D84CDD boys
1 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/Basketball/Tea... Basketball 57C38F60-B323-4087-A557-9ED925DC546D boys
2 /html/body/form/div/div[3]/table/tr[2]/td[2]/t... http://quikstatsiowa.com/Public/Basketball/Tea... Basketball B657ECDF-ECD0-4429-810A-9F9274EC4AAA girls
3 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/Bowling/TeamSt... Bowling DA3506E8-E4CA-4175-BF69-BEBBDC2FD878 boys
4 /html/body/form/div/div[3]/table/tr[2]/td[2]/t... http://quikstatsiowa.com/Public/Bowling/TeamSt... Bowling 0C6DFBCF-98C4-4B01-9F56-17B02E9E47E1 girls
5 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/Golf/TeamStand... Fall Golf 92A34DE4-ACB3-4282-BF29-571A97DE1946 boys
6 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/Football/TeamS... Football 91A308DE-5763-4DAA-8C03-9AF66611E0BC boys
7 /html/body/form/div/div[3]/table/tr[2]/td[2]/t... http://quikstatsiowa.com/Public/Golf/TeamStand... Golf 6DC124A1-D8C4-4F88-84EF-5C6B4FD4A688 girls
8 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/Soccer/TeamSta... Soccer 9D4214D2-EBE6-429E-9005-C11D2A29C89B boys
9 /html/body/form/div/div[3]/table/tr[2]/td[2]/t... http://quikstatsiowa.com/Public/Soccer/TeamSta... Soccer 65E5DA09-90C6-45F5-847A-F9A84FD9C5B0 girls
10 /html/body/form/div/div[3]/table/tr[2]/td[2]/t... http://quikstatsiowa.com/Public/BBSB/TeamStand... Softball D97DD7D0-0BEF-404A-B041-7E51ACFDBD16 girls
11 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/Golf/TeamStand... Spring Golf FC614ADE-B5DA-4012-A95E-0FD2A594FE9D boys
12 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/Swimming/Indiv... Swimming 139DCB57-4343-4FB8-BAF9-970E5D64597F boys
13 /html/body/form/div/div[3]/table/tr[2]/td[2]/t... http://quikstatsiowa.com/Public/Swimming/Indiv... Swimming 71F7113B-576F-4372-9E9B-4C746F251946 girls
14 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/Tennis/TeamSta... Tennis 19786FF3-ADA3-4C7A-A94F-FAC0811118F5 boys
15 /html/body/form/div/div[3]/table/tr[2]/td[2]/t... http://quikstatsiowa.com/Public/Tennis/TeamSta... Tennis 6086C2DF-4661-4701-BFF1-3BB32C081B88 girls
16 /html/body/form/div/div[3]/table/tr[2]/td[1]/t... http://quikstatsiowa.com/Public/Track/Individu... Track & Field EB178641-26F1-464D-97F1-A1D101AE35D6 boys
17 /html/body/form/div/div[3]/table/tr[2]/td[2]/t... http://quikstatsiowa.com/Public/Track/Individu... Track & Field 93AAC882-9E72-4621-B16F-95F389BA7F15 girls
18 /html/body/form/div/div[3]/table/tr[2]/td[2]/t... http://quikstatsiowa.com/Public/Volleyball/Tea... Volleyball 83298383-D7D7-4670-9C6B-24DDB8B2E773 girls
In [4]:
from chamelboots import ChameleonTemplate as CT
from chamelboots import TalStatement as TS
from chamelboots.constants import FAKE, JoinWith
from chamelboots.html.utils import prettify_html

Define TAL statements

In [15]:
TSCS, TSR, TSA = (
    TS(*args)
    for args in (
        ("content", f"structure content"),
        ("repeat", "content items"),
        ("attributes", "attributes"),
    )
)
In [14]:
LINK = CT("a", (TSCS, TSA)).render
SPAN = partial(
    CT("span", (TSCS, TSA)).render, attributes=dict(style="font-size: 2.5rem;")
)
SPAN(content="foo")
Out[14]:
'<span style="font-size: 2.5rem;">foo</span>'

Create anchor tags

An html id cannot start with digits so strip them.

In [7]:
menu_items, anchors = zip(
    *(
        (
            LINK(
                content=f"{item.sex}' {item.sport}",
                attributes={
                    "href": f"#{(id_ := item.sport_id.strip(digits))}",
                    "id": (menu_id := f"menu-{id_}"),
                },
            ),
            LINK(
                content=SPAN(content="back to menu"),
                attributes={"id": id_, "href": f"#{menu_id}"},
            ),
        )
        for item in df.itertuples()
    )
)
In [8]:
list_items = prettify_html(
    CT("ul", (TSCS,)).render(content=CT("li", (TSR, TSCS)).render(items=menu_items))
)

Display truncated portion of HTML

In [9]:
print(JoinWith.LINES(list_items.splitlines()[:10]))
<ul>
 <li>
  <a href="#B25923B5-D303-41CA-B9B3-DF2527D84CDD" id="menu-B25923B5-D303-41CA-B9B3-DF2527D84CDD">
   boys' Baseball
  </a>
 </li>
 <li>
  <a href="#C38F60-B323-4087-A557-9ED925DC546D" id="menu-C38F60-B323-4087-A557-9ED925DC546D">
   boys' Basketball
  </a>

Anchors

In [10]:
anchors[:5]
Out[10]:
('<a id="B25923B5-D303-41CA-B9B3-DF2527D84CDD" href="#menu-B25923B5-D303-41CA-B9B3-DF2527D84CDD"><span style="font-size: 2.5rem;">back to menu</span></a>',
 '<a id="C38F60-B323-4087-A557-9ED925DC546D" href="#menu-C38F60-B323-4087-A557-9ED925DC546D"><span style="font-size: 2.5rem;">back to menu</span></a>',
 '<a id="B657ECDF-ECD0-4429-810A-9F9274EC4AAA" href="#menu-B657ECDF-ECD0-4429-810A-9F9274EC4AAA"><span style="font-size: 2.5rem;">back to menu</span></a>',
 '<a id="DA3506E8-E4CA-4175-BF69-BEBBDC2FD" href="#menu-DA3506E8-E4CA-4175-BF69-BEBBDC2FD"><span style="font-size: 2.5rem;">back to menu</span></a>',
 '<a id="C6DFBCF-98C4-4B01-9F56-17B02E9E47E" href="#menu-C6DFBCF-98C4-4B01-9F56-17B02E9E47E"><span style="font-size: 2.5rem;">back to menu</span></a>')

Display list of links.

Display scaled and cropped screenshots of each website page

In [12]:
from chamelboots.imageutils import get_scaled_screenshot
In [13]:
for anchor, item in zip(anchors, df.itertuples()):
    for item in (HTML(anchor), Image(filename=get_scaled_screenshot(item.url))):
        display(item)