Create HTML with python-chamelboots: An Experiment

Experiment with python-chamelboots to create HTML.

Resources

Replicate an HTML document using chamelboots.

Specs

Replace the rel and integrity attributes in the link tag and the src and integrity attributes in the script tag with different values without editing the starter_html string.

The new result should be a list of strings that would replace a range of lines in starter_html.

In [1]:
from chamelboots.constants import HTML_PARSER, Join
from chamelboots import ChameleonTemplate as CT
from chamelboots import TalStatement as TS
In [2]:
from functools import reduce
import operator as op
from pprint import pprint
import itertools as it
from subprocess import check_call
import shlex
from pathlib import Path
import tempfile
In [3]:
from lxml import etree
from bs4 import BeautifulSoup
from IPython.display import display, IFrame
In [4]:
starter_html = """<!doctype html>
<html lang="en">
  <head>
    <!-- Required meta tags -->
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <!-- Bootstrap CSS -->
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
    <!-- Optional JavaScript -->
    <!-- jQuery first, then Popper.js, then Bootstrap JS -->
    <script defer="defer" src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
    <script defer="defer" src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous"></script>
    <script defer="defer" src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous"></script>
    <title>Bootstrap title</title>
  </head>
  <body>
    <div>
        <h1>Hello, world!{nested_span}</h1>
        {list_}
    </div>
  </body>
</html>""".format(  # add some extra HTML using chamelboots
    list_=CT(
        "ul", (TS("content", "structure content"), TS("attributes", "attributes"))
    ).render(
        attributes={"class": "list-group"},
        content=CT(
            "li",
            (TS("repeat", "item items"), TS("attributes", "attributes")),
            "${item}",
        ).render(
            items=(f"foo item number {i}" for i in range(10)),
            attributes={"class": "list-group-item"},
        ),
    ),
    nested_span=CT("span", (), "I am a nested span."),
)
print(BeautifulSoup(starter_html, "html.parser").prettify())
<!DOCTYPE doctype html>
<html lang="en">
 <head>
  <!-- Required meta tags -->
  <meta charset="utf-8"/>
  <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/>
  <!-- Bootstrap CSS -->
  <link crossorigin="anonymous" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" rel="stylesheet"/>
  <!-- Optional JavaScript -->
  <!-- jQuery first, then Popper.js, then Bootstrap JS -->
  <script crossorigin="anonymous" defer="defer" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" src="https://code.jquery.com/jquery-3.3.1.slim.min.js">
  </script>
  <script crossorigin="anonymous" defer="defer" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js">
  </script>
  <script crossorigin="anonymous" defer="defer" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js">
  </script>
  <title>
   Bootstrap title
  </title>
 </head>
 <body>
  <div>
   <h1>
    Hello, world!
    <span>
     I am a nested span.
    </span>
   </h1>
   <ul class="list-group">
    <li class="list-group-item">
     foo item number 0
    </li>
    <li class="list-group-item">
     foo item number 1
    </li>
    <li class="list-group-item">
     foo item number 2
    </li>
    <li class="list-group-item">
     foo item number 3
    </li>
    <li class="list-group-item">
     foo item number 4
    </li>
    <li class="list-group-item">
     foo item number 5
    </li>
    <li class="list-group-item">
     foo item number 6
    </li>
    <li class="list-group-item">
     foo item number 7
    </li>
    <li class="list-group-item">
     foo item number 8
    </li>
    <li class="list-group-item">
     foo item number 9
    </li>
   </ul>
  </div>
 </body>
</html>

Upload starter_html to my static webserver to display in an IFrame

In [5]:
def save_to_minio(text):
    tmpfile = Path(tempfile.mkstemp(suffix=".html")[-1])
    tmpfile.write_text(text)
    url = f"https://minio.apps.selfip.com/mymedia/html/{tmpfile.name}"
    check_call(shlex.split(f"mc cp {tmpfile} dokkuminio/mymedia/html/"))
    return url

Display template HTML document.

In [6]:
url = save_to_minio(starter_html)
print(url)
display(IFrame(src=url, width="auto", height=500))
https://minio.apps.selfip.com/mymedia/html/tmpixne_sks.html
In [7]:
tree = etree.fromstring(starter_html, HTML_PARSER)

Flat structure.

Flat is better than nested. Without nesting it makes it difficult to reconstruct the original HTML.

In [8]:
groups = [
    (e.tag, tuple(e.attrib.items()), e.text.strip() if e.text is not None else "")
    for e in tree.iter()
    if isinstance(e.tag, str)
]
groups
Out[8]:
[('html', (('lang', 'en'),), ''),
 ('head', (), ''),
 ('meta', (('charset', 'utf-8'),), ''),
 ('meta',
  (('name', 'viewport'),
   ('content', 'width=device-width, initial-scale=1, shrink-to-fit=no')),
  ''),
 ('link',
  (('rel', 'stylesheet'),
   ('href',
    'https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css'),
   ('integrity',
    'sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T'),
   ('crossorigin', 'anonymous')),
  ''),
 ('script',
  (('defer', 'defer'),
   ('src', 'https://code.jquery.com/jquery-3.3.1.slim.min.js'),
   ('integrity',
    'sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo'),
   ('crossorigin', 'anonymous')),
  ''),
 ('script',
  (('defer', 'defer'),
   ('src',
    'https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js'),
   ('integrity',
    'sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1'),
   ('crossorigin', 'anonymous')),
  ''),
 ('script',
  (('defer', 'defer'),
   ('src',
    'https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js'),
   ('integrity',
    'sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM'),
   ('crossorigin', 'anonymous')),
  ''),
 ('title', (), 'Bootstrap title'),
 ('body', (), ''),
 ('div', (), ''),
 ('h1', (), 'Hello, world!'),
 ('span', (), 'I am a nested span.'),
 ('ul', (('class', 'list-group'),), ''),
 ('li', (('class', 'list-group-item'),), 'foo item number 0'),
 ('li', (('class', 'list-group-item'),), 'foo item number 1'),
 ('li', (('class', 'list-group-item'),), 'foo item number 2'),
 ('li', (('class', 'list-group-item'),), 'foo item number 3'),
 ('li', (('class', 'list-group-item'),), 'foo item number 4'),
 ('li', (('class', 'list-group-item'),), 'foo item number 5'),
 ('li', (('class', 'list-group-item'),), 'foo item number 6'),
 ('li', (('class', 'list-group-item'),), 'foo item number 7'),
 ('li', (('class', 'list-group-item'),), 'foo item number 8'),
 ('li', (('class', 'list-group-item'),), 'foo item number 9')]

Define some constants.

In [9]:
INNER_CONTENT, ATTRIBS, ATTRIBUTES, TAIL = (
    "inner_content",
    "attribs",
    "attributes",
    "tail",
)

Define functions to recursively walk the element tree and convert to nested dictionaries and lists.

In [10]:
def dictdata(node):
    res = {}
    res[node.tag] = []
    html_to_dict(node, res[node.tag])
    reply = {}
    reply[node.tag] = {
        INNER_CONTENT: res[node.tag],
        ATTRIBS: node.attrib,
        TAIL: node.tail,
    }
    return reply


def html_to_dict(node, res):
    rep = {}
    if len(node):
        for n in list(node):
            rep[node.tag] = []
            value = html_to_dict(n, rep[node.tag])
            if len(n):

                value = {
                    INNER_CONTENT: rep[node.tag],
                    ATTRIBUTES: n.attrib,
                    TAIL: n.tail,
                }
                res.append({n.tag: value})
            else:
                res.append(rep[node.tag][0])
    else:
        value = {}
        value = {INNER_CONTENT: node.text, ATTRIBUTES: node.attrib, TAIL: node.tail}
        res.append({node.tag: value})
    return None
In [11]:
data = dictdata(tree.getroottree().getroot())
In [12]:
data
Out[12]:
{'html': {'inner_content': [{'head': {'inner_content': [{<cyfunction Comment at 0x7f2c140317a0>: {'inner_content': ' Required meta tags ',
        'attributes': <lxml.etree._ImmutableMapping at 0x7f2c1401c780>,
        'tail': '\n    '}},
      {'meta': {'inner_content': None,
        'attributes': {'charset': 'utf-8'},
        'tail': '\n    '}},
      {'meta': {'inner_content': None,
        'attributes': {'name': 'viewport', 'content': 'width=device-width, initial-scale=1, shrink-to-fit=no'},
        'tail': '\n    '}},
      {<cyfunction Comment at 0x7f2c140317a0>: {'inner_content': ' Bootstrap CSS ',
        'attributes': <lxml.etree._ImmutableMapping at 0x7f2c1401c780>,
        'tail': '\n    '}},
      {'link': {'inner_content': None,
        'attributes': {'rel': 'stylesheet', 'href': 'https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css', 'integrity': 'sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T', 'crossorigin': 'anonymous'},
        'tail': '\n    '}},
      {<cyfunction Comment at 0x7f2c140317a0>: {'inner_content': ' Optional JavaScript ',
        'attributes': <lxml.etree._ImmutableMapping at 0x7f2c1401c780>,
        'tail': '\n    '}},
      {<cyfunction Comment at 0x7f2c140317a0>: {'inner_content': ' jQuery first, then Popper.js, then Bootstrap JS ',
        'attributes': <lxml.etree._ImmutableMapping at 0x7f2c1401c780>,
        'tail': '\n    '}},
      {'script': {'inner_content': None,
        'attributes': {'defer': 'defer', 'src': 'https://code.jquery.com/jquery-3.3.1.slim.min.js', 'integrity': 'sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo', 'crossorigin': 'anonymous'},
        'tail': '\n    '}},
      {'script': {'inner_content': None,
        'attributes': {'defer': 'defer', 'src': 'https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js', 'integrity': 'sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1', 'crossorigin': 'anonymous'},
        'tail': '\n    '}},
      {'script': {'inner_content': None,
        'attributes': {'defer': 'defer', 'src': 'https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js', 'integrity': 'sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM', 'crossorigin': 'anonymous'},
        'tail': '\n    '}},
      {'title': {'inner_content': 'Bootstrap title',
        'attributes': {},
        'tail': '\n  '}}],
     'attributes': {},
     'tail': '\n  '}},
   {'body': {'inner_content': [{'div': {'inner_content': [{'h1': {'inner_content': [{'span': {'inner_content': 'I am a nested span.',
              'attributes': {},
              'tail': None}}],
           'attributes': {},
           'tail': '\n        '}},
         {'ul': {'inner_content': [{'li': {'inner_content': 'foo item number 0',
              'attributes': {'class': 'list-group-item'},
              'tail': '\n'}},
            {'li': {'inner_content': 'foo item number 1',
              'attributes': {'class': 'list-group-item'},
              'tail': '\n'}},
            {'li': {'inner_content': 'foo item number 2',
              'attributes': {'class': 'list-group-item'},
              'tail': '\n'}},
            {'li': {'inner_content': 'foo item number 3',
              'attributes': {'class': 'list-group-item'},
              'tail': '\n'}},
            {'li': {'inner_content': 'foo item number 4',
              'attributes': {'class': 'list-group-item'},
              'tail': '\n'}},
            {'li': {'inner_content': 'foo item number 5',
              'attributes': {'class': 'list-group-item'},
              'tail': '\n'}},
            {'li': {'inner_content': 'foo item number 6',
              'attributes': {'class': 'list-group-item'},
              'tail': '\n'}},
            {'li': {'inner_content': 'foo item number 7',
              'attributes': {'class': 'list-group-item'},
              'tail': '\n'}},
            {'li': {'inner_content': 'foo item number 8',
              'attributes': {'class': 'list-group-item'},
              'tail': '\n'}},
            {'li': {'inner_content': 'foo item number 9',
              'attributes': {'class': 'list-group-item'},
              'tail': None}}],
           'attributes': {'class': 'list-group'},
           'tail': '\n    '}}],
        'attributes': {},
        'tail': '\n  '}}],
     'attributes': {},
     'tail': '\n'}}],
  'attribs': {'lang': 'en'},
  'tail': None}}

Define functions for getting all the "paths" to item leaves in the nested dictionary and for getting the leaf using the path.

See this solution to Access nested dictionary items via a list of keys? on Stack Overflow.

In [13]:
def paths_in_data(data, parent=()):
    """Calculate keys and/or indices in dict."""

    if not any(isinstance(data, type_) for type_ in (dict, list, tuple)):
        return (parent,)
    else:
        try:
            return reduce(
                op.add,
                (paths_in_data(v, op.add(parent, (k,))) for k, v in data.items()),
                (),
            )
        except AttributeError:
            return reduce(
                op.add,
                (paths_in_data(v, op.add(parent, (data.index(v),))) for v in data),
                (),
            )


def get_from(data, path):
    """Get a leaf from iterable of keys and/or indices.
    
    :data: Collection where nodes are either a dict or list.
    :path: Collection of keys and/or indices leading to a leaf.
    """
    return reduce(op.getitem, path, data)

Get the items to change.

In [14]:
WANTED_TAGS = ("link", "script")
paths_to_mutables = [
    item for item in paths_in_data(data) if any(tag in item for tag in WANTED_TAGS)
]

Group the paths by HTML element

In [15]:
TAG_INDEX = 5
mutables = it.groupby(paths_to_mutables, key=op.itemgetter(TAG_INDEX))
for key, group in mutables:
    for row in group:
        print(row)
('html', 'inner_content', 0, 'head', 'inner_content', 4, 'link', 'inner_content')
('html', 'inner_content', 0, 'head', 'inner_content', 4, 'link', 'attributes')
('html', 'inner_content', 0, 'head', 'inner_content', 4, 'link', 'tail')
('html', 'inner_content', 0, 'head', 'inner_content', 7, 'script', 'inner_content')
('html', 'inner_content', 0, 'head', 'inner_content', 7, 'script', 'attributes')
('html', 'inner_content', 0, 'head', 'inner_content', 7, 'script', 'tail')
('html', 'inner_content', 0, 'head', 'inner_content', 8, 'script', 'inner_content')
('html', 'inner_content', 0, 'head', 'inner_content', 8, 'script', 'attributes')
('html', 'inner_content', 0, 'head', 'inner_content', 8, 'script', 'tail')
('html', 'inner_content', 0, 'head', 'inner_content', 9, 'script', 'inner_content')
('html', 'inner_content', 0, 'head', 'inner_content', 9, 'script', 'attributes')
('html', 'inner_content', 0, 'head', 'inner_content', 9, 'script', 'tail')
In [16]:
items_to_edit = [
    [get_from(data, row) for row in group][1:]  # attributes and (inner_content or tail)
    for key, group in it.groupby(paths_to_mutables, key=op.itemgetter(5))
]
items_to_edit
Out[16]:
[[{'rel': 'stylesheet', 'href': 'https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css', 'integrity': 'sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T', 'crossorigin': 'anonymous'},
  '\n    '],
 [{'defer': 'defer', 'src': 'https://code.jquery.com/jquery-3.3.1.slim.min.js', 'integrity': 'sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo', 'crossorigin': 'anonymous'},
  '\n    '],
 [{'defer': 'defer', 'src': 'https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js', 'integrity': 'sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1', 'crossorigin': 'anonymous'},
  '\n    '],
 [{'defer': 'defer', 'src': 'https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js', 'integrity': 'sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM', 'crossorigin': 'anonymous'},
  '\n    ']]
In [17]:
INTEGRITY = "integrity"
link_keys = ("href", "rel", INTEGRITY, "crossorigin")
script_keys = ("defer", "src", *link_keys[link_keys.index(INTEGRITY):])
TAIL_DEFAULT = "\n    "
DEFER = "defer"

Bootswatch css breaks basic Boostrap view.

In [18]:
STYLESHEET = "stylesheet"
BOOTSWATCH_LINK_DATA = (
    [
        None,
        dict(
            zip(
                link_keys,
                (
                    "http://netdna.bootstrapcdn.com/bootswatch/4.3.1/cerulean/bootstrap.min.css",
                    STYLESHEET,
                    None,
                    None,
                ),
            )
        ),
        TAIL_DEFAULT,
    ],
)
MY_LINK_DATA = (
    None,
    dict(
        zip(
            link_keys,
            (
                "https://static.apps.selfip.com/bootstrap/4.3.1/css/boostrap.min.css",
                STYLESHEET,
                None,
                None,
            ),
        )
    ),
    TAIL_DEFAULT,
)
ALTERNATE_LINK_DATA = (
    None,
    dict(
        zip(
            link_keys,
            (
                "https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.3.1/css/bootstrap.min.css",
                STYLESHEET,
                None,
                None,
            ),
        )
    ),
    TAIL_DEFAULT,
)

LINK_DATA = (
    None,
    items_to_edit[0][0],
    TAIL_DEFAULT,
)
LINK_DATA = ALTERNATE_LINK_DATA
ALTERNATE_LINK_DATA, items_to_edit[0][0]
Out[18]:
((None,
  {'href': 'https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.3.1/css/bootstrap.min.css',
   'rel': 'stylesheet',
   'integrity': None,
   'crossorigin': None},
  '\n    '),
 {'rel': 'stylesheet', 'href': 'https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css', 'integrity': 'sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T', 'crossorigin': 'anonymous'})
In [19]:
new_values = (
    ("link", LINK_DATA),
    *(
        ("script", [None, dict(zip(script_keys, values)), TAIL_DEFAULT])
        for values in (
            (
                DEFER,
                "https://code.jquery.com/jquery-3.3.1.slim.min.js",
                "sha256-3edrmyuQ0w65f8gfBsqowzjJe2iM6n0nKciPUp8y+7E=",
                "anonymous",
            ),
            (
                DEFER,
                "https://unpkg.com/popper.js@1.14.7/dist/umd/popper.min.js",
                None,
                None,
            ),
            (
                DEFER,
                "https://ajax.aspnetcdn.com/ajax/bootstrap/4.3.1/bootstrap.min.js",
                None,
                None,
            ),
        )
    ),
)
In [20]:
TAG_INDEX = 5
grouped = (
    tuple(group)
    for key, group in it.groupby(paths_to_mutables, key=op.itemgetter(TAG_INDEX))
)
TAG_INDEX_ = 6
values = tuple(
    (paths[0][TAG_INDEX_], [get_from(data, path) for path in paths])
    for paths in grouped
)
values
Out[20]:
(('link',
  [None,
   {'rel': 'stylesheet', 'href': 'https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css', 'integrity': 'sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T', 'crossorigin': 'anonymous'},
   '\n    ']),
 ('script',
  [None,
   {'defer': 'defer', 'src': 'https://code.jquery.com/jquery-3.3.1.slim.min.js', 'integrity': 'sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo', 'crossorigin': 'anonymous'},
   '\n    ']),
 ('script',
  [None,
   {'defer': 'defer', 'src': 'https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js', 'integrity': 'sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1', 'crossorigin': 'anonymous'},
   '\n    ']),
 ('script',
  [None,
   {'defer': 'defer', 'src': 'https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js', 'integrity': 'sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM', 'crossorigin': 'anonymous'},
   '\n    ']))
In [21]:
previous_parts = [
    (
        CT(
            **dict(
                zip(
                    ("tag", "tal_statements", INNER_CONTENT),
                    (tag, (TS(ATTRIBUTES, ATTRIBUTES),), value[2],),
                )
            )
        ).render(attributes=value[1])
    )
    for tag, value in values
]
previous_parts
Out[21]:
['<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">',
 '<script defer="defer" src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous">\n    </script>',
 '<script defer="defer" src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous">\n    </script>',
 '<script defer="defer" src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous">\n    </script>']
In [22]:
new_parts = [
    (
        CT(
            **dict(
                zip(
                    ("tag", "tal_statements", INNER_CONTENT),
                    (tag, (TS(ATTRIBUTES, ATTRIBUTES),), value[2],),
                )
            )
        ).render(attributes=value[1])
    )
    for tag, value in new_values
]
new_parts
Out[22]:
['<link href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.3.1/css/bootstrap.min.css" rel="stylesheet">',
 '<script defer="defer" src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha256-3edrmyuQ0w65f8gfBsqowzjJe2iM6n0nKciPUp8y+7E=" crossorigin="anonymous">\n    </script>',
 '<script defer="defer" src="https://unpkg.com/popper.js@1.14.7/dist/umd/popper.min.js">\n    </script>',
 '<script defer="defer" src="https://ajax.aspnetcdn.com/ajax/bootstrap/4.3.1/bootstrap.min.js">\n    </script>']

Get the lines from starter_html that need replacing

In [23]:
lines_to_replace = (
    (
        i,
        line
        if any(
            item.tag in WANTED_TAGS for item in tuple(element.iterdescendants())[-1:]
        )
        else None,
    )
    for i, line in enumerate(starter_html.splitlines())
    if (element := etree.fromstring(line, HTML_PARSER)) is not None
)
indices, _ = zip(*((i, _) for i, _ in lines_to_replace if _))
indices
Out[23]:
(7, 10, 11, 12)
In [24]:
new_parts_iter = iter(new_parts)
new_html = Join.LINES(
    line if i not in indices else next(new_parts_iter)
    for i, line in enumerate(starter_html.splitlines())
)
print(BeautifulSoup(new_html, "html.parser").prettify())
<!DOCTYPE doctype html>
<html lang="en">
 <head>
  <!-- Required meta tags -->
  <meta charset="utf-8"/>
  <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/>
  <!-- Bootstrap CSS -->
  <link href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.3.1/css/bootstrap.min.css" rel="stylesheet"/>
  <!-- Optional JavaScript -->
  <!-- jQuery first, then Popper.js, then Bootstrap JS -->
  <script crossorigin="anonymous" defer="defer" integrity="sha256-3edrmyuQ0w65f8gfBsqowzjJe2iM6n0nKciPUp8y+7E=" src="https://code.jquery.com/jquery-3.3.1.slim.min.js">
  </script>
  <script defer="defer" src="https://unpkg.com/popper.js@1.14.7/dist/umd/popper.min.js">
  </script>
  <script defer="defer" src="https://ajax.aspnetcdn.com/ajax/bootstrap/4.3.1/bootstrap.min.js">
  </script>
  <title>
   Bootstrap title
  </title>
 </head>
 <body>
  <div>
   <h1>
    Hello, world!
    <span>
     I am a nested span.
    </span>
   </h1>
   <ul class="list-group">
    <li class="list-group-item">
     foo item number 0
    </li>
    <li class="list-group-item">
     foo item number 1
    </li>
    <li class="list-group-item">
     foo item number 2
    </li>
    <li class="list-group-item">
     foo item number 3
    </li>
    <li class="list-group-item">
     foo item number 4
    </li>
    <li class="list-group-item">
     foo item number 5
    </li>
    <li class="list-group-item">
     foo item number 6
    </li>
    <li class="list-group-item">
     foo item number 7
    </li>
    <li class="list-group-item">
     foo item number 8
    </li>
    <li class="list-group-item">
     foo item number 9
    </li>
   </ul>
  </div>
 </body>
</html>

Verify that new_html displays Boostrap styling.

In [25]:
url = save_to_minio(new_html)
print(url)
https://minio.apps.selfip.com/mymedia/html/tmp8rtlpbmx.html

All values were programmatically replaced with the above code.

In [26]:
display(IFrame(src=url, width="auto", height=500))