2.3. Building¶
This section we will explain various options to build your projects. This options can be grouped into four categories:
Sanity check
d2lbook build linkcheck
will check if all internal and external links are accessible.d2lbook build outputcheck
will check if no notebook will contain code outputs
Building results
d2lbook build html
: build the HTML version into_build/html
d2lbook build pdf
: build the PDF version into_build/pdf
d2lbook build pkg
: build a zip file contains all.ipynb
notebooks
Additional features
d2lbook build colab
: convert all notebooks can be run on Google Colab into_build/colab
. See more in Section 2.9d2lbook build lib
: build a Python package so we can reuse codes in other notebooks. See more in XXX.
Internal stages, which often are triggered automatically.
d2lbook build eval
: evaluate all notebooks and save them as.ipynb
notebooks into_build/eval
d2lbook build rst
: convert all notebooks intorst
files and create a Sphinx project in_build/rst
2.3.1. Building Cache¶
We encourage you to evaluate your notebooks to obtain code cell results, instead of keeping these results in the source files for two reasons: 1. These results make code review difficult, especially when they have randomness either due to numerical precision or random number generators. 1. A notebook hasn’t evaluated for a while may be broken due to package upgrading.
But the evaluation costs additional overhead during building. We
recommend to limit the runtime for each notebook within a few minutes.
And d2lbook
will reuse the previous built and only evaluate the
modified notebooks.
For example, the average runtime of a notebook (section) in Dive into Deep Learning is about 2 minutes on a GPU machine, due to training neural networks. It contains more than 100 notebooks, which make the total runtime cost 2-3 hours. In reality, each code change will only modify a few notebooks and therefore the build time is often less than 10 minutes.
Let’s see how it works. First create a project as we did in Section 2.1.
!mkdir -p cache
%%writefile cache/index.md
# My Book
The starting page of my book with `d2lbook`.
````toc
get_started
````
Writing cache/index.md
%%writefile cache/get_started.md
# Getting Started
Please first install my favorite package `numpy`.
Writing cache/get_started.md
!cd cache; d2lbook build html
[d2lbook:build.py:L147] INFO 2 notebooks are outdated
[d2lbook:build.py:L149] INFO [1] ./get_started.md
[d2lbook:build.py:L149] INFO [2] ./index.md
[d2lbook:build.py:L153] INFO Evaluating notebooks in parallel with 8 CPU workers and 8 GPU workers
[d2lbook:resource.py:L196] INFO Starting task "Evaluating ./get_started.md" on CPU [0]
[d2lbook:resource.py:L159] INFO Status: 1 running tasks, 0 done, 1 not started
[d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [0] is running for 00:00:00
[d2lbook:resource.py:L196] INFO Starting task "Evaluating ./index.md" on CPU [3]
[d2lbook:resource.py:L159] INFO Status: 2 running tasks, 0 done, 0 not started
[d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [0] is running for 00:00:02
[d2lbook:resource.py:L164] INFO - Task "Evaluating ./index.md" on CPU [3] is running for 00:00:00
[d2lbook:resource.py:L223] INFO Task "Evaluating ./get_started.md" on CPU [0] is finished in 00:00:03
[d2lbook:resource.py:L223] INFO Task "Evaluating ./index.md" on CPU [3] is finished in 00:00:02
[d2lbook:resource.py:L142] INFO All 2 tasks are done, sorting by runtime:
[d2lbook:resource.py:L148] INFO - 00:00:02 on CPU [3] for Evaluating ./index.md
[d2lbook:resource.py:L148] INFO - 00:00:03 on CPU [0] for Evaluating ./get_started.md
[d2lbook:build.py:L56] INFO === Finished "d2lbook build eval" in 00:00:13
[d2lbook:build.py:L322] INFO 2 rst files are outdated
[d2lbook:build.py:L324] INFO Convert _build/eval/index.ipynb to _build/rst/index.rst
[d2lbook:build.py:L324] INFO Convert _build/eval/get_started.ipynb to _build/rst/get_started.rst
[d2lbook:build.py:L56] INFO === Finished "d2lbook build rst" in 00:00:14
[d2lbook:build.py:L56] INFO === Finished "d2lbook build ipynb" in 00:00:00
[d2lbook:build.py:L56] INFO === Finished "d2lbook build colab" in 00:00:00
[d2lbook:build.py:L56] INFO === Finished "d2lbook build sagemaker" in 00:00:00
Running Sphinx v5.3.0
making output directory... done
checking bibtex cache... out of date
parsing bibtex file /home/d2l-worker/workspace/d2l-book/docs/_build/eval/user/cache/_build/rst... WARNING: could not open bibtex file /home/d2l-worker/workspace/d2l-book/docs/_build/eval/user/cache/_build/rst.
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 2 source files that are out of date
updating environment: [new config] 2 added, 0 changed, 0 removed
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
generating indices... genindex done
writing additional pages... search done
copying static files... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded, 1 warning.
The HTML pages are in _build/html.
[d2lbook:build.py:L56] INFO === Finished "d2lbook build html" in 00:00:15
You can see index.md
is evaluated. (Though it doesn’t contain codes,
it’s fine to evaluate it as a Jupyter notebook.)
If building again, we will see no notebook will be evaluated.
!cd cache; d2lbook build html
[d2lbook:build.py:L147] INFO 0 notebooks are outdated
[d2lbook:build.py:L153] INFO Evaluating notebooks in parallel with 8 CPU workers and 8 GPU workers
[d2lbook:build.py:L56] INFO === Finished "d2lbook build eval" in 00:00:00
[d2lbook:build.py:L322] INFO 0 rst files are outdated
[d2lbook:build.py:L56] INFO === Finished "d2lbook build rst" in 00:00:00
[d2lbook:build.py:L56] INFO === Finished "d2lbook build ipynb" in 00:00:00
[d2lbook:build.py:L56] INFO === Finished "d2lbook build colab" in 00:00:00
[d2lbook:build.py:L56] INFO === Finished "d2lbook build sagemaker" in 00:00:00
Running Sphinx v5.3.0
loading pickled environment... checking bibtex cache... up to date
done
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 0 source files that are out of date
updating environment: 0 added, 0 changed, 0 removed
looking for now-outdated files... none found
no targets are out of date.
build succeeded.
The HTML pages are in _build/html.
[d2lbook:build.py:L56] INFO === Finished "d2lbook build html" in 00:00:00
Now let’s modify get_started.md
, you will see it will be
re-evaluated, but not index.md
.
%%writefile cache/get_started.md
# Getting Started
Please first install my favorite package `numpy>=1.18`.
Overwriting cache/get_started.md
!cd cache; d2lbook build html
[d2lbook:build.py:L147] INFO 1 notebooks are outdated
[d2lbook:build.py:L149] INFO [1] ./get_started.md
[d2lbook:build.py:L153] INFO Evaluating notebooks in parallel with 8 CPU workers and 8 GPU workers
[d2lbook:resource.py:L196] INFO Starting task "Evaluating ./get_started.md" on CPU [7]
[d2lbook:resource.py:L159] INFO Status: 1 running tasks, 0 done, 0 not started
[d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [7] is running for 00:00:00
[d2lbook:resource.py:L223] INFO Task "Evaluating ./get_started.md" on CPU [7] is finished in 00:00:02
[d2lbook:resource.py:L142] INFO All 1 tasks are done, sorting by runtime:
[d2lbook:resource.py:L148] INFO - 00:00:02 on CPU [7] for Evaluating ./get_started.md
[d2lbook:build.py:L56] INFO === Finished "d2lbook build eval" in 00:00:03
[d2lbook:build.py:L322] INFO 1 rst files are outdated
[d2lbook:build.py:L324] INFO Convert _build/eval/get_started.ipynb to _build/rst/get_started.rst
[d2lbook:build.py:L56] INFO === Finished "d2lbook build rst" in 00:00:03
[d2lbook:build.py:L56] INFO === Finished "d2lbook build ipynb" in 00:00:00
[d2lbook:build.py:L56] INFO === Finished "d2lbook build colab" in 00:00:00
[d2lbook:build.py:L56] INFO === Finished "d2lbook build sagemaker" in 00:00:00
Running Sphinx v5.3.0
loading pickled environment... checking bibtex cache... up to date
done
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 1 source files that are out of date
updating environment: 0 added, 1 changed, 0 removed
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
generating indices... genindex done
writing additional pages... search done
copying static files... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded.
The HTML pages are in _build/html.
[d2lbook:build.py:L56] INFO === Finished "d2lbook build html" in 00:00:04
One way to trigger the whole built is removing the saved notebooks in
_build/eval
, or simply deleting _build
. Another way is
specifying some dependencies. For example, in the following cell we add
config.ini
into the dependencies. Every time config.ini
is
modified, it will invalid the cache of all notebooks and trigger a build
from scratch.
%%writefile cache/config.ini
[build]
dependencies = config.ini
Writing cache/config.ini
!cd cache; d2lbook build html
[d2lbook:config.py:L12] INFO Load configure from config.ini
[d2lbook:build.py:L147] INFO 2 notebooks are outdated
[d2lbook:build.py:L149] INFO [1] ./get_started.md
[d2lbook:build.py:L149] INFO [2] ./index.md
[d2lbook:build.py:L153] INFO Evaluating notebooks in parallel with 8 CPU workers and 8 GPU workers
[d2lbook:resource.py:L196] INFO Starting task "Evaluating ./get_started.md" on CPU [5]
[d2lbook:resource.py:L159] INFO Status: 1 running tasks, 0 done, 1 not started
[d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [5] is running for 00:00:00
[d2lbook:resource.py:L196] INFO Starting task "Evaluating ./index.md" on CPU [2]
[d2lbook:resource.py:L159] INFO Status: 2 running tasks, 0 done, 0 not started
[d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [5] is running for 00:00:02
[d2lbook:resource.py:L164] INFO - Task "Evaluating ./index.md" on CPU [2] is running for 00:00:00
[d2lbook:resource.py:L223] INFO Task "Evaluating ./get_started.md" on CPU [5] is finished in 00:00:03
[d2lbook:resource.py:L223] INFO Task "Evaluating ./index.md" on CPU [2] is finished in 00:00:02
[d2lbook:resource.py:L142] INFO All 2 tasks are done, sorting by runtime:
[d2lbook:resource.py:L148] INFO - 00:00:02 on CPU [2] for Evaluating ./index.md
[d2lbook:resource.py:L148] INFO - 00:00:03 on CPU [5] for Evaluating ./get_started.md
[d2lbook:build.py:L56] INFO === Finished "d2lbook build eval" in 00:00:05
[d2lbook:build.py:L322] INFO 2 rst files are outdated
[d2lbook:build.py:L324] INFO Convert _build/eval/get_started.ipynb to _build/rst/get_started.rst
[d2lbook:build.py:L324] INFO Convert _build/eval/index.ipynb to _build/rst/index.rst
[d2lbook:build.py:L56] INFO === Finished "d2lbook build rst" in 00:00:05
[d2lbook:build.py:L56] INFO === Finished "d2lbook build ipynb" in 00:00:00
[d2lbook:build.py:L56] INFO === Finished "d2lbook build colab" in 00:00:00
[d2lbook:build.py:L56] INFO === Finished "d2lbook build sagemaker" in 00:00:00
Running Sphinx v5.3.0
loading pickled environment... checking bibtex cache... up to date
done
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 2 source files that are out of date
updating environment: 0 added, 2 changed, 0 removed
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
generating indices... genindex done
writing additional pages... search done
copying static files... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded.
The HTML pages are in _build/html.
[d2lbook:build.py:L56] INFO === Finished "d2lbook build html" in 00:00:06
Last, let’s clean our workspace.
!rm -rf cache