Building ======== This section we will explain various options to build your projects. This options can be grouped into four categories: 1. Sanity check - ``d2lbook build linkcheck`` will check if all internal and external links are accessible. - ``d2lbook build outputcheck`` will check if no notebook will contain code outputs 1. Building results - ``d2lbook build html``: build the HTML version into ``_build/html`` - ``d2lbook build pdf``: build the PDF version into ``_build/pdf`` - ``d2lbook build pkg``: build a zip file contains all ``.ipynb`` notebooks 1. Additional features - ``d2lbook build colab``: convert all notebooks can be run on Google Colab into ``_build/colab``. See more in :numref:`sec_colab` - ``d2lbook build lib``: build a Python package so we can reuse codes in other notebooks. See more in XXX. 1. Internal stages, which often are triggered automatically. - ``d2lbook build eval``: evaluate all notebooks and save them as ``.ipynb`` notebooks into ``_build/eval`` - ``d2lbook build rst``: convert all notebooks into ``rst`` files and create a Sphinx project in ``_build/rst`` Building Cache -------------- We encourage you to evaluate your notebooks to obtain code cell results, instead of keeping these results in the source files for two reasons: 1. These results make code review difficult, especially when they have randomness either due to numerical precision or random number generators. 1. A notebook hasn't evaluated for a while may be broken due to package upgrading. But the evaluation costs additional overhead during building. We recommend to limit the runtime for each notebook within a few minutes. And ``d2lbook`` will reuse the previous built and only evaluate the modified notebooks. For example, the average runtime of a notebook (section) in `Dive into Deep Learning `__ is about 2 minutes on a GPU machine, due to training neural networks. It contains more than 100 notebooks, which make the total runtime cost 2-3 hours. In reality, each code change will only modify a few notebooks and therefore the `build time `__ is often less than 10 minutes. Let's see how it works. First create a project as we did in :numref:`sec_create`. .. raw:: latex \diilbookstyleinputcell .. code:: python !mkdir -p cache .. raw:: latex \diilbookstyleinputcell .. code:: python %%writefile cache/index.md # My Book The starting page of my book with `d2lbook`. ````toc get_started ```` .. raw:: latex \diilbookstyleoutputcell .. parsed-literal:: :class: output Writing cache/index.md .. raw:: latex \diilbookstyleinputcell .. code:: python %%writefile cache/get_started.md # Getting Started Please first install my favorite package `numpy`. .. raw:: latex \diilbookstyleoutputcell .. parsed-literal:: :class: output Writing cache/get_started.md .. raw:: latex \diilbookstyleinputcell .. code:: python !cd cache; d2lbook build html .. raw:: latex \diilbookstyleoutputcell .. parsed-literal:: :class: output [d2lbook:build.py:L147] INFO 2 notebooks are outdated [d2lbook:build.py:L149] INFO [1] ./get_started.md [d2lbook:build.py:L149] INFO [2] ./index.md [d2lbook:build.py:L153] INFO Evaluating notebooks in parallel with 8 CPU workers and 8 GPU workers [d2lbook:resource.py:L196] INFO Starting task "Evaluating ./get_started.md" on CPU [0] [d2lbook:resource.py:L159] INFO Status: 1 running tasks, 0 done, 1 not started [d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [0] is running for 00:00:00 [d2lbook:resource.py:L196] INFO Starting task "Evaluating ./index.md" on CPU [3] [d2lbook:resource.py:L159] INFO Status: 2 running tasks, 0 done, 0 not started [d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [0] is running for 00:00:02 [d2lbook:resource.py:L164] INFO - Task "Evaluating ./index.md" on CPU [3] is running for 00:00:00 [d2lbook:resource.py:L223] INFO Task "Evaluating ./get_started.md" on CPU [0] is finished in 00:00:03 [d2lbook:resource.py:L223] INFO Task "Evaluating ./index.md" on CPU [3] is finished in 00:00:02 [d2lbook:resource.py:L142] INFO All 2 tasks are done, sorting by runtime: [d2lbook:resource.py:L148] INFO - 00:00:02 on CPU [3] for Evaluating ./index.md [d2lbook:resource.py:L148] INFO - 00:00:03 on CPU [0] for Evaluating ./get_started.md [d2lbook:build.py:L56] INFO === Finished "d2lbook build eval" in 00:00:13 [d2lbook:build.py:L322] INFO 2 rst files are outdated [d2lbook:build.py:L324] INFO Convert _build/eval/index.ipynb to _build/rst/index.rst [d2lbook:build.py:L324] INFO Convert _build/eval/get_started.ipynb to _build/rst/get_started.rst [d2lbook:build.py:L56] INFO === Finished "d2lbook build rst" in 00:00:14 [d2lbook:build.py:L56] INFO === Finished "d2lbook build ipynb" in 00:00:00 [d2lbook:build.py:L56] INFO === Finished "d2lbook build colab" in 00:00:00 [d2lbook:build.py:L56] INFO === Finished "d2lbook build sagemaker" in 00:00:00 Running Sphinx v5.3.0 making output directory... done checking bibtex cache... out of date parsing bibtex file /home/d2l-worker/workspace/d2l-book/docs/_build/eval/user/cache/_build/rst... WARNING: could not open bibtex file /home/d2l-worker/workspace/d2l-book/docs/_build/eval/user/cache/_build/rst. building [mo]: targets for 0 po files that are out of date building [html]: targets for 2 source files that are out of date updating environment: [new config] 2 added, 0 changed, 0 removed looking for now-outdated files... none found pickling environment... done checking consistency... done preparing documents... done generating indices... genindex done writing additional pages... search done copying static files... done copying extra files... done dumping search index in English (code: en)... done dumping object inventory... done build succeeded, 1 warning. The HTML pages are in _build/html. [d2lbook:build.py:L56] INFO === Finished "d2lbook build html" in 00:00:15 You can see ``index.md`` is evaluated. (Though it doesn't contain codes, it's fine to evaluate it as a Jupyter notebook.) If building again, we will see no notebook will be evaluated. .. raw:: latex \diilbookstyleinputcell .. code:: python !cd cache; d2lbook build html .. raw:: latex \diilbookstyleoutputcell .. parsed-literal:: :class: output [d2lbook:build.py:L147] INFO 0 notebooks are outdated [d2lbook:build.py:L153] INFO Evaluating notebooks in parallel with 8 CPU workers and 8 GPU workers [d2lbook:build.py:L56] INFO === Finished "d2lbook build eval" in 00:00:00 [d2lbook:build.py:L322] INFO 0 rst files are outdated [d2lbook:build.py:L56] INFO === Finished "d2lbook build rst" in 00:00:00 [d2lbook:build.py:L56] INFO === Finished "d2lbook build ipynb" in 00:00:00 [d2lbook:build.py:L56] INFO === Finished "d2lbook build colab" in 00:00:00 [d2lbook:build.py:L56] INFO === Finished "d2lbook build sagemaker" in 00:00:00 Running Sphinx v5.3.0 loading pickled environment... checking bibtex cache... up to date done building [mo]: targets for 0 po files that are out of date building [html]: targets for 0 source files that are out of date updating environment: 0 added, 0 changed, 0 removed looking for now-outdated files... none found no targets are out of date. build succeeded. The HTML pages are in _build/html. [d2lbook:build.py:L56] INFO === Finished "d2lbook build html" in 00:00:00 Now let's modify ``get_started.md``, you will see it will be re-evaluated, but not ``index.md``. .. raw:: latex \diilbookstyleinputcell .. code:: python %%writefile cache/get_started.md # Getting Started Please first install my favorite package `numpy>=1.18`. .. raw:: latex \diilbookstyleoutputcell .. parsed-literal:: :class: output Overwriting cache/get_started.md .. raw:: latex \diilbookstyleinputcell .. code:: python !cd cache; d2lbook build html .. raw:: latex \diilbookstyleoutputcell .. parsed-literal:: :class: output [d2lbook:build.py:L147] INFO 1 notebooks are outdated [d2lbook:build.py:L149] INFO [1] ./get_started.md [d2lbook:build.py:L153] INFO Evaluating notebooks in parallel with 8 CPU workers and 8 GPU workers [d2lbook:resource.py:L196] INFO Starting task "Evaluating ./get_started.md" on CPU [7] [d2lbook:resource.py:L159] INFO Status: 1 running tasks, 0 done, 0 not started [d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [7] is running for 00:00:00 [d2lbook:resource.py:L223] INFO Task "Evaluating ./get_started.md" on CPU [7] is finished in 00:00:02 [d2lbook:resource.py:L142] INFO All 1 tasks are done, sorting by runtime: [d2lbook:resource.py:L148] INFO - 00:00:02 on CPU [7] for Evaluating ./get_started.md [d2lbook:build.py:L56] INFO === Finished "d2lbook build eval" in 00:00:03 [d2lbook:build.py:L322] INFO 1 rst files are outdated [d2lbook:build.py:L324] INFO Convert _build/eval/get_started.ipynb to _build/rst/get_started.rst [d2lbook:build.py:L56] INFO === Finished "d2lbook build rst" in 00:00:03 [d2lbook:build.py:L56] INFO === Finished "d2lbook build ipynb" in 00:00:00 [d2lbook:build.py:L56] INFO === Finished "d2lbook build colab" in 00:00:00 [d2lbook:build.py:L56] INFO === Finished "d2lbook build sagemaker" in 00:00:00 Running Sphinx v5.3.0 loading pickled environment... checking bibtex cache... up to date done building [mo]: targets for 0 po files that are out of date building [html]: targets for 1 source files that are out of date updating environment: 0 added, 1 changed, 0 removed looking for now-outdated files... none found pickling environment... done checking consistency... done preparing documents... done generating indices... genindex done writing additional pages... search done copying static files... done copying extra files... done dumping search index in English (code: en)... done dumping object inventory... done build succeeded. The HTML pages are in _build/html. [d2lbook:build.py:L56] INFO === Finished "d2lbook build html" in 00:00:04 One way to trigger the whole built is removing the saved notebooks in ``_build/eval``, or simply deleting ``_build``. Another way is specifying some dependencies. For example, in the following cell we add ``config.ini`` into the dependencies. Every time ``config.ini`` is modified, it will invalid the cache of all notebooks and trigger a build from scratch. .. raw:: latex \diilbookstyleinputcell .. code:: python %%writefile cache/config.ini [build] dependencies = config.ini .. raw:: latex \diilbookstyleoutputcell .. parsed-literal:: :class: output Writing cache/config.ini .. raw:: latex \diilbookstyleinputcell .. code:: python !cd cache; d2lbook build html .. raw:: latex \diilbookstyleoutputcell .. parsed-literal:: :class: output [d2lbook:config.py:L12] INFO Load configure from config.ini [d2lbook:build.py:L147] INFO 2 notebooks are outdated [d2lbook:build.py:L149] INFO [1] ./get_started.md [d2lbook:build.py:L149] INFO [2] ./index.md [d2lbook:build.py:L153] INFO Evaluating notebooks in parallel with 8 CPU workers and 8 GPU workers [d2lbook:resource.py:L196] INFO Starting task "Evaluating ./get_started.md" on CPU [5] [d2lbook:resource.py:L159] INFO Status: 1 running tasks, 0 done, 1 not started [d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [5] is running for 00:00:00 [d2lbook:resource.py:L196] INFO Starting task "Evaluating ./index.md" on CPU [2] [d2lbook:resource.py:L159] INFO Status: 2 running tasks, 0 done, 0 not started [d2lbook:resource.py:L164] INFO - Task "Evaluating ./get_started.md" on CPU [5] is running for 00:00:02 [d2lbook:resource.py:L164] INFO - Task "Evaluating ./index.md" on CPU [2] is running for 00:00:00 [d2lbook:resource.py:L223] INFO Task "Evaluating ./get_started.md" on CPU [5] is finished in 00:00:03 [d2lbook:resource.py:L223] INFO Task "Evaluating ./index.md" on CPU [2] is finished in 00:00:02 [d2lbook:resource.py:L142] INFO All 2 tasks are done, sorting by runtime: [d2lbook:resource.py:L148] INFO - 00:00:02 on CPU [2] for Evaluating ./index.md [d2lbook:resource.py:L148] INFO - 00:00:03 on CPU [5] for Evaluating ./get_started.md [d2lbook:build.py:L56] INFO === Finished "d2lbook build eval" in 00:00:05 [d2lbook:build.py:L322] INFO 2 rst files are outdated [d2lbook:build.py:L324] INFO Convert _build/eval/get_started.ipynb to _build/rst/get_started.rst [d2lbook:build.py:L324] INFO Convert _build/eval/index.ipynb to _build/rst/index.rst [d2lbook:build.py:L56] INFO === Finished "d2lbook build rst" in 00:00:05 [d2lbook:build.py:L56] INFO === Finished "d2lbook build ipynb" in 00:00:00 [d2lbook:build.py:L56] INFO === Finished "d2lbook build colab" in 00:00:00 [d2lbook:build.py:L56] INFO === Finished "d2lbook build sagemaker" in 00:00:00 Running Sphinx v5.3.0 loading pickled environment... checking bibtex cache... up to date done building [mo]: targets for 0 po files that are out of date building [html]: targets for 2 source files that are out of date updating environment: 0 added, 2 changed, 0 removed looking for now-outdated files... none found pickling environment... done checking consistency... done preparing documents... done generating indices... genindex done writing additional pages... search done copying static files... done copying extra files... done dumping search index in English (code: en)... done dumping object inventory... done build succeeded. The HTML pages are in _build/html. [d2lbook:build.py:L56] INFO === Finished "d2lbook build html" in 00:00:06 Last, let's clean our workspace. .. raw:: latex \diilbookstyleinputcell .. code:: python !rm -rf cache