Nbdev
Here is our main process using nbdev
Create a new project
Create a new GitHub project with Poetry for dependency management (or using pip)
Create the
.gitignore
fileInstall Nbdev
poetry install
poetry shell
poetry add nbdev --group dev
nbdev_install_quarto
poetry add jupyterlab-quarto --group dev
- Initiate Nbdev project
nbdev_new
PyPi library name can’t have _
while repo path can’t have -
.
nbdev_new assumes that your package name is the same as your repo name (with - replaced by _). Use the
--lib_name
option if that isn’t the case.
So lib_name
can be nbdev-cards
and the lib_path
can be nbdev_cards
. If encountering No module named 'pkg_resources'
error, do poetry add setuptools
(or pip install setuptools
).
Do a poetry install
# this is in lieu of pip install -e .
and this needs to be done only once
Develop the project
nbdev_export
will convert notebook to modules.nbdev_docs
to update docs including the READMEnbdev_preview
initiates a server to display documentation
nbdev_readm
render the READMEnbdev_test
to run local tests defined in the notebooknbdev_pypi
to publish to PyPI
nbdev_prepare
, to run the nbdev_export
, nbdev_clean
, nbdev_test
, nbdev_readme
all together, useful before we push changes.
Then
git status
git commit -am "update"
git push
Consider using pre-commit hook to make notebook version control cleaner. If the hook processing fails and modifies the file, we need to add the modifications into commit, and then it will pass! Do
git add -u
and commit again again will do the trick!Consider use various methods in
fastcore.test
module for tests.
What goes to where (doc vs. module)
- To add the cell to the module file, use
#| export
- To use the cell for development only, without it exposing to module and docs, use
#| hide
, this is useful, e.g., for inline tests that don’t need to be in the doc. - To show documentation for class, for a class cell with
#| export
, its signature will automatically appear in the docs. But the methods inside will NOT. - To show documentation for methods, we can define them in their own cells and then use
show_doc(method_name). For methods that belong to a class, use
@patch`. For static methods, use:
@staticmethod
@patch_to(Class)
- If a cell has neither
#| export
nor#| hide
, it will still appear in the doc in its literal form.
Misc tips
- From time to time, restart, and run all cells. Restart the Jupyter kernel after making changes to dependent modules.
- For debugging, simply add a new cell and put
%debug
, note that the pop up window is on the top portion of the VScode Window.
- Suspend and put into background
^Z
,bg
Assert not level
error when runningnbdev_export
: Make sure all theimport
statements in all notebooks under “nbs” folder are valid. If necessary, moving the ones that are not working to check.
- if
nbdev_docs
raises error and things get cluttered, try to delete the_docs
and_proc
directories and re-run. - Still encounter Nbdev_export SyntaxError from time to time, made a check_syntax gist to solve it.
FastHTML
Display code
def example()
return Div(
"FastHTML APP"),
H1("Let's do this"),
P(="go"
cls
)
= example()
ft_code print(ft_code)
print(to_xml(ft_code))
__repr__()
ft_code.
ft_code.__html__()print(ft_code.__repr__())
print(ft_code.__html__())
Conversion between the fasttag representation and the raw html representation
Markdown
from fasthtml.common import *
= (MarkdownJS(), )
hdrs = FastHTML(hdrs=hdrs)
app
= """
content Here are some _markdown_ elements.
- This is a list item
- This is another list item
- And this is a third list item
**Fenced code blocks work here.**
"""
# @rt('/')
@app.route("/")
def get(req):
= """
code_content Here are some code _markdown_ elements.
- This is a list item
- This is another list item
- And this is a third list item
**Fenced code blocks work here.**
"""
= """
normal_content Here are some normal _markdown_ elements.
- This is a list item
- This is another list item
- And this is a third list item
**Fenced code blocks work here.**
"""
return Titled("Markdown rendering example", Div(content,cls="marked"), Div(normal_content, cls="marked"), Div(code_content, cls="marked"))
serve()
Note the difference between normal_content
and code_content
! This is because the markdown parser will interpret the leading spaces before the code_content as code block!
<script type="module">
import { marked } from "https://cdn.jsdelivr.net/npm/marked/lib/marked.esm.js";
import { proc_htmx } from "https://cdn.jsdelivr.net/gh/answerdotai/fasthtml-js/fasthtml.js";
proc_htmx('.marked', e => e.innerHTML = marked.parse(e.textContent));</script>
Test in notebook
from fasthtml.common import *
# Setting up the Starlette test client
from starlette.testclient import TestClient
= (MarkdownJS(), )
hdrs = FastHTML(hdrs=hdrs)
app
@app.route("/")
def get(req):
= """
content Here are some _markdown_ elements.
- This is a list item
- This is another list item
- And this is a third list item
**Fenced code blocks work here.**
"""
return Titled("Markdown rendering example", Div(content,cls="marked"))
= TestClient(app)
client print(client.get("/").text)
"/").text)) # note that js-rendered markdown will not be shown in notebook output display(HTML(client.get(
# Loading tailwind and daisyui
= (Script(src="https://cdn.tailwindcss.com"),
headers ="stylesheet", href="https://cdn.jsdelivr.net/npm/daisyui@4.11.1/dist/full.min.css"))
Link(rel
# Displaying a single message
= Div(
d "Chat header here", cls="chat-header"),
Div("My message goes here", cls="chat-bubble chat-bubble-primary"),
Div(="chat chat-start"
cls
)
print(to_xml(d))
*headers, d)) show(Html(
Jupyter Notebook
Absolute File Path
The __file__
attribute is not available in Jupyter notebooks. It’s a built-in attribute in Python scripts that contains the path of the script that is currently being executed.
In a Jupyter notebook, you can use the os and IPython libraries to get the notebook’s path:
import os
from IPython.core.getipython import get_ipython
# Get the current notebook's path
= os.path.join(os.getcwd(), get_ipython().starting_dir)
notebook_path
# Specify the relative path to the directory
= "data/test_data"
relative_path
# Construct the absolute path
= os.path.join(notebook_path, relative_path)
file_dir
# assert the file_dir exists
assert os.path.isdir(file_dir)
# list 5 files in the file_dir
5] os.listdir(file_dir)[:
VSCode mode
settings.json
"notebook.lineNumbers": "on",
"notebook.output.wordWrap": true
Conda
conda: error: argument COMMAND: invalid choice: ‘activate’
Do conda init
first then restart the terminal and do conda activate xxx
again. But this will init conda everytime you start the shell by placing the related commands in your shell init scripts like bash_profile
. If we also have other package management systems like poetry
, venv
in use, we might not want to initialize conda all the time. We can instead activate conda explicitly:
conda env list
source /opt/homebrew/Caskroom/miniconda/base/bin/activate
/opt/homebrew/Caskroom/miniconda/base/envs/venv_name
instead of
conda activate venv_name
Use the actual paths for activate
and the virtual environment.
Alternatively, we can leave the conda initializaton scripts on, but use:
conda config --set auto_activate_base false
To make sure conda do not automatically invoke its base environment.
Streamlit
Streamlit re-runs the entire script each time an input changes. To avoid unnecessarily re-run some one time statements, we can use the @st_cache decorator.
For example, we import from logging_config
from config.logging_config import get_logger
@st.cache_data()
def cached_get_logger(name):
return get_logger(name)
= cached_get_logger(__name__) logger
Poetry
Configure lint
E501 stands for “line too long”. By default, PEP 8 recommends that lines should not exceed 79 characters.
[tool.autopep8]
ignore = [ "E501" ]
If using black
instead of autopep8
, you will need to it differently to disable it (not recommended)
[too.black]
line-length = 999 # very large number
When using black
together with isort
. We use
[tool.black]
line-length = 88
[tool.isort]
profile = "black"
to ensure that both isort
and Black
are using compatible rules, particularly regarding line length and import style.
Brew
Purpose | Code | Note |
---|---|---|
Update brew | brew update |
|
Upgrade brew installed packages | brew upgrad |
|
Check installed Python versions | brew list | grep python |
|
Install a particular Python version | brew install python@3.11 |
|
Upgrade a particular Python version | brew upgrad python@3.11 |
Common logger setup
config.logging.py
import logging.config
import os
from pathlib import Path
import yaml
def configure_logging(log_config_path, log_directory):
with open(log_config_path, 'r') as f:
= yaml.safe_load(f)
config 'handlers']['file']['filename'] = str(log_directory / config['handlers']['file']['filename'])
config[
logging.config.dictConfig(config)
= Path(__file__).resolve().parent.parent
project_root = project_root / 'logs'
logs_directory =True)
logs_directory.mkdir(exist_ok
= project_root / 'config' / 'logging.yaml'
logging_config_path
configure_logging(logging_config_path, logs_directory)
= logging.getLogger("package_name")
logger
"Logging started") logger.info(
config.logging.yml
version: 1
formatters:
simple:
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
handlers:
console:
class: logging.StreamHandler
level: INFO
formatter: simple
file:
class: logging.FileHandler
level: INFO
formater: simple
filename: package_name.log
loggers:
root:
level: INFO
handlers: [console, file]
package_name:
level: INFO
handlers: [console, file]
root:
level: INFO
handlers: [console, file]
We can then use it by:
import logging
= logging.getLogger(__name__) logger
This will place the package_name.log
logfile under projectroot/logs
directory.
clear ANSI codes
Most modern terminal emulators support ANSI escape codes and can interpret them to display colored and formatted text. If we have the logs saved in a text file. We can examine them in terminal with cat
or less
, which will display the formats and colors.
If we use a text editor or some other program that doesn’t interpret these codes, we can strip the codes from the output. For example:
sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" output.log > clean_output.log
TMUX
Function | Command | Note |
---|---|---|
create a session | tmux new -s Session1 |
|
list sessions | tmux ls |
|
attach to a session | tmux attach -t Session1 |
|
kill a session | tmux kill-session -t Session1 |
|
leader key within tmux | ctrl a |
|
split window into panes v/h | ctrl a \| or - |
|
resize pane | ctrl a then i/j/h/k |
|
maximize pane | ctrl a m |
|
create a window in a session | ctrl a c |
|
navigate windows | ctrl a 0/1/p/w |
|
list windows | ctrl a w |
|
rename a window | ctrl a , |
|
list sessions | ctrl a s |
|
navigate sessions | k or j |
|
exit a session | tmux detach |
|
install plugin | ctrl a I |
|
reload configuration | ctrl a r |
|
copy mode | ctrl a [ then use k/j/ctrl+u/d/b/f/shift+k/j to select and v/y to copy, or mouse scrolling, use ctrl c to exit copy mode |
|
detach from a session | ctrl a then d |
|
avoid nested tmux sessions on remote server | ssh -t user@remote 'tmux attach-session -t 0' or ssh -t user@remote 'tmux attach' |
Above uses ctrl a
and other custom setting from the great TMUX Dev Flow configuration by Josean Martinez
Diagramming
Excalidraw
To run and use Excalidraw locally:
git clone https://github.com/excalidraw/excalidraw.git
yarn
yarn start
Mermaid
Quarto
Commonly used commands after setup
quarto preview # to preview the website
quarto render # to render the website (recommanded before push)
Use quarto preview
or shift+cmd+k
to preview side-by-side.
When publishing from gh-pages
instead of the _docs
directory, use quarto publish gh-pages
in the "main"
branch.
Cross references
Cross-reference can also be created in float or with division
Code content
There are several ways to publish content that includes code. For a simple blog post, we can embed code (and their corresponding results) directly into its qmd
file. Alternatively, we can create a separate Jupyter notebook and embed parts of it and its results into the blog’s qmd
file.
For more comprehensive projects, we might consider other formats offered by Quarto, such as Manuscripts and Books. For instance, a quarto manuscript has an article view that only shows the results of code execution (like graph visualizations) in the article body, and ook view that reveals all the underlying notebook code. This is includes the notebook version of the article itself, along with additional notebooks as needed. It is essentially a qmd
file creating the index page with embedded cells from companion Jupyter notebooks. We can also download these notebooks directly. In comparison, Quarto books support multiple chapters and cross-referencing in their HTML format. They also provide a normal view with its companion source code repository.
Manuscripts and Books are really just different Quarto project types that have their customized behavior, similar to other types like website and blogs. A key file that specifies the behaviro is the _quarto.yml
configuration file.
using qmd
or ipynb
?
Both file formats allow us to embed code. The following guidelines can be helpful:
- If the subject is primarily a non-coding topic and we are simply embedding code to supplement the presentation, such as creating graph visualizations, then it is good to go with the
qmd
and embed code in it. - If the subject is primarily a coding topic, it is easier to start with
ipynb
natively.
For qmd
files, it is easier to go with VSCode or text editors like NeoVim. For ipynb
files, Jupyter Lab/notebook is the native approach, but VSCode also offers surprisingly good Notebook Editor.
Here are examples of authoring a manuscript in qmd
with VSCode and in ipynb
in Jupyter Lab.
Tex errors
If compilation fails, quarto publish gh-pages
will not publish to the website!
If quarto render
reports Tex
related errors, check the index.tex
and index.log
file.
! LaTeX Error: Something's wrong--perhaps a missing \item.
# References {.unnumbered}
::: {#refs}
:::
This could be the cause.
❯ grep -n -A 1 '\\begin{itemize}\|\\begin{enumerate}\|\\begin{description}' index.tex
Draft setting bugs
The latest Quarto 1.5 seems to have bugs with setting posts as drafts. After each rendering and preview the drafts posts will re-appear in the list. One way to get rid of it is to comment out or add back the drafts:
attribute of the frontmatter in the list qmd
file or the site’s _quarto.yml
file.
But this could reset the custom domain setting and may require adding the custom domain again.
Quarto slides
Note: We can use the following to have
format:
# html: default
revealjs:
multiplex: true
self-contained: true
slide-number: true
# chalkboard:
# buttons: false
preview-links: auto
Then it will produce a local index-speaker.html
used for presentation, and an index.html
on server for download.
Note:
ERROR: Reveal plugin ’RevealChalkboard is not compatible with self-contained output
Quarto subscription
AWS concepts
Resources: S3 bucket, KMS key, EC2 instances Principals: users, services, or accounts Policies: permissions Roles: list of permissions User: persona associated with roles
Resource-based policies: specify who (the principal) has permission to access the rsources. Trust Policies (Relationships): specify which principlas are allowed to assume the role. User/Role-based policies: attach directly to users or roles (does not need the Principals here)
Fasthtml
Getting started
How to run the demo app under examples
?
pip install . # install from source
uvicorn examples.app:app --reload
Installation
brew install railway # install railway Cli
pip install -U python-fasthtml # install FastHTML
railway login # login to `railway`
Move to the project directory (this is important, otherwise the railway up -c
command will fail to recognize what type of project it is)
cd simple
railway init -n project_name
railway up -c # this takes several minutes, wait for finish
railway domain
fh_railway_link
railway volume add -m /app/data
Rancher for Docker
If we got:
RuntimeError: Docker is not running. Please start Docker and try again
Try to set the current environment variable (in the specific virtual environment if applicable).
export DOCKER_HOST=unix:///Users/username/.rd/docker.sock
Check and confirm
`os.environ[‘DOCKER_HOST’] = ‘unix:///Users/cshen2/.rd/docker.sock’
We need rebuild the specific containers.
Common commands:
docker context create rancher-desktop
docker context use rancher-desktop
docker context ls
docker context inspect rancher-desktop
curl --unix-socket /Users/username/.rd/docker.sock http://v1.24/version
A simple script to test Docker
import docker
import os
print("DOCKER_HOST:", os.getenv('DOCKER_HOST'))
print("Initializing Docker client...")
try:
= docker.from_env()
client if client.ping():
print("Docker client initialized and server ping successful")
except Exception as e:
print(f"Docker client initialization or ping failed: {e}")
raise e
If we initiated a docker client
= docker.from_env() docker_client
and need to get the container, we should use
docker_client.containers.get(container_name)
avoid using
docker.DockerClient().containers.get(container_name)
The latter might work for Docker Desktop but could cause issue with Rancher.
(An alternative to the above might be to explose Rancher’s Docker compatible API over the default Unix socket docker context create rancher-desktop --docker "host=unix:///var/run/docker.sock"
)
Misc
use vars()
to display local variables or attributes.
try:
del obj
print("Deleted obj")
except NameError:
pass
if 'obj' in locals():
del obj
assert 'obj' not in locals(), "obj not deleted"
def get_absolute_path_in_notebook(relative_path: str) -> str:
# Get the current notebook's path
= os.path.join(os.getcwd(), get_ipython().starting_dir)
notebook_path
# Construct the absolute path
= os.path.join(notebook_path, relative_path)
absolute_path
return absolute_path
= get_absolute_path_in_notebook(file_path_relative)
file_path_abs
assert os.path.isfile(file_path_abs), f"File not found: {file_path_abs}"