

Picture by the writer
If you train models out of a single notebook, you probably hit the same headaches: you comply with five nobes, re -train, and until Friday you do not remember which run has developed the “good” ROC curve or which data piece you have used. Weight and prejudice .
Below is a practical tour. It has been given the opinion, the light on the event, and is ready for the teams who want a clean experience history without building their platform. Let’s call it a fool walkthrough.
. Why W&B Absolutely?
Notebook grows in experiments. Experiences multiply. Soon you are asking: Which run did you use this data slice? Why is today’s ROC curve more? Can I re -reproduce last week’s baseline?
W&B gives you a place:
- Log matrix, configurations, plots and system data
- Model with version datasis and samples
- Run Hyper Parameter
- Share Dashboards without screenshots
You can start small and layer features when needed.
. Setup in 60 seconds
Start by installing the library and logging in with your API key. If you don’t have yet, You can find it here.
pip install wandb
wandb login # paste your API key oncePicture by the writer
!! The least serious check
import wandb, random, time
wandb.init(project="kdn-crashcourse", name="hello-run", config={"lr": 0.001, "epochs": 5})
for epoch in range(wandb.config.epochs):
loss = 1.0 / (epoch + 1) + random.random() * 0.05
wandb.log({"epoch": epoch, "loss": loss})
time.sleep(0.1)
wandb.finish()Now you should see something like that:


Picture by the writer
Now let’s go for useful bits.
. To keep track of experiments
!! Log hyperpressors and matrix
Treatment wandb.config As the only source of truth for your experience’s nobes. Give the matrix clear name so that there are chart auto groups.
cfg = dict(arch="resnet18", lr=3e-4, batch=64, seed=42)
run = wandb.init(project="kdn-mlops", config=cfg, tags=("baseline"))
# training loop ...
for step, (x, y) in enumerate(loader):
# ... compute loss, acc
wandb.log({"train/loss": loss.item(), "train/acc": acc, "step": step})
# log a final summary
run.summary("best_val_auc") = best_aucSome points:
- Use the name space
train/lossOrval/aucAutomatically the group chart - Add tags like
"lr-finder"Or"fp16"So you can filter the later runs - Use
run.summary(...)Once the results of Lou You want to see on the run card
!! Login Images, Confusion Matrix, and Customs Plot
wandb.log({
"val/confusion": wandb.plot.confusion_matrix(
preds=preds, y_true=y_true, class_names=classes)
})You can also save any metaplatlib plot:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(history)
wandb.log({"training/curve": fig})!! Model with version datasis and samples
Samples answer questions such as, “What files did this run use?” And “What did we train?” No more final_final_v3.parquet Mystery
import wandb
run = wandb.init(project="kdn-mlops")
# Create a dataset artifact (run once per version)
raw = wandb.Artifact("imdb_reviews", type="dataset", description="raw dump v1")
raw.add_dir("data/raw") # or add_file("path")
run.log_artifact(raw)
# Later, consume the latest version
artifact = run.use_artifact("imdb_reviews:latest")
data_dir = artifact.download() # folder path pinned to a hashLogin your model similarly:
import torch
import wandb
run = wandb.init(project="kdn-mlops")
model_path = "models/resnet18.pt"
torch.save(model.state_dict(), model_path)
model_art = wandb.Artifact("sentiment-resnet18", type="model")
model_art.add_file(model_path)
run.log_artifact(model_art)Now, the lineage is clear: This model comes from this data under the low pledge of this code.
!! Tables for diagnosis and error analysis
wandb.Table The results, a light data frame for predictions and pieces.
table = wandb.Table(columns=("id", "text", "pred", "true", "prob"))
for r in batch_results:
table.add_data(r.id, r.text, r.pred, r.true, r.prob)
wandb.log({"eval/preds": table})Filter the table in the UI to find failure patterns (eg, short reviews, rare classes, etc.).
!! The hyperparameter gives the sweep
Describe the search space in Yamal, launch agents, and let the W&B connect.
# sweep.yaml
method: bayes
metric: {name: val/auc, goal: maximize}
parameters:
lr: {min: 1e-5, max: 1e-2}
batch: {values: (32, 64, 128)}
dropout: {min: 0.0, max: 0.5}Start the sweep:
wandb sweep sweep.yaml # returns a SWEEP_ID
wandb agent // # run 1+ agents Should read your training script wandb.config For lrFor, for, for,. batchEtc. Dashboard shows top trials, parallel coordinates, and excellent structure.
. Drop -in integration
Choose what you use and keep moving.
!! Piturch electric
from pytorch_lightning.loggers import WandbLogger
logger = WandbLogger(project="kdn-mlops")
trainer = pl.Trainer(logger=logger, max_epochs=10)!! Carrus
import wandb
from wandb.keras import WandbCallback
wandb.init(project="kdn-mlops", config={"epochs": 10})
model.fit(X, y, epochs=wandb.config.epochs, callbacks=(WandbCallback()))!! Skate
from sklearn.metrics import roc_auc_score
wandb.init(project="kdn-mlops", config={"C": 1.0})
# ... fit model
wandb.log({"val/auc": roc_auc_score(y_true, y_prob)}). Model Registry and Staging
Think of the Registry as a nominee for your best models. You press a sample once, then poin the alias names staging Or production Therefore, the flow code can be pulled to the right without evaluating the file routes.
run = wandb.init(project="kdn-mlops")
art = run.use_artifact("sentiment-resnet18:latest")
registry = wandb.sdk.artifacts.model_registry.ModelRegistry()
entry = registry.push(art, name="sentiment-classifier")
entry.aliases.add("staging")When you promote a new construction, flip the alias. Users always read sentiment-classifier:production.
. List of reproductive capacity
- Construction: Store in every hyper parameter
wandb.config - Code and Commitment: Use
wandb.init(settings=wandb.Settings(code_dir="."))To rely on CI to attach a snapshot code or gut Shaw - Environment: Log
requirements.txtOr doer tag and add it to a sample - Bead: Login them and set them
Minimum Badge Wizard:
def set_seeds(s=42):
import random, numpy as np, torch
random.seed(s)
np.random.seed(s)
torch.manual_seed(s)
torch.cuda.manual_seed_all(s). Cooperation and sharing without screenshots
Add notes and tags so fellow can find. Use the sewing charts, tables and commentary reports of commentary in the link that you can fall into the slack or PR. Stakeholders can walk together without opening a notebook.
. Ci and automation tips
- Drive
wandb agentOn training nodes to give a sweep from CI - Log in Datastic Article after your ETL job. Train jobs can clearly rely on this version
- After the diagnosis, promote the model nickname (
staging→ →production) After a small - So
WANDB_API_KEYWith a secret and group -related runsWANDB_RUN_GROUP
. The points of privacy and reliability
- Use private plans for teams as default
- Use an offline form for air -driven runs. Then train usually
wandb syncShortly
export WANDB_MODE=offline- Do not log on to the raw PII. If needed, the hash ID before logging.
- LIB LIGHT FIRES, Store them as sample instead of attaching them
wandb.log.
. Ordinary snags (and quick fixes)
- “My run didn’t log in.” The script may have crashed before
wandb.finish()Was called Nos, check that you haven’t setWANDB_DISABLED=trueIn your environment - Logging feels slow. Log in every step, but save heavy assets such as photos or tables for the end of a pledge. You can also pass
commit=Falsetowandb.log()And together with more than one log. - Watching duplicate runs in UI? If you’re getting restarted from a checkpoint, set
idAndresume="allow"Iwandb.init()To continue the same run. - Experience of growing mysterious data? Put each Datastic Snapshot in a sample and keep your runs on a clear version.
. Pocket chat sheet
!! 1. Start a run
wandb.init(project="proj", config=cfg, tags=("baseline"))!! 2. Log matrix, images, or tables
wandb.log({"train/loss": loss, "img": (wandb.Image(img))})!! 3. Version a datastate or model
art = wandb.Artifact("name", type="dataset")
art.add_dir("path")
run.log_artifact(art)!! 4. Use a sample
path = run.use_artifact("name:latest").download()!! 5. Run a broom
wandb sweep sweep.yaml && wandb agent // . Wrap
Start Little: Start a run, log in some matrix, and make your model file as a sample. When it feels natural, add a broom and a short report. You will end up with reproductive experiences, detective data and models, and a dashboard that explains your work without a slide show.
Jozep Ferrer Barcelona is an analytical engineer. He graduated in physics engineering and is currently working in the data science field applied to humanitarian movement. He is a part -time content creator focused on data science and technology. Joseph writes everything on AI, covering the application of the explosion at the field.