Weights & Biases Weave logo

Weights & Biases Weave

Open-source toolkit for tracing, evaluating, and iterating on LLM apps.

Open source

Best when ML teams already use W&B and want LLM traces, evals, and comparisons in the same experimentation culture.

Selection advice

Choose Weave when your team already treats model iteration as tracked experiments, not one-off debugging sessions.

Best for

  • LLM trace inspection in notebooks
  • eval-driven iteration loops
  • teams with existing W&B workflows

Not ideal for

  • teams with no experiment-tracking culture
  • production-only ops teams that avoid notebook workflows

Core concepts

tracesevalsscorersdatasetsexperiments

Minimal implementation shape

Wrap an agent function with Weave tracing, inspect intermediate steps, and compare eval scores across three prompt variants.

Sources