CLI
Watch Mode
You can run Evalite in watch mode by running evalite watch
:
evalite watch
This will watch for changes to your .eval.ts
files and re-run the evals when they change.
[!IMPORTANT]
I strongly recommend implementing a caching layer in your LLM calls when using watch mode. This will keep your evals running fast and avoid burning through your API credits.
Running Specific Files
You can run specific files by passing them as arguments:
evalite my-eval.eval.ts
This also works for watch
mode:
evalite watch my-eval.eval.ts
Threshold
You can tell Evalite that your evals must pass a specific score by passing --threshold
:
evalite --threshold=50 # Score must be greater than or equal to 50
evalite watch --threshold=70 # Also works in watch mode
This is useful for running on CI. If the score threshold is not met, it will fail the process.