Cerebras/vscode-cerebras-chat v0.1.18
Cerebras/vscode-cerebras-chat
Captured source
source ↗published Dec 8, 2025seen 5dcaptured 13hhttp 200method plain
v0.1.18 - Rate Limit Optimization & Model Updates
Repository: Cerebras/vscode-cerebras-chat
Tag: v0.1.18
Published: 2025-12-08T21:35:19Z
Prerelease: no
Release notes:
Features
- Use conservative
max_completion_tokensdefaults (8192) to prevent premature rate limiting - Cerebras rate limiter estimates quota based on
max_completion_tokensupfront, not actual usage - Lower defaults preserve rate limit headroom for agentic tools
Fixes
- Update
llama-3.3-70b: maxInputTokens to 131072, maxOutputTokens to 65536 - Update
qwen-3-235b-a22b-instruct-2507: maxOutputTokens to 40960
Notability
notability 2.0/10Routine VS Code extension update