What does this repo signal mean?

Fireworks AI published fw-ai/llm_eval_meta (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo fw-ai/llm_eval_meta · language Python · New repo, negligible traction. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Fireworks AI Repo: fw-ai/llm_eval_meta

Captured source

source ↗

GitHub/github.com/fw-ai/llm_eval_meta

fw-ai/llm_eval_meta repository metadata

Source ↗

published Aug 3, 2024seen Jun 5captured Jun 11http 200method plain

fw-ai/llm_eval_meta

Description: Repro for official Llama 3.1 Benchmarks

Language: Python

License: Apache-2.0

Stars: 2

Forks: 0

Open issues: 1

Created: 2024-08-03T20:48:40Z

Pushed: 2025-04-08T22:55:55Z

Default branch: main

Fork: no

Archived: no

README:

Example script to run

OPENAI_API_KEY= python run_meta_benchmarks.py --model-size 8b --provider fw --output-dir gsm8k/fw_3p1_8b/ --eval-set evals__gsm8k__details

# Note - if this crashes due to rate limit/something else, you can rerun the same command to continue - all the previous requests are persisted

python analyze_answers.py --task evals__gsm8k__details --response-path gsm8k/fw_3p1_8b/

> Accuracy: 0.8529188779378317 evals__gsm8k__details gsm8k/fw_3p1_8b/

Tasks supported so far are evals__mmlu__details, evals__mmlu__0_shot__cot__details, evals__gsm8k__details, evals__mmlu_pro__details.

Note - we don't know the exact answer extraction logic Meta uses so we rolled out own. Discrepencies may be a result of this.

Notability

notability 2.0/10

New repo, negligible traction