RepoAI21 LabsAI21 Labspublished Apr 27, 2022seen 5d

AI21Labs/MRKL_synthetic_data

Open original ↗

Captured source

source ↗
published Apr 27, 2022seen 5dcaptured 15hhttp 200method plain

AI21Labs/MRKL_synthetic_data

Description: All synthetic data used for the experiments described in MRKL white paper https://www.ai21.com/blog/jurassic-x-crossing-the-neuro-symbolic-chasm-with-the-mrkl-system.

License: Apache-2.0

Stars: 5

Forks: 1

Open issues: 0

Created: 2022-04-27T06:51:01Z

Pushed: 2022-05-01T06:51:08Z

Default branch: main

Fork: no

Archived: yes

README:

MRKL_synthetic_data

This repository contains all of the artificial data used for the experiments described in MKRL white paper: https://www.ai21.com/blog/jurassic-x-crossing-the-neuro-symbolic-chasm-with-the-mrkl-system

The repository is divided into folders, containing the neccassiry data to replicate the experiments described in the white paper, and are numbered accordingly.

Each data file is in jsonlines format, containing the columns:

  • prompt: the text to be evaluated by the model.
  • completion: the correct text prediction for the model.
  • execution: the correct evaluation of the completion formula done by a calculator.

Example: {"prompt":"How much is 27395 plus 87971 plus 25658?","execution":141024.0,"completion":"X=(27395+87971+25658)"}

Some of the files contain extra columns which may be used to further stratify the data. Those columns are not required to replicate any of the experiments desribed in the white paper, but enable a more elaborate analysis.