{"schema_version":"onlylabs.public_analysis_evidence.v1","title":"Amazon (Nova) analysis evidence pack","description":"Public onlylabs evidence pack for cited agent analysis: captured pages, ranked public signals, and stored web-search provenance used by the background analysis workflow.","url":"https://onlylabs.fyi/analysis/amazon","json_url":"https://onlylabs.fyi/analysis/amazon/evidence.json","generated_at":"2026-06-11T18:16:21.466Z","org":{"slug":"amazon","name":"Amazon (Nova)","category":"frontier-lab","category_label":"Frontier lab","dossier_url":"https://onlylabs.fyi/labs/amazon"},"analysis":{"url":"https://onlylabs.fyi/analysis/amazon","json_url":"https://onlylabs.fyi/analysis/amazon/analysis.json","generated_at":"2026-06-10T08:03:29.623+00:00"},"workflow":{"version":"onlylabs-deepagents-analysis-v3","provider":"deepseek","model":"deepseek-v4-pro","agent":"deepagents","public_pack_mode":"local-pages-and-events","live_web_fetches":false,"note":"Public evidence exports do not trigger live Exa calls; stored Exa provenance is included when analysis metadata contains it."},"stats":{"pages":28,"events":140,"web":0,"evidence":88,"signal_desks":{"hiring":0,"forks":0,"releases":19,"talking":23,"repos":18},"data_radar_lanes":{"data":10,"evals":6,"infrastructure":8,"safety":6,"product":3},"data_radar_matches":23,"stored_analysis_evidence":92,"stored_analysis_web":4,"stored_analysis_signal_desks":{"forks":0,"repos":19,"hiring":0,"talking":22,"releases":19},"stored_analysis_data_radar_lanes":{"data":10,"evals":6,"safety":5,"product":2,"infrastructure":8},"stored_analysis_data_radar_matches":21},"stored_web_provenance":{"queries":["\"Amazon (Nova)\" frontier AI lab recent model release research hiring GitHub Hugging Face","\"Amazon (Nova)\" AI lab what they are building talking about hiring releasing forking"],"request_ids":["d8103f865fbe9fa7a233e64b81e678f8","7745f32a4c2a9e481d52c64c2be1d4f5"],"skipped":null},"evidence":[{"ref":"P1","kind":"page","title":"amazon-science/concurry repository metadata","date":"2026-06-11T03:58:54.344303+00:00","date_source":null,"source_url":"https://github.com/amazon-science/concurry","signal_url":null,"signal_json_url":null,"text":"# amazon-science/concurry\n\nDescription: Scaling made stupid: Accelerate your AI research and production workloads\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 18\n\nForks: 1\n\nOpen issues: 1\n\nCreated: 2025-04-01T05:31:25Z\n\nPushed: 2026-06-10T10:09:34Z\n\nDefault branch: mainline\n\nFork: no\n\nArchived: no\n\nREADME:\n# Concurry\n\n<p align=\"center\">\n<img src=\"docs/concurry-landscape.png\" alt=\"Concurry\" width=\"800\">\n</p>\n\n<p align=\"center\">\n<a href=\"https://amazon-science.github.io/concurry/\"><img src=\"https://img.shields.io/badge/docs-latest-blue.svg\" alt=\"Documentation\"></a>\n<a href=\"https://pypi.org/project/concurry/\"><img src=\"https://img.shields.io/pypi/v/concurry.svg\" alt=\"PyPI Version\"></a>\n<a href=\"https://pypi.org/project/concurry/\"><img src=\"https://img.shields.io/pypi/pyversions/concurry.svg\" alt=\"Python Versions\"></a>\n<a href=\"LICENSE\"><img src=\"https://img.shields.io/badge/license-Apache%202.0-blue.svg\" alt=\"License\"></a>\n<a href=\"https://github.com/amazon-science/concurry/actions\"><img src=\"https://img.shields.io/github/actions/workflow/status/amazon-science/concurry/tests.yml?branch=main\" alt=\"Build Status\"></a>\n</p>\n\n## **Parallelism made simple, both for humans and AI agents.**\n\nConcurry is a unified, delightful concurrency library for Python. It replaces the fragmented landscape of `threading`, `multiprocessing`, `asyncio`, and `Ray` with a single, elegant API. Write your code once, and run it on a single thread, multiple cores, or a distributed cluster—without changing a line of business logic.\n\n---\n\n## 🚀 Quickstart: 50x Speedup in 3 Lines of Code\n\nCalling LLMs sequentially is painfully slow. With Concurry, you can parallelize your existing code instantly.\n\n**Prerequisites:** `pip install concurry litellm`\n\n```python\nfrom pydantic import BaseModel\nimport litellm\n# Line 1. Import concurry\nfrom concurry import worker, gather\n\n# Line 2. Add the @worker decorator to an existing class\n@worker(mode=\"thread\", max_workers=50)\nclass LLM(BaseModel):\nmodel: str\n\ndef call(self, prompt: str) -> str:\n# This runs in a separate thread!\nreturn litellm.completion(\nmodel=self.model,\nmessages=[{\"role\": \"user\", \"content\": prompt}]\n).choices[0].message.content\n\n# Initialize y"},{"ref":"P2","kind":"page","title":"amazon-science/AI-Reinforced-Recommendations repository metadata","date":"2026-06-11T03:58:52.74626+00:00","date_source":null,"source_url":"https://github.com/amazon-science/AI-Reinforced-Recommendations","signal_url":null,"signal_json_url":null,"text":"# amazon-science/AI-Reinforced-Recommendations\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 5\n\nForks: 0\n\nOpen issues: 0\n\nCreated: 2025-04-09T23:19:23Z\n\nPushed: 2025-04-09T23:22:29Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n## My Project\n\nTODO: Fill this README out!\n\nBe sure to:\n\n* Change the title in this README\n* Edit your repository description on GitHub\n\n## Security\n\nSee [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.\n\n## License\n\nThis project is licensed under the Apache-2.0 License."},{"ref":"P3","kind":"page","title":"amazon-science/document-haystack repository metadata","date":"2026-06-11T03:58:51.887135+00:00","date_source":null,"source_url":"https://github.com/amazon-science/document-haystack","signal_url":null,"signal_json_url":null,"text":"# amazon-science/document-haystack\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 4\n\nForks: 1\n\nOpen issues: 0\n\nCreated: 2025-04-23T17:27:24Z\n\nPushed: 2025-07-30T21:59:33Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# Document Haystack Benchmark\n\nThis repository contains the inference and evaluation scripts for the paper “[Document Haystack: A Long Context Multimodal Image/Document Understanding Vision LLM Benchmark](https://arxiv.org/abs/2507.15882)”.\n\n## 📑 Abstract Paper\n\nThe proliferation of multimodal Large Language Models has significantly advanced the ability to analyze and understand complex data inputs from different modalities. However, the processing of long documents remains under-explored, largely due to a lack of suitable benchmarks. To address this, we introduce Document Haystack, a comprehensive benchmark designed to evaluate the performance of Vision Language Models (VLMs) on long, visually complex documents. Document Haystack features documents ranging from 5 to 200 pages and strategically inserts pure text or multimodal text+image \"needles\" at various depths within the documents to challenge VLMs' retrieval capabilities. Comprising 400 document variants and a total of 8,250 questions, it is supported by an objective, automated evaluation framework. We detail the construction and characteristics of the Document Haystack dataset, present results from prominent VLMs and discuss potential research avenues in this area.\n\n## 🎯 Benchmark Overview\n\n### Dataset\n\nThe Document Haystack dataset can be found at [AmazonScience/document-haystack](https://huggingface.co/datasets/AmazonScience/document-haystack) on Hugging Face.\n\n### Key Features\n- **Document Formats**: Text, Image, PDF\n- **Document Range**: 5-200 pages\n- **Dataset Size**: 400 document variants\n- **Question Pool**: 8,250 evaluation questions\n- **Needle Types**:\n- Pure text\n- Multimodal (text + image)\n- **Automated Evaluation Framework**\n\n### Benchmark Structure\n- Strategic needle placement at various document depths\n- Three inference and evaluation settings:\n1. TextNeedlesFromParsedText\n2. TextNeedlesFromDocumentImages\n3. TextImagesNeedlesFromDocumentImages\n\n## 📁 Project Structure"},{"ref":"P4","kind":"page","title":"amazon-science/mxfp4-llm repository metadata","date":"2026-06-11T03:58:51.747108+00:00","date_source":null,"source_url":"https://github.com/amazon-science/mxfp4-llm","signal_url":null,"signal_json_url":null,"text":"# amazon-science/mxfp4-llm\n\nDescription: Official implementation for Training LLMs with MXFP4\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 127\n\nForks: 18\n\nOpen issues: 2\n\nCreated: 2025-04-23T21:11:42Z\n\nPushed: 2025-04-25T06:28:35Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# [Training LLMs with MXFP4](https://arxiv.org/abs/2502.20586)\n\n[![preprint](https://img.shields.io/static/v1?label=arXiv&message=2405.03637&color=B31B1B&logo=arXiv)](https://arxiv.org/abs/2502.20586)\n[![License: MIT](https://img.shields.io/badge/License-Apache--2.0-green.svg)](https://opensource.org/licenses/Apache-2.0)\n\n## Introduction\n\nThis repo contains official implementation for Training LLMs with MXFP4. Our MXFP4 training recipe achieves near-lossless training by computing unbiased gradient estimates (with stochastic rounding and random Hadamard transformation) using MXFP4-accelerated GEMMs. This allows us compute the backward pass in MXFP4, which constitutes $>1/2$ of the FLOPs during training.\n\nWe support training with [`NVIDIA/Megatron-LM`](https://github.com/NVIDIA/Megatron-LM/tree/main) and [`NVIDIA/TransformerEngine`](https://github.com/NVIDIA/TransformerEngine/tree/main). Due to lack of MXFP4 hardware supports (Blackwell GPUs), we use [`microsoft/microxcaling`](https://github.com/microsoft/microxcaling) to perform emulation of MXFP4 GEMMS [(OCP MX specification)](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf).\n\n## Requirements\nWe recommend using [NGC's PyTorch container](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) with released tag ``pytorch:24.04-py3``\n```bash\ndocker pull nvcr.io/nvidia/pytorch:24.04-py3\n```\nWe support MXFP4 backward passes with both BF16 and FP8 forward passes, leveraging TransformerEngine for the latter. Currently, we only supported FP8 + MXFP4 training with [TransformerEngine-Version('1.5.0+6a9edc3')](https://github.com/NVIDIA/TransformerEngine/tree/release_v1.5), which comes pre-installed in the ``pytorch:24.04-py3`` container.\n\n## Datasets\nWe used the `GPT2BPETokenizer` preprocessed Wikipedia dataset (around 3.28 billion tokens). Please follow [AWS-Neuron-Examples-Megatron-LM-GPT](https://a"},{"ref":"P5","kind":"page","title":"amazon-science/JavaMigration repository metadata","date":"2026-06-11T03:58:51.745371+00:00","date_source":null,"source_url":"https://github.com/amazon-science/JavaMigration","signal_url":null,"signal_json_url":null,"text":"# amazon-science/JavaMigration\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 8\n\nForks: 0\n\nOpen issues: 7\n\nCreated: 2025-05-01T22:20:35Z\n\nPushed: 2026-04-26T00:51:21Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# ☕ JavaMigration\n<table>\n<tr>\n<td style=\"padding: 0;\">\n<a href=\"https://huggingface.co/collections/AmazonScience/migrationbench-68125452fc21a4564b92b6c3\">\n<img src=\"https://img.shields.io/badge/-🤗 MigrationBench-4d5eff?style=flatten&labelColor\" alt=\"MigrationBench (Hugging Face)\">\n</a>\n</td>\n<td style=\"padding: 0;\">\n<a href=\"https://github.com/amazon-science/MigrationBench\">\n<img src=\"https://img.shields.io/badge/MigrationBench-000000?style=flatten&logo=github\" alt=\"MigrationBench (GitHub)\">\n</a>\n</td>\n<td style=\"padding: 0;\">\n<a href=\"https://github.com/amazon-science/JavaMigration\">\n<img src=\"https://img.shields.io/badge/JavaMigration-000000?style=flatten&logo=github&logoColor=white\" alt=\"JavaMigration (GitHub)\">\n</a>\n</td>\n<td style=\"padding: 0;\">\n<a href=\"https://arxiv.org/abs/2505.09569\">\n<img src=\"https://img.shields.io/badge/arXiv-2505.09569-b31b1b.svg?style=flatten\" alt=\"MigrationBench (arXiv)\">\n</a>\n</td>\n<td style=\"padding: 0; padding-left: 10px; vertical-align: middle;\">\n<a href=\"https://huggingface.co/datasets/AmazonScience/migration-bench-java-full\">\n<img src=\"https://img.shields.io/badge/-🤗 java--full-8a98ff?style=flat&labelColor\" alt=\"java-full\">\n</a>\n</td>\n<td style=\"padding: 0; vertical-align: middle;\">\n<a href=\"https://huggingface.co/datasets/AmazonScience/migration-bench-java-selected\">\n<img src=\"https://img.shields.io/badge/-🤗 java--selected-8a98ff?style=flat&labelColor\" alt=\"java-selected\">\n</a>\n</td>\n</tr>\n</table>\n\n🚀 Repository for automated Java code migration research, part of the [MigrationBench](https://huggingface.co/collections/AmazonScience/migrationbench-68125452fc21a4564b92b6c3) project.\n\n## 📦 Packages\n\nThis repository contains **two** sub-packages to conduct Java code migration with LLMs:\n\n### 1. 🤖 [JavaMigrationAgent](./java_migration_agent)\n\n[**JavaMigrationAgent**](./java_migration_agent) is an LLM-based agent to automate Java 8 to Java 17 (21) migration,\nbuilt on top of the [Strands Agents](https://st"},{"ref":"P6","kind":"page","title":"amazon-science/TurboFuzzLLM repository metadata","date":"2026-06-11T03:58:51.739754+00:00","date_source":null,"source_url":"https://github.com/amazon-science/TurboFuzzLLM","signal_url":null,"signal_json_url":null,"text":"# amazon-science/TurboFuzzLLM\n\nDescription: TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in Practice\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 24\n\nForks: 2\n\nOpen issues: 0\n\nCreated: 2025-04-28T17:26:38Z\n\nPushed: 2025-11-24T18:41:17Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# TurboFuzzLLM\n\n**Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking LLMs in Practice**\n\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![GitHub](https://img.shields.io/badge/GitHub-Repo-black?logo=github)](https://github.com/amazon-science/TurboFuzzLLM)\n\nA state-of-the-art tool for automatic red teaming of Large Language Models (LLMs) that generates effective adversarial prompt templates to identify vulnerabilities and improve AI safety.\n\n### ⚠️ Responsible Use\n\nThis tool is designed for improving AI safety through systematic vulnerability testing. It should be used responsibly for defensive purposes and developing better safeguards for LLMs.\n\nOur primary goal is to advance the development of more robust and safer AI systems by identifying and addressing their vulnerabilities. We believe this research will ultimately benefit the AI community by enabling the development of better safety measures and alignment techniques.\n\n## 📖 Table of Contents\n\n- [🚀 Getting Started](#-getting-started)\n- [🎯 Key Features](#-key-features)\n- [🔧 Method Overview](#-method-overview)\n- [🔄 Architecture and Data Flow](#-architecture-and-data-flow)\n- [📊 Results](#-results)\n- [🛡️ Applications](#️-applications)\n- [⚙️ Configuration](#️-configuration)\n- [🤖 Supported Models](#-supported-models)\n- [🧑‍💻 Development](#-development)\n- [📁 Codebase Structure](#-codebase-structure)\n- [📂 Understanding Output](#-understanding-output)\n- [🔧 Troubleshooting](#-troubleshooting)\n- [👥 Meet the Team](#-meet-the-team)\n- [Security](#security)\n- [License](#license)\n- [Citation](#citation)\n\n## 🚀 Getting Started\n\n### Prerequisites\n\n- Python 3.8+ and `pip`\n- Provider acce"},{"ref":"P7","kind":"page","title":"amazon-science/information-preservation-in-prompt-compression repository metadata","date":"2026-06-11T03:58:49.544016+00:00","date_source":null,"source_url":"https://github.com/amazon-science/information-preservation-in-prompt-compression","signal_url":null,"signal_json_url":null,"text":"# amazon-science/information-preservation-in-prompt-compression\n\nDescription: Understanding and Improving the Information Preservation in Prompt Compression for LLMs\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 2\n\nForks: 1\n\nOpen issues: 6\n\nCreated: 2025-05-16T17:39:58Z\n\nPushed: 2026-04-18T02:15:16Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n<!--\nCopyright Amazon.com, Inc. or its affiliates. All Rights Reserved.\nSPDX-License-Identifier: CC-BY-NC-4.0\n-->\n\n# Understanding and Improving the Information Preservation in Prompt Compression for LLMs\n\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n![Python version](https://img.shields.io/badge/python-3.9-blue)\n\nThis repository provides resources developed within the following article:\n\n> W. Łajewska, M. Hardalov, L. Aina, N. A. John, H. Su., L. Màrquez **Understanding and Improving the Information Preservation in Prompt Compression for LLMs.** In: Findings of the Association for Computational Linguistics: EMNLP 2025.\n\nThe preprint of this paper is available on [arXiv](https://arxiv.org/abs/2503.19114).\n\n## Summary\n\nRecent advancements in large language models (LLMs) have enabled their successful application to a broad range of tasks. However, in information-intensive tasks, the prompt length can grow fast, leading to increased computational requirements, performance degradation, and induced biases from irrelevant or redundant information. Recently, various prompt compression techniques have been introduced to optimize the trade-off between reducing input length and retaining performance. We propose a holistic evaluation framework that allows for in-depth analysis of prompt compression methods. We focus on three key aspects, besides compression ratio: (i) downstream task performance, (ii) grounding in the input context, and (iii) information preservation. Using our framework, we analyze state-of-the-art soft and hard compression methods and show that some fail to preserve key details from the original prompt, limiting performance on complex tasks. By identifying these limitations, we are able to improve one soft prompting method by controllin"},{"ref":"P8","kind":"page","title":"amazon-science/MigrationBench repository metadata","date":"2026-06-11T03:58:49.544016+00:00","date_source":null,"source_url":"https://github.com/amazon-science/MigrationBench","signal_url":null,"signal_json_url":null,"text":"# amazon-science/MigrationBench\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 14\n\nForks: 6\n\nOpen issues: 4\n\nCreated: 2025-05-14T22:11:19Z\n\nPushed: 2026-06-10T00:57:35Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# MigrationBench\n<table>\n<tr>\n<td style=\"padding: 0;\">\n<a href=\"https://huggingface.co/collections/AmazonScience/migrationbench-68125452fc21a4564b92b6c3\">\n<img src=\"https://img.shields.io/badge/-🤗 MigrationBench-4d5eff?style=flatten&labelColor\" alt=\"MigrationBench (Hugging Face)\">\n</a>\n</td>\n<td style=\"padding: 0;\">\n<a href=\"https://github.com/amazon-science/MigrationBench\">\n<img src=\"https://img.shields.io/badge/MigrationBench-000000?style=flatten&logo=github\" alt=\"MigrationBench (GitHub)\">\n</a>\n</td>\n<td style=\"padding: 0;\">\n<a href=\"https://github.com/amazon-science/JavaMigration\">\n<img src=\"https://img.shields.io/badge/JavaMigration-000000?style=flatten&logo=github&logoColor=white\" alt=\"JavaMigration (GitHub)\">\n</a>\n</td>\n<td style=\"padding: 0;\">\n<a href=\"https://arxiv.org/abs/2505.09569\">\n<img src=\"https://img.shields.io/badge/arXiv-2505.09569-b31b1b.svg?style=flatten\" alt=\"MigrationBench (arXiv)\">\n</a>\n</td>\n<td style=\"padding: 0;\">\n<a href=\"https://amazon-science.github.io/MigrationBench\">\n<img src=\"https://img.shields.io/badge/📖_Docs-Site-4d9eff?style=flatten\" alt=\"Documentation Site\">\n</a>\n</td>\n</tr>\n</table>\n\n<!--\nnpm install -g markdown-toc\nmarkdown-toc -i README.md\n-->\n\n<!-- toc -->\n\n- [1. 📖 Overview](#1--overview)\n* [1.1 MigrationBench: Dataset and Evaluation Framework](#11-migrationbench-dataset-and-evaluation-framework)\n* [1.2 JavaMigration: Migration with LLMs](#12-javamigration-migration-with-llms)\n- [2. 🤗 MigrationBench Datasets](#2--migrationbench-datasets)\n- [3. Code Migration Evaluation](#3-code-migration-evaluation)\n* [3.1 Docker Mode (Recommended)](#31-docker-mode-recommended)\n+ [3.1.1 Setup Docker](#311-setup-docker)\n+ [3.1.2 Single Repository Evaluation](#312-single-repository-evaluation)\n+ [3.1.3 Batch Evaluation](#313-batch-evaluation)\n* [3.2 Local Mode](#32-local-mode)\n+ [3.2.1 Install Java and Maven](#321-install-java-and-maven)\n+ [3.2.2 Install MigrationBench](#322-install-migrationbench)\n+ [3.2.3 Single Rep"},{"ref":"P9","kind":"page","title":"amazon-science/acibench-hallucination-annotations repository metadata","date":"2026-06-11T03:58:49.312549+00:00","date_source":null,"source_url":"https://github.com/amazon-science/acibench-hallucination-annotations","signal_url":null,"signal_json_url":null,"text":"# amazon-science/acibench-hallucination-annotations\n\nDescription: Expert hallucination labels from the ACI-bench dataset\n\nLicense: CC-BY-4.0\n\nStars: 7\n\nForks: 0\n\nOpen issues: 0\n\nCreated: 2025-05-23T16:45:26Z\n\nPushed: 2025-05-23T16:51:49Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# Natural Hallucination Dataset - ACI-bench Clinical Note Hallucination Annotations\n\nThis repository contains expert-annotated hallucination labels from the ACI-bench dataset for evaluating hallucination detection in medical text summarization.\n\n## Dataset Overview\n\nThe Natural Hallucination (NH) dataset contains expert annotations of hallucinations in clinical summaries, focused on SOAP notes from the ACI-bench collection of clinical conversations.\n\n### Annotation Categories & Counts\n\nExpert clinical scribes annotated statements into 4 categories with the following distribution:\n- No Error: 12,365\n- Hallucination: 106\n- Inference: 87\n- Misunderstanding: 72\n\n### Error Severity Distribution\n\nThe errors were classified by severity:\n- Low Severity: 138\n- High Severity: 87 \n- Not Medically Relevant (NMR): 40\n\n### High Severity Categories\n\nThe following categories are marked as high severity errors:\n- Diagnosis\n- Exam Findings\n- Lab Testing and Imaging \n- Medical History\n- Symptoms\n- Treatment Plan\n\nAge & Sex errors are considered low severity.\n\n### Dataset Format\n\nThe released dataset contains:\n- Original ACI-bench conversation transcripts\n- Expert annotations of factual errors marked by category\n- Severity labels for each error\n- Aggregated error scores per subject\n\n## Usage\n\nThe annotations can be used to:\n- Evaluate hallucination detection methods\n- Analyze different types of factual errors in clinical summarization\n- Study high vs low severity errors in medical text generation\n\n## Citation\n\nIf you use this dataset, please cite:\n\nFact-Controlled Diagnosis of Hallucinations in Medical Text Summarization. \nBN, S., Shing, H.-C., Xu, L., Strong, M., Burnsky, J., Ofor, J., Mason, J. R., Chen, S., Srinivasan, S., Shivade, C., Moriarty, J., & Cohen, J. P. \nInterspeech 2025\n\n```\n@inproceedings{BN2024fact,\ntitle={Fact-Controlled Diagnosis of Hallucinations in Medical Text Summarizatio"},{"ref":"P10","kind":"page","title":"amazon-science/XRAG repository metadata","date":"2026-06-11T03:58:49.155354+00:00","date_source":null,"source_url":"https://github.com/amazon-science/XRAG","signal_url":null,"signal_json_url":null,"text":"# amazon-science/XRAG\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 5\n\nForks: 1\n\nOpen issues: 0\n\nCreated: 2025-05-23T18:40:28Z\n\nPushed: 2025-05-25T16:25:57Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# XRAG: Cross-lingual Retrieval-Augmented Generation\n\nThis repository contains the code and data for the paper **\"XRAG: Cross-lingual Retrieval-Augmented Generation\"** ([arxiv](https://arxiv.org/abs/2505.10089)). \n\n---\n## Data\nSee our [paper](https://arxiv.org/abs/2505.10089) for a description of the data, and `data/README.md` for details on the jsonlines data format. \n\n## Implementation\n\nSee `src/README.md` for details on how to run the code to reproduce dataset generation and experiments.\n\n## License\n\nThis repository is released under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) Licence.\nAlthough this repository is released under the CC BY-NC 4.0 license, it requires the third party OpenAI API for question generation, subject to OpenAI's [terms of use](https://openai.com/policies/row-terms-of-use/)."},{"ref":"P11","kind":"page","title":"amazon-science/confetti repository metadata","date":"2026-06-11T03:58:48.970396+00:00","date_source":null,"source_url":"https://github.com/amazon-science/confetti","signal_url":null,"signal_json_url":null,"text":"# amazon-science/confetti\n\nLicense: CC-BY-4.0\n\nStars: 4\n\nForks: 1\n\nOpen issues: 0\n\nCreated: 2025-05-29T03:22:32Z\n\nPushed: 2025-05-29T03:42:46Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n## ConFETTI: Conversational Function-Calling Evaluation Through Turn-Level Interactions \n\nConFETTI is a Conversational Function-Calling\nbenchmark that works on the turn-level. \nThe benchmark is designed to evaluate function-calling \ncapabilities and response quality of large language models\n(LLMs). Current benchmarks lack comprehensive\nassessment of LLMs in complex conversational\nscenarios. CONFETTI addresses this gap through\n109 human-simulated conversations, comprising\n313 user turns and covering 86 APIs.\n\n### Conversation Complexities \n\nThese conversations explicitly target various conversational\ncomplexities, such as follow-ups, goal correction\nand switching, ambiguous and implicit goals. Below is a list of all included complexities \nand the number of dialogs covering those complexities. \n\n| COMPLEXITY | # DIALOGS | DESCRIPTION |\n|---------------------------|-----------|-------------|\n| EXCEPTION_IN_EXECUTION | 5 | Errors or exceptions that occur during the execution of an action |\n| FAILED_CONVERSATION | 5 | Interactions where the intended goal is not achieved |\n| CONFIRMATION | 6 | Requesting user approval before executing an action |\n| GOAL_SWITCHING | 6 | When the user changes their objective during the conversation |\n| NO_TARGET_COMPLEXITY | 6 | Conversations without specific complexity requirements |\n| GOAL_CORRECTION | 7 | Adjusting or refining the user's goal based on feedback |\n| GOAL_STACKING | 7 | Managing multiple user objectives simultaneously |\n| AMBIGUOUS_GOAL | 9 | When the user's intention is unclear and requires clarification |\n| FOLLOWUP_QUESTION | 10 | Additional queries or requests for information after the initial response |\n| IMPLICIT_DESCRIPTIVE_GOAL | 10 | The user describes a problem/background without directly stating their goal |\n| OVERFILL | 11 | Providing more information than required for an action |\n| UNDERFILL | 11 | Missing required arguments or information for an action |\n| GOAL_NOT_SUPPORTED | 15 | The user's request is not supp"},{"ref":"P12","kind":"page","title":"amazon-science/PersonaLens repository metadata","date":"2026-06-11T03:58:48.650749+00:00","date_source":null,"source_url":"https://github.com/amazon-science/PersonaLens","signal_url":null,"signal_json_url":null,"text":"# amazon-science/PersonaLens\n\nDescription: Code repository for PersonaLens paper (ACL 2025).\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 5\n\nForks: 2\n\nOpen issues: 1\n\nCreated: 2025-05-30T10:12:45Z\n\nPushed: 2025-06-06T10:19:36Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# PersonaLens: A Benchmark for Personalization Evaluation in Conversational AI Assistants\n<p align=\"center\" width=\"100%\">\n<img src=\"./res/personalens.png\" alt=\"PersonaLens\" style=\"width: 100%; min-width: 300px; display: block; margin: auto;\">\n</p>\n\nPersonaLens is a comprehensive benchmark designed to evaluate how well AI assistants can personalize their responses while completing tasks. Unlike existing benchmarks that focus on chit-chat, non-conversational tasks, or narrow domains, PersonaLens captures the complexities of personalized task-oriented assistance through rich user profiles, diverse tasks, and an innovative multi-agent evaluation framework.\n\n## Overview\nPersonaLens features:\n\n- Rich user profiles with diverse preferences and interaction histories\n- 100+ tasks spanning 20 domains\n- Two specialized LLM-based agents:\n- User agent that simulates realistic task-oriented dialogues\n- Judge agent that evaluates personalization quality, response quality, and task success\n\n## Project Structure\n\n```bash\nPersonaLens/\n│\n├── src/\n│ ├── generate_dialogue.py # Generate dialogues between user agent and AI assistant\n│ └── evaluate_dialogue.py # Evaluate dialogues using judge agent\n│\n├── data/\n│ ├── profile/ # User profiles with preferences and interaction history\n│ └── task/ # Task specifications across multiple domains\n│\n└── util/\n├── <utility_files> # Helper functions and utilities \n\n```\n\n## Installation\n\nEnsure you have Python 3.11+ installed. Install dependencies using:\n\n```bash\npip install -r requirements.txt\n```\nYou also need to use Amazon Bedrock in order to run the code. Please refer to the [Amazon Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html) for setup instructions. And please make sure you have the necessary permissions to access the models used in this benchmark.\n\n## Usage\n### 1. Dialogue Generation\nUse the `generate_dialogue."},{"ref":"P13","kind":"page","title":"amazon-science/TISER repository metadata","date":"2026-06-11T03:58:48.597152+00:00","date_source":null,"source_url":"https://github.com/amazon-science/TISER","signal_url":null,"signal_json_url":null,"text":"# amazon-science/TISER\n\nDescription: [ACL 2025] Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models\n\nLicense: MIT-0\n\nStars: 13\n\nForks: 4\n\nOpen issues: 1\n\nCreated: 2025-06-03T12:34:49Z\n\nPushed: 2025-06-03T12:58:39Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n## TISER\n\n### Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models\n\nThis repository contains the data for the paper (ACL 2025 Main): [Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models](https://arxiv.org/pdf/2504.05258).\n\nTISER incorporates a multi-stage inference pipeline that combines explicit reasoning, timeline construction, and iterative self-reflection. The key idea behind our approach is to empower LLMs to adapt by scaling their internal reasoning process during inference. TISER enables models to systematically organize temporal information, verify their inferences, and refine their outputs.\n\n## Train Data Format\n\nEach entry in the [TISER train dataset](data/TISER_train.json) is a JSON object containing six fields: `dataset_name`, `question_id`, `question`, `answer`, `prompt`, and `output`. The `question` field specifies the temporal question being asked, while `answer` contains the expected short-form response (e.g., an entity or number). The `prompt` provides detailed instructions for a Chain of Thought (CoT) reasoning process with reflection, guiding the model to reason step-by-step, extract temporal events, reflect on its logic, and produce a final answer. The `output` field contains the full model-generated response adhering to this reasoning format. This structure supports supervised training of models to perform temporal reasoning and answer generation.\n\n## Test Data Format\n\nEach test example in the [TISER test dataset](data/TISER_test.json) is represented as a single JSON object containing five fields: `dataset_name`, which specifies the split or task; `question_id`, a unique identifier for the query; `question`, the temporal reasoning prompt itself; `prompt`, which embeds the full Chain-of-Thought template (including `<reas"},{"ref":"P14","kind":"page","title":"amazon-science/GaRAGe repository metadata","date":"2026-06-11T03:58:48.375948+00:00","date_source":null,"source_url":"https://github.com/amazon-science/GaRAGe","signal_url":null,"signal_json_url":null,"text":"# amazon-science/GaRAGe\n\nDescription: [ACL 2025] GaRAGe: A Benchmark with Grounding Annotations for RAG Evaluation.\n\nLicense: NOASSERTION\n\nStars: 13\n\nForks: 1\n\nOpen issues: 0\n\nCreated: 2025-06-04T10:10:27Z\n\nPushed: 2025-06-10T10:17:52Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n## GaRAGe\n\n### A Benchmark with Grounding Annotations for RAG Evaluation\n\nThis repository contains the data for the paper (ACL 2025 Findings): [GaRAGe: A Benchmark with Grounding Annotations for RAG Evaluation](https://arxiv.org/abs/2506.07671).\n\nGaRAGe is a large RAG benchmark with human-curated long-form answers and annotations of each grounding passage, allowing a fine-grained evaluation of whether LLMs can identify relevant grounding when generating RAG answers. This benchmark contains 2366 questions of diverse complexity, dynamism, and topics, and includes over 35K annotated passages retrieved from both private document sets and the Web, to reflect real-world RAG use cases. This makes it an ideal test bed to evaluate an LLM's ability to identify only the relevant information necessary to compose a response, or provide a deflective response when there is insufficient information. \n\n## Data Format\n\nEach entry in the [GaRAGe dataset](data/GaRAGe_benchmark.jsonl) is a JSON object containing the following fields: `sample_id`, `question_date`, `grounding`, `question`, `question_valid`, `question_false_premise`, `question_seeking`, `question_sensitive`, `question_type`, `question_complexity`, `question_category`, `question_popularity`, `evidence_relevant`, `evidence_correct`, `answer_generate`, `answer_related_info`, `answer_validate`, `comments`, `evidence_cited`, `question_tag` and `topic_tag`. Each field is explained below:\n\n- sample_id: an unique identifier of the datapoint.\n- question_date: the timestamp of the question.\n- grounding: a list of passages. For each passage, we provide the text prefixed by a citation marker, the age of the passage, the date of the passage and the provider (either web or ent).\n- question: the question for the current datapoint.\n- question_valid: wether the question is valid or not. A valid question is understandable, complete, answerable, not ha"},{"ref":"P15","kind":"page","title":"amazon-science/Query-Conditioned-NLI repository metadata","date":"2026-06-11T03:58:48.302944+00:00","date_source":null,"source_url":"https://github.com/amazon-science/Query-Conditioned-NLI","signal_url":null,"signal_json_url":null,"text":"# amazon-science/Query-Conditioned-NLI\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 0\n\nForks: 0\n\nOpen issues: 2\n\nCreated: 2025-06-06T22:01:16Z\n\nPushed: 2026-04-08T08:55:51Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# Query-conditioned Natural Language Inference\n\nThis repository contains the dataset and code for the paper \"Benchmarking Query-conditioned Natural Language Inference\" (Canby et al., 2025).\n\n<div align=\"center\">\n<img src=\"imgs/qcnli.png\" width=\"600px\">\n<br>\n<em><b>Natural language inference (NLI).</b> (a) Sentence-level NLI has a label &ell; indicating the semantic relationship between a premise sentence s<sub>p</sub> and hypothesis sentence s<sub>h</sub>. (b) Document-level NLI conditions &ell; on a premise document d<sub>p</sub> and a hypothesis document d<sub>h</sub>. (c) Query-conditioned NLI conditions label &ell;<sub>i</sub> on premise document d<sub>p</sub>, hypothesis document d<sub>h</sub>, and a query q<sub>i</sub>, which indicates the aspect of the documents the semantic relationship should be based on.</em>\n</div>\n\n## Table of Contents\n- [Environment Setup](#environment-setup)\n- [Dataset](#dataset)\n- [Usage](#usage)\n- [Running QC-NLI Task](#running-qc-nli-task)\n- [Converting Datasets to QC-NLI Format](#converting-datasets-to-qc-nli-format)\n- [Adding New Datasets](#adding-new-datasets)\n- [Citation](#citation)\n- [Security](#security)\n- [Contact](#contact)\n\n## Environment Setup\n\n### Prerequisites\n- Python 3.8+\n- Required API keys (OpenAI, Google AI)\n\n### Installation\n\n1. Clone this repository:\n```bash\ngit clone https://github.com/amazon-science/Query-Conditioned-NLI.git\ncd Query-Conditioned-NLI\n```\n\n2. Create and activate a virtual environment:\n```bash\npython -m venv venv\nsource venv/bin/activate # On Windows: venv\\Scripts\\activate\n```\n\n3. Install required packages:\n```bash\npip install -r requirements.txt\n```\n\n4. Set up API keys:\n```bash\nexport OPENAI_API_KEY=\"your-openai-key\"\nexport GOOGLE_API_KEY=\"your-google-key\"\n```\n\n## Dataset\n\nThe QC-NLI dataset is located in the `data/` folder and includes adaptations from four existing datasets:\n\n| Dataset | Task | Size | Label Set |\n|------------------------------|--------------------"},{"ref":"P16","kind":"page","title":"amazon-science/omnimatch repository metadata","date":"2026-06-11T03:58:48.111227+00:00","date_source":null,"source_url":"https://github.com/amazon-science/omnimatch","signal_url":null,"signal_json_url":null,"text":"# amazon-science/omnimatch\n\nDescription: OmniMatch: Joinability Discovery in Data Products\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 9\n\nForks: 2\n\nOpen issues: 5\n\nCreated: 2025-06-19T23:04:50Z\n\nPushed: 2026-06-10T18:46:19Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# ΟmniΜatch\n\nThis repo includes the code used for implementing OmniMatch, as described in \"OmniMatch: Joinability Discovery in Data Products\".\n\n## Repo structure\n\n* [`src`]() Contains python source fiiles for developing OmniMatch and baselines used in the paper:\n- `training_generator.py` contains the code for generating training dataset pairs for self-supervision.\n- `featurizer.py` contains the code for computing column pairwise similarity metrics\n- `omnimatch_predictors.py` contains code for training and testing OmniMatch models, as described in the paper.\n- `rf_predictor.py` contains code for training and testing the Random Forest model baseline.\n- other source files needed for execution.\n* [`config_files`]() Contains configuration files for each python script included in `src`.\n\n## Datasets and other files\n\nThe dataset can be downloaded from this location: https://zenodo.org/records/15705578\n\nDetails:\n\n* `data-products-matching/datasets` contains test and train datasets for both our join benchmarks.\n* `data-products-matching/assets/features` contains column-pairwise similarity metrics for each measure used in the paper (in .pickle format) for both our join benchmarks and their corresponding test and train datasets.\n* `data-products-matching/assets/samples` contains samples of training datasets that can be used for training, for both join benchmarks.\n* `data-products-matching/assets/matches` contains all join and non-join pairs of training and test datasets for both our join benchmarks (in .pickle format).\n\n## Running OmniMatch\n\n1. In the absence of training data, use `src/training_generator.py` to generate training dataset pairs based on the test data. Make sure after generating the data to compute the full lists of join/non_join pairs between the generated dataset pairs in the format of [((filename1.csv, column1), (filename2.csv, column2)), etc.] and store them into two separate p"},{"ref":"P17","kind":"page","title":"amazon-science/LARCQ repository metadata","date":"2026-06-11T03:58:47.781136+00:00","date_source":null,"source_url":"https://github.com/amazon-science/LARCQ","signal_url":null,"signal_json_url":null,"text":"# amazon-science/LARCQ\n\nDescription: Codes of LARCQ Paper (Interspeech 2025)\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 0\n\nForks: 0\n\nOpen issues: 10\n\nCreated: 2025-08-12T22:35:31Z\n\nPushed: 2026-02-05T18:59:38Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# 🚀 Official codes of our Interspeech paper *On Retrieval of Long Audios with Complex Text Queries*\n\n* Project website https://sites.google.com/view/larcq\n* Paper https://www.isca-archive.org/interspeech_2025/yang25n_interspeech.html\n```\n@inproceedings{yang25n_interspeech,\ntitle = {On Retrieval of Long Audios with Complex Text Queries},\nauthor = {Ruochu Yang and Milind Rao and Harshavardhan Sundar and Anirudh Raju and Aparna Khare and Srinath Tankasala and Di He and Venkatesh Ravichandran},\nyear = {2025},\nbooktitle = {Interspeech 2025},\npages = {2660--2664},\ndoi = {10.21437/Interspeech.2025-2085},\nissn = {2958-1796},\n}\n```\n\n# Prerequisite\n\n## 1. Configure environments\n\n```\nconda create -n larcq python=3.10\nconda activate larcq\npip install -r requirements.txt\npip install -e hf-dev-train/transformers-main\npip install -e peft-main\n```\n\n## 2. Download benchmarks\nSave the benchmarks in the `datasets` folder.\n\nDue to license restriction, we cannot open-source our Clotho_LARCQ and SoundDescs_LARCQ benchmarks. However, we provide the codes of generating the benchmarks. Actually, you can use our codes to generate any LARCQ-style benchmark you want.\n\n## 3. Download models\n* Download the `clap-htsat-fused` model from the Hugging Face [model link](https://huggingface.co/laion/clap-htsat-fused). Save the model in the `models` folder.\n\n* Download the `gpt2` model from the Hugging Face [model link](https://huggingface.co/openai-community/gpt2). Save the model in the `models` folder.\n\n* Download the `Llama-2-7b-chat-hf-qformer` folder from the Google Drive [website link](https://drive.google.com/drive/u/0/folders/1W8ZtlhXNZ2IdVcKWsQpLD4jVw98brYDM). Save the folder in the `models` folder.\n\n* Download the `stage5_epoch2` folder from the Google Drive [website link](https://drive.google.com/drive/u/0/folders/1W8ZtlhXNZ2IdVcKWsQpLD4jVw98brYDM). Unzip and save the folder in the `models` folder.\n\n* Download the `clapcap"},{"ref":"P18","kind":"page","title":"amazon-science/TN-Eval repository metadata","date":"2026-06-11T03:58:47.731888+00:00","date_source":null,"source_url":"https://github.com/amazon-science/TN-Eval","signal_url":null,"signal_json_url":null,"text":"# amazon-science/TN-Eval\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 1\n\nForks: 0\n\nOpen issues: 0\n\nCreated: 2025-06-20T19:33:11Z\n\nPushed: 2025-06-23T15:19:55Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# TN-Eval\n\nThis repository contains the code for our ACL 2025 paper: [TN-Eval: Rubric and Evaluation Protocols for Measuring the Quality of Behavioral Therapy\nNotes](https://arxiv.org/abs/2503.20648).\n\n**Authors**: \n[Raj Sanjay Shah](https://raj-sanjay-shah.github.io/), \n[Lei Xu](leixx.io), \n[Qianchu Liu](https://qianchu.github.io/), \n[Jon Burnsky](https://jburnsky.github.io/linguist/), \nDrew Bertagnolli,\n[Chaitanya Shivade](https://cshivade.github.io/)\n\n## Introduction\nTN-Eval provides tools for generating behavioral therapy notes using large language models (LLMs) and evaluating them via automatic, rubric-based protocols.\n\n## Quick Start\n\n**Download Data**\n\nDownload [AnnoMI](https://github.com/uccollab/AnnoMI) data from https://github.com/uccollab/AnnoMI/raw/refs/heads/main/AnnoMI-full.csv and save it as `data/AnnoMI-full.csv`. \n\n**Generate Notes**\n\n```bash\npython3 src/generate_soap_note.py --input data/AnnoMi-full.csv --output data/llm_notes/\n```\n\n**Run Automatic Evaluations**\n```base\npython3 src/run_metrics_reference_free.py \\\n--note data/llm_notes/outputs_annomi_llama31_70B_high.json \\\n--output data/llm_notes/utputs_annomi_llama31_70B_high_with_eval.json\n```\n\n## Human Notes and Evaluations\nYou can find all data artifacts in our companion repository: [TN-Eval-Data](https://github.com/amazon-science/TN-Eval-Data).\n\nThis includes:\n- Human-written therapy notes\n- Human evaluations of human notes and LLM-generated notes\n- Automatic evaluations using LLaMA and Mistral models\n\n## Citation\n\nIf you use our data, please cite\n\n```\n@inproceedings{shah2025tneval,\ntitle={TN-Eval: Rubric and Evaluation Protocols for Measuring the Quality of Behavioral Therapy Notes},\nauthor={Shah, Raj Sanjay and Xu, Lei and Liu, Qianchu and Burnsky, Jon and Bertagnolli, Drew and Shivade, Chaitanya},\nbooktitle={Proceedings of the 63nd Annual Meeting of the Association for Computational Linguistics: Industry Track},\nyear={2025}\n}\n```\n\n## Security\n\nSee [CONTRIBUTING](CONTRIBUTING"},{"ref":"P19","kind":"page","title":"amazon-science/TN-Eval-Data repository metadata","date":"2026-06-11T03:58:47.515296+00:00","date_source":null,"source_url":"https://github.com/amazon-science/TN-Eval-Data","signal_url":null,"signal_json_url":null,"text":"# amazon-science/TN-Eval-Data\n\nStars: 0\n\nForks: 0\n\nOpen issues: 0\n\nCreated: 2025-06-20T19:44:03Z\n\nPushed: 2025-06-23T15:20:09Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# TN-Eval-Data\n\nThis repository contains data for the [TN-Eval](https://github.com/amazon-science/TN-Eval) project.\n\nWe use 50 conversations from the [AnnoMI](https://github.com/uccollab/AnnoMI) dataset and collect corresponding notes and evaluations.\n\nFor each conversation, the following note versions are included:\n- Human-written notes\n- Llama 3.1 70B generated notes\n- Mistral Large V2 generated notes\n\nEach note is evaluated by:\n- 2 human annotators\n- Llama 3.1 70B\n- Mistral Large V2\n\n## Data Format\nEach entry in the dataset is structured as follows:\n```\n{\n\"id\": \"0\", // Conversation ID\n\"mi_quality\": \"high\", // AnnoMI dataset split (only high-quality used)\n\"human\": { // Human-written note and evaluations\n\"note\": { ... }, // SOAP-style clinical note\n\"metrics_llama31_70B\": { ... }, // Automatic evaluation by Llama 3.1\n\"metrics_mistral_large_v2\": { ... }, // Automatic evaluation by Mistral\n\"align_score\": { ... }, // AlignScore metric\n\"metrics_human\": { ... } // Human evaluation (2 annotators)\n},\n\"llm_llama31_70B\": { ... }, // Llama 3.1 generated note and evaluations\n\"llm_mistral_large_v2\": { ... } // Mistral Large V2 generated note and evaluations\n}\n```\n\nEvaluations are conducted per note section. For both human and automatic evaluations, the structure is consistent. Below is an example for the subjective section:\n```\n\"subjective\": {\n\"rubric_completeness_raw\": { ... }, // Raw rubric scores (completeness)\n\"rubric_conciseness_raw\": { ... }, // Raw rubric scores (conciseness)\n\"likert_completeness\": 2, // Likert-scale score for completeness\n\"likert_conciseness\": 4, // Likert-scale score for conciseness\n\"likert_faithfulness\": 5, // Likert-scale score for faithfulness\n\"rubric_completeness\": 0.33, // Aggregated rubric completeness score\n\"rubric_conciseness\": 0.89 // Aggregated rubric conciseness score\n}\n```\n\n## Conversation Data\nPlease refer to https://github.com/uccollab/AnnoMI for the conversation transcripts.\n\n## Citation\nIf you use our data, please cite\n```\n@inproceedings{shah2025tneval,\n"},{"ref":"P20","kind":"page","title":"amazon-science/Spherical_Diffusion_Policy repository metadata","date":"2026-06-11T03:58:47.240744+00:00","date_source":null,"source_url":"https://github.com/amazon-science/Spherical_Diffusion_Policy","signal_url":null,"signal_json_url":null,"text":"# amazon-science/Spherical_Diffusion_Policy\n\nDescription: [ICML 2025] Official implementation of Spherical Diffusion Policy: A SE(3) Equivariant Visuomotor Policy with Spherical Fourier Representation\n\nLanguage: Python\n\nLicense: MIT\n\nStars: 43\n\nForks: 7\n\nOpen issues: 3\n\nCreated: 2025-07-01T15:22:16Z\n\nPushed: 2025-07-08T15:51:59Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# Spherical Diffusion Policy\nBy [Xupeng Zhu](https://zxp-s-works.github.io/), [Fan Wang](https://faninedinburgh.wixsite.com/mysite-1/publications), [Robin Walters](https://www.robinwalters.com/), and [Jane Shi]()\n\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/se-3-equivariant-diffusion-policy-in/robot-manipulation-on-mimicgen)](https://paperswithcode.com/sota/robot-manipulation-on-mimicgen?p=se-3-equivariant-diffusion-policy-in)\n\nOfficial implementation for [**Spherical Diffusion Policy: A SE(3) Equivariant Visuomotor Policy with Spherical Fourier Representation**](https://arxiv.org/abs/2507.01723), to appear at **ICML 2025**.\n\n[Arxiv](https://arxiv.org/abs/2507.01723) ｜ [5min summary video](https://recorder-v3.slideslive.com/?share=102381&s=e55f418c-393e-451f-a47e-c25b41d009e5) | [OpenReview](https://openreview.net/forum?id=U5nRMOs8Ed)\n\n![](image/SDP.png)\n\nSpherical Diffusion Policy (SDP) is a SE(3) equivariant and T(3) invariant visuomotor policy that leverages spherical Fourier representations to achieve strong 3D generalization in robotic manipulation tasks. SDP introduces three key components:\n1. **Spherical Fourier Representations** for encoding the robot's state and actions with continuous rotational equivariance.\n2. **Spherical FiLM Conditioning** to inject scene embeddings from the vision encoder into the denoising process in an equivariant manner.\n3. **Spherical Denoising Temporal Unet (SDTU)** that supports spatiotemporal equivariant denoising of trajectories.\n\nOur method generalizes well across diverse 3D scene configurations and is benchmarked on 20 simulation tasks using [MimicGen](https://github.com/NVlabs/mimicgen_environments) and 5 physical single arm or bi-manual robot tasks, consistently outperforming strong baselines like EquiDi"},{"ref":"P21","kind":"page","title":"amazon-science/ProxSparse repository metadata","date":"2026-06-11T03:58:47.14889+00:00","date_source":null,"source_url":"https://github.com/amazon-science/ProxSparse","signal_url":null,"signal_json_url":null,"text":"# amazon-science/ProxSparse\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 11\n\nForks: 1\n\nOpen issues: 0\n\nCreated: 2025-06-23T20:46:04Z\n\nPushed: 2025-06-24T21:19:52Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# ProxSparse: Regularized Learning of Semi-Structured Sparsity Masks for Pretrained LLMs\n\n<p align=\"center\">\n<img src=\"assets/semi-structured-sparsity-pattern.png\" alt=\"2-4 Structured Sparsity\" width=\"750\">\n<br>\n<em>Semi-Structured sparsity (eg.: 2-4 sparsity) provides a middle ground between strctured puning (removing entire sub-structures like neurons or attention heads) and non-uniform unstructured sparsity. However, finding the optimal 2:4 mask is NP-hard and non-differentiable. <a href=\"https://developer.nvidia.com/blog/structured-sparsity-in-the-nvidia-ampere-architecture-and-applications-in-search-engines/\">[Image Source]</a> </em>\n</p>\n\nProxSparse is a learning-based framework for semi-structured (2:4) pruning of Large Language Models using only a few hundred calibration samples. The optimal mask selection is enabled by regularized optimization, which transforms the rigid, non-differentiable mask selection process into a smoother optimization procedure, allowing gradual mask exploration with flexibility. ProxSparse does not involve additional weight updates once the mask is determined.\n\n🔗 You can find our paper (ICML'25) [here](https://arxiv.org/abs/2502.00258).\n\n## 🛠 Setup Instructions\n\nThe required environment to run ProxSparse is stored in ``requirement.txt``. You can run the below command to install them.\n\n``` \nconda create --name proxsparse python==3.10\nconda activate proxsparse\npip install -r requirement.txt\n```\n\n## 💾 Model checkpoints produced by ProxSparse\n\nWe release our 2:4 pruned models induced by ProxSparse in the Huggingface repository [aladinggit/proxsparse_models](https://huggingface.co/aladinggit/proxsparse_models/tree/main). The repo contains 2:4 pruned checkpoints of Llama-2-7b, Llama-2-13b, Llama-3.1-8b, Mistral-v0.1-7b, Mistral-v0.3-7b, Openllama-v2-7b and Qwen-2.5-14b. \n\nThe downloading scripts will download those pruned models from the huggingface repository.\n\n```python proxsparse_pruned_model_download.py```\n\n``ev"},{"ref":"P22","kind":"page","title":"amazon-science/CiteEval repository metadata","date":"2026-06-11T03:58:46.849955+00:00","date_source":null,"source_url":"https://github.com/amazon-science/CiteEval","signal_url":null,"signal_json_url":null,"text":"# amazon-science/CiteEval\n\nDescription: Official repository for CiteEval: Principle-Driven Citation Evaluation for Source Attribution\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 7\n\nForks: 2\n\nOpen issues: 5\n\nCreated: 2025-07-13T21:51:49Z\n\nPushed: 2026-01-22T20:41:05Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# CiteEval: Principle-Driven Citation Evaluation for Source Attribution\nThis repository releases the code and data for **CiteEval**:\n\n> 📃 Title: **Principle-Driven Citation Evaluation for Source Attribution** <br>\n> 🔗 Link: https://arxiv.org/pdf/2506.01829 <br>\n> 🤔 Abstract: Citation quality is crucial in information-seeking systems, directly influencing trust and the effectiveness of information access. Current evaluation frameworks, both human and automatic, mainly rely on Natural Language Inference (NLI) to assess binary or ternary supportiveness from cited sources, which we argue is a suboptimal proxy for citation evaluation. In this work we introduce CiteEval, a citation evaluation framework driven by principles focusing on fine-grained citation assessment within a broad context, encompassing not only the cited sources but the full retrieval context, user query, and generated text. Guided by the proposed framework, we construct CiteBench, a multi-domain benchmark with high-quality human annotations on citation quality. To enable efficient evaluation, we further develop CiteEval-Auto, a suite of model-based metrics that exhibit strong correlation with human judgments. Experiments across diverse systems demonstrate CiteEval-Auto's superior ability to capture the multifaceted nature of citations compared to existing metrics, offering a principled and scalable approach to evaluate and improve model-generated citations.\n\n## Environment Setup\n1. Set up Python environment\nCreate a virtualenv or conda environment with Python≥3.10\n\n```bash\nconda create -n citeeval python=3.10\nconda activate citeeval\n```\n\n2. Install dependencies via:\n```bash\npip install -r requirments.txt\n```\n\n3. Set up environment variables\n```bash\n# Set OPENAI_API_KEY \nexport OPENAI_API_KEY='YOUR-OPENAI-API-KEY'\n\n# Set CITEEVAL_ROOT\nexport CITEEVAL_ROOT=\"PATH-TO-CITEEVAL\"\n\n# "},{"ref":"P23","kind":"page","title":"amazon-science/QualityFlow repository metadata","date":"2026-06-11T03:58:46.587832+00:00","date_source":null,"source_url":"https://github.com/amazon-science/QualityFlow","signal_url":null,"signal_json_url":null,"text":"# amazon-science/QualityFlow\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 2\n\nForks: 0\n\nOpen issues: 0\n\nCreated: 2025-08-28T17:53:51Z\n\nPushed: 2025-08-28T17:57:36Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# QualityFlow\n\n## Abstract\n\nWe introduce QualityFlow, a dynamic agentic workflow for program synthesis.\nGiven the English description of a programming problem and a set of unit tests, the model's goal is to synthesize the\ncorrect program that solves the problem and passes the tests.\nQualityFlow includes large language model (LLM) agents resembling a software development team, including code\ngeneration, testing, and self-debugging.\nWe propose the LLM Quality Checker, which explicitly ``imagines'' whether the synthesized programs' execution would\nconform to the unit tests.\nThe Quality Checks dynamically control the workflow, including actions to submit the final answer, clarify the problem\nstatement, and revert previous workflow steps.\nOur experiments show that the Quality Checker can precisely accept any correct program, mitigate faulty synthesized\ntests, and prevent potential workflow deviation.\nQualityFlow establishes the state-of-the-art results on four program synthesis benchmarks: MBPP, HumanEval, and stricter\nevaluations from MBPP-EvalPlus and HumanEval-EvalPlus.\n\n## Paper\n\nQualityFlow: An Agentic Workflow for Program Synthesis Controlled by LLM Quality Checks\nhttps://arxiv.org/pdf/2501.17167\n\n## Setup instruction\n\nInstall dependencies\n\n```bash\nconda create -n agentic python=3.12\nconda activate agentic\npip3 install openai boto3 tqdm pandas datasets sqlalchemy anthropic psutil transformers deepdiff seaborn pymysql jsonlines\npip3 install simple_parsing scikit-learn \npip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121\n```\n\nInstall mxeval: https://github.com/amazon-science/mxeval\nInstall codegeex: https://github.com/THUDM/CodeGeeX\n\nSet your Anthropic API KEY as environment variable\n\n```bash\nexport ANTHROPIC_API_KEY=[YOUR KEY HERE]\n```\n\nAdd CodeGeeX to your python path\n\n```bash\nexport PYTHONPATH=.:lib/CodeGeeX/\n```\n\n## How to run\n\nFor example, running QualityFlow on MBPP:\n\n```bash\n### without cacher (afte"},{"ref":"P24","kind":"page","title":"amazon-science/MixtureOfAdapters repository metadata","date":"2026-06-11T03:58:46.236597+00:00","date_source":null,"source_url":"https://github.com/amazon-science/MixtureOfAdapters","signal_url":null,"signal_json_url":null,"text":"# amazon-science/MixtureOfAdapters\n\nLanguage: Jupyter Notebook\n\nLicense: Apache-2.0\n\nStars: 2\n\nForks: 0\n\nOpen issues: 0\n\nCreated: 2025-07-18T22:03:56Z\n\nPushed: 2025-07-18T22:51:59Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# Mixture of Adapters: Context-Aware Adaptation of Representations\n\nThis repo contains code for reproducing the experiments in our paper and using our proposed architecture. \n\n![Overview of Mixture of Adapters](moa.png)\n\n**Fig. 1:** The left diagram shows the proposed Mixture of Adapters (MoA) architecture for generating context-aware embeddings $\\phi(x, c)$. The input $x$ is encoded and processed by $K$ adapters in parallel. The context $c$ is independently encoded and used by a gating module to compute mixing weights, producing a weighted sum of adapter outputs as the final embedding. The right diagram shows an example: given the input \"blue t-shirt\" and the context \"color,\" the model produces a context-aware embedding that emphasizes \"blue,\" isolating the specified feature.\n\n## Security\n\nSee [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.\n\n## License\n\nThis project is licensed under the Apache-2.0 License."},{"ref":"P25","kind":"page","title":"amazon-science/h3-indexer repository metadata","date":"2026-06-11T03:58:46.190379+00:00","date_source":null,"source_url":"https://github.com/amazon-science/h3-indexer","signal_url":null,"signal_json_url":null,"text":"# amazon-science/h3-indexer\n\nDescription: The h3-indexer is an open source package for indexing geospatial data using PySpark, Apache Sedona and the H3 hierarchical spatial indexing system. The h3-indexer maps any number of vector-type geospatial data sets to H3 grids for efficient spatial analysis and querying.\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 17\n\nForks: 2\n\nOpen issues: 0\n\nCreated: 2025-07-29T21:22:56Z\n\nPushed: 2025-08-19T01:43:12Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# h3-indexer\n\nThe **h3-indexer** is an open source package for indexing geospatial data using PySpark, Apache Sedona and Uber's open source H3 hierarchical spatial indexing system. The h3-indexer maps any number of vector-type geospatial data sets to H3 grids for efficient spatial analysis and querying.\n\n![H3 Data Flow](data_flow.png)\n\nThe h3-indexer contains 3 stages, and users can [provide command line arguments](#usage) to run the stages one at a time, or all together. \n1. [Validator](#input-requirements)\n2. [Indexer](#methodology-indexer)\n3. [Resolver](#methodology-resolver)\n\n## Features\n- Supports vector point, line & polygon data types\n- Supports the following inputs:\n- Parquet files & shapefiles in AWS S3\n- AWS Glue catalog tables\n- Outputs are written to AWS S3 in parquet format\n- Configurable [H3 resolution](https://h3geo.org/docs/core-library/restable/) (3-10)\n- PySpark for H3 indexing operations using Apache Sedona\n- YAML & JSON-based configuration supported\n\n## Developer Setup\nDeveloper setup is currently supported only on Linux ARM64 machines (not on x86_64 or macOS).\n\n### Versions:\nThis tool requires the following versions:\n- Python: 3.10\n- AWS Glue: 4.0\n- Spark: 3.3.0\n- Apache Sedona: 1.7.1\n\n### Setup\n\nRun the following commands from inside the h3-indexer root directory to set up your run environment:\n\n```\nchmod +x scripts/env_setup.sh\nsource ./scripts/env_setup.sh\n```\n\nWhen this executable finishes running, it will print out the environment variable paths that you should set in your .env file - for example:\n```\nYour SPARK_HOME path is <path>\nYour GLUE_JARS path is <path>\nYour JAVA_HOME path is <path>\n```\n\n### Environment\n\nYou need to include the foll"},{"ref":"P26","kind":"page","title":"amazon-science/Cyber-Zero repository metadata","date":"2026-06-11T03:58:45.967602+00:00","date_source":null,"source_url":"https://github.com/amazon-science/Cyber-Zero","signal_url":null,"signal_json_url":null,"text":"# amazon-science/Cyber-Zero\n\nDescription: Cyber-Zero: Training Cybersecurity Agents Without Runtime\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 94\n\nForks: 17\n\nOpen issues: 38\n\nCreated: 2025-07-28T17:57:55Z\n\nPushed: 2026-02-13T14:29:44Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# Cyber-Zero: Training Cybersecurity Agents without Runtime\n\n<p align=\"left\">\n<a href=\"https://arxiv.org/abs/2508.00910\"><img src=\"https://img.shields.io/badge/arXiv-2508.00910-b31b1b.svg?style=for-the-badge\">\n</p>\n\n<div align=\"center\">\n<h2>🎉 Check out our latest work!<br>\n<a href=\"https://arxiv.org/abs/2508.18370\" title=\"CTF-Dojo: Training Language Model Agents to Find Vulnerabilities with CTF-Dojo\"><img src=\"https://img.shields.io/badge/CTF--Dojo-First%20runtime%20for%20cybersecurity%20agents-orange?style=for-the-badge\"></a><br>\n<strong>🚀 First runtime environment for cybersecurity agents 🚀<br>\n</div>\n\n<p align=\"left\">\n🧐&nbsp;<a href=\"#overview\">Overview</a>\n| 🏆&nbsp;<a href=\"#benchmark-suite\">Benchmark Suite</a>\n| 🚀&nbsp;<a href=\"#quick-start\">Quick Start</a>\n| 🏗️&nbsp;<a href=\"#architecture\">Architecture</a>\n| ⚙️&nbsp;<a href=\"#configuration\">Configuration</a>\n| 📊&nbsp;<a href=\"#generation\">Generation</a>\n| 📝&nbsp;<a href=\"#validation\">Validation</a>\n| 📝&nbsp;<a href=\"#cli-interface\">CLI Interface</a>\n| 📝&nbsp;<a href=\"#citation\">Citation</a>\n</p>\n\nCyber-Zero is a comprehensive framework for training cybersecurity agents without requiring runtime execution environments.\n\n## Overview\n\n<p align=\"center\">\n<img src=\"asset/tease.svg\" alt=\"Cyber-Zero Teaser\" width=\"800\"/>\n</p>\n\nLarge Language Models (LLMs) have achieved remarkable success in software engineering tasks when trained with executable runtime environments, such environments are often unavailable in cybersecurity domains where challenge configurations and execution contexts are ephemeral or restricted. Cyber-Zero addresses this fundamental limitation by leveraging publicly available CTF writeups and employing persona-driven LLM simulation to reverse-engineer runtime behaviors and generate realistic, long-horizon interaction sequences without actual execution environments.\n\nThe key innovation is generati"},{"ref":"P27","kind":"page","title":"amazon-science/plan-guided-summarization repository metadata","date":"2026-06-11T03:58:45.319522+00:00","date_source":null,"source_url":"https://github.com/amazon-science/plan-guided-summarization","signal_url":null,"signal_json_url":null,"text":"# amazon-science/plan-guided-summarization\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 0\n\nForks: 0\n\nOpen issues: 0\n\nCreated: 2025-08-20T20:38:15Z\n\nPushed: 2025-08-20T22:30:54Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# Exploration of Plan-Guided Summarization for Narrative Texts: the Case of Small Language Models\n\n[![Paper](https://img.shields.io/badge/Paper-arXiv-b31b1b)](https://arxiv.org/abs/2504.09071)\n[![Python](https://img.shields.io/badge/Python-3.10-blue)](https://www.python.org/downloads/)\n[![PyTorch](https://img.shields.io/badge/PyTorch-2.3.1-red)](https://pytorch.org/)\n[![Transformers](https://img.shields.io/badge/Transformers-4.41.2-yellow)](https://huggingface.co/transformers/)\n[![License: CC BY-NC 4.0](https://img.shields.io/badge/License-CC%20BY--NC%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc/4.0/)\n\n## 🚀 Installation\n\n### Instructions for plan_guided_summ python environment:\n\n```\nconda create -n plan_guided_summ python=3.10\n```\n\nInstall following packages:\n\n```\npip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121\npip install transformers==4.41.2\npip install flash_attn==2.5.8\npip install accelerate\npip install datasets\npip install deepspeed\npip install peft\npip install evaluate\npip install rouge-score\npip install absl-py\npip install spacy==3.7.4\n```\n\n## 🔧 Usage\n\nSee shell scripts in the scripts directory for commands to run different experiments.\n\n## 🤝 Citation\n\nPlease consider citing the paper if you use these sources.\n\n```\n@misc{grenander2025explorationplanguidedsummarizationnarrative,\ntitle={Exploration of Plan-Guided Summarization for Narrative Texts: the Case of Small Language Models}, \nauthor={Matt Grenander and Siddharth Varia and Paula Czarnowska and Yogarshi Vyas and Kishaloy Halder and Bonan Min},\nyear={2025},\neprint={2504.09071},\narchivePrefix={arXiv},\nprimaryClass={cs.CL},\nurl={https://arxiv.org/abs/2504.09071}, \n}\n```"},{"ref":"P28","kind":"page","title":"amazon-science/weak-supervision-for-few-shot-absa repository metadata","date":"2026-06-11T03:58:45.134034+00:00","date_source":null,"source_url":"https://github.com/amazon-science/weak-supervision-for-few-shot-absa","signal_url":null,"signal_json_url":null,"text":"# amazon-science/weak-supervision-for-few-shot-absa\n\nLanguage: Python\n\nLicense: NOASSERTION\n\nStars: 1\n\nForks: 0\n\nOpen issues: 0\n\nCreated: 2025-08-26T19:32:13Z\n\nPushed: 2025-08-28T06:01:06Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n## Code Repository\n\nThis is the code repository for the papers:\n\n- [A Weak Supervision Approach for Few-Shot Aspect Based Sentiment Analysis](https://aclanthology.org/2024.eacl-long.167/) (Vacareanu et al., EACL 2024)\n\n- [Instruction Tuning for Few-Shot Aspect-Based Sentiment Analysis](https://aclanthology.org/2023.wassa-1.3/) (Varia et al., WASSA 2023)\n\n## Subdirectories\n\nThe repository has the following directories:\n\n1. data: contains the necessary python scripts to create the few-shot data. The source code in the data directory is same as the source code in this repository: https://github.com/amazon-science/instruction-tuning-for-absa\n2. baselines: contains the necessary python scripts to create the weak supervision data as described in the first paper above\n3. instruction_tuning: contains the necessary python scripts to instruction tune the T5 model on the weak supervision data\n4. utils: contains other utility python scripts\n\nFor the data, check the README inside the data directory.\n\nIf you find the sources useful, please consider citing our work:\n\n## Citations\n\nBibTeX for the first paper:\n\n```\n@inproceedings{vacareanu-etal-2024-weak,\ntitle = \"A Weak Supervision Approach for Few-Shot Aspect Based Sentiment Analysis\",\nauthor = \"Vacareanu, Robert and\nVaria, Siddharth and\nHalder, Kishaloy and\nWang, Shuai and\nPaolini, Giovanni and\nAnna John, Neha and\nBallesteros, Miguel and\nMuresan, Smaranda\",\neditor = \"Graham, Yvette and\nPurver, Matthew\",\nbooktitle = \"Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)\",\nmonth = mar,\nyear = \"2024\",\naddress = \"St. Julian{'}s, Malta\",\npublisher = \"Association for Computational Linguistics\",\nurl = \"https://aclanthology.org/2024.eacl-long.167/\",\ndoi = \"10.18653/v1/2024.eacl-long.167\",\npages = \"2734--2752\",\nabstract = \"We explore how weak supervision on abundant unlabeled data can be leveraged to improve few-shot p"},{"ref":"E1","kind":"event","title":"amazon/chronos-2","date":"2025-10-30T14:54:39+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/chronos-2","signal_url":"https://onlylabs.fyi/signals/67c71940-2d8b-42d5-b23c-282513cd1e7f","signal_json_url":"https://onlylabs.fyi/signals/67c71940-2d8b-42d5-b23c-282513cd1e7f/signal.json","text":"model_released · amazon/chronos-2 · signal_desk=releases · occurred_at=2025-10-30T14:54:39+00:00 · url=https://huggingface.co/amazon/chronos-2 · hf_downloads=12421045 · hf_likes=318 · hf_params=119477664 · pipeline=time-series-forecasting · license=apache-2.0"},{"ref":"E2","kind":"event","title":"How flat is replacing fat in AWS data center networks","date":"2026-05-28T10:30:00+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/how-flat-is-replacing-fat-in-aws-data-center-networks","signal_url":"https://onlylabs.fyi/signals/870cabf1-8131-4b1a-ad15-4b6173ce2bdc","signal_json_url":"https://onlylabs.fyi/signals/870cabf1-8131-4b1a-ad15-4b6173ce2bdc/signal.json","text":"post_published · How flat is replacing fat in AWS data center networks · signal_desk=talking · occurred_at=2026-05-28T10:30:00+00:00 · url=https://www.amazon.science/blog/how-flat-is-replacing-fat-in-aws-data-center-networks · hn=4 points/2 comments · data_radar_lanes=Data demand · data_radar_terms=data · data_radar_reason=Amazon (Nova) has a writing signal matching data demand. · raw={\"excerpt\":\"“Quasi-random” network topologies and new passive optical components called ShuffleBoxes make more-efficient flat networks as practical as traditional “fat-tree” networks.\"}"},{"ref":"E3","kind":"event","title":"Graviton5&#8217;s improved design increases speed and energy efficiency &#8212; beyond Moore&#8217;s law","date":"2026-06-10T15:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/graviton5s-improved-design-increases-speed-and-energy-efficiency-beyond-moores-law","signal_url":"https://onlylabs.fyi/signals/54aa0928-e2d8-42a7-bd35-35ffd86960ff","signal_json_url":"https://onlylabs.fyi/signals/54aa0928-e2d8-42a7-bd35-35ffd86960ff/signal.json","text":"post_published · Graviton5&#8217;s improved design increases speed and energy efficiency &#8212; beyond Moore&#8217;s law · signal_desk=talking · occurred_at=2026-06-10T15:00:00+00:00 · url=https://www.amazon.science/blog/graviton5s-improved-design-increases-speed-and-energy-efficiency-beyond-moores-law · data_radar_lanes=Product and customer · data_radar_terms=support · data_radar_reason=Amazon (Nova) has a writing signal matching product and customer. · raw={\"excerpt\":\"A new chiplet architecture, custom die-to-die connectivity, and support for DDR5-8800 memory and the latest PCIe gen6 interconnects improve performance by 25% for general-purpose and agentic AI workloads.\"}"},{"ref":"E4","kind":"event","title":"EC2&#8217;s formally verified &#8220;isolation engine&#8221; provides mathematical assurance of virtual-machine isolation","date":"2026-06-10T15:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/ec2s-formally-verified-isolation-engine-provides-mathematical-assurance-of-virtual-machine-isolation","signal_url":"https://onlylabs.fyi/signals/50af815d-c9e4-4f62-8f18-fb20ed2c559c","signal_json_url":"https://onlylabs.fyi/signals/50af815d-c9e4-4f62-8f18-fb20ed2c559c/signal.json","text":"post_published · EC2&#8217;s formally verified &#8220;isolation engine&#8221; provides mathematical assurance of virtual-machine isolation · signal_desk=talking · occurred_at=2026-06-10T15:00:00+00:00 · url=https://www.amazon.science/blog/ec2s-formally-verified-isolation-engine-provides-mathematical-assurance-of-virtual-machine-isolation · data_radar_lanes=Safety and policy · data_radar_terms=security · data_radar_reason=Amazon (Nova) has a writing signal matching safety and policy. · raw={\"excerpt\":\"Splitting the “separation kernel” off from the rest of the Nitro security system and using only a subset of the Rust programming language to code it enabled its formal verification.\"}"},{"ref":"E5","kind":"event","title":"Real-world grounding in agentic AI","date":"2026-06-08T19:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/real-world-grounding-in-agentic-ai","signal_url":"https://onlylabs.fyi/signals/ca203ab5-0b26-4cf4-842a-b6974d959a3c","signal_json_url":"https://onlylabs.fyi/signals/ca203ab5-0b26-4cf4-842a-b6974d959a3c/signal.json","text":"post_published · Real-world grounding in agentic AI · signal_desk=talking · occurred_at=2026-06-08T19:00:00+00:00 · url=https://www.amazon.science/blog/real-world-grounding-in-agentic-ai · data_radar_lanes=Safety and policy · data_radar_terms=trust · data_radar_reason=Amazon (Nova) has a writing signal matching safety and policy. · raw={\"excerpt\":\"Four approaches can dramatically improve the performance and trustworthiness of AI agents in operational environments.\"}"},{"ref":"E6","kind":"event","title":"Bridging intent and execution in agentic systems","date":"2026-06-08T17:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/bridging-intent-and-execution-in-agentic-systems","signal_url":"https://onlylabs.fyi/signals/7aa6e2f0-9f45-4d83-ae57-da5d76968332","signal_json_url":"https://onlylabs.fyi/signals/7aa6e2f0-9f45-4d83-ae57-da5d76968332/signal.json","text":"post_published · Bridging intent and execution in agentic systems · signal_desk=talking · occurred_at=2026-06-08T17:00:00+00:00 · url=https://www.amazon.science/blog/bridging-intent-and-execution-in-agentic-systems · data_radar_lanes=Infrastructure · data_radar_terms=systems · data_radar_reason=Amazon (Nova) has a writing signal matching infrastructure. · raw={\"excerpt\":\"The harnesses that mediate between models and tools in agentic systems are becoming their own performance bottleneck, but a few simple design principles can fix what ails them.\"}"},{"ref":"E7","kind":"event","title":"amazon-science/reskill","date":"2026-06-04T02:13:35+00:00","date_source":"source","source_url":"https://github.com/amazon-science/reskill","signal_url":"https://onlylabs.fyi/signals/087c32a2-6ad0-4981-9315-11fdd32a0153","signal_json_url":"https://onlylabs.fyi/signals/087c32a2-6ad0-4981-9315-11fdd32a0153/signal.json","text":"repo_new · amazon-science/reskill · signal_desk=repos · occurred_at=2026-06-04T02:13:35+00:00 · url=https://github.com/amazon-science/reskill · stars=6 · data_radar_lanes=Infrastructure · data_radar_terms=training · data_radar_reason=Amazon (Nova) has a repo signal matching infrastructure. · raw={\"repo\":\"amazon-science/reskill\",\"description\":\"An easy-to-configure and extensible veRL extension for agent RL training with skill co-evolution.\",\"language\":\"Python\"}"},{"ref":"E8","kind":"event","title":"Ground truth is a process, not a dataset","date":"2026-06-03T15:56:57+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/ground-truth-is-a-process-not-a-dataset","signal_url":"https://onlylabs.fyi/signals/1745a0a9-a045-456c-a1f7-f4123168fe17","signal_json_url":"https://onlylabs.fyi/signals/1745a0a9-a045-456c-a1f7-f4123168fe17/signal.json","text":"post_published · Ground truth is a process, not a dataset · signal_desk=talking · occurred_at=2026-06-03T15:56:57+00:00 · url=https://www.amazon.science/blog/ground-truth-is-a-process-not-a-dataset · data_radar_lanes=Data demand, Evals and quality · data_radar_terms=data, dataset, benchmark · data_radar_reason=Amazon (Nova) has a writing signal matching data demand, evals and quality. · raw={\"excerpt\":\"Automatically fact-checking long, AI-generated research reports poses new challenges — including benchmarking.\"}"},{"ref":"E9","kind":"event","title":"amazon-science/dualkv-flash-attn-for-rl","date":"2026-05-27T17:38:58+00:00","date_source":"source","source_url":"https://github.com/amazon-science/dualkv-flash-attn-for-rl","signal_url":"https://onlylabs.fyi/signals/e5701aed-6cd3-48dd-bfa6-ef839031e2e8","signal_json_url":"https://onlylabs.fyi/signals/e5701aed-6cd3-48dd-bfa6-ef839031e2e8/signal.json","text":"repo_new · amazon-science/dualkv-flash-attn-for-rl · signal_desk=repos · occurred_at=2026-05-27T17:38:58+00:00 · url=https://github.com/amazon-science/dualkv-flash-attn-for-rl · stars=2 · data_radar_lanes=Infrastructure · data_radar_terms=training · data_radar_reason=Amazon (Nova) has a repo signal matching infrastructure. · raw={\"repo\":\"amazon-science/dualkv-flash-attn-for-rl\",\"description\":\"Implementation of DualKV: Shared-Prompt Flash Attention for Efficient RL Training with Large Rollouts and Long Contexts\",\"language\":\"Python\"}"},{"ref":"E10","kind":"event","title":"Amazon Research Awards recipients announced","date":"2026-05-27T17:21:51+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/research-awards/latest-news/fall-2025-amazon-research-awards-recipients-announced","signal_url":"https://onlylabs.fyi/signals/175148e8-db47-458e-a21b-166f4a96c8e8","signal_json_url":"https://onlylabs.fyi/signals/175148e8-db47-458e-a21b-166f4a96c8e8/signal.json","text":"post_published · Amazon Research Awards recipients announced · signal_desk=talking · occurred_at=2026-05-27T17:21:51+00:00 · url=https://www.amazon.science/research-awards/latest-news/fall-2025-amazon-research-awards-recipients-announced · data_radar_lanes=Data demand · data_radar_terms=data, dataset, datasets · data_radar_reason=Amazon (Nova) has a writing signal matching data demand. · raw={\"excerpt\":\"Awardees represent more than 49 universities in 11 countries. Recipients have access to Amazon public datasets, along with AWS AI/ML services and tools.\"}"},{"ref":"E11","kind":"event","title":"Diverse reasoning traces teach LLMs to make better decisions","date":"2026-05-26T15:17:06+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/diverse-reasoning-traces-teach-llms-to-make-better-decisions","signal_url":"https://onlylabs.fyi/signals/885b930c-18c1-480d-83f9-d3ca51e7f697","signal_json_url":"https://onlylabs.fyi/signals/885b930c-18c1-480d-83f9-d3ca51e7f697/signal.json","text":"post_published · Diverse reasoning traces teach LLMs to make better decisions · signal_desk=talking · occurred_at=2026-05-26T15:17:06+00:00 · url=https://www.amazon.science/blog/diverse-reasoning-traces-teach-llms-to-make-better-decisions · raw={\"excerpt\":\"How to train language models to generate diverse, accurate reasoning paths using tokens that control distinct reasoning strategies.\"}"},{"ref":"E12","kind":"event","title":"amazon-science/concurry v0.13.2","date":"2026-05-21T12:25:34+00:00","date_source":"source","source_url":"https://github.com/amazon-science/concurry/releases/tag/v0.13.2","signal_url":"https://onlylabs.fyi/signals/48b07581-1baa-451f-a141-2e3d7a984fcf","signal_json_url":"https://onlylabs.fyi/signals/48b07581-1baa-451f-a141-2e3d7a984fcf/signal.json","text":"release · amazon-science/concurry v0.13.2 · signal_desk=releases · occurred_at=2026-05-21T12:25:34+00:00 · url=https://github.com/amazon-science/concurry/releases/tag/v0.13.2 · raw={\"repo\":\"amazon-science/concurry\"}"},{"ref":"E13","kind":"event","title":"amazon-science/concurry v0.13.1","date":"2026-05-21T11:48:18+00:00","date_source":"source","source_url":"https://github.com/amazon-science/concurry/releases/tag/v0.13.1","signal_url":"https://onlylabs.fyi/signals/8d82442b-99a9-48ca-a544-ce326351d487","signal_json_url":"https://onlylabs.fyi/signals/8d82442b-99a9-48ca-a544-ce326351d487/signal.json","text":"release · amazon-science/concurry v0.13.1 · signal_desk=releases · occurred_at=2026-05-21T11:48:18+00:00 · url=https://github.com/amazon-science/concurry/releases/tag/v0.13.1 · raw={\"repo\":\"amazon-science/concurry\"}"},{"ref":"E14","kind":"event","title":"amazon-science/EvoMAS","date":"2026-05-19T19:23:29+00:00","date_source":"source","source_url":"https://github.com/amazon-science/EvoMAS","signal_url":"https://onlylabs.fyi/signals/8af28f0c-7331-4b08-b517-e18b3555e503","signal_json_url":"https://onlylabs.fyi/signals/8af28f0c-7331-4b08-b517-e18b3555e503/signal.json","text":"repo_new · amazon-science/EvoMAS · signal_desk=repos · occurred_at=2026-05-19T19:23:29+00:00 · url=https://github.com/amazon-science/EvoMAS · stars=3 · data_radar_lanes=Data demand, Infrastructure · data_radar_terms=rag, systems · data_radar_reason=Amazon (Nova) has a repo signal matching data demand, infrastructure. · raw={\"repo\":\"amazon-science/EvoMAS\",\"description\":\"Evolutionary Generation of Multi-Agent Systems; Yuntong Hu, Yuting Zhang, Matthew Trager, Yi Zhang, Shuo Yang, Wei Xia, Stefano Soatto, ICML 2026\",\"language\":\"Python\"}"},{"ref":"E15","kind":"event","title":"amazon/gpt-oss-20b-p-eagle-long-context","date":"2026-05-14T23:11:10+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/gpt-oss-20b-p-eagle-long-context","signal_url":"https://onlylabs.fyi/signals/65dd73c3-f938-468a-9777-d0d3fbbcbc23","signal_json_url":"https://onlylabs.fyi/signals/65dd73c3-f938-468a-9777-d0d3fbbcbc23/signal.json","text":"model_released · amazon/gpt-oss-20b-p-eagle-long-context · signal_desk=releases · occurred_at=2026-05-14T23:11:10+00:00 · url=https://huggingface.co/amazon/gpt-oss-20b-p-eagle-long-context · hf_downloads=54 · hf_likes=2 · hf_params=1799955776 · license=apache-2.0"},{"ref":"E16","kind":"event","title":"amazon/Qwen3-Coder-30B-A3B-Instruct-P-EAGLE-long-context","date":"2026-05-14T23:50:24+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/Qwen3-Coder-30B-A3B-Instruct-P-EAGLE-long-context","signal_url":"https://onlylabs.fyi/signals/774e55d2-87c5-4b3e-91d9-79271c21322b","signal_json_url":"https://onlylabs.fyi/signals/774e55d2-87c5-4b3e-91d9-79271c21322b/signal.json","text":"model_released · amazon/Qwen3-Coder-30B-A3B-Instruct-P-EAGLE-long-context · signal_desk=releases · occurred_at=2026-05-14T23:50:24+00:00 · url=https://huggingface.co/amazon/Qwen3-Coder-30B-A3B-Instruct-P-EAGLE-long-context · hf_downloads=67 · hf_likes=1 · hf_params=972884736 · license=apache-2.0"},{"ref":"E17","kind":"event","title":"amazon/gpt-oss-120b-p-eagle-long-context","date":"2026-05-14T17:52:34+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/gpt-oss-120b-p-eagle-long-context","signal_url":"https://onlylabs.fyi/signals/c255adba-13d0-4522-a0af-63d9665b4c93","signal_json_url":"https://onlylabs.fyi/signals/c255adba-13d0-4522-a0af-63d9665b4c93/signal.json","text":"model_released · amazon/gpt-oss-120b-p-eagle-long-context · signal_desk=releases · occurred_at=2026-05-14T17:52:34+00:00 · url=https://huggingface.co/amazon/gpt-oss-120b-p-eagle-long-context · hf_downloads=140 · hf_likes=1 · hf_params=1702264640 · license=apache-2.0"},{"ref":"E18","kind":"event","title":"Making LLMs faster without sacrificing accuracy","date":"2026-05-15T13:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/making-llms-faster-without-sacrificing-accuracy","signal_url":"https://onlylabs.fyi/signals/6fb87847-fe93-4f46-88d7-d43f2b49a0de","signal_json_url":"https://onlylabs.fyi/signals/6fb87847-fe93-4f46-88d7-d43f2b49a0de/signal.json","text":"post_published · Making LLMs faster without sacrificing accuracy · signal_desk=talking · occurred_at=2026-05-15T13:00:00+00:00 · url=https://www.amazon.science/blog/making-llms-faster-without-sacrificing-accuracy · data_radar_lanes=Infrastructure · data_radar_terms=scaling · data_radar_reason=Amazon (Nova) has a writing signal matching infrastructure. · raw={\"excerpt\":\"A new scaling law that relates particular architectural choices to loss helps identify models that improve throughput by up to 47% with no loss of accuracy.\"}"},{"ref":"E19","kind":"event","title":"amazon-science/adaptive-layerwise-perturbation","date":"2026-05-14T17:44:17+00:00","date_source":"source","source_url":"https://github.com/amazon-science/adaptive-layerwise-perturbation","signal_url":"https://onlylabs.fyi/signals/e3ff8718-7daa-4ebd-a3e6-3d825c538b74","signal_json_url":"https://onlylabs.fyi/signals/e3ff8718-7daa-4ebd-a3e6-3d825c538b74/signal.json","text":"repo_new · amazon-science/adaptive-layerwise-perturbation · signal_desk=repos · occurred_at=2026-05-14T17:44:17+00:00 · url=https://github.com/amazon-science/adaptive-layerwise-perturbation · stars=1 · raw={\"repo\":\"amazon-science/adaptive-layerwise-perturbation\",\"language\":\"Python\"}"},{"ref":"E20","kind":"event","title":"Promptimus: Improving already good LLM prompts with zero manual engineering","date":"2026-05-14T13:47:45+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/promptimus-improving-already-good-llm-prompts-with-zero-manual-engineering","signal_url":"https://onlylabs.fyi/signals/dcbc2a9d-c6ae-423a-b562-ba1252b7843e","signal_json_url":"https://onlylabs.fyi/signals/dcbc2a9d-c6ae-423a-b562-ba1252b7843e/signal.json","text":"post_published · Promptimus: Improving already good LLM prompts with zero manual engineering · signal_desk=talking · occurred_at=2026-05-14T13:47:45+00:00 · url=https://www.amazon.science/blog/promptimus-improving-already-good-llm-prompts-with-zero-manual-engineering · data_radar_lanes=Product and customer · data_radar_terms=solutions · data_radar_reason=Amazon (Nova) has a writing signal matching product and customer. · raw={\"excerpt\":\"By focusing on specific failure points and suggesting targeted solutions, a new automated prompt-engineering framework improves prompt performance without compromising existing functionality.\"}"},{"ref":"E21","kind":"event","title":"amazon-science/temporal-reasoning-dataset","date":"2026-05-13T13:07:08+00:00","date_source":"source","source_url":"https://github.com/amazon-science/temporal-reasoning-dataset","signal_url":"https://onlylabs.fyi/signals/9afcd328-0124-485c-8ace-9c3ad546e316","signal_json_url":"https://onlylabs.fyi/signals/9afcd328-0124-485c-8ace-9c3ad546e316/signal.json","text":"repo_new · amazon-science/temporal-reasoning-dataset · signal_desk=repos · occurred_at=2026-05-13T13:07:08+00:00 · url=https://github.com/amazon-science/temporal-reasoning-dataset · stars=2 · data_radar_lanes=Data demand, Evals and quality · data_radar_terms=data, dataset, benchmark · data_radar_reason=Amazon (Nova) has a repo signal matching data demand, evals and quality. · raw={\"repo\":\"amazon-science/temporal-reasoning-dataset\",\"description\":\"🔬 Replication Package for the paper: \\\"Benchmarking Multilingual Temporal Reasoning in LLMs: The Temporal Reasoning Dataset\\\"\",\"language\":\"Python\"}"},{"ref":"E22","kind":"event","title":"amazon-science/PROF-GRPO","date":"2026-05-12T19:43:55+00:00","date_source":"source","source_url":"https://github.com/amazon-science/PROF-GRPO","signal_url":"https://onlylabs.fyi/signals/e19ce80b-3d6a-4aaf-9b1a-82d1b19ab682","signal_json_url":"https://onlylabs.fyi/signals/e19ce80b-3d6a-4aaf-9b1a-82d1b19ab682/signal.json","text":"repo_new · amazon-science/PROF-GRPO · signal_desk=repos · occurred_at=2026-05-12T19:43:55+00:00 · url=https://github.com/amazon-science/PROF-GRPO · stars=3 · raw={\"repo\":\"amazon-science/PROF-GRPO\",\"language\":\"Python\"}"},{"ref":"E23","kind":"event","title":"amazon-science/hallucination-benchmark-trivialplus","date":"2026-05-11T21:50:38+00:00","date_source":"source","source_url":"https://github.com/amazon-science/hallucination-benchmark-trivialplus","signal_url":"https://onlylabs.fyi/signals/9b75c29a-37a5-4bf7-90ee-0934ca0fa407","signal_json_url":"https://onlylabs.fyi/signals/9b75c29a-37a5-4bf7-90ee-0934ca0fa407/signal.json","text":"repo_new · amazon-science/hallucination-benchmark-trivialplus · signal_desk=repos · occurred_at=2026-05-11T21:50:38+00:00 · url=https://github.com/amazon-science/hallucination-benchmark-trivialplus · stars=3 · data_radar_lanes=Data demand, Evals and quality · data_radar_terms=rag, eval, evaluation, benchmark · data_radar_reason=Amazon (Nova) has a repo signal matching data demand, evals and quality. · raw={\"repo\":\"amazon-science/hallucination-benchmark-trivialplus\",\"description\":\"[ACL 2026 main] Long-Context Hallucination Detection Benchmark: Rethinking Evaluation for LLM Hallucination Detection: A Desiderata, A New RAG-based Benchmark, New Insights\",\"language\":\"Python\"}"},{"ref":"E24","kind":"event","title":"Navigating uncertainty in Amazon&apos;s middle-mile network","date":"2026-05-06T13:37:38+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/navigating-uncertainty-in-amazons-middle-mile-network","signal_url":"https://onlylabs.fyi/signals/748cd73d-00db-498b-a191-237703fba550","signal_json_url":"https://onlylabs.fyi/signals/748cd73d-00db-498b-a191-237703fba550/signal.json","text":"post_published · Navigating uncertainty in Amazon&apos;s middle-mile network · signal_desk=talking · occurred_at=2026-05-06T13:37:38+00:00 · url=https://www.amazon.science/blog/navigating-uncertainty-in-amazons-middle-mile-network · raw={\"excerpt\":\"Amazon engineers and scientists have created new tools to optimize delivery networks under uncertainty — and keep them adapting without missing a beat.\"}"},{"ref":"E25","kind":"event","title":"amazon-science/RecArena","date":"2026-05-06T06:43:02+00:00","date_source":"source","source_url":"https://github.com/amazon-science/RecArena","signal_url":"https://onlylabs.fyi/signals/167fd29a-5f0f-49c7-be08-5e284c7ab423","signal_json_url":"https://onlylabs.fyi/signals/167fd29a-5f0f-49c7-be08-5e284c7ab423/signal.json","text":"repo_new · amazon-science/RecArena · signal_desk=repos · occurred_at=2026-05-06T06:43:02+00:00 · url=https://github.com/amazon-science/RecArena · stars=1 · raw={\"repo\":\"amazon-science/RecArena\",\"language\":\"Python\"}"},{"ref":"E26","kind":"event","title":"amazon-science/compagent","date":"2026-05-05T23:09:22+00:00","date_source":"source","source_url":"https://github.com/amazon-science/compagent","signal_url":"https://onlylabs.fyi/signals/f823a5c0-e909-4f5a-8241-af329648db0c","signal_json_url":"https://onlylabs.fyi/signals/f823a5c0-e909-4f5a-8241-af329648db0c/signal.json","text":"repo_new · amazon-science/compagent · signal_desk=repos · occurred_at=2026-05-05T23:09:22+00:00 · url=https://github.com/amazon-science/compagent · stars=1 · data_radar_lanes=Safety and policy · data_radar_terms=compliance · data_radar_reason=Amazon (Nova) has a repo signal matching safety and policy. · raw={\"repo\":\"amazon-science/compagent\",\"description\":\"CompAgent: An Agentic Framework for Visual Compliance Verification\",\"language\":\"Jupyter Notebook\"}"},{"ref":"E27","kind":"event","title":"amazon-science/SWAN","date":"2026-05-05T17:17:14+00:00","date_source":"source","source_url":"https://github.com/amazon-science/SWAN","signal_url":"https://onlylabs.fyi/signals/be9e75a4-5f0a-4bc4-84a0-3c0bbafc1533","signal_json_url":"https://onlylabs.fyi/signals/be9e75a4-5f0a-4bc4-84a0-3c0bbafc1533/signal.json","text":"repo_new · amazon-science/SWAN · signal_desk=repos · occurred_at=2026-05-05T17:17:14+00:00 · url=https://github.com/amazon-science/SWAN · raw={\"repo\":\"amazon-science/SWAN\",\"description\":\"Code for ACL 2026 Paper \\\"SWAN: Semantic Watermarking with Abstract Meaning Representation\\\"\",\"language\":\"Python\"}"},{"ref":"E28","kind":"event","title":"How mechanism design theory helps optimize Amazon-vendor collaboration","date":"2026-05-05T13:11:23+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/how-mechanism-design-theory-helps-optimize-amazon-vendor-collaboration","signal_url":"https://onlylabs.fyi/signals/e6719951-46a3-47cd-916b-810758df6bac","signal_json_url":"https://onlylabs.fyi/signals/e6719951-46a3-47cd-916b-810758df6bac/signal.json","text":"post_published · How mechanism design theory helps optimize Amazon-vendor collaboration · signal_desk=talking · occurred_at=2026-05-05T13:11:23+00:00 · url=https://www.amazon.science/blog/how-mechanism-design-theory-helps-optimize-amazon-vendor-collaboration · raw={\"excerpt\":\"Agentic mechanism enables Amazon and vendors to optimize supply chain management without disclosing private information.\"}"},{"ref":"E29","kind":"event","title":"Building trust into AI","date":"2026-05-04T15:07:58+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/building-trust-into-ai","signal_url":"https://onlylabs.fyi/signals/907e5456-d6a1-40ad-b684-af668587083a","signal_json_url":"https://onlylabs.fyi/signals/907e5456-d6a1-40ad-b684-af668587083a/signal.json","text":"post_published · Building trust into AI · signal_desk=talking · occurred_at=2026-05-04T15:07:58+00:00 · url=https://www.amazon.science/blog/building-trust-into-ai · data_radar_lanes=Safety and policy · data_radar_terms=safety, trust, policy · data_radar_reason=Amazon (Nova) has a writing signal matching safety and policy. · raw={\"excerpt\":\"Amazon scientists and policy experts discuss how the company’s responsible-AI pipeline embeds safety and values throughout the AI development lifecycle.\"}"},{"ref":"E30","kind":"event","title":"amazon-science/rmir","date":"2026-05-01T16:28:38+00:00","date_source":"source","source_url":"https://github.com/amazon-science/rmir","signal_url":"https://onlylabs.fyi/signals/72ed8e30-7140-4825-a6d8-3025d6e595aa","signal_json_url":"https://onlylabs.fyi/signals/72ed8e30-7140-4825-a6d8-3025d6e595aa/signal.json","text":"repo_new · amazon-science/rmir · signal_desk=repos · occurred_at=2026-05-01T16:28:38+00:00 · url=https://github.com/amazon-science/rmir · stars=4 · data_radar_lanes=Data demand, Evals and quality · data_radar_terms=data, dataset, retrieval, eval, benchmark · data_radar_reason=Amazon (Nova) has a repo signal matching data demand, evals and quality. · raw={\"repo\":\"amazon-science/rmir\",\"description\":\"Code for the paper RMIR: A Benchmark Dataset for Reasoning-Intensive Multimodal Image Retrieval.\",\"language\":\"Python\"}"},{"ref":"E31","kind":"event","title":"amazon-science/azcausal v0.2.5","date":"2026-04-30T18:25:05+00:00","date_source":"source","source_url":"https://github.com/amazon-science/azcausal/releases/tag/v0.2.5","signal_url":"https://onlylabs.fyi/signals/8470f354-8d69-4dbf-905a-d1a59df46142","signal_json_url":"https://onlylabs.fyi/signals/8470f354-8d69-4dbf-905a-d1a59df46142/signal.json","text":"release · amazon-science/azcausal v0.2.5 · signal_desk=releases · occurred_at=2026-04-30T18:25:05+00:00 · url=https://github.com/amazon-science/azcausal/releases/tag/v0.2.5 · raw={\"repo\":\"amazon-science/azcausal\"}"},{"ref":"E32","kind":"event","title":"Preserving the privacy of AI training data","date":"2026-04-29T17:59:07+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/preserving-the-privacy-of-ai-training-data","signal_url":"https://onlylabs.fyi/signals/12f0579b-c9ec-47fd-963f-1f7cf7cb43e7","signal_json_url":"https://onlylabs.fyi/signals/12f0579b-c9ec-47fd-963f-1f7cf7cb43e7/signal.json","text":"post_published · Preserving the privacy of AI training data · signal_desk=talking · occurred_at=2026-04-29T17:59:07+00:00 · url=https://www.amazon.science/blog/preserving-the-privacy-of-ai-training-data · data_radar_lanes=Data demand, Infrastructure, Safety and policy · data_radar_terms=data, training, privacy · data_radar_reason=Amazon (Nova) has a writing signal matching data demand, infrastructure, safety and policy. · raw={\"excerpt\":\"How we reproduced three attacks that extract private training data from AI models and the cryptographic defenses that stop them.\"}"},{"ref":"E33","kind":"event","title":"How catastrophic is your LLM?","date":"2026-04-27T19:01:26+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/how-catastrophic-is-your-llm","signal_url":"https://onlylabs.fyi/signals/7d1377b6-dd1b-43b3-884f-92d37b339d7e","signal_json_url":"https://onlylabs.fyi/signals/7d1377b6-dd1b-43b3-884f-92d37b339d7e/signal.json","text":"post_published · How catastrophic is your LLM? · signal_desk=talking · occurred_at=2026-04-27T19:01:26+00:00 · url=https://www.amazon.science/blog/how-catastrophic-is-your-llm · raw={\"excerpt\":\"A new framework provides a statistical method for estimating the likelihood of catastrophic failures in large language models in adversarial conversations.\"}"},{"ref":"E34","kind":"event","title":"Isabelle/HOL: The proof assistant behind the Nitro Isolation Engine","date":"2026-04-17T13:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/isabelle-hol-the-proof-assistant-behind-the-nitro-isolation-engine","signal_url":"https://onlylabs.fyi/signals/2f38f1a4-8624-4219-8543-76625029d14c","signal_json_url":"https://onlylabs.fyi/signals/2f38f1a4-8624-4219-8543-76625029d14c/signal.json","text":"post_published · Isabelle/HOL: The proof assistant behind the Nitro Isolation Engine · signal_desk=talking · occurred_at=2026-04-17T13:00:00+00:00 · url=https://www.amazon.science/blog/isabelle-hol-the-proof-assistant-behind-the-nitro-isolation-engine · raw={\"excerpt\":\"Isabelle/HOL's balance of expressiveness, automation, and scalability enabled the world's first formally verified cloud hypervisor.\"}"},{"ref":"E35","kind":"event","title":"amazon-science/expert-upcycling","date":"2026-04-15T23:52:22+00:00","date_source":"source","source_url":"https://github.com/amazon-science/expert-upcycling","signal_url":"https://onlylabs.fyi/signals/bb1e997e-59bd-4b94-af47-e7b256e8114c","signal_json_url":"https://onlylabs.fyi/signals/bb1e997e-59bd-4b94-af47-e7b256e8114c/signal.json","text":"repo_new · amazon-science/expert-upcycling · signal_desk=repos · occurred_at=2026-04-15T23:52:22+00:00 · url=https://github.com/amazon-science/expert-upcycling · stars=14 · raw={\"repo\":\"amazon-science/expert-upcycling\",\"language\":\"Python\"}"},{"ref":"E36","kind":"event","title":"amazon-science/CodeStruct","date":"2026-04-15T21:34:24+00:00","date_source":"source","source_url":"https://github.com/amazon-science/CodeStruct","signal_url":"https://onlylabs.fyi/signals/d7306143-c6f9-4905-a7bd-d8488ea1598b","signal_json_url":"https://onlylabs.fyi/signals/d7306143-c6f9-4905-a7bd-d8488ea1598b/signal.json","text":"repo_new · amazon-science/CodeStruct · signal_desk=repos · occurred_at=2026-04-15T21:34:24+00:00 · url=https://github.com/amazon-science/CodeStruct · stars=4 · raw={\"repo\":\"amazon-science/CodeStruct\",\"language\":\"Python\"}"},{"ref":"E37","kind":"event","title":"Customized Amazon Nova models improve molecular-property prediction in drug discovery","date":"2026-04-15T16:10:55+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/customized-amazon-nova-models-improve-molecular-property-prediction-in-drug-discovery","signal_url":"https://onlylabs.fyi/signals/ca3f9aec-02af-41ce-ad0e-4d975b44e9f9","signal_json_url":"https://onlylabs.fyi/signals/ca3f9aec-02af-41ce-ad0e-4d975b44e9f9/signal.json","text":"post_published · Customized Amazon Nova models improve molecular-property prediction in drug discovery · signal_desk=talking · occurred_at=2026-04-15T16:10:55+00:00 · url=https://www.amazon.science/blog/customized-amazon-nova-models-improve-molecular-property-prediction-in-drug-discovery · raw={\"excerpt\":\"A single, optimized LLM unifies what previously required multiple models and can serve as a reasoning partner for medical chemists.\"}"},{"ref":"E38","kind":"event","title":"AWS and Hopkins Engineering announce groundbreaking database for AI/ML antibody design","date":"2026-04-14T14:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/news/aws-gray-lab-johns-hopkins-announce-groundbreaking-database-for-ai-ml-antibody-design","signal_url":"https://onlylabs.fyi/signals/6334cb01-177d-42a9-982a-d92b7980f2dd","signal_json_url":"https://onlylabs.fyi/signals/6334cb01-177d-42a9-982a-d92b7980f2dd/signal.json","text":"post_published · AWS and Hopkins Engineering announce groundbreaking database for AI/ML antibody design · signal_desk=talking · occurred_at=2026-04-14T14:00:00+00:00 · url=https://www.amazon.science/news/aws-gray-lab-johns-hopkins-announce-groundbreaking-database-for-ai-ml-antibody-design · data_radar_lanes=Data demand, Evals and quality · data_radar_terms=data, dataset, datasets, eval, evaluation, benchmark · data_radar_reason=Amazon (Nova) has a writing signal matching data demand, evals and quality. · raw={\"excerpt\":\"Built in collaboration with the Gray Lab at Johns Hopkins Whiting School of Engineering, the Antibody Developability Benchmark is powered by one of the most diverse antibody datasets in public literature, enabling transparent performance evaluation for AI-guided antibody design.\"}"},{"ref":"E39","kind":"event","title":"Intelligence isn&#8217;t about parameter count. It&#8217;s about time.","date":"2026-02-25T13:59:12+00:00","date_source":"source","source_url":"https://www.amazon.science/blog/intelligence-isnt-about-parameter-count-its-about-time","signal_url":"https://onlylabs.fyi/signals/573d898b-3aa9-4371-8633-bea4b659902a","signal_json_url":"https://onlylabs.fyi/signals/573d898b-3aa9-4371-8633-bea4b659902a/signal.json","text":"post_published · Intelligence isn&#8217;t about parameter count. It&#8217;s about time. · signal_desk=talking · occurred_at=2026-02-25T13:59:12+00:00 · url=https://www.amazon.science/blog/intelligence-isnt-about-parameter-count-its-about-time · hn=3 points/0 comments · data_radar_lanes=Infrastructure · data_radar_terms=inference · data_radar_reason=Amazon (Nova) has a writing signal matching infrastructure. · raw={\"excerpt\":\"As AI models grow larger, they become less insightful, not more. To ensure that they continue to learn, we need to reduce their inference time.\"}"},{"ref":"E40","kind":"event","title":"amazon-science/TransitionFlowMatching","date":"2026-04-10T19:19:08+00:00","date_source":"source","source_url":"https://github.com/amazon-science/TransitionFlowMatching","signal_url":"https://onlylabs.fyi/signals/a71e26e0-15de-45cc-a8ec-272b31707e07","signal_json_url":"https://onlylabs.fyi/signals/a71e26e0-15de-45cc-a8ec-272b31707e07/signal.json","text":"repo_new · amazon-science/TransitionFlowMatching · signal_desk=repos · occurred_at=2026-04-10T19:19:08+00:00 · url=https://github.com/amazon-science/TransitionFlowMatching · stars=12 · raw={\"repo\":\"amazon-science/TransitionFlowMatching\",\"description\":\"Official implementation of \\\"Demystifying Transition Matching: When and Why It Can Beat Flow Matching\\\" (AISTATS 2026). Code for image and video generation using Transition Matching.\",\"language\":\"Python\"}"},{"ref":"E41","kind":"event","title":"amazon/GKA-primed-HQwen3-32B-Instruct","date":"2026-03-31T17:47:35+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/GKA-primed-HQwen3-32B-Instruct","signal_url":"https://onlylabs.fyi/signals/d98160b8-ebe7-4863-af6b-c8eadaf028e4","signal_json_url":"https://onlylabs.fyi/signals/d98160b8-ebe7-4863-af6b-c8eadaf028e4/signal.json","text":"model_released · amazon/GKA-primed-HQwen3-32B-Instruct · signal_desk=releases · occurred_at=2026-03-31T17:47:35+00:00 · url=https://huggingface.co/amazon/GKA-primed-HQwen3-32B-Instruct · hf_downloads=60480 · hf_likes=2 · hf_params=34439095296 · pipeline=text-generation · license=apache-2.0 · raw={\"derived_reason\":\"first-party-finetune\"}"},{"ref":"E42","kind":"event","title":"amazon/Mamba2-primed-HQwen3-8B-Instruct","date":"2026-03-31T17:51:01+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/Mamba2-primed-HQwen3-8B-Instruct","signal_url":"https://onlylabs.fyi/signals/7ba3f73b-c564-42f5-8c88-db38a44cdf91","signal_json_url":"https://onlylabs.fyi/signals/7ba3f73b-c564-42f5-8c88-db38a44cdf91/signal.json","text":"model_released · amazon/Mamba2-primed-HQwen3-8B-Instruct · signal_desk=releases · occurred_at=2026-03-31T17:51:01+00:00 · url=https://huggingface.co/amazon/Mamba2-primed-HQwen3-8B-Instruct · hf_downloads=199 · hf_likes=5 · hf_params=8495712960 · pipeline=text-generation · license=apache-2.0 · raw={\"derived_reason\":\"first-party-finetune\"}"},{"ref":"E43","kind":"event","title":"How Amazon uses agentic AI for vulnerability detection at global scale","date":"2026-04-08T16:17:20+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/how-amazon-uses-agentic-ai-for-vulnerability-detection-at-global-scale","signal_url":"https://onlylabs.fyi/signals/2c49bc6a-c02f-4750-bf89-479730dc8670","signal_json_url":"https://onlylabs.fyi/signals/2c49bc6a-c02f-4750-bf89-479730dc8670/signal.json","text":"post_published · How Amazon uses agentic AI for vulnerability detection at global scale · signal_desk=talking · occurred_at=2026-04-08T16:17:20+00:00 · url=https://www.amazon.science/blog/how-amazon-uses-agentic-ai-for-vulnerability-detection-at-global-scale · data_radar_lanes=Product and customer · data_radar_terms=product · data_radar_reason=Amazon (Nova) has a writing signal matching product and customer. · raw={\"excerpt\":\"Amazon’s RuleForge system uses agentic AI to generate production-ready detection rules 336% faster than traditional methods.\"}"},{"ref":"E44","kind":"event","title":"Verifying and optimizing post-quantum cryptography at Amazon","date":"2026-04-07T15:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/verifying-and-optimizing-post-quantum-cryptography-at-amazon","signal_url":"https://onlylabs.fyi/signals/26fabe91-d726-4848-a1e0-4538ec0fd290","signal_json_url":"https://onlylabs.fyi/signals/26fabe91-d726-4848-a1e0-4538ec0fd290/signal.json","text":"post_published · Verifying and optimizing post-quantum cryptography at Amazon · signal_desk=talking · occurred_at=2026-04-07T15:00:00+00:00 · url=https://www.amazon.science/blog/verifying-and-optimizing-post-quantum-cryptography-at-amazon · data_radar_lanes=Safety and policy · data_radar_terms=security · data_radar_reason=Amazon (Nova) has a writing signal matching safety and policy. · raw={\"excerpt\":\"How automated reasoning reconciles the demands of security, performance, and maintainability.\"}"},{"ref":"E45","kind":"event","title":"amazon/GKA-primed-HQwen3-8B-Reasoner","date":"2026-03-31T17:37:59+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/GKA-primed-HQwen3-8B-Reasoner","signal_url":"https://onlylabs.fyi/signals/0bf4648a-945c-40b7-a1e5-7c4fcd204a97","signal_json_url":"https://onlylabs.fyi/signals/0bf4648a-945c-40b7-a1e5-7c4fcd204a97/signal.json","text":"model_released · amazon/GKA-primed-HQwen3-8B-Reasoner · signal_desk=releases · occurred_at=2026-03-31T17:37:59+00:00 · url=https://huggingface.co/amazon/GKA-primed-HQwen3-8B-Reasoner · hf_downloads=3937 · hf_likes=3 · hf_params=8500245504 · pipeline=text-generation · license=apache-2.0 · raw={\"derived_reason\":\"first-party-finetune\"}"},{"ref":"E46","kind":"event","title":"amazon/GKA-primed-HQwen3-32B-Reasoner","date":"2026-03-31T17:46:09+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/GKA-primed-HQwen3-32B-Reasoner","signal_url":"https://onlylabs.fyi/signals/588f6aad-c968-4b05-a6a7-0be01f6f4db8","signal_json_url":"https://onlylabs.fyi/signals/588f6aad-c968-4b05-a6a7-0be01f6f4db8/signal.json","text":"model_released · amazon/GKA-primed-HQwen3-32B-Reasoner · signal_desk=releases · occurred_at=2026-03-31T17:46:09+00:00 · url=https://huggingface.co/amazon/GKA-primed-HQwen3-32B-Reasoner · hf_downloads=2208 · hf_likes=3 · hf_params=34137072640 · pipeline=text-generation · license=apache-2.0 · raw={\"derived_reason\":\"first-party-finetune\"}"},{"ref":"E47","kind":"event","title":"amazon/GDN-primed-HQwen3-32B-Instruct","date":"2026-03-31T17:55:09+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/GDN-primed-HQwen3-32B-Instruct","signal_url":"https://onlylabs.fyi/signals/349c8649-b8fc-4b89-85ca-a8bed8bb979e","signal_json_url":"https://onlylabs.fyi/signals/349c8649-b8fc-4b89-85ca-a8bed8bb979e/signal.json","text":"model_released · amazon/GDN-primed-HQwen3-32B-Instruct · signal_desk=releases · occurred_at=2026-03-31T17:55:09+00:00 · url=https://huggingface.co/amazon/GDN-primed-HQwen3-32B-Instruct · hf_downloads=25 · hf_likes=3 · hf_params=34428605440 · pipeline=text-generation · license=apache-2.0 · raw={\"derived_reason\":\"first-party-finetune\"}"},{"ref":"E48","kind":"event","title":"amazon-science/uniqsketch v1.2.1","date":"2026-04-06T16:09:09+00:00","date_source":"source","source_url":"https://github.com/amazon-science/uniqsketch/releases/tag/v1.2.1","signal_url":"https://onlylabs.fyi/signals/ea1a9158-32c6-4058-ae2a-7de8a659751d","signal_json_url":"https://onlylabs.fyi/signals/ea1a9158-32c6-4058-ae2a-7de8a659751d/signal.json","text":"release · amazon-science/uniqsketch v1.2.1 · signal_desk=releases · occurred_at=2026-04-06T16:09:09+00:00 · url=https://github.com/amazon-science/uniqsketch/releases/tag/v1.2.1 · raw={\"repo\":\"amazon-science/uniqsketch\"}"},{"ref":"E49","kind":"event","title":"amazon/GKA-primed-HQwen3-8B-Instruct","date":"2026-03-31T17:49:35+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/GKA-primed-HQwen3-8B-Instruct","signal_url":"https://onlylabs.fyi/signals/4afc0b8c-2d15-4535-a6ec-9c24a5426a6d","signal_json_url":"https://onlylabs.fyi/signals/4afc0b8c-2d15-4535-a6ec-9c24a5426a6d/signal.json","text":"model_released · amazon/GKA-primed-HQwen3-8B-Instruct · signal_desk=releases · occurred_at=2026-03-31T17:49:35+00:00 · url=https://huggingface.co/amazon/GKA-primed-HQwen3-8B-Instruct · hf_downloads=3456 · hf_likes=2 · hf_params=8500245504 · pipeline=text-generation · license=apache-2.0 · raw={\"derived_reason\":\"first-party-finetune\"}"},{"ref":"E50","kind":"event","title":"amazon/GDN-primed-HQwen3-8B-Instruct","date":"2026-03-31T17:48:43+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/GDN-primed-HQwen3-8B-Instruct","signal_url":"https://onlylabs.fyi/signals/f670cf61-a6c1-43a9-9f60-bfc331981f03","signal_json_url":"https://onlylabs.fyi/signals/f670cf61-a6c1-43a9-9f60-bfc331981f03/signal.json","text":"model_released · amazon/GDN-primed-HQwen3-8B-Instruct · signal_desk=releases · occurred_at=2026-03-31T17:48:43+00:00 · url=https://huggingface.co/amazon/GDN-primed-HQwen3-8B-Instruct · hf_downloads=1334 · hf_likes=2 · hf_params=8497885056 · pipeline=text-generation · license=apache-2.0 · raw={\"derived_reason\":\"first-party-finetune\"}"},{"ref":"E51","kind":"event","title":"amazon/BMOJOF-primed-HQwen3-8B-Instruct","date":"2026-03-31T17:48:17+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/BMOJOF-primed-HQwen3-8B-Instruct","signal_url":"https://onlylabs.fyi/signals/79c45959-e0b1-4d92-a653-a66361fb1537","signal_json_url":"https://onlylabs.fyi/signals/79c45959-e0b1-4d92-a653-a66361fb1537/signal.json","text":"model_released · amazon/BMOJOF-primed-HQwen3-8B-Instruct · signal_desk=releases · occurred_at=2026-03-31T17:48:17+00:00 · url=https://huggingface.co/amazon/BMOJOF-primed-HQwen3-8B-Instruct · hf_downloads=69 · hf_likes=2 · hf_params=9255224832 · pipeline=text-generation · license=apache-2.0 · raw={\"derived_reason\":\"first-party-finetune\"}"},{"ref":"E52","kind":"event","title":"amazon/GDN-primed-HQwen3-8B-Reasoner","date":"2026-03-31T17:43:05+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/GDN-primed-HQwen3-8B-Reasoner","signal_url":"https://onlylabs.fyi/signals/c27b90dc-22c4-43ce-8f67-1e59f7a69fb8","signal_json_url":"https://onlylabs.fyi/signals/c27b90dc-22c4-43ce-8f67-1e59f7a69fb8/signal.json","text":"model_released · amazon/GDN-primed-HQwen3-8B-Reasoner · signal_desk=releases · occurred_at=2026-03-31T17:43:05+00:00 · url=https://huggingface.co/amazon/GDN-primed-HQwen3-8B-Reasoner · hf_downloads=32 · hf_likes=2 · hf_params=8497885056 · pipeline=text-generation · license=apache-2.0 · raw={\"derived_reason\":\"first-party-finetune\"}"},{"ref":"E53","kind":"event","title":"amazon-science/storm-referring-multi-object-grounding","date":"2026-04-03T21:51:07+00:00","date_source":"source","source_url":"https://github.com/amazon-science/storm-referring-multi-object-grounding","signal_url":"https://onlylabs.fyi/signals/dd34a26c-6be6-449b-8a88-4bfc7623643a","signal_json_url":"https://onlylabs.fyi/signals/dd34a26c-6be6-449b-8a88-4bfc7623643a/signal.json","text":"repo_new · amazon-science/storm-referring-multi-object-grounding · signal_desk=repos · occurred_at=2026-04-03T21:51:07+00:00 · url=https://github.com/amazon-science/storm-referring-multi-object-grounding · stars=1 · raw={\"repo\":\"amazon-science/storm-referring-multi-object-grounding\"}"},{"ref":"E54","kind":"event","title":"amazon-science/azcausal v0.2.4.3","date":"2026-04-02T21:01:34+00:00","date_source":"source","source_url":"https://github.com/amazon-science/azcausal/releases/tag/v0.2.4.3","signal_url":"https://onlylabs.fyi/signals/bdebc990-63a9-42ae-9f23-696688262a98","signal_json_url":"https://onlylabs.fyi/signals/bdebc990-63a9-42ae-9f23-696688262a98/signal.json","text":"release · amazon-science/azcausal v0.2.4.3 · signal_desk=releases · occurred_at=2026-04-02T21:01:34+00:00 · url=https://github.com/amazon-science/azcausal/releases/tag/v0.2.4.3 · raw={\"repo\":\"amazon-science/azcausal\"}"},{"ref":"E55","kind":"event","title":"Improving quality and robustness in LLM-based text-to-speech systems","date":"2026-04-01T18:13:19+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/improving-quality-and-robustness-in-llm-based-text-to-speech-systems","signal_url":"https://onlylabs.fyi/signals/ee6e1769-b151-4449-8196-efc8d4a812d4","signal_json_url":"https://onlylabs.fyi/signals/ee6e1769-b151-4449-8196-efc8d4a812d4/signal.json","text":"post_published · Improving quality and robustness in LLM-based text-to-speech systems · signal_desk=talking · occurred_at=2026-04-01T18:13:19+00:00 · url=https://www.amazon.science/blog/improving-quality-and-robustness-in-llm-based-text-to-speech-systems · data_radar_lanes=Data demand, Evals and quality, Infrastructure · data_radar_terms=data, quality, systems · data_radar_reason=Amazon (Nova) has a writing signal matching data demand, evals and quality, infrastructure. · raw={\"excerpt\":\"Low-rank adaptation, data augmentation, and chain-of-thought reasoning are among the techniques enabling accent-free polyglot outputs, improved expressiveness, and reliable synthesis.\"}"},{"ref":"E56","kind":"event","title":"amazon/gpt-oss-120b-p-eagle","date":"2026-02-09T21:23:11+00:00","date_source":"source","source_url":"https://huggingface.co/amazon/gpt-oss-120b-p-eagle","signal_url":"https://onlylabs.fyi/signals/445f6122-72fc-4c6c-af5f-f23360f7428d","signal_json_url":"https://onlylabs.fyi/signals/445f6122-72fc-4c6c-af5f-f23360f7428d/signal.json","text":"model_released · amazon/gpt-oss-120b-p-eagle · signal_desk=releases · occurred_at=2026-02-09T21:23:11+00:00 · url=https://huggingface.co/amazon/gpt-oss-120b-p-eagle · hf_downloads=267 · hf_likes=9 · hf_params=1702264640 · license=apache-2.0"},{"ref":"E57","kind":"event","title":"amazon-science/acclaim","date":"2026-03-30T17:23:07+00:00","date_source":"source","source_url":"https://github.com/amazon-science/acclaim","signal_url":"https://onlylabs.fyi/signals/cb1d9c56-42cd-48a5-9f1e-c264e3bfbc1e","signal_json_url":"https://onlylabs.fyi/signals/cb1d9c56-42cd-48a5-9f1e-c264e3bfbc1e/signal.json","text":"repo_new · amazon-science/acclaim · signal_desk=repos · occurred_at=2026-03-30T17:23:07+00:00 · url=https://github.com/amazon-science/acclaim · stars=3 · raw={\"repo\":\"amazon-science/acclaim\",\"language\":\"Python\"}"},{"ref":"E58","kind":"event","title":"amazon-science/agentic-forking-path","date":"2026-03-30T13:15:53+00:00","date_source":"source","source_url":"https://github.com/amazon-science/agentic-forking-path","signal_url":"https://onlylabs.fyi/signals/23ac9a51-441b-43cd-8435-84e9cd0c9018","signal_json_url":"https://onlylabs.fyi/signals/23ac9a51-441b-43cd-8435-84e9cd0c9018/signal.json","text":"repo_new · amazon-science/agentic-forking-path · signal_desk=repos · occurred_at=2026-03-30T13:15:53+00:00 · url=https://github.com/amazon-science/agentic-forking-path · stars=1 · raw={\"repo\":\"amazon-science/agentic-forking-path\",\"language\":\"Python\"}"},{"ref":"E59","kind":"event","title":"amazon-science/papercode-coordinating-spot-and-contracts","date":"2026-03-28T14:47:56+00:00","date_source":"source","source_url":"https://github.com/amazon-science/papercode-coordinating-spot-and-contracts","signal_url":"https://onlylabs.fyi/signals/b124a4a8-abb6-474b-af95-af80bc85307c","signal_json_url":"https://onlylabs.fyi/signals/b124a4a8-abb6-474b-af95-af80bc85307c/signal.json","text":"repo_new · amazon-science/papercode-coordinating-spot-and-contracts · signal_desk=repos · occurred_at=2026-03-28T14:47:56+00:00 · url=https://github.com/amazon-science/papercode-coordinating-spot-and-contracts · raw={\"repo\":\"amazon-science/papercode-coordinating-spot-and-contracts\",\"description\":\"Code for Paper: \\\"Coordinating Spot and Contract Supply in Freight Marketplaces\\\"\",\"language\":\"Python\"}"},{"ref":"E60","kind":"event","title":"Formally verified AES-XTS: The first AES algorithm to join s2n-bignum","date":"2026-03-20T16:38:44+00:00","date_source":"rss.item_date","source_url":"https://www.amazon.science/blog/formally-verified-aes-xts-the-first-aes-algorithm-to-join-s2n-bignum","signal_url":"https://onlylabs.fyi/signals/2e8af20a-fd68-4946-918a-8b9217bdf626","signal_json_url":"https://onlylabs.fyi/signals/2e8af20a-fd68-4946-918a-8b9217bdf626/signal.json","text":"post_published · Formally verified AES-XTS: The first AES algorithm to join s2n-bignum · signal_desk=talking · occurred_at=2026-03-20T16:38:44+00:00 · url=https://www.amazon.science/blog/formally-verified-aes-xts-the-first-aes-algorithm-to-join-s2n-bignum · raw={\"excerpt\":\"Simplifying and clarifying the assembly code for core operations enabled automated optimization and verification.\"}"}]}