What does this repo signal mean?

Upstage (Solar) published UpstageAI/mcp-upstage-server (TypeScript). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo UpstageAI/mcp-upstage-server · language TypeScript · Low stars, routine repo. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Upstage (Solar) Repo: UpstageAI/mcp-upstage-server

Captured source

source ↗

GitHub/github.com/UpstageAI/mcp-upstage-server

UpstageAI/mcp-upstage-server repository metadata

Source ↗

published Sep 3, 2025seen Jun 5captured Jun 11http 200method plain

UpstageAI/mcp-upstage-server

Description: Node.js/TypeScript MCP server for Upstage AI document processing with parsing, information extraction, schema generation, and classification tools

Language: TypeScript

Stars: 2

Forks: 1

Open issues: 5

Created: 2025-09-03T11:00:24Z

Pushed: 2026-02-11T17:36:52Z

Default branch: main

Fork: no

Archived: no

README:

MCP-Upstage-Server

Node.js/TypeScript implementation of the MCP server for Upstage AI services.

Features

Document Parsing: Extract structure and content from various document types (PDF, images, Office files)
Information Extraction: Extract structured information using custom or auto-generated schemas
Schema Generation: Automatically generate extraction schemas from document analysis
Document Classification: Classify documents into predefined categories (invoice, receipt, contract, etc.)
Built with TypeScript for type safety
Dual transport support: stdio (default) and HTTP Streamable
Async/await pattern throughout
Comprehensive error handling and retry logic
Progress reporting support

Installation

Prerequisites

Node.js 18.0.0 or higher
Upstage API key from Upstage Console

Install from npm

# Install globally
npm install -g mcp-upstage-server

# Or use with npx (no installation required)
npx mcp-upstage-server

Install from source

# Clone the repository
git clone https://github.com/UpstageAI/mcp-upstage.git
cd mcp-upstage/mcp-upstage-node

# Install dependencies
npm install

# Build the project
npm run build

# Set up environment variables
cp .env.example .env
# Edit .env and add your UPSTAGE_API_KEY

Usage

Running the server

# With stdio transport (default)
UPSTAGE_API_KEY=your-api-key npx mcp-upstage-server

# With HTTP Streamable transport
UPSTAGE_API_KEY=your-api-key npx mcp-upstage-server --http

# With HTTP transport on custom port
UPSTAGE_API_KEY=your-api-key npx mcp-upstage-server --http --port 8080

# Show help
npx mcp-upstage-server --help

# Development mode (from source)
npm run dev

# Production mode (from source)
npm start

Integration with Claude Desktop

Option 1: stdio transport (default)

{
"mcpServers": {
"upstage": {
"command": "npx",
"args": ["mcp-upstage-server"],
"env": {
"UPSTAGE_API_KEY": "your-api-key-here"
}
}
}
}

Option 2: HTTP Streamable transport

{
"mcpServers": {
"upstage-http": {
"command": "npx",
"args": ["mcp-upstage-server", "--http", "--port", "3000"],
"env": {
"UPSTAGE_API_KEY": "your-api-key-here"
}
}
}
}

Transport Options

stdio Transport (Default)

Pros: Simple setup, direct process communication
Cons: Single client connection only
Usage: Default mode, no additional configuration needed

HTTP Streamable Transport

Pros: Multiple client support, network accessible, RESTful API
Cons: Requires port management, network configuration
Endpoints:
POST /mcp - Main MCP communication endpoint
GET /mcp - Server-Sent Events stream
GET /health - Health check endpoint

Available Tools

parse_document

Parse a document using Upstage AI's document digitization API.

Parameters:

file_path (required): Path to the document file
output_formats (optional): Array of output formats (e.g., ['html', 'text', 'markdown'])

Supported formats: PDF, JPEG, PNG, TIFF, BMP, GIF, WEBP

extract_information

Extract structured information from documents using Upstage Universal Information Extraction.

Parameters:

file_path (required): Path to the document file
schema_path (optional): Path to JSON schema file
schema_json (optional): JSON schema as string
auto_generate_schema (optional, default: true): Auto-generate schema if none provided

Supported formats: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX

generate_schema

Generate an extraction schema for a document using Upstage AI's schema generation API.

Parameters:

file_path (required): Path to the document file to analyze

Supported formats: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX

This tool analyzes a document and automatically generates a JSON schema that defines the structure and fields that can be extracted from similar documents. The generated schema can then be used with the extract_information tool when auto_generate_schema is set to false.

Use cases:

Create reusable schemas for multiple similar documents
Have more control over extraction fields
Ensure consistent field naming across extractions

The tool returns both a readable schema object and a schema_json string that can be directly copied and used with the extract_information tool.

classify_document

Classify a document into predefined categories using Upstage AI's document classification API.

Parameters:

file_path (required): Path to the document file to classify
schema_path (optional): Path to JSON file containing custom classification schema
schema_json (optional): JSON string containing custom classification schema

Supported formats: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX

This tool analyzes a document and classifies it into categories. By default, it uses a comprehensive set of document types, but you can provide custom classification categories.

Default categories:

invoice, receipt, contract, cv, bank_statement, tax_document, insurance, business_card, letter, form, certificate, report, others

Use cases:

Automatically sort and organize documents by type
Filter documents for specific processing workflows
Build document management systems with automatic categorization

Schema Guide for Information Extraction

When auto_generate_schema is false, you need to provide a custom schema. Here's how to format it correctly:

📋 Basic Schema Structure

The schema must follow this exact structure:

{
"type": "json_schema",
"json_schema": {
"name": "document_schema",
"schema": {
"type": "object",
"properties": {
"field_name": {
"type": "string|number|array|object",
"description": "Description of what to extract"
}
}
}
}
}

❌ Common Mistakes

Wrong: Missing nested structure

{
"company_name": {
"type": "string"
}
}

Wrong: Incorrect response_format

{
"schema": {
"company_name": "string"
}
}

Wrong: Missing properties wrapper

{
"type": "json_schema",...

Excerpt shown — open the source for the full document.

Notability

notability 2.0/10

Low stars, routine repo