← Back
SyGra Studio v2.0.0 debuts visual no-code editor for synthetic data generation workflows
· releasefeatureplatformopen-source · huggingface.co ↗

Overview

SyGra Studio v2.0.0 introduces a unified visual environment for synthetic data generation, replacing manual configuration files with an interactive canvas-based editor. The studio maintains backward compatibility with existing SyGra workflows while simplifying the data generation process through graphical composition.

Key Features

Visual Workflow Builder

  • Drag-and-drop node composition for LLM-based pipelines
  • Support for multiple LLM providers (OpenAI, Azure OpenAI, Ollama, Vertex AI, Bedrock, vLLM, custom endpoints)
  • Auto-discovery of state variables from data sources for prompt templating

Data Source Integration

  • Direct connectors for Hugging Face datasets, file systems, and ServiceNow instances
  • Row preview capability before execution
  • Automatic column-to-variable mapping

Execution & Monitoring

  • Real-time node progress tracking during workflow execution
  • Per-run visibility into token costs, latency, and guardrail outcomes
  • Inline debugging with logs and breakpoints
  • Monaco-backed code editor for advanced customization

Artifact Generation

  • Visual designs automatically generate production-ready YAML/JSON configuration files
  • Stored in standard tasks/ directory structure for version control
  • Execution history preserved in .executions/ directory

Developer Experience

The platform maintains API compatibility with existing SyGra workflows—anything built in Studio generates the same configuration format as manual scripts. Users can preview generated artifacts in a code panel before execution, providing transparency and control over the underlying configuration.

Getting Started

New workflows can be created by selecting data sources, configuring LLM nodes with prompts and output schemas, and chaining multiple generation steps through state variables. The example included demonstrates a code-assistant workflow that ingests the Glaive Code Assistant dataset and iteratively refines outputs through critique feedback loops.