# Workspaces A **Workspace** is Sync's compute layer—a scalable cluster that handles all document processing, AI-powered queries, ingestion pipelines, and API access to your data. Workspaces are the engines that transform raw documents in dataspaces into AI-ready, searchable, queryable content. ## What is a Workspace? Think of a workspace as a **dedicated compute cluster** that sits between your application and your data. While dataspaces store your content and metadata, workspaces do all the computational heavy lifting: - **API Hosting**: Serve the REST APIs your applications use to interact with content - **Document Processing**: Extract text, generate embeddings, create thumbnails - **AI Operations**: Execute queries, run research agents, perform semantic search - **Workflow Execution**: Orchestrate multi-step ingestion and extraction pipelines - **Batch Operations**: Process large volumes of documents in parallel Workspaces are **stateless**—all persistent data lives in dataspaces. This means you can scale workspaces up or down, create new ones, or destroy existing ones without any data loss. ## Key Characteristics **Scalable Compute Clusters**: Each workspace is a fully managed compute cluster (typically running on Kubernetes) that can be scaled independently based on workload: - Scale up for high-volume ingestion - Scale down during idle periods - Create dedicated workspaces for different teams or use cases - No impact on data storage when scaling **Multi-Dataspace Access**: A single workspace can access **multiple dataspaces**, subject to user permissions. This allows you to: - Query across multiple dataspaces simultaneously - Move content between dataspaces - Aggregate data from different departments or projects - Apply consistent processing pipelines across dataspaces **Network Isolation**: Workspaces run in isolated virtual private clouds (VPCs) with strict network segmentation: - Each workspace has its own VPC (or shares a VPC with other workspaces in the same account) - No direct network connectivity between different customer workspaces - Encrypted connections to dataspaces and external services - Private networking for all database and blob storage access **Model Provider Flexibility**: By default, workspaces perform all AI computations **within the VPC** using self-hosted models. However, customers can optionally configure workspaces to use **external model providers** (e.g., OpenAI, Anthropic, Cohere) for specific operations: - **Default (VPC-only)**: All models run within the VPC—data never leaves your environment - **External Providers**: Optionally send text to external APIs for embeddings, extractions, or queries - **Bring Your Own Model (BYOM)**: Use your own API keys for external providers to control costs and usage ## Creating a Workspace Workspaces are created via the Admin API and deployed in the cloud provider's infrastructure. ### Example: Create a Workspace ```bash POST https://cloud.syncdocs.ai/api/accounts/{accountId}/workspaces Authorization: Bearer Content-Type: application/json { "name": "Production Workspace", "description": "Primary workspace for production applications", "plan": "us-east-1", "resources": { "nodeCount": 3, "nodeSize": "medium" } } ``` **Response**: ```json { "id": "sws-12345678-abcd-1234-efgh-123456789012", "accountId": "acc-98765432-dcba-4321-hgfe-987654321098", "name": "Production Workspace", "description": "Primary workspace for production applications", "plan": "us-east-1", "clusterStatus": "CREATING", "apiUrl": "https://sws-12345678.cloud.syncdocs.ai", "publicDbURL": "postgres://user:pass@host:5432/workspace_sws_12345678", "createdAt": "2024-10-28T14:30:00Z", "updatedAt": "2024-10-28T14:30:00Z" } ``` **Parameters**: - `name` (required): Human-readable name for the workspace - `description` (optional): Description of the workspace's purpose - `plan` (optional): Cloud region for deployment (e.g., `us-east-1`, `eu-west-1`) - `resources` (optional): Compute resource configuration - `nodeCount`: Number of compute nodes (default: 2) - `nodeSize`: Size of each node (`small`, `medium`, `large`) ### Workspace Status Workspace creation is asynchronous and typically takes 5-10 minutes. The `clusterStatus` field indicates the current state: - `CREATING`: Cluster infrastructure is being provisioned - `READY`: Workspace is fully operational - `UPDATING`: Configuration changes are being applied - `ERROR`: Provisioning or configuration failed You can poll the workspace status: ```bash GET https://cloud.syncdocs.ai/api/accounts/{accountId}/workspaces/{workspaceId} Authorization: Bearer ``` ## Using a Workspace Once a workspace is `READY`, you access it via its dedicated API URL. ### Workspace API URL Each workspace has a unique subdomain-based URL: ``` https://{workspaceId}.cloud.syncdocs.ai ``` All Workspace API operations use this base URL. For example: ```bash # Upload content to a dataspace POST https://sws-12345678.cloud.syncdocs.ai/api/content/{dataspaceId} # Query content with AI POST https://sws-12345678.cloud.syncdocs.ai/api/content/{dataspaceId}/query # List content in a dataspace GET https://sws-12345678.cloud.syncdocs.ai/api/content/{dataspaceId} ``` ### Access Control Workspaces enforce object-level access control. When you call a workspace API with your authentication token, the workspace: 1. Validates your token with the control plane 2. Determines which objects your API request is trying to access 3. Restricts API operations to only those dataspaces This means: - ✅ Users can only access dataspaces they have permissions for - ✅ Multiple users can share the same workspace with different access levels - ✅ Workspace compute resources are shared, but data access is isolated per user ## Model Providers & API Keys Workspaces offer flexible options for AI model usage. ### Default: VPC-Only Compute By default, all AI operations run **within the VPC**: - Text extraction, chunking, and embedding generation happen locally - No data leaves your VPC - No external API calls for document processing - Ideal for compliance, data sovereignty, and cost control ### Optional: External Model Providers You can configure a workspace to use **external model providers** (OpenAI, Anthropic, Cohere, etc.) for specific operations: - **Embeddings**: Use OpenAI's `text-embedding-3-small` for vector generation - **Query Agents**: Use GPT-4 or Claude for complex research queries - **Data Extraction**: Use advanced models for structured data extraction **Bring Your Own Model (BYOM)**: Customers can provide their own API keys for external providers: **Privacy Considerations**: When using external providers, the workspace sends text content and any metadata to the external API. Raw files remain within the VPC. ## Scaling Workspaces Workspaces can be scaled independently of dataspaces: **Vertical Scaling** (Resize Nodes): ```bash PATCH https://cloud.syncdocs.ai/api/accounts/{accountId}/workspaces/{workspaceId} Content-Type: application/json { "resources": { "nodeSize": "large" } } ``` **Horizontal Scaling** (Add Nodes): ```bash PATCH https://cloud.syncdocs.ai/api/accounts/{accountId}/workspaces/{workspaceId} Content-Type: application/json { "resources": { "nodeCount": 5 } } ``` **Use Cases for Scaling**: - Increase capacity during bulk ingestion - Scale down during off-peak hours to reduce costs - Temporarily boost resources for complex research queries - Create dedicated high-performance workspaces for critical applications ## Multiple Workspaces Organizations often create multiple workspaces for different purposes: **Development vs. Production**: - `dev-workspace`: Testing new workflows and ontologies - `prod-workspace`: Production applications with SLA guarantees **Department-Specific Workspaces**: - `hr-workspace`: Optimized for HR department use cases - `legal-workspace`: Configured for legal team workflows **Geographic Distribution**: - `us-workspace`: Deployed in US region for low-latency US access - `eu-workspace`: Deployed in EU region for GDPR compliance Each workspace can access the same dataspaces (subject to permissions), but provides isolated compute resources. ## Next Steps - [Understand Dataspaces](/concepts/dataspaces) - Learn how workspaces access dataspace content - [Explore Content](/concepts/content) - Upload and process content via workspaces - [Learn about Queries](/concepts/queries) - Use workspace query engines - [API Reference](/api) - Complete API documentation for workspaces