# Workspaces

A **Workspace** is Sync's compute layer—a scalable cluster that handles all document processing, AI-powered queries, ingestion pipelines, and API access to your data. Workspaces are the engines that transform raw documents in dataspaces into AI-ready, searchable, queryable content.

## What is a Workspace?

Think of a workspace as a **dedicated compute cluster** that sits between your application and your data. While dataspaces store your content and metadata, workspaces do all the computational heavy lifting:

- **API Hosting**: Serve the REST APIs your applications use to interact with content
- **Document Processing**: Extract text, generate embeddings, create thumbnails
- **AI Operations**: Execute queries, run research agents, perform semantic search
- **Workflow Execution**: Orchestrate multi-step ingestion and extraction pipelines
- **Batch Operations**: Process large volumes of documents in parallel


Workspaces are **stateless**—all persistent data lives in dataspaces. This means you can scale workspaces up or down, create new ones, or destroy existing ones without any data loss.

## Key Characteristics

**Scalable Compute Clusters**:
Each workspace is a fully managed compute cluster (typically running on Kubernetes) that can be scaled independently based on workload:

- Scale up for high-volume ingestion
- Scale down during idle periods
- Create dedicated workspaces for different teams or use cases
- No impact on data storage when scaling


**Multi-Dataspace Access**:
A single workspace can access **multiple dataspaces**, subject to user permissions. This allows you to:

- Query across multiple dataspaces simultaneously
- Move content between dataspaces
- Aggregate data from different departments or projects
- Apply consistent processing pipelines across dataspaces


**Network Isolation**:
Workspaces run in isolated virtual private clouds (VPCs) with strict network segmentation:

- Each workspace has its own VPC (or shares a VPC with other workspaces in the same account)
- No direct network connectivity between different customer workspaces
- Encrypted connections to dataspaces and external services
- Private networking for all database and blob storage access


**Model Provider Flexibility**:
By default, workspaces perform all AI computations **within the VPC** using self-hosted models. However, customers can optionally configure workspaces to use **external model providers** (e.g., OpenAI, Anthropic, Cohere) for specific operations:

- **Default (VPC-only)**: All models run within the VPC—data never leaves your environment
- **External Providers**: Optionally send text to external APIs for embeddings, extractions, or queries
- **Bring Your Own Model (BYOM)**: Use your own API keys for external providers to control costs and usage


## Creating a Workspace

Workspaces are created via the Admin API and deployed in the cloud provider's infrastructure.

### Example: Create a Workspace


```bash
POST https://cloud.syncdocs.ai/api/accounts/{accountId}/workspaces
Authorization: Bearer <token>
Content-Type: application/json

{
  "name": "Production Workspace",
  "description": "Primary workspace for production applications",
  "plan": "us-east-1",
  "resources": {
    "nodeCount": 3,
    "nodeSize": "medium"
  }
}
```

**Response**:


```json
{
  "id": "sws-12345678-abcd-1234-efgh-123456789012",
  "accountId": "acc-98765432-dcba-4321-hgfe-987654321098",
  "name": "Production Workspace",
  "description": "Primary workspace for production applications",
  "plan": "us-east-1",
  "clusterStatus": "CREATING",
  "apiUrl": "https://sws-12345678.cloud.syncdocs.ai",
  "publicDbURL": "postgres://user:pass@host:5432/workspace_sws_12345678",
  "createdAt": "2024-10-28T14:30:00Z",
  "updatedAt": "2024-10-28T14:30:00Z"
}
```

**Parameters**:

- `name` (required): Human-readable name for the workspace
- `description` (optional): Description of the workspace's purpose
- `plan` (optional): Cloud region for deployment (e.g., `us-east-1`, `eu-west-1`)
- `resources` (optional): Compute resource configuration
  - `nodeCount`: Number of compute nodes (default: 2)
  - `nodeSize`: Size of each node (`small`, `medium`, `large`)


### Workspace Status

Workspace creation is asynchronous and typically takes 5-10 minutes. The `clusterStatus` field indicates the current state:

- `CREATING`: Cluster infrastructure is being provisioned
- `READY`: Workspace is fully operational
- `UPDATING`: Configuration changes are being applied
- `ERROR`: Provisioning or configuration failed


You can poll the workspace status:


```bash
GET https://cloud.syncdocs.ai/api/accounts/{accountId}/workspaces/{workspaceId}
Authorization: Bearer <token>
```

## Using a Workspace

Once a workspace is `READY`, you access it via its dedicated API URL.

### Workspace API URL

Each workspace has a unique subdomain-based URL:


```
https://{workspaceId}.cloud.syncdocs.ai
```

All Workspace API operations use this base URL. For example:


```bash
# Upload content to a dataspace
POST https://sws-12345678.cloud.syncdocs.ai/api/content/{dataspaceId}

# Query content with AI
POST https://sws-12345678.cloud.syncdocs.ai/api/content/{dataspaceId}/query

# List content in a dataspace
GET https://sws-12345678.cloud.syncdocs.ai/api/content/{dataspaceId}
```

### Access Control

Workspaces enforce object-level access control. When you call a workspace API with your authentication token, the workspace:

1. Validates your token with the control plane
2. Determines which objects your API request is trying to access
3. Restricts API operations to only those dataspaces


This means:

- ✅ Users can only access dataspaces they have permissions for
- ✅ Multiple users can share the same workspace with different access levels
- ✅ Workspace compute resources are shared, but data access is isolated per user


## Model Providers & API Keys

Workspaces offer flexible options for AI model usage.

### Default: VPC-Only Compute

By default, all AI operations run **within the VPC**:

- Text extraction, chunking, and embedding generation happen locally
- No data leaves your VPC
- No external API calls for document processing
- Ideal for compliance, data sovereignty, and cost control


### Optional: External Model Providers

You can configure a workspace to use **external model providers** (OpenAI, Anthropic, Cohere, etc.) for specific operations:

- **Embeddings**: Use OpenAI's `text-embedding-3-small` for vector generation
- **Query Agents**: Use GPT-4 or Claude for complex research queries
- **Data Extraction**: Use advanced models for structured data extraction


**Bring Your Own Model (BYOM)**:
Customers can provide their own API keys for external providers:

**Privacy Considerations**:
When using external providers, the workspace sends text content and any metadata to the external API. Raw files remain within the VPC.

## Scaling Workspaces

Workspaces can be scaled independently of dataspaces:

**Vertical Scaling** (Resize Nodes):


```bash
PATCH https://cloud.syncdocs.ai/api/accounts/{accountId}/workspaces/{workspaceId}
Content-Type: application/json

{
  "resources": {
    "nodeSize": "large"
  }
}
```

**Horizontal Scaling** (Add Nodes):


```bash
PATCH https://cloud.syncdocs.ai/api/accounts/{accountId}/workspaces/{workspaceId}
Content-Type: application/json

{
  "resources": {
    "nodeCount": 5
  }
}
```

**Use Cases for Scaling**:

- Increase capacity during bulk ingestion
- Scale down during off-peak hours to reduce costs
- Temporarily boost resources for complex research queries
- Create dedicated high-performance workspaces for critical applications


## Multiple Workspaces

Organizations often create multiple workspaces for different purposes:

**Development vs. Production**:

- `dev-workspace`: Testing new workflows and ontologies
- `prod-workspace`: Production applications with SLA guarantees


**Department-Specific Workspaces**:

- `hr-workspace`: Optimized for HR department use cases
- `legal-workspace`: Configured for legal team workflows


**Geographic Distribution**:

- `us-workspace`: Deployed in US region for low-latency US access
- `eu-workspace`: Deployed in EU region for GDPR compliance


Each workspace can access the same dataspaces (subject to permissions), but provides isolated compute resources.

## Next Steps

- [Understand Dataspaces](/concepts/dataspaces) - Learn how workspaces access dataspace content
- [Explore Content](/concepts/content) - Upload and process content via workspaces
- [Learn about Queries](/concepts/queries) - Use workspace query engines
- [API Reference](/api) - Complete API documentation for workspaces