Skip to content
Last updated

Getting Started with Sync

This quick guide shows how easy it is to upload a document and start querying it with AI. In just three API calls, you'll go from an uploaded file to intelligent question-answering over your content.

What You'll Do

  1. Upload a document to your dataspace
  2. Trigger ingestion to make it AI-ready
  3. Query your document using natural language

Time to complete: ~5 minutes

Prerequisites

For this quickstart, you'll need:

  • Account ID (format: scd-k2j8n4m1) - provided when you sign up
  • Workspace ID (format: sws-x9p3q7r5) - your compute cluster
  • Dataspace ID (format: sds-a1b2c3d4) - your data storage
  • Ontology configured - your dataspace should have an ontology with at least one category
  • API token - get this from the Sync Cloud web app

Don't have these yet? Follow the Account Setup Guide to create your workspace and dataspace.


Step 1: Upload a Document

Let's upload a PDF document to your dataspace. We'll use a sample employee handbook for this example.

POST https://sws-x9p3q7r5.syncdocs.ai/api/content
Authorization: Bearer <your-token>
Content-Type: multipart/form-data

# Form fields:
file: <your-pdf-file>
dataspaceId: sds-a1b2c3d4
categoryId: 3fa85f64-5717-4562-b3fc-2c963f66afa6
fileName: employee-handbook-2024.pdf
fileFormat: application/pdf
metadata: {}

Response:

{
  "contentId": "7a8b9c0d-1e2f-3a4b-5c6d-7e8f9a0b1c2d",
  "dataspaceId": "sds-a1b2c3d4",
  "categoryId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "fileName": "employee-handbook-2024.pdf",
  "fileFormat": "application/pdf",
  "metadata": {},
  "createdAt": "2025-01-30T10:00:00Z",
  "updatedAt": "2025-01-30T10:00:00Z"
}

βœ… Your document is uploaded! Save the contentId - you'll need it for the next step.


Step 2: Make It AI-Ready (Ingestion)

Now trigger ingestion to extract text, generate embeddings, and index your document for AI queries.

POST https://sws-x9p3q7r5.syncdocs.ai/api/content/sds-a1b2c3d4/7a8b9c0d-1e2f-3a4b-5c6d-7e8f9a0b1c2d/ingest?workflowId=456f1234-e89b-12d3-a456-426614174001
Authorization: Bearer <your-token>
Content-Type: application/json

Ingestion usually takes 30 seconds to a few minutes depending on document size. Once complete, let's retrieve the content to see what Sync extracted:

GET https://sws-x9p3q7r5.syncdocs.ai/api/content/sds-a1b2c3d4/7a8b9c0d-1e2f-3a4b-5c6d-7e8f9a0b1c2d
Authorization: Bearer <your-token>

Response:

{
  "contentId": "7a8b9c0d-1e2f-3a4b-5c6d-7e8f9a0b1c2d",
  "dataspaceId": "sds-a1b2c3d4",
  "categoryId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "fileName": "employee-handbook-2024.pdf",
  "fileFormat": "application/pdf",
  "createdAt": "2025-01-30T10:00:00Z",
  "updatedAt": "2025-01-30T10:02:30Z",  // Updated after ingestion
  "fileSize": 2458624,  // Extracted during ingestion
  "fileUrl": "https://sws-x9p3q7r5.syncdocs.ai/api/content/sds-a1b2c3d4/7a8b9c0d-1e2f-3a4b-5c6d-7e8f9a0b1c2d/employee-handbook-2024.pdf",  // Generated during ingestion
  "metadata": {
    // These fields were extracted by AI during ingestion based on your ontology's metadata queries
    "Document Type": "Employee Handbook",
    "Publication Year": 2024,
    "Department": "Human Resources",
    "Version": "3.2"
  },
  "inferenceTaskExecutions": {
    // Maps each metadata field to the AI task that extracted it (for audit/attribution)
    "Document Type": "9c0d1e2f-3a4b-5c6d-7e8f-9a0b1c2d3e4f",
    "Publication Year": "9c0d1e2f-3a4b-5c6d-7e8f-9a0b1c2d3e4f",
    "Department": "9c0d1e2f-3a4b-5c6d-7e8f-9a0b1c2d3e4f",
    "Version": "9c0d1e2f-3a4b-5c6d-7e8f-9a0b1c2d3e4f"
  }
}

βœ… Your document is now AI-ready! Notice how Sync automatically:

  • Extracted structured metadata from the PDF
  • Generated a file URL for downloading
  • Tracked which AI tasks created each metadata field
  • Indexed the content for semantic search

Step 3: Query Your Document

Now the fun part! Ask questions about your document using natural language.

POST https://sws-x9p3q7r5.syncdocs.ai/api/content/sds-a1b2c3d4/query
Authorization: Bearer <your-token>
Content-Type: application/json

{
  "query": "What is the vacation policy for new employees?"
}

Response:

{
  "query": "What is the vacation policy for new employees?",
  "response": "According to the employee handbook, new employees receive the following vacation benefits:\n\n- **Year 1**: 10 days of paid vacation\n- **Years 2-5**: 15 days of paid vacation\n- **Years 6+**: 20 days of paid vacation\n\nVacation days accrue monthly and can be used after completing 90 days of employment. Unused vacation days can be rolled over up to a maximum of 5 days per year【0】.",
  "analyzedDocumentCount": 1,
  "citedContent": [
    {
      "contentId": "7a8b9c0d-1e2f-3a4b-5c6d-7e8f9a0b1c2d",
      "dataspaceId": "sds-a1b2c3d4",
      "fileName": "employee-handbook-2024.pdf",
      "fileFormat": "application/pdf",
      "categoryId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
      "metadata": {},
      "createdAt": "2025-01-30T10:00:00Z",
      "updatedAt": "2025-01-30T10:02:00Z"
    }
  ],
  "webSearchResults": [],
  "citedLibraryPages": []
}

πŸŽ‰ That's it! You've uploaded a document, made it AI-ready, and queried it with natural language. The AI found the relevant information, synthesized an answer, and provided citations.


What You Just Experienced

In three simple API calls, Sync:

  • βœ… Accepted your document in its native format (no preprocessing required)
  • βœ… Extracted and indexed text, tables, and metadata automatically
  • βœ… Generated embeddings for semantic search
  • βœ… Answered your question using AI with proper citations
  • βœ… Tracked everything in audit logs for compliance

All of this happened in your own infrastructure with complete data isolation.


Try More Queries

Your document is now AI-ready. Try asking follow-up questions:

# Ask a different question
{
  "query": "What are the requirements for remote work?"
}

# Ask a follow-up question (include conversationId from previous response)
{
  "query": "Are there any restrictions on where I can work remotely?",
  "conversationId": "conv-a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}

# Filter by metadata (if you have multiple documents)
{
  "query": "What changed in the 2024 policy updates?",
  "context": {
    "contentFilters": {
      "metadata": {
        "year": 2024,
        "documentType": "Policy Update"
      }
    }
  }
}

What's Next?

Now that you've seen how easy it is to query documents with Sync, explore these guides to unlock more powerful features:

Learn how to provision workspaces and dataspaces, and configure your account for production use.

Automatically extract structured metadata from documents using AI-powered ontologies. Turn unstructured PDFs into queryable, structured data.

Create specialized AI agents with custom instructions, combine private documents with public reference libraries, and implement metadata-based access controls.


Core Concepts

Want to understand how Sync works under the hood?

  • Architecture - How Sync's three-tier architecture (control plane, data plane, compute plane) enables scalability and isolation
  • Content - How documents are processed, indexed, and versioned
  • Dataspaces - Data organization and isolation strategies
  • Workspaces - Compute clusters and how they scale
  • Ontologies - Define categories and metadata schemas for your content
  • Queries - Advanced query techniques and filtering strategies
  • Agents - Configure specialized AI assistants for different use cases

API Reference

For complete API documentation with all endpoints and parameters: