Skip to content
Last updated

Queries

In Sync, all AI-powered inferences on unstructured content—whether it's answering a question, extracting data, summarizing documents, or classifying content—are modeled as queries. This unified abstraction makes it simple to build everything from single-document lookups to large-scale research agents that span millions of documents. Sync's power is in that users can quickly create queries that "just work" without having to think about the underlying AI processing of the document.

What is a Query?

A query is a natural language question or instruction executed against your content. Unlike traditional database queries that operate on structured data with predefined schemas, Sync queries work directly on unstructured documents (PDFs, images, videos, etc.) and use AI to understand and extract information.

Examples of Queries:

  • "What is the effective date of this contract?"
  • "Summarize all customer feedback from Q1 2024"
  • "Find contracts with auto-renewal clauses worth over $500k"
  • "Extract the policyholder's age and medical history from this insurance application"
  • "Which permits were approved in the last 30 days?"

Every query in Sync runs through the same pipeline:

  1. Scope Selection: Determine which content to analyze (1 document, 100 documents, or millions)
  2. Context Retrieval: Fetch relevant text chunks using vector similarity search
  3. AI Execution: Process the query using an AI model with retrieved context
  4. Response Generation: Return structured answers with citations

This unified model means you can use the same API endpoint whether you're querying a single PDF or running a research agent across your entire knowledge base.

Query Parameters

Queries are highly configurable through parameters that control their scope, context, and behavior.

Core Parameters

query (string, required): The natural language question or instruction.

{
  "query": "What is the total contract value?"
}

agentId (UUID, optional): Specifies which agent's instructions to use as the system prompt. Agents define the AI's behavior, personality, and domain expertise.

{
  "query": "Analyze this permit application",
  "agentId": "agent-550e8400-uuid"
}

### Query Context Parameters

**`contentFilters`**:
Filters which content from the dataspace is available for the query. Uses the same filtering syntax as content fetch operations.

**`libraries`**:
Array of library IDs to include as additional context. Libraries are pre-indexed external knowledge sources (e.g., legal codes, product documentation, research papers).

**`includeWebSearchResults`**:
Boolean flag to include real-time web search results as context. Useful for queries requiring up-to-date external information.

## Query Scope: From Single Documents to Millions

The power of Sync's query model is its ability to scale seamlessly from tiny to massive scopes.

### Single Document Query

Query a specific document by filtering to one content ID:

```bash
POST https://sws-{workspaceId}.cloud.syncdocs.ai/api/content/{dataspaceId}/query
Authorization: Bearer <token>

{
  "query": "What is the effective date of this contract?",
  "context": {
    "contentFilters": {
      "contentId": "550e8400-e29b-41d4-a716-446655440000"
    }
  }
}

Use Case: Extract specific data from a single uploaded document, validate information, or generate a summary.

Project-Scoped Query (100+ Documents)

Query all content in a specific project:

POST https://sws-{workspaceId}.cloud.syncdocs.ai/api/content/{dataspaceId}/query
Authorization: Bearer <token>

{
  "query": "What are the common themes across all customer feedback in this project?",
  "context": {
    "contentFilters": {
      "projectId": "proj-12345678-uuid"
    }
  }
}

Use Case: Research agent for a specific initiative, summarize findings across a curated set of documents, or analyze trends within a bounded scope.

Dataspace-Wide Query (Millions of Documents)

Query across your entire dataspace with metadata filters:

POST https://sws-{workspaceId}.cloud.syncdocs.ai/api/content/{dataspaceId}/query
Authorization: Bearer <token>

{
  "query": "Find all contracts with auto-renewal clauses expiring in 2025 where annual value exceeds $500,000",
  "agentId": "agent-contract-analysis-uuid",
  "context": {
    "contentFilters": {
      "categoryId": "cat-contract-uuid",
      "metadata": {
        "autoRenewal": true,
        "annualValue": { "gte": 500000 },
        "effectiveDate": { "between": ["2024-01-01", "2025-12-31"] }
      }
    }
  }
}

Use Case: Enterprise-wide research, compliance audits, large-scale data extraction, or building AI-powered analytics dashboards.

Query Logging

Sync automatically logs all query operations to provide visibility, auditability, and usage analytics.

Example Query Log:

{
  "id": "log-550e8400-uuid",
  "operationType": "QUERY",
  "operationScope": "single",
  "userId": "user-12345678-uuid",
  "contentId": "550e8400-e29b-41d4-a716-446655440000",
  "dataspaceId": "sds-abc12345",
  "query": {
    "method": "POST",
    "path": "/api/content/sds-abc12345/query",
    "body": {
      "query": "What is the effective date?",
      "context": { "contentFilters": { "contentId": "550e8400..." } }
    }
  },
  "queriedAt": "2024-10-28T14:30:00Z"
}

Use Cases for Query Logs

  • Compliance: Track who accessed what documents and when
  • Usage Analytics: Understand which queries are most common
  • Cost Attribution: Allocate AI compute costs to departments or users
  • Security Auditing: Detect unusual access patterns
  • Quality Improvement: Identify poorly performing queries to refine ontologies

Connecting Queries to Other Concepts

Queries integrate with several other Sync concepts to create powerful, composable AI workflows.

Queries + Ontologies: Precomputed Metadata Extraction

Ontologies define metadata queries that run automatically during content ingestion. These are queries too—just executed in batch and appended as metadata to a content item.

Example: Legal contract ontology with precomputed queries:

{
  "category": "Contract",
  "metadataQueries": [
    {
      "name": "Effective Date",
      "instructions": "Extract the effective date in ISO 8601 format"
    },
    {
      "name": "Contracting Parties",
      "instructions": "List all parties to the contract"
    }
  ]
}

During ingestion, these queries run on every uploaded contract, extracting structured metadata. Later, you can query that extracted metadata:

POST /api/content/{dataspaceId}/query
{
  "query": "Show me all contracts effective in Q1 2025",
  "context": {
    "contentFilters": {
      "categoryId": "cat-contract-uuid",
      "metadata": {
        "effectiveDate": { "between": ["2025-01-01", "2025-03-31"] }
      }
    }
  }
}

Queries + Libraries: External Knowledge Context

Libraries are pre-indexed external knowledge sources that can be included as context in queries.

Example: Insurance application processing with legal context

Setup:

  1. Create a library of relevant regulations:
POST https://cloud.syncdocs.ai/api/accounts/{accountId}/libraries
Authorization: Bearer <token>
Content-Type: application/json

{
  "name": "Insurance Regulations",
  "rootUrl": "https://insurance.gov/",
  "urlFilter": "https://insurance\\.gov/(regulations|compliance)/.*"
}
  1. Query insurance applications with regulatory context:
POST /api/content/{dataspaceId}/query
{
  "query": "Does this application comply with state underwriting requirements?",
  "context": {
    "contentFilters": {
      "categoryId": "cat-insurance-application-uuid",
      "contentId": "app-550e8400-uuid"
    },
    "libraries": ["lib-insurance-regulations-uuid"]
  }
}

The AI has access to both the application document and the indexed regulatory library, enabling it to answer compliance questions accurately.

Queries + Agents: Reusable Query Behavior

Agents package together instructions, context, and behavior into reusable query configurations.

Example: Insurance Underwriting Agent

Create the Agent:

POST /api/accounts/{accountId}/agents
{
  "name": "Insurance Underwriting Agent",
  "description": "Analyzes insurance applications for underwriting decisions",
  "instructions": "You are an expert insurance underwriter. Analyze insurance applications thoroughly, checking for completeness, accuracy, and compliance with underwriting guidelines. Extract key risk factors including applicant age, medical history, occupation, and lifestyle factors. Cross-reference application details with regulatory requirements and flag any inconsistencies or missing information. Provide clear recommendations on whether to approve, deny, or request additional information.",
  "defaultContext": {
    "libraries": ["lib-insurance-regulations-uuid", "lib-underwriting-guidelines-uuid"],
    "includeWebSearchResults": false
  }
}

Use the Agent:

POST /api/content/{dataspaceId}/query
{
  "query": "Analyze this insurance application and provide an underwriting recommendation",
  "agentId": "agent-underwriting-uuid",
  "context": {
    "contentFilters": {
      "contentId": "app-550e8400-uuid"
    }
  }
}

Response:

{
  "query": "Analyze this insurance application...",
  "response": "Based on analysis of the application:\n\n**Applicant Profile**:\n- Age: 42 years old\n- Occupation: Office worker (low risk)\n- Medical history: Controlled hypertension, no other conditions\n- Lifestyle: Non-smoker, exercises regularly\n\n**Compliance Check**: ✅ All required fields completed per state regulations (ref: Insurance Regulations Library)\n\n**Risk Assessment**: Medium-low risk profile\n\n**Recommendation**: APPROVE with standard premium tier. Medical underwriting notes that controlled hypertension poses minimal risk given applicant's age and lifestyle.",
  "analyzedDocumentCount": 1,
  "citedContent": ["app-550e8400-uuid"],
  "citedLibraryPages": [
    "lib-insurance-regulations-uuid/page-14",
    "lib-underwriting-guidelines-uuid/page-8"
  ]
}

The agent automatically applies its domain expertise (underwriting), uses its configured libraries (regulations + guidelines), and provides a structured, actionable recommendation.

Complete Workflow: Insurance Application Processing

Combining all concepts:

  1. Ontology: Defines categories ("Insurance Application") and metadata queries ("Applicant Age", "Medical History", "Occupation")
  2. Ingestion: Applications are uploaded and precomputed queries extract structured metadata
  3. Libraries: External regulations and underwriting guidelines are indexed
  4. Agent: Underwriting agent is configured with domain expertise and library context
  5. Query: Underwriter asks the agent to analyze an application
  6. Result: Agent provides a recommendation based on extracted metadata, document text, and external regulatory context

This workflow transforms a manual, hours-long underwriting process into a seconds-long AI-powered analysis while maintaining auditability and compliance.

Next Steps