This guide shows you how to build an AI-powered legal research agent for a fictitious San Francisco law firm. You'll create an agent that can answer questions by searching across private client case files AND public legal reference libraries—demonstrating how Sync's unified platform enables you to combine private and public data with built-in governance, privacy controls, and comprehensive audit trails.
A "SF Legal Research Assistant" that paralegals and attorneys can query about client cases while automatically pulling relevant California state statutes and San Francisco municipal regulations. The agent will:
- Search across private client documents with metadata-based filters
- Incorporate authoritative legal reference libraries
- Maintain conversation history automatically
- Provide full citation tracking for compliance and audit trails
- How to create Libraries for external reference content (laws, regulations, public documentation)
- How to create an Agent with custom instructions and default context
- How to link libraries to agents for expanded search scope
- How to query agents with metadata filters to scope queries to specific clients or case types
- How automatic conversation history enables multi-turn research sessions
- How to access query logs for compliance and audit purposes
- Why this architecture uniquely enables privacy, governance, and ease of use
Before starting, make sure you have:
- Completed the Account Setup Guide
- An active workspace (e.g.,
sws-x9p3q7r5) - A dataspace (e.g.,
sds-a1b2c3d4) containing client case documents - Client documents already uploaded and categorized with metadata including:
clientId(e.g., "CLIENT-2024-789")caseNumber(e.g., "CASE-2024-1523")caseType(e.g., "Eviction Defense", "Tenant Rights", "Housing Dispute")jurisdiction(e.g., "San Francisco", "California")
- Your authentication token ready
- Your Account ID (format:
scd-k2j8n4m1)
Note that with an ontology configured, that metadata would have been added automatically by Sync.
Before we dive into building, let's briefly understand the key concepts:
A Library allows you to ingest and index external content (websites, documentation, regulations) that should be available to agents as read-only reference material. Unlike dataspaces which store YOUR private documents, libraries store PUBLIC or LICENSED content that provides context for queries.
Why Use Libraries?
- Separation of concerns: Keep client data (dataspace) separate from reference materials (libraries)
- Reusability: One library can be used by multiple agents across different dataspaces
- Governance: Libraries have different access controls than dataspaces—they're not tied to specific clients or cases
An Agent is a configured AI assistant with:
- Instructions: Specific behavior, tone, domain expertise, and guidelines
- Default context: Pre-configured libraries and search settings
- Permissions: Implicitly defined by the dataspaces to which the querying user has access
Why Use Agents?
- Customization: Different agents for different use cases (research, classification, extraction)
- Access control: Agents act as a permission and behavior layer—users query agents, agents query data
- Consistency: Same agent instructions across all queries ensure predictable, high-quality behavior
- Audit trails: All queries are logged with agent ID for governance and compliance
- Domain expertise: Encode specialized knowledge into agent instructions (legal, medical, financial, etc.)
Sync automatically manages conversation history for multi-turn dialogues:
- Each new conversation gets a unique
conversationId - Pass the
conversationIdto continue a conversation thread - Sync intelligently manages token limits and context summarization
- Full conversation threads are stored and auditable
Why This Matters:
- User experience: Natural, ChatGPT-like conversations without client-side state management
- Efficiency: No need to manually pass chat history in your application
- Governance: Full conversation threads are logged and auditable for compliance
- Token optimization: Sync handles context window management automatically
Let's create a library that indexes California state law for your legal research agent.
POST https://cloud.syncdocs.ai/api/accounts/scd-k2j8n4m1/libraries
Authorization: Bearer <your-token>
Content-Type: application/json
{
"name": "California State Law",
"rootUrl": "https://leginfo.legislature.ca.gov/",
"urlFilter": "https://leginfo\\.legislature\\.ca\\.gov/(faces/codes|statutes)/.*"
}Request Parameters:
name(required): Descriptive name for the libraryrootUrl(required): Base URL to crawl and indexurlFilter(optional): Regular expression to limit which URLs are indexed (prevents crawling the entire domain)
Response:
{
"id": "c7f85f64-9821-4a62-b8fc-1c963f66afa6",
"name": "California State Law",
"rootUrl": "https://leginfo.legislature.ca.gov/",
"urlFilter": "https://leginfo\\.legislature\\.ca\\.gov/(faces/codes|statutes)/.*",
"accountId": "scd-k2j8n4m1",
"createdAt": "2025-01-25T10:00:00Z",
"lastUpdatedAt": null
}What Happens Next:
- Sync begins crawling the specified URL automatically
- Web pages are downloaded, text is extracted, and embeddings are generated
- Content is indexed for semantic search
- This process runs asynchronously in the background (can take minutes to hours depending on site size)
Save the library id (c7f85f64-9821-4a62-b8fc-1c963f66afa6) - you'll need it when configuring the agent.
Now create a second library for San Francisco-specific regulations:
POST https://cloud.syncdocs.ai/api/accounts/scd-k2j8n4m1/libraries
Authorization: Bearer <your-token>
Content-Type: application/json
{
"name": "San Francisco Municipal Code",
"rootUrl": "https://codelibrary.amlegal.com/codes/san_francisco/",
"urlFilter": "https://codelibrary\\.amlegal\\.com/codes/san_francisco/.*"
}Response:
{
"id": "d8a96g75-1932-5b73-c9gd-2d074g77bgb7",
"name": "San Francisco Municipal Code",
"rootUrl": "https://codelibrary.amlegal.com/codes/san_francisco/",
"urlFilter": "https://codelibrary\\.amlegal\\.com/codes/san_francisco/.*",
"accountId": "scd-k2j8n4m1",
"createdAt": "2025-01-25T10:05:00Z",
"lastUpdatedAt": null
}You now have two libraries that will provide legal reference context to your research agent:
- ✅ California State Law (
c7f85f64-9821-4a62-b8fc-1c963f66afa6) - ✅ San Francisco Municipal Code (
d8a96g75-1932-5b73-c9gd-2d074g77bgb7)
Now let's create an agent that's specifically designed for legal research, with instructions that encode best practices and link it to both libraries.
POST https://cloud.syncdocs.ai/api/accounts/scd-k2j8n4m1/agents
Authorization: Bearer <your-token>
Content-Type: application/json
{
"name": "SF Legal Research Assistant",
"description": "AI assistant specializing in California and San Francisco law for client case research",
"instructions": "You are an expert legal research assistant specializing in California state law and San Francisco municipal law. Your role is to help attorneys and paralegals research client cases.\n\nGuidelines:\n1. Always cite specific statutes, codes, or ordinances with section numbers\n2. Clearly distinguish between information from client case files vs. legal reference sources\n3. When answering questions about specific clients, prioritize documents from that client's files\n4. Provide practical guidance relevant to San Francisco jurisdiction\n5. If legal precedent or statutory interpretation is unclear, explicitly state the ambiguity\n6. Recommend consulting an attorney for case-specific legal advice when appropriate\n7. Use formal, professional language suitable for legal documentation\n\nYou have access to:\n- Client case files (contracts, correspondence, filings, evidence)\n- California state statutes and codes\n- San Francisco municipal ordinances and regulations",
"defaultContext": {
"libraries": [
"c7f85f64-9821-4a62-b8fc-1c963f66afa6",
"d8a96g75-1932-5b73-c9gd-2d074g77bgb7"
],
"includeWebSearchResults": false
}
}Request Parameters:
name(required): Display name for the agentdescription(required): Short description of the agent's purposeinstructions(required): Detailed system prompt that defines the agent's behavior, tone, expertise, and guidelinesdefaultContext.libraries(optional): Array of library IDs to include in every query by defaultdefaultContext.includeWebSearchResults(optional): Whether to augment queries with web search results (default: false)
Response:
{
"agentId": "550e8400-e29b-41d4-a716-446655440000",
"accountId": "scd-k2j8n4m1",
"name": "SF Legal Research Assistant",
"description": "AI assistant specializing in California and San Francisco law for client case research",
"instructions": "You are an expert legal research assistant specializing in California state law and San Francisco municipal law...",
"defaultContext": {
"libraries": [
"c7f85f64-9821-4a62-b8fc-1c963f66afa6",
"d8a96g75-1932-5b73-c9gd-2d074g77bgb7"
],
"includeWebSearchResults": false
},
"createdBy": "123e4567-e89b-12d3-a456-426614174001",
"createdAt": "2025-01-25T10:15:00Z",
"lastUpdatedAt": null
}What You've Configured:
- ✅ An agent with legal expertise encoded in its instructions
- ✅ Default access to both CA state law and SF municipal code libraries
- ✅ Professional tone and behavior guidelines for legal research
- ✅ Clear citation and source attribution requirements
Save the agentId (550e8400-e29b-41d4-a716-446655440000) - you'll use it for all queries.
Now let's query the agent with a general legal research question. This will search across BOTH client case files (in your dataspace) AND legal reference libraries.
POST https://sws-x9p3q7r5.syncdocs.ai/api/content/sds-a1b2c3d4/query
Authorization: Bearer <your-token>
Content-Type: application/json
{
"query": "What are the tenant notification requirements for eviction in San Francisco?",
"agentId": "550e8400-e29b-41d4-a716-446655440000"
}Request Parameters:
query(required): The question to askagentId(optional but recommended): The agent to use for this query (provides instructions and default library context)context(optional): Additional context configuration (we'll see this in the next step)
What Happens Under the Hood:
- Sync loads the agent's instructions and default context (libraries)
- The agent searches across:
- Your dataspace (
sds-a1b2c3d4) for client case files - California State Law library for relevant statutes
- SF Municipal Code library for local ordinances
- Your dataspace (
- Vector embeddings are used to find semantically relevant content
- The AI synthesizes an answer using the agent's instructions as guidance
- Citations are extracted and full content objects are retrieved
- A new conversation is automatically created with a unique
conversationId
Response:
{
"query": "What are the tenant notification requirements for eviction in San Francisco?",
"response": "In San Francisco, tenant notification requirements for eviction are governed by both California state law and San Francisco's Rent Ordinance.\n\n**California State Law (Civil Code §1946.1):**\nFor month-to-month tenancies, landlords must provide:\n- 30-day notice if the tenant has resided in the unit for less than one year\n- 60-day notice if the tenant has resided in the unit for one year or more\n\n**San Francisco Rent Ordinance (§37.9):**\nSan Francisco provides additional tenant protections. Landlords can only evict tenants for \"just cause\" as defined in the ordinance. Notice requirements depend on the grounds for eviction:\n- For at-fault evictions (e.g., non-payment of rent): 3-day notice\n- For no-fault evictions (e.g., owner move-in): 60-day notice (120 days for elderly or disabled tenants)\n\nAll eviction notices must include specific language informing tenants of their rights under San Francisco law【0】【1】.",
"analyzedDocumentCount": 47,
"citedContent": [],
"webSearchResults": [],
"citedLibraryPages": [
{
"id": "9b7c5d3a-4f2e-4a8b-9c1d-2e3f4a5b6c7d",
"libraryId": "c7f85f64-9821-4a62-b8fc-1c963f66afa6",
"url": "https://leginfo.legislature.ca.gov/faces/codes_displaySection.xhtml?lawCode=CIV§ionNum=1946.1",
"title": "California Civil Code § 1946.1 - Notice to Terminate Tenancy",
"domain": "leginfo.legislature.ca.gov",
"contentType": "text/html",
"scrapedAt": "2025-01-20T08:30:00Z",
"relevanceScore": 0.94
},
{
"id": "8a6b4c2d-3e1f-5b9a-8d0c-1e2f3a4b5c6e",
"libraryId": "d8a96g75-1932-5b73-c9gd-2d074g77bgb7",
"url": "https://codelibrary.amlegal.com/codes/san_francisco/latest/sf_admin/0-0-0-17656",
"title": "San Francisco Administrative Code § 37.9 - Just Cause for Eviction",
"domain": "codelibrary.amlegal.com",
"contentType": "text/html",
"scrapedAt": "2025-01-22T14:15:00Z",
"relevanceScore": 0.91
}
]
}Understanding the Response:
response: AI-generated answer following the agent's instructions (formal legal tone, clear citations)analyzedDocumentCount: 47 documents were available for analysis (client files + library pages)citedContent: Empty in this case (no client case files were cited, only library pages)citedLibraryPages: Two library pages were cited - one from CA law, one from SF municipal codewebSearchResults: Empty becauseincludeWebSearchResultsis false for this agent
Citation Format:
The citations 【0】 and 【1】 in the response reference indices in the citedLibraryPages array. This allows your UI to:
- Display clickable citations
- Show source documents in a sidebar
- Provide users with direct links to authoritative sources
Why This Matters:
- Privacy: Client case files stayed in your dataspace and were searchable, but none were relevant for this general question
- Comprehensive research: Agent automatically pulled from both state and local law
- Audit trail: Full record of which sources were accessed
- Professional output: Response follows legal research conventions (citations, clear structure)
Now let's ask a question scoped to a SPECIFIC client using metadata filters. This demonstrates Sync's powerful ability to dynamically limit search scope while still accessing library content.
POST https://sws-x9p3q7r5.syncdocs.ai/api/content/sds-a1b2c3d4/query
Authorization: Bearer <your-token>
Content-Type: application/json
{
"query": "Based on our case files, what is our strongest defense strategy for the eviction case?",
"agentId": "550e8400-e29b-41d4-a716-446655440000",
"context": {
"contentFilters": {
"clientId": "CLIENT-2024-789",
"caseType": "Eviction Defense"
}
}
}What Changed:
The context.contentFilters parameter tells Sync to ONLY search documents in the dataspace that match:
clientId = CLIENT-2024-789caseType = Eviction Defense
Libraries are STILL searched (SF/CA law), but dataspace search is now narrowed to just this client's files.
Why Use Metadata Filters:
- Privacy by design: Ensure queries only access relevant client data
- Compliance: Implement data isolation rules (attorney-client privilege, ethical walls)
- Performance: Narrower search scope = faster, more relevant results
- Flexibility: Same agent, different filters = different effective behaviors
- Multi-tenancy: Support multiple clients/cases with one agent and one dataspace
Response:
{
"query": "Based on our case files, what is our strongest defense strategy for the eviction case?",
"response": "Based on the case files for CLIENT-2024-789, I recommend a defense strategy focused on procedural defects in the eviction notice:\n\n**Primary Defense: Insufficient Notice Period**\n\nYour client received a 30-day notice to vacate dated December 15, 2024【0】. However, tenant occupancy records show the client has resided at the property since March 2022【1】, which exceeds one year of residency.\n\nUnder California Civil Code §1946.1, tenants who have resided in a unit for one year or more must receive 60-day notice, not 30-day notice【2】. This procedural defect makes the eviction notice invalid.\n\n**Secondary Defense: Lack of Just Cause**\n\nThe landlord's stated reason for eviction is \"breach of lease\" related to unauthorized occupants【0】. However, correspondence from July 2024 shows the landlord was informed of the additional occupant (client's elderly parent) and did not object at that time【3】. San Francisco Rent Ordinance §37.9 requires clear just cause for eviction【4】. The landlord's delayed enforcement may undermine the just cause requirement.\n\n**Recommendation:**\n\nFile a motion to quash the eviction notice based on insufficient notice period. This is your strongest technical defense and has clear statutory support.",
"analyzedDocumentCount": 23,
"citedContent": [
{
"contentId": "7f3e5b89-a1c2-4d3e-b5f6-8a9b0c1d2e3f",
"dataspaceId": "sds-a1b2c3d4",
"fileName": "CLIENT-2024-789_Eviction_Notice.pdf",
"fileFormat": "application/pdf",
"categoryId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"metadata": {
"clientId": "CLIENT-2024-789",
"caseNumber": "CASE-2024-1523",
"caseType": "Eviction Defense",
"documentType": "Eviction Notice",
"documentDate": "2024-12-15"
},
"createdAt": "2024-12-16T09:15:00Z",
"updatedAt": "2024-12-16T09:15:00Z"
},
...
}
],
"webSearchResults": [],
"citedLibraryPages": [
{
"id": "9b7c5d3a-4f2e-4a8b-9c1d-2e3f4a5b6c7d",
"libraryId": "c7f85f64-9821-4a62-b8fc-1c963f66afa6",
"url": "https://leginfo.legislature.ca.gov/faces/codes_displaySection.xhtml?lawCode=CIV§ionNum=1946.1",
"title": "California Civil Code § 1946.1 - Notice to Terminate Tenancy",
"domain": "leginfo.legislature.ca.gov",
"contentType": "text/html",
"scrapedAt": "2025-01-20T08:30:00Z",
"relevanceScore": 0.96
},
{
"id": "8a6b4c2d-3e1f-5b9a-8d0c-1e2f3a4b5c6e",
"libraryId": "d8a96g75-1932-5b73-c9gd-2d074g77bgb7",
"url": "https://codelibrary.amlegal.com/codes/san_francisco/latest/sf_admin/0-0-0-17656",
"title": "San Francisco Administrative Code § 37.9 - Just Cause for Eviction",
"domain": "codelibrary.amlegal.com",
"contentType": "text/html",
"scrapedAt": "2025-01-22T14:15:00Z",
"relevanceScore": 0.89
}
]
}Understanding the Filtered Response:
analyzedDocumentCount: Only 23 documents analyzed (down from 47) because we filtered to this specific clientcitedContent: Now populated with 3 client documents - all matching our metadata filterscitedLibraryPages: Still includes relevant legal references from libraries- Agent behavior: The response now explicitly references client-specific facts ("Your client received...")
The Power of Metadata Filtering:
This query demonstrates Sync's unique value proposition:
- Privacy Protection: Only documents for CLIENT-2024-789 were searched
- Contextual Awareness: Agent combined client facts with legal standards
- Governance: Full audit trail of which client files were accessed
- Unified Platform: Single query interface for structured metadata filters + semantic search + library context
Without metadata filters, the agent might have accidentally mixed facts from different clients' cases—a serious ethical and legal violation. With filters, you get precision and compliance.
Sync automatically manages conversation history. Let's ask a follow-up question without repeating context.
When you receive a query response, Sync automatically creates a conversation. To continue the conversation, you would typically need to track the conversationId from your first query. For this example, let's assume the previous query created a conversation with ID conv-a1b2c3d4-e5f6-7890-abcd-ef1234567890.
POST https://sws-x9p3q7r5.syncdocs.ai/api/content/sds-a1b2c3d4/query
Authorization: Bearer <your-token>
Content-Type: application/json
{
"query": "What evidence do we have to support the secondary defense?",
"agentId": "550e8400-e29b-41d4-a716-446655440000",
"conversationId": "conv-a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"context": {
"contentFilters": {
"clientId": "CLIENT-2024-789",
"caseType": "Eviction Defense"
}
}
}What's Different:
- Added
conversationIdfrom the previous query - The question references "the secondary defense" without re-stating it
- Same metadata filters maintained for consistency
What Happens:
- Sync retrieves the conversation history
- The agent understands "the secondary defense" refers to the "Lack of Just Cause" argument from the previous response
- Context from the previous answer informs the new response
- The conversation continues naturally
Response:
{
"query": "What evidence do we have to support the secondary defense?",
"response": "To support the secondary defense (lack of just cause due to delayed enforcement), we have strong documentary evidence:\n\n**Email Correspondence from July 15, 2024【0】:**\nYour client sent an email to the landlord notifying them that her elderly mother would be moving in temporarily for medical care. The landlord responded: \"Thanks for letting me know. I hope your mother recovers quickly.\"\n\nThis correspondence demonstrates:\n1. The landlord was explicitly informed of the additional occupant\n2. The landlord did not object or cite any lease violation at that time\n3. The landlord's acknowledgment could be construed as implicit consent\n\n**Current Eviction Notice (December 15, 2024)【1】:**\nCites \"unauthorized occupants\" as the breach of lease, but this occurred 5 months AFTER the landlord was informed and did not object.\n\n**Legal Analysis:**\n\nUnder San Francisco Rent Ordinance §37.9【2】, landlords must establish just cause for eviction. Delayed enforcement of lease terms (especially after apparent consent) may undermine the just cause requirement. Several tenant defense cases have successfully argued that landlord acquiescence waives the right to later enforce that provision.\n\n**Strategic Note:**\nWhile this defense is strong, it's secondary to the insufficient notice period defense, which has a clearer statutory basis and is less subject to judicial interpretation.",
"analyzedDocumentCount": 23,
"citedContent": [
{
"contentId": "9h5g7d01-c3e4-6f5g-d7h8-0c1d2e3f4g5h",
"dataspaceId": "sds-a1b2c3d4",
"fileName": "CLIENT-2024-789_Correspondence_July2024.pdf",
"fileFormat": "application/pdf",
"categoryId": "c3d4e5f6-g7h8-9012-cdef-gh3456789012",
"metadata": {
"clientId": "CLIENT-2024-789",
"caseNumber": "CASE-2024-1523",
"caseType": "Eviction Defense",
"documentType": "Correspondence",
"documentDate": "2024-07-15"
},
"createdAt": "2024-11-12T10:45:00Z",
"updatedAt": "2024-11-12T10:45:00Z"
},
...
],
"webSearchResults": [],
"citedLibraryPages": [
{
"id": "8a6b4c2d-3e1f-5b9a-8d0c-1e2f3a4b5c6e",
"libraryId": "d8a96g75-1932-5b73-c9gd-2d074g77bgb7",
"url": "https://codelibrary.amlegal.com/codes/san_francisco/latest/sf_admin/0-0-0-17656",
"title": "San Francisco Administrative Code § 37.9 - Just Cause for Eviction",
"domain": "codelibrary.amlegal.com",
"contentType": "text/html",
"scrapedAt": "2025-01-22T14:15:00Z",
"relevanceScore": 0.87
}
]
}Every query is automatically logged with comprehensive details for compliance, debugging, and usage tracking. Query logs provide:
- Which user made which queries
- Which documents were accessed
- Which libraries were searched
- Full request/response history
- Timestamps and performance metrics
While the specific API for accessing query logs may vary by workspace configuration, Sync maintains detailed audit trails that typically include:
What's Logged for Each Query:
- Query text and parameters
- User ID and agent ID
- Timestamp and response time
- Dataspace and library IDs accessed
- Content items accessed (with metadata filters applied)
- Citations and sources used in response
- Token usage and cost metrics
- Conversation ID (for multi-turn tracking)
Why Query Logs are Essential:
Legal Compliance:
- Prove which attorney accessed which client files
- Demonstrate attorney-client privilege boundaries
- Track conflicts of interest (ethical walls)
Audit & Security:
- Detect unusual access patterns
- Monitor for potential data exfiltration
- Review user behavior for training purposes
Billing & Usage:
- Track query volume per client or per user
- Allocate costs to specific matters or projects
- Monitor API usage against quotas
Quality Assurance:
- Review query patterns to improve agent instructions
- Identify common questions to build training materials
- Debug user-reported issues by replaying exact queries
Research & Improvement:
- Analyze which sources are most frequently cited
- Understand which libraries provide the most value
- Refine metadata schemas based on actual filter usage
Example Log Entry (Conceptual):
{
"queryId": "q-7a8b9c0d-1e2f-3a4b-5c6d-7e8f9a0b1c2d",
"conversationId": "conv-a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"userId": "123e4567-e89b-12d3-a456-426614174001",
"agentId": "550e8400-e29b-41d4-a716-446655440000",
"query": "Based on our case files, what is our strongest defense strategy for the eviction case?",
"timestamp": "2025-01-25T15:30:00Z",
"dataspaceId": "sds-a1b2c3d4",
"libraryIds": [
"c7f85f64-9821-4a62-b8fc-1c963f66afa6",
"d8a96g75-1932-5b73-c9gd-2d074g77bgb7"
],
"contentFilters": {
"clientId": "CLIENT-2024-789",
"caseType": "Eviction Defense"
},
"documentsAccessed": [
{
"contentId": "7f3e5b89-a1c2-4d3e-b5f6-8a9b0c1d2e3f",
"fileName": "CLIENT-2024-789_Eviction_Notice.pdf",
"accessedAt": "2025-01-25T15:30:02Z"
},
{
"contentId": "8g4f6c90-b2d3-5e4f-c6g7-9b0c1d2e3f4g",
"fileName": "CLIENT-2024-789_Lease_Agreement.pdf",
"accessedAt": "2025-01-25T15:30:02Z"
}
],
"libraryPagesAccessed": [
{
"libraryId": "c7f85f64-9821-4a62-b8fc-1c963f66afa6",
"url": "https://leginfo.legislature.ca.gov/faces/codes_displaySection.xhtml?lawCode=CIV§ionNum=1946.1",
"accessedAt": "2025-01-25T15:30:03Z"
}
],
"responseTime": "4.2s",
"tokensUsed": 3847,
"analyzedDocumentCount": 23
}Sometimes you want to use an agent but change which libraries are searched. Sync allows you to override the agent's default libraries on a per-query basis:
POST https://sws-x9p3q7r5.syncdocs.ai/api/content/sds-a1b2c3d4/query
Authorization: Bearer <your-token>
Content-Type: application/json
{
"query": "What is the California statute of limitations for property damage claims?",
"agentId": "550e8400-e29b-41d4-a716-446655440000",
"context": {
"libraries": ["c7f85f64-9821-4a62-b8fc-1c963f66afa6"]
}
}What Changed:
By specifying context.libraries, we override the agent's default library list. This query will:
- Use the agent's instructions (legal expertise, citation style)
- Search the dataspace for client documents
- Search ONLY the CA State Law library (NOT the SF Municipal Code library)
Why This Is Useful:
- Cost optimization: Don't search libraries you don't need for a specific query
- Performance: Fewer sources = faster responses
- Multi-jurisdiction: Easy to add/remove jurisdictions dynamically
- Specialized queries: Use different reference materials for different question types
Throughout this guide, you've seen how Sync's research agent architecture uniquely enables:
Traditional Problems:
- Client data mixed with reference materials in a single database
- No easy way to ensure queries don't leak across clients
- Difficult to implement ethical walls or privilege boundaries
How Sync Solves This:
- ✅ Client case files (dataspace) are physically separate from public references (libraries)
- ✅ Metadata filters enforce client-specific data access at query time
- ✅ Query logs provide complete audit trails of which documents were accessed
- ✅ Libraries can be shared across clients without privacy concerns
Traditional Problems:
- No audit trail of which AI queries accessed which documents
- Difficult to prove compliance with privilege requirements
- Can't track which legal sources informed which advice
How Sync Solves This:
- ✅ Every query logged with full details (user, agent, filters, documents accessed)
- ✅ Conversation threads stored for review and dispute resolution
- ✅ Citations link responses to specific source documents
- ✅ Metadata filters create technical enforcement of privilege boundaries
Traditional Problems:
- Different systems for PDFs vs. Word docs vs. scanned images vs. web content
- Complex ETL pipelines to make documents "AI-ready"
- Separate interfaces for structured metadata vs. unstructured content
How Sync Solves This:
- ✅ Single query interface for PDFs, Word docs, scanned images, and scraped web content
- ✅ Libraries and dataspaces both searchable through the same API
- ✅ No preprocessing needed - upload documents and they're automatically processed
- ✅ Metadata queries combine with semantic search in one request
Traditional Problems:
- Hard-coded search scopes make systems inflexible
- Each new use case requires new infrastructure
- Difficult to combine multiple data sources in one query
How Sync Solves This:
- ✅ Metadata filters dynamically scope queries to specific clients/cases/time periods
- ✅ Libraries can be added/removed per query for flexible context
- ✅ Same agent serves multiple use cases by changing filters
- ✅ Conversation history automatically maintained across multi-turn dialogues
Traditional Problems:
- Document storage, AI processing, and search are separate products
- Complex integrations between systems
- Inconsistent governance models across tools
How Sync Solves This:
- ✅ Single platform for document storage, processing, metadata extraction, and querying
- ✅ One API for structured metadata AND semantic search AND library integration
- ✅ Consistent access control and audit logging across all operations
- ✅ No data movement between systems - everything happens in your VPC
Congratulations! You've created a production-ready legal research agent that:
- ✅ Searches across private client documents AND public legal references in one query
- ✅ Enforces client-specific data access through metadata filters
- ✅ Provides professional legal research output with proper citations
- ✅ Maintains conversation history for multi-turn research sessions
- ✅ Logs all activity for compliance and audit purposes
- ✅ Runs entirely in your infrastructure with complete data isolation
This same architecture can be adapted for:
- Medical research agents (patient records + medical literature)
- Financial analysis agents (client portfolios + market data)
- Engineering support agents (internal docs + technical specifications)
- Compliance review agents (company policies + regulatory databases)
Ensure your client case files have proper metadata (clientId, caseType, etc.) for effective filtering. Learn how to use ontologies to automatically extract structured data from uploaded documents.
- Concepts: Agents - Advanced agent configuration, instruction engineering, and multi-agent systems
- Concepts: Libraries - Library management, update strategies, and content curation
- Concepts: Queries - Advanced query techniques, filter composition, and performance optimization
- Concepts: Dataspaces - Data organization strategies and access control models