Documentation Index
Fetch the complete documentation index at: https://docs.provisionr.io/llms.txt
Use this file to discover all available pages before exploring further.
An IT leader decides to implement policy-based access management. The vision is clear: when someone joins, their HRIS record updates, policies calculate required access, and systems provision automatically.
Then the integration mapping begins:
HRIS (Workday)
→ Identity Provider (Okta)
→ Cloud (AWS, GCP, Azure)
→ SaaS (Google Workspace, Slack, Salesforce, GitHub, etc.)
→ On-prem (File shares, legacy apps)
Reality sets in: Every arrow represents a custom integration. Every system has different authentication, different APIs, different data models.
Welcome to the integration layer—the unsexy but critical plumbing that makes access management work.
The Integration Complexity Matrix
A typical mid-size company integrates 20-30 systems:
| Category | Systems | Integration Method |
|---|
| HRIS | Workday, BambooHR, Rippling, ADP | REST API, SCIM, Webhooks |
| Identity Provider | Okta, Azure AD, Auth0 | REST API, SCIM |
| Cloud (IaaS) | AWS, Azure, GCP | IAM APIs, Terraform |
| Collaboration | Slack, Microsoft Teams | REST API, Webhooks |
| Dev Tools | GitHub, GitLab, Jira | REST API, Webhooks |
| CRM | Salesforce, HubSpot | REST API, SOAP (legacy) |
| Productivity | Google Workspace, Microsoft 365 | Admin APIs |
| Finance | NetSuite, QuickBooks | REST API (limited) |
Total integration endpoints: 60-100+ (each system exposes multiple APIs).
The Six Integration Patterns
Pattern 1: Pull (Polling)
Mechanism: The access management system queries external systems on a schedule, compares current state to previous state, and processes changes.
Example: HRIS User Sync
def sync_users_from_hris():
# Fetch all users from HRIS
hris_users = workday_api.get_users()
# Fetch all users from local database
local_users = db.query(User).all()
# Calculate diff
added = hris_users - local_users
removed = local_users - hris_users
updated = detect_changes(hris_users, local_users)
# Apply changes
for user in added:
create_user(user)
for user in removed:
deactivate_user(user)
for user in updated:
update_user(user)
Advantages: Simple to implement. Works with any API. No external dependencies on webhooks or message queues.
Disadvantages: Delayed updates based on poll frequency. Inefficient—fetches all data even when nothing changed. API rate limits constrain frequent polling.
Best fit: Systems lacking webhook support. Scenarios where real-time updates are non-critical. Data volumes under 10K records.
Pattern 2: Push (Webhooks)
Mechanism: External systems send HTTP POST requests to designated endpoints when data changes. The access management system processes immediately.
Example: HRIS User Update
@app.route('/webhooks/workday/user-updated', methods=['POST'])
def handle_user_update():
# Verify webhook signature
if not verify_signature(request):
return 'Unauthorized', 401
# Parse payload
event = request.json
user_id = event['user_id']
changes = event['changes']
# Process immediately
user = db.query(User).filter_by(hris_id=user_id).first()
for field, new_value in changes.items():
setattr(user, field, new_value)
db.commit()
# Trigger policy re-evaluation
recalculate_access(user)
return 'OK', 200
Advantages: Real-time updates. Efficient—only changed data transmits. No polling overhead.
Disadvantages: Requires public endpoint (security consideration). Webhook reliability depends on endpoint availability. Retry logic varies by vendor. Signature verification differs across systems.
Best fit: Real-time updates are critical. The external system supports webhooks. The organization can handle webhook security requirements.
Pattern 3: Scheduled Batch
Mechanism: Data exports from systems on schedule, imports to the access management system, batch processing.
Example: HRIS Daily Export
def process_daily_hris_export():
# Download CSV from SFTP
csv_data = sftp_client.download('/exports/users_2024-11-25.csv')
# Parse CSV
users = parse_csv(csv_data)
# Batch process
for user in users:
upsert_user(user)
# Trigger downstream actions
recalculate_all_access()
Advantages: Simple. Works with legacy systems (many only support file exports). Reliable—file-based, easy to retry.
Disadvantages: Significant delays (daily, nightly). Large data transfers. Requires file handling (SFTP, S3).
Best fit: Systems only supporting file exports. Real-time is non-critical. Batch windows are acceptable.
Pattern 4: Event Stream
Mechanism: Events publish to a message queue (Kafka, SQS). The access management system subscribes and processes events as they arrive.
Example: User Lifecycle Events
# Publisher (HRIS system)
def publish_user_event(event_type, user_data):
kafka_producer.send('user-events', {
'event_type': event_type, # 'user.created', 'user.updated', 'user.deleted'
'timestamp': datetime.now().isoformat(),
'user': user_data
})
# Consumer (access management system)
def consume_user_events():
for message in kafka_consumer:
event = message.value
if event['event_type'] == 'user.created':
handle_user_created(event['user'])
elif event['event_type'] == 'user.updated':
handle_user_updated(event['user'])
elif event['event_type'] == 'user.deleted':
handle_user_deleted(event['user'])
Advantages: Real-time. Decoupled—systems avoid direct calls to each other. Reliable—message queue handles retries. Scalable—multiple consumers.
Disadvantages: Infrastructure overhead (Kafka, SQS). Complexity (message serialization, dead letter queues). Requires both systems to support event streaming.
Best fit: Event streaming infrastructure exists. Multiple systems need the same data. High volume and high reliability are requirements.
Pattern 5: SCIM (Standard Protocol)
Mechanism: SCIM (System for Cross-domain Identity Management) provides a standard protocol. Identity providers push or pull user data through standardized endpoints.
Example: Okta to Application via SCIM
@app.route('/scim/v2/Users', methods=['POST'])
def scim_create_user():
# Okta sends SCIM user payload
data = request.json
user = create_user(
email=data['userName'],
first_name=data['name']['givenName'],
last_name=data['name']['familyName'],
active=data.get('active', True)
)
# Return SCIM response
return jsonify({
'schemas': ['urn:ietf:params:scim:schemas:core:2.0:User'],
'id': user.id,
'userName': user.email,
'name': {
'givenName': user.first_name,
'familyName': user.last_name
},
'active': user.active,
'meta': {
'resourceType': 'User',
'created': user.created_at.isoformat(),
'location': f'/scim/v2/Users/{user.id}'
}
}), 201
Advantages: Standard protocol with thorough documentation. Supported by major IdPs (Okta, Azure AD, Google). Bidirectional—supports push and pull.
Disadvantages: Complex specification with numerous edge cases. Not all systems support SCIM. Group mapping presents challenges.
Best fit: Enterprise IdP integrations. Standardized protocols preferred. User provisioning is the primary use case.
Pattern 6: API Proxying
Mechanism: The access management system acts as a proxy between two systems, translating requests and responses.
Example: HRIS to Okta via Proxy
def handle_hris_user_update(hris_user):
# Receive from HRIS (one format)
hris_data = {
'employee_id': hris_user['id'],
'full_name': hris_user['name'],
'work_email': hris_user['email'],
'dept': hris_user['department']
}
# Transform to Okta format
okta_data = {
'profile': {
'login': hris_data['work_email'],
'email': hris_data['work_email'],
'firstName': hris_data['full_name'].split()[0],
'lastName': hris_data['full_name'].split()[-1],
'department': hris_data['dept']
}
}
# Send to Okta
okta_api.update_user(okta_user_id, okta_data)
Advantages: Full control over transformations. Business logic can be added (validation, enrichment). Single point for monitoring.
Disadvantages: The access management system becomes a critical path. Maintenance burden—the organization owns all transformations. Performance bottleneck—all traffic flows through the proxy.
Best fit: Data transformations required. Business logic must be applied. Systems cannot communicate directly.
Integration patterns directly affect audit trail completeness. Webhook-based (push) patterns create real-time audit records; polling patterns create batch records with potential gaps. Auditors evaluating SOC 2 CC6.1 (logical access controls) prefer near-real-time evidence of access changes. Document the integration pattern for each connected system and the expected latency for audit events.
The Authentication Zoo
Each system requires different authentication:
| System | Auth Method | Credential Type |
|---|
| Workday | OAuth 2.0 | Client ID + Secret |
| BambooHR | API Key | Single key header |
| Okta | API Token | Bearer token |
| AWS | IAM Credentials | Access Key + Secret |
| Azure | Service Principal | Client ID + Secret + Tenant |
| GCP | Service Account | JSON key file |
| Google Workspace | OAuth 2.0 + Domain-Wide Delegation | Service account JSON |
| Slack | OAuth 2.0 | Bot token |
| GitHub | Personal Access Token | PAT or OAuth app |
| Salesforce | OAuth 2.0 | Connected app credentials |
Organizations must manage 20-30 different credential types, credential rotation (some expire), secure storage (encrypted secrets), and permission scopes (least privilege).
The Rate Limit Maze
Each API enforces different rate limits:
| System | Rate Limit | Scope |
|---|
| Okta | 10,000 req/min | Per tenant |
| Google Admin API | 2,400 req/min | Per project |
| AWS IAM | 5,000 req/sec | Per account |
| Slack | 20-100 req/min | Per method type |
| GitHub | 5,000 req/hour | Per OAuth token |
| Salesforce | 100,000 req/day | Per org |
Organizations need rate limit tracking per API, exponential backoff retry logic, request queuing to avoid exceeding limits, and graceful degradation when limits are reached.
The Integration Layer Architecture
Recommended architecture:
┌──────────────────────────────────────────────┐
│ HRIS (Source of Truth) │
│ (Workday, BambooHR) │
└────────────────┬─────────────────────────────┘
│
▼ (Webhook or Poll)
┌──────────────────────────────────────────────┐
│ Integration Layer │
│ ┌────────────────────────────────────────┐ │
│ │ Event Processor │ │
│ │ - Receive HRIS events │ │
│ │ - Calculate access changes │ │
│ │ - Queue provisioning actions │ │
│ └────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────┐ │
│ │ Orchestration Engine │ │
│ │ - Execute provisioning in order │ │
│ │ - Handle dependencies │ │
│ │ - Retry failures │ │
│ └────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────┐ │
│ │ Adapter Registry │ │
│ │ - OktaAdapter │ │
│ │ - GoogleAdapter │ │
│ │ - AWSAdapter │ │
│ │ - SlackAdapter │ │
│ │ - ... 20 more │ │
│ └────────────────────────────────────────┘ │
└────────────────┬─────────────────────────────┘
│
├─────────────┬──────────────┬────────────┐
▼ ▼ ▼ ▼
┌───────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Okta │ │ Google │ │ AWS │ │ Slack │
└───────────┘ └──────────┘ └──────────┘ └──────────┘
Component 1: Adapter Interface
A standard interface ensures consistency across all adapters:
class ProvisioningAdapter(ABC):
"""Base class for all provisioning adapters"""
@abstractmethod
def authenticate(self) -> bool:
"""Authenticate with target system"""
pass
@abstractmethod
def create_user(self, user_data: dict) -> str:
"""Create user, return user ID"""
pass
@abstractmethod
def update_user(self, user_id: str, user_data: dict) -> bool:
"""Update user"""
pass
@abstractmethod
def deactivate_user(self, user_id: str) -> bool:
"""Deactivate user"""
pass
@abstractmethod
def add_to_group(self, user_id: str, group_id: str) -> bool:
"""Add user to group"""
pass
@abstractmethod
def remove_from_group(self, user_id: str, group_id: str) -> bool:
"""Remove user from group"""
pass
@abstractmethod
def get_user_groups(self, user_id: str) -> List[str]:
"""Get list of groups user belongs to"""
pass
Component 2: Concrete Adapter (Example: Okta)
class OktaAdapter(ProvisioningAdapter):
def __init__(self, domain: str, api_token: str):
self.domain = domain
self.api_token = api_token
self.base_url = f"https://{domain}/api/v1"
self.session = requests.Session()
self.session.headers.update({
'Authorization': f'SSWS {api_token}',
'Content-Type': 'application/json'
})
self.rate_limiter = RateLimiter(requests_per_minute=10000)
def authenticate(self) -> bool:
try:
response = self.session.get(f"{self.base_url}/users/me")
return response.status_code == 200
except Exception:
return False
def create_user(self, user_data: dict) -> str:
self.rate_limiter.wait_if_needed()
payload = {
'profile': {
'login': user_data['email'],
'email': user_data['email'],
'firstName': user_data['first_name'],
'lastName': user_data['last_name']
}
}
response = self.session.post(
f"{self.base_url}/users",
json=payload,
params={'activate': 'true'}
)
if response.status_code in [201, 200]:
return response.json()['id']
else:
raise ProvisioningError(f"Failed to create user: {response.text}")
def add_to_group(self, user_id: str, group_id: str) -> bool:
self.rate_limiter.wait_if_needed()
response = self.session.put(
f"{self.base_url}/groups/{group_id}/users/{user_id}"
)
return response.status_code == 204
# ... implement other methods
Component 3: Rate Limiter
class RateLimiter:
def __init__(self, requests_per_minute: int):
self.requests_per_minute = requests_per_minute
self.requests_per_second = requests_per_minute / 60
self.last_request_time = 0
self.request_count = 0
self.lock = threading.Lock()
def wait_if_needed(self):
with self.lock:
now = time.time()
# Reset counter if minute passed
if now - self.last_request_time > 60:
self.request_count = 0
self.last_request_time = now
# Check if at limit
if self.request_count >= self.requests_per_minute:
sleep_time = 60 - (now - self.last_request_time)
time.sleep(sleep_time)
self.request_count = 0
self.last_request_time = time.time()
# Increment counter
self.request_count += 1
Component 4: Retry Logic with Exponential Backoff
def retry_with_backoff(func, max_retries=5, base_delay=1):
"""Retry function with exponential backoff"""
for attempt in range(max_retries):
try:
return func()
except RateLimitError:
if attempt == max_retries - 1:
raise
# Exponential backoff: 1s, 2s, 4s, 8s, 16s
delay = base_delay * (2 ** attempt)
time.sleep(delay)
except TemporaryError:
if attempt == max_retries - 1:
raise
delay = base_delay * (2 ** attempt)
time.sleep(delay)
except PermanentError:
# Don't retry permanent errors
raise
Component 5: Orchestrator
class ProvisioningOrchestrator:
def __init__(self):
self.adapters = {
'okta': OktaAdapter(domain='company.okta.com', api_token='...'),
'google': GoogleAdapter(credentials='...'),
'aws': AWSAdapter(access_key='...', secret_key='...'),
'slack': SlackAdapter(token='...')
}
def provision_user(self, user, access_grants):
"""Provision access for user across multiple systems"""
results = []
# Group grants by system
grants_by_system = {}
for grant in access_grants:
system = grant.system
if system not in grants_by_system:
grants_by_system[system] = []
grants_by_system[system].append(grant)
# Provision in dependency order
# (Okta first, then everything else)
execution_order = ['okta', 'google', 'aws', 'slack']
for system in execution_order:
if system not in grants_by_system:
continue
adapter = self.adapters[system]
grants = grants_by_system[system]
for grant in grants:
try:
result = retry_with_backoff(
lambda: self._execute_grant(adapter, user, grant)
)
results.append({
'system': system,
'grant': grant,
'status': 'success',
'result': result
})
except Exception as e:
results.append({
'system': system,
'grant': grant,
'status': 'failed',
'error': str(e)
})
# Log error
logger.error(f"Failed to provision {grant} for {user}: {e}")
return results
def _execute_grant(self, adapter, user, grant):
if grant.action == 'create_user':
return adapter.create_user(user)
elif grant.action == 'add_to_group':
return adapter.add_to_group(user.id, grant.group_id)
# ... handle other actions
Common Integration Challenges
Challenge 1: Eventual Consistency
Problem: Systems do not update immediately.
Example: Create user in Okta. Add user to Okta group. Okta provisions to Google via SCIM. Google creates the user 5-10 minutes later.
Solution: Poll for completion. Use webhooks when available. Implement retry with longer delays.
Challenge 2: Circular Dependencies
Problem: System A depends on System B, which depends on System A.
Example: Okta provisions to Google. Google group membership triggers an Okta group rule. The Okta group rule triggers Google provisioning. Infinite loop.
Solution: Detect cycles in the dependency graph. Break cycles with explicit ordering. Use idempotency to prevent duplication.
Challenge 3: Partial Failures
Problem: Some systems succeed while others fail.
Example: Create user in Okta (success). Add to Google group (API timeout). Add to Slack channels (success).
Solution: Track provisioning state per system. Retry failed operations separately. Avoid rollback of successful operations—use forward recovery.
Partial failures create audit trail complexity. SOC 2 and SOX auditors expect to see: (1) evidence that failures are detected, (2) documented retry/remediation procedures, and (3) reconciliation reports showing eventual consistency. Build reconciliation reports that compare intended access (per policy) with actual access (per system) and flag discrepancies for investigation.
Problem: Each system uses different data formats.
Example: HRIS stores “John Smith”. Okta expects first_name=“John”, last_name=“Smith”. Google expects givenName=“John”, familyName=“Smith”. Slack expects display_name=“John Smith”, real_name=“John Smith”.
Solution: Define a canonical data model (internal format). Build transformation functions per system. Handle edge cases (single-word names, hyphens, non-ASCII characters).
Challenge 5: API Versioning
Problem: APIs change, breaking integrations.
Example: Okta releases v2 API. Deprecates v1 API with 6 months notice. The existing integration uses v1.
Solution: Pin to stable API versions. Monitor deprecation notices. Test against new versions before migrating. Abstract API calls to simplify version swaps.
The Integration Reality
Integration represents 70% of the work in access management automation.
The six patterns:
- Pull (polling)—Simple, delayed
- Push (webhooks)—Real-time, complex
- Batch (file exports)—Legacy-friendly
- Event stream (Kafka)—Scalable, requires infrastructure
- SCIM (standard)—Enterprise IdP standard
- API proxying—Full control, maintenance burden
Key components:
- Adapter interface (standard contract)
- Concrete adapters (one per system)
- Rate limiters (respect API limits)
- Retry logic (handle transient failures)
- Orchestrator (coordinate across systems)
Common challenges:
- Eventual consistency
- Circular dependencies
- Partial failures
- Data format mismatches
- API versioning
Implementation timeline:
- First 3 adapters: 3-4 weeks
- Each additional adapter: 1-2 weeks
- Orchestration layer: 2-3 weeks
- Testing and hardening: 2-4 weeks
Total: 3-4 months for 10 system integrations.
Organizations underestimating integration complexity discover the difference between “access management in theory” and “access management in production.”
Next up: Security considerations in access management—threat models, attack vectors, and defense strategies.