HR AI Evidence Has a Workspace Problem

On May 4, Google added an AI control center to the Workspace Admin console. The product note was written for administrators, not HR leaders, but it described a problem HR will inherit quickly: Gemini and agentic solutions now need centralized visibility, governance, and auditing when they touch Gmail, Drive, Docs, Sheets, Slides, Meet, Calendar, Chat, and the Gemini app.

Two weeks earlier, OpenAI had moved shared agents into ChatGPT Business, Enterprise, Edu, and Teachers plans. The examples were ordinary office workflows: software requests, weekly metrics, lead outreach, vendor risk, Slack responses. None were branded as HR products. That is why they matter. A recruiter, manager, or HR business partner will not always begin an AI-assisted employment workflow inside Workday, Greenhouse, iCIMS, Oracle, SAP, or ServiceNow. They may begin it in ChatGPT, Slack, Gmail, Teams, Copilot, or Docs.

The evidence trail changes when that happens.

An AI-assisted hiring note can start with a Slack request, pull candidate context from an ATS, summarize interview feedback in a shared document, route an approval through ServiceNow, update a Workday task, and generate a finance-facing cost record for the agent run. Each system can keep a log. Each log can be accurate. The employer can still fail the later audit because no one can reconstruct the whole employment decision in one readable file.

This creates the new format war in HR AI. It is not about whether a vendor can export a CSV. It is about whether HR, legal, security, finance, auditors, vendors, and affected workers can read the same evidence record after work has moved across workspace agents, HCM systems, service workflows, model providers, identity layers, and billing meters.

A Hiring Case Now Starts in the Work Apps

Picture a talent acquisition manager on a Monday morning. A hiring manager asks in Slack why three applicants for a customer support role were not moved forward. The manager tags a shared agent that was built by the recruiting operations team. The agent reads the job intake, checks candidate notes, drafts a short explanation, asks for approval before posting an update, and files a ticket because one rejection reason conflicts with the company’s policy language.

The operational work looks clean. The evidence does not.

OpenAI’s workspace agents launch says agents can gather context from the right systems, follow team processes, ask for approval when needed, and continue work across tools. The same launch says the Compliance API gives admins visibility into agent configuration, updates, and runs, and that admins can suspend agents if needed. The help center adds details that matter for HR: agents can connect to Google Calendar, Google Drive, Slack, SharePoint, custom MCPs, skills, files, schedules, and Slack channels; app connections can use end-user accounts or agent-owned accounts; write actions can be configured to ask every time, never ask, or use custom approvals.

Those details turn a simple recruiting explanation into a multi-party record. A future reviewer may need to know which Slack user invoked the agent, which workspace agent version ran, which app connections it used, whether it acted through an end-user account or an agent-owned account, which candidate records it accessed, whether the write action required approval, who approved it, and whether the posted explanation matched the final ATS disposition.

The old HR evidence model assumed the system of record held the decisive event. The new model has to assume the decisive event may be assembled across work apps before it lands in the HR system.

The shift is deeper than a new admin dashboard. It moves evidence collection upstream, into the same tools where employees already coordinate work.

OpenAI Turns Team Agents Into Shared Operational Records

OpenAI’s April 22 launch is useful because it describes team agents as repeatable business workflows rather than personal assistants. The examples are not speculative. Software review enforces policy and opens IT tickets. Product feedback routing monitors Slack and support channels. Weekly reporting pulls data and writes the narrative. Lead outreach scores records and updates the CRM. A third-party risk agent screens vendors and produces structured reports.

Each of those patterns has an HR version. Recruiting operations can review candidate-source exceptions. Employee relations can route intake notes. HR shared services can answer policy questions and file cases. Workforce planning can turn headcount requests into data pulls and manager packets. Vendor management can review assessment providers, background-check vendors, and sourcing tools.

OpenAI also gives buyers a clue about what it considers evidence. After a team shares an agent, analytics show runs and unique users. Admins can control connected tools and actions by user group. The Compliance API exposes configuration, updates, and runs. Soon, according to the launch, admins will be able to view every agent built across the organization, including usage patterns and connected data sources.

The enterprise release notes add a second clue. OpenAI has been moving compliance exports toward a single logs platform with time-windowed JSONL files across multiple log categories. That kind of ingestion pattern is attractive to security teams because it fits the tools they already use. HR still has to connect the log event to the employment object.

For ChatGPT governance, this is a strong start. It is not a complete HR evidence schema.

An employment decision record needs more than agent usage. It needs a decision object. It needs a person affected by the output. It needs the job, requisition, policy, worker, case, pay code, shift, manager, region, and vendor context. It needs the source data used and the source data intentionally excluded. It needs the human review state, the approval state, the correction state, and the final disposition in the HR system of record.

OpenAI can log the agent. The HR stack has to log the employment consequence.

This gap will shape buyer behavior. HR will not be able to treat ChatGPT agent analytics, Slack channel history, ATS audit logs, Workday task history, and ServiceNow case logs as separate artifacts forever. A candidate complaint or employee appeal will ask one question: what happened to me? The employer will have to answer with one record, not five admin screenshots.

Google Puts Workspace Access Under an Admin Lens

Google’s May 4 Workspace update makes the same point from the opposite direction. The AI control center gives administrators a centralized view of security and governance settings for generative AI and agent actions. Google says it brings more granular governance and auditing for Gemini and agentic solutions accessing Workspace data. It starts with usage for Gmail, Drive, Docs, Sheets, Slides, Meet, Calendar, Chat, and the Gemini app.

For HR, that list is almost a map of sensitive evidence leakage.

Interview feedback often sits in Docs. Compensation planning spreadsheets sit in Sheets. Manager calibration notes sit in email. Candidate scheduling lives in Calendar. Employee service questions happen in Chat. Hiring committee discussion happens in Meet notes. A Gemini or third-party agent that touches those surfaces can influence an employment decision before the ATS or HRIS receives a final update.

The control center answers one administrative question: who or what is using AI against Workspace data? HR needs the harder follow-up: which Workspace-derived signal entered an employment workflow, and how was it represented in the final decision record?

The difference matters because Workspace is where informal work becomes formal enough to matter but not formal enough to be clean. A recruiter may paste a candidate summary into a Google Doc. A manager may ask Gemini to condense interview notes. A team may discuss a promotion slate in Chat. A policy agent may draft a response from Drive documents. None of that looks like a final decision in isolation.

Later, it may become evidence.

Google Cloud’s Gemini Enterprise materials push in the same direction. Gemini Enterprise offers centralized oversight and management of Google-made, third-party, and organization-built agents, with agents built through Agent Development Kit and governed in Gemini Enterprise. Google has also emphasized open standards such as A2A for agent interoperability. For HR buyers, openness is useful only if the resulting record has enough shared fields to survive outside Google’s console.

Procurement has to test a narrower point: whether a Gemini-derived hiring, payroll, or employee service action can be joined to the rest of the record when the workflow crosses a non-Google system.

Microsoft and ServiceNow Join the Agent Map

Microsoft made Agent 365 generally available on May 1. The company framed the problem as agent sprawl: agents with delegated access, agents with their own credentials, local agents, SaaS agents, cloud agents, and unmanaged shadow agents. Agent 365 is designed to observe, govern, and secure them whether they act on behalf of users or behind the scenes with their own permissions.

That matters for HR because employment workflows contain both types of agents. A manager-facing Copilot agent may act with delegated access when it summarizes a team member’s feedback. A payroll exception agent may act with its own credentials because it triages variance cases in the background. A recruiting operations agent may sit in a SaaS platform outside Microsoft while still touching Teams or Outlook.

ServiceNow then expanded the map. At Knowledge 2026, ServiceNow announced that AI Control Tower will integrate with Microsoft Agent 365, extending visibility and governance across Microsoft Agent 365’s agent environment. The company said administrators can review and approve ServiceNow AI specialists before submission to the Microsoft Agent 365 Marketplace, where Microsoft publishing and policy controls apply. Charles Lamanna, Microsoft’s executive vice president for Copilot, Agents, and Platform, framed the joint work around linking intelligence and action inside a secure foundation. ServiceNow’s product team framed it around unified control for multi-agent enterprises.

The integration draws a record boundary, not only a partnership.

If a ServiceNow HR workflow is invoked inside Microsoft 365, which system owns the evidence? Microsoft may know the agent identity, user context, and workplace surface. ServiceNow may know the workflow, approval chain, case state, SLA timer, and governed action. The HRIS may hold the employee or candidate record. The model provider may hold a tool trace or prompt history. The finance team may see the meter.

A buyer that signs the contract without specifying a shared evidence format will discover the issue after the first dispute.

ServiceNow’s Action Fabric press release shows why. ServiceNow says its MCP Server lets any AI agent access governed enterprise actions headlessly. Every action runs through AI Control Tower, where it is identity-verified, permission-scoped, and auditable. The MCP Server Console includes governance, consumption metering, managed OAuth, enterprise audit trails, session management, and role-based tool packages. The MCP Server spans IT, HR, customer service, security, risk and compliance, and app development. Bill McDermott, ServiceNow’s CEO, has been explicit that the company wants to connect models, clouds, data sources, workflows, and trust in one operating layer.

The record is rich. It is also platform-specific.

ServiceNow will understandably want its action layer to become the place where the work is proved. Microsoft will want Agent 365 to be the control plane for agent identity and enterprise governance. Workday will want HR and finance buyers to see ASOR as the record for human and digital work. OpenAI and Google will keep improving the agent admin layer inside the work apps. The HR buyer sits in the middle.

Workday Keeps the HR Object Close

Workday saw the problem early. Its Agent System of Record, announced in February 2025, was built around managing AI agents in one place. The company said ASOR would onboard new agents, define roles and responsibilities, track impact, budget and forecast costs, support compliance, and provide real-time operational visibility. Workday positioned agents as part of the workforce, alongside employees and contingent workers. Aneel Bhusri and Carl Eschenbach both tied the product to Workday’s long-running claim that it understands people, money, roles, and workforce processes.

For HR buyers, that framing has force. Workday already holds sensitive people and money data. It understands jobs, workers, requisitions, compensation, payroll, skills, managers, organizations, and business processes. A generic workspace log does not.

Workday’s September 2025 partner network expansion made the control problem larger. More than 15 Workday Ventures portfolio companies joined the Agent Partner Network, and Workday said the network had grown more than fourfold to over 50 partners since its June launch. Those agents connect to ASOR. Examples included employee self-service, workforce intelligence, skill-building, leadership coaching, AP invoices, and time tracking.

That partner map shows the likely shape of HR AI. It will not be one vendor. It will be Workday plus partner agents, ServiceNow workflows, Microsoft workplace surfaces, Google Workspace documents, OpenAI agents, ATS data, assessment outputs, background-check vendors, identity providers, and model providers.

Workday can describe the HR object. ServiceNow can describe the work action. Microsoft and Google can describe the workspace access. OpenAI can describe the shared agent run. Finance can describe the meter. Legal may later need the whole chain.

No single system has every field.

This does not make Workday weaker. It makes Workday’s data model more important. If an evidence schema is going to work for employment decisions, it needs a stable way to represent worker, candidate, requisition, job, pay, case, skill, and manager context. Without those fields, an agent trace can prove that software acted, but not what employment outcome the action affected.

Colorado Resets the Regulatory Clock

Regulation is making the schema problem less optional. Colorado’s SB26-189 had a signed act dated May 14 after a 34-1 Senate concurrence and a 57-6 House third reading vote, according to the Colorado General Assembly bill record. The bill replaces the earlier Colorado AI Act framework with rules around automated decision-making technology and consequential decisions. The definition covers technology that processes personal data and generates predictions, recommendations, classifications, rankings, scores, or other outputs used to make or assist decisions concerning individuals. Consequential decisions include access, eligibility, or compensation related to employment and employment opportunities, along with other domains.

Colorado’s rewrite is narrower than the original law. The narrowness matters.

The regulatory ground is not stable. A compliance team that hard-codes its evidence record only to one statute will keep rebuilding. California already moved differently. The California Civil Rights Council’s employment ADS regulations were approved on June 27, 2025 and took effect on October 1, 2025. The CRD announcement says employers and covered entities must maintain employment records, including automated-decision data, for at least four years.

EU rules add another direction. Employment and worker management uses are high-risk under the EU AI Act, and affected people can seek explanations for certain decisions. Even when an employer is operating mostly in the United States, global HR systems and multinational hiring processes make EU-style documentation expectations hard to ignore.

This is why evidence schema matters more than policy language. Laws will keep changing. Local definitions will differ. Vendor control planes will evolve. But the employer still needs to answer a repeatable set of questions:

Which person, candidate, worker, job, requisition, case, shift, pay item, or employment action was affected?
Which agent, model, tool, data source, identity, and workspace surface participated?
Which human approved, reviewed, corrected, or overrode the output?
Which record became final, which records were superseded, and which downstream systems received the correction?
Which logs are retained, for how long, under whose custody, and in what exportable format?

A statute can change one deadline or one duty. It rarely changes the underlying evidence map.

SHRM and ICIMS Show the Readiness Gap

The demand side is not ready for this complexity. SHRM’s 2026 State of AI in HR report surveyed 1,908 HR professionals and found that 39% had AI adopted in their HR functions. Legal and compliance primarily led AI governance and oversight in 37% of organizations. More than half of organizations, 52%, did not directly or collaboratively involve HR in overall AI strategy and vision. SHRM also reported that 56% of organizations using or planning AI did not formally measure AI investment success.

The same organization can have a governance gap and a measurement gap at once.

ICIMS and Aptitude Research added a recruiting-specific signal on April 30. Their report found that 69% of companies were using AI in talent acquisition, 46% were using or planning to use agentic AI, and 45% did not yet have a formal AI governance framework. The report also said 82% considered transparency and explainability important.

Those numbers explain why the format war will not stay inside product teams. HR teams are adopting AI before they know how to measure it. Recruiting teams are moving toward agentic workflows before governance is mature. Legal and compliance are often leading oversight while HR owns the operational context. Finance is beginning to ask how many systems one workflow billed. Security is asking which agents had access. Candidates and employees will ask why a decision happened.

Each function will request evidence in its own language.

HR will ask for a candidate or worker story. Legal will ask for retention, duty, notice, and explanation. Security will ask for identity, access, data movement, and prompt injection exposure. Finance will ask for cost attribution and meter reconciliation. Procurement will ask for vendor obligations and export rights. The vendor will provide whatever its platform can export. The employee or candidate will not care which platform produced which log.

The first companies to scale HR agents will be the companies that make those requests converge.

Schema Becomes a Procurement Term

The phrase “evidence schema” sounds technical. In a contract, it becomes practical.

The buyer should not merely ask whether a vendor has audit logs. It should ask for the minimum evidence manifest produced by any AI-assisted employment workflow. That manifest needs stable identifiers and enough context to be joined across systems. It also needs a dictionary, because a “run” in ChatGPT, an “action” in ServiceNow, an “agent” in Workday, a “message” in Copilot Studio, and a “workflow” in an ATS do not mean the same thing.

A useful HR AI evidence manifest would contain at least seven blocks.

First, the employment object: candidate, employee, contingent worker, manager, requisition, job, organization, case, shift, pay item, performance cycle, or service request.

Second, the agent object: agent name, owner, system, version, configuration state, role, permission scope, credential type, memory state, schedule, channel, and approval policy.

Third, the tool and data object: connected apps, MCP servers, files, records, source systems, retrieval results, excluded sources, write-action settings, and data classification.

Fourth, the decision object: proposed output, final output, human reviewer, approval, override, reason code, policy basis, timestamp, and final system of record update.

Fifth, the custody object: log location, export format, hash or signature, retention period, legal hold state, subprocessor chain, and post-termination support commitment.

Sixth, the correction object: superseded output, affected population, downstream recipients, correction timestamp, acknowledgement state, and unresolved exceptions.

Seventh, the cost object: agent run, action, message, token, connector, integration, audit export, retry, fallback model, and cost owner.

The last block may look odd in an evidence schema. It belongs there because HR AI evidence and HR AI billing are now moving through the same workflow trail. A finance team cannot audit a workflow invoice without knowing which action happened. A legal team cannot audit an employment decision without knowing which action happened. The same action record will feed both disputes.

This is where procurement has leverage. A buyer can require vendors to export evidence manifests in a documented schema, preserve field-level lineage, support third-party ingestion, and maintain schema compatibility after contract termination. Without those terms, “audit-ready” will mean “ready inside our console.”

That is not enough.

Finance and Legal Will Fight Over the Same Trace

The last article in this series argued that CFOs will want receipts for HR agent workflows. Evidence schema war turns that finance problem into a legal problem.

Take a candidate-screening workflow. An external agent drafts a shortlist. A recruiter edits the summary. The ATS updates the disposition. A manager receives a packet. A rejected candidate asks for an explanation. Finance sees charges from the workspace agent, the ATS AI add-on, a model provider, and an audit export. Security sees a tool call that used a shared connection. Legal sees a state-law notice issue.

There is no clean separation between invoice audit and decision audit. The same trace has to answer both.

That will create internal tension. Finance will want a compact cost record. Legal will want a defensible evidence record. Security will want low-friction log ingestion. HR will want a human-readable narrative. Vendors will want to keep customers inside their dashboards. Each preference is rational. Together, they can produce a record nobody can use.

This is why a cross-functional evidence schema should be negotiated before scaled deployment, not after a dispute. The fields do not need to be perfect on day one. They need to be stable enough that teams can join records across systems and durable enough that a vendor switch does not destroy historical meaning.

The buyer should test this with a drill. Pick one completed workflow, such as a candidate rejection, payroll correction, leave escalation, onboarding exception, or promotion recommendation. Ask each system for its record. Give the bundle to HR, legal, security, finance, and procurement. Ask each team to answer its own question without a vendor engineer on the call.

If the room cannot reconstruct the run, the company does not have evidence. It has fragments.

Renewal Meetings Need a Replay File

The evidence format war will not be settled by a standards body first. It will be settled in renewal meetings, security reviews, litigation holds, regulator requests, and budget disputes.

Vendors will arrive with strong claims. OpenAI will show team agents, configuration visibility, runs, analytics, and Compliance API coverage. Google will show Workspace data controls and admin-level auditing. Microsoft will show a broad agent control plane. ServiceNow will show governed action, enterprise audit trails, session management, and metering through Action Fabric. Workday will show a people, money, and agents record rooted in HR and finance context. ATS and assessment vendors will show their own decision logs.

The buyer should ask for one thing: replay the case.

Replay the hiring decision from Slack request to ATS disposition. Replay the payroll correction from employee complaint to downstream update. Replay the employee service case from Chat question to policy answer to ServiceNow ticket to Workday task. Replay the promotion recommendation from manager notes to performance summary to human review to final record. Replay the invoice line from the same workflow.

The replay file is the product.

Not the dashboard. Not the policy deck. Not the model card. Not the vendor’s assurance that logs exist somewhere.

The replay file is the only artifact that lets HR explain the decision, legal defend the process, security verify the access path, finance reconcile the cost, procurement enforce the contract, and a new vendor import historical context after a switch. If it cannot leave the console, it is not a durable employment record. If it cannot join to the HR object, it is not HR evidence. If it cannot show cost, it will not satisfy finance. If it cannot show correction, it will fail the first appeal.

HR AI evidence now lives where work happens: work apps, service workflows, HCM records, recruiting systems, security tools, and billing meters. The companies that scale agents responsibly will not be the ones with the longest audit-log menu. They will be the ones that can hand over a readable, portable, cross-system replay when someone asks what happened.

This article analyzes the evidence schema conflict created by workspace agents, HR systems, agent control planes, and employment AI regulation. Published May 18, 2026.