Classification and Tagging

How data becomes policy-enforceable — tag schemas, classification automation, and the handoff from governance to access control.

Before a policy can enforce rules on a data object, the object has to be classified in a way the policy can reference. Classification is the layer that bridges data governance (what this data is) and access control (what may be done with it). Everything downstream — policies, key release, audit — depends on it.

The tag schema

Every tenant defines a tag schema. A tag schema is the list of classifications and metadata attributes that data objects can be tagged with, and the rules for what constitutes a valid value.

A typical schema might include:

Sensitivity tier: Public, Internal, Confidential, Restricted.
Regulatory scope: PII, PHI, PCI, ITAR, CUI, none.
Business domain: Finance, Engineering, Legal, Customer, HR.
Retention class: Ephemeral, Standard, Long-term, Indefinite hold.
Origin: Internal-produced, Customer-provided, Third-party-provided.

The schema is tenant-scoped. Different organizations can use completely different taxonomies, and coalitions can standardize on a shared schema without requiring it across the whole platform.

Why a schema matters

A policy that says "Confidential-Restricted can only be accessed by principals with clearance level 3 or higher" only works if Confidential-Restricted and clearance level 3 are well-defined terms. The schema is what gives those terms meaning.

Without a schema, classifications are free-form strings. An ad hoc tag like "confidential" versus "confidential-restricted" versus "restricted" turns every policy into a guessing game about author intent. With a schema, the policy author knows exactly what values the policy can reference, and the classifying author knows exactly what they are declaring.

How objects get tagged

Lattix supports three modes, which organizations almost always combine:

Author-assigned. The person producing the data object assigns the tags at creation time — through Passport, a Data Room upload form, or an integration at the source application. This is the most accurate mode when the author has the context to classify correctly.

Schema-inferred. When data flows in through a configured connector or integration, the source context itself establishes a default classification. Data from a specific HR application inherits an HR tag; data from a specific engineering repository inherits a corresponding domain tag.

Content-inferred. Automated classification scans the content of a data object and infers a classification from its content — for example, detecting that a document contains PII and applying the corresponding tag. Lattix supports pluggable classification engines for this mode; a tenant can supply its own classifier or use ones available from partners.

The three modes produce tags the same way, and a policy does not care which mode classified a given object. A tenant can require that tags be present before an object is allowed to leave a classification zone.

Tag propagation through derivation

When a classified object is used to produce a derived object, classifications propagate by default. A redaction of a Restricted document inherits the Restricted tag until explicitly downgraded by an authorized principal; an extract of a Confidential dataset inherits the Confidential tag. This is handled by the content addressing layer's lineage graph (see Content Addressing): because the platform knows the source CID, it can propagate the source's tags to the new object.

Downgrades are an explicit action, logged on the ledger as a tag change event attributable to the principal who made it.

The handoff to access control

Once an object is tagged, the policy evaluation described in Policies and ABAC has data attributes to reason about. A policy like "Restricted objects require clearance level 3 in the requester's identity claims and must be unwrapped within the approved network zone" becomes expressible and evaluable.

Tags are immutable with respect to cryptography: once an object is wrapped, the tag set in its envelope is signed and cannot be modified without re-wrapping. A re-wrap is a new version of the object with its own CID. This prevents an adversary who has obtained an encrypted envelope from silently reclassifying it to relax the policy that governs its unwrap.

Tagging as a first-class governance activity

Because downstream correctness depends on it, classification is not treated as a secondary concern. The Mesh Dashboard exposes tagging activity as a reviewable dataset: unclassified objects, objects with tag drift (where content-inferred tags disagree with author-assigned tags), and aging objects where the original classifier no longer has authority to re-classify.

Administrators can require that:

Specific source systems cannot produce untagged objects.
Specific tenants cannot share objects externally without a tag review step.
Specific tag combinations require a second reviewer before the object becomes shareable.

These controls are configured in the Mesh Dashboard — see Configuration → Tag Schema.

Relationship to other concepts

Classifications are attributes that policies evaluate.
Tag changes and downgrades are recorded on the Immutable Ledger.
Tag propagation relies on the content addressing lineage graph.
Tag schemas are configured under Configuration → Tag Schema.