Schematron
Schematron is a rule-based validation language with powerful capabilities and configurable error messages. It expresses rules, or constraints, in terms that domain experts or users can understand.
Further information about the language is available on schematron.com.
Schematron in a nutshell
Schematron defines constraints for your content using natural language that is meaningful to users.
It supports two kinds of constraints:
- Assertions – let you confirm that the document conforms to a particular schema.
- Reports – let you diagnose and return useful information about the content.
Related assertions and reports can be grouped together in a pattern, that can operate as follows:
- It finds a specific context (for example, a heading).
- It checks against a pattern expression (for example, text contains less than 100 characters).
- It reports which assertions have failed and which reports have succeeded alongside some diagnostic information.
The Schematron results is a list of failed assertions and successful reports.
Example
Your style guide requires that all your headings use sentence-style capitalization. Conformance means you capitalize the first word but use lower case for all other words, except for proper nouns.
Your Schematron must define a rule which context is “heading” and includes an assertion that the text matches sentence-style capitalization.
When Schematron processes your document, it does the following:
- Finds all the headings in your document.
- Checks that each one matches the correct capitalization style.
- Returns a list of all the headings that didn’t match as failed assertions.
Schematron in PageSeeder
You can use Schematron to validate any document or URL in PageSeeder.
- For PSML documents, Schematron validates the full content, edit notes and reverse references.
- For non-PSML documents, such as images, PDF and Office files, and URLs, Schematron validates the metadata PSML. This includes the metadata properties, edit notes and reverse references.
When viewing or editing a document:
- To validate the current document – the document validation panel is available from the right sidebar when you click the icon.
- To validate the documents in a folder (or only the documents that are in the current publication) – the validation report panel is available from the left sidebar when you click the icon.
You can also validate files from the group search page, and the documents page, as a batch action.
PSML schema and Schematron
PageSeeder uses Schematron in conjunction with the PSML schema to validate documents.
The PSML schema uses a grammar-based language to define constraints which are common to all PSML documents, and enforced by PageSeeder. A document which doesn’t validate the PSML schema is not considered to be PSML.
Schematron complements the PSML schema to check for system-specific constraints.
You can associate multiple Schematron schemas to each document type, media type or URL type. When viewing a document or URL, PageSeeder automatically validates your content using the default schema.
Result format
After processing a document, Schematron returns a list of failed assertions and successful reports. These are presented as a red cross or a green tick in PageSeeder by default.
To provide more nuanced results, the assertions and reports in your schema can be defined as:
- errors (for failed assertions),
- warnings,
- infos (for reports only),
- and tips (for reports only).
They can be used to filter the results:
Diagnostic information
Assertions and reports can be associated with diagnostic information. The most common diagnostic hint is the fragment that hosts the failed assertion or report occurred. This is used by PageSeeder to take you to the fragment in the validation results when you click the icon.
IDs for notes and filtering
If an assertion is given an ID in the schema, you can also filter the results that match that ID by clicking the icon and add an edit note with that ID as a label when you click the Add note button.
Example
If you have a list of terms to avoid, but you want to allow some flexibility in case the term is unavoidable, you could use the edit note to indicate that the term is allowed in that particular situation.
Using data for validation
Schematron can use data defined in other documents or PageSeeder search results as part of its validation process.
Example
You can define a controlled vocabulary in a PageSeeder document, that Schematron can use to check against the terms in a document. This flexible approach lets users manage the vocabulary and authors can modify content so it conforms to that vocabulary.
Quick fix
Assertions can be associated with a “quick fix”. A quick fix is a transformation that has been preconfigured by a developer to fix a particular issue in your content. By using Schematron to identify an issue and bind it to a quick fix, you can streamline the process of addressing content issues.
When a quick fix is associated with an assertion, you can see the Quickfix button. It opens the quick fix dialog so that you can preview the effect of the quick fix and decide whether to apply it or not.
Example
You have a rule that requires your headings to use sentence-style capitalization (where only the first word is capitalized). Schematron can report which headings don’t follow the capitalization style and quick fix can automatically apply that style.
Available from the template configuration page, the Validate all button uses Schematron to validate document type configuration files to determine whether they are valid.