In the Kodexa platform, managing and filtering documents effectively is crucial for maintaining data integrity and relevance. One of the powerful ways to achieve this is by using custom subscriptions defined through Spring Expression Language (SpEL). SpEL provides a robust mechanism to specify complex conditions for selecting documents, enhancing your ability to manage and process documents based on your specific needs.
Understanding Spring Expression Language (SpEL)
Spring Expression Language (SpEL) is a powerful expression language that supports querying and manipulation of objects at runtime. It is a part of the Spring Framework and allows for dynamic evaluation of expressions.
For Kodexa users, SpEL can be used to create custom subscriptions that determine whether a document meets certain criteria.
Sample Expressions for Kodexa Subscriptions
Below are some examples of how you can use SpEL expressions to define custom subscriptions in Kodexa:
Specific Key and Value from the Metadata
containsKey('ShouldProcessXML') && ['ShouldProcessXML'].toLowerCase() != 'true'
Description: This expression is used to identify documents that contain the key ShouldProcessXML
but do not have its value set to true
. It’s useful when you need to filter out documents that are marked for processing but are not flagged as true
.
Exclude Documents with Specific Mixins
!hasMixins('spatial')
Description: This expression filters out documents that have the mixin spatial
. The !
operator negates the condition, meaning only documents without the spatial
mixin will be included.
Include Documents with a Specific Label
hasLabel('PARSED')
Description: This expression is used to include documents that are labeled with PARSED
. It’s ideal for processing documents that have been successfully parsed.
Complex Subscription with Multiple Labels
hasLabel('MODEL-APPLIED') && !hasLabel('DONT_PUBLISH')
Description: This expression includes documents that have the label MODEL-APPLIED
but excludes those with the label DONT_PUBLISH
. It’s useful for scenarios where documents that are marked for publication but not flagged for exclusion should be processed.
Filtering Content Events:
// Accept content events with a specific label
type == 'content' and hasLabel('important')
// Accept content events with specific mixins
type == 'content' and hasMixins('mixin1', 'mixin2')
// Accept content events with specific file extensions
type == 'content' and hasExtensions('pdf', 'docx')
// Accept content events that are guidance documents
type == 'content' and isGuidance()
// Accept content events with a path matching a regex
type == 'content' and matchesPath('^/documents/.*\\.txt$')
Filtering Document Family Events:
// Accept document family events with a specific label
type == 'documentFamily' and hasLabel('confidential')
// Accept document family events with specific mixins
type == 'documentFamily' and hasMixins('financial', 'report')
// Accept document family events with specific file extensions
type == 'documentFamily' and hasExtensions('xlsx', 'csv')
// Accept document family events that are guidance documents
type == 'documentFamily' and isGuidance()
// Accept document family events with a path matching a regex
type == 'documentFamily' and matchesPath('^/reports/2023/.*\\.pdf$')
Combining Multiple Conditions:
// Accept content or document family events with specific conditions
(type == 'content' and hasLabel('urgent')) or (type == 'documentFamily' and hasMixins('legal', 'contract'))
// Accept content events with multiple conditions
type == 'content' and hasLabel('important') and hasMixins('financial') and hasExtensions('pdf')
Using External Data:
// Accept events based on external data
type == 'content' and hasLabel(#externalData['requiredLabel'])
// Accept events with a mix of event properties and external data
type == 'documentFamily' and hasMixins('report') and documentFamily.path.startsWith(#externalData['basePath'])
Filtering Assistant Events:
// Accept assistant events with a specific assistant ID "type == 'assistant' and assistant.id == 'specific-assistant-id'"
// Accept assistant events with a specific event type "type == 'assistant' and eventType == 'specific-event-type'"
Filtering Channel Events:
// Accept channel events with a specific channel ID "type == 'channel' and channel.id == 'specific-channel-id'"
// Accept channel events with at least one message event "type == 'channel' and messageEvents.size() > 0"
Filtering Workspace Events
// Accept workspace events with a specific workspace ID "type == 'workspace' and workspace.id == 'specific-workspace-id'"
// Accept workspace events with a specific update type "type == 'workspace' and workspaceUpdate.updateType == 'specific-update-type'"
Conclusion
Using Spring Expression Language for custom subscriptions in Kodexa provides a powerful way to control document management based on dynamic and complex conditions. By leveraging SpEL, you can tailor your document handling processes to meet specific requirements and ensure efficient data processing.
For more details on SpEL and its capabilities, refer to the Spring Framework documentation.
If you have any questions or need assistance with creating or managing your SpEL-based subscriptions in Kodexa, feel free to reach out to our support team.
Happy document processing!
This article should give Kodexa users a clear understanding of how to use SpEL for custom subscriptions, complete with practical examples and implementation guidance.