Skip to main content
All CollectionsDeveloper Guides
Managing Document Storage with Kodexa
Managing Document Storage with Kodexa

Using the Kodexa SDK to work with Document Storage

Updated over a week ago

Kodexa offers versatile "Store" options for handling various types of data, including documents, structured data, or models. This guide will walk you through the process of creating a document store within a demo organization on the Kodexa platform.

Prerequisites

Before you begin, ensure you have:

- A Kodexa account.

- Created an organization within Kodexa.

- Initiated a new project within your organization.

Getting Started with KodexaClient

The first step involves obtaining a `KodexaClient` instance and accessing your organization and its associated stores. Here's how you can do it:

from kodexa.platform import KodexaClient

# Create a KodexaClient instance
client = KodexaClient()

# Find your organization by its slug
philips_organization = client.organizations.find_by_slug('kodexa-demo')

# Access a specific project by name
demo_project = philips_organization.projects.find_by_name('Document Demo')

# Get the document store by name
document_store = demo_project.document_stores.find_by_name('Processing')

# Print the name of the document store
print(document_store.name)

This code snippet sets up your environment, allowing you to interact with a specific document store within your project.

Uploading Documents

With the document store at your disposal, you can upload documents to it. When dealing with PDF files, you use the `upload_file` method, which returns a document family representing the uploaded document.

# Upload a PDF document
document_family = document_store.upload_file('sample.pdf')

The returned `document_family` object contains all relevant information about the uploaded file, including a unique identifier for future operations.

Managing Uploaded Documents

The document family associated with your upload encompasses comprehensive details about the file, such as its unique identifier, which is crucial for subsequent document management tasks.

# Display the document family details
print(document_family)

Enhancing Uploads with Metadata

It's possible to delete an upload and re-upload it with additional metadata. For example, to attach JSON metadata to your document, you can incorporate it directly into the upload process:

# Upload a document with additional metadata
document_family = document_store.upload_file('sample.pdf', additional_metadata={"document_type":"invoice","document_date":"2020-01-01"})

The metadata is stored as a dictionary, allowing you to customize the data attached to your document. However, it's important to note that nested information cannot be included in this format.

# Access and print the metadata of the uploaded document
print(document_family.metadata)

By following these steps, you can efficiently manage document uploads within your Kodexa project, utilizing the platform's robust storage capabilities to organize and enhance your documents with relevant metadata.

Did this answer your question?