Migrating from CALM to AtoM is a well-trodden path — but it is not a simple one. The data model differences, the quality of your existing data, and the complexity of your authority records all affect how long it takes and how much manual work is involved. This guide walks through the process step by step, from getting your data out of CALM through to a verified, live AtoM instance.

This is a guide for archivists, not just IT staff. Understanding what is happening to your data at each stage is important — not because you need to run the scripts yourself, but because you need to be able to review the output and sign off that it is right.

Before you start: Confirm that your CALM licence allows data export and that you have administrator access to request a full database export from Axiell. Request this well before your licence expiry date — do not find yourself needing an export after the contract ends.

1 Audit your CALM data before anything else

The single most valuable thing you can do before attempting a migration is to understand exactly what is in your CALM database. This sounds obvious, but many migration projects stall because unexpected data issues surface mid-process. A pre-migration audit typically reveals:

  • How many records exist at each level — fonds, series, sub-series, file, item. A mismatch between your count and the imported count is your primary verification metric later.
  • The state of your authority records — are creators maintained as authority records linked to descriptions, or are they embedded as free text in Scope and Content fields? For proper authority control and ISAAR(CPF)-style description, creator data should be represented as linked authority records rather than left as free text. If your creators are all free text in CALM, building them out will be a manual step.
  • Digital objects — are there linked digital objects in CALM? If so, where are they stored and what format? These need to be migrated separately and re-linked after import.
  • Custom fields — has your CALM installation been customised with local fields that do not have a direct ISAD(G) equivalent? These need decisions: do they become scope notes, access conditions, or do they map to one of AtoM's existing fields?
  • Data quality issues — inconsistent date formats, encoding problems in legacy records, broken hierarchy (descriptions at wrong levels, missing parent records), duplicate authority entries.

Run a record count report by level before you begin, and keep it. You will need it to verify the migration at the end.

2 Export your data from CALM

CALM supports export in XML and CSV formats. For migration purposes, XML export is preferable — it preserves the hierarchical structure of your descriptions more reliably than flat CSV exports, and it includes relationships between records rather than just the field values.

Request the following exports from Axiell, or generate them yourself if you have administrator access:

  • All catalogue records (all levels — do not export level by level if you can avoid it, as this loses parent-child relationships)
  • All authority files separately — names (personal, corporate, family), places, subjects
  • All accession records
  • Any location/strong room management data you want to preserve

Known issue: Direct database exports from CALM can produce encoding problems with non-standard characters — particularly in older records that predate Unicode support. Welsh character sets and pre-Unicode Latin characters are common culprits in UK county record offices. Flag this to whoever is handling the technical migration before you start.

3 Map your CALM fields to ISAD(G) and AtoM

AtoM is built around the ICA standards — ISAD(G) for archival descriptions, ISAAR(CPF) for authority records. Most CALM fields map relatively directly, but the mapping requires decisions on some fields. The core mappings are:

CALM "Reference" → AtoM "Identifier / Reference code" (ISAD 3.1.1)
CALM "Title" → AtoM "Title" (ISAD 3.1.2)
CALM "Date" → AtoM "Date(s) of creation" (ISAD 3.1.3) — normalise formats
CALM "Extent" → AtoM "Extent and medium of the unit" (ISAD 3.1.5)
CALM "Scope & Content" → AtoM "Scope and content" (ISAD 3.3.1)
CALM "Arrangement" → AtoM "System of arrangement" (ISAD 3.3.4)
CALM "Access Conditions" → AtoM "Conditions governing access" (ISAD 3.4.1)
CALM "Copyright" → AtoM "Conditions governing reproduction" (ISAD 3.4.2)
CALM "Language" → AtoM "Language/scripts of material" (ISAD 3.4.3)
CALM "Related Material" → AtoM "Related units of description" (ISAD 3.5.3)
CALM "Publication Note" → AtoM "Publication note" (ISAD 3.5.4)
CALM "Creator" → AtoM authority record linked via ISAAR relationship
CALM "Custodial History" → AtoM "Archival history" (ISAD 3.2.3)
CALM "Acquisition" → AtoM "Immediate source of acquisition" (ISAD 3.2.4)

TNA publishes guidance on ISAD(G) field usage for UK archives. It is worth reviewing their current guidance on access conditions and physical description fields before finalising your mapping — these are the areas where local practice varies most.

Custom CALM fields that do not have a direct equivalent typically become either scope note content or go into AtoM's "General notes" field (ISAD 3.6.1). Document every mapping decision before you start transforming data — you will need to refer back to it during verification.

4 Clean your data

Data cleaning is usually the most time-consuming step in a CALM migration, and the one most likely to be underestimated. The issues found most commonly in UK county record office CALM databases include:

  • Date format inconsistency — records from different eras or different cataloguers often use different date formats. AtoM expects standardised dates. You need a consistent format for the display date and a normalised date for sorting and filtering. "c.1850", "circa 1850", "ca 1850-1860", and "undated" all need handling differently.
  • Authority record duplicates — personal names with variant spellings, maiden names, initials vs full forenames, and different forms of corporate names all create duplicate entries. Deduplication before import is significantly easier than after.
  • Hierarchy problems — descriptions at the wrong level, series without parent fonds, items linked to the wrong file. These are usually legacy issues from data entry over many years.
  • Extent field inconsistency — "3 boxes", "3 archive boxes", "approx. 3 boxes and 1 folder", "see catalogue" — these all need standardising.
  • HTML or legacy encoding artefacts — older CALM records sometimes contain HTML entities, line-break characters, or encoding relics from early versions of the software.

The best tool for data cleaning at scale is OpenRefine — free, open-source, and widely used in UK archival and library data work. It allows you to cluster similar values, apply transformations across thousands of records, and track every change you make. For complex authority deduplication or bespoke field transformations, Python scripts can handle what OpenRefine cannot.

5 Prepare your AtoM import files

For structured migrations, AtoM's CSV import templates are the most commonly used route. The format is well-documented, but the key structural points are:

  • Hierarchy is established via legacyId and parentId columns — each record needs a unique identifier, and child records reference their parent's identifier. Get this right and your entire hierarchy imports correctly. Get it wrong and you will spend significant time fixing parent-child relationships after the fact.
  • Authority records are imported in separate CSV files — creators (personal names, corporate names, family names), subjects, and places each have their own import format. Import these before the description CSV, so that links can be established during description import.
  • Digital objects — if you have digital objects to link, their file paths or URLs are included in the description CSV. They need to be accessible to the AtoM instance at import time.
  • Run a test import on a staging instance first — always. Never run your first import against a production system.

6 Stage, verify, and go live

A staging environment is not optional for a CALM migration. You need to be able to review the imported data before it goes public. During review:

  • Record count check — the number of records at each level in AtoM should match your pre-migration audit count. Any discrepancy needs investigating before you proceed.
  • Hierarchy spot-check — navigate the collection tree for a representative sample of fonds and confirm that series, sub-series, and files are nested correctly.
  • Authority record check — confirm that creator authority records have linked correctly to the relevant descriptions, and that subject and place terms are appearing as expected.
  • Date display check — verify that date normalisation has worked correctly across a sample of records with different original formats.
  • Digital object check — if you have digital objects, confirm that they are displaying correctly and accessible from the relevant description records.
  • Public interface review — use AtoM's public interface as a researcher would. Search for known records. Browse by creator. Check that access restrictions are applying correctly to restricted records.

Only after your archivist has signed off on the staging environment should you proceed to go-live. At go-live, the AtoM instance is made publicly accessible at its permanent URL, and the old CALM public catalogue (if any) can be redirected.

Common problems and how to handle them

The hierarchy import fails or produces a flat structure

The most common cause is inconsistent legacyId and parentId values — leading or trailing spaces, inconsistent formatting, or records where the parentId references an ID that does not exist in the import file. Validate all ID references before import.

Authority records are not linking

AtoM links descriptions to authority records by name. If the name in the description CSV does not exactly match the authorised form of name in the authority CSV (including spaces, punctuation, and capitalisation), the link will not form. Normalise names consistently across both files before import.

Date sorting is wrong

AtoM sorts by normalised date, not display date. If normalised dates are missing or inconsistently formatted, records will appear in unexpected positions in sorted views. AtoM expects dates in YYYY, YYYY-MM, or YYYY-MM-DD format in the normalised date field.

Performance is slow after import

AtoM uses Elasticsearch for full-text search. After a large import, you need to rebuild the search index — this is a command-line operation on the server and can take time for large datasets. It is normal and expected; it is not a problem with the imported data.

Need help with the migration? We manage CALM to AtoM migrations for UK archives, handling the export, transformation, cleaning, and import. See our county record office page or migration resources for more detail, or contact us directly.

← CALM Alternatives Compared County Record Office Hosting →

About the author: Matthew Bruton is a qualified archivist and the founder of Archives Hosting UK. He has managed CALM data migration and AtoM deployments across the British Isles.