Jumping in with both feet
Now that CA is installed it’s time to get it working. I am going to export an ArtworkArchive account full of over 400 pieces, export it all and import everything into CA.
Sounds easy, no? Let’s find out….
Process Overview
Both the data (metadata) and the media files need to be imported in two coordinated steps. In short, you’d first prepare your metadata and mapping spreadsheet to import your records (objects, entities, etc.) via the “Import → Data” interface, and then you’d upload the actual media files (images, PDFs, etc.) into the designated import folder (set in your app.conf, typically something like <ca_base_dir>/import) and use the “Import → Media” interface to batch attach them to the records.
Step One > Exporting from AA
For the purposes of this task using the export tool from the AA ‘pieces’ page it the ticket!
It can save out all data-points for every piece. This even includes URLs for all images that belong to the various pieces.
Step Two > Corralling the media
More details… I wrote a bash script that downloads the images naming them by combining the creation date and item name.
NOTE: the script and initial downloads are stored in code-server on unraid
#!/bin/bash
# Define the CSV file name. Adjust this if your CSV file has a different name or path.
CSV_FILE="data.csv"
# Create a directory for downloads if it doesn't already exist.
DOWNLOAD_DIR="downloads"
mkdir -p "$DOWNLOAD_DIR"
# Function to sanitize filenames:
# - Replaces spaces with underscores.
# - Removes any characters that are not alphanumeric, dots, underscores, or hyphens.
sanitize_filename() {
local filename="$1"
# Replace spaces with underscores.
filename="${filename// /_}"
# Remove any character that is not allowed in a filename.
filename="$(echo "$filename" | sed 's/[^A-Za-z0-9._-]//g')"
echo "$filename"
}
# Function to extract a file extension from a URL.
# If no extension is detected, it defaults to 'jpg'.
get_extension() {
local url="$1"
# Remove any query string and get the base name.
local base=$(basename "${url%%\?*}")
# Extract the extension (the substring after the last dot).
local ext="${base##*.}"
# If the extension is identical to the base, then no extension was found.
if [[ "$ext" == "$base" ]]; then
echo "jpg"
else
echo "$ext"
fi
}
# Skip the header row and process the CSV file line by line.
# "tail -n +2" skips the first header line.
# IFS=, sets the delimiter to a comma.
tail -n +2 "$CSV_FILE" | while IFS=, read -r piece_id date_added name primary_url add1_url add2_url add3_url add4_url; do
# Sanitize the date and name to create a safe base filename.
sanitized_date=$(sanitize_filename "$date_added")
sanitized_name=$(sanitize_filename "$name")
base_filename="${sanitized_date}_${sanitized_name}"
# Function to download an image from a given URL.
# Parameters:
# $1 - URL to download.
# $2 - Optional suffix for additional images (e.g., "1" for Additional Image 1).
download_image() {
local url="$1"
local suffix="$2"
# If the URL is empty, skip this image.
if [ -z "$url" ]; then
return 0
fi
# Determine the file extension from the URL.
ext=$(get_extension "$url")
# Construct the filename: primary image uses no suffix; additional images get a numeric suffix.
if [ -z "$suffix" ]; then
filename="${base_filename}.${ext}"
else
filename="${base_filename}_${suffix}.${ext}"
fi
# Download the image using curl and save it in the downloads directory.
echo "Downloading $url to ${DOWNLOAD_DIR}/${filename}"
curl -s -o "${DOWNLOAD_DIR}/${filename}" "$url"
# Check if the download succeeded.
if [ $? -eq 0 ]; then
echo "Downloaded: ${filename}"
else
echo "Error downloading $url"
fi
}
# Download the primary image (no suffix) and additional images with numeric suffixes.
download_image "$primary_url" ""
download_image "$add1_url" "1"
download_image "$add2_url" "2"
download_image "$add3_url" "3"
download_image "$add4_url" "4"
done
Step Three > Create the Import Mapping
This has proven to be quite tricky…. And has led me to explore an alternative to CollectiveAccess! After doing some research looking for a FOSS alternative to Artwork Archive I found that these two were the final contenders. Here are some thoughts….
Omeka Classic vs CollectiveAccess
Omeka Classic and CollectiveAccess are both open-source content management systems designed for managing digital collections, but they have different strengths and ideal use cases.
Omeka Classic
Best for: Small to medium-sized digital collections, libraries, museums, and academic institutions.
- Ease of Use: User-friendly with a simple installation process and an intuitive interface.
- Customization: Uses themes and plugins for extensibility.
- Metadata Support: Primarily uses Dublin Core, but can be extended with plugins.
- Scalability: More suited for smaller projects; Omeka S is better for larger or linked data collections.
- Hosting: Can be self-hosted or hosted via Omeka.net.
- User Management: Simple role-based permissions.
- Strengths: Great for quickly setting up digital exhibits with minimal technical knowledge.
CollectiveAccess
Best for: Museums, archives, and research institutions that need deep cataloging features.
- Flexibility: Highly configurable with support for complex metadata and relationships.
- Metadata Support: Supports multiple metadata standards like Dublin Core, MODS, PBCore, and more.
- Scalability: Designed for large collections with complex relationships.
- Hosting: Self-hosted with more complex installation and configuration.
- User Management: More granular control over permissions and workflows.
- Strengths: Ideal for institutions needing highly detailed, structured cataloging and data management.
Key Differences
| Feature | Omeka Classic | CollectiveAccess |
|---|---|---|
| Ease of Use | Easier, plugin-based | Steeper learning curve, highly customizable |
| Metadata | Mostly Dublin Core | Supports multiple metadata schemas |
| Scalability | Best for small to medium collections | Designed for large and complex collections |
| Hosting | Self-hosted or Omeka.net | Self-hosted only |
| Customization | Plugin & theme-based | Highly configurable data model |
| Ideal Use Case | Digital exhibits, academic & community projects | Museums, archives, & research databases |
Which One to Choose?
- Choose Omeka Classic if you need a quick and easy way to display digital collections online with a focus on storytelling and user-friendly management.
- Choose CollectiveAccess if you need a robust cataloging system for managing complex metadata and large collections with deep relationships.
Would you like recommendations based on a specific use case?
Decision Time!
…I will set up both, slap in some samples, compare and then make a final decision. See you in the next post!

