XMP wrangling on OS X
Wed Jul 11, 2012
1427 Words
XMP
XMP is the extensible metadata platform from Adobe. It can be used to embed structured metadata into a file. I think it’s most common usage is for storing image metadata in image files, but it also has a use in embedding metadata in PDF files.
Adobe provide an SDK that you can use to start coding applications for working with XMP. They provide a guide for developers, and I’ve posted a copy so you can have a quick look if you are interested, without downloading the full SDK.
In this post I will look at compiling the SDK on OSX 10.7.4 with Xcode 4.3.2, and run through getting one of the demo applications provided in the SDK working. This was my first time working with Xcode, so I have certainly done some stupid things with setting up the environment, however in the end it worked. YMMV.
Compiling the XMP libraries on OSX 10.7.4 with Xcode 4.3.2
I was unable to get the provided SDK to compile on my machine with Xcode 4.3.2. I kept running into issues that I just didn’t understand in how the distributed Xcode project was set up. I was bitching about this on Twitter and the amazing Matias Piipari offered to have a look. He said:
They’d done this silly thing where they require you to download 3rd party code and place it inside the project root, and then they hard-code the relative paths to those 3rd party source files in the #includes. Both expat and zlib are actually available as part of the OSX SDK, but was easier to just download zlib and unpackage them both into that 3rd-party directory rather than fix the includes to use systemwide header search paths and to set up dynamic linking for expat and zlib.
He modified the distributed project as described, and sent me a copy. Downloading this, and following the build instructions from the programmers guide compiles the libraries, raising about 54 warnings on my machine.
Building a demo application using the compiled libraries. The
First step, we need to set up a project in Xcode.
Creating an Xcode project to read XMP metadata
The following is based on the instructions from the developers guide, which start on page 62, however there are quite a few differences between the provided instructions, and what I did to get the sample code working.
- Create a new project
File > New > Project ...
or ⇧⌘N - Select the
Command Line tool
option. - Hit
next
. - Name your product, pick your company name and choose
C++
for the project type. - You will then be asked to chose where to save the project. I’m saving my project in
/Users/ian/code/private-code/xmp/XMP-Toolkit-SDK-5.1.2/samples/build/Xcode3/
, with the project nameMyReadingXMP
. Hitsave
. I have git installed on my system, and Xcode offers to place the project under source control, why not, let’s go with that. - In the project window, highlight the file main.cpp and delete it by choosing
Edit > Delete
or ⌘⌫. In the confirmation dialog selectMove to Trash
. - Add a new file, there are many options for adding a new file, ⌘N works. This brings up the file template chooser.
- Pick a
C++
file, hitNext
. - Name the file
MyReadingXMP.cpp
. The instructions in the programmers guide tell you to deselect the option to create a header file, however the version of Xcode that we are working with does not offer to create a header file, so you can ignore this step.
Compiling the application.
Selecting the project in the left hand pane of Xcode will show the project info and build settings in the central pane:
For finding the various options in the build settings, the search field is convenient.
- Select
Build Settings
and underBuild Locations
selectPer-configuration Build Products Path
. - Insert
$(PROJECT_DIR)
, you should see Xcode expand this to point to the directory that contains the project files. - Under
Search Paths
selectHeader Search Paths
. The XMP developer documentation says to include${SDKROOT}/public/include
. This is where things started to go tits up for me. There is a bug report on the adobe forums pointing out that the developer documentation is broken at this step, and that one should insert<xmpsdk>
in place of${SDKROOT}
, but I didn’t get it to work with this substitution either, so I started to hard code the paths. This is a bad practice, but it was the way that I got it to work. If I knew a little more about Xcode I am sure this would be trivial to fix, but for now I started to use/Users/ian/code/private-code/xmp/XMP-Toolkit-SDK-5.1.2/
in every place in the documentation where you are advised to use${SDKROOT}
, so we put in/Users/ian/code/private-code/xmp/XMP-Toolkit-SDK-5.1.2/public/include
for this variable. - Select Framework Search Paths and enter
/Users/ian/code/private-code/xmp/XMP-Toolkit-SDK-5.1.2/System/Library/Frameworks
. When I finally got to the compile stage I was getting a warning that this directory didn’t exist, so I manually created that directory inside of the project. - Deselect
Precompiled Header Uses Files From Build Directory
. - Select Other Linker Flags and add the paths to both the XMP Toolkit SDK static libraries:
/Users/ian/code/private-code/xmp/XMP-Toolkit-SDK-5.1.2/public/libraries/macintosh/debug/libXMPCoreStaticDebug.a
/Users/ian/code/private-code/xmp/XMP-Toolkit-SDK-5.1.2/public/libraries/macintosh/debug/libXMPFilesStaticDebug.a
- In Other Linker Flags, at the end of the paths to the static libraries enter the flag
-framework CoreServices
. At this point I may have tried compiling the empty project, or I may have toggled some other interaction with Xcode, in any case Xcode began downloading the CoreServices framework. - Select
Preprocessor Macros
and enterMAC_ENV=1
- The instructions provided direct you to set the Preprocessor Macros for the target, however in this version of Xcode when I looked, this setting carried over from the project setting, and I didn’t need to do anything.
I tried compiling the empty project, but ran into some bugs. I then cut and pasted the entire sample XMP reader code into the my empty project. I also manually linked the CoreServices library against the target using the Build Phases
option Xcode.
I honestly don’t know what I am doing at this point, and I don’t know whether this helped, but on the next attempt to compile the project compiled fully, and I got an executable.
It took me a while to find where the executable was located, but eventually I found it located in
~/Library/Developer/Xcode/DerivedData/
MyXMPReader-dzqqlyrtudnrpsganqbsozskatkp/
Build/Products/Debug /Users/ian/code/
private-code/xmp/XMP-Toolkit-SDK-5.1.2/
samples/build/Xcode3/MyXMPReader
Obviously I’ve made a typo in the name of the target, you can see that I have an extra space next to the ../Debug /..
part of the path, and I had expected the executable to land in the project directory, rather than in the system Library, but we have the working executable ‘MyXMPReader’.
Running the application
It’s a command line application, and takes a path to a file as an argument. To run this, point it at a PDF like so:
$ MyXMPReader test.pdf
It produces some terminal output, and writes the XMP data into a local XMPDump.txt
file. I verified the tool works by running it against a nature communications pdf, and you can see some of the output here:
$ head -40 XMPDump.txt
Dumping XMPMeta object "" (0x0)
dc: http://purl.org/dc/elements/1.1/ (0x80000000 : schema)
dc:format = "application/pdf"
dc:identifier = "
doi:10.1038/ncomms1828"
dc:creator (0x600 : isOrdered isArray)
[1] = "Zhong Yan"
[2] = "Guanxiong Liu"
[3] = "Javed M. Khan"
[4] = "Alexander A. Balandin"
dc:description (0x1E00 : isLangAlt isAlt isOrdered isArray)
[1] = "Nature Communications 3, 827 (2012). doi:10.1038/ncomms1828" (0x50 : hasLang hasQual)
? xml:lang = "x-default" (0x20 : isQual)
dc:publisher (0x200 : isArray)
[1] = "Nature Publishing Group"
dc:rights (0x1E00 : isLangAlt isAlt isOrdered isArray)
[1] = "<C2 A9> 2012 Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved." (0x50 : hasLang hasQual)
? xml:lang = "x-default" (0x20 : isQual)
dc:title (0x1E00 : isLangAlt isAlt isOrdered isArray)
[1] = "Graphene quilts for thermal management of high-power GaN transistors" (0x50 : hasLang hasQual)
? xml:lang = "x-default" (0x20 : isQual)
pdf: http://ns.adobe.com/pdf/1.3/ (0x80000000 : schema)
pdf:Producer = "Adobe PDF Library 7.0"
prism: http://prismstandard.org/namespaces/basic/2.0/ (0x80000000 : schema)
prism:copyright = "
<C2 A9> 2012 Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved."
prism:doi = "10.1038/ncomms1828"
prism:eIssn = "2041-1723"
prism:endingPage = "8"
prism:publicationName = "Nature Communications"
prism:rightsAgent = "permissions@nature.com"
prism:startingPage = "827"
prism:volume = "3"
prism:publicationDate (0x200 : isArray)
[1] = "--"
prism:url (0x200 : isArray)
I’ll explore some other tools that you can use to look at XMP data in pdfs in a later post, and I’ll also look at what we can find inside some recently published academic PDFs with an eye to deciding what we should be putting into the documents that we will be producing at eLife (don’t forget to submit).