Citation Style Language
Fri Sep 24, 2010
Martin Fenner asked me to comment on Mendeley’s relationship with the Citation Style Language, so I thought I would pop up my thoughts on the blog too. Martin’s post below is the original message that I sent over to him.
Hi Martin, thanks for getting in touch and asking about Mendeley’s relationship with CSL. There are basically three aspects of citation styling that I’d like to discuss, the first is about Mendeley’s involvement, the second is about the use of citation styling in the community, and amongst publishers, and the third is about how we identify things in academia, and how we format those identifiers.
We have been using the Citations Style Language for quite a while now. We think it is an amazing project and we are very strongly committed to working with the CSL community in encouraging uptake. We get a lot of feedback from our users and one area that they constantly run into problems with is the need to be able to format a citation in just such a manner. The CSL project is the best way for us to be able to support the needs of our users with these kinds of requests. Our developers have been pushing patches upstream to the citeproc-js project, particularly Carles Pina. We have also just added a cut and paste stylebox on our article pages. If you have a look at a sample paper you will now see a little citeproc-js driven box that lets you cut and paste styles in one of the eight most popular citation formats that get used within Mendeley: APA, BibTeX, Cell, Chicago, Harvard, MLA, Nature and Science. We have also been supporting the creation of a WISYWIG citation style editor. The status of the project is that most of the code is complete and we just need to work on getting it integrated into our client, and figuring out the best way to manage the creation of more styles, and how that will work with the CSL community.
One of the things that we have been discussing with Bruce D’Arcus is how to manage the redistribution of new styles, and how to make sure that there corrupt styles don’t propagate, and that people get the style that they are looking for. We have been discussing the creation of a styles commons or public repository, the final plans are not completely in place yet. If people want to contribute there is a lot of activity on the mailing list. One thing we thing we hope Mendeley can help with is reporting usage statistics on specific style files, so at least people can find the most popular version of a CSL file for a given style. I think it would also be awesome if publishers started authoring definitive CSL styles for their journals.
That leads me nicely to talk about the second thing that interests me about citations. After working for many years at Springer, and then Nature, I was well aware that most large publishers just push submitted manuscripts out to companies in India where the formatting of the paper happens. The input format is really for the most part of no importance to the publisher, and also the citation formatting really doesn’t matter to most publishers. They just tear the submitted manuscript to pieces and rebuild it in their chosen XML schema. However, most people using citations are not actually submitting manuscripts for publication, but rather are writing term papers, or theses, or reports. Someone from the Open University recently told me that tutors there can end up docking up to 10% off the marks of a paper if the citations are mis-formatted, and in many cases those tutors have their very own preferred version of a specific citation that they want their students to adhere to.
So the weird thing is that citations started off as a required identifier for the literature. Google scholar and HTTP URI’s have almost totally made formatted citations redundant as identifiers, and yet there is still a huge user need to be able to format citations according to a huge variety of styles, and since that need is going to continue for quite a long time, it’s a need that we have to support.
This kind of leads me to my last point, which is a bit of a lament for the state of informational waste in the academic publishing system. As I pointed out the big publishers don’t care about the submission format, but they have not really done a good job of communicating that to their editorial boards. Smaller publishers don’t have the resources to totally format submissions, and beyond academic publishing there are a huge number of people who just need to format citations. There is a huge waste of people’s time in reformatting papers for submissions, in fixing styles according to changing requirements from departments, and just trying to get the styling correct, when what should matter is the content. I’d love to get to a point where every publisher accepted the same type of XML input, and our authoring tools all created content conforming to that input. Citations should be a HTTP URI that can be rendered into the appropriate format using CSL and and API. Imagine if you had to resubmit a manuscript to a different publisher, and all you had to do was resubmit the same file, just imagine! I’ve been thinking things should work like this since about 2002 when I first started working in the publishing industry.
One of the things I really like about CSL is that is abstracts the form of the citation from the content of the citation. May more and more of our tools do the same!