As museums, libraries and archives digitise their collections the term ‘petabyte’ [which is one up from terabyte and two up from a gigabyte] has become increasingly regular in conversations.
While we are all used to free or low-cost digital storage, data storage on this level still means cost is a serious consideration.
High quality storage of uncompressed image files is rather expensive and this has led to the growth in use of the relatively unknown JP2 file format. So what is JP2?
JP2 (which is often referred to as JPEG2000 although this is actually the compression method used) allows you to take the most common file type used, uncompressed TIFF and compress into two different forms, lossless and lossy.
Lossless will produce smaller file sizes than an uncompressed TIFF and is effectively an uncompressed JPEG.
The latter, lossy, allows you to compress the file, and although you would think this may be a bad thing for your master files, research shows that some compression may not be a bad thing.
Research by Sean Martin while at the British Library showed that compression to a certain level is actually only cleaning the image of noise created during the capture process, whether via scanner or camera.
After numerous tests the British Library adopted JP2 with a lossy compression for their newspaper digitisation programme, which gave them a twofold benefit.
Firstly it reduced the file size massively compared to a TIFF, secondly image noise was reduced and optical character recognition (OCR) results were improved.
A win, win situation then – however, further research into the file format by Maureen Pennock, also at the British Library, showed that the same level of compression could not be used for the digitisation of material with a high colour depth.
This is because there comes a point where information is starting to be lost and this point can change depending on capture method and detail in the original.
So in order to take full advantage of the JP2 file format there needs to be detailed testing and understanding of the format.
As more museums and libraries digitise and provide access to the material online, image size has become a talking point.
Too big and the image will not load, too small and the image will not be good enough quality.
The use of JP2 is certainly a help in this area as high resolution/quality images can be produced with smaller file sizes and the tiling effect of JP2 can allow for very fast loading and instant pan zoom of the image.
Tiling is another part of JP2 that some imaging studios use and some don’t, but we’ll leave that topic for another day.
The Wellcome Library recently hosted a Digital Preservation Coalition (DPC) conference which featured the use of JP2 and the current trend for digital imaging in museums, libraries and archives.
Some interesting facts and figures were mentioned during the event including 87 per cent of imaging studios still use TIFF as their master file.
Throughout the day it was highlighted that the implementation of JP2 was rather expensive with either in-house programme developers being adopted or work being outsourced to commercial digitisation providers.
Purely as a cost exercise it is difficult for smaller museums and libraries to justify the cost of implementation over digital storage cost.
So does this limit JP2 use to the larger archives and libraries with corresponding budgets?
Personally I think this is the current situation, however, all suggestions are that JP2 is only going to become more popular and with that tools and software will become available far more cost-effectively. Understanding of the format will also increase.
At Genus we can help people understand the file format further and offer free consultation as to the use of different file formats and specifically JP2.
It is important that people understand the formats which they are using and why and when it is appropriate to employ them.
For our digitisation service we offer the use of JP2 in both lossless and lossy format, removing the element of cost and uncertainty and producing secure archival images, manageable for access and viewing on the web or in the Cloud
Referring again to Maureen Pennock’s presentation at the recent DPC event who said: “Don’t just use JP2 because other people are, – understanding and researching the file format is very important,” and she’s right and Genus can help with this, ensuring the benefits of JP2 are available far beyond just a small number of museums and libraries.