Thursday, October 15, 2009
Data Compression in CMS
Data Compression in CMS
The
compressed-data content type for CMS is defined separately from RFC
3852—it is defined in RFC 3274. As it does not involve encryption or
authentication, it has a simpler ASN.1 structure than the others, and
as you will see by the code in the example, it is also a lot more
straightforward to use.
ASN.1 Structure
CMS compressed-data is created by wrapping a CompressedData structure in a ContentInfo structure with the contentType field set to the OID id-ct-compressedData, which is defined as:
id-ct-compressedData OBJECT IDENTIFIER ::= { iso(1) member-body(2)
us(840) rsadsi(113549) pkcs(1) pkcs-9(9) smime(16) ct(1) 9 }
and the CompressedData structure is defined as:
CompressedData ::= SEQUENCE {
version CMSVersion,
compressionAlgorithm CompressionAlgorithmIdentifier,
encapContentInfo EncapsulatedContentInfo }
As you can probably guess, CompressionAlgorithmIdentifier is further defined as being of the type AlgorithmIdentifier, and EncapsulatedContentInfo is the same type you encountered when you looked at CMS signed-data. The version number is always 0.
Currently the only compression algorithm specified is ZLIB, which is identified using the following OID:
id-alg-zlibCompress OBJECT IDENTIFIER ::= { iso(1) member-body(2)
us(840) rsadsi(113549) pkcs(1) pkcs-9(9) smime(16) alg(3) 8 }
Compared to the other CMS structures you have
looked at, this one is quite simple. Fortunately, the Java classes
related to compressed-data reflect this.
The CMSCompressedData Class
The org.bouncycastle.cms.CMSCompressedData class has two general use constructors that can take either a byte array or an InputStream representing a binary encoding of a ContentInfo structure and the CompressedData structure it contains. As you will see in the next example, CMSCompressedData objects can also be created using objects of the CMSCompressedDataGenerator class.
CMSCompressedData.get Content()
The getContent() method returns a byte array representing the contents of the encapContentInfo field after uncompressing.
The method will throw a CMSException if a problem occurs uncompressing the data.
CMSCompressedData.get Encoded()
The getEncoded() method returns the binary ASN.1 encoding of the object. The encoding may follow the BER or DER encoding rules.
The method will throw an IOException if a problem occurs generating the encoding.
Try It Out: Using Compression with CMS
Here is a simple example showing how to use the
compressed-content type. As you can see, it is much easier to deal with
than the previous content types discussed.
package chapter9;
import java.util.Arrays;
import org.bouncycastle.cms.CMSCompressedData;
import org.bouncycastle.cms.CMSCompressedDataGenerator;
import org.bouncycastle.cms.CMSProcessableByteArray;
/**
* Basic use of CMS compressed-data.
*/
public class CompressedDataExample
{
public static void main(String args[]) throws Exception
{
// set up the generator
CMSCompressedDataGenerator gen = new CMSCompressedDataGenerator();
//compress the data
CMSProcessableByteArray data = new CMSProcessableByteArray(
"Hello world!".getBytes());
CMSCompressedData compressed = gen.generate(data,
CMSCompressedDataGenerator.ZLIB);
// re-create and uncompress the data
compressed = new CMSCompressedData(compressed.getEncoded());
byte[] recData = compressed.getContent();
// compare uncompressed data to the original data
if (Arrays.equals((byte[])data.getContent(), recData))
{
System.out.println("data recovery succeeded");
}
else
{
System.out.println("data recovery failed");
}
}
}
Running the example produces the following message:
data recovery succeeded
indicating the data compressed without problems.
How It Works
This example follows a similar pattern to the earlier ones in that a generator is created and used to create a CMSCompressedData object from an implementation of CMSProcessable using the following line:
CMSCompressedData compressed = gen.generate(data,
CMSCompressedDataGenerator.ZLIB);
where CMSCompressedDataGenerator.ZLIB is a string representing the OID for the ZLIB algorithm.
After this, the CMSCompressedData object is reconstructed from its binary encoding, and the original data is recovered from it.
One thing to note here is that, although ZLIB is a
lossless compression algorithm, not all compression algorithms
are—especially those that can be used with images and sound. If you
ever get to the point of using other compression algorithms and
combining the compressed-content type with the signed-data type,
compress the data before creating the signed-data; otherwise, the use
of a "lossy" compression algorithm will mean your signatures are
invalid.
CMS is not just an end in itself, but is used as the basis for a number of other protocols. Chief amongst these is S/MIME.
No comments:
Post a Comment