[ACCEPTED]-How to serialize object + compress it and then decompress + deserialize without third-party library?-gzipstream
You have a bug in your code and the explanation is too long for 11 a comment so I present it as an answer even 10 though it's not answering your real question.
You 9 need to call memoryStream.ToArray()
only after closing GZipStream
otherwise you 8 are creating compressed data that you will 7 not be able to deserialize.
Fixed code follows:
using (var memoryStream = new System.IO.MemoryStream())
{
using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Compress))
{
BinaryFormatter binaryFormatter = new BinaryFormatter();
binaryFormatter.Serialize(gZipStream, obj);
}
return memoryStream.ToArray();
}
The 6 GZipStream
writes to the underlying buffer in chunks 5 and also appends a footer to the end of 4 the stream and this is only performed at 3 the moment you close the stream.
You can 2 easily prove this by running the following 1 code sample:
byte[] compressed;
int[] integers = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var mem1 = new MemoryStream();
using (var compressor = new GZipStream(mem1, CompressionMode.Compress))
{
new BinaryFormatter().Serialize(compressor, integers);
compressed = mem1.ToArray();
}
var mem2 = new MemoryStream(compressed);
using (var decompressor = new GZipStream(mem2, CompressionMode.Decompress))
{
// The next line will throw SerializationException
integers = (int[])new BinaryFormatter().Deserialize(decompressor);
}
GZipStream from .NET 3.5 doesn't allow you 10 to set compression level. This parameter 9 was introduced in .NET 4.5, but I don't 8 know if it will give you better result or 7 upgrade is suitable for you. Built in algorithm 6 is not very optimal, due to patents AFAIK. So 5 in 3.5 is only one way to get better compression 4 is to use third party library like SDK provided 3 by 7zip or SharpZipLib. Probably you should experiment 2 a little bit with different libs to get 1 better compression of your data.
The default CompressionLevel used is Optimal
, at 27 least according to http://msdn.microsoft.com/en-us/library/as1ff51s, so there is no way 26 to tell the GZipStream to "try harder".. It 25 seems for me that a 3rd party lib would 24 be better.
I personally never considered 23 the GZipStream to be 'good' in terms of 22 the compression - probably they put the 21 effort in minimizing the memory footprint 20 or maximizing speed. However, seeing how 19 WindowsXP/WindowsVista/Windows7 handles 18 the ZIP files natively in the Explorer - well.. I 17 cannot say neither it is fast, nor have 16 good compression.. I'd not be surprised 15 if the Explorer in Win7 actually uses the 14 GZipStream - all in all they have implemented 13 it and put into the framework, so probably 12 they use it in many places (i.e., seems 11 to be used in HTTP GZIP handling), so I'd stay away from it 10 I needed an efficient processing.. I've 9 never done any serious research in this 8 topic, as my company bought a nice zip-handler 7 many years ago when the .Net was in its 6 early days.
edit:
More refs:
http://dotnetzip.codeplex.com/workitem/7159 - but marked 5 as "closed/resolved" in 2009.. maybe 4 you will find something interesting in that 3 code?
heh, after a few minutes of googling, it 2 seems that 7Zip exposes some C# bindings: http://www.splinter.com.au/compressing-using-the-7zip-lzma-algorithm-in/
edit#2:
just 1 a FYI abou .net4.5: https://stackoverflow.com/a/9808000/717732
The original question was related to .NET 24 3.5. Three years after, .NET 4.5 is much 23 more likely to be used, my answer is only 22 valid for 4.5. As other mentioned earlier, the 21 compression algorithm got good improvements 20 with .NET 4.5
Today, I wanted to compress 19 my data set to save some space. So similar 18 than the original question but for .NET4.5. And 17 because I remember having using the same 16 trick with double MemoryStream many years 15 ago, I just gave a try. My data set is a 14 container objects with many hashsets and 13 lists of custom ojects with string/int/DateTime 12 properties. The data set contains about 11 45 000 objects and when serialized without 10 compression, it creates a 3500 kB binary 9 file.
Now, with GZipStream, with single or 8 double MemoryStream as described in the 7 question, or with DeflateStream (which 6 uses zlib in 4.5), I always get a file of 5 818 kB. So I just want to insist here than 4 the trick with double MemoryStream got useless 3 with .NET 4.5.
Eventually, my generic code 2 is as follow:
public static byte[] SerializeAndCompress<T, TStream>(T objectToWrite, Func<TStream> createStream, Func<TStream, byte[]> returnMethod, Action catchAction)
where T : class
where TStream : Stream
{
if (objectToWrite == null || createStream == null)
{
return null;
}
byte[] result = null;
try
{
using (var outputStream = createStream())
{
using (var compressionStream = new GZipStream(outputStream, CompressionMode.Compress))
{
var formatter = new BinaryFormatter();
formatter.Serialize(compressionStream, objectToWrite);
}
if (returnMethod != null)
result = returnMethod(outputStream);
}
}
catch (Exception ex)
{
Trace.TraceError(Exceptions.ExceptionFormat.Serialize(ex));
catchAction?.Invoke();
}
return result;
}
so that I can use different 1 TStream, e.g.
public static void SerializeAndCompress<T>(T objectToWrite, string filePath) where T : class
{
//var buffer = SerializeAndCompress(collection);
//File.WriteAllBytes(filePath, buffer);
SerializeAndCompress(objectToWrite, () => new FileStream(filePath, FileMode.Create), null, () =>
{
if (File.Exists(filePath))
File.Delete(filePath);
});
}
public static byte[] SerializeAndCompress<T>(T collection) where T : class
{
return SerializeAndCompress(collection, () => new MemoryStream(), st => st.ToArray(), null);
}
you can use a custom formatter
public class GZipFormatter : IFormatter
{
IFormatter formatter;
public GZipFormatter()
{
this.formatter = new BinaryFormatter();
}
public GZipFormatter(IFormatter formatter)
{
this.formatter = formatter;
}
ISurrogateSelector IFormatter.SurrogateSelector { get => formatter.SurrogateSelector; set => formatter.SurrogateSelector = value; }
SerializationBinder IFormatter.Binder { get => formatter.Binder; set => formatter.Binder = value; }
StreamingContext IFormatter.Context { get => formatter.Context; set => formatter.Context = value; }
object IFormatter.Deserialize(Stream serializationStream)
{
using (GZipStream gZipStream = new GZipStream(serializationStream, CompressionMode.Decompress))
{
return formatter.Deserialize(gZipStream);
}
}
void IFormatter.Serialize(Stream serializationStream, object graph)
{
using (GZipStream gZipStream = new GZipStream(serializationStream, CompressionMode.Compress))
using (MemoryStream msDecompressed = new MemoryStream())
{
formatter.Serialize(msDecompressed, graph);
byte[] byteArray = msDecompressed.ToArray();
gZipStream.Write(byteArray, 0, byteArray.Length);
gZipStream.Close();
}
}
then you can 1 use as this :
IFormatter formatter = new GZipFormatter();
using (Stream stream = new FileStream(path...)){
formatter.Serialize(stream, obj);
}
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.