Deep understanding of binary serialization

I Summary

Binary serialization is the main data transmission and processing method of the company's self-developed microservice framework. However, ordinary developers do not have in-depth learning and understanding of binary, which can easily lead to problems in the use process without analyzing and solving ideas. This article introduces this topic from an accident in the production environment, and discusses some technical points that are not paid attention to at ordinary times through the analysis process of the accident. The results of binary serialization are not as readable as Json serialization, and most people do not understand the results of serialization. Therefore, this article finally analyzes the serialization results in detail by using actual examples and comparing with MSDN documents, and intends to have an intuitive and in-depth understanding of the results of binary serialization through this analysis.

II Accident Description

One night, a batch of early warnings broke out. The scene at that time was:

 A:B, Please take a look at your service. I have a warning here B: I just released a patch. Is it related to me? A: I haven't released it here. Of course, it matters. Get back! B: I haven't changed the interface you use here. Why should we go back? A: It's my fault. I haven't released anything here. Get back! B: This interface has not been changed for a long time. It must be your own problem. A: No matter who has a problem, let's go back and have a look. B: All right, wait a minute Publishing Assistant: Rolling back... (the alert disappears after rolling back) A:…… B:……

III. Accident analysis

Although the problem was solved by the fallback patch after the accident, the analysis of the problem was carried out late at night.

Because this accident is simple to solve, but it directly challenges our understanding of the service. If we do not find the root cause, it is difficult to carry out the follow-up work with confidence.

Our previous understanding of services can be summarized as follows:

 Adding attributes will not cause client deserialization failure.

 

However, this is not an official statement, but a rule summarized by developers through actual use. The summary of experience is often lack of theoretical support, and can't do anything when encountering problems.

When a problem occurs, the exception stack captured by the client is as follows:

 System.Runtime.Serialization.SerializationException HResult=0x8013150C Message=ObjectManager found an invalid number of link address information. This usually indicates a problem in the formatter. Source=mscorlib StackTrace: In System Runtime.Serialization.ObjectManager.DoFixups() In System Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(HeaderHandler handler, __BinaryParser serParser,  Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage) In System Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream,  HeaderHandler handler, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage) In System Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream)

It can be seen from the exception stack that an exception occurred during binary deserialization. Through multiple access to data, the views on this issue can be basically summarized as two points:

  1. The client used for deserialization is too old. Replace the class used for deserialization with the latest class.
  2. This problem is related to generic collections. If generic collections are added, this problem is likely to occur.

The first view is not helpful to solve the current problem, while the second view is somewhat useful. It is understood that the microservice interface involved in the patch released that day did not add a generic set attribute, but added the assignment logic to a generic set that was previously added but not used. Later, after testing, it was really caused by this change. It can also be seen that the experience summarized by developers in the daily development process has some limitations, and it is necessary to analyze in depth under what conditions binary serialization will lead to deserialization failure.

Quad binary serialization and deserialization test

To test the impact of different data types on deserialization, write a test plan for common data types. This test involves two code solutions, the serialized program (V1 for short) and the deserialized program (V2 for short).

Test steps:

  1. Declare classes and attributes in V1;
  2. In V1, the class objects are binary serialized and saved to a file;
  3. Modify the attributes of classes in V1, remove the declaration of relevant attributes, and recompile the DLL;
  4. V2 refers to the DLL generated in step 3, and reads the data generated in step 2 for deserialization;
  5.  ///  ///Classes used in V1 testing ///  [Serializable] public class ObjectItem { public string TestStr { get;  set; } } ///  ///V1 Structure used in the test process ///  [Serializable] public struct StructItem { public string TestStr; }

    Results of testing common data types:

    New Data Type Value for test Whether deserialization is successful
    int one hundred success
    int[] {1,100} success
    string “test” success
    string[] {“a”,”1″} success
    double 1d success
    double[] {1d,2d} success
    bool true success
    bool[] {false,true} success
    List null success
    List {} success
    List {“1″,”a”} success
    List null success
    List {} success
    List {1,100} success
    List null success
    List {} success
    List {1d,100d} success
    List null success
    List {} success
    List {true,false} success
    ObjectItem null success
    ObjectItem new ObjectItem() success
    ObjectItem[] {} success
    ObjectItem{} {new ObjectItem()} Failed (when deserializing, the client does not have the ObjectItem class)
    ObjectItem{} {new ObjectItem()} Success (when deserializing, the client has the ObjectItem class)
    List null success
    List {} success
    List {new ObjectItem()} Failed (when deserializing, the client does not have the ObjectItem class)
    List {new ObjectItem()} Success (when deserializing, the client has the ObjectItem class)
    StructItem null success
    StructItem new StructItem() success
    List null success
    List {} success
    List {new StructItem()} Success (when deserializing, the client does not have the ObjectItem class)
    List {new StructItem()} Success (when deserializing, the client has the ObjectItem class)

    Summary of test results: When the binary system is deserialized, the newly added data of the serializer will be processed automatically and compatibly. However, in some cases, exceptions may occur during deserialization.
    Data type with deserialization exception:

    1. Generic Collection
    2. array

    These two data structures do not necessarily cause binary deserialization errors, but have certain conditions. There are three conditions for deserialization exception of generic collection:

    1. The serialized object adds a generic collection;
    2. Generics use new classes;
    3. The new class does not exist during deserialization;

    Arrays are similar. Only when the above three conditions are met will binary deserialization fail. This is why there has been no problem since the previous release, but the microservice client reported an error after the generic collection was assigned.

    Now that we know through testing that binary deserialization does have an automatic compatible processing mechanism, it is necessary to deeply understand the theoretical knowledge of the fault tolerance mechanism of binary deserialization on MSDN.

    Fault Tolerant Mechanism of Five Binary Deserialization

    In the process of binary deserialization, it is inevitable to encounter the situation that the version of the assembly used for serialization and deserialization is different. If the deserializer (such as the client of the microservice) is forced to keep consistent with the serializer (such as the server of the microservice) at all times, it is unrealistic in the actual application process. From NET 2.0 NET introduces Version Tolerance Serialization (VTS) for binary deserialization.

     When using BinaryFormatter, the VTS function will be enabled. The VTS function is especially enabled for classes (including generic types) that apply the SerializableAttribute attribute. VTS allows you to add new fields to these classes without breaking compatibility with other versions of the type.

     

    If the client and server assemblies are different during serialization and deserialization NET will try its best to be compatible, so there is little feeling about it in the normal use process, and even a sense of being used to it.

    To ensure correct version management behavior, please follow the following rules when modifying type versions:

    • Do not remove serialized fields.
    • If you did not apply the NonSerializedAttribute attribute to a field in a previous version, do not apply the attribute to that field.
    • Do not change the name or type of a serialized field.
    • When adding a new serialized field, apply the OptionalFieldAttribute attribute.
    • When removing the NonSerializedAttribute attribute from a field that was not serializable in previous versions, apply the OptionalFieldAttribute attribute.
    • For all optional fields, unless 0 or null can be accepted as the default value, use the serialization callback to set a meaningful default value.

    To ensure that types are compatible with future serialization engines, follow these guidelines:

    • The VersionAdded property on the OptionalFieldAttribute attribute is always set correctly.
    • Avoid versioning branches.

Structure of six binary serialized data

We have learned the theoretical knowledge of binary serialization and version compatibility from the previous article. Next, it is necessary to intuitively learn the binary serialization results used at ordinary times to eliminate the strangeness of the binary serialization results.

6.1 Data sent during remote call

What we currently use NET microservice framework uses binary data serialization. When making a remote call, what is the data sent by the client to the server?

The data actually sent by the client is as follows:

 0000  00 01 00 00 00 FF FF FF FF 01 00 00 00 00 00 00 ..... ÿÿÿÿ....... 0010  00 15 14 00 00 00 12 0B 53 65 6E 64 41 64 64 72 ........ SendAddr 0020  65 73 73 12 6F 44 4F 4A 52 65 6D 6F 74 69 6E 67 ess.oDOJRemoting 0030  4D 65 74 61 64 61 74 61 2E 4D 79 53 65 72 76 65 Metadata. MyServe 0040  72 2C 20 44 4F 4A 52 65 6D 6F 74 69 6E 67 4D 65 r, DOJRemotingMe 0050  74 61 64 61 74 61 2C 20 56 65 72 73 69 6F 6E 3D tadata, Version= 0060  31 2E 30 2E 32 36 32 32 2E 33 31 33 32 36 2C 20 1.0.2622.31326, 0070  43 75 6C 74 75 72 65 3D 6E 65 75 74 72 61 6C 2C Culture=neutral, 0080  20 50 75 62 6C 69 63 4B 65 79 54 6F 6B 65 6E 3D PublicKeyToken= 0090  6E 75 6C 6C 10 01 00 00 00 01 00 00 00 09 02 00 null............ 00A0  00 00 0C 03 00 00 00 51 44 4F 4A 52 65 6D 6F 74 ....... QDOJRemot 00B0  69 6E 67 4D 65 74 61 64 61 74 61 2C 20 56 65 72 ingMetadata, Ver 00C0  73 69 6F 6E 3D 31 2E 30 2E 32 36 32 32 2E 33 31 sion=1.0.2622.31 00D0  33 32 36 2C 20 43 75 6C 74 75 72 65 3D 6E 65 75 326, Culture=neu 00E0  74 72 61 6C 2C 20 50 75 62 6C 69 63 4B 65 79 54 tral, PublicKeyT 00F0  6F 6B 65 6E 3D 6E 75 6C 6C 05 02 00 00 00 1B 44 oken=null...... D 0100  4F 4A 52 65 6D 6F 74 69 6E 67 4D 65 74 61 64 61 OJRemotingMetada 0110  74 61 2E 41 64 64 72 65 73 73 04 00 00 00 06 53 ta. Address.....S 0120  74 72 65 65 74 04 43 69 74 79 05 53 74 61 74 65 treet. City.State 0130  03 5A 69 70 01 01 01 01 03 00 00 00 06 04 00 00 . Zip............ 0140  00 11 4F 6E 65 20 4D 69 63 72 6F 73 6F 66 74 20 .. One Microsoft  0150  57 61 79 06 05 00 00 00 07 52 65 64 6D 6F 6E 64 Way...... Redmond 0160  06 06 00 00 00 02 57 41 06 07 00 00 00 05 39 38 ...... WA......98 0170  30 35 34 0B                                     054.

The above data is binary. It can be seen that the serialized result contains assembly information, called methods, parameter classes used, properties, and values of various properties. For detailed analysis of the above serialized data, please refer to Reference 4.

6.2 Binary Serialization Results of Class Objects

There are no ready-made examples of the serialized results of class objects. For this, a simple scenario is designed to save the serialized data to a local file.

 ///  ///Custom serialized object ///  [Serializable] public class MyObject { public bool BoolMember { get;  set; } public int IntMember { get;  set; } } ///  ///Program entry ///  class Program { static void Main(string[] args) { var obj = new MyObject(); obj.BoolMember = true; obj.IntMember = 10000; IFormatter formatter = new BinaryFormatter(); Stream stream = new FileStream("data.dat",  FileMode.Create, FileAccess.Write, FileShare.None); formatter.Serialize(stream, obj); stream.Close(); } }

Content in data.dat:

 0000: 00 01 00 00 00 ff ff ff ff 01 00 00 00 00 00 00  ................ 0010: 00 0c 02 00 00 00 4e 42 69 6e 61 72 79 53 65 72  ...... NBinarySer 0020: 69 61 6c 69 7a 65 50 72 61 63 74 69 73 65 2c 20  ializePractise,  0030: 56 65 72 73 69 6f 6e 3d 31 2e 30 2e 30 2e 30 2c  Version=1.0.0.0, 0040: 20 43 75 6c 74 75 72 65 3d 6e 65 75 74 72 61 6c   Culture=neutral 0050: 2c 20 50 75 62 6c 69 63 4b 65 79 54 6f 6b 65 6e  , PublicKeyToken 0060: 3d 6e 75 6c 6c 05 01 00 00 00 20 42 69 6e 61 72  =null.....  Binar 0070: 79 53 65 72 69 61 6c 69 7a 65 50 72 61 63 74 69  ySerializePracti 0080: 73 65 2e 4d 79 4f 62 6a 65 63 74 02 00 00 00 1b  se.MyObject..... 0090: 3c 42 6f 6f 6c 4d 65 6d 62 65 72 3e 6b 5f 5f 42  k__B 00a0: 61 63 6b 69 6e 67 46 69 65 6c 64 1a 3c 49 6e 74   ackingField.k__Backin 00c0: 67 46 69 65 6c 64 00 00 01 08 02 00 00 00 01 10  gField.......... 00d0: 27 00 00 0b                                      '...

The result of binary serialization of class objects is different from the structure of binary serialization in remote call scenarios.

According to [MS-NRBF], the serialized result is first the serialized data header, including RecordTypeEnum, TopId, HeaderId, MajorVersion and MajorVersion. This is followed by some information about the serialized class, including the assembly, class name, property, and the value corresponding to the property

 Binary Serialization Format SerializationHeaderRecord: RecordTypeEnum: SerializedStreamHeader (0x00) TopId: 1 (0x1) HeaderId: -1 (0xFFFFFFFF) MajorVersion: 1 (0x1) MinorVersion: 0 (0x0) Record Definition: RecordTypeEnum: SystemClassWithMembers (0x02) ClassInfo: ObjectId:   (0x4e000000) LengthPrefixedString: Length: 78 (0x4e) String: BinarySerializePractise, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null ObjectId:   (0x00000001) LengthPrefixedString: Length: 32 (0x20) String: BinarySerializePractise.MyObject MemberCount: 2(0x00000002) LengthPrefixedString: Length: 27(0x1b) String: k__BackingField LengthPrefixedString: Length: 26(0x1a) String: k__BackingField ObjectId:0x08010000 Length:0x00000002 Value:1(0x01) Value:10000(0x00002710) MessageEnd: RecordTypeEnum: MessageEnd (0x0b)

VII Summary

Although binary serialization and deserialization are the main data processing methods of microservices currently in use, they are mysterious to developers and do not know much about serialized data and deserialization mechanisms. In this article, through the analysis of an accident, we summarized the deserialization mechanism, deserialization compatibility, serialized data structure and other contents. We hope that through some knowledge in this article, we can eliminate the strangeness of binary serialization and enhance our in-depth understanding of binary serialization.

VIII Reference materials

  1. Some gotchas in backward compatibility
  2. Version fault-tolerant serialization
  3. [MS-NRBF]: . NET Remoting: Binary Format Data Structure
  4. [MS-NRBF]: 3 Structure Examples

Original article by Mo Tao, if reproduced, please indicate the source: https://imotao.com/583.html

fabulous (0)
 Head of Mo Tao Mo Tao
Previous August 20, 2019
Next August 20, 2019

Related recommendations

Post reply

Your email address will not be disclosed. Required items have been used * tagging

This site uses Akismet to reduce spam comments. Learn how we handle your comment data