PolarSPARC

Introduction to Google Protocol Buffers


Bhaskar S 07/04/2020


Overview

Google Protocol Buffers (sometimes referred to as protobuf) is a Data Serialization framework with the following features:

Installation and Setup

The installation is on a Ubuntu 20.04 LTS based Linux desktop.

We need to install the packages for the protobuf compiler called protobuf-compiler and the Python language bindings called python3-protobuf, from the Ubuntu repository.

To install the mentioned packages, execute the following commands:

$ sudo apt-get update

$ sudo apt-get install protobuf-compiler -y

$ sudo apt-get install python3-protobuf -y

For Java language bindings, we will need the JAR file called protobuf-java-3.x.y.jar, where x.y is the current version. At the time of this article, the current version is 3.11.4. We will leverage Maven to manage the Java dependencies.

The following is the Maven pom.xml file we will use:

pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

    <modelVersion>4.0.0</modelVersion>
    <groupId>com.polarsparc.protobuf3</groupId>
    <artifactId>Protobuf3</artifactId>
    <version>1.0</version>

    <properties>
        <java.version>1.8</java.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>com.google.protobuf</groupId>
            <artifactId>protobuf-java</artifactId>
            <version>3.11.4</version>
        </dependency>

        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter-engine</artifactId>
            <version>5.6.1</version>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.8.1</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>build-helper-maven-plugin</artifactId>
                <version>3.2.0</version>
                <executions>
                    <execution>
                        <phase>generate-sources</phase>
                        <goals>
                            <goal>add-source</goal>
                        </goals>
                        <configuration>
                            <sources>
                                <source>target/generated-sources/main/java</source>
                            </sources>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.xolstice.maven.plugins</groupId>
                <artifactId>protobuf-maven-plugin</artifactId>
                <version>0.6.1</version>
                <configuration>
                    <attachProtoSources>false</attachProtoSources>
                    <checkStaleness>true</checkStaleness>
                    <clearOutputDirectory>false</clearOutputDirectory>
                    <outputDirectory>target/generated-sources/main/java</outputDirectory>
                    <protocExecutable>/usr/bin/protoc</protocExecutable>
                    <protoSourceRoot>src/main/proto</protoSourceRoot>
                </configuration>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>test-compile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

Once the data format is defined and saved in a .proto file, we run the protobuf compiler (called protoc) on the .proto file to generate the data access classes for the desired language bindings (Java, Python, etc).

In this article, we will demostrate the serialzation and deserialization using both the Java and Python language bindings.

Hands-on with Google Protocol Buffers

We will demonstrate the ability to both serialize and deserialize a simple object Customer with scalar types using protobuf.

The following is the schema definition for a Customer object defined in the file Customer.proto located in the directory src/main/proto as shown below:

Customer.proto
/*
    @Author: Bhaskar S
    @Blog:   https://www.polarsparc.com
    @Date:   04 Jul 2020
*/

syntax = "proto3";

package com.polarsparc.protobuf3.simple;

message Customer {
  string first_name = 1;
  string last_name = 2;
  string email_id = 3;

  int32 age = 4;

  repeated string phone_no = 5;
}

Google Protocol Buffers version 3 referred to as proto3 is specified using the syntax keyword.

To avoid any namespace collisions, we use the package keyword.

A Customer object is defined using the message keyword. Each field within the Customer object is defined using the following general syntax:

[repeated] <field-type> <field-name> = <field-tag>

where,

Let us now compile the Customer.proto file for Java and Python.

For Java binding, run Maven compile. This will generate a Java file called CustomerOuterClass.java in the directory target/generated-sources/main/java/com/polarsparc/protobuf3.

Let us create a Java program called CustomerTest.java to use the generated class in the directory src/main/java/com/polarsparc/protobuf3 as shown below:

CustomerTest.java
/*
    @Author: Bhaskar S
    @Blog:   https://www.polarsparc.com
    @Date:   04 Jul 2020
*/

package com.polarsparc.protobuf3;

import com.google.protobuf.InvalidProtocolBufferException;

import com.polarsparc.protobuf3.simple.CustomerOuterClass.Customer;

public class CustomerTest {
    public static void main(String[] args) {
        Customer.Builder builder = Customer.newBuilder();

        builder.setFirstName("Bugs")
                .setLastName("Bunny")
                .setEmailId("bugs.b@carrot.co")
                .addPhoneNo("100-100-1000")
                .addPhoneNo("100-100-1005");

        Customer customer = builder.build();

        System.out.printf("Customer fields: %s\n", customer.getAllFields());
        System.out.printf("Customer data size: %d\n", customer.getSerializedSize());
        System.out.printf("Customer: %s\n", customer);

        byte[] data = customer.toByteArray();

        Customer customer2 = null;
        try {
            customer2 = Customer.parseFrom(data);
        }
        catch (InvalidProtocolBufferException ex) {
            System.out.printf("Exception: %s\n", ex.getMessage());
        }

        System.out.printf("Customer deserialized: %s\n", customer2);
    }
}

The Java class(es) generated by protobuf compiler are all immutable. To construct a Customer object, one must first use the corresponding builder class (called Customer.Builder) to set the field values in the object and then finally call the build() method to get the object.

Executing the above Java program CustomerTest.java, produces the following results as shown in Output.1 below:

Output.1

Customer fields: {com.polarsparc.protobuf3.Customer.first_name=Bugs, com.polarsparc.protobuf3.Customer.last_name=Bunny, com.polarsparc.protobuf3.Customer.email_id=bugs.b@carrot.co, com.polarsparc.protobuf3.Customer.phone_no=[100-100-1000, 100-100-1005]}
Customer data size: 59
Customer: first_name: "Bugs"
last_name: "Bunny"
email_id: "bugs.b@carrot.co"
phone_no: "100-100-1000"
phone_no: "100-100-1005"

Customer deserialized: first_name: "Bugs"
last_name: "Bunny"
email_id: "bugs.b@carrot.co"
phone_no: "100-100-1000"
phone_no: "100-100-1005"

Now, switching gears to the Python binding, compile the Customer.proto file using the following command:

$ protoc --python_out=. ./Customer.proto

The compilation will generate a Python file called Customer_pb2.py in the specified directory, which is the current directory.

Let us create a Python script called CustomerTest.py to use the generated script in the current directory as shown below:

CustomerTest.py
#
# @Author: Bhaskar S
# @Blog:   https: // www.polarsparc.com
# @Date:   04 Jul 2020
#
        
import Customer_pb2

customer = Customer_pb2.Customer()
customer.first_name = "Bugs"
customer.last_name = "Bunny"
customer.email_id = "bugs.b@carrot.co"
customer.phone_no.append("100-100-1000")
customer.phone_no.append("100-100-1005")

print("Customer fields: %s" % customer.ListFields())
print("Customer data size: %s" % customer.ByteSize())
print("Customer: %s" % customer)

data = customer.SerializeToString()

customer2 = Customer_pb2.Customer()
customer2.ParseFromString(data)

print("Customer deserialized: %s" % customer2)

To construct a Customer object, one must first import the generated module (called Customer_pb2) and invoke the empty constructor. One can then set the field values in the object like any regular Python object.

Executing the above Python script CustomerTest.py, produces the following results as shown in Output.2 below:

Output.2

Customer fields: [(<google.protobuf.pyext._message.FieldDescriptor object at 0x7fe687eca290>, 'Bugs'), (<google.protobuf.pyext._message.FieldDescriptor object at 0x7fe687eca2b0>, 'Bunny'), (<google.protobuf.pyext._message.FieldDescriptor object at 0x7fe687eca2d0>, 'bugs.b@carrot.co'), (<google.protobuf.pyext._message.FieldDescriptor object at 0x7fe687eca2f0>, ['100-100-1000', '100-100-1005'])]
Customer data size: 59
Customer: first_name: "Bugs"
last_name: "Bunny"
email_id: "bugs.b@carrot.co"
phone_no: "100-100-1000"
phone_no: "100-100-1005"

Customer deserialized: first_name: "Bugs"
last_name: "Bunny"
email_id: "bugs.b@carrot.co"
phone_no: "100-100-1000"
phone_no: "100-100-1005"

Now, we will demonstrate the ability to both serialize and deserialize an object Account with complex types using protobuf.

The following is the schema definition for Customer and Account objects defined in the file CustomerAccount.proto located in the directory src/main/proto as shown below:

CustomerAccount.proto
/*
    @Author: Bhaskar S
    @Blog:   https://www.polarsparc.com
    @Date:   04 Jul 2020
*/

syntax = "proto3";

package com.polarsparc.protobuf3.complex;

option java_outer_classname = "CustomerAccount";

message Customer {
  string first_name = 1;
  string last_name = 2;
  string email_id = 3;

  int32 age = 4;

  repeated string phone_no = 5;
}

enum AccountType {
  CA_UNKNOWN = 0;
  CA_SAVINGS = 1;
  CA_CHECKING = 2;
  CA_BROKERAGE = 3;
}

message Account {
  string acct_no = 1;

  AccountType acct_type = 2;

  Customer customer = 3;
}

Notice the use of the option keyword with java_outer_classname to force the name of the outer class generated by protobuf compiler.

To define a pre-defined set of constants, we use the enum keyword. In this example AccountType is defined as an enum with the constants CA_UNKNOWN, CA_SAVINGS, CA_CHECKING, and CA_BROKERAGE. There *MUST* always be a zero value enum, so that we can use 0 as a numeric default value.

A field in a message can refer other message types. In this example, one of the fields in the Account type references the Customer type.

Let us now compile the CustomerAccount.proto file for Java and Python.

For Java binding, run Maven compile. This will generate a Java file called CustomerAccount.java in the directory target/generated-sources/main/java/com/polarsparc/protobuf3.

Let us create a Java program called CustomerAccountTest.java to use the generated class(es) in the directory src/main/java/com/polarsparc/protobuf3 as shown below:

CustomerAccountTest.java
/*
    @Author: Bhaskar S
    @Blog:   https://www.polarsparc.com
    @Date:   04 Jul 2020
*/

package com.polarsparc.protobuf3;

import com.google.protobuf.InvalidProtocolBufferException;

import com.polarsparc.protobuf3.CustomerAccount.complex.Account;
import com.polarsparc.protobuf3.CustomerAccount.complex.AccountType;
import com.polarsparc.protobuf3.CustomerAccount.complex.Customer;

public class CustomerAccountTest {
    public static void main(String[] args) {
        Customer customer = Customer.newBuilder()
                .setFirstName("Bugs")
                .setLastName("Bunny")
                .setEmailId("bugs.b@carrot.co")
                .addPhoneNo("100-100-1000")
                .addPhoneNo("100-100-1005")
                .build();

        Account account = Account.newBuilder()
                .setAcctNo("12345")
                .setAcctType(AccountType.CA_BROKERAGE)
                .setCustomer(customer)
                .build();

        System.out.printf("Account fields: %s\n", account.getAllFields());
        System.out.printf("Account data size: %d\n", account.getSerializedSize());
        System.out.printf("Account: %s\n", account);

        byte[] data = account.toByteArray();

        Account account2 = null;
        try {
            account2 = Account.parseFrom(data);
        }
        catch (InvalidProtocolBufferException ex) {
            System.out.printf("Exception: %s\n", ex.getMessage());
        }

        System.out.printf("Account deserialized: %s\n", account2);
    }
}

Executing the above Java program CustomerAccountTest.java, produces the following results as shown in Output.3 below:

Output.3

Account fields: {com.polarsparc.protobuf3.Account.acct_no=12345, com.polarsparc.protobuf3.Account.acct_type=CA_BROKERAGE, com.polarsparc.protobuf3.Account.customer=first_name: "Bugs"
last_name: "Bunny"
email_id: "bugs.b@carrot.co"
phone_no: "100-100-1000"
phone_no: "100-100-1005"
}
Account data size: 70
Account: acct_no: "12345"
acct_type: CA_BROKERAGE
customer {
  first_name: "Bugs"
  last_name: "Bunny"
  email_id: "bugs.b@carrot.co"
  phone_no: "100-100-1000"
  phone_no: "100-100-1005"
}

Account deserialized: acct_no: "12345"
acct_type: CA_BROKERAGE
customer {
  first_name: "Bugs"
  last_name: "Bunny"
  email_id: "bugs.b@carrot.co"
  phone_no: "100-100-1000"
  phone_no: "100-100-1005"
}

Now, switching gears to the Python binding, compile the CustomerAccount.proto file using the following command:

$ protoc --python_out=. ./CustomerAccount.proto

The compilation will generate a Python file called CustomerAccount_pb2.py in the specified directory, which is the current directory.

Let us create a Python script called CustomerAccountTest.py to use the generated script in the current directory as shown below:

CustomerAccountTest.py
#
# @Author: Bhaskar S
# @Blog:   https: // www.polarsparc.com
# @Date:   04 Jul 2020
#

import CustomerAccount_pb2

account = CustomerAccount_pb2.Account()
account.acct_no = "12345"
account.acct_type = CustomerAccount_pb2.BROKERAGE
account.customer.first_name = "Bugs"
account.customer.last_name = "Bunny"
account.customer.email_id = "bugs.b@carrot.co"
account.customer.phone_no.append("100-100-1000")
account.customer.phone_no.append("100-100-1005")

print("Account fields: %s" % account.ListFields())
print("Account data size: %s" % account.ByteSize())
print("Account: %s" % account)

data = account.SerializeToString()

account2 = CustomerAccount_pb2.Account()
account2.ParseFromString(data)

print("Account deserialized: %s" % account2)

Notice how the fields of the Customer object within the Account object are set in Python.

Executing the above Python script CustomerAccountTest.py, produces the following results as shown in Output.4 below:

Output.4

Account fields: [(<google.protobuf.pyext._message.FieldDescriptor object at 0x7f19b6f9de10>, '12345'), (<google.protobuf.pyext._message.FieldDescriptor object at 0x7f19b6f9de30>, 2), (<google.protobuf.pyext._message.FieldDescriptor object at 0x7f19b6f9de50>, first_name: "Bugs"
last_name: "Bunny"
email_id: "bugs.b@carrot.co"
phone_no: "100-100-1000"
phone_no: "100-100-1005"
)]
Account data size: 70
Account: acct_no: "12345"
acct_type: CA_BROKERAGE
customer {
  first_name: "Bugs"
  last_name: "Bunny"
  email_id: "bugs.b@carrot.co"
  phone_no: "100-100-1000"
  phone_no: "100-100-1005"
}

Account deserialized: acct_no: "12345"
acct_type: CA_BROKERAGE
customer {
  first_name: "Bugs"
  last_name: "Bunny"
  email_id: "bugs.b@carrot.co"
  phone_no: "100-100-1000"
  phone_no: "100-100-1005"
}

In the above example, we had all the object definitions in a single CustomerAccount.proto schema definition file.

We could modularize the object definitions and separate them into two .proto files - one for the Customer related object(s) and the other for the Account related object(s).

The following is the schema definition for the Customer2 related object(s) defined in the file Customer2.proto located in the directory src/main/proto as shown below:

Customer2.proto
/*
    @Author: Bhaskar S
    @Blog:   https://www.polarsparc.com
    @Date:   04 Jul 2020
  */

syntax = "proto3";

package com.polarsparc.protobuf3.modular;

option java_outer_classname = "CustomerInfo";
option java_multiple_files = true;

enum PhoneType2 {
  PT_UNKNOWN = 0;
  PT_HOME = 1;
  PT_MOBILE = 2;
  PT_WORK = 3;
}

message PhoneNumber2 {
  string number = 1;

  PhoneType2 type = 2;
}

message Customer2 {
  string first_name = 1;
  string last_name = 2;
  string email_id = 3;

  int32 age = 4;

  repeated PhoneNumber2 phone_no = 5;
}

Notice the use of the option keyword with java_multiple_files which causes top-level messages, enums, etc to be defined at the package level, rather than inside an outer class file.

And, here is the schema definition for the Account2 related object(s) defined in the file Account2.proto located in the directory src/main/proto as shown below:

Account2.proto
/*
    @Author: Bhaskar S
    @Blog:   https://www.polarsparc.com
    @Date:   04 Jul 2020
  */

syntax = "proto3";

package com.polarsparc.protobuf3.modular;

option java_outer_classname = "AccountInfo";
option java_multiple_files = true;

import "Customer2.proto";

enum AccountType2 {
  AT_UNKNOWN = 0;
  AT_SAVINGS = 1;
  AT_CHECKING = 2;
  AT_BROKERAGE = 3;
}

message Account2 {
  string acct_no = 1;

  AccountType2 acct_type = 2;

  Customer2 customer = 3;
}

In the above Account2.proto file, we import the Customer2.proto file.

Let us now compile both the Customer2.proto and the Account2.proto files for Java and Python.

For Java binding, run Maven compile. This will generate a Java file for each object type defined in the schema file(s) in the directory target/generated-sources/main/java/com/polarsparc/protobuf3.

Let us create a Java program called CustomerAccountTest2.java to use the generated classes in the directory src/main/java/com/polarsparc/protobuf3 as shown below:

CustomerAccountTest2.java
/*
    @Author: Bhaskar S
    @Blog:   https://www.polarsparc.com
    @Date:   04 Jul 2020
*/

package com.polarsparc.protobuf3;

import com.google.protobuf.InvalidProtocolBufferException;

import com.polarsparc.protobuf3.modular.*;

import java.util.Arrays;

public class CustomerAccountTest2 {
    public static void main(String[] args) {
        PhoneNumber2 phoneNo1 = PhoneNumber2.newBuilder()
                .setNumber("100-100-1000")
                .setType(PhoneType2.PT_MOBILE)
                .build();

        PhoneNumber2 phoneNo2 = PhoneNumber2.newBuilder()
                .setNumber("100-100-1005")
                .setType(PhoneType2.PT_WORK)
                .build();

        Customer2 customer = Customer2.newBuilder()
                .setFirstName("Bugs")
                .setLastName("Bunny")
                .setEmailId("bugs.b@looney.us")
                .addAllPhoneNo(Arrays.asList(phoneNo1, phoneNo2))
                .build();

        Account2 account = Account2.newBuilder()
                .setAcctNo("12345")
                .setAcctType(AccountType2.AT_SAVINGS)
                .setCustomer(customer)
                .build();

        System.out.printf("Account fields: %s\n", account.getAllFields());
        System.out.printf("Account data size: %d\n", account.getSerializedSize());
        System.out.printf("Account: %s\n", account);

        byte[] data = account.toByteArray();

        Account2 account2 = null;
        try {
            account2 = Account2.parseFrom(data);
        }
        catch (InvalidProtocolBufferException ex) {
            System.out.printf("Exception: %s\n", ex.getMessage());
        }

        System.out.printf("Account deserialized: %s\n", account2);
    }
}

Executing the above Java program CustomerAccountTest2.java, produces the following results as shown in Output.5 below:

Output.5

Account fields: {com.polarsparc.protobuf3.Account2.acct_no=12345, com.polarsparc.protobuf3.Account2.acct_type=AT_SAVINGS, com.polarsparc.protobuf3.Account2.customer=first_name: "Bugs"
last_name: "Bunny"
email_id: "bugs.b@looney.us"
phone_no {
  number: "100-100-1000"
  type: PT_MOBILE
}
phone_no {
  number: "100-100-1005"
  type: PT_WORK
}
}
Account data size: 78
Account: acct_no: "12345"
acct_type: AT_SAVINGS
customer {
  first_name: "Bugs"
  last_name: "Bunny"
  email_id: "bugs.b@looney.us"
  phone_no {
    number: "100-100-1000"
    type: PT_MOBILE
  }
  phone_no {
    number: "100-100-1005"
    type: PT_WORK
  }
}

Account deserialized: acct_no: "12345"
acct_type: AT_SAVINGS
customer {
  first_name: "Bugs"
  last_name: "Bunny"
  email_id: "bugs.b@looney.us"
  phone_no {
    number: "100-100-1000"
    type: PT_MOBILE
  }
  phone_no {
    number: "100-100-1005"
    type: PT_WORK
  }
}

Now, switching gears to the Python binding, compile both the Customer2.proto and Account2.proto files using the following commands:

$ protoc --python_out=. ./Customer2.proto

$ protoc --python_out=. ./Account2.proto

The compilation will generate two Python files called Customer2_pb2.py and Account2_pb2.py in the specified directory, which is the current directory.

Let us create a Python script called CustomerAccountTest2.py to use the generated script in the current directory as shown below:

CustomerAccountTest2.py
#
# @Author: Bhaskar S
# @Blog:   https: // www.polarsparc.com
# @Date:   04 Jul 2020
#

import Customer2_pb2
import Account2_pb2

account = Account2_pb2.Account2()
account.acct_no = "12345"
account.acct_type = Account2_pb2.AT_SAVINGS
account.customer.first_name = "Bugs"
account.customer.last_name = "Bunny"
account.customer.email_id = "bugs.bunny@looney.us"
home = account.customer.phone_no.add()
home.number = "100-100-1000"
home.type = Customer2_pb2.PT_MOBILE
mobile = account.customer.phone_no.add()
mobile.number = "100-100-1005"
mobile.type = Customer2_pb2.PT_WORK

print("Account fields: %s" % account.ListFields())
print("Account data size: %s" % account.ByteSize())
print("Account: %s" % account)

data = account.SerializeToString()

account2 = Account2_pb2.Account2()
account2.ParseFromString(data)

print("Account deserialized: %s" % account2)

Notice how the fields of the PhoneNumber2 object within the Customer2 object inside the Account2 object are set in Python.

Executing the above Python script CustomerAccountTest2.py, produces the following results as shown in Output.6 below:

Output.6

Account fields: [(<google.protobuf.pyext._message.FieldDescriptor object at 0x7fa590626570>, '12345'), (<google.protobuf.pyext._message.FieldDescriptor object at 0x7fa590626230>, 1), (<google.protobuf.pyext._message.FieldDescriptor object at 0x7fa590626330>, first_name: "Bugs"
last_name: "Bunny"
email_id: "bugs.bunny@looney.us"
phone_no {
  number: "100-100-1000"
  type: PT_MOBILE
}
phone_no {
  number: "100-100-1005"
  type: PT_WORK
}
)]
Account data size: 82
Account: acct_no: "12345"
acct_type: AT_SAVINGS
customer {
  first_name: "Bugs"
  last_name: "Bunny"
  email_id: "bugs.bunny@looney.us"
  phone_no {
    number: "100-100-1000"
    type: PT_MOBILE
  }
  phone_no {
    number: "100-100-1005"
    type: PT_WORK
  }
}

Account deserialized: acct_no: "12345"
acct_type: AT_SAVINGS
customer {
  first_name: "Bugs"
  last_name: "Bunny"
  email_id: "bugs.bunny@looney.us"
  phone_no {
    number: "100-100-1000"
    type: PT_MOBILE
  }
  phone_no {
    number: "100-100-1005"
    type: PT_WORK
  }
}

We will now demonstrate the ability to serialize an instance of a Customer object to a file using Python and then deserializing the same Customer instance from the file using Java.

Let us create a Python script called SerializeCustomerTest.py to serialize an instance of the Customer object to a file called /tmp/customer.bin as shown below:

SerializeCustomerTest.py
#
# @Author: Bhaskar S
# @Blog:   https: // www.polarsparc.com
# @Date:   04 Jul 2020
#

import Customer_pb2

customer = Customer_pb2.Customer()
customer.first_name = "Wile E"
customer.last_name = "Coyote"
customer.phone_no.append("200-101-2001")
customer.phone_no.append("201-102-2002")

print("Customer fields: %s" % customer.ListFields())
print("Customer data size: %s" % customer.ByteSize())
print("Customer: %s" % customer)

with open('/tmp/customer.bin', 'wb') as bf:
    bf.write(customer.SerializeToString())

print("Customer object serialized to /tmp/customer.bin")

Executing the above Python script SerializeCustomerTest.py, produces the following results as shown in Output.7 below:

Output.7

Customer fields: [(<google.protobuf.pyext._message.FieldDescriptor object at 0x7fe8fc223290>, 'Wile E'), (<google.protobuf.pyext._message.FieldDescriptor object at 0x7fe8fc2232b0>, 'Coyote'), (<google.protobuf.pyext._message.FieldDescriptor object at 0x7fe8fc2232d0>, ['200-101-2001', '201-102-2002'])]
Customer data size: 44
Customer: first_name: "Wile E"
last_name: "Coyote"
phone_no: "200-101-2001"
phone_no: "201-102-2002"

Customer object serialized to /tmp/customer.bin

Notice that we have not set a value for the email_id field in the Customer object.

Let us create a Java program called DeserializeCustomerTest.java in the directory src/test/com/polarsparc/protobuf3 as shown below:

DeserializeCustomerTest.java
/*
    @Author: Bhaskar S
    @Blog:   https://www.polarsparc.com
    @Date:   04 Jul 2020
*/

package com.polarsparc.protobuf3;

import com.google.protobuf.InvalidProtocolBufferException;

import com.polarsparc.protobuf3.simple.CustomerOuterClass;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class DeserializeCustomerTest {
    public static void main(String[] args) {
        String fname = "/tmp/customer.bin";

        byte[] data = null;
        try {
            data = Files.readAllBytes(Paths.get(fname));
        }
        catch (IOException ex) {
            System.out.printf("Exception: %s\n", ex.getMessage());
        }

        CustomerOuterClass.Customer customer = null;
        try {
            if (data != null) {
                customer = CustomerOuterClass.Customer.parseFrom(data);
            }
        }
        catch (InvalidProtocolBufferException ex) {
            System.out.printf("Exception: %s\n", ex.getMessage());
        }

        System.out.printf("Customer deserialized: %s\n", customer);
    }
}

Executing the above Java program DeserializeCustomerTest.java, produces the following results as shown in Output.8 below:

Output.8

Customer deserialized: first_name: "Wile E"
last_name: "Coyote"
phone_no: "200-101-2001"
phone_no: "201-102-2002"

When an object is deserialized, if the encoded message does not contain a particular field, the corresponding field in the parsed object is set to the default value for that field. It is false for type bool, empty bytes for type bytes, the first defined enum value (must be set to a zero value) for type enum, zero (0) for numeric types (such as int32, int64, etc), and empty string for type string.

This concludes the demonstration of the Google Protocol Buffers (a.k.a Protobuf) in both Java and Python.

Source Code

Github - Java

Github - Python

References

Protobuf 3 Language Guide

Protobuf Java Basics

Protobuf Python Basics

Maven Protocol Buffers Plugin


© PolarSPARC