Pitfalls of GoLang interface streaming to JSON (part1)

Michael Francis
9 min readAug 15, 2022
Photo by Jachan DeVol on Unsplash

Sometimes writing go code isn’t quite as obvious as you first think. Let’s consider the simple exercise of round-tripping between JSON and Go. Please excuse the ignored errors in these code samples, the added verbosity makes the examples much harder to read. If errors concern you, you can read up on some of my thoughts here

So firstly, what is the problem? Consider we have a type that we wish to steam to JSON and later back into a Go struct, this is easily accomplished using the built-in json.Marshal and json.Unmarshal library functions.

type Test struct {
T float64 `json:"t"`
}
t := Test{T: 3.142}
b, _ := json.Marshal(t)
var newT Test
_ = json.Unmarshal(b, &newT)
if t != newT {
panic("all is lost")
}

Run the above in the main function and it will execute without issue. If I print out the string value of the byte steam

println(string(b))

We get the expected result: {“t”:3.142} The structure can be deeply nested, and it all just works (tm). Documentation for these functions is here

It’s very well written but doesn’t directly cover the case we are going to look at. Streaming an interface. In my code, I commonly come across examples where I have a structure where one of the members is an interface. The default methods allow me to stream this without issue. The following code defines three structures, creates an instance of the X type with embedded A or B, and shows the result.

import "encoding/json"

type Type string
type MyInterface interface {
Type() Type
}

type StructA struct {
A float64 `json:"a"`
}
type StructB struct {
B string `json:"b"`
}
type StructX struct {
X string `json:"x"`
MyInterface MyInterface `json:"my_interface"`
}

func (_ StructA) Type() Type {
return "StructA"
}

func (_ StructB) Type() Type {
return "StructB"
}

// Check that we have implemented the interface
var _ MyInterface = (*StructA)(nil)
var _ MyInterface = (*StructB)(nil)

func main() {
// Create an instance of each a turn to JSON
xa := StructX{X: "xyz", MyInterface: StructA{A: 1.23}}
xb := StructX{X: "xyz", MyInterface: StructB{B: "hello"}}

xaJSON, _ := json.Marshal(xa)
xbJSON, _ := json.Marshal(xb)
println(string(xaJSON))
println(string(xbJSON))
}

So, what do we get when we run this, exactly what you might have guessed:

{"x":"xyz","my_interface":{"a":1.23}}
{"x":"xyz","my_interface":{"b":"hello"}}

Ok, this is great, now I can just take one of those streams and convert it back to Go, so cribbing from the original example :

var newX StructX
err := json.Unmarshal(xaJSON, &newX)
if err != nil {
panic(err)
}

Now we run this and

panic: json: cannot unmarshal object into Go struct field StructX.my_interface of type main.MyInterface

Taking 20 seconds to consider this makes complete sense, how could the JSON UnMarshal routine ‘know’ what type to un-stream this into? A quick glance at the go docs and there is something called a custom unmartial function. If you want to do this, you verride the default behavior by implementing the Unmarshaller interface.

https://cs.opensource.google/go/go/+/refs/tags/go1.19:src/encoding/json/decode.go;l=119

What does this look like? If we know we are always going to be getting a specific implementation of an interface returned, it should be easy …

func (x *StructX) Unmarshal(b []byte) error {
...
}

I just need to take the byte stream from the input and turn that into a structX with a nested structA (assuming we are only streaming A’s ) How do I do this though? I can’t call json.Unmarshal on the byte array as this will just call myslef recursively. This is where json.RawMessage comes to the rescue. json.RawMessage has a suprisingly simple implementation.

https://cs.opensource.google/go/go/+/refs/tags/go1.19:src/encoding/json/stream.go;l=263

I’ve copied the relevant part here for completeness

// stream.go - Go (opensource.google)
// RawMessage is a raw encoded JSON value.
// It implements Marshaler and Unmarshaler and can
// be used to delay JSON decoding or precompute a JSON encoding.
type RawMessage []byte

// MarshalJSON returns m as the JSON encoding of m.
func (m RawMessage) MarshalJSON() ([]byte, error) {
if m == nil {
return []byte("null"), nil
}
return m, nil
}

// UnmarshalJSON sets *m to a copy of data.
func (m *RawMessage) UnmarshalJSON(data []byte) error {
if m == nil {
return errors.New("json.RawMessage: UnmarshalJSON on nil pointer")
}
*m = append((*m)[0:0], data...)
return nil
}

var _ Marshaler = (*RawMessage)(nil)
var _ Unmarshaler = (*RawMessage)(nil)

So, all it does is makes a copy of the byte stream, why would you want to do this? It allows us to delay decoding (or encoding) of a stream till later. To use this I need to define a structure with a different shape one that has a raw message in it.

type StructX struct {
X string `json:"x"`
MyInterface MyInterface `json:"my_interface"`
}

type StructXRAW struct {
X string `json:"x"`
MyInterface json.RawMessage `json:"my_interface"`
}

Now I can decode my json messge into the new structure

func (x *StructX) UnmarshalJSON(b []byte) error {
var xr StructXRAW
err := json.Unmarshal(b, &xr)
if err != nil {
return err
}
x.X = xr.X
return nil
}

This still doesn’t quite give us what we wanted as we now have only the value of x.X populated and we ignored the []byte (json.RawMessage) in the MyInterface field.

func (x *StructX) UnmarshalJSON(b []byte) error {
var xr StructXRAW
err := json.Unmarshal(b, &xr)
if err != nil {
return err
}
x.X = xr.X
var a StructA
err = json.Unmarshal(xr.MyInterface, &a)
if err != nil {
return err
}
x.MyInterface = a
return nil
}

and now I can round trip a StructX containing a StructA in the interface field. Unfortunately, this does mean that I need to know the type of the underlying struct that fits the interface, I guess I can do some fuzzy matching but let’s not. So what to do? well, we know that we can have a custom marshal routine and a custom unmarshal why not write the type of the struct to the stream?

Why do we use a pointer for un-marshaling but a value-based receiver for marshaling?

func (x StructX) MarshalJSON( ) ([]byte, error ) {
...
}

I do want to point out something about this signature. In the case of the unmarsal we used a pointer receiver, in the case of the marshal we use a value receiver. It is generally recommended not to mix pointer and value receivers on a given struct. The Go FAQ even states that you should not mix

and, if you are using a tool like GoLand by JetBrains: More than just a Go IDE (which I highly recommend) it will constantly give you a warning for this. So why do we define it this way? Consider this code:

type Foo struct {
Foo string
}

func (f *Foo) MarshalJSON() ([]byte, error) {
panic("Custom Unmarshal Called ")
}

func main() {
f := Foo{
Foo: "My String",
}
_, _ = json.Marshal(f)
}

What do you expect to happen? Naively I would expect panic to be generated but if you run the code, you will get no panic. A pointer receiver is NOT called for a non-pointer value. This makes sense when you think through the implications of a pointer, you can modify the underlying data, so calling a pointer receiver for a value would be the wrong thing to do. For the unmarshal routine we have to have a pointer receiver as we are going to modify the struct. If I modify the code to pass a pointer to the marshal function I get the expected panic

func main() {
f := Foo{
Foo: "My String",
}
_, _ = json.Marshal(&f)
}
panic: Custom Unmarshal Called

So this should be fine if I always have a pointer type but what about the case where I have a structure containing a value of type X? You guessed it, the custom routine does not get called. All of this goes away we define a value-based receiver for the marshal method, in either case, the correct routine is called.

func (f Foo) MarshalJSON() ([]byte, error) {
panic("Custom Unmarshal Called ")
}

Back to our custom routines, as we were defining a custom routine, we also defined another public struct. One of the nice things about Go is that we can simply define this inline and clean up our code a little.

func (x *StructX) UnmarshalJSON(b []byte) error {
var xr struct {
X string `json:"x"`
MyInterface json.RawMessage `json:"my_interface"`
}
err := json.Unmarshal(b, &xr)
if err != nil {
return err
}
x.X = xr.X
var a StructA
err = json.Unmarshal(xr.MyInterface, &a)
if err != nil {
return err
}
x.MyInterface = a
return nil
}

Here we embed the special type and only have it available in the scope of the streaming function. So let’s do the same for the encode and we will store the type in the stream!

func (x StructX) MarshalJSON() ([]byte, error) {
var xr struct {
X string `json:"x"`
MyInterface MyInterface `json:"my_interface"`
MyInterfaceType Type `json:"my_interface_type"`
}
xr.X = x.X
xr.MyInterface = x.MyInterface
xr.MyInterfaceType = x.MyInterface.Type()
return json.Marshal(xr)
}
{“x”:”xyz”,”my_interface”:{“a”:1.23},”my_interface_type”:”StructA”}
{“x”:”xyz”,”my_interface”:{

Now we just need to change the decoder to use the type.

func (x *StructX) UnmarshalJSON(b []byte) error {
var xr struct {
X string `json:"x"`
MyInterface json.RawMessage `json:"my_interface"`
MyInterfaceType Type `json:"my_interface_type"`
}
err := json.Unmarshal(b, &xr)
if err != nil {
return err
}
x.X = xr.X
var myInterface MyInterface
if xr.MyInterfaceType == "StructA" {
myInterface = &StructA{}
} else {
myInterface = &StructB{}
}
var a StructA
err = json.Unmarshal(xr.MyInterface, myInterface)
if err != nil {
return err
}
x.MyInterface = a
return nil
}

This block of code also is worth remarking on, you’ll notice on inspection that we use an interface type to contain the pointers to the underlying structs when I pass them to the json.Unmashal. Since I used a value-based definition for the Type() function technically I could have supplied either to the assignment to myInterface as follows

} else {
myInterface = StructB{}
}
var a StructA
err = json.Unmarshal(xr.MyInterface, myInterface)

I get a runtime panic!

panic: json: Unmarshal(non-pointer main.StructB)

Ouch, this is what I have in the past referred to as pointer interface duality and it will bite you if you are not aware.

In the next part, I will clean up the functions and add a registration method. I will also explain why reflection doesn’t add a whole lot in these cases.

Runnable code example

package main

import (
"encoding/json"
)

type Type string
type MyInterface interface {
Type() Type
}

type StructA struct {
A float64 `json:"a"`
}
type StructB struct {
B string `json:"b"`
}
type StructX struct {
X string `json:"x"`
MyInterface MyInterface `json:"my_interface"`
}

type StructXRAW struct {
X string `json:"x"`
MyInterface json.RawMessage `json:"my_interface"`
}

func (_ StructA) Type() Type {
return "StructA"
}

func (_ StructB) Type() Type {
return "StructB"
}

// Check that we have implemented the interface
var _ MyInterface = (*StructA)(nil)
var _ MyInterface = (*StructB)(nil)

func (x StructX) MarshalJSON() ([]byte, error) {
var xr struct {
X string `json:"x"`
MyInterface MyInterface `json:"my_interface"`
MyInterfaceType Type `json:"my_interface_type"`
}
xr.X = x.X
xr.MyInterface = x.MyInterface
xr.MyInterfaceType = x.MyInterface.Type()
return json.Marshal(xr)
}

func (x *StructX) UnmarshalJSON(b []byte) error {
var xr struct {
X string `json:"x"`
MyInterface json.RawMessage `json:"my_interface"`
MyInterfaceType Type `json:"my_interface_type"`
}
err := json.Unmarshal(b, &xr)
if err != nil {
return err
}
x.X = xr.X
var myInterface MyInterface
if xr.MyInterfaceType == "StructA" {
myInterface = &StructA{}
} else {
myInterface = StructB{}
}
var a StructA
err = json.Unmarshal(xr.MyInterface, myInterface)
if err != nil {
return err
}
x.MyInterface = a
return nil
}


func main() {
// Create an instance of each a turn to JSON
xa := StructX{X: "xyz", MyInterface: StructA{A: 1.23}}
xb := StructX{X: "xyz", MyInterface: StructB{B: "hello"}}

xaJSON, _ := json.Marshal(xa)
xbJSON, _ := json.Marshal(xb)
println(string(xaJSON))
println(string(xbJSON))

var newX StructX
err := json.Unmarshal(xaJSON, &newX)
if err != nil {
panic(err)
}
err = json.Unmarshal(xbJSON, &newX)
if err != nil {
panic(err)
}
}

Pitfalls of GoLang interface streaming to JSON (part2) | by Michael Francis | Aug, 2022 | Medium

--

--