[Arduino] Parse JSON efficiently

I’m currently working a project on an Arduino Ethernet with my co-worker Levi.

We use a RESTful API between the Arduino and a server. The payload is stored in JSON strings.

Sounds easy, right? Well, it was not.

Only 2 KB of RAM

The Arduino Ethernet, like many other Arduino boards, only contains 2 KB of RAM.

Once you have loaded all the libraries like Ethernet, SoftwareSerial, SPI, LiquidCrystal and a JSON parser library, there is very few room left.

Long story short, after I allocated my big char json[160], only 300 bytes were left for the JSON parser.

Existing JSON libraries

I tried both aJson and json-arduino but none of them was able to work on such memory limited conditions.

The reason is extremely simple, these libraries share a serious design flaw: they both rely on malloc().

Here is an extract of aJson.cpp:

aJsonObject* aJsonClass::newItem()
{
    aJsonObject* node = (aJsonObject*) malloc(sizeof(aJsonObject));
    if (node) memset(node, 0, sizeof(aJsonObject));
    return node;
}

And another one from json-arduino.cpp:

token_list_t* create_token_list(int length)
{
    token_list_t *token_list = (token_list_t*)malloc(1*sizeof(token_list_t) );
    token_list->tokens = (jsmntok_t*)malloc(length*sizeof(jsmntok_t));
    token_list->length = length;
    token_list->count = 0;
    return token_list;
}

What’s wrong with malloc()?

malloc() make dynamic memory allocation on the heap. It looks for free space and returns a pointer to it, or NULL if not enough contiguous space is available.

free() will make the memory area available for another allocation.

The problem is that this will create “holes” in the memory and these holes are likely to be too small for the next allocations.

This phenomenon is called “fragmentation”, and it can be solved with virtual address space and defragmentation. This is something that can be done on a computer but not on an Arduino.

Animation of memory fragmentation

Moreover, memory allocation on the stack is way faster and easier to predict. If all your allocations are on the stack you know that the result will be the same each time you run the program, you don’t have to worry about a malloc() returning a NULL.

Bottom line: never use malloc() on a embedded system.

Back to JSON

Since I didn’t find any available JSON parser for Arduino that matched my requirements, I decided to write my own.

It’s open source and available on GitHub: https://github.com/bblanchon/ArduinoJson

Here is a taste of it:

char* json = "{\"Name\":\"Blanchon\",\"Skills\":[\"C\",\"C++\",\"C#\"],\"Age\":32,\"Online\":true}";

JsonParser<32> parser;

JsonHashTable hashTable = parser.parseHashTable(json);
   
if (!hashTable.success()) return;

char* name = hashTable.getString("Name");    
JsonArray skills = hashTable.getArray("Skills");    
int age = hashTable.getLong("Age");    
bool online = hashTable.getBool("Online");

As you can see, it’s extremely easy to use and supports nested object (which json-arduino doesn’t, by the way).

How is it implemented?

As json-arduino, I built my parser on a existing JSON tokenizer jsmn (pronounced like ‘jasmine’). It is very lightweight and minimalistic. It’s also a very stable and well-proven product.

I first thought about using jsmn directly in my project, but it’s not convenient at all: that why you need a wrapper around it. I tried to make mine as thin as possible.

To avoid calls to malloc(), I embedded an array of tokens as a private member of the JsonParser. The number of tokens is specified in a template parameter:

template <int MAX_TOKENS>
class JsonParser
{
public:    
    JsonArray parseArray(char* json);    
    JsonHashTable parseHashTable(char* json);

private:
    jsmntok_t tokens[MAX_TOKENS];
};

The array of tokens is available as long as the JsonParser is in memory.

Then, to avoid copying memory and reduce the occupation, JsonArray and JsonHashTable don’t actually own the tokens, they only contain pointers to it. Here is their base class:

class JsonObjectBase
{
public:
     // ...removed irrelevant methods...

protected:
    char* json;
    jsmntok_t* tokens;
};

The char* json is simply the JSON string passed to JsonParser::parseArray() or JsonParser::parseHashTable(), so that there is no duplication of the JSON string either.

Of course, you need to use the JsonArray and JsonHashTable within the scope of the JsonParser otherwise the pointer will lead to an invalid location. This is an acceptable limitation given that it’s a very nice optimization.

That’s all! This is how you create an efficient JSON parser library for embedded systems.

Where to go next?