[C++] Lightweight float to string conversion
This article presents a simple algorithm for converting floats to string. It’s designed to have a small code and limited RAM usage, making it a perfect fit for embedded systems. As a matter of fact, it’s part of ArduinoJson 5.10.
It features:
- Small code size
- Small memory usage
- No dynamic allocation
- Up to nine digits after the decimal point
- Fairly good precision
- Exponentiation of large and small values
As far as I know, no such an algorithm has been publicly described.
A. Introduction
A.1. Existing algorithms
There is not much information the Internet covering float to string conversion. The best document I could find is this piece of gold detailing the current state of the art in a language that I could understand:
Printing Floating-Point Numbers by Ryan Juckett
In his article, Ryan Juckett presents the following algorithms:
- Dragon4
- Bellerophon
- Grisu
However, they didn’t meet my requirement because:
- they use arbitrary precision arithmetic
- the code is massive
- the memory usage is likely to be high
A.2. A lightweight alternative
The algorithm I’m going to describe is inferior to the because:
- it supports a limited number of decimal places
- it doesn’t correct rounding error that could happen when reading back the value
- it doesn’t allow custom formatting
But, as you’ll see it does the job and fits in less than 1KB of AVR code and uses just a few dozens of bytes of stack memory.
A.3. Overall description
The algorithm is very simple. It consists in splitting the floating-point value into multiple parts:
- The sign (
+
or-
) - The integral part (32-bit unsigned integer)
- The decimal part (32-bit unsigned integer)
- The exponent part (16-bit signed integer)
Example
-12.34e56
becomes-
,12
,34
and56
.
B. The steps in detail
In this section, we’ll see the various steps of the algorithm. The next section will give the complete implementation in C++.
B.1. Be positive
The first thing we need to do is to check the sign of the input float.
If it’s positive, great!
If it’s negative, we print a minus sign and negate the value so the rest of the algorithm doesn’t have to worry about negative values.
B.2. Normalization
We use 32-bit integers, meaning that each part cannot exceed the capacity of a 32-bit integer.
Example
We’ll never be able to output
123456789123
because the integral part exceeds the capacity of a 32-bit integer.However, we’ll be able to output
1.234567891e11
which represents the same value with an acceptable loss of accuracy.
This means that every value that exceeds the capacity of a 32-bit integer must be normalized so that the integral part goes from 1 to 9 and the exponent part compensates the difference.
In my implementation, this threshold is 1e7
but you can choose any value up to 1e9
.
Similarly, when the input is very small, i.e. very close to zero, we normalize it to avoid printing a lot of leading zeros. This is very important as this algorithm only allows nine digits after the decimal point.
Example
If the input is
0.00000123
, then the output is1.23e-6
B.3. Binary exponentiation
As we saw, the normalization is the process of transforming 123456
to 1.23456e5
.
This could be implemented with log()
but this function is not always available on embedded platforms, and when it’s present, it’s fat and slow.
Now, we could implement the logarithm with a loop that divides by ten and counts the iterations. However, each floating point operation introduces an error, so each iteration magnifies the error. This could be OK for small values below 1000
, as they require only 4 iterations; but double
s range up to 1.79e+308
, requiring up to 308 iterations!
That’s why my implementation uses a technique called “binary exponentiation” which minimizes the number of operations. It works by decomposing the input value into factors, just like you would do with “prime factor decomposition”, except that each factor is a binary power of end.
Example
1e308 == 1e256 * 1e32 * 1e16 * 1e4
Only 4 operations are required to extract the 4 factors.
The biggest value that can be stored in a double
is 1.79769e+308
. As a consequence, the biggest exponent that we have to handle is 308
. This means that every positive exponent can be composed with the following factors: 1e1
, 1e2
, 1e4
, 1e8
, 1e16
, 1e32
, 1e64
, 1e128
and 1e256
.
So, to compute the logarithm of the input value, we just need to extract the composing factors one after the other. The algorithm goes like this.
- compare the input with a given factor (e.g.
1e4
) - if it’s greater, then divide the value and increment the exponent (e.g.
exponent += 4
) - repeat for each factor
Example
- Suppose input is
1234
1234
is smaller than1e4
, so we pass
- value is
1234
- exponent is
0
1234
is bigger than1e2
, so we divide and increment the exponent:
- value is now
12.34
- exponent is now
2
12.34
is bigger than1e1
, so we divide and increment the exponent:
- value is now
1.234
- exponent is now
3
The end result is:
1.234e3
B.4. Integral and decimal part
Extracting the integral part is easy: we just need to cast the floating-point value to an integer. We know that the value will fit in a 32-bit integer because bigger values have been normalized.
To extract the decimal part, we compute the remainder of the previous operation, multiply by 1e9
and cast to an integer.
Example
- Suppose input is
12.34
- First, we cast to integer and get
12
- The remainder is
0.34
- We multiply by
1e9
and get340000000
The end result is
12.340000000
B.5. Rounding
After extracting the decimal part, we can compute the remainder.
If it’s smaller than 0.5
, then great!
If it’s greater or equal to 0.5
, we need to round up.
To round up, we just need to increment the decimal part by one unit. If the decimal part is now equal to 1e9
, then we need to increment the integral part and clear the decimal part.
Example
- Suppose input is
0.9999999999
- Then integral part is
0
- And decimal part is
999999999
- And remainder is
0.9
so rounding is needed- We increment the decimal part:
1000000000
- As it’s equal to
1e9
, we increment the integral part- The results are
1
for the integral part and0
for the decimal part
B.6. Writing the integral part
To convert the integral part to a string, we extract digits one by one in reverse order, i.e. from right to left. This is done by taking the modulus 10
and dividing by 10
to move to the next digit. We stop when the remainder is 0
Example
- Suppose input is
12
12 % 10
gives2
, our right most digit.- We divide by
10
, this gives1
1 % 10
gives1
, our second digit.- We divide by
10
, this gives0
, so we stop
Since digits are extracted in reverse order, we need to store them is a temporary buffer before outputting them. The trick is to fill the buffer starting from the end so that they are in the right order in memory. We start from the end of the buffer and decrement a pointer so as to point to the previous character. We can add a null terminator at the end so it’s a valid C string.
Example
- Suppose an uninitialized buffer:
{ ???, ???, ??? }
- We add the null terminator
{ ???, ???, '\0' }
- Then add the first digit
{ ???, '2', '\0' }
- Then the seconds digit
{ '1', '2', '\0' }
- We call
puts()
to print the digits in the right order
B.7. Writing the decimal part.
To print the decimal part, we use a process very similar to the previous one, except that:
- We remove the trailing zeros
- We don’t stop when the remainder is
0
, but when the 9 digits have been extracted (or removed)
To remove the trailing zeros, we compute the modulus 10
, if it’s 0
we divide by 10
and start again.
Example
- Suppose input is
3400
3400 % 10
gives0
, so we divide by ten340 % 10
gives0
, so we divide by ten34 % 10
gives4
, so we stop
To print a fixed number of digits, we use a counter initialized to 9
and decrement it each time a digit is extracted. We stop when the counter reaches 0
.
B.8. The rest
We’ve done the biggest part of the job, here is what remains:
- Print the exponent part
- Handle
NaN
- Handle
+Inf
and-Inf
These are trivial tasks that don’t require explanation.
C. The implementation
Here is the main function:
void writeFloat(double value) {
if (isnan(value)) return puts("nan");
if (value < 0.0) {
putchar('-');
value = -value;
}
if (isinf(value)) return puts("inf");
uint32_t integralPart, decimalPart;
int16_t exponent;
splitFloat(value, integralPart, decimalPart, exponent);
writeInteger(integralPart);
if (decimalPart) writeDecimals(decimalPart);
if (exponent < 0) {
puts("e-");
writeInteger(-exponent);
}
if (exponent > 0) {
puts('e');
writeInteger(exponent);
}
}
Here is the function that extracts the integral part, the decimal part and the exponent part of a float:
void splitFloat(double value, uint32_t &integralPart,
uint32_t &decimalPart, int16_t &exponent) {
exponent = normalizeFloat(value);
integralPart = (uint32_t)value;
double remainder = value - integralPart;
remainder *= 1e9;
decimalPart = (uint32_t)remainder;
// rounding
remainder -= decimalPart;
if (remainder >= 0.5) {
decimalPart++;
if (decimalPart >= 1000000000) {
decimalPart = 0;
integralPart++;
if (exponent != 0 && integralPart >= 10) {
exponent++;
integralPart = 1;
}
}
}
}
Here is the function that normalizes the values between 1e-5
and 1e7
and returns the exponent:
int normalizeFloat(double& value) {
const double positiveExpThreshold = 1e7;
const double negativeExpThreshold = 1e-5;
int exponent = 0;
if (value >= positiveExpThreshold) {
if (value >= 1e256) {
value /= 1e256;
exponent += 256;
}
if (value >= 1e128) {
value /= 1e128;
exponent += 128;
}
if (value >= 1e64) {
value /= 1e64;
exponent += 64;
}
if (value >= 1e32) {
value /= 1e32;
exponent += 32;
}
if (value >= 1e16) {
value /= 1e16;
exponent += 16;
}
if (value >= 1e8) {
value /= 1e8;
exponent += 8;
}
if (value >= 1e4) {
value /= 1e4;
exponent += 4;
}
if (value >= 1e2) {
value /= 1e2;
exponent += 2;
}
if (value >= 1e1) {
value /= 1e1;
exponent += 1;
}
}
if (value > 0 && value <= negativeExpThreshold) {
if (value < 1e-255) {
value *= 1e256;
exponent -= 256;
}
if (value < 1e-127) {
value *= 1e128;
exponent -= 128;
}
if (value < 1e-63) {
value *= 1e64;
exponent -= 64;
}
if (value < 1e-31) {
value *= 1e32;
exponent -= 32;
}
if (value < 1e-15) {
value *= 1e16;
exponent -= 16;
}
if (value < 1e-7) {
value *= 1e8;
exponent -= 8;
}
if (value < 1e-3) {
value *= 1e4;
exponent -= 4;
}
if (value < 1e-1) {
value *= 1e2;
exponent -= 2;
}
if (value < 1e0) {
value *= 1e1;
exponent -= 1;
}
}
return exponent;
}
Here is the function that prints the integral part and the exponent:
void writeInteger(uint32_t value) {
char buffer[16];
char *ptr = buffer + sizeof(buffer) - 1;
*ptr = 0;
do {
*--ptr = (char)(value % 10) + '0';
value = (uint32_t)(value / 10);
} while (value);
puts(ptr);
}
Here is the function that prints the decimal part:
void writeDecimals(uint32_t value) {
int width = 9;
// remove trailing zeros
while (value % 10 == 0 && width > 0) {
value /= 10;
width--;
}
char buffer[16];
char *ptr = buffer + sizeof(buffer) - 1;
// write the string in reverse order
*ptr = 0;
while (width--) {
*--ptr = value % 10 + '0';
value /= 10;
}
*--ptr = '.';
// and dump it in the right order
puts(ptr);
}
Here is the link to the code in ArduinoJson
D. Possible variations
You could use 64-bit integers and get much more digits. You could also repeat the decimal extraction a second time to get 9 more digits. However, I don’t think it’s worth it, as 9 digits are sufficient in most cases.
You could be less picky with the normalization and instead of reducing the integral part from 1
to 9
, you could accept it to be from 1 to 9999 for example. That would significantly reduce the size of the normalize()
function.
EDIT 2017-07-14: ArduinoJson 5.11.1 reduces the number of decimal places printed depending on the number of integral places.