Image by Benjamin Lundquist
Technical

Quick and Small Exponential Scaling on the MSP430

For a recent project, I wanted to build a variable-brightness LED lamp, using an MSP430 to convert the linear slide potentiometer input into a PWM signal to drive the LED. Unfortunately, directly scaling the 10-bit ADC value to a 16-bit PWM duty cycle does not result in a linear apparent brightness response due to the eye’s (approximately) logarithmic response. There are several ways to accomplish this conversion in an embedded system, with different trade-offs:

  • Use floating point math and powf() from math.h
  • Use a 1024-element lookup table of pre-computed values
  • Use a smaller lookup table with interpolation
  • Use an approximation that trades accuracy for speed & size

Implementation of approximation algorithm

I decided to try the last option and use the fact that 2^n is equivalent to 1 << n, since bit-shifts are much quicker than multiplication and division.

static uint16_t convert_approximate(uint16_t adcValue)
{
    uint8_t highBits, lowBits;
    uint16_t const base = 256;
    uint16_t const offset = 1;
 
    highBits = adcValue >> 7;
    lowBits = adcValue & 0x7F;
 
    return ((base << highBits) + ((lowBits + 1) << (highBits + offset)) - 1);
}

At a high level, this splits the input into more-significant (highBits) and less-significant (lowBits) parts, using highBits as n to compute “2^n” and lowBits to interpolate between “2^n” and “2^(n+1)”. There are two somewhat linked choices in this implementation – where to split the input into highBits and lowBits, and the base value. If we use a base value of 1, then we can use up to 4 bits for highBits, since the maximum value for highBits would be 15 and 1 << 15 = 0x8000. Since we want the maximum input (1023) to map to the maximum 16-bit value (0xFFFF), we need the interpolation to be 0x7FFF. This gives:

0x7FFF = ((lowBits + 1) << (highBits + offset)) - 1
0x8000 = ((lowBits + 1) << (highBits + offset))

Since lowBits is the lower 6 bits of the input, its maximum value is 63.

0x8000 = ((63 + 1) << (15 + offset)
0x8000 = 0x0040 << (15 + offset)
0x8000 = 0x0040 << 9

This gives an offset of -6. After testing PWM value vs LED brightness, I found that PWM values below 0.5% (328/65535) were too dim to be useful, so I started with a base value of 256. Using the same calculations as above, we end up with 3 highBits and 7 lowBits and an offset of 1.

Implementation of the other algorithms

In order to determine the relative performance of each approach, I decided to implement all of them and compare on the MSP430 Launchpad, using a hardware timer to count the number of CPU cycles required to convert each of the 1024 possible ADC inputs into the appropriate PWM duty cycle.

timerStart();
for (i = 0; i < 1024; i++) {
    testSum += convert(i);
}
timerStop();

The first implementation of convert() simply returned the ADC value in order to determine a baseline for the loop overhead. All tests were compiled with CCS v6.0, using -O5 and -mf 0 (optimized for size to avoid non-representative loop unrolling for some test cases).

static uint16_t convert_baseline(uint16_t adcValue)
{
    return adcValue;
}

The next test used floating point math to perform the conversion.

static uint16_t convert_float(uint16_t adcValue)
{
   float exponent = (adcValue + 1) / 128.0f;
   float result = 256.0f * powf(2.0f, exponent) - 1.0f;
   return (uint16_t)(result + 0.5f);
}

This required using math.h for software implementations of the required floating point routines, resulting in a 10kB code size increase. Next up is a direct table lookup implementation:

static uint16_t convert_table_lookup(uint16_t adcValue)
{
    static const uint16_t LOOKUP_TABLE[1024] = {
        256, 258, 259, 261, 262, 263, 265, 266, 268, 269, 271, 272, 274, 275, 277, 278,
        /* ... */
        60422, 60750, 61080, 61412, 61745, 62080, 62418, 62756, 63097, 63440, 63784, 64131, 64479, 64829, 65181, 65535
    };
    return LOOKUP_TABLE[(adcValue & 0x3FF)];
}

Since we no longer need to do any software floating point math, execution time is much faster, and the 2kB lookup table is smaller than the 10kB math library. However, this method does not scale well, requiring 8kB for a 12-bit input or 128kB for a 16-bit input. In order to reduce the size further, we can use a sparse lookup table and interpolate, which is the next test:

static uint16_t convert_linear_interp(uint16_t adcValue)
{
    static const uint16_t ADC_SAMPLE[9] = {0, 127, 255, 383, 511, 639, 767, 895, 1023};
    static const uint16_t PWM_OUTPUT[9] = {257, 511, 1023, 2047, 4095, 8191, 16383, 32767, 65535};
 
    unsigned int i;
    for (i=1; i < 9; i++) {
        if (adcValue <= ADC_SAMPLE[i]) {
            break;
        }
    }
 
    return PWM_OUTPUT[i-1] + (uint16_t)((((uint32_t)adcValue - ADC_SAMPLE[i-1]) * (PWM_OUTPUT[i] - PWM_OUTPUT[i-1])) / (ADC_SAMPLE[i] - ADC_SAMPLE[i-1]));
}

This implementation only uses 36 bytes for the lookup table, but is considerably slower due to the integer multiplication and division required for interpolation and the MSP430’s lack of hardware support.

Results

fast approx table G2230: http://www.ti.com/product/msp430g2230 G2332: http://www.ti.com/product/msp430g2332 G2553: http://www.ti.com/product/msp430g2553 The code size for the fast approximation solution is the smallest at only 172 bytes, and the execution time is second only to the direct table lookup which trades code size (2064 bytes) for speed. Looking at processor cost, the table interpolation algorithm and the approximation algorithm can both fit on a G2230 (2kB flash) for $0.40. To get the faster speed of the direct table lookup solution, you’d pay an extra $0.20 for the G2332 (4kB flash) at $0.60 each. Using math.h’s floating point routines will cost you $0.90 (16kB flash) for the slowest algorithm. fast approx graph