Apollo3Blue I2S via hardware pattern generator

In my last article I had described the capabilities of Apollo3Blue generating pattern in hardware: http://blog.io-expert.com/hardware-generated-pattern-with-apollo3blue

In this article I will go a step deeper inside implementing a full protocol for driving audio: I2S.

In general I2S is used to transfer left and right channel data in a 16-bit PCM format for each channel with a SPI like synchronous transfer. To sync between left and right channel, a word select line toggles every 16-bit. A little specialty of I2S is the requirement of one clock shift for word select line:

In theory it is possible to run three synchronized timers in parallel to generate the bit-clock, word-select clock and one timer for data pattern generation, but I2S needs a multiple of the sample frequency which is sometimes not possible with internal dividers and would require an external osciallator with special frequencies (for example 1.4112MHz). With the used codec STGL5000 it is possible to generate a clock between 8 and 27MHz for the codecs main clock input used by the internal PLL to generate conform BCLK frequencies of FS*16*2 (FS for example 44.100KHz = 1.4112MHz).

Apollo3Blue has the possibility to output the divided HFRC clock of 48MHz. If divided by 4, the resulting 12MHz are a perfect match for the STGL5000 PLL input. One timer (B0) is used to sync on the word select clock and a second timer configured as pattern generator is triggered by the synchronized word-select timer (B0).

The resulting data looks like following timing diagram

In software the 12MHz generation can be done by a few lines of code

//select HFRC divided by 4
CLKGEN->CLKOUT_b.CKSEL = CLKGEN_CLKOUT_CKSEL_HFRC_DIV4;
//enable clock-out generation
CLKGEN->CLKOUT_b.CKEN = 1;
//unlock pad configuration
GPIO->PADKEY = 0x73; 
//select PAD7 for clock output
GPIO->PADREGB_b.PAD7FNCSEL = GPIO_PADREGB_PAD7FNCSEL_CLKOUT;
//lock  pad configuration
GPIO->PADKEY = 0; 

Now the codec can be initialized to use work as master with PLL input

//
// Reset codec
//
Stgl5000_I2C_write(CHIP_CLK_CTRL, 0);       // writing reset value
Stgl5000_I2C_write(CHIP_I2S_CTRL, 0x0008);  // writing reset value
Stgl5000_I2C_write(CHIP_DIG_POWER, 0);      // writing reset value
Stgl5000_I2C_write(CHIP_LINREG_CTRL,0);     // writing reset value
Stgl5000_I2C_write(CHIP_ANA_POWER, 0x7060); // writing reset value
 
delay(400);

//
//  Init codec
//
Stgl5000_I2C_write(CHIP_ANA_POWER, 0x4060);  // VDDD is externally driven with 1.8V
Stgl5000_I2C_write(CHIP_LINREG_CTRL, 0x006C);  // VDDA & VDDIO both over 3.1V
Stgl5000_I2C_write(CHIP_REF_CTRL, 0x01F2); // VAG=1.575, normal ramp, +12.5% bias current
Stgl5000_I2C_write(CHIP_LINE_OUT_CTRL, 0x0F22); // LO_VAGCNTRL=1.65V, OUT_CURRENT=0.54mA
Stgl5000_I2C_write(CHIP_SHORT_CTRL, 0x4446);  // allow up to 125mA
Stgl5000_I2C_write(CHIP_ANA_CTRL, 0x0137);  // enable zero cross detectors
Stgl5000_I2C_write(CHIP_ANA_POWER, 0x44FF); // power up: lineout, hp, adc, dac
Stgl5000_I2C_write(CHIP_ANA_POWER, 0x45FF); // power up: lineout, hp, adc, dac
Stgl5000_I2C_write(CHIP_PLL_CTRL,(15 << 11) | (108 << 0)); // writing PLL settings for 12MHz input
Stgl5000_I2C_write(CHIP_DIG_POWER, 0x0073); // power up all digital stuff

delay(400);

Stgl5000_I2C_write(CHIP_LINE_OUT_VOL, 0x1D1D); // default approx 1.3 volts peak-to-peak
Stgl5000_I2C_write(CHIP_CLK_CTRL, 0x0007);  // 44.1 kHz, PLL
Stgl5000_I2C_write(CHIP_I2S_CTRL, 0x01B0); // SCLK=32*Fs, 16bit, I2S format, master mode
Stgl5000_I2C_write(CHIP_SSS_CTRL, 0x0050); // ADC->I2S, I2S->DAC
Stgl5000_I2C_write(CHIP_ADCDAC_CTRL, 0x0000); // disable dac mute
Stgl5000_I2C_write(CHIP_DAC_VOL, 0x3C3C); // digital gain, 0dB
Stgl5000_I2C_write(CHIP_ANA_HP_CTRL, 0x3C3C); // set volume to 0dB (0x7F7F = lowest level)
Stgl5000_I2C_write(CHIP_ANA_CTRL, 0x0036);  // enable zero cross detectors
Stgl5000_I2C_write(CHIP_ANA_CTRL, 0x0026);  // enable HP

The resulting output of the codec shows the bit-clock output (red) and the word select output (yellow)

Now the MCU is ready to input its data into the given bit-clock and word select frame. So first of all the neccesary pins have to be configured

GPIO->PADKEY = 0x00000073//unlock pin selection

//use pin 12 for timer output (data out signal)
GPIO->PADREGD_b.PAD12FNCSEL = GPIO_PADREGD_PAD12FNCSEL_CT0;
//pad 12 ouput is push-pull
GPIO->CFGB_b.GPIO12OUTCFG = GPIO_CFGB_GPIO12OUTCFG_PUSHPULL;


//use pin 25 for timer A0 input (clock)
GPIO->PADREGG_b.PAD25FNCSEL = GPIO_PADREGG_PAD25FNCSEL_CT1;
//enable input at pad 25, pullup enable
GPIO->PADREGG_b.PAD25INPEN = 1;
GPIO->PADREGG_b.PAD25PULL = 1;


//use pin 13 as sync timer input (B0)
GPIO->PADREGD_b.PAD13INPEN = 1;
GPIO->PADREGD_b.PAD13PULL = 1;
GPIO->PADREGD_b.PAD13FNCSEL = GPIO_PADREGD_PAD13FNCSEL_CT2;
//GPIO->INT0EN_b.GPIO13 = 1;

GPIO->PADKEY = 0//lock pin selection

Timer B0 which is connected to the word-select is a simple single-count timer which triggers the pattern generator timer A0 after it had finished counting. In this example the counter is syncing 3 word select clocks before triggering A0

    //timer B0, WS input, B0OUT output
    CTIMER->INCFG_b.CFGB0 = CTIMER_INCFG_CFGB0_CT2;
    CTIMER->CTRL0_b.TMRB0CLR = 1;         //clear timer
    CTIMER->CTRL0_b.TMRB0FN = CTIMER_CTRL1_TMRA1FN_SINGLECOUNT;
    CTIMER->CTRL0_b.TMRB0CLK = CTIMER_CTRL0_TMRB0CLK_TMRPIN;
    CTIMER->INTEN_b.CTMRB0C0INT = 0;
    CTIMER->INTEN_b.CTMRB0C1INT = 1; 
    CTIMER->CMPRB0_b.CMPR0B0 = 2;         //count - 1
    CTIMER->CMPRB0_b.CMPR1B0 = 4;
    CTIMER->AUX0_b.TMRB0NOSYNC;
    CTIMER->CTRL0_b.TMRB0CLR = 0;         //clear timer

Timer A0 is configured as 64-bit timers and every 32-bit a double buffer is realized

//timer A0, data generation
CTIMER->OUTCFG0_b.CFG0 = CTIMER_OUTCFG0_CFG0_A0OUT; //Output configuration to ct0
CTIMER->INCFG_b.CFGA0 = CTIMER_INCFG_CFGA0_CT1;     //use CT1 as in config
GPIO->CTENCFG_b.EN0 = 0;             //Timer output enabled
CTIMER->CTRL0_b.TMRA0EN = 0;
CTIMER->CTRL0_b.TMRA0CLR = 1;         //clear timer
CTIMER->CTRL0_b.TMRA0IE0 = 1;        
CTIMER->CTRL0_b.TMRA0IE1 = 1;
CTIMER->INTEN_b.CTMRA0C0INT = 1;
CTIMER->INTEN_b.CTMRA0C1INT = 1;
CTIMER->CTRL0_b.TMRA0FN = CTIMER_CTRL0_TMRA0FN_SINGLEPATTERN;
CTIMER->CTRL0_b.TMRA0CLK = CTIMER_CTRL0_TMRA0CLK_TMRPIN;
CTIMER->AUX0_b.TMRA0LMT = 63;              //set limit to 64
CTIMER->AUX0_b.TMRA0TRIG = CTIMER_AUX0_TMRA0TRIG_B0OUT;
CTIMER->AUX0_b.TMRA0NOSYNC;
CTIMER->CTRL0_b.TMRA0CLR = 0;         //release clear timer
CTIMER->CTRL0_b.TMRA0EN = 1;               

Now interrupts needs to be enabled and data will be processed in the timer interrupt handler every 32-bit

NVIC_DisableIRQ(CTIMER_IRQn);         //disable IRQ in the NVIC
NVIC_ClearPendingIRQ(CTIMER_IRQn);    //clear pending flag in the NVIC
NVIC_SetPriority(CTIMER_IRQn,1);      //set the interrupts priority, smaller means higher priority
NVIC_EnableIRQ(CTIMER_IRQn);          //enable the IRQ in the NVIC

The audio sample data itself is stored in following variables

static uint32_t u32DataPos = 0;   //current position in the sample
static uint8_t au8Data[] = {...}; //audio sample data
// size of the sample is sizeof(au8Data) //audio sample size

The interrupt service routine needs several parts: First of all the pattern generator has to run one-time until it stops. The pattern generator will execute one interrupt as soon 32-bit were transferred and a second interrupt as soon 64-bit were transferred. So two interrupts will happen in which the data is pre-loaded. After the first run, the pattern generator is switched to output just 62-bits, so the shift of one clock compared to the word-select line is managed and the timer for word-select can be started to restart the pattern generator by trigger with correct timing.

As soon the pattern generator (timer A0) is triggered by timer B0, the next two interrupts are used to change:

  • in the first interrupt the pattern generator from single-pattern to repeated-pattern mode
  • in the second interrupt to the pattern length back to 64-bit.
/**
****************************
** \brief CTIMER IRQ handle
**  
***************************/
void CTIMER_IRQHandler(void)
{
    static volatile uint32_t u32InitCount = 0;
    uint32_t u32Status = CTIMER->INTSTAT;
    static uint32_t u32Tmp;
    

    if (u32Status & ((1 << CTIMER_INTSTAT_CTMRA0C0INT_Pos) | (1 << CTIMER_INTSTAT_CTMRA0C1INT_Pos)))
    {
        if (u32InitCount < 4)
        {
            if (u32InitCount == 1)
            {
                //
                // pattern generator was run one-time in dummy mode, activating WS clocked timer B0
                // to trigger pattern generator A0 in single-mode in time
                //
                CTIMER->AUX0_b.TMRA0LMT = 61; // generating clock shift required by I2S spec.
                CTIMER->CTRL0_b.TMRB0EN = 1// enable pattern generator
            }
            if (u32InitCount == 2)
            {
                //
                // first clock interrupt happened, we switch the single pattern mode to repeated pattern
                //
                CTIMER->CTRL0_b.TMRA0FN = CTIMER_CTRL0_TMRA0FN_REPEATPATTERN;
            }
            if (u32InitCount == 3)
            {
                //
                //  switching number of bits back to 64
                //
                CTIMER->AUX0_b.TMRA0LMT = 63;
            }
            u32InitCount++;
        }
        ...

As soon all clock-shifting and setup was done, data can be load via 32-bit wise double-buffer into the pattern generator. In addition I2S needs MSB / LSB order switched which can be solved with one instruction in ARM Cortex M MCUs defined as intrinsic function in the CMSIS standard: __RBIT

        ...
        else
        {
            memcpy(&u32Tmp,&au8Data[u32DataPos],4);
            u32Tmp = __RBIT(u32Tmp);
            u32DataPos+= 4;
            if (u32DataPos >= sizeof(au8Data)) u32DataPos = 0;
            if (u32Status & (1 << CTIMER_INTSTAT_CTMRA0C0INT_Pos))
            {
                CTIMER->CMPRAUXA0 = u32Tmp;
            } else
            {
                CTIMER->CMPRA0 = u32Tmp;
            }
            
        }
    }
    CTIMER->INTCLR = u32Status;
}

A software example will be availale soon at http://www.feeu.com/apollo3blue

By the way for converting wave files to a binary file, Audacity is a good tool.

More information can be found here:

2 Comments

  1. n stocks

    Dear Shreinerman, great blog – I read it with interest. We are also thinking of adding audio to the Apollo3 and I wouldn’t have thought of using the pattern generator to configure an output i2s. It looks like you have got great results so we are considering following your example. But can I ask why you didn’t use one of the interfaces that exists on the Apollo3 such as SPI or I2C? My concern of using the pattern generator is it will hog CPU time and hence will limit the CPUs ability to other signal processesing tasks. Also, what current drain did you observe?

    • schreinerman

      Hi Nigel,

      So far I didn’t got it managed to run I2S output by SPI without gaps/jitter inbetween transaction blocks. In addition exact timing / multiple-sample-frequency is also somehow some issue and it is also not possible to clock the SPI master with an external clock. The SPI slave for sure is able to do that and great for a virtual register / virtual memory model, but for I2S therefore also not usable.

      For a 64-bit pattern generator you will have the need to handle the ISR every 32-bit, so with 44100Hz. You can also minimize the CPU load by connecting two timers together to a 128-bit pattern generator and every 64-bit the ISR is handled, so with 22050Hz. So with running Apollo3Blue with 48MHz you can execute for 64-bit pattern generator 1088 cycles between 2 interrupts and with 96MHz 2176 cycles. If used as 128-bit pattern generator you will have 2176 cycles for 64-bit and 4353 cycles for 128-bit pattern generator. If minimizing the sample frequency to 22050Hz for example you would again double all values.

      And in addition Apollo3Blue has also some advantages by adding a DMA to different peripherals, so other communication can run without CPU usage.

      I didn’t measured the current so far, but Apollo3Blue runs with less than 0.5mA active 48MHz. With Apollo1/2 enabled CTIMER consumed about 200uA. So my expectation is less than 1mA (without codec). I can try to verify this values the next days.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.