There is a lot of projects out there which use R-2R ladder and an Arduino to recreate sounds from either SD card or short audio clips programmed directly to MCU's flash memory. Although using SD card is fairly reasonable and provides a lot of flexibility it's not very challenging (from the software point of view) and it requires some additional hardware. Believe it or not we already have all the needed hardware in the Arduino itself.
In this project I'll play mp3 or any other multimedia files from the PC using Arduino. Bellow are the details:
- Play PCM 8kHz 8bit Audio with Arduino
- Audio samples will be transfered via USART from the PC in binary format using SLIP
- Audio files will be decoded on the PC side and only the RAW data will be send to Arduino
Arduino is a powerful machine, powerful enough that it's possible to play audio with even higher sampling rates and bit-resolutions than assumed above, the bottleneck in this case is the Serial Port. Assuming that I'll use the highest standard USART speed available which is 115200 bps (8 data bits one start bit and stop bit = 10 bits to send a byte) it's possible to send up to 11520 bytes per second. In order to play a second of 8 kHz 8 bit Audio I need to have at least 8 kB of data. So the USART is around 30% faster as a data producer than the audio samples consumption rate, which means that 8kHz standard sampling rate is the highest I can get.
The timing is crucial thing in this project. The data must be buffered but what kind of buffer do I need to have ? At first I thought that the bigger the better, but that's not exactly true. Accordingly to assumptions in implementation of the SLIP protocol in libpca, one SLIP frame can carry up to 256 bytes of data, we must not forget that this transfer is not instantaneous it takes time and we're in the middle of the playback not really knowing (at the PC side) how much audio samples have already been consumed (the consequence is that it's hard to tell how much space is available in the buffer). If the buffer at Arduino side doesn't have enough data to hold the new upcoming chunk - the data will be lost and we'll hear an audio glitch - and that situation for sure will happen since we're sending faster than consuming.
First approach that came to my mind is to simply wait after each data block in order to be sure that there will always be enough space in the buffer (since most of the data will be consumed). Let's think about it for a moment and evaluate some rough timing calculations.
- playing 256 samples takes 1/8000 * 256 = 32 ms
- sending 256 samples takes 256 + 2 (SLIP END characters) / 11520 = 22 ms
The second calculation does not take into consideration any additional SLIP ESCAPE characters that may be included. But more or less I have a 10 ms I can wait after sending the chunk before sending another one, right ? WRONG. Those calculations do not consider that during those 22 ms when we were sending the data already something around 176 samples have been played (68% of the data). OK, so if I'll wait 32%*10ms = 3.2 ms then it should be fine, right ? WRONG again. By waiting, everything that is done is only hopelessly trying to make the data transfer speed equal to the audio consumption rate. It's impossible to synchronize ideally those two independent processes since there are so many factors that can have it's influence on the timing that all the efforts by definition have simply no point. If the sending rate will be too low, the audio will be chopped since there will be gaps in the playback, if the data will come too fast, there will be no space available and will have drop the data from time to time.
It's even worse. Let's look on the slip_recv function prototype:
uint8_t slip_recv(uint8_t *a_buff, uint8_t a_buflen);
It takes a pointer to a buffer for the incoming data, since this is a transmission buffer it can't be used to realize another transmission until the data is completely consumed. That means that after the reception we must COPY the data from the frame to the proper audio buffer. Which makes the timing COMPLETELY unpredictable with the required precision.
How to cope with this situation then ? First of all we cannot afford to copy the data from one buffer to another - it's simply a waste of time and by clever memory organization this problem can be easily eliminated. The playback must be done from the transmission buffer directly. But how to perform the transmission and audio playback using the same buffer in the same time ? It's actually pretty easy. Have a look:
I have four transmission buffers (let's call them buffer banks) each of them holding 64 samples. The audio will be played directly from the samples table. The beauty of the picked size is that
64 * 4 = 256
which means that the audio data can be addressed using a single byte, the following way:
sample = p[(g_tail >> 6) & 0x03].samples[g_tail & 0x3f];
Since g_tail is an 8 bit variable 2 upper bits select the "bank" (0-3), and the rest, addresses the audio data (0 - 63). I address both the buffer bank and the audio data with a single variable. It more or less looks like using a single continuous buffer.
When receiving data I track which buffer bank is free (1-4) and I write to it. The playback happens from the previous buffers. If there is no free buffer available I send a "WAIT" command to the PC so it can wait a little while (a time shorter than the time needed to consume 3 buffers = 192 audio samples, basically the time must be longer than 8 ms (consume 64 bytes = 1 bank) and shorter than 24 ms). I chose 1,9 ms which lasts for around 15 audio samples. It's too short isn't it ? No it's not. Let's assume 2 ms of explicit waiting (due to the function inaccuracies) + time needed to receive the "WAIT" string through the Serial Line: 350 us which already makes it let's say 2.5 ms not mentioning any other processing times and of course the time needed to transfer 64 bytes block >= 8 ms, which in total gives at least 10,5 ms. Of course the time is much more longer (SLIP special characters have not been taken into consideration and as mentioned indeterminable processing time has been omitted as well).
The transmission/consumption process is depicted bellow:
|Transmission of binary frames through the Serial Port and new data placement in the Arduino's buffers.|
Let's talk about how to play the samples ? I use timer in CTC mode and play the samples by placing them on the port directly, but again there's a little catch here as well. The only "full" 8 bit port available on Arduino is PORTD, unfortunately we can't use it's two lower PINS 0,1 since they are shared with USART and I'm using USART as a data source. Because of that The bottom part of the byte (bits 0 - 5) is placed on PINS 2-7 of PORTD and the remaing two most significant bits (6 - 7) are placed on the adjacent PORTB. This may have an influence on the audio quality since placing the data on two ports is not an atomic operation - first we place one piece of data - which in effect generates some sort of voltage on the R-2R output, then we place the remaining piece of data on the other port - this will probably generate a high frequency glitch for every sample. Have a look on the schematics.
A word about the Ladder itself. I chose a value of R = 5k. I had more in mind the 2R value = 10k which is the standard one. I had 4k7 resistors laying near hand so I decided to use them. It wasn't a good idea. The R-2R relation must be as good as possible. It's best to use the same resistors for both R and 2R and connect two of them in parallel to form a R value, so in my case R = 5k (two 10k connected in parallel) and 2R = 10k. I took some measurements using Arduinos ADC (the R-2R ladder output connected to the ADC) and below are the results. It's pretty visible that using 4,7k and 10k for R-2R ladder is a bad idea.
|R-2R = 4k7-10k Ladder signal response (sawtooth).|
|R-2R = 4k7-10k Ladder signal response (sine wave).|
|Major differences when using two 10k in parallel as 5k.|
|"Steps" visible anyway - the curve magnified.|
|R-2R Ladder with 4k7 and 10k resistors. Work in progress.|
|R-2R Ladder with 10k restistors connected in parallel.|
|R-2R Ladder Schematics.|
The source code.
Arduino program is pretty simple. I already mentioned that samples are played in the timer interrupt. Besides that a standard data collection algorithm happens in the while loop in a very similar fashion to the one from the Arduino MIDI player.
The function collecting the samples, takes a pointer to the destination buffer. The buffer is selected from four available by using 2 most significant bits of the g_head counter. g_head & g_tail indexes realize a queue on the buffers. The g_tail is incremented by the timer interrupt whenever new sample is played. The g_head is incremented along with new data received. As long as g_tail != g_head I know that there is still data available in the buffers.
serial_collect_samples((void *)&p[(g_head >> 6) & 0x03]);
A word about the sample collect function
First thing is to check if number of available samples is higher than 192 (3 banks) if yes, then the "WAIT" command is send, to tell the PC side to refrain from sending new data for a while. The slip_recv call is blocking. It will block until new data has been collected, after CRC verification and making sure that the data is genuine the g_head index is incremented by the number of samples received in the frame.
Perl script is responsible for feeding Arduino with data. This script accepts a 8 kHz 8 bit WAVE file as an input. It's pretty straight forward and similar to previous script for the Arduino MIDI player. First it initialize the Serial port, then it tries to open the WAVE file and read it's header. After that is successful I read 44 bytes of header and unpack them.
die "Unable to read WAV header\n"
unless($offset = read $g_fh, $header, 44);
my @header = unpack "a4la4a4ls2l2s2a4l", $header;
It get's more sense if you look on the WAVE file header:
So, the unpack call extracts RIFF file id, the WAVE file format and the rest of the header fields. The purpose is to detect if it's a WAVE file and if it has the only compatible sampling rate and bit rate. Once that is confirmed, the script goes to the "transfer loop". If there is no "WAIT" command received from Arduino it simply reads the 64 byte data chunk from the file and feeds the Arduino. If "WAIT" has been received, the script will wait for around 2 ms (1900 us). The loop continues until the whole file is processed. The frame contains two bytes of CRC, one byte indicating how many samples it conveys and the samples themselves.
Although it's completely fine to use this script directly I use it in a wrapper (I could've made everything in the single script, but I was too lazy :)). The "player.sh" shell script accepts any audio format as an input it uses sox to convert the provided audio file to an intermediate WAVE file which then will be provided to Perl in order to be played by Arduino. It's a lot more useful than converting the files manually every time.
How to use it ?
The whole project snapshot can be obtained here. One can fetch the newest version of libpca and the project itself from my public GitHub repository as well. The snapshot and the GitHub version have a slightly different Makefile, paths to the libpca are a little bit different, everything else is exactly the same.
git clone email@example.com:dagon666/avr_Libpca pca
git clone firstname.lastname@example.org:dagon666/avr_ArduinoProjects projects
The project resides under dac directory. After navigating to it. One should change the branch to r2r_dac:
git checkout -b r2r_dac origin/r2r_dac
Download and unpack the snapshot from here
... the rest is common:
Build and flash the Arduino firmware:
Launch the player.sh script with an audio file as a parameter:
$ ./player.sh audio.mp3
Finally we can play some audio. The video presents an Arduino playing mp3 of my metal project Tangible Void - you can check it out on youtube - here.