Instructions | Raspi Newspaper Synthesizer/Reader

1
Step 1
- If you want to use the TTS engine with ALSA integration only follow steps 1, 2, 4, 5. For the Newspaper Reader follow all steps.
First install the TTS engine. I suggest you use mfurquim's fork.
```
git clone git://github.com/mfurquim/picopi.git
```
As an alternative you can go with the original repository from Doug
```
git clone git://github.com/DougGore/picopi.git
```
If you go with theoriginal version you will need to add these files: strdup8to16.c, strdup16to8.c, strdup16to8.cpp, jstring.h
which you can get from mfurquim's fork.
Followthe first steps of Doug Gore's instructions to setup the library
```
cd picopi/pico/lib
make && sudo make install 
```
It is normal to see a lot of warnings.
Now have the TTS library installed.

Step 2

Now let's install ALSA's development environment. This can take some time.

cd /picoip/pico
wget http://alsa.cybermirror.org/lib/alsa-lib-1.0.29.tar.bz2
tar -xjvf alsa-lib-1.0.29.tar.bz2
cd alsa-lib-1.0*
./configure && make
sudo make install
cd ..
rm alsa-lib-1.0.29.tar.bz2

##UPDATE FEBRUARY 2017:
##ALSA 1.1.3 is the current version. Use these commands instead to download and compile
cd /picoip/pico
wget ftp://ftp.alsa-project.org/pub/lib/alsa-lib-1.1.3.tar.bz2
tar -xjvf alsa-lib-1.1.3.tar.bz2
cd alsa-lib-1.1*
./configure && make
sudo make install
cd ..
rm alsa-lib-1.1.3.tar.bz2

… and test if it works by compiling one of the supplied example programs

cd alsa-lib-1.0.29/test
make pcm
amixer cset numid=3 1
./pcm

You should hear a 440 Hz sine wave when plugging a headphone into the analog jack. The amixer command above directs sound to the analog jack. If you want it output differently you need to change the amixer options.

At the time of writing alsa-lib-1.0.29 is the latest version. UPDATE FEB 2017: ALSA 1.1.3

3
Step 3
Now we will add some of the components we need for Python. Again, this may take some time.
```
sudo apt-get install espeak python-espeak python-dev python-pip python-alsaaudio
sudo pip install feedparser beautifulsoup4 
```
Why do we need eSpeak if we are going to run the Android TTS engine? I have gone back and forth on this point but decided that eSpeak is good enough for reading out the names of the papers. It gives you an extra voice that makes it easier for the user to determine, just based on voice, where in the menu hierarchy (s)he is.
In addition, if you plan to use the MPL3155 module you need to install
```
sudo apt-get install python-smbus
```

Step 4

Now we need to add a file that is missing from the tts repositories. Fortunately Google themselves make it available but it needs some modifications to make it work with the Raspi port. You can find the file at https://android.googlesource.com/platform/external/svox/+/9b08cc440f25c4722ca112642be87053fc47f918/pico/tts/com_svox_picottsengine.cpp

cd picopi/pico/tts
nano com_svox_picottsengine.cpp

Paste the file in the editor. Before you can save it a few changes need to be made within the file.

---------- more ----------

Add the following define statements at the beginning of the file
#if 0
#define ALOGE printf
#define ALOGI printf
#define ALOGV printf
#else
#define ALOGE
#define ALOGI
#define ALOGV
#endif
Find and replace all occurrences of LOGE by ALOGE, LOGV by ALOGV and LOGI by ALOGI
Comment out or delete the include statements for log.h and AndroidRuntime.h
Add these include statements
#include "jstring.h"
#include "svox_ssml_parser.h"
#include <string.h>
#include <stdint.h>
#include <stdlib.h>
Make sure these are included before the include for TtsEngine.h
Change #include <tts/TtsEngine.h> to #include <TtsEngine.h>
After using namespace Android;
add using namespace std;
Change the file path for PICO_LINGWARE_PATH to
const char* PICO_LINGWARE_PATH = "../lang/";
After
const char * PICO_PHONEME_OPEN_TAG = "<phoneme ph='";
add
const char * PICO_PHONEME_CLOSE_TAG = "'/>";
Find the function tts_result TtsEngine::init( synthDoneCB_t synthDoneCBPtr )
In the original file that you pulled from the web this would have been in line 522
Add one variable tts_result TtsEngine::init( synthDoneCB_t synthDoneCBPtr , const char *config)
Find TtsEngine::setAudioFormat(AudioSystem::audio_format& encoding, uint32
Replace with
TtsEngine::setAudioFormat(tts_audio_format& encoding, uint32
and leave the rest of the line as is
Find the line
ret = pico_resetEngine(pico_Engine);
Replace with (or add the extra argument)
ret = pico_resetEngine(pico_Engine, PICO_RESET_SOFT);
Find the three occurrences of AudioSystem::PCM_16_BIT and replace each with TTS_AUDIO_FORMAT_16_BIT
Remove the entire function TtsEngine::synthesizeIpa( ... ) { ... }
After the end of the synthesizeText function you should have the TtsEngine::stop() function.

Add these functions

/** createPhonemeString
 *  Wrap all individual words in <phoneme> tags.
 *  The Pico <phoneme> tag only supports one word in each tag,
 *  therefore they must be individually wrapped!
 *  @xsampa - text to convert to Pico phomene string
 *  @length - length of the input string
 *  return new string with tags applied
*/
extern char * createPhonemeString( const char * xsampa, int length )
{
    char *  convstring = NULL;
    int     origStrLen = strlen(xsampa);
    int     numWords   = 1;
    int     start, totalLength, i, j;

    for (i = 0; i < origStrLen; i ++) {
        if ((xsampa[i] == ' ') || (xsampa[i] == '#')) {
            numWords ++;
        }
    }

    if (numWords == 1) {
        convstring = new char[origStrLen + 17];
        convstring[0] = '\0';
        strcat(convstring, PICO_PHONEME_OPEN_TAG);
        strcat(convstring, xsampa);
        strcat(convstring, PICO_PHONEME_CLOSE_TAG);
    } else {
        char * words[numWords];
        start = 0; totalLength = 0; i = 0; j = 0;
        for (i=0, j=0; i < origStrLen; i++) {
            if ((xsampa[i] == ' ') || (xsampa[i] == '#')) {
                words[j]    = new char[i+1-start+17];
                words[j][0] = '\0';
                strcat( words[j], PICO_PHONEME_OPEN_TAG);
                strncat(words[j], xsampa+start, i-start);
                strcat( words[j], PICO_PHONEME_CLOSE_TAG);
                start = i + 1;
                j++;
                totalLength += strlen(words[j-1]);
            }
        }
        words[j]    = new char[i+1-start+17];
        words[j][0] = '\0';
        strcat(words[j], PICO_PHONEME_OPEN_TAG);
        strcat(words[j], xsampa+start);
        strcat(words[j], PICO_PHONEME_CLOSE_TAG);
        totalLength += strlen(words[j]);
        convstring = new char[totalLength + 1];
        convstring[0] = '\0';
        for (i=0 ; i < numWords ; i++) {
            strcat(convstring, words[i]);
            delete [] words[i];
        }
    }

    return convstring;
}

/* The XSAMPA uses as many as 5 characters to represent a single IPA code.  */
typedef struct tagPhnArr
{
    char16_t    strIPA;             /* IPA Unicode symbol       */
    char        strXSAMPA[6];       /* SAMPA sequence           */
} PArr;

#define phn_cnt (134+7)

PArr    PhnAry[phn_cnt] = {

    /* XSAMPA conversion table
	   This maps a single IPA symbol to a sequence representing XSAMPA.
       This relies upon a direct one-to-one correspondance
       including diphthongs and affricates.						      */

    /* Vowels (23) complete     */
    {0x025B,        "E"},
    {0x0251,        "A"},
    {0x0254,        "O"},
    {0x00F8,        "2"},
    {0x0153,        "9"},
    {0x0276,        "&"},
    {0x0252,        "Q"},
    {0x028C,        "V"},
    {0x0264,        "7"},
    {0x026F,        "M"},
    {0x0268,        "1"},
    {0x0289,        "}"},
    {0x026A,        "I"},
    {0x028F,        "Y"},
    {0x028A,        "U"},
    {0x0259,        "@"},
    {0x0275,        "8"},
    {0x0250,        "6"},
    {0x00E6,        "{"},
    {0x025C,        "3"},
    {0x025A,        "@`"},
    {0x025E,        "3\\\\"},
    {0x0258,        "@\\\\"},

    /* Consonants (60) complete */
    {0x0288,        "t`"},
    {0x0256,        "d`"},
    {0x025F,        "J\\\\"},
    {0x0261,        "g"},
    {0x0262,        "G\\\\"},
    {0x0294,        "?"},
    {0x0271,        "F"},
    {0x0273,        "n`"},
    {0x0272,        "J"},
    {0x014B,        "N"},
    {0x0274,        "N\\\\"},
    {0x0299,        "B\\\\"},
    {0x0280,        "R\\\\"},
    {0x027E,        "4"},
    {0x027D,        "r`"},
    {0x0278,        "p\\\\"},
    {0x03B2,        "B"},
    {0x03B8,        "T"},
    {0x00F0,        "D"},
    {0x0283,        "S"},
    {0x0292,        "Z"},
    {0x0282,        "s`"},
    {0x0290,        "z`"},
    {0x00E7,        "C"},
    {0x029D,        "j\\\\"},
    {0x0263,        "G"},
    {0x03C7,        "X"},
    {0x0281,        "R"},
    {0x0127,        "X\\\\"},
    {0x0295,        "?\\\\"},
    {0x0266,        "h\\\\"},
    {0x026C,        "K"},
    {0x026E,        "K\\\\"},
    {0x028B,        "P"},
    {0x0279,        "r\\\\"},
    {0x027B,        "r\\\\'"},
    {0x0270,        "M\\\\"},
    {0x026D,        "l`"},
    {0x028E,        "L"},
    {0x029F,        "L\\\\"},
    {0x0253,        "b_<"},
    {0x0257,        "d_<"},
    {0x0284,        "J\\_<"},
    {0x0260,        "g_<"},
    {0x029B,        "G\\_<"},
    {0x028D,        "W"},
    {0x0265,        "H"},
    {0x029C,        "H\\\\"},
    {0x02A1,        ">\\\\"},
    {0x02A2,        "<\\\\"},
    {0x0267,        "x\\\\"},		/* hooktop heng	*/
    {0x0298,        "O\\\\"},
    {0x01C0,        "|\\\\"},
    {0x01C3,        "!\\\\"},
    {0x01C2,        "=\\"},
    {0x01C1,        "|\\|\\"},
    {0x027A,        "l\\\\"},
    {0x0255,        "s\\\\"},
    {0x0291,        "z\\\\"},
    {0x026B,        "l_G"},


    /* Diacritics (37) complete */
    {0x02BC,        "_>"},
    {0x0325,        "_0"},
    {0x030A,        "_0"},
    {0x032C,        "_v"},
    {0x02B0,        "_h"},
    {0x0324,        "_t"},
    {0x0330,        "_k"},
    {0x033C,        "_N"},
    {0x032A,        "_d"},
    {0x033A,        "_a"},
    {0x033B,        "_m"},
    {0x0339,        "_O"},
    {0x031C,        "_c"},
    {0x031F,        "_+"},
    {0x0320,        "_-"},
    {0x0308,        "_\""},     /* centralized		*/
    {0x033D,        "_x"},
    {0x0318,        "_A"},
    {0x0319,        "_q"},
    {0x02DE,        "`"},
    {0x02B7,        "_w"},
    {0x02B2,        "_j"},
    {0x02E0,        "_G"},
    {0x02E4,        "_?\\\\"},	/* pharyngealized	*/
    {0x0303,        "~"},		/* nasalized		*/
    {0x207F,        "_n"},
    {0x02E1,        "_l"},
    {0x031A,        "_}"},
    {0x0334,        "_e"},
    {0x031D,        "_r"},		/* raised  equivalent to 02D4 */
    {0x02D4,        "_r"},		/* raised  equivalent to 031D */
    {0x031E,        "_o"},		/* lowered equivalent to 02D5 */
    {0x02D5,        "_o"},		/* lowered equivalent to 031E */
    {0x0329,        "="},		/* sylabic			*/
    {0x032F,        "_^"},		/* non-sylabic		*/
    {0x0361,        "_"},		/* top tie bar		*/
    {0x035C,        "_"},

    /* Suprasegmental (15) incomplete */
    {0x02C8,        "\""},		/* primary   stress	*/
    {0x02CC,        "%"},		/* secondary stress	*/
    {0x02D0,        ":"},		/* long				*/
    {0x02D1,        ":\\\\"},	/* half-long		*/
    {0x0306,        "_X"},		/* extra short		*/

    {0x2016,        "||"},		/* major group		*/
    {0x203F,        "-\\\\"},	/* bottom tie bar	*/
    {0x2197,        "<R>"},		/* global rise		*/
    {0x2198,        "<F>"},		/* global fall		*/
    {0x2193,        "<D>"},		/* downstep			*/
    {0x2191,        "<U>"},		/* upstep			*/
    {0x02E5,        "<T>"},		/* extra high level	*/
    {0x02E7,        "<M>"},		/* mid level		*/
    {0x02E9,        "<B>"},		/* extra low level	*/

    {0x025D,        "3`:"},		/* non-IPA	%%		*/

    /* Affricates (6) complete  */
    {0x02A3,        "d_z"},
    {0x02A4,        "d_Z"},
    {0x02A5,        "d_z\\\\"},
    {0x02A6,        "t_s"},
    {0x02A7,        "t_S"},
    {0x02A8,        "t_s\\\\"}
    };


void CnvIPAPnt( const char16_t IPnt, char * XPnt )
{
    char16_t        ThisPnt = IPnt;                     /* local copy of single IPA codepoint   */
    int             idx;                                /* index into table         */

    /* Convert an individual IPA codepoint.
       A single IPA code could map to a string.
       Search the table.  If it is not found, use the same character.
       Since most codepoints can be contained within 16 bits,
       they are represented as wide chars.              */
    XPnt[0] = 0;                                        /* clear the result string  */

    /* Search the table for the conversion. */
    for (idx = 0; idx < phn_cnt; idx ++) {               /* for each item in table   */
        if (IPnt == PhnAry[idx].strIPA) {                /* matches IPA code         */
            strcat( XPnt, (const char *)&(PhnAry[idx].strXSAMPA) ); /* copy the XSAMPA string   */
            return;
        }
    }
    strcat(XPnt, (const char *)&ThisPnt);               /* just copy it             */
}


/** cnvIpaToXsampa
 *  Convert an IPA character string to an XSAMPA character string.
 *  @ipaString - input IPA string to convert
 *  @outXsampaString - converted XSAMPA string is passed back in this parameter
 *  return size of the new string
*/

int cnvIpaToXsampa( const char16_t * ipaString, size_t ipaStringSize, char ** outXsampaString )
{
    size_t xsize;                                  /* size of result               */
    size_t ipidx;                                  /* index into IPA string        */
    char * XPnt;                                   /* short XSAMPA char sequence   */

    /* Convert an IPA string to an XSAMPA string and store the xsampa string in *outXsampaString.
       It is the responsibility of the caller to free the allocated string.
       Increment through the string.  For each base & combination convert it to the XSAMP equivalent.
       Because of the XSAMPA limitations, not all IPA characters will be covered.       */
    XPnt = (char *) malloc(6);
    xsize   = (4 * ipaStringSize) + 8;          /* assume more than double size */
    *outXsampaString = (char *) malloc( xsize );/* allocate return string   */
    *outXsampaString[0] = 0;
    xsize = 0;                                  /* clear final              */

    for (ipidx = 0; ipidx < ipaStringSize; ipidx ++) { /* for each IPA code        */
        CnvIPAPnt( ipaString[ipidx], XPnt );           /* get converted character  */
        strcat((char *)*outXsampaString, XPnt );       /* concatenate XSAMPA       */
    }
    free(XPnt);
    xsize = strlen(*outXsampaString);                  /* get the final length     */
    return xsize;
}

Save and exit.

5
Step 5
Next delete main.cpp and the Makefile in the picopi/pico/tts directory that you should still be in. They will be replaced by my version from git.
Build the program in the tts directory.
```
rm Makefile
rm main.cpp
git clone git://github.com/kirchnet/Newspaper-Reader.git
make
```
Ignore the warnings. You should see a new executable tts.alsa in the tts subdirectory. Try it out:
```
amixer cset numid=3 1
./tts.alsa
```
If you plug a headphone into the audio jack you should hear "test, test, test." To output the sound otherwise change the amixer command.
The file language.txt defines the language in which the contents of the file temp.txt are synthesized. Try other languages. Allowable values are eng-USA, eng-GBR, deu-DEU, spa-ESP, fra-FRA or ita-ITA. As a bonus I added an excerpt from a book I published last year that illustrates the use of the <pitch> tag in the text. This is meant to be sythesized in deu-DEU.
If all you want to do is use the TTS engine you are done now. If you want to build the Newspaper Reader continue with the next step.
6

Step 6

more to come soon ...
7

Step 7

Install WiringPi. Works on both Raspi and BananaPi. <not needed? need to double check>
8

Step 8

On the hardware side we need to add three buttons, and if you choose to use it we also need to add <TODO>

Raspi Newspaper Synthesizer/Reader

Discussions

Become a Hackaday.io Member