BOTev

Winter 2019

A neural network poetry generator in Bulgarian.

 

Designed throughout SAP GeekyCamp 5.0 in Sofia.

Uses Python + Keras + TensorFlow.

While the idea ​of generating poetry using machine learning is not a novel one, I had never seen a tool that did it in Bulgarian.

I wanted to explore existing implementations for English and test how they dealt with the nuances of Bulgarian: a Cyrillic script, different word order and grammar.

I settled on an approach using a combination of a recurrent neural network and Markov chains, as described by Potash et al. Special thanks to Paul Mooney for his Poetry Generator (RNN Markov).

 

I trained the model on the works of prominent 19-20th century Bulgarian authors, including Hristo Botev (who inspired the name of the project), Ivan Vazov and Pencho Slaveykov. Due to the relatively small datasets, the generated poems are quite nonsensical, but there seems to be a semblance of rhyme and subject.

Output trained on the works of Ivan Vazov

Не че той ще има —

до теб ще е във гроба,

за всички, що в гърди ми!

Да, не съм в тез години

От кой ли ще ме окичи

ох, аз ще да се кае,

Да, всичко в теб да се отчайм

до теб ще е на жал…

че от радост веч не щем:

Ти ще се да не кажат

Но що от нищо на светът!

а той не се в часът

той от радост веч не желая!

че чух как си ти Русия!

искам и аз ще го напълня

Не че той ще да знайш!