Liquid State Machines (LSMs) by W. Maass and Echo State Networks (ESNs) by H. Jaeger are two novel and similar computational frameworks, that both use a recurrent neural network as a reservoir into which the inputs are fed and which itself is left untrained. Instead, a linear classifier is applied to the dynamical state of the reservoir - this classifier can be trained quickly and with a guaranteed optimal solution for a certain error metric. Here, we present the work we have done on the application of both instances of reservoir computing - LSMs and ESNs - to the task of recognizing isolated spoken digits by multiple speakers. Our approach differs quite substantially from the traditional setup used in automated speech recognition, i.e. a mel-frequency cepstrum frontend followed by a Hidden Markov Model-based classifier. We use a biologically realistic model of the human cochlea (Lyon`s passive cochlea) to preprocess the speech, which is then used as input to either a LSM or an ESN. With this novel setup, we attain surprisingly good results and even outperform state-of-the-art HMM-based recognizers, using rather small reservoirs that require little or no parameter tuning and are apparently quite robust to noise.