How do you make a bpe file for Tokenizer

Wouldn’t it be easier to call the Python tokenizer from C#?

This is how I did it using Ruby and it works fine for me, which I use for many tasks including (1) counting tokens in text and (2) creating logit_bias params.

Here is a random tutorial demonstrating how to call a Python script from C#. There are many others tutorials on the net on the topic:

HTH

:slight_smile:

Note, you can also call Python directly in a C# program using IronPython, FYI (but I have not tested it as it’s been a few years since I wrote C# code):

https://ironpython.net/

1 Like