How do you make a bpe file for Tokenizer

ruby_coder · March 12, 2023, 4:22am

Wouldn’t it be easier to call the Python tokenizer from C#?

This is how I did it using Ruby and it works fine for me, which I use for many tasks including (1) counting tokens in text and (2) creating logit_bias params.

Here is a random tutorial demonstrating how to call a Python script from C#. There are many others tutorials on the net on the topic:

HTH

Note, you can also call Python directly in a C# program using IronPython, FYI (but I have not tested it as it’s been a few years since I wrote C# code):

https://ironpython.net/

Topic		Replies	Views
What is the OpenAI algorithm to calculate tokens? API	35	31754	December 13, 2023
My simple implementation is 10x faster than tiktoken. Anything wrong? Community api	6	5605	October 17, 2023
TikToken.GetEncoding Hangs or Freezes Bugs	6	249	January 30, 2025
Counting tokens for chat API calls (gpt-3.5-turbo) Documentation	5	28053	December 13, 2023
Counting Tokens and Rendering Content in HTML (Not the tags) Prompting chatgpt , api , token	6	1740	October 19, 2023

How do you make a bpe file for Tokenizer

Related topics