I would like to define a differentiable function that converts binary number representations of bits into one-hot vectors. This can be accomplished by using fuzzy logic operators to convert into using definitions of product -norms and -conorms and the fact that
Why did I ask this? My previous post explored a generalized version of the cross-entropy minimization assumption from my recent paper. This function could be used to make that assumption about the input IDs to a language model while keeping the dimension of small.