标签 embeddings1 jieba1 linux2 llm2 nlp4 python1 semantic search1 softmax1 text generation1 tokenizer2 unicode1