A Transformer model to auto insert Vietnamese accent marks

Finetuning XLM-Roberta to auto insert Vietnamese accent marks (diacritics)
vietnamese accent marks
finetuned xlm-roberta
Author

Hung Hoang

Published

September 3, 2024

This project was completed quite some time ago but the model wasn’t published yet. And now I’m glad this model is now available on HuggingFace hub here.

This model was finetuned based on XLM-Roberta (multilingual Roberta), a Transformer encoder, for the task of inserting Vietnamese accent marks.

This accent marks insertion was modelled as a token classification where the assigned label corresponds to the necessary transformation to insert accents.

The HF model page linked to above also contains detailed instructions on how to use the model from input to output.