Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Tuning text models with RLHF - custom model vs base model offerings?

1

Screenshot 2024-03-12 at 11.00.48 AM.png

2

Screenshot 2024-03-12 at 11.02.26 AM.png

3

Screenshot 2024-03-12 at 11.03.03 AM.png


Looking through several instances of the information provided here, it seems that only the 6 base models offered here can be RLHF tuned. What if we would like to bring our own SFT model and use it, is that a possibility? Or using a custom SFT task specific model for RLHF tuning? Where can I find these resources?

1 0 203
0 REPLIES 0