1
2
3
Looking through several instances of the information provided here, it seems that only the 6 base models offered here can be RLHF tuned. What if we would like to bring our own SFT model and use it, is that a possibility? Or using a custom SFT task specific model for RLHF tuning? Where can I find these resources?
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |