RL Environments
```Position: RL Environments``` ```Engineer Type: contractor, full-time (40h/week), flexible hours with ≥4h PST overlap``` ```Location: remote anywhere Salary: $80–$200/hour or $14,000–$32,000/month Job description``` ```Best way to contact: @smartdev1227``` Anything else you want to add: About us: Preference Model builds next-gen RL environments that bring real-world use cases into distribution for AI models. The founding team previously worked on Anthropic’s data infrastructure, tokenizers, and datasets behind Claude, and we partner with leading AI labs. ```Responsibilities: - Design and build MLE/SWE environments and diverse, well-specified tasks. - Target specific language models and meet defined difficulty distributions. - Deliver ~1 task every 3–5 hours once onboarded. - Implement fast edits within 24 hours based on feedback. - Work independently with minimal supervision.``` ``` Requirements: - Heavy LLM user with strong Python skills. - Experience designing RL/evaluation environments or tasks. - Advanced English (C1/C2). - Ability to meet throughput (1 task/3–5h) and turn edits around in 24 hours. - ≥4 hours overlap with PST business hours.``` ```Working conditions: - Remote independent contractor engagement. - Full-time 40 hours/week with ≥4h PST overlap. - Deliverables-driven; begin shipping on day one. - Potential path to FTE and Bay Area relocation if there’s mutual fit.``` Evaluation: A take-home assignment is required and will be compensated if an offer is made. Send your CV to Telegram and we’ll take care of next steps.
