Embrace the Face Release Smolvla: A compact visual language action model for affordable, efficient robotics
Although robotic control has recently progressed through large-scale visual language action (VLA) models, actual deployment is still limited by hardware and data requirements. Most VLA models depend on transformer-based backbones with billions of parameters,...