Model Cascading: Optimize AI Performance on Embedded Devices
In our interview with Sergi Mansilla from Edge Impulse at embedded world 2025, we got an update on machine learning at the edge. As we’ve seen at every trade fair, the technology is maturing rapidly. Through model cascading, first-stage ML algorithms perform pre-filtering before more power-hungry models are executed. We also discussed the benefits of Qualcomm’s acquisition for the Edge Impulse community.
A lot’s going on in the world of edge AI, with embedded processors becoming more powerful and cloud algorithms being adapted for execution on them. Speaking to Sergi Mansilla of Edge Impulse, it is clear that the industry is leveraging this change to benefit embedded systems.
By cascading machine learning models, low-power algorithms can make a preliminary assessment of data. Should core attributes be present, additional, more power-hungry processing can be undertaken by a second algorithm, such as a large language model. VLMs, or vision language models, increasingly make their way into edge systems, allowing natural language algorithms to detect objects or describe a scene.
We also discussed the benefits of Qualcomm’s acquisition of Edge Impulse to the community and the access to new hardware it offers developers.
By cascading machine learning models, low-power algorithms can make a preliminary assessment of data. Should core attributes be present, additional, more power-hungry processing can be undertaken by a second algorithm, such as a large language model. VLMs, or vision language models, increasingly make their way into edge systems, allowing natural language algorithms to detect objects or describe a scene.
We also discussed the benefits of Qualcomm’s acquisition of Edge Impulse to the community and the access to new hardware it offers developers.