Petals
Distributed inference system for running large language models collaboratively by connecting multiple computers. Enables swarm hosting where many participants together host a single large model. Splits model layers across different computers to achieve distributed inference. Reduces hardware requirements per participant. Maintains model quality equivalent to centralized inference. Interesting alternative to traditional cloud-based inference. Open source and experimental.