A large number of today’s botnets leverage the HTTP protocol to communicate with their botmasters or perpetrate malicious activities. In this paper, we present a new scalable system for network-level behavioral clustering of HTTP-based malware that aims to eﬃciently group newly collected malware samples into malware family clusters. The end goal is to obtain malware clusters that can aid the automatic generation of high quality network signatures, which can in turn be used to detect botnet command-and-control (C&C) and other malware-generated communications at the network perimeter. We achieve scalability in our clustering system by simplifying the multi-step clustering process proposed in , and by leveraging incremental clustering algorithms that run eﬃciently on very large datasets. At the same time, we show that scalability is achieved while retaining a good trade-oﬀ between detection rate and false positives for the signatures derived from the obtained malware clusters. We implemented a proof-of-concept version of our new scalable malware clustering system and performed experiments with about 65,000 distinct malware samples. Results from our evaluation conﬁrm the eﬀectiveness of the proposed system and show that, compared to , our approach can reduce processing times from several hours to a few minutes, and scales well to large datasets containing tens of thousands of distinct malware samples.