Brian D. is a senior machine studying (ML) engineer on our AI Inference crew, which is a part of the broader AI Engineering crew at Pink Hat. Based mostly remotely in Chicago, Brian helps keep LLM Compressor, a key element of vLLM (an open supply inference server initially developed at UC Berkeley, and now supported by a worldwide group). vLLM is designed to make AI inference—in different phrases, responses from fashions—extra environment friendly. By means of LLM Compressor, Brian and his crew make it attainable to optimize and deploy LLMs so that they run sooner, devour much less power, and function on fewer GPUs, with out
roosho
Senior Engineer (Technical Services)
I am Rakib Raihan RooSho, Jack of all IT Trades. You got it right. Good for nothing. I try a lot of things and fail more than that. That's how I learn. Whenever I succeed, I note that in my cookbook. Eventually, that became my blog.
