--- license: cc0-1.0 tags: - art - computer vision - Image segmentation --- # DeepLabV3+ ResNet50 for human body parts segmentation This is a very simple ONNX model that can segment human body parts. ## Why this model This model is a ONNX transposition of [keras-io/deeplabv3p-resnet50](https://huggingface.co/keras-io/deeplabv3p-resnet50) where the provided model can segment human body parts. All the others models that I found was trained on city segmentation. The original model is built for old version of Keras and cannot be used with recent version of TensorFlow. I translated the model to ONNX format. ## Usage Get the `deeplabv3p-resnet50-human.onnx` file and use it with ONNXRuntime package. The result of `model.run` is a `(1, 1, 512, 512, 20)` tensor: - 1: number of output (you can squeeze it) - 1: batch size (you can squeeze it) - 512, 512: the size of the image (fixed) - 20: number of classes, so you can take the `argmax`` of the tensor to get the class of each pixel ```python import onnxruntime import numpy as np from PIL import Image model = onnxruntime.InferenceSession("deeplabv3p-resnet50-human.onnx") img = Image.open(sys.argv[1] if len(sys.argv) > 1 else "image.jpg") img = img.resize((512, 512)) img = np.array(img).astype(np.float32) / 127.5 - 1 # infer input_name = model.get_inputs()[0].name output_name = model.get_outputs()[0].name result = model.run([output_name], {input_name: img}) # squeeze, argmax... result = np.array(result[0]) # argmax the classes, remove the batch size result = result.argmax(axis=3).squeeze(0) # get the masks for i in range(20): detected = result == i # get the detected pixels for the class i # detected is a 512, 512 boolean array mask = np.zeros_like(img) mask[detected] = 255 Image.fromarray(mask).show() # or save, or return the mask... ``` ## Classes index This is the list of classes that the model can detect (some classes are not specifically identified, see below): - 0: "background", - 1: "unknown", - 2: "hair", - 3: "unknown", - 4: "glasses", - 5: "top-clothes", - 6: "unknown", - 7: "unknown", - 8: "unknown", - 9: "bottom-clothes", - 10: "torso-skin", - 11: "unknown", - 12: "unknown", - 13: "face", - 14: "left-arm", - 15: "right-arm", - 16: "left-leg", - 17: "right-leg", - 18: "left-foot", - 19: "right-foot", ## Known limitation - The model could fail on portrait images, because the model was trained on "full body" images. - There are some classes that I don't know what they are. I can't find the list of classes (help !). - The model is not perfect, and can fail on some images. I'm not the author of the model, so I can't fix it. ## License The [original model card](https://huggingface.co/keras-io/deeplabv3p-resnet50/blob/main/README.md) proposes the "CC0-1.0" license. I don't know if it's the right license for the model, but I keep it. > Anyway, thanks to the authors of the model for sharing it and to leave it open to use. This means that you may use the model, share, modify, and distribute it without any restriction.