Image I/O and manipulation

Working with images is a common task in computer graphics. SlangPy provides the Bitmap class for common image related tasks including:

  • Reading/writing images from/to files in various file formats (PNG, JPEG, EXR and more).

  • Converting images between different formats.

  • Accessing/manipulating the raw pixel data.

To start using the Bitmap class, we first import slangpy:

[34]:
import slangpy as spy

Reading images from disk

Bitmap supports reading and writing images in most common file formats, including:

Reading an image from disk is as simple as:

[35]:
bmp = spy.Bitmap("vangogh.jpg")

Using the string representation of the Bitmap we can get more information about the image:

[36]:
print(bmp)
Bitmap(
  pixel_format = rgb,
  component_type = uint8,
  width = 500,
  height = 400,
  srgb_gamma = true,
  pixel_struct = Struct(
    pack = true,
    byte_order = little_endian,
    fields = [
      Field(name="R", type=uint8, flags=(normalized | srgb_gamma), size=1, offset=0),
      Field(name="G", type=uint8, flags=(normalized | srgb_gamma), size=1, offset=1),
      Field(name="B", type=uint8, flags=(normalized | srgb_gamma), size=1, offset=2),
    ],
    size = 3,
    alignment = 1
  ),
  data = 585.94 kB
)

We can see the following information:

  • pixel_format: Shows the format of the pixel data such as rgb, rgba, multi_channel and others (see Bitmap.PixelFormat).

  • component_type: Shows the type of each pixel component such as uint8, uint16, float32 and others (see Bitmap.ComponentType).

  • width and height: Shows the dimensions of the image in pixels.

  • srgb_gamma: Shows if the image data is interpreted as sRGB gamma corrected.

  • pixel_struct: Shows the raw structure of the underlying pixel data (see DataStruct).

  • data: Shows the size of the raw image data.

Images loaded from JPEG or PNG files are by default treated as sRGB gamma corrected. Other formats are treated as linear.

To load multiple images in parallel, you can use `Bitmap.read_multiple() <../api_reference.html#slangpy.Bitmap.read_multiple>`__.

For convenience, if a Bitmap is used as a cell output in Jupyter notebooks, it will be displayed as an image:

[37]:
bmp
[37]:

Note: Images are encoded as PNG files when displayed in Jupyter notebooks. This means that HDR image are first converted to LDR before being displayed.

Converting images

Images can be converted into different formats using the Bitmap.convert() method. For example, to convert the loaded image into a floating point image in linear space, we can do:

[ ]:

[38]:
bmp_float = bmp.convert(component_type=spy.Bitmap.ComponentType.float32, srgb_gamma=False)
print(bmp_float)
Bitmap(
  pixel_format = rgb,
  component_type = float32,
  width = 500,
  height = 400,
  srgb_gamma = false,
  pixel_struct = Struct(
    pack = true,
    byte_order = little_endian,
    fields = [
      Field(name="R", type=float32, flags=none, size=4, offset=0),
      Field(name="G", type=float32, flags=none, size=4, offset=4),
      Field(name="B", type=float32, flags=none, size=4, offset=8),
    ],
    size = 12,
    alignment = 1
  ),
  data = 2.29 MB
)

As another example, we can convert the image to RGBA (defaulting alpha to fully opaque), preserving the component type and sRGB gamma:

[39]:
bmp_rgba = bmp.convert(pixel_format=spy.Bitmap.PixelFormat.rgba)
print(bmp_rgba)
Bitmap(
  pixel_format = rgba,
  component_type = uint8,
  width = 500,
  height = 400,
  srgb_gamma = true,
  pixel_struct = Struct(
    pack = true,
    byte_order = little_endian,
    fields = [
      Field(name="R", type=uint8, flags=(normalized | srgb_gamma), size=1, offset=0),
      Field(name="G", type=uint8, flags=(normalized | srgb_gamma), size=1, offset=1),
      Field(name="B", type=uint8, flags=(normalized | srgb_gamma), size=1, offset=2),
      Field(name="A", type=uint8, flags=normalized, size=1, offset=3),
    ],
    size = 4,
    alignment = 1
  ),
  data = 781.25 kB
)

As a final example, let’s convert the image to a linear 16-bit luminance image:

[40]:
bmp_y = bmp.convert(
    pixel_format=spy.Bitmap.PixelFormat.y,
    component_type=spy.Bitmap.ComponentType.uint16,
    srgb_gamma=False,
)
print(bmp_y)
bmp_y
Bitmap(
  pixel_format = y,
  component_type = uint16,
  width = 500,
  height = 400,
  srgb_gamma = false,
  pixel_struct = Struct(
    pack = true,
    byte_order = little_endian,
    fields = [
      Field(name="Y", type=uint16, flags=normalized, size=2, offset=0),
    ],
    size = 2,
    alignment = 1
  ),
  data = 390.62 kB
)
[40]:

Array Interface

The Bitmap class implements the NumPy array interface, which allows for seamless integration with other Python libraries.

For example, we can display an image using matplotlib:

[41]:
import matplotlib.pyplot as plt

plt.imshow(bmp)
plt.axis("off")

[41]:
(np.float64(-0.5), np.float64(499.5), np.float64(399.5), np.float64(-0.5))
../../_images/src_tutorials_image_io_and_manipulation_16_1.png

Bitmaps can easily be converted to NumPy arrays:

[42]:
import numpy as np

bmp_np = np.array(bmp)
type(bmp_np), bmp_np.shape, bmp_np.dtype
[42]:
(numpy.ndarray, (400, 500, 3), dtype('uint8'))

Note that the above conversion will create a copy of the pixel data. You can set copy=False to avoid the copy and reference the underlying data directly:

[43]:
bmp_np = np.array(bmp, copy=False)

This allows manipulating the image data using NumPy operations:

[44]:
bmp_np[75:125,75:425,:] = [255,0,0]
bmp_np[175:225,75:425,:] = [0,255,0]
bmp_np[275:325,75:425,:] = [0,0,255]
bmp
[44]:

Of course, bitmaps can also be created directly from NumPy arrays:

[45]:
bmp_np = np.tile(np.linspace(0, 1, 100, dtype=np.float32), (100, 1))
bmp_np[25:75,25:75] = 0.5
bmp = spy.Bitmap(bmp_np)

plt.imshow(bmp)
plt.axis("off")
[45]:
(np.float64(-0.5), np.float64(99.5), np.float64(99.5), np.float64(-0.5))
../../_images/src_tutorials_image_io_and_manipulation_24_1.png

Multi-channel images

The Bitmap class supports multi-channel images. This is most useful for working with EXR files, which can contain an arbitrary number of channels.

To create a multi-channel bitmap, we have to pass a set of channel names.

[46]:
bmp_np = np.zeros((256, 256, 7), dtype=np.float32)
bmp_multi = spy.Bitmap(
    bmp_np,
    channel_names=[
        "color.R",
        "color.G",
        "color.B",
        "normal.R",
        "normal.G",
        "normal.B",
        "depth.T",
    ],
)
bmp_multi
[46]:
Bitmap(
  pixel_format = multi_channel,
  component_type = float32,
  width = 256,
  height = 256,
  srgb_gamma = false,
  pixel_struct = Struct(
    pack = true,
    byte_order = little_endian,
    fields = [
      Field(name="color.R", type=float32, flags=none, size=4, offset=0),
      Field(name="color.G", type=float32, flags=none, size=4, offset=4),
      Field(name="color.B", type=float32, flags=none, size=4, offset=8),
      Field(name="normal.R", type=float32, flags=none, size=4, offset=12),
      Field(name="normal.G", type=float32, flags=none, size=4, offset=16),
      Field(name="normal.B", type=float32, flags=none, size=4, offset=20),
      Field(name="depth.T", type=float32, flags=none, size=4, offset=24),
    ],
    size = 28,
    alignment = 1
  ),
  data = 1.75 MB
)

We can split the multi-channel bitmap into its individual parts using the Bitmap.split() method. It groups the channels by their common prefix and tries infer the correct pixel format. The parts are returned as a list of tuples, containing the prefix name and the bitmap.

[47]:
parts = bmp_multi.split()
print([(part[0], part[1].pixel_format) for part in parts])
[('color', PixelFormat.rgb), ('depth', PixelFormat.multi_channel), ('normal', PixelFormat.rgb)]

The following example loads a multi-channel EXR image and displays some of its parts side-by-side:

[48]:
bmp_multi = spy.Bitmap("multi_channel.exr")
print(bmp_multi)

parts = dict(bmp_multi.split())

fig, axs = plt.subplots(1, 3, figsize=(12, 4))
axs[0].imshow(parts["image"])
axs[0].axis("off")
axs[0].set_title("image")
axs[1].imshow(parts["sh_normal"])
axs[1].axis("off")
axs[1].set_title("sh_normal")
axs[2].imshow(parts["depth"])
axs[2].axis("off")
axs[2].set_title("depth")
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..18.773415].
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [-1.0..1.0].
Bitmap(
  pixel_format = multi_channel,
  component_type = float32,
  width = 256,
  height = 256,
  srgb_gamma = false,
  pixel_struct = Struct(
    pack = true,
    byte_order = little_endian,
    fields = [
      Field(name="R", type=float32, flags=none, size=4, offset=0),
      Field(name="G", type=float32, flags=none, size=4, offset=4),
      Field(name="B", type=float32, flags=none, size=4, offset=8),
      Field(name="depth.T", type=float32, flags=none, size=4, offset=12),
      Field(name="image.R", type=float32, flags=none, size=4, offset=16),
      Field(name="image.G", type=float32, flags=none, size=4, offset=20),
      Field(name="image.B", type=float32, flags=none, size=4, offset=24),
      Field(name="image.A", type=float32, flags=none, size=4, offset=28),
      Field(name="position.X", type=float32, flags=none, size=4, offset=32),
      Field(name="position.Y", type=float32, flags=none, size=4, offset=36),
      Field(name="position.Z", type=float32, flags=none, size=4, offset=40),
      Field(name="sh_normal.X", type=float32, flags=none, size=4, offset=44),
      Field(name="sh_normal.Y", type=float32, flags=none, size=4, offset=48),
      Field(name="sh_normal.Z", type=float32, flags=none, size=4, offset=52),
    ],
    size = 56,
    alignment = 1
  ),
  data = 3.50 MB
)
[48]:
Text(0.5, 1.0, 'depth')
../../_images/src_tutorials_image_io_and_manipulation_30_3.png

Writing images to disk

Writing images to disk is a simple as reading them. For example, to load a JPEG and save it as a PNG, we can simply do:

[49]:
bmp = spy.Bitmap("vangogh.jpg")
bmp.write("vangogh.png")

Note that different file formats support different pixel formats and component types. For example, JPEG only supports rgb and rgba pixel formats with uint8 component type. You have to convert the image to the correct format before saving it.

Bitmaps can also be written asynchronously using the Bitmap.write_async() method.

See also