python - How to modify my code to handle RGBX (4-channel) images for semantic segmentation?

admin管理员组
文章数量:1023251

I'm new to this field and have been following a U-Net tutorial using 3-channel RGB images for semantic segmentation ;list=PLZsOBAyNTZwbR08R959iCvYT3qzhxvGOE&index=2&ab_channel=DigitalSreeni, and it worked fine for me. However, I now need to extend the pipeline to support 4-channel RGBX images (i.e., RGB + an other channel), but I’m not sure how to modify the code to accommodate the additional channel, especially for the preprocessing and the ImageDataGenerator parts (I think that the ImageDataGenerator doesn’t support 4-channel images).

This is the code (after patchifying the image to (256 * 256 * 4) and the masks to (256*256)):

import os
import cv2
import numpy as np
import glob
from matplotlib import pyplot as plt
import tensorflow as tf
import splitfolders
import segmentation_models as sm
from tensorflow.keras.metrics import MeanIoU
from sklearn.preprocessing import MinMaxScaler
from keras.utils import to_categorical


input_folder='path folder to my images and masks '
output_folder='path to output folder'
#split with a ratio
splitfolders.ratio(input_folder, output=output_folder, seed=42, ratio=(.75,.25),group_prefix=None) 

#Rearange the folder structure for keras augmentation


seed=24
batch_size=16 
n_classes=2 


scaler=MinMaxScaler()


BACKBONE='resnet34'  
preprocess_input=sm.get_preprocessing(BACKBONE)

def preprocess_data(img, mask, num_class):
    #Scale images
    img=scaler.fit_transform(img.reshape(-1, img.shape[-1])).reshape(img.shape)
    img=preprocess_input(img)  #Preprocess based on the pretrained backbone
    mask=to_categorical(mask, num_class)
    return (img,mask)

from tensorflow.keras.preprocessing.image import ImageDataGenerator
def trainGenerator(train_img_path, train_mask_path, num_class):
    img_data_gen_args=dict(horizontal_flip=True, vertical_flip=True, fill_mode='reflect') #Data augmentation
    
    image_datagen=ImageDataGenerator(**img_data_gen_args)
    mask_datagen=ImageDataGenerator(**img_data_gen_args)
    
    image_generator=image_datagen.flow_from_directory(train_img_path, class_mode=None, batch_size=batch_size, seed=seed)
    mask_generator=image_datagen.flow_from_directory(train_mask_path, class_mode=None, color_mode='grayscale', batch_size=batch_size, seed=seed)
    
    train_generator=zip(image_generator, mask_generator)
    
    for (img, mask) in train_generator:
        img, mask= preprocess_data(img, mask, num_class)
        yield (img, mask)

train_img_path='path for training images'
train_mask_path='path for training masks'
train_img_gen=trainGenerator(train_img_path, train_mask_path, num_class=2)

val_img_path='path for validation images'
val_mask_path='path for validation masks'
val_img_gen=trainGenerator(val_img_path, val_mask_path, num_class=2)


x, y=train_img_gen.__next__()

for i in range(0,3):
    image=x[i]
    mask=np.argmax(y[i], axis=2)
    plt.subplot(1,2,1)
    plt.imshow(image)
    plt.subplot(1,2,2)
    plt.imshow(mask, cmap='gray')
    plt.show()


num_train_imgs=len(os.listdir('path for training images'))
num_val_images=len(os.listdir('path for validation image'))
steps_per_epochs=num_train_imgs//batch_size
val_steps_per_epoch=num_val_images//batch_size

IMG_HEIGHT=x.shape[1]
IMG_WIDTH=x.shape[2]
IMG_CHANNELS=x.shape[3]

n_classes=2

model=sm.Unet('resnet34', encoder_weights='None', input_shape=(IMG_HEIGHT,IMG_WIDTH,IMG_CHANNELS), classes=n_classes,activation='softmax')
modelpile('Adam', loss=sm.losses.binary_crossentropy, metrics=[sm.metrics.iou_score, sm.metrics.FScore()])

history=model.fit(train_img_gen, steps_per_epoch=steps_per_epochs, epochs=100, verbose=1, validation_data=val_img_gen, validation_steps=val_steps_per_epoch)

I'm new to this field and have been following a U-Net tutorial using 3-channel RGB images for semantic segmentation https://www.youtube/watch?v=68HR_eyzk00&list=PLZsOBAyNTZwbR08R959iCvYT3qzhxvGOE&index=2&ab_channel=DigitalSreeni, and it worked fine for me. However, I now need to extend the pipeline to support 4-channel RGBX images (i.e., RGB + an other channel), but I’m not sure how to modify the code to accommodate the additional channel, especially for the preprocessing and the ImageDataGenerator parts (I think that the ImageDataGenerator doesn’t support 4-channel images).

This is the code (after patchifying the image to (256 * 256 * 4) and the masks to (256*256)):

import os
import cv2
import numpy as np
import glob
from matplotlib import pyplot as plt
import tensorflow as tf
import splitfolders
import segmentation_models as sm
from tensorflow.keras.metrics import MeanIoU
from sklearn.preprocessing import MinMaxScaler
from keras.utils import to_categorical


input_folder='path folder to my images and masks '
output_folder='path to output folder'
#split with a ratio
splitfolders.ratio(input_folder, output=output_folder, seed=42, ratio=(.75,.25),group_prefix=None) 

#Rearange the folder structure for keras augmentation


seed=24
batch_size=16 
n_classes=2 


scaler=MinMaxScaler()


BACKBONE='resnet34'  
preprocess_input=sm.get_preprocessing(BACKBONE)

def preprocess_data(img, mask, num_class):
    #Scale images
    img=scaler.fit_transform(img.reshape(-1, img.shape[-1])).reshape(img.shape)
    img=preprocess_input(img)  #Preprocess based on the pretrained backbone
    mask=to_categorical(mask, num_class)
    return (img,mask)

from tensorflow.keras.preprocessing.image import ImageDataGenerator
def trainGenerator(train_img_path, train_mask_path, num_class):
    img_data_gen_args=dict(horizontal_flip=True, vertical_flip=True, fill_mode='reflect') #Data augmentation
    
    image_datagen=ImageDataGenerator(**img_data_gen_args)
    mask_datagen=ImageDataGenerator(**img_data_gen_args)
    
    image_generator=image_datagen.flow_from_directory(train_img_path, class_mode=None, batch_size=batch_size, seed=seed)
    mask_generator=image_datagen.flow_from_directory(train_mask_path, class_mode=None, color_mode='grayscale', batch_size=batch_size, seed=seed)
    
    train_generator=zip(image_generator, mask_generator)
    
    for (img, mask) in train_generator:
        img, mask= preprocess_data(img, mask, num_class)
        yield (img, mask)

train_img_path='path for training images'
train_mask_path='path for training masks'
train_img_gen=trainGenerator(train_img_path, train_mask_path, num_class=2)

val_img_path='path for validation images'
val_mask_path='path for validation masks'
val_img_gen=trainGenerator(val_img_path, val_mask_path, num_class=2)


x, y=train_img_gen.__next__()

for i in range(0,3):
    image=x[i]
    mask=np.argmax(y[i], axis=2)
    plt.subplot(1,2,1)
    plt.imshow(image)
    plt.subplot(1,2,2)
    plt.imshow(mask, cmap='gray')
    plt.show()


num_train_imgs=len(os.listdir('path for training images'))
num_val_images=len(os.listdir('path for validation image'))
steps_per_epochs=num_train_imgs//batch_size
val_steps_per_epoch=num_val_images//batch_size

IMG_HEIGHT=x.shape[1]
IMG_WIDTH=x.shape[2]
IMG_CHANNELS=x.shape[3]

n_classes=2

model=sm.Unet('resnet34', encoder_weights='None', input_shape=(IMG_HEIGHT,IMG_WIDTH,IMG_CHANNELS), classes=n_classes,activation='softmax')
modelpile('Adam', loss=sm.losses.binary_crossentropy, metrics=[sm.metrics.iou_score, sm.metrics.FScore()])

history=model.fit(train_img_gen, steps_per_epoch=steps_per_epochs, epochs=100, verbose=1, validation_data=val_img_gen, validation_steps=val_steps_per_epoch)

Share Improve this question edited Nov 19, 2024 at 17:07 asked Nov 18, 2024 at 20:06 FF123456 838 bronze badges

a change in the number of input channels? unless the preprocessing ahead of inference involves collapsing to grayscale, such an operation will require retraining the model or performing surgery on the model's input layer and weights. – Christoph Rackwitz Commented Nov 18, 2024 at 21:01
@ChristophRackwitz Thanks for your response! You're right that with the change to 4 channels (RGBX), I need to retrain the model. I’ve set encoder_weights=None because I want to train from scratch, given the additional channel. My main issue now is modifying the preprocessing and ImageDataGenerator to handle 4-channel images. The 4th channel is essential for my task, so I need to adjust the pipeline accordingly. – FF123456 Commented Nov 18, 2024 at 21:47

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

You could drop the 4th band of data, typically an alpha channel, while reading it in with OpenCV like this.

import cv2

img = cv2.imread(filename)

and if the workflow requires an image path instead of a numpy object, then I might run a pre-processing workflow that copies 3-channel images to a new directory train_img_path.

This is the code (after patchifying the image to (256 * 256 * 4) and the masks to (256*256)):

import os
import cv2
import numpy as np
import glob
from matplotlib import pyplot as plt
import tensorflow as tf
import splitfolders
import segmentation_models as sm
from tensorflow.keras.metrics import MeanIoU
from sklearn.preprocessing import MinMaxScaler
from keras.utils import to_categorical


input_folder='path folder to my images and masks '
output_folder='path to output folder'
#split with a ratio
splitfolders.ratio(input_folder, output=output_folder, seed=42, ratio=(.75,.25),group_prefix=None) 

#Rearange the folder structure for keras augmentation


seed=24
batch_size=16 
n_classes=2 


scaler=MinMaxScaler()


BACKBONE='resnet34'  
preprocess_input=sm.get_preprocessing(BACKBONE)

def preprocess_data(img, mask, num_class):
    #Scale images
    img=scaler.fit_transform(img.reshape(-1, img.shape[-1])).reshape(img.shape)
    img=preprocess_input(img)  #Preprocess based on the pretrained backbone
    mask=to_categorical(mask, num_class)
    return (img,mask)

from tensorflow.keras.preprocessing.image import ImageDataGenerator
def trainGenerator(train_img_path, train_mask_path, num_class):
    img_data_gen_args=dict(horizontal_flip=True, vertical_flip=True, fill_mode='reflect') #Data augmentation
    
    image_datagen=ImageDataGenerator(**img_data_gen_args)
    mask_datagen=ImageDataGenerator(**img_data_gen_args)
    
    image_generator=image_datagen.flow_from_directory(train_img_path, class_mode=None, batch_size=batch_size, seed=seed)
    mask_generator=image_datagen.flow_from_directory(train_mask_path, class_mode=None, color_mode='grayscale', batch_size=batch_size, seed=seed)
    
    train_generator=zip(image_generator, mask_generator)
    
    for (img, mask) in train_generator:
        img, mask= preprocess_data(img, mask, num_class)
        yield (img, mask)

train_img_path='path for training images'
train_mask_path='path for training masks'
train_img_gen=trainGenerator(train_img_path, train_mask_path, num_class=2)

val_img_path='path for validation images'
val_mask_path='path for validation masks'
val_img_gen=trainGenerator(val_img_path, val_mask_path, num_class=2)


x, y=train_img_gen.__next__()

for i in range(0,3):
    image=x[i]
    mask=np.argmax(y[i], axis=2)
    plt.subplot(1,2,1)
    plt.imshow(image)
    plt.subplot(1,2,2)
    plt.imshow(mask, cmap='gray')
    plt.show()


num_train_imgs=len(os.listdir('path for training images'))
num_val_images=len(os.listdir('path for validation image'))
steps_per_epochs=num_train_imgs//batch_size
val_steps_per_epoch=num_val_images//batch_size

IMG_HEIGHT=x.shape[1]
IMG_WIDTH=x.shape[2]
IMG_CHANNELS=x.shape[3]

n_classes=2

model=sm.Unet('resnet34', encoder_weights='None', input_shape=(IMG_HEIGHT,IMG_WIDTH,IMG_CHANNELS), classes=n_classes,activation='softmax')
modelpile('Adam', loss=sm.losses.binary_crossentropy, metrics=[sm.metrics.iou_score, sm.metrics.FScore()])

history=model.fit(train_img_gen, steps_per_epoch=steps_per_epochs, epochs=100, verbose=1, validation_data=val_img_gen, validation_steps=val_steps_per_epoch)

This is the code (after patchifying the image to (256 * 256 * 4) and the masks to (256*256)):

import os
import cv2
import numpy as np
import glob
from matplotlib import pyplot as plt
import tensorflow as tf
import splitfolders
import segmentation_models as sm
from tensorflow.keras.metrics import MeanIoU
from sklearn.preprocessing import MinMaxScaler
from keras.utils import to_categorical


input_folder='path folder to my images and masks '
output_folder='path to output folder'
#split with a ratio
splitfolders.ratio(input_folder, output=output_folder, seed=42, ratio=(.75,.25),group_prefix=None) 

#Rearange the folder structure for keras augmentation


seed=24
batch_size=16 
n_classes=2 


scaler=MinMaxScaler()


BACKBONE='resnet34'  
preprocess_input=sm.get_preprocessing(BACKBONE)

def preprocess_data(img, mask, num_class):
    #Scale images
    img=scaler.fit_transform(img.reshape(-1, img.shape[-1])).reshape(img.shape)
    img=preprocess_input(img)  #Preprocess based on the pretrained backbone
    mask=to_categorical(mask, num_class)
    return (img,mask)

from tensorflow.keras.preprocessing.image import ImageDataGenerator
def trainGenerator(train_img_path, train_mask_path, num_class):
    img_data_gen_args=dict(horizontal_flip=True, vertical_flip=True, fill_mode='reflect') #Data augmentation
    
    image_datagen=ImageDataGenerator(**img_data_gen_args)
    mask_datagen=ImageDataGenerator(**img_data_gen_args)
    
    image_generator=image_datagen.flow_from_directory(train_img_path, class_mode=None, batch_size=batch_size, seed=seed)
    mask_generator=image_datagen.flow_from_directory(train_mask_path, class_mode=None, color_mode='grayscale', batch_size=batch_size, seed=seed)
    
    train_generator=zip(image_generator, mask_generator)
    
    for (img, mask) in train_generator:
        img, mask= preprocess_data(img, mask, num_class)
        yield (img, mask)

train_img_path='path for training images'
train_mask_path='path for training masks'
train_img_gen=trainGenerator(train_img_path, train_mask_path, num_class=2)

val_img_path='path for validation images'
val_mask_path='path for validation masks'
val_img_gen=trainGenerator(val_img_path, val_mask_path, num_class=2)


x, y=train_img_gen.__next__()

for i in range(0,3):
    image=x[i]
    mask=np.argmax(y[i], axis=2)
    plt.subplot(1,2,1)
    plt.imshow(image)
    plt.subplot(1,2,2)
    plt.imshow(mask, cmap='gray')
    plt.show()


num_train_imgs=len(os.listdir('path for training images'))
num_val_images=len(os.listdir('path for validation image'))
steps_per_epochs=num_train_imgs//batch_size
val_steps_per_epoch=num_val_images//batch_size

IMG_HEIGHT=x.shape[1]
IMG_WIDTH=x.shape[2]
IMG_CHANNELS=x.shape[3]

n_classes=2

model=sm.Unet('resnet34', encoder_weights='None', input_shape=(IMG_HEIGHT,IMG_WIDTH,IMG_CHANNELS), classes=n_classes,activation='softmax')
modelpile('Adam', loss=sm.losses.binary_crossentropy, metrics=[sm.metrics.iou_score, sm.metrics.FScore()])

history=model.fit(train_img_gen, steps_per_epoch=steps_per_epochs, epochs=100, verbose=1, validation_data=val_img_gen, validation_steps=val_steps_per_epoch)

Share Improve this question edited Nov 19, 2024 at 17:07 asked Nov 18, 2024 at 20:06 FF123456 838 bronze badges

a change in the number of input channels? unless the preprocessing ahead of inference involves collapsing to grayscale, such an operation will require retraining the model or performing surgery on the model's input layer and weights. – Christoph Rackwitz Commented Nov 18, 2024 at 21:01
@ChristophRackwitz Thanks for your response! You're right that with the change to 4 channels (RGBX), I need to retrain the model. I’ve set encoder_weights=None because I want to train from scratch, given the additional channel. My main issue now is modifying the preprocessing and ImageDataGenerator to handle 4-channel images. The 4th channel is essential for my task, so I need to adjust the pipeline accordingly. – FF123456 Commented Nov 18, 2024 at 21:47

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

You could drop the 4th band of data, typically an alpha channel, while reading it in with OpenCV like this.

import cv2

img = cv2.imread(filename)

and if the workflow requires an image path instead of a numpy object, then I might run a pre-processing workflow that copies 3-channel images to a new directory train_img_path.

本文标签： pythonHow to modify my code to handle RGBX (4channel) images for semantic segmentationStack Overflow

版权声明：本文标题：python - How to modify my code to handle RGBX (4-channel) images for semantic segmentation? - Stack Overflow 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://it.en369.cn/questions/1745596888a2158224.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

369IT编程

python - How to modify my code to handle RGBX (4-channel) images for semantic segmentation? - Stack Overflow

1 Answer 1

1 Answer 1

更多相关文章

python - How to modify my code to handle RGBX (4-channel) images for semantic segmentation? - Stack Overflow

发表评论

推荐文章

c# - Unit testing and UI testing in .NET - Stack Overflow

javascript - Ignore e2e folder when running npm test - Stack Overflow

asp.net mvc - Dynamically writing Javascript code? - Stack Overflow

javascript - jQuery remove options from select box - Stack Overflow

redirect - How to create a php url redirection for nicer links

热门文章

javascript - React Enzyme get all classes even if passed by other components - Stack Overflow

html - Pick random property and value from Javascript object - Stack Overflow

javascript - Geohash: How to calculate the eight surrounding boxes - Stack Overflow

javascript - Loading user context before controllers in AngularJS - Stack Overflow

jquery - Convert characters to asterisks in Javascript - Stack Overflow

typescript - Failed to import WASM using ?url in Vite Project - Stack Overflow

javascript - Forcing a column to require sorting in Ant Design - Stack Overflow

Spring Cloud Gateway - how to apply timeout limiter to specific endpoint, using Kotlin - Stack Overflow

javascript - Global variable for Jade templates in node.js - Stack Overflow

c# - .NET 4.8 HttpClient throws ArgumentException: only http or https schemas allowed - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

程序员刚毕业，先去大厂镀金还是先去小厂攒经验？

万象2008清空boss账户密码

【Tools】GitBook简明教程

oracle exadata celldisk 闪存盘受损导致性能下降

SDUT 2138 图结构练习——BFSDFS——判断可达性

r - Autoplot(ly): how can labels appear when hovering over points - Stack Overflow

javascript - UnhandledPromiseRejection: Error: Invalid message options with Joi - Stack Overflow

node.js - uWebsocket HttpResponse not resumingpausing properly - Stack Overflow

javascript - Copy value from one textbox to another using submit button - Stack Overflow

javascript - Hover all element in html - Stack Overflow