grays Independent colorization of grayscale images By IEEE -nönünü | November, 2025

by SkillAiNest

Importance of image processing and computer vision:

Image processing includes the manipulation and enhancement operations performed on a digital image. This enables field machines to analyze images numerically. Computer vision, in turn, goes beyond this analysis to understand the semantic meaning within the image. For example, it can recognize objects or faces in an image. These technologies facilitate human life today with a wide spectrum of applications, from autonomous vehicles to medical diagnostics.

Computer vision is an interdisciplinary scientific field concerned with the ability of computers to derive high-level understanding from digital images or videos. Through engineering methods, it seeks to understand and automate tasks that the human visual system can perform. Computer vision involves the automatic extraction, analysis, and understanding of useful information from a single image or sequence of images. To achieve automated visual understanding, it involves the development of a foundation derived from theory and algorithms.

As a scientific discipline, computer vision is concerned with the theory behind artificial systems that extract information from images. Image data can take many forms, such as video sequences, views from multiple cameras, or multidimensional data from a medical scanner. As a technical discipline, computer vision tries to apply its theories and models to the construction of computer vision systems. The field is also related to areas such as solid-state physics, neuroscience, and signal processing.

Loss of color information in grays:

In a color image, each pixel has three distinct values: red, green, and blue. The grayscale conversion process, however, reduces these three values ​​to a single brightness value. During this process, a formula determined by the light sensitivity of the human eye is commonly used:

Gray scale = 0.2989 × r+0.5870 × g+0.1140 × b

However, in this conversion, specific color information is lost. We are only left with the information of “how light or dark”. For example, light green and light blue tones can be converted to the same gray value. Therefore, in grayscale, the color information is completely eliminated. Only the light information is preserved.

“It’s impossible to know which color corresponds to a gray pixel.” In a color image, each pixel consists of three components: RGB (red, green, blue). The grayscale conversion process reduces these three channels to a single brightness value. Meaning, the three numbers in each pixel become the same number. In this case, multiple color combinations can achieve the same gray light. For example, both light yellow and light blue tones can correspond to the same gray value. Therefore, performing the reverse operation—that is, finding the original color from a gray pixel—is mathematically impossible, because we cannot know which triplet produced that value.

mum motivation:

The colorization of grayscale images is not merely a technological change but also indicates the restoration of visual information. The human brain uses color as a primary factor in making sense of its surroundings. Therefore, rendering accurate and natural colors to black-and-white images is of great importance for rendering the past more clearly, highlighting critical regions in medical images, or revealing details irreplaceable in forensic analysis. Deep learning-based color estimation models can learn color relationships by using textural and compositional features in images, thus enabling them to colorize a grayscale scene in a manner consistent with human perception.

Image processing and computer vision form the basis of technologies encountered in many areas of daily life. Color estimation models, which have attracted significant interest and active development in recent years, can learn color relationships observed in everyday life.

Areas of Application:

· Historic Photo and Film Restoration: Black and white photos and old films can be colored to present visual memories of the past in a more vivid and impressive way. For example, in a photograph of a city from the 1920s, cars, signs, and streets appear in their natural colors.

· Security and Forensic Analysis: Grayscale images obtained from CCTV or old security cameras can be colorized to aid in faster analysis and identification.

· Film Industry and Advertising: Old films or animation works can be re-colored and presented to a modern audience. Especially in the fields of advertising and visual effects, color estimation is very important to make scenes more effective.

· Medical Imaging: Grayscale medical images, such as MRI or X-ray, can be interpreted more quickly and clinical decisions can be supported by highlighting specific tissues and regions in specific colors.

Notable projects in this field include Devaldify and Algorithmia. DevilDiff is an open source project that has had significant success in colorizing historical photographs and movies in particular. Algorithmia, on the other hand, offers a cloud-based color estimation service, which helps developers colorize grayscale images quickly. These plans ensure that the model can perform both technically and visually, while enriching the user experience.

Finally, the expected output of models that colorize grayscale images is not only to make color predictions but also to present the viewer with the sensation of a scene taken from life. Colors that are appropriate for human perception, add meaning to the scene, and facilitate emotional connection determine the true success of the model.

Lit Literary/Methods:

Deep learning methods used in colorization of black-and-white images perform pixel-level color estimation by analyzing the structure within the grayscale image. Commonly used approaches in this field are CNN, encoder-decoder, and GAN-based models. Each operates with a different learning logic.

🌀CNN (Convolutional Neural Network):

CNN models recognize visual features such as edges, textures, and patterns in images using filters. Based on these features, they predict the possible color of each pixel. For example, in an image, the light gray tones of the sky can be changed to blue, and the gray tones of leaves can be changed to green. In this process, the network uses previously learned color shopping relationships.

Advantage: It is fast and easily trained due to its simple structure.

Disadvantage: It can create false colors in complex scenes because it doesn’t fully understand the context.

🧩 encoder-decoder:

In this architecture, the model first converts the gray image into a compact summary (a feature map) in the encoder section. The decoder then uses this information to reconstruct the color image. For example, in a landscape image, the encoder part learns things like the sky, mountains, and trees. The decoder then adds the appropriate colors (blue, brown, green) to these regions. (Diagram: Encoder -> Resnet -> Decoder)

Advantage: Provides consistent color distribution throughout the image. Disadvantage: Color loss can occur in small details (eg, leaf veins or small objects).

🤖 GAN (Generative Adversarial Network):

In GANs, two networks work collaboratively:

· Generator: Produces a color image.

· Discrimination: It tries to determine whether the generated image is real or fake. For example, in a black-and-white face image, while the GAN model naturally produces skin tone, lip, and eye colors, the discriminative network examines whether these predictions are indistinguishable from a real image. This process enables the model to produce colors more natural to the human eye over time.

Advantage: Produces highly realistic and vivid colors.

Disadvantage: Training is hard. Achieving balance between networks is time-consuming.

Functions used for color estimation:

Color estimation models are optimized using different loss functions:

Total loss = α walking+β plusceptive+adaadversarial

A comparison of methods

Results of expected results:

At the end of this study, the developed model is expected to produce color estimates from grayscale images that are close to the actual colors of the scene. The goal is for the model to not only color the pixels, but to understand the context of the objects and generate meaningful and natural color distributions. For example, a realistic appearance of the sky as blue, leaves as green, and skin tones will indicate the success of the model’s context.

The aim is for the model to produce results consistent with human perception in terms of naturalness, visual integrity and perceptual similarity. Consistent color transitions, preservation of tonal balance, and accurate color capture in small details will be critical metrics of success.

From a technical point of view, the goal is to reduce the margin of error at both the pixel and perceptual levels by jointly optimizing the MSSE, perceptual loss, and community loss functions. In this way, the model will not only achieve accurate colors but also a visual quality that creates a true photographic feel.

The results obtained are expected to be practically applicable in areas such as historical image restoration, medical image analysis, security systems, and media production. In conclusion, this study will represent not only a technical achievement but also an important step in the recovery of visual meaning.

bibliography;

Loss of color information in computer vision and grayscale:

est https://tr.wikipedia.org/wiki/bilgisayarl%C4%B1_G%C3%B6R%C3%BC

est https://www.geeksforgeeks.org/electronics-engineering/ what-is-grayscale-image

est https://www3.cs.stonybrook.edu/~mueller/research/colorize/colorize-sig02.pdf

· researchgate.net/publications/369610399_A_REVIEWE_PAPAPE_ON_COMPUTER_VISION

Literature and Methods:

· Goodfellow, I., Pauget-Abadi, J., et al. (2014). Generative adversarial net. Advances in neural information processing systems (neurops).

· Lecon, Y., Bengo, Y., & Hinton, G. (2015). Deep learning. Nature, 521 (7553), 436–444.

· Zhang, R., Isola, P., & Efros, AA (2016). Color Image Color. European Conference on Computer Vision (ECCV).

est https://arxiv.org/abs/1406.2661

est https://www.iangoodfellow.com/slides/2016-04-gtc.pdf?utm_source

est https://richzhang.github.io/colorization/resources/colorful_eccv2016.pdf

est https://arxiv.org/abs/1603.08511

Expected Results and Application Areas:

· Zhang, R., Isola, P., & Efros, AA (2016). Color Image Color. European Conference on Computer Vision (ECCV), 2016.

Lizuka, S., Simo-Serra, E., & Ishikawa, H. (2016). Let there be color!: Combined end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics (TOG), 35(4), 110.

· Isola, P., Zhu, J. Y., Chow, T., & Efros, A. A. (2017). Image-to-image translation with conditional advectional networks. CVPR 2017.

· Devolved. Available: https://github.com/jantic/deoldify

est https://algorithmia.com/algorithms/deeplearning/colorfulimagecolorization

· Roderman, DL, Cronin, TW, & Chiao, CC (1998). Cone response statistics of natural Sharma, G. (2003). Digital Color Imaging Handbook. CRC Press.

· He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition.

· Ronberger, O., Fisher, P., & Brooks, T. (2015). U-NET: Adaptive Networks for Biomedical Image Segmentation. Michael 2015.

In essence, automated colorization transforms grayscale imagery into emotional storytelling – integrating art, memory and machine intelligence into a single frame.

👩 💻Yazarler:
Mustafa Tura ş
✍ Göktüğ Ezhan Karademir
✍ Yusuf Burke Yildirim

📖 Translation: Emery Carata

You may also like

Leave a Comment

At Skillainest, we believe the future belongs to those who embrace AI, upgrade their skills, and stay ahead of the curve.

Get latest news

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

@2025 Skillainest.Designed and Developed by Pro