[go: nahoru, domu]

Jump to content

Chroma subsampling: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Citation bot (talk | contribs)
Add: s2cid, pmid, isbn. | Use this bot. Report bugs. | Suggested by Abductive | Category:Data compression | #UCB_Category 108/179
Citation bot (talk | contribs)
Alter: title, template type. Added isbn. | Use this bot. Report bugs. | Suggested by Abductive | Category:Image compression | #UCB_Category 29/38
 
(30 intermediate revisions by 16 users not shown)
Line 1: Line 1:
{{Short description|Practice of encoding images}}
{{Short description|Practice of encoding images}}
[[File:Common chroma subsampling ratios.svg|thumb|Widely used chroma subsampling formats]]
[[File:Common chroma subsampling ratios YCbCr CORRECTED.svg|thumb|Widely used chroma subsampling formats]]
'''Chroma subsampling''' is the practice of encoding images by implementing less resolution for [[Chrominance|chroma]] [[information]] than for [[luma (video)|luma]] information, taking advantage of the human visual system's lower acuity for color differences than for luminance.<ref>
'''Chroma subsampling''' is the practice of encoding images by implementing less resolution for [[Chrominance|chroma]] [[information]] than for [[luma (video)|luma]] information, taking advantage of the human visual system's lower acuity for color differences than for luminance.<ref>
{{cite book
{{cite book
Line 19: Line 19:
[[File:Colorcomp.jpg|thumb|In [{{filepath:Colorcomp.jpg}} full size], this image shows the difference between four subsampling schemes. Note how similar the color images appear. The lower row shows the resolution of the color information.]]
[[File:Colorcomp.jpg|thumb|In [{{filepath:Colorcomp.jpg}} full size], this image shows the difference between four subsampling schemes. Note how similar the color images appear. The lower row shows the resolution of the color information.]]


Digital signals are often compressed to reduce file size and save transmission time. Since the human visual system is much more sensitive to variations in brightness than color, a video system can be optimized by devoting more bandwidth to the [[luma (video)|luma]] component (usually denoted Y'), than to the color difference components '''Cb''' and '''Cr'''. In compressed images, for example, the 4:2:2 [[YCbCr|Y'CbCr]] scheme requires two-thirds the bandwidth of non-subsampled "4:4:4" [[RGB|R'G'B']]. This reduction results in almost no visual difference as perceived by the viewer.
Digital signals are often compressed to reduce file size and save transmission time. Since the human visual system is much more sensitive to variations in brightness than color, a video system can be optimized by devoting more bandwidth to the [[luma (video)|luma]] component (usually denoted Y'), than to the color difference components '''Cb''' and '''Cr'''. In compressed images, for example, the 4:2:2 [[YCbCr|Y'CbCr]] scheme requires two-thirds the bandwidth of non-subsampled "4:4:4" [[RGB|R'G'B']].{{efn|The prime signs indicates gamma-correction or any non-linear EOTF.}} This reduction results in almost no visual difference as perceived by the viewer.


===How subsampling works===
===How subsampling works===
The [[Visual perception|human vision system]] (HVS) processes color information, meaning [[hue]] and [[colorfulness]], at about a third of the resolution of [[Relative luminance|luminance]], meaning the lightness/darkness information in an image. Therefore it is possible to [[Sampling (signal processing)|sample]] color information at a lower resolution while maintaining good image quality.
The [[Visual perception|human vision system]] (HVS) processes color information ([[hue]] and [[colorfulness]]) at about a third of the resolution of [[Relative luminance|luminance]] (lightness/darkness information in an image). Therefore it is possible to [[Sampling (signal processing)|sample]] color information at a lower resolution while maintaining good image quality.


This is achieved by encoding [[RGB color model|RGB]] image data into a composite [[black and white]] image, with separated color difference data ([[Chrominance|chroma]]). For example with <math>Y'C_bC_r</math>, [[gamma correction|gamma encoded]] <math>R'G'B'</math> components are weighted and then summed together to create the [[Luma (video)|luma]] <math>Y'</math> component. The color difference components are created by subtracting two of the weighted <math>R'G'B'</math> components from the third. A variety of [[Image scaling|filtering]] methods can be used to limit the resolution.
This is achieved by encoding [[RGB color model|RGB]] image data into a composite [[black and white]] image, with separated color difference data ([[Chrominance|chroma]]). For example with <math>Y'C_bC_r</math>, [[gamma correction|gamma encoded]] <math>R'G'B'</math> components are weighted and then summed together to create the [[Luma (video)|luma]] <math>Y'</math> component. The color difference components are created by subtracting two of the weighted <math>R'G'B'</math> components from the third. A variety of [[Image scaling|filtering]] methods can be used to limit the resolution.


====Regarding gamma and transfer functions====
====Regarding gamma and transfer functions====
Gamma encoded luma <math>Y'</math> should not be confused with linear [[Relative luminance|luminance]] <math>Y</math>. The presence of gamma encoding is denoted with the [[prime symbol]] <math>'</math> . In very early video systems, gamma-correction was necessary due to the nonlinear response of a [[cathode-ray tube]] (CRT).
Gamma encoded luma <math>Y'</math> should not be confused with linear [[Relative luminance|luminance]] <math>Y</math>. The presence of gamma encoding is denoted with the [[prime symbol]] <math>'</math>. In very early video systems, gamma-correction was necessary due to the nonlinear response of a [[cathode-ray tube]] (CRT).


While CRTs are no longer widely used, gamma or electro-optical transfer curves (EOTF), are still very useful due to the nonlinear response of human vision. The use of gamma improves perceived signal-to-noise in analogue systems, and allows for more efficient data encoding in digital systems. This encoding uses more levels for darker colors than for lighter ones, accommodating human vision sensitivity. <ref name=plea>Poynton, Charles. [http://www.poynton.com/PDFs/YUV_and_luminance_harmful.pdf "YUV and ''luminance'' considered harmful: A plea for precise terminology in video"].</ref>
While CRTs are no longer widely used, gamma or electro-optical transfer curves (EOTF), are still very useful due to the nonlinear response of human vision. The use of gamma improves perceived signal-to-noise in analogue systems, and allows for more efficient data encoding in digital systems. This encoding uses more levels for darker colors than for lighter ones, accommodating human vision sensitivity.<ref name=plea>Poynton, Charles. [http://www.poynton.com/PDFs/YUV_and_luminance_harmful.pdf "YUV and ''luminance'' considered harmful: A plea for precise terminology in video"].</ref>


==Sampling systems and ratios==
==Sampling systems and ratios==
Line 40: Line 40:
if one wants to re-add this to the article, it should be better explained --~~~~ -->
if one wants to re-add this to the article, it should be better explained --~~~~ -->
* ''a'': number of chrominance samples ('''Cr''', '''Cb''') in the first row of ''J'' pixels.
* ''a'': number of chrominance samples ('''Cr''', '''Cb''') in the first row of ''J'' pixels.
* ''b'': number of changes of chrominance samples ('''Cr''', '''Cb''') between first and second row of ''J'' pixels. Note that ''b'' has to be either zero or equal to ''a'' (except in rare irregular cases like 4:4:1 and 4:2:1, which do not follow this convention).
* ''b'': number of changes of chrominance samples ('''Cr''', '''Cb''') between first and second row of ''J'' pixels. ''b'' has to be either zero or equal to ''a'' (except in rare irregular cases like 4:4:1 and 4:2:1, which do not follow this convention).
* ''[[Alpha compositing|Alpha]]'': horizontal factor (relative to first digit). May be omitted if alpha component is not present, and is equal to ''J'' when present.
* ''[[Alpha compositing|Alpha]]'': horizontal factor (relative to first digit). May be omitted if alpha component is not present, and is equal to ''J'' when present.


Line 154: Line 154:
|}
|}


The mapping examples given are only theoretical and for illustration. Also note that the diagram does not indicate any chroma filtering, which should be applied to avoid [[aliasing]]. To calculate required bandwidth factor relative to 4:4:4 (or 4:4:4:4), one needs to sum all the factors and divide the result by 12 (or 16, if alpha is present).
The mapping examples given are only theoretical and for illustration. Also the diagram does not indicate any chroma filtering, which should be applied to avoid [[aliasing]]. To calculate required bandwidth factor relative to 4:4:4 (or 4:4:4:4), one needs to sum all the factors and divide the result by 12 (or 16, if alpha is present).


==Types of sampling and subsampling==
==Types of sampling and subsampling==
Line 161: Line 161:
Each of the three [[YCbCr|Y'CbCr]] components has the same sample rate, thus there is no chroma subsampling. This scheme is sometimes used in high-end film scanners and cinematic post-production.
Each of the three [[YCbCr|Y'CbCr]] components has the same sample rate, thus there is no chroma subsampling. This scheme is sometimes used in high-end film scanners and cinematic post-production.


Note that "4:4:4" may instead be wrongly referring to [[RGB|R'G'B']] color space, which implicitly also does not have any chroma subsampling (except in JPEG R'G'B' can be subsampled). Formats such as [[HDCAM SR]] can record 4:4:4 R'G'B' over dual-link [[HD-SDI]].
"4:4:4" may instead be wrongly referring to [[RGB|R'G'B']] color space, which implicitly also does not have any chroma subsampling (except in JPEG R'G'B' can be subsampled). Formats such as [[HDCAM SR]] can record 4:4:4 R'G'B' over dual-link [[HD-SDI]].


===4:2:2===
===4:2:2===
Line 170: Line 170:
* [[Betacam#Digital Betacam|Digital Betacam]]
* [[Betacam#Digital Betacam|Digital Betacam]]
* [[Betacam#Betacam SX|Betacam SX]]
* [[Betacam#Betacam SX|Betacam SX]]
* [[DV#DVCPRO|DVCPRO50]] and [[DVCPRO HD]]
* [[DV (video format)#DVCPRO50|DVCPRO50]] and [[DV (video format)#DVCPRO HD|DVCPRO HD]]
* [[Digital-S]]
* [[Digital-S]]
* [[CCIR 601]] / [[Serial Digital Interface]] / [[D-1 (Sony)|D-1]]
* [[CCIR 601]] / [[Serial digital interface]] / [[D-1 (Sony)|D-1]]
* [[ProRes|ProRes (HQ, 422, LT, and Proxy)]]
* [[ProRes|ProRes (HQ, 422, LT, and Proxy)]]
* [[XDCAM|XDCAM HD422]]
* [[XDCAM|XDCAM HD422]]
* [[Canon XF-300|Canon MXF HD422]]
* [[Canon XF-300|Canon MXF HD422]]

===4:2:1===
This sampling mode is not expressible in ''J:a:b'' notation. "4:2:1" is an obsolete term from a previous notational scheme, and very few software or hardware codecs use it. '''Cb''' horizontal resolution is half that of '''Cr''' (and a quarter of the horizontal resolution of '''Y''').


===4:1:1===
===4:1:1===
In 4:1:1 chroma subsampling, the horizontal color resolution is quartered, and the bandwidth is halved compared to no chroma subsampling. Initially, 4:1:1 chroma subsampling of the [[DV]] format was not considered to be broadcast quality and was only acceptable for low-end and consumer applications.<ref name="dv-betacam">{{cite web |url=http://www.dvcentral.org/DV-Beta.html |title=DV vs. Betacam SP |last=Jennings |first=Roger |author2=Bertel Schmitt |year=1997 |work=DV Central |access-date=2008-08-29 |archive-url=https://web.archive.org/web/20080702013821/http://www.dvcentral.org/DV-Beta.html |archive-date=2008-07-02 |url-status=dead}}</ref><ref name="dv-formats">{{cite web |url=http://www.adamwilt.com/DV-FAQ-tech.html |title=DV, DVCAM & DVCPRO Formats |last=Wilt |first=Adam J. |year=2006 |work=adamwilt.com |access-date=2008-08-29}}</ref> However, [[DV]]-based formats (some of which use 4:1:1 chroma subsampling) have been used professionally in electronic news gathering and in playout servers. DV has also been sporadically used in feature films and in [[digital cinematography]].
In 4:1:1 chroma subsampling, the horizontal color resolution is quartered, and the bandwidth is halved compared to no chroma subsampling. Initially, 4:1:1 chroma subsampling of the [[DV (video format)|DV]] format was not considered to be broadcast quality and was only acceptable for low-end and consumer applications.<ref name="dv-betacam">{{cite web |url=http://www.dvcentral.org/DV-Beta.html |title=DV vs. Betacam SP |last=Jennings |first=Roger |author2=Bertel Schmitt |year=1997 |work=DV Central |access-date=2008-08-29 |archive-url=https://web.archive.org/web/20080702013821/http://www.dvcentral.org/DV-Beta.html |archive-date=2008-07-02 |url-status=dead}}</ref><ref name="dv-formats">{{cite web |url=http://www.adamwilt.com/DV-FAQ-tech.html |title=DV, DVCAM & DVCPRO Formats |last=Wilt |first=Adam J. |year=2006 |work=adamwilt.com |access-date=2008-08-29}}</ref> However, [[DV (video format)|DV]]-based formats (some of which use 4:1:1 chroma subsampling) have been used professionally in electronic news gathering and in playout servers. DV has also been sporadically used in feature films and in [[digital cinematography]].


In the [[480i]] "NTSC" system, if the luma is sampled at 13.5&nbsp;MHz, then this means that the '''Cr''' and '''Cb''' signals will each be sampled at 3.375&nbsp;MHz, which corresponds to a maximum [[Nyquist–Shannon sampling theorem|Nyquist]] bandwidth of 1.6875&nbsp;MHz, whereas traditional "high-end broadcast [[NTSC|analog NTSC]] encoder" would have a Nyquist bandwidth of 1.5&nbsp;MHz and 0.5&nbsp;MHz for the [[YIQ|I/Q]] channels. However, in most equipment, especially cheap TV sets and [[VHS]]/[[Betamax]] [[Videocassette recorder|VCRs]], the chroma channels have only the 0.5&nbsp;MHz bandwidth for both '''Cr''' and '''Cb''' (or equivalently for I/Q). Thus the DV system actually provides a superior color bandwidth compared to the best [[Composite video|composite analog]] specifications for NTSC, despite having only 1/4 of the chroma bandwidth of a "full" digital signal.
In the [[480i]] "NTSC" system, if the luma is sampled at 13.5&nbsp;MHz, then this means that the '''Cr''' and '''Cb''' signals will each be sampled at 3.375&nbsp;MHz, which corresponds to a maximum [[Nyquist–Shannon sampling theorem|Nyquist]] bandwidth of 1.6875&nbsp;MHz, whereas traditional "high-end broadcast [[NTSC|analog NTSC]] encoder" would have a Nyquist bandwidth of 1.5&nbsp;MHz and 0.5&nbsp;MHz for the [[YIQ|I/Q]] channels. However, in most equipment, especially cheap TV sets and [[VHS]]/[[Betamax]] [[Videocassette recorder|VCRs]], the chroma channels have only the 0.5&nbsp;MHz bandwidth for both '''Cr''' and '''Cb''' (or equivalently for I/Q). Thus the DV system actually provides a superior color bandwidth compared to the best [[Composite video|composite analog]] specifications for NTSC, despite having only 1/4 of the chroma bandwidth of a "full" digital signal.


Formats that use 4:1:1 chroma subsampling include:
Formats that use 4:1:1 chroma subsampling include:
* [[DV#DVCPRO|DVCPRO]] ([[NTSC]] and [[PAL]])
* [[DV (video format)#DVCPRO|DVCPRO]] / D-7 ([[NTSC]] and [[PAL]])
* [[480i]] "NTSC" [[DV]] and [[DV#DVCAM|DVCAM]]
* [[480i]] "NTSC" [[DV (video format)|DV]] and [[DV (video format)#DVCAM|DVCAM]]
* D-7


===4:2:0===
===4:2:0===
Line 196: Line 192:
* All [[International Organization for Standardization|ISO]]/[[International Electrotechnical Commission|IEC]] [[MPEG]] and [[ITU-T]] [[VCEG]] H.26x video coding standards including [[H.262/MPEG-2 Part 2]] implementations (although some profiles of [[MPEG-4 Part 2]] and [[H.264/MPEG-4 AVC]] allow higher-quality sampling schemes such as 4:4:4)
* All [[International Organization for Standardization|ISO]]/[[International Electrotechnical Commission|IEC]] [[MPEG]] and [[ITU-T]] [[VCEG]] H.26x video coding standards including [[H.262/MPEG-2 Part 2]] implementations (although some profiles of [[MPEG-4 Part 2]] and [[H.264/MPEG-4 AVC]] allow higher-quality sampling schemes such as 4:4:4)
* [[DVD-Video]] and [[Blu-ray Disc]].<ref name=AudioholicsHDMIApril2008>{{cite news |title=HDMI Enhanced Black Levels, xvYCC and RGB |author=Clint DeBoer |publisher=[[Audioholics]] |url=http://www.audioholics.com/tweaks/calibrate-your-system/hdmi-black-levels-xvycc-rgb |date=2008-04-16 |access-date=2013-06-02}}</ref><ref name=TelairityDigitalColorCodingPDF>{{cite news |title=Digital Color Coding |publisher=Telairity |url=http://www.telairity.com/assets/downloads/Digital%20Color%20Coding.pdf |access-date=2013-06-02 |archive-url=https://web.archive.org/web/20140107171831/http://www.telairity.com/assets/downloads/Digital%20Color%20Coding.pdf |archive-date=2014-01-07 |url-status=dead }}</ref>
* [[DVD-Video]] and [[Blu-ray Disc]].<ref name=AudioholicsHDMIApril2008>{{cite news |title=HDMI Enhanced Black Levels, xvYCC and RGB |author=Clint DeBoer |publisher=[[Audioholics]] |url=http://www.audioholics.com/tweaks/calibrate-your-system/hdmi-black-levels-xvycc-rgb |date=2008-04-16 |access-date=2013-06-02}}</ref><ref name=TelairityDigitalColorCodingPDF>{{cite news |title=Digital Color Coding |publisher=Telairity |url=http://www.telairity.com/assets/downloads/Digital%20Color%20Coding.pdf |access-date=2013-06-02 |archive-url=https://web.archive.org/web/20140107171831/http://www.telairity.com/assets/downloads/Digital%20Color%20Coding.pdf |archive-date=2014-01-07 |url-status=dead }}</ref>
* [[576i]] "PAL" [[DV]] and [[DV#DVCAM|DVCAM]]
* [[576i]] "PAL" [[DV (video format)|DV]] and [[DV (video format)#DVCAM|DVCAM]]
* [[HDV]]
* [[HDV]]
* [[AVCHD]] and [[AVC-Intra|AVC-Intra 50]]
* [[AVCHD]] and [[AVC-Intra|AVC-Intra 50]]
Line 203: Line 199:
* [[VC-1]]
* [[VC-1]]
* [[WebP]]
* [[WebP]]
* [[YJK]]'''<ref>{{Cite web |last=MSX Licensing Corporation |date=2022 |title=The YJK screen modes |url=http://map.grauw.nl/articles/yjk/ |website=MSX Assembly Page}}</ref><ref>{{Cite book |last=Niemietz |first=Ricardo Cancho |url=http://rs.gr8bit.ru/Documentation/Issues-on-YJK-colour-model-implemented-in-Yamaha-V9958-VDP-chip.pdf |title=Issues on YJK colour model implemented in Yamaha V9958 VDP chip |year=2014}}</ref><ref>{{Cite web |title=VCFe Vortrag vom 2016.04.30 - Homecomputer und Spielkonsolen - Videoarchitekturen als visuelles Medium |url=http://neil.franklin.ch/Articles/20160430_VCFe_Video_als_Medium.html |access-date=2022-11-13 |website=neil.franklin.ch}}</ref>''', a proprietary [[color space]] implemented by the [[Yamaha V9958]]<ref>{{Cite book |url=https://books.google.com/books?id=8RwSAQAAMAAJ&dq=%22Yamaha+V9958%22+-wikipedia&pg=PA3984 |title=IC Master |date=2001 |publisher=United Technical Publications |language=en}}</ref><ref>{{Cite thesis |title=Arqueología informática: los ordenadores MSX en los inicios de la microinformática doméstica |url=https://riunet.upv.es/handle/10251/70909 |publisher=Universitat Politècnica de València |date=2016-10-03 |degree=Proyecto/Trabajo fin de carrera/grado |first=Sergio |last=Martín Sesma}}</ref><ref>{{Cite web |last=Redazione |date=2008-10-20 |title=MSX - Vari Costruttori- 1983 |url=https://www.cyberludus.com/2008/10/msx-vari-costruttori-1983/ |access-date=2022-11-13 |website=CyberLudus.com |language=it-IT}}</ref> graphic chip on [[MSX2+]] computers.<ref name="auto">{{Cite web |date=1988 |title=V9958 MSX-VIDEO TECHNICAL DATA BOOK |url=http://map.grauw.nl/resources/video/yamaha_v9958_ocr.pdf}}</ref><ref>{{Cite journal |last=Alex |first=Wulms |date=1995 |title=Schermen op MSX - De 2+ schermen |url=http://www.msxarchive.nl/pub/msx/mirrors/hanso/hwdoityourself/msxplus.pdf |journal=MSX Computer & Club Magazine |issue=72}}</ref>
* [[YJK]],<ref>{{Cite web |last=MSX Licensing Corporation |date=2022 |title=The YJK screen modes |url=http://map.grauw.nl/articles/yjk/ |website=MSX Assembly Page}}</ref><ref>{{Cite book |last=Niemietz |first=Ricardo Cancho |url=http://rs.gr8bit.ru/Documentation/Issues-on-YJK-colour-model-implemented-in-Yamaha-V9958-VDP-chip.pdf |title=Issues on YJK colour model implemented in Yamaha V9958 VDP chip |year=2014}}</ref><ref>{{Cite web |title=VCFe Vortrag vom 2016.04.30 Homecomputer und Spielkonsolen Videoarchitekturen als visuelles Medium |url=http://neil.franklin.ch/Articles/20160430_VCFe_Video_als_Medium.html |access-date=2022-11-13 |website=neil.franklin.ch}}</ref> a proprietary [[color space]] implemented by the [[Yamaha V9958]]<ref>{{Cite book |url=https://books.google.com/books?id=8RwSAQAAMAAJ&dq=%22Yamaha+V9958%22+-wikipedia&pg=PA3984 |title=IC Master |date=2001 |publisher=United Technical Publications |language=en}}</ref><ref>{{Cite thesis |title=Arqueología informática: los ordenadores MSX en los inicios de la microinformática doméstica |url=https://riunet.upv.es/handle/10251/70909 |publisher=Universitat Politècnica de València |date=2016-10-03 |degree=Proyecto/Trabajo fin de carrera/grado |first=Sergio |last=Martín Sesma}}</ref><ref>{{Cite web |last=Redazione |date=2008-10-20 |title=MSX Vari Costruttori- 1983 |url=https://www.cyberludus.com/2008/10/msx-vari-costruttori-1983/ |access-date=2022-11-13 |website=CyberLudus.com |language=it-IT}}</ref> graphic chip on [[MSX2+]] computers.<ref name="auto">{{Cite web |date=1988 |title=V9958 MSX-VIDEO TECHNICAL DATA BOOK |url=http://map.grauw.nl/resources/video/yamaha_v9958_ocr.pdf}}</ref><ref>{{Cite journal |last=Alex |first=Wulms |date=1995 |title=Schermen op MSX De 2+ schermen |url=http://www.msxarchive.nl/pub/msx/mirrors/hanso/hwdoityourself/msxplus.pdf |journal=MSX Computer & Club Magazine |issue=72}}</ref>


'''Cb''' and '''Cr''' are each subsampled at a factor of 2 both horizontally and vertically. Most digital video formats corresponding to 576i "PAL" use 4:2:0 chroma subsampling.
'''Cb''' and '''Cr''' are each subsampled at a factor of 2 both horizontally and vertically. Most digital video formats corresponding to 576i "PAL" use 4:2:0 chroma subsampling.


==== Sampling positions ====
==== Sampling positions ====
There are three variants of 4:2:0 schemes, having different horizontal and vertical sampling siting.<ref name="chroma-subsampling-notation">{{cite web |url=http://www.poynton.com/PDFs/Chroma_subsampling_notation.pdf |title=Chroma Subsampling Notation |last=Poynton |first=Charles |year=2008 |work=Poynton.com |access-date=2008-10-01}}</ref>
There are four main variants of 4:2:0 schemes, having different horizontal and vertical sampling siting relative to the 2&times;2 "square" of the original input size.<ref name="chroma-subsampling-notation">{{cite web |url=http://www.poynton.com/PDFs/Chroma_subsampling_notation.pdf |title=Chroma Subsampling Notation |last=Poynton |first=Charles |year=2008 |work=Poynton.com |access-date=2008-10-01}}</ref>


* In MPEG-2, MPEG-4, and AVC, ''Cb'' and ''Cr'' are taken on midpoint of the left-edge of the 2&times;2 square. In other words, they have the same horizontal location as the top-left pixel, but is shifted one-half pixel down vertically. Also called "left".<ref name=AvChromaLocation>[https://ffmpeg.org/doxygen/3.1/pixfmt_8h.html#a1f86ed1b6a420faccacf77c98db6c1ff enum AvChromaLocation], ffmpeg 3.1.</ref>
* In MPEG-2, MPEG-4 and AVC '''Cb''' and '''Cr''' are co-sited horizontally. '''Cb''' and '''Cr''' are sited between pixels in the vertical direction (sited interstitially).
* In JPEG/JFIF, H.261, and MPEG-1, '''Cb''' and '''Cr''' are sited interstitially, halfway between alternate luma samples.
* In JPEG/JFIF, H.261, and MPEG-1, ''Cb'' and ''Cr'' are taken at the center of 2&times;2 the square. In other words, they are offset one-half pixel to the right and one-half pixel down compared to the top-left pixel. Also called "center".<ref name=AvChromaLocation/>
* In HEVC for BT.2020 and [[Rec. 2100#Chroma sample location|BT.2100]] content (in particular on Blu-rays), ''Cb'' and ''Cr'' are sampled at the same location as the group's top-left Y pixel ("co-sited", "co-located"). Also called "top-left". An analogous co-sited sampling is used in MPEG-2 4:2:2.<ref name=AvChromaLocation/>
* In 4:2:0 DV, '''Cb''' and '''Cr''' are co-sited in the horizontal direction. In the vertical direction, they are co-sited on alternating lines. That is also what is used in HEVC in BT.2020 and [[Rec. 2100#Chroma sample location|BT.2100]] content (in particular on Blu-rays). Also called top-left.
* In 4:2:0 PAL-DV (IEC 61834-2), ''Cb'' is sampled at the same location as the group's top-left Y pixel, but ''Cr'' is sampled one pixel down.<ref>{{cite web |title=y4minput.c - webm/libvpx - Git at Google |url=https://chromium.googlesource.com/webm/libvpx/+/refs/heads/main/y4minput.c |website=chromium.googlesource.com |quote=420paldv chroma samples are sited like:}}</ref> It is ''also'' called "top-left" in ffmpeg.<ref name=AvChromaLocation/>


===== Interlaced and Progressive =====
===== Interlaced and progressive =====
With [[interlaced]] material, 4:2:0 chroma subsampling can result in motion artifacts if it is implemented the same way as for progressive material. The luma samples are derived from separate time intervals, while the chroma samples would be derived from both time intervals. It is this difference that can result in motion artifacts. The MPEG-2 standard allows for an alternate interlaced sampling scheme, where 4:2:0 is applied to each field (not both fields at once). This solves the problem of motion artifacts, reduces the vertical chroma resolution by half, and can introduce comb-like artifacts in the image.
With [[interlaced]] material, 4:2:0 chroma subsampling can result in motion artifacts if it is implemented the same way as for progressive material. The luma samples are derived from separate time intervals, while the chroma samples would be derived from both time intervals. It is this difference that can result in motion artifacts. The MPEG-2 standard allows for an alternate interlaced sampling scheme, where 4:2:0 is applied to each field (not both fields at once). This solves the problem of motion artifacts, reduces the vertical chroma resolution by half, and can introduce comb-like artifacts in the image.


Line 221: Line 218:


[[Image:420-progressive-single-field.png]]
[[Image:420-progressive-single-field.png]]
<br />4:2:0 '''progressive''' sampling applied to moving ''interlaced'' material. Note that the chroma leads and trails the moving text. This image shows a single field.
<br />4:2:0 '''progressive''' sampling applied to moving ''interlaced'' material. The chroma leads and trails the moving text. This image shows a single field.


[[File:420-interlaced-single-field.png]]
[[File:420-interlaced-single-field.png]]
Line 240: Line 237:


===4:1:0===
===4:1:0===
This ratio is possible, and some [[codec]]s support it, but it is not widely used. This ratio uses half of the vertical and one-fourth the horizontal color resolutions, with only one-eighth of the bandwidth of the maximum color resolutions used. Uncompressed video in this format with 8-bit quantization uses 10 bytes for every macropixel (which is 4×2 pixels) or 10 bit for every pixel. It has the equivalent chrominance bandwidth of a PAL I signal decoded with a delay line decoder, and still very much superior to NTSC. Some video codecs may operate at 4:1:0.5 or 4:1:0.25 as an option, so as to allow similar to VHS quality.
This ratio is possible, and some [[codec]]s support it, but it is not widely used. This ratio uses half of the vertical and one-fourth the horizontal color resolutions, with only one-eighth of the bandwidth of the maximum color resolutions used. Uncompressed video in this format with 8-bit quantization uses 10 bytes for every macropixel (which is 4×2 pixels) or 10 bit for every pixel. It has the equivalent chrominance bandwidth of a PAL-I or PAL-M signal decoded with a delay line decoder, and still very much superior to NTSC.


===3:1:1===
===3:1:1===
Used by Sony in their HDCAM High Definition recorders (not HDCAM SR). In the horizontal dimension, luma is sampled horizontally at three quarters of the full HD sampling rate{{snd}} 1440 samples per row instead of 1920. Chroma is sampled at 480 samples per row, a third of the luma sampling rate. In the vertical dimension, both luma and chroma are sampled at the full HD sampling rate (1080 samples vertically).
Used by Sony in their HDCAM High Definition recorders (not HDCAM SR). In the horizontal dimension, luma is sampled horizontally at three quarters of the full HD sampling rate{{snd}} 1440 samples per row instead of 1920. Chroma is sampled at 480 samples per row, a third of the luma sampling rate. In the vertical dimension, both luma and chroma are sampled at the full HD sampling rate (1080 samples vertically).

=== Different Cb and Cr rates ===
A number of legacy schemes allow different subsampling factors in Cb and Cr, similar to how a different amount of bandwidth is allocated to the two chroma values in broadcast systems such as [[CCIR System M]]. These schemes are not expressible in ''J:a:b'' notation. Instead, they adopt a ''Y:Cb:Cr'' notation, with each part describing the amount of resolution for the corresponding component. It is unspecified whether the resolution reduction happens in the horizontal or vertical direction.

* In JPEG, 4:4:2 and 4:2:1 half the vertical resolution of ''Cb'' compared to 4:4:4 and 4:4:0.<ref>{{cite web |title=Support decoding yuv442 and yuv421 jpeg images. · FFmpeg/FFmpeg@387d860 |url=https://github.com/FFmpeg/FFmpeg/commit/387d86077f5237e53cec7a4ed31f5531224feebf |website=GitHub |language=en}}</ref>
* In another version of {{vanchor|4:2:1}}, '''Cb''' horizontal resolution is half that of '''Cr''' (and a quarter of the horizontal resolution of '''Y''').
* 4:1:0.5 or 4:1:0.25 are variants of 4:1:0 with reduced horizontal resolution on Cb, similar to VHS quality.


==Artifacts==
==Artifacts==
Line 258: Line 262:
=== Gamma luminance error ===
=== Gamma luminance error ===
{{see|Gamma correction#Scaling and blending}}
{{see|Gamma correction#Scaling and blending}}
Gamma-corrected signals like Y'CbCr have an issue where chroma errors "bleed" into luma. In those signals, a low chroma actually makes a color appear less bright than one with equivalent luma. As a result, when a saturated color blends with an unsaturated or complementary color, a loss of luminance occurs at the border. This can be seen in the example between magenta and green.<ref name=better>{{cite journal |last1=Chan |first1=Glenn |title=Toward Better Chroma Subsampling: Recipient of the 2007 SMPTE Student Paper Award |journal=SMPTE Motion Imaging Journal |date=May 2008 |volume=117 |issue=4 |pages=39–45 |doi=10.5594/J15100 |url=http://www.glennchan.info/articles/technical/chroma/chroma1.htm}}</ref> This issue persists in HDR video where gamma is generalized into a transfer function "[[EOTF]]". A steeper EOTF shows a stronger luminance loss.<ref name=Larbier>{{cite journal |last1=Larbier |first1=Pierre |title=High Dynamic Range: Compression Challenges |journal=SMPTE 2015 Annual Technical Conference and Exhibition |date=October 2015 |pages=1–15 |doi=10.5594/M001639|isbn=978-1-61482-956-0 }}</ref>
Gamma-corrected signals like Y'CbCr have an issue where chroma errors "bleed" into luma. In those signals, a low chroma actually makes a color appear less bright than one with equivalent luma. As a result, when a saturated color blends with an unsaturated or complementary color, a loss of luminance occurs at the border. This can be seen in the example between magenta and green.<ref name=better>{{cite journal |last1=Chan |first1=Glenn |title=Toward Better Chroma Subsampling: Recipient of the 2007 SMPTE Student Paper Award |journal=SMPTE Motion Imaging Journal |date=May 2008 |volume=117 |issue=4 |pages=39–45 |doi=10.5594/J15100 |url=http://www.glennchan.info/articles/technical/chroma/chroma1.htm|doi-access=free }}</ref> This issue persists in HDR video where gamma is generalized into a transfer function "[[EOTF]]". A steeper EOTF shows a stronger luminance loss.<ref name=Larbier>{{cite journal |last1=Larbier |first1=Pierre |title=High Dynamic Range: Compression Challenges |journal=SMPTE 2015 Annual Technical Conference and Exhibition |date=October 2015 |pages=1–15 |doi=10.5594/M001639|isbn=978-1-61482-956-0 }}</ref>


Some proposed corrections of this issue are:
Some proposed corrections of this issue are:
Line 264: Line 268:
* Iterative ''sharp YUV'' method, used by [[WebP]] and optionally [[AVIF]]. Sharp YUV assumes a [[bilinear interpolation|bilinear]] upscaling for chroma.<ref>{{cite web |title=WebP: sharpyuv/sharpyuv.h {{!}} Fossies |url=https://fossies.org/linux/libwebp/sharpyuv/sharpyuv.h |website=fossies.org |quote=Assumes that the image will be upsampled using a bilinear filter. If nearest neighbor is used instead, the upsampled image might look worse than with standard downsampling.}}</ref>
* Iterative ''sharp YUV'' method, used by [[WebP]] and optionally [[AVIF]]. Sharp YUV assumes a [[bilinear interpolation|bilinear]] upscaling for chroma.<ref>{{cite web |title=WebP: sharpyuv/sharpyuv.h {{!}} Fossies |url=https://fossies.org/linux/libwebp/sharpyuv/sharpyuv.h |website=fossies.org |quote=Assumes that the image will be upsampled using a bilinear filter. If nearest neighbor is used instead, the upsampled image might look worse than with standard downsampling.}}</ref>
* RGB subsampling in linear space before chroma subsampling (HDRTools)<ref name=Larbier/>
* RGB subsampling in linear space before chroma subsampling (HDRTools)<ref name=Larbier/>
* Iterative or closed-form luma correction to minimize luminance error (HDRTools)<ref>{{cite conference |url=https://norkin.org/pdf/SPIE_2016_HDR_conversion_metrics.pdf |last1=Norkin |first1=Andrey |title=HDR color conversion with varying distortion metrics |date=27 September 2016 |pages=99710E |conference=SPIE Optical Engineering + Applications, 2016 |doi=10.1117/12.2237040}}</ref>

[[Rec. 2020]] defines a "constant luminance" Yc'CbcCrc, which is calculated from linear RGB components and then gamma-encoded. This version does not suffer from the luminance loss by design.<ref name=Recommendation2020>{{cite news |title=BT.2020: Parameter values for ultra-high definition television systems for production and international programme exchange |publisher=[[International Telecommunication Union]] |url=https://www.itu.int/rec/R-REC-BT.2020/en |date=2014-07-17 |access-date=2014-08-31}}</ref>


=== Gamut clipping ===
=== Gamut clipping ===
Another artifact that can occur with chroma subsampling is that out-of-[[gamut]] colors can occur upon chroma reconstruction. Suppose the image consisted of alternating 1-pixel red and black lines and the subsampling omitted the chroma for the black pixels. Chroma from the red pixels will be reconstructed onto the black pixels, causing the new pixels to have positive red and ''negative'' green and blue values. As displays cannot output negative light (negative light does not exist), these negative values will effectively be clipped, and the resulting luma value will be too high. Other sub-sampling filters (especially the averaging "box") has a similar issue that is harder to make a simple example out of. Similar artifacts arise in the less artificial example of gradation near a fairly sharp red/black boundary.<ref name=better/>
Another artifact that can occur with chroma subsampling is that out-of-[[gamut]] colors can occur upon chroma reconstruction. Suppose the image consisted of alternating 1-pixel red and black lines and the subsampling omitted the chroma for the black pixels. Chroma from the red pixels will be reconstructed onto the black pixels, causing the new pixels to have positive red and ''negative'' green and blue values. As displays cannot output negative light (negative light does not exist), these negative values will effectively be clipped, and the resulting luma value will be too high. Other sub-sampling filters (especially the averaging "box") have a similar issue that is harder to make a simple example out of. Similar artifacts arise in the less artificial example of gradation near a fairly sharp red/black boundary.<ref name=better/>


It is possible for the decoder to deal with out-of-gamut colors by considering how much chroma a given luma value can hold and distribute it into the 4:4:4 intermediate accordingly, termed "in-range chroma reconstruction" by Glenn Chan. The "proportion" method is in spirit similar to Kornelski's luma-weighted average, while the "spill" method resembles [[error diffusion]].<ref name=better/> Improving chroma reconstruction remains an active field of research.<ref>{{cite journal |last1=Chung |first1=Kuo-Liang |last2=Liang |first2=Yan-Cheng |last3=Wang |first3=Ching-Sheng |title=Effective Content-Aware Chroma Reconstruction Method for Screen Content Images |journal=IEEE Transactions on Image Processing |date=March 2019 |volume=28 |issue=3 |pages=1108–1117 |doi=10.1109/TIP.2018.2875340|pmid=30307864 |s2cid=52964340 }}</ref>
It is possible for the decoder to deal with out-of-gamut colors by considering how much chroma a given luma value can hold and distribute it into the 4:4:4 intermediate accordingly, termed "in-range chroma reconstruction" by Glenn Chan. The "proportion" method is in spirit similar to Kornelski's luma-weighted average, while the "spill" method resembles [[error diffusion]].<ref name=better/> Improving chroma reconstruction remains an active field of research.<ref>{{cite journal |last1=Chung |first1=Kuo-Liang |last2=Liang |first2=Yan-Cheng |last3=Wang |first3=Ching-Sheng |title=Effective Content-Aware Chroma Reconstruction Method for Screen Content Images |journal=IEEE Transactions on Image Processing |date=March 2019 |volume=28 |issue=3 |pages=1108–1117 |doi=10.1109/TIP.2018.2875340|pmid=30307864 |bibcode=2019ITIP...28.1108C |s2cid=52964340 }}</ref>


==Terminology==
==Terminology==
The term [[YUV|Y'UV]] refers to an analog TV encoding scheme (ITU-R Rec. BT.470) while Y'CbCr refers to a digital encoding scheme.<ref name=plea/> One difference between the two is that the scale factors on the chroma components (U, V, Cb, and Cr) are different. However, the term YUV is often used erroneously to refer to Y'CbCr encoding. Hence, expressions like "4:2:2 YUV" always refer to 4:2:2 Y'CbCr, since there simply is no such thing as 4:x:x in analog encoding (such as YUV). Pixel formats used in Y'CbCr can be referred to as YUV too, for example yuv420p, yuvj420p and many others.
The term [[YUV|Y'UV]] refers to an analog TV encoding scheme (ITU-R Rec. BT.470) while Y'CbCr refers to a digital encoding scheme.<ref name=plea/> One difference between the two is that the scale factors on the chroma components (U, V, Cb, and Cr) are different. However, the term YUV is often used erroneously to refer to Y'CbCr encoding. Hence, expressions like "4:2:2 YUV" always refer to 4:2:2 Y'CbCr, since there simply is no such thing as 4:x:x in analog encoding (such as YUV). Pixel formats used in Y'CbCr can be referred to as YUV too, for example yuv420p, yuvj420p and many others.


In a similar vein, the term luminance and the symbol Y are often used erroneously to refer to luma, which is denoted with the symbol Y'. Note that the ''luma'' (Y') of video engineering deviates from the ''luminance'' (Y) of color science (as defined by [[International Commission on Illumination|CIE]]). Luma is formed as the weighted sum of ''gamma-corrected'' (tristimulus) RGB components. Luminance is formed as a weighed sum of ''linear'' (tristimulus) RGB components.
In a similar vein, the term luminance and the symbol Y are often used erroneously to refer to luma, which is denoted with the symbol Y'. The ''luma'' (Y') of video engineering deviates from the ''luminance'' (Y) of color science (as defined by [[International Commission on Illumination|CIE]]). Luma is formed as the weighted sum of ''gamma-corrected'' (tristimulus) RGB components. Luminance is formed as a weighed sum of ''linear'' (tristimulus) RGB components. In practice, the [[International Commission on Illumination|CIE]] symbol Y is often incorrectly used to denote luma. In 1993, [[SMPTE]] adopted Engineering Guideline EG&nbsp;28, clarifying the two terms. The prime symbol ' is used to indicate gamma correction.<ref>{{cite book |title=Annotated Glossary of Essential Terms for Electronic Production |url=https://ieeexplore.ieee.org/document/7291332 |doi=10.5594/SMPTE.EG28.1993 |quote=luma: To avoid the interdisciplinary confusion resulting from the two distinct definitions of luminance, it has been proposed that the video documents use luma for luminance, television (i.e., the luminance signal), and chroma for chrominance television (i.e., the chrominance signal) |isbn=978-1-61482-022-2 }}</ref>


Similarly, the chroma of video engineering differs from the chrominance of color science. The chroma of video engineering is formed from weighted tristimulus components (gamma corrected, OETF), not linear components. In video engineering practice, the terms ''chroma'', ''chrominance'', and ''saturation'' are often used interchangeably to refer to chroma, but it is not a good practice, as ITU-T Rec H.273 says.<ref name="H273">{{cite web |date=2016 |title=H.273 : Coding-independent code points for video signal type identification |url=https://www.itu.int/rec/T-REC-H.273-201612-S/en |website=www.itu.int |quote=NOTE – The term chroma is used rather than the term chrominance in order to avoid the implication of the use of linear light transfer characteristics that is often associated with the term chrominance. [...] NOTE – The term luma is used rather than the term luminance in order to avoid the implication of the use of linear light transfer characteristics that is often associated with the term luminance. The symbol L is sometimes used instead of the symbol Y to avoid confusion with the symbol y as used for vertical location.}}</ref>
In practice, the [[International Commission on Illumination|CIE]] symbol Y is often incorrectly used to denote luma. In 1993, [[SMPTE]] adopted Engineering Guideline EG&nbsp;28, clarifying the two terms. Note that the prime symbol ' is used to indicate gamma correction.

Similarly, the chroma of video engineering differs from the chrominance of color science. The chroma of video engineering is formed from weighted tristimulus components (gamma corrected, OETF), not linear components. In video engineering practice, the terms ''chroma'', ''chrominance'', and ''saturation'' are often used interchangeably to refer to chrominance, but it is not a good practice, as ITU-T Rec H.273 says.


==History==
==History==
{{Unreferenced section|date=July 2022}}
{{Unreferenced section|date=July 2022}}
Chroma subsampling was developed in the 1950s by [[Alda Bedford]] for the development of color television by [[RCA]], which developed into the [[NTSC]] standard; luma–chroma separation was developed earlier, in 1938 by [[Georges Valensi]]. Through studies{{Which?|date=July 2022}}, he showed that the human eye has high resolution only for black and white, somewhat less for "mid-range" colors like yellows and greens, and much less for colors on the end of the spectrum, reds and blues. {{Clarify|reason=High resolution in what domain?|date=July 2022}} knowledge allowed RCA to develop a system in which they discarded most of the blue signal after it comes from the camera, keeping most of the green and only some of the red; this is chroma subsampling in the [[YIQ]] color space and is roughly analogous to 4:2:1 subsampling, in that it has decreasing resolution for luma, yellow/green, and red/blue.
Chroma subsampling was developed in the 1950s by [[Alda Bedford]] for the development of color television by [[RCA]], which developed into the [[NTSC]] standard; luma–chroma separation was developed earlier, in 1938 by [[Georges Valensi]]. Through studies{{Which?|date=July 2022}}, he showed that the human eye has high resolution only for black and white, somewhat less for "mid-range" colors like yellows and greens, and much less for colors on the end of the spectrum, reds and blues.{{Clarify|reason=High resolution in what domain?|date=July 2022}} This knowledge allowed RCA to develop a system in which they discarded most of the blue signal after it comes from the camera, keeping most of the green and only some of the red; this is chroma subsampling in the [[YIQ]] color space and is roughly analogous to 4:2:1 subsampling, in that it has decreasing resolution for luma, yellow/green, and red/blue.


==See also==
==See also==
* [[Color]]
* [[Color space]]
* [[Color space]]
* [[Color vision]]
*[[Multiple sub-Nyquist sampling encoding]]
** [[Rod cell]]
* [[SMPTE]] - Society of Motion Picture and Television Engineers
** [[Cone cell]]
* [[Digital video]]
* [[Digital video]]
* [[High-definition television]]
* [[High-definition television]]
* [[Multiple sub-Nyquist sampling encoding]]
* [[Rec. 601]] 4:2:2 [[SDTV]]
* [[SMPTE]] Society of Motion Picture and Television Engineers
* [[YCbCr]]
* [[YCbCr]]
* [[YJK]]
* [[YPbPr]]
* [[YPbPr]]
* [[Rec. 601]] 4:2:2 [[SDTV]]
* [[YUV]]
* [[YUV]]
* [[YJK]]
* [[Color]]
* [[Color vision]]
** [[Rod cell]]
** [[Cone cell]]


==References==
==References==
{{notelist}}
{{reflist}}
{{reflist}}
* Poynton, Charles. "Digital Video and HDTV: Algorithms and Interfaces". U.S.: Morgan Kaufmann Publishers, 2003.
* Poynton, Charles. "Digital Video and HDTV: Algorithms and Interfaces". U.S.: Morgan Kaufmann Publishers, 2003.

Latest revision as of 04:17, 26 May 2024

Widely used chroma subsampling formats

Chroma subsampling is the practice of encoding images by implementing less resolution for chroma information than for luma information, taking advantage of the human visual system's lower acuity for color differences than for luminance.[1]

It is used in many video and still image encoding schemes – both analog and digital – including in JPEG encoding.

Rationale[edit]

In full size, this image shows the difference between four subsampling schemes. Note how similar the color images appear. The lower row shows the resolution of the color information.

Digital signals are often compressed to reduce file size and save transmission time. Since the human visual system is much more sensitive to variations in brightness than color, a video system can be optimized by devoting more bandwidth to the luma component (usually denoted Y'), than to the color difference components Cb and Cr. In compressed images, for example, the 4:2:2 Y'CbCr scheme requires two-thirds the bandwidth of non-subsampled "4:4:4" R'G'B'.[a] This reduction results in almost no visual difference as perceived by the viewer.

How subsampling works[edit]

The human vision system (HVS) processes color information (hue and colorfulness) at about a third of the resolution of luminance (lightness/darkness information in an image). Therefore it is possible to sample color information at a lower resolution while maintaining good image quality.

This is achieved by encoding RGB image data into a composite black and white image, with separated color difference data (chroma). For example with , gamma encoded components are weighted and then summed together to create the luma component. The color difference components are created by subtracting two of the weighted components from the third. A variety of filtering methods can be used to limit the resolution.

Regarding gamma and transfer functions[edit]

Gamma encoded luma should not be confused with linear luminance . The presence of gamma encoding is denoted with the prime symbol . In very early video systems, gamma-correction was necessary due to the nonlinear response of a cathode-ray tube (CRT).

While CRTs are no longer widely used, gamma or electro-optical transfer curves (EOTF), are still very useful due to the nonlinear response of human vision. The use of gamma improves perceived signal-to-noise in analogue systems, and allows for more efficient data encoding in digital systems. This encoding uses more levels for darker colors than for lighter ones, accommodating human vision sensitivity.[2]

Sampling systems and ratios[edit]

The subsampling scheme is commonly expressed as a three-part ratio J:a:b (e.g. 4:2:2) or four parts, if alpha channel is present (e.g. 4:2:2:4), that describe the number of luminance and chrominance samples in a conceptual region that is J pixels wide and 2 pixels high. The parts are (in their respective order):

  • J: horizontal sampling reference (width of the conceptual region). Usually, 4.
  • a: number of chrominance samples (Cr, Cb) in the first row of J pixels.
  • b: number of changes of chrominance samples (Cr, Cb) between first and second row of J pixels. b has to be either zero or equal to a (except in rare irregular cases like 4:4:1 and 4:2:1, which do not follow this convention).
  • Alpha: horizontal factor (relative to first digit). May be omitted if alpha component is not present, and is equal to J when present.

This notation is not valid for all combinations and has exceptions, e.g. 4:1:0 (where the height of the region is not 2 pixels, but 4 pixels, so if 8 bits per component are used, the media would be 9 bits per pixel) and 4:2:1.

4:1:1 4:2:0 4:2:2 4:4:0 4:4:4
Y'CrCb  
 
= = = = =
Y'  
 
+ + + + +
1 2 3 4 J = 4 1 2 3 4 J = 4 1 2 3 4 J = 4 1 2 3 4 J = 4 1 2 3 4 J = 4
(Cr, Cb) 1 a 1 1 2 a 2 1 2 a 2 1 2 3 4 a 4 1 2 3 4 a 4
1 b 1 b 0 1 2 b 2 b 0 1 2 3 4 b 4
¼ horizontal resolution,
full vertical resolution
½ horizontal resolution,
½ vertical resolution
½ horizontal resolution,
full vertical resolution
full horizontal resolution,
½ vertical resolution
full horizontal resolution,
full vertical resolution

The mapping examples given are only theoretical and for illustration. Also the diagram does not indicate any chroma filtering, which should be applied to avoid aliasing. To calculate required bandwidth factor relative to 4:4:4 (or 4:4:4:4), one needs to sum all the factors and divide the result by 12 (or 16, if alpha is present).

Types of sampling and subsampling[edit]

4:4:4[edit]

Each of the three Y'CbCr components has the same sample rate, thus there is no chroma subsampling. This scheme is sometimes used in high-end film scanners and cinematic post-production.

"4:4:4" may instead be wrongly referring to R'G'B' color space, which implicitly also does not have any chroma subsampling (except in JPEG R'G'B' can be subsampled). Formats such as HDCAM SR can record 4:4:4 R'G'B' over dual-link HD-SDI.

4:2:2[edit]

The two chroma components are sampled at half the horizontal sample rate of luma: the horizontal chroma resolution is halved. This reduces the bandwidth of an uncompressed video signal by one-third, which means for 8 bit per component without alpha (24 bit per pixel) only 16 bits are enough, as in NV16.

Many high-end digital video formats and interfaces use this scheme:

4:1:1[edit]

In 4:1:1 chroma subsampling, the horizontal color resolution is quartered, and the bandwidth is halved compared to no chroma subsampling. Initially, 4:1:1 chroma subsampling of the DV format was not considered to be broadcast quality and was only acceptable for low-end and consumer applications.[3][4] However, DV-based formats (some of which use 4:1:1 chroma subsampling) have been used professionally in electronic news gathering and in playout servers. DV has also been sporadically used in feature films and in digital cinematography.

In the 480i "NTSC" system, if the luma is sampled at 13.5 MHz, then this means that the Cr and Cb signals will each be sampled at 3.375 MHz, which corresponds to a maximum Nyquist bandwidth of 1.6875 MHz, whereas traditional "high-end broadcast analog NTSC encoder" would have a Nyquist bandwidth of 1.5 MHz and 0.5 MHz for the I/Q channels. However, in most equipment, especially cheap TV sets and VHS/Betamax VCRs, the chroma channels have only the 0.5 MHz bandwidth for both Cr and Cb (or equivalently for I/Q). Thus the DV system actually provides a superior color bandwidth compared to the best composite analog specifications for NTSC, despite having only 1/4 of the chroma bandwidth of a "full" digital signal.

Formats that use 4:1:1 chroma subsampling include:

4:2:0[edit]

In 4:2:0, the horizontal sampling is doubled compared to 4:1:1, but as the Cb and Cr channels are only sampled on each alternate line in this scheme, the vertical resolution is halved. The data rate is thus the same. This fits reasonably well with the PAL color encoding system, since this has only half the vertical chrominance resolution of NTSC. It would also fit extremely well with the SECAM color encoding system, since like that format, 4:2:0 only stores and transmits one color channel per line (the other channel being recovered from the previous line). However, little equipment has actually been produced that outputs a SECAM analogue video signal. In general, SECAM territories either have to use a PAL-capable display or a transcoder to convert the PAL signal to SECAM for display.

Different variants of 4:2:0 chroma configurations are found in:

Cb and Cr are each subsampled at a factor of 2 both horizontally and vertically. Most digital video formats corresponding to 576i "PAL" use 4:2:0 chroma subsampling.

Sampling positions[edit]

There are four main variants of 4:2:0 schemes, having different horizontal and vertical sampling siting relative to the 2×2 "square" of the original input size.[15]

  • In MPEG-2, MPEG-4, and AVC, Cb and Cr are taken on midpoint of the left-edge of the 2×2 square. In other words, they have the same horizontal location as the top-left pixel, but is shifted one-half pixel down vertically. Also called "left".[16]
  • In JPEG/JFIF, H.261, and MPEG-1, Cb and Cr are taken at the center of 2×2 the square. In other words, they are offset one-half pixel to the right and one-half pixel down compared to the top-left pixel. Also called "center".[16]
  • In HEVC for BT.2020 and BT.2100 content (in particular on Blu-rays), Cb and Cr are sampled at the same location as the group's top-left Y pixel ("co-sited", "co-located"). Also called "top-left". An analogous co-sited sampling is used in MPEG-2 4:2:2.[16]
  • In 4:2:0 PAL-DV (IEC 61834-2), Cb is sampled at the same location as the group's top-left Y pixel, but Cr is sampled one pixel down.[17] It is also called "top-left" in ffmpeg.[16]
Interlaced and progressive[edit]

With interlaced material, 4:2:0 chroma subsampling can result in motion artifacts if it is implemented the same way as for progressive material. The luma samples are derived from separate time intervals, while the chroma samples would be derived from both time intervals. It is this difference that can result in motion artifacts. The MPEG-2 standard allows for an alternate interlaced sampling scheme, where 4:2:0 is applied to each field (not both fields at once). This solves the problem of motion artifacts, reduces the vertical chroma resolution by half, and can introduce comb-like artifacts in the image.


Original. This image shows a single field. The moving text has some motion blur applied to it.


4:2:0 progressive sampling applied to moving interlaced material. The chroma leads and trails the moving text. This image shows a single field.


4:2:0 interlaced sampling applied to moving interlaced material. This image shows a single field.

In the 4:2:0 interlaced scheme, however, vertical resolution of the chroma is roughly halved, since the chroma samples effectively describe an area 2 samples wide by 4 samples tall instead of 2×2. As well, the spatial displacement between both fields can result in the appearance of comb-like chroma artifacts.


Original still image.


4:2:0 progressive sampling applied to a still image. Both fields are shown.


4:2:0 interlaced sampling applied to a still image. Both fields are shown.

If the interlaced material is to be de-interlaced, the comb-like chroma artifacts (from 4:2:0 interlaced sampling) can be removed by blurring the chroma vertically.[18]

4:1:0[edit]

This ratio is possible, and some codecs support it, but it is not widely used. This ratio uses half of the vertical and one-fourth the horizontal color resolutions, with only one-eighth of the bandwidth of the maximum color resolutions used. Uncompressed video in this format with 8-bit quantization uses 10 bytes for every macropixel (which is 4×2 pixels) or 10 bit for every pixel. It has the equivalent chrominance bandwidth of a PAL-I or PAL-M signal decoded with a delay line decoder, and still very much superior to NTSC.

3:1:1[edit]

Used by Sony in their HDCAM High Definition recorders (not HDCAM SR). In the horizontal dimension, luma is sampled horizontally at three quarters of the full HD sampling rate – 1440 samples per row instead of 1920. Chroma is sampled at 480 samples per row, a third of the luma sampling rate. In the vertical dimension, both luma and chroma are sampled at the full HD sampling rate (1080 samples vertically).

Different Cb and Cr rates[edit]

A number of legacy schemes allow different subsampling factors in Cb and Cr, similar to how a different amount of bandwidth is allocated to the two chroma values in broadcast systems such as CCIR System M. These schemes are not expressible in J:a:b notation. Instead, they adopt a Y:Cb:Cr notation, with each part describing the amount of resolution for the corresponding component. It is unspecified whether the resolution reduction happens in the horizontal or vertical direction.

  • In JPEG, 4:4:2 and 4:2:1 half the vertical resolution of Cb compared to 4:4:4 and 4:4:0.[19]
  • In another version of 4:2:1, Cb horizontal resolution is half that of Cr (and a quarter of the horizontal resolution of Y).
  • 4:1:0.5 or 4:1:0.25 are variants of 4:1:0 with reduced horizontal resolution on Cb, similar to VHS quality.

Artifacts[edit]

Original image without color subsampling. 200% zoom.
Image after color subsampling (Sony Vegas DV codec, box filtering.)
Note the bleeding in lightness near the borders.

Chroma subsampling suffers from two main types of artifacts, causing degradation more noticeable than intended where colors change abruptly.

Gamma luminance error[edit]

Gamma-corrected signals like Y'CbCr have an issue where chroma errors "bleed" into luma. In those signals, a low chroma actually makes a color appear less bright than one with equivalent luma. As a result, when a saturated color blends with an unsaturated or complementary color, a loss of luminance occurs at the border. This can be seen in the example between magenta and green.[20] This issue persists in HDR video where gamma is generalized into a transfer function "EOTF". A steeper EOTF shows a stronger luminance loss.[21]

Some proposed corrections of this issue are:

  • Luma-weighted average (Kornelski, experiment for mozjpeg)[22]
  • Iterative sharp YUV method, used by WebP and optionally AVIF. Sharp YUV assumes a bilinear upscaling for chroma.[23]
  • RGB subsampling in linear space before chroma subsampling (HDRTools)[21]
  • Iterative or closed-form luma correction to minimize luminance error (HDRTools)[24]

Rec. 2020 defines a "constant luminance" Yc'CbcCrc, which is calculated from linear RGB components and then gamma-encoded. This version does not suffer from the luminance loss by design.[25]

Gamut clipping[edit]

Another artifact that can occur with chroma subsampling is that out-of-gamut colors can occur upon chroma reconstruction. Suppose the image consisted of alternating 1-pixel red and black lines and the subsampling omitted the chroma for the black pixels. Chroma from the red pixels will be reconstructed onto the black pixels, causing the new pixels to have positive red and negative green and blue values. As displays cannot output negative light (negative light does not exist), these negative values will effectively be clipped, and the resulting luma value will be too high. Other sub-sampling filters (especially the averaging "box") have a similar issue that is harder to make a simple example out of. Similar artifacts arise in the less artificial example of gradation near a fairly sharp red/black boundary.[20]

It is possible for the decoder to deal with out-of-gamut colors by considering how much chroma a given luma value can hold and distribute it into the 4:4:4 intermediate accordingly, termed "in-range chroma reconstruction" by Glenn Chan. The "proportion" method is in spirit similar to Kornelski's luma-weighted average, while the "spill" method resembles error diffusion.[20] Improving chroma reconstruction remains an active field of research.[26]

Terminology[edit]

The term Y'UV refers to an analog TV encoding scheme (ITU-R Rec. BT.470) while Y'CbCr refers to a digital encoding scheme.[2] One difference between the two is that the scale factors on the chroma components (U, V, Cb, and Cr) are different. However, the term YUV is often used erroneously to refer to Y'CbCr encoding. Hence, expressions like "4:2:2 YUV" always refer to 4:2:2 Y'CbCr, since there simply is no such thing as 4:x:x in analog encoding (such as YUV). Pixel formats used in Y'CbCr can be referred to as YUV too, for example yuv420p, yuvj420p and many others.

In a similar vein, the term luminance and the symbol Y are often used erroneously to refer to luma, which is denoted with the symbol Y'. The luma (Y') of video engineering deviates from the luminance (Y) of color science (as defined by CIE). Luma is formed as the weighted sum of gamma-corrected (tristimulus) RGB components. Luminance is formed as a weighed sum of linear (tristimulus) RGB components. In practice, the CIE symbol Y is often incorrectly used to denote luma. In 1993, SMPTE adopted Engineering Guideline EG 28, clarifying the two terms. The prime symbol ' is used to indicate gamma correction.[27]

Similarly, the chroma of video engineering differs from the chrominance of color science. The chroma of video engineering is formed from weighted tristimulus components (gamma corrected, OETF), not linear components. In video engineering practice, the terms chroma, chrominance, and saturation are often used interchangeably to refer to chroma, but it is not a good practice, as ITU-T Rec H.273 says.[28]

History[edit]

Chroma subsampling was developed in the 1950s by Alda Bedford for the development of color television by RCA, which developed into the NTSC standard; luma–chroma separation was developed earlier, in 1938 by Georges Valensi. Through studies[which?], he showed that the human eye has high resolution only for black and white, somewhat less for "mid-range" colors like yellows and greens, and much less for colors on the end of the spectrum, reds and blues.[clarification needed] This knowledge allowed RCA to develop a system in which they discarded most of the blue signal after it comes from the camera, keeping most of the green and only some of the red; this is chroma subsampling in the YIQ color space and is roughly analogous to 4:2:1 subsampling, in that it has decreasing resolution for luma, yellow/green, and red/blue.

See also[edit]

References[edit]

  1. ^ The prime signs indicates gamma-correction or any non-linear EOTF.
  1. ^ S. Winkler, C. J. van den Branden Lambrecht, and M. Kunt (2001). "Vision and Video: Models and Applications". In Christian J. van den Branden Lambrecht (ed.). Vision models and applications to image and video processing. Springer. p. 209. ISBN 978-0-7923-7422-0.{{cite book}}: CS1 maint: multiple names: authors list (link)
  2. ^ a b Poynton, Charles. "YUV and luminance considered harmful: A plea for precise terminology in video".
  3. ^ Jennings, Roger; Bertel Schmitt (1997). "DV vs. Betacam SP". DV Central. Archived from the original on 2008-07-02. Retrieved 2008-08-29.
  4. ^ Wilt, Adam J. (2006). "DV, DVCAM & DVCPRO Formats". adamwilt.com. Retrieved 2008-08-29.
  5. ^ Clint DeBoer (2008-04-16). "HDMI Enhanced Black Levels, xvYCC and RGB". Audioholics. Retrieved 2013-06-02.
  6. ^ "Digital Color Coding" (PDF). Telairity. Archived from the original (PDF) on 2014-01-07. Retrieved 2013-06-02.
  7. ^ MSX Licensing Corporation (2022). "The YJK screen modes". MSX Assembly Page.
  8. ^ Niemietz, Ricardo Cancho (2014). Issues on YJK colour model implemented in Yamaha V9958 VDP chip (PDF).
  9. ^ "VCFe Vortrag vom 2016.04.30 – Homecomputer und Spielkonsolen – Videoarchitekturen als visuelles Medium". neil.franklin.ch. Retrieved 2022-11-13.
  10. ^ IC Master. United Technical Publications. 2001.
  11. ^ Martín Sesma, Sergio (2016-10-03). Arqueología informática: los ordenadores MSX en los inicios de la microinformática doméstica (Proyecto/Trabajo fin de carrera/grado thesis). Universitat Politècnica de València.
  12. ^ Redazione (2008-10-20). "MSX – Vari Costruttori- 1983". CyberLudus.com (in Italian). Retrieved 2022-11-13.
  13. ^ "V9958 MSX-VIDEO TECHNICAL DATA BOOK" (PDF). 1988.
  14. ^ Alex, Wulms (1995). "Schermen op MSX – De 2+ schermen" (PDF). MSX Computer & Club Magazine (72).
  15. ^ Poynton, Charles (2008). "Chroma Subsampling Notation" (PDF). Poynton.com. Retrieved 2008-10-01.
  16. ^ a b c d enum AvChromaLocation, ffmpeg 3.1.
  17. ^ "y4minput.c - webm/libvpx - Git at Google". chromium.googlesource.com. 420paldv chroma samples are sited like:
  18. ^ Munsil, Don; Stacey Spears (2003). "DVD Player Benchmark – Chroma Upsampling Error". Secrets of Home Theater and High Fidelity. Archived from the original on 2008-06-06. Retrieved 2008-08-29.
  19. ^ "Support decoding yuv442 and yuv421 jpeg images. · FFmpeg/FFmpeg@387d860". GitHub.
  20. ^ a b c Chan, Glenn (May 2008). "Toward Better Chroma Subsampling: Recipient of the 2007 SMPTE Student Paper Award". SMPTE Motion Imaging Journal. 117 (4): 39–45. doi:10.5594/J15100.
  21. ^ a b Larbier, Pierre (October 2015). "High Dynamic Range: Compression Challenges". SMPTE 2015 Annual Technical Conference and Exhibition: 1–15. doi:10.5594/M001639. ISBN 978-1-61482-956-0.
  22. ^ "Gamma-correct chroma subsampling · Issue #193 · mozilla/mozjpeg". GitHub.
  23. ^ "WebP: sharpyuv/sharpyuv.h | Fossies". fossies.org. Assumes that the image will be upsampled using a bilinear filter. If nearest neighbor is used instead, the upsampled image might look worse than with standard downsampling.
  24. ^ Norkin, Andrey (27 September 2016). HDR color conversion with varying distortion metrics (PDF). SPIE Optical Engineering + Applications, 2016. pp. 99710E. doi:10.1117/12.2237040.
  25. ^ "BT.2020: Parameter values for ultra-high definition television systems for production and international programme exchange". International Telecommunication Union. 2014-07-17. Retrieved 2014-08-31.
  26. ^ Chung, Kuo-Liang; Liang, Yan-Cheng; Wang, Ching-Sheng (March 2019). "Effective Content-Aware Chroma Reconstruction Method for Screen Content Images". IEEE Transactions on Image Processing. 28 (3): 1108–1117. Bibcode:2019ITIP...28.1108C. doi:10.1109/TIP.2018.2875340. PMID 30307864. S2CID 52964340.
  27. ^ Annotated Glossary of Essential Terms for Electronic Production. doi:10.5594/SMPTE.EG28.1993. ISBN 978-1-61482-022-2. luma: To avoid the interdisciplinary confusion resulting from the two distinct definitions of luminance, it has been proposed that the video documents use luma for luminance, television (i.e., the luminance signal), and chroma for chrominance television (i.e., the chrominance signal)
  28. ^ "H.273 : Coding-independent code points for video signal type identification". www.itu.int. 2016. NOTE – The term chroma is used rather than the term chrominance in order to avoid the implication of the use of linear light transfer characteristics that is often associated with the term chrominance. [...] NOTE – The term luma is used rather than the term luminance in order to avoid the implication of the use of linear light transfer characteristics that is often associated with the term luminance. The symbol L is sometimes used instead of the symbol Y to avoid confusion with the symbol y as used for vertical location.

External links[edit]