H264 Encoding tips
--1-pass vs 2-pass--
I recommend 2-pass mode for all of the encodes which you wish to reach a certain filesize. 1-pass mode is strictly for those who are severely short on time or do not expect to attain high-quality output. It is sometimes used to produce streaming content (see the streaming section below) or constant quality content (see the Constant Quality section below).
--Constant Quality--
If you wish your video to maintain a specific, constant quality the whole way through, use only the Constant Quality feature of 1-pass mode.
<> Do not use a quantizer of under 15 unless you are working for archive/reproduction quality.
<> Also, do not use anything over 40: the quality is simply unbearable unless you are encoding from an extremely sharp source of high-contrast edges and plan on streaming it over the internet.
<> A good bet for most people interested in high quality video would be the range of 22-30 (or, more specifically, 24-28). Of course, this varies depending on individual taste and how much free space you have.
<> On animated content with few detailed textures, consider using a higher quantizer.
<> On "real-life" content, especially that with many dark scenes and important subtle textures, consider using a much lower quantizer value.
--Encoding: High Speed, High Quality--
Best mix of encoding quality and encoding speed:
These are the recommended settings which produce the maximum quality AVC encodes while maximizing encoding speed.
<> 4-5 references are the typically accepted maximum. However, if you have a little bit of extra time on your hands (or a beefy computer), consider using up to 8 references (which decode just about as fast as 5, according to my Pocket PC AVC benchmarks).
<> I very strongly urge you to use 2-4 consecutive B-frames*, but only with content with a framerate over 23.976fps (I typically use 0-1 on 12-15fps animation or Very-Extremely-Low bitrate streaming content).
<> B-frame reduction (if you choose to use B-frames) should be adjusted according to the desired datarate. For most DVD backups, the default B-frame reduction of 30% seems to be sufficient. However, with certain animated medium bitrate content, I suggest use of up to 50% B-frame reduction.
<> See the "Bitrate Variability" section below for tips on how to set this field.
<> I keep sub-pel refinement on "6 RD-oh!”; the quality increase is very noticeable and well worth the mere seconds lost in the overall encoding job.
<> I check every single one of the checkboxes (for High Profile output) or all but the 8x8 transform (for MP-compliant streams). You may want to consider un-checking chroma estimation for non-animated source, but I find the quality is always helped by it (controversial issue). Of course, you can disable B-frame search if you disabled B-frames, though it makes no difference to the codec.
<> Don't ever use CAVLC... My benchmarks prove that decoding CABLC streams is only slightly faster than CABAC and compression is drastically decreased by its use. In other words, always keep CABAC checked.
<> Personally, I find that the keyframe threshold of 45% and keyframe QP boost: 20 to be the optimal in achieving quality at medium bitrates. However, if you wish to produce Ultra-Low-Bitrate video (DVD backups less than one CD), consider reducing this to 0, for your video's quality (PSNR) per BM will increase.
<> I suggest use of hexagonal motion estimation if encoding speed is desired. It provides sufficient quality for its speed. However, if you have a particularly complex source (or if you really want to scrimp for the best possible quality), try Uneven Multi-Hexagon with a ME range of 16-32, depending on the framerate of the source. Generally lower framerates (12-15 as opposed to 23.976-29.97) need a higher estimation range.
<> Consult the Deblocking Guide near the end of this guide for the best deblocking settings.
I find that this method delivers the best possible quality and filesize while achieving fast encoding speed.
--Encoding: Absolute Max Quality--
These are the recommended settings which produce the maximum quality AVC encodes regardless of encoding speed.
Best Possible Quality (regardless of encoding speed)
<> Use 16 reference frames (the max allowed). The compression quality will be increased slightly at the expense of some encoding time.
<> Use 2-3 max consecutive B-frames* with 30% B-frame size reduction for medium-high bitrates (2-CD backups). Use 1 B-frame with 40% reduction on animated or low bitrate (1-CD backups or less) content. Use 0 B-frames if you are using Very Extremely Low streaming bitrates. Also, use 0 B-frames if you wish to encode archive quality material.
<> Enable All motion searches and CABAC for HP-compliant content (in other words, check all the checkboxes on the lower half of x264's GUI) OR all but the 8x8 DCT for MP-compliant content (although technically, HP content offers significantly better compression quality, it is yet unsupported in many multimedia applications.)
<> Sub-pel refinement should be set to 6 or 7 (CLI), the maximum.
<> In maximum-quality content, I find that the keyframe threshold of 35% and keyframe QP boost of either: 30 or 0 to be the optimal in achieving quality at medium-high bitrates in my "maximum possible quality" mode.
<> Use Exhaustive motion estimation with range of 16 (for pretty darn slow encoding and exemplary quality) or 32 (for painfully slow encoding but highest possible quality). Do not set this value higher than 32 or you will risk loosing quality. Consider 32 to be the peak of the quality mountain, so to speak.
<> See the "Bitrate Variability" section below for tips on how to set this field.
<> Consult the Deblocking Guide near the end of this guide for the best deblocking settings.
<> Take advantage of the newly released custom H.264 quantization matrices. These nifty little numerical spreadsheets control how and where the bitrate goes to each frame. Typically, these matrices are beginning to see some circulation, but they are still being tweaked to provide the best quality possible. The x264 codec currently supports this feature (rev. 365).
*B-frames can be activated in "pyramid" mode, which allows B-frames to serve as references. If you wish to use a lot of references (and thereby increase quality slightly), consider selecting the "Use as references" checkbox. However, this option may cause some crashes with both the encoder and the decoder (if either is old enough). For this reason, I recommend NOT using B-frame references.
Again, these are the best possible settings for optimal performance in the quality-only range.
--Bitrate Variability--
The bitrate variability feature controls how much your datarate can fluxuate at any given time. That is, if you plan on encoding at 500kbps and set this value to 40, the maximum amount of bitrate given to more complex scenes is 700kbps. In a nutshell, the lower you set this value, the better still/non-complex scenes look, but high-motion/complex scenes will look more shabby and garbled. The higher you set this value, the more equal the overall quality will become: still scenes would look worse than with a low value, and high-motion/complex scenes would look a lot better. Overall, it comes down to this: if your source content is an extremely high-paced surge of complex/fast motion (the background moves a lot, there are many scene changes, colors change a lot, high-contrast particles are always flying, etc), make the value higher. Else, leave it be or lower it if your source is...calmer.
---Deblocking Guide---
The deblocker (also known as the "in-loop filter") is an interesting feature of H.264 that was previously unavailable in the older MPEG-4 ASP standard. Such filtering is very useful for eliminating blocks, as its name suggests, but it is also very controversial because it can lead to some pretty bad misuse as well. When enabled (as it always should be, in my opinion, except in lossless, quantizer=0 encoding), it is typically centered at 0, with a threshold of 0.
Think of it as a clothes washing service. The deblocking threshold determines how much of the material needs to be "washed," while the deblocking strength determines how strong of a "washing" effect is needed to eliminate the blocks, or "stains." Naturally, if you don't wash (deblock) enough of the material, there will still be some stains (blocks) left, away from the washed area. Also, if you don't wash (deblock) hard or strong enough, the stains (blocks) simply won't fade/disappear. However, if you wash too much or scrub too hard, you'll ruin the material, because instead of stains, there will be the ugly lack of color where all the texture and detail rubbed out.
Both bars are initially set to 0 for a reason: this is the standard deblocking that will lead to the most accurate mix of deblocking while maintaining detail. If, however, you find the result unsatisfying, look to these tips:
<> If multiple consecutive B-frames are used, you may wish to increase deblocking strength to eliminate between-frame noise. This is especially important if you encode with Recode.
<> If you are encoding an animated source, heavier deblocking is suggested to eliminate all blocks possible. The drawn content is more resistant to smearing due to the high-contrast edges. On the other hand, if you are encoding a "real-life" video, especially one with intricate textures and low/poor lighting, consider decreasing the deblocking to preserve such things without creating a washing effect.
<> To determine what you think is best, try the standard 0/0 settings on a small but indicative sample of the source and experiment with it, keeping in mind the below guidelines.
<> For deblocking strength, try not go out of bounds of the -3 to 3 range. Generally, any more than 3 will turn your result into mush while decreasing PSNR and actually increasing the output file slightly. Any less than -3 may cause the result to look a bit too blocky and any lack of texture will merely become more apparent as all smoothing is taken out.
<> For deblocking threshold, it all depends on the source. Lighting and content type (edge contrast) seem to play the biggest role in determining what needs to be deblocked. Stick to the range of -4 to 4: if the value is set too high, everything is smeared. Setting it too low would mean that hardly anything actually gets deblocked, leaving sharp images with too much mosquito noise).
<> Try to keep a positive correlation between the two settings. That is, if you want heavier deblocking, make sure to increase the threshold so that more gets deblocked, and vise versa. Recall the comparison with the clothes washing: you don't want to heavily wash a small area while the rest remains unwashed; the unwashed areas will stand out more in stark vividness and provide an ugly visual effect.
---Adaptive Quantization---
Present in certain AVC encoders, lumi masking is an encoding technique in which darker areas are awarded less quality than lighter ones, giving more bitrate to the more "visible" parts of the video. This is an example of an encoding tool that can LOWER the PSNR quality measure but actually increase the quality that your eyes see.
<> The main determining factor in whether or not you would like to enable Lumi Masking is the darkness of your source. In films with very dark scenes (dungeons, tunnels, night, shadow), turn this OFF or LOW for good measure. If the source isn't very dark, I'd recommend you try moderate enhancement. Strong lumi masking can lead to problems (the codec might think important semi-dark things are "dark enough" to heavily reduce the quality and create blocking, which may be obvious to the viewer). Check out CiNcH's post below for more info...
I recommend 2-pass mode for all of the encodes which you wish to reach a certain filesize. 1-pass mode is strictly for those who are severely short on time or do not expect to attain high-quality output. It is sometimes used to produce streaming content (see the streaming section below) or constant quality content (see the Constant Quality section below).
--Constant Quality--
If you wish your video to maintain a specific, constant quality the whole way through, use only the Constant Quality feature of 1-pass mode.
<> Do not use a quantizer of under 15 unless you are working for archive/reproduction quality.
<> Also, do not use anything over 40: the quality is simply unbearable unless you are encoding from an extremely sharp source of high-contrast edges and plan on streaming it over the internet.
<> A good bet for most people interested in high quality video would be the range of 22-30 (or, more specifically, 24-28). Of course, this varies depending on individual taste and how much free space you have.
<> On animated content with few detailed textures, consider using a higher quantizer.
<> On "real-life" content, especially that with many dark scenes and important subtle textures, consider using a much lower quantizer value.
--Encoding: High Speed, High Quality--
Best mix of encoding quality and encoding speed:
These are the recommended settings which produce the maximum quality AVC encodes while maximizing encoding speed.
<> 4-5 references are the typically accepted maximum. However, if you have a little bit of extra time on your hands (or a beefy computer), consider using up to 8 references (which decode just about as fast as 5, according to my Pocket PC AVC benchmarks).
<> I very strongly urge you to use 2-4 consecutive B-frames*, but only with content with a framerate over 23.976fps (I typically use 0-1 on 12-15fps animation or Very-Extremely-Low bitrate streaming content).
<> B-frame reduction (if you choose to use B-frames) should be adjusted according to the desired datarate. For most DVD backups, the default B-frame reduction of 30% seems to be sufficient. However, with certain animated medium bitrate content, I suggest use of up to 50% B-frame reduction.
<> See the "Bitrate Variability" section below for tips on how to set this field.
<> I keep sub-pel refinement on "6 RD-oh!”; the quality increase is very noticeable and well worth the mere seconds lost in the overall encoding job.
<> I check every single one of the checkboxes (for High Profile output) or all but the 8x8 transform (for MP-compliant streams). You may want to consider un-checking chroma estimation for non-animated source, but I find the quality is always helped by it (controversial issue). Of course, you can disable B-frame search if you disabled B-frames, though it makes no difference to the codec.
<> Don't ever use CAVLC... My benchmarks prove that decoding CABLC streams is only slightly faster than CABAC and compression is drastically decreased by its use. In other words, always keep CABAC checked.
<> Personally, I find that the keyframe threshold of 45% and keyframe QP boost: 20 to be the optimal in achieving quality at medium bitrates. However, if you wish to produce Ultra-Low-Bitrate video (DVD backups less than one CD), consider reducing this to 0, for your video's quality (PSNR) per BM will increase.
<> I suggest use of hexagonal motion estimation if encoding speed is desired. It provides sufficient quality for its speed. However, if you have a particularly complex source (or if you really want to scrimp for the best possible quality), try Uneven Multi-Hexagon with a ME range of 16-32, depending on the framerate of the source. Generally lower framerates (12-15 as opposed to 23.976-29.97) need a higher estimation range.
<> Consult the Deblocking Guide near the end of this guide for the best deblocking settings.
I find that this method delivers the best possible quality and filesize while achieving fast encoding speed.
--Encoding: Absolute Max Quality--
These are the recommended settings which produce the maximum quality AVC encodes regardless of encoding speed.
Best Possible Quality (regardless of encoding speed)
<> Use 16 reference frames (the max allowed). The compression quality will be increased slightly at the expense of some encoding time.
<> Use 2-3 max consecutive B-frames* with 30% B-frame size reduction for medium-high bitrates (2-CD backups). Use 1 B-frame with 40% reduction on animated or low bitrate (1-CD backups or less) content. Use 0 B-frames if you are using Very Extremely Low streaming bitrates. Also, use 0 B-frames if you wish to encode archive quality material.
<> Enable All motion searches and CABAC for HP-compliant content (in other words, check all the checkboxes on the lower half of x264's GUI) OR all but the 8x8 DCT for MP-compliant content (although technically, HP content offers significantly better compression quality, it is yet unsupported in many multimedia applications.)
<> Sub-pel refinement should be set to 6 or 7 (CLI), the maximum.
<> In maximum-quality content, I find that the keyframe threshold of 35% and keyframe QP boost of either: 30 or 0 to be the optimal in achieving quality at medium-high bitrates in my "maximum possible quality" mode.
<> Use Exhaustive motion estimation with range of 16 (for pretty darn slow encoding and exemplary quality) or 32 (for painfully slow encoding but highest possible quality). Do not set this value higher than 32 or you will risk loosing quality. Consider 32 to be the peak of the quality mountain, so to speak.
<> See the "Bitrate Variability" section below for tips on how to set this field.
<> Consult the Deblocking Guide near the end of this guide for the best deblocking settings.
<> Take advantage of the newly released custom H.264 quantization matrices. These nifty little numerical spreadsheets control how and where the bitrate goes to each frame. Typically, these matrices are beginning to see some circulation, but they are still being tweaked to provide the best quality possible. The x264 codec currently supports this feature (rev. 365).
*B-frames can be activated in "pyramid" mode, which allows B-frames to serve as references. If you wish to use a lot of references (and thereby increase quality slightly), consider selecting the "Use as references" checkbox. However, this option may cause some crashes with both the encoder and the decoder (if either is old enough). For this reason, I recommend NOT using B-frame references.
Again, these are the best possible settings for optimal performance in the quality-only range.
--Bitrate Variability--
The bitrate variability feature controls how much your datarate can fluxuate at any given time. That is, if you plan on encoding at 500kbps and set this value to 40, the maximum amount of bitrate given to more complex scenes is 700kbps. In a nutshell, the lower you set this value, the better still/non-complex scenes look, but high-motion/complex scenes will look more shabby and garbled. The higher you set this value, the more equal the overall quality will become: still scenes would look worse than with a low value, and high-motion/complex scenes would look a lot better. Overall, it comes down to this: if your source content is an extremely high-paced surge of complex/fast motion (the background moves a lot, there are many scene changes, colors change a lot, high-contrast particles are always flying, etc), make the value higher. Else, leave it be or lower it if your source is...calmer.
---Deblocking Guide---
The deblocker (also known as the "in-loop filter") is an interesting feature of H.264 that was previously unavailable in the older MPEG-4 ASP standard. Such filtering is very useful for eliminating blocks, as its name suggests, but it is also very controversial because it can lead to some pretty bad misuse as well. When enabled (as it always should be, in my opinion, except in lossless, quantizer=0 encoding), it is typically centered at 0, with a threshold of 0.
Think of it as a clothes washing service. The deblocking threshold determines how much of the material needs to be "washed," while the deblocking strength determines how strong of a "washing" effect is needed to eliminate the blocks, or "stains." Naturally, if you don't wash (deblock) enough of the material, there will still be some stains (blocks) left, away from the washed area. Also, if you don't wash (deblock) hard or strong enough, the stains (blocks) simply won't fade/disappear. However, if you wash too much or scrub too hard, you'll ruin the material, because instead of stains, there will be the ugly lack of color where all the texture and detail rubbed out.
Both bars are initially set to 0 for a reason: this is the standard deblocking that will lead to the most accurate mix of deblocking while maintaining detail. If, however, you find the result unsatisfying, look to these tips:
<> If multiple consecutive B-frames are used, you may wish to increase deblocking strength to eliminate between-frame noise. This is especially important if you encode with Recode.
<> If you are encoding an animated source, heavier deblocking is suggested to eliminate all blocks possible. The drawn content is more resistant to smearing due to the high-contrast edges. On the other hand, if you are encoding a "real-life" video, especially one with intricate textures and low/poor lighting, consider decreasing the deblocking to preserve such things without creating a washing effect.
<> To determine what you think is best, try the standard 0/0 settings on a small but indicative sample of the source and experiment with it, keeping in mind the below guidelines.
<> For deblocking strength, try not go out of bounds of the -3 to 3 range. Generally, any more than 3 will turn your result into mush while decreasing PSNR and actually increasing the output file slightly. Any less than -3 may cause the result to look a bit too blocky and any lack of texture will merely become more apparent as all smoothing is taken out.
<> For deblocking threshold, it all depends on the source. Lighting and content type (edge contrast) seem to play the biggest role in determining what needs to be deblocked. Stick to the range of -4 to 4: if the value is set too high, everything is smeared. Setting it too low would mean that hardly anything actually gets deblocked, leaving sharp images with too much mosquito noise).
<> Try to keep a positive correlation between the two settings. That is, if you want heavier deblocking, make sure to increase the threshold so that more gets deblocked, and vise versa. Recall the comparison with the clothes washing: you don't want to heavily wash a small area while the rest remains unwashed; the unwashed areas will stand out more in stark vividness and provide an ugly visual effect.
---Adaptive Quantization---
Present in certain AVC encoders, lumi masking is an encoding technique in which darker areas are awarded less quality than lighter ones, giving more bitrate to the more "visible" parts of the video. This is an example of an encoding tool that can LOWER the PSNR quality measure but actually increase the quality that your eyes see.
<> The main determining factor in whether or not you would like to enable Lumi Masking is the darkness of your source. In films with very dark scenes (dungeons, tunnels, night, shadow), turn this OFF or LOW for good measure. If the source isn't very dark, I'd recommend you try moderate enhancement. Strong lumi masking can lead to problems (the codec might think important semi-dark things are "dark enough" to heavily reduce the quality and create blocking, which may be obvious to the viewer). Check out CiNcH's post below for more info...
标签: 技术笔记
0 条评论:
发表评论
订阅 博文评论 [Atom]
<< 主页